Compare commits

...

377 Commits

Author SHA1 Message Date
Emil Velikov
64d51e2507 Add release notes for the 10.2.7 release
Listing bugs fixed and changes made.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-06 00:47:24 +01:00
Emil Velikov
25f5457e84 Update VERSION to 10.2.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-06 00:36:45 +01:00
José Fonseca
7ffc8556ca mesa: Move declaration to top of block.
To fix MSVC build.  Trivial.
(cherry picked from commit c98b704128)
2014-09-03 16:39:53 -06:00
Kenneth Graunke
65324b89b8 i965/clip: Fix brw_clip_unfilled.c/compute_offset's assembly.
Due to the destination register width of 1 or 2, these instructions get
ExecSize 1 or 2.  But dir and offset (used as src0) are both registers
of width 4, violating the execsize >= width assertion.

I honestly don't think this could have ever worked.

Fixes Piglit's polygon-offset and polygon-mode-offset tests on Gen4-5.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70441
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit b7679639bc)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36193
2014-09-02 13:35:03 +01:00
Adam Jackson
0c739aa1d2 radeonsi: Don't use anonymous struct trick in atom tracking
I'm somewhat impressed that current gccs will let you do this, but
sufficiently old ones (including 4.4.7 in RHEL6) won't.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 74388dd24b)
Nominated-by: Jonathan Gray <jsg@jsg.id.au>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76789
2014-09-02 13:35:03 +01:00
Pekka Paalanen
55d28925e6 egl_dri2: fix EXT_image_dma_buf_import fds
The EGL_EXT_image_dma_buf_import specification was revised (according to
its revision history) on Dec 5th, 2013, for EGL to not take ownership of
the file descriptors.

Do not close the file descriptors passed in to eglCreateImageKHR with
EGL_LINUX_DMA_BUF_EXT target.

It is assumed, that the drivers, which ultimately process the file
descriptors, do not close or modify them in any way either. This avoids
the need to dup(), as it seems we would only need to just close the
dup'd file descriptors right after.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76188
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 08264e5dad)
2014-09-02 13:35:03 +01:00
Emil Velikov
ec7d081e13 mesa: fix make tarballs
Current method of generating distribution tar-balls involves manually
invoking make + target name in the appropriate places. This temporary
solution is used until we get 'make dist' working.

Currently it does not work, as in order to have the target (which is
also a filename) available in the final Makefile we need to add a PHONY
target + use the correct target name.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 88cbe3908f)

Also squashed together with:

Revert "mesa: fix make tarballs"

This reverts commit 0fbb9a599d.

Rather than adding hacks around the issue drop the sources from the
final tarball, and re-add them back with 'make dist'. This fixes a
problem when running parallel 'make install' fails as it recreates
sources and triggers partial recompilation.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83355
Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
(cherry picked from commit 5a4e0f3873)
2014-09-02 13:35:02 +01:00
Dave Airlie
c4fa2bc796 i965: add missing parens in vec4 visitor
coverity reported this, Matt said it look like missing parens,
not bad identing, so lets try that.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 94a909ec2d)
2014-09-02 13:35:02 +01:00
Ilia Mirkin
0dad6d0ab0 nv50: attach the buffer bo to the miptree structures
The current code... makes no sense. Use nouveau_bo_ref to attach the bo
to the exposed resource so as to have the proper lifetime guarantees.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2c44043313)
2014-09-02 13:35:02 +01:00
Ilia Mirkin
00ef4b0ab1 nv50: mt address may not be the underlying bo's start address
With VP2, nv50_miptree is faked because the underlying bo's have to be
laid out in a certain way. This is done by adjusting the address. Make
sure that blits (and everything else for consistency) use the mt address
rather than the bo address as a base.

This fixes retrieving chroma plane with VDPAU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9d52e551a5)
2014-09-02 13:35:02 +01:00
Ilia Mirkin
eb7c7032c3 nv50: set the miptree address when clearing bo's in vp2 init
The mt address is about to be used more, make sure it's set
appropriately.

Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2528d402b9)
2014-09-02 13:35:01 +01:00
Ilia Mirkin
a708926792 nv50/ir: avoid creating instructions that can't be emitted
When constant folding a MAD operation, we first fold the multiply and
generate an ADD. However we do so without making sure that the immediate
can be handled in the saturate case. If it can't, load the immediate in
a separate instruction.

Reported-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6c2b079231)
2014-09-02 13:35:01 +01:00
Ilia Mirkin
2a967f7128 nvc0: don't make 1d staging textures linear
Experimentally, the sampler doesn't appear to like these, neither as
buffer nor as rect textures. So remove 1D from the list of texture types
to make linear when used for staging.

This fixes the OSD in mplayer for VDPAU.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 115d9a5525)
2014-09-02 13:35:01 +01:00
Ilia Mirkin
eb38556137 nv50: zero out unbound samplers
Samplers are only defined up to num_samplers, so set all samplers above
nr to NULL so that we don't try to read them again later.

Tested-by: Christian Ruppert <idl0r@qasl.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 362cd26960)
2014-09-02 13:35:01 +01:00
Ilia Mirkin
836b0ae8b6 nvc0/ir: avoid infinite recursion when finding first uses of tex
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4bb436f76)

Conflicts:
	src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
2014-09-02 13:34:54 +01:00
Vinson Lee
3fe59905fc gallivm: Fix build with LLVM >= 3.6 r215967.
This LLVM 3.6 commit changed EngineBuilder constructor.

commit 3f4ed32b4398eaf4fe0080d8001ba01e6c2f43c8
Author: Rafael Espindola <rafael.espindola@gmail.com>
Date:   Tue Aug 19 04:04:25 2014 +0000

    Make it explicit that ExecutionEngine takes ownership of the modules.

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215967 91177308-0d34-0410-b5e6-96231b3b80d8

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit c04a6d5c29)
Nominated-by: Marek Olšák <maraeo@gmail.com>
2014-09-02 12:29:58 +01:00
Jan Vesely
8fe85c7742 gallivm: Fix build with latest LLVM
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit e28136343b)
Nominated-by: Marek Olšák <maraeo@gmail.com>

Conflicts:
	src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
2014-09-02 12:29:58 +01:00
Andreas Boll
c89719e955 winsys/radeon: fix nop packet padding for hawaii
// Marek - merged cce58147eb into this one

The initial firmware for hawaii does not support type3 nop packet.
Detect the new hawaii firmware with query RADEON_INFO_ACCEL_WORKING2.
If the returned value is 3, then the new firmware is used.

This patch uses type2 for the old firmware and type3 for the new firmware.

It fixes the cases when the old firmware is used and the user wants to
manually enable acceleration.
The two possible scenarios are:
 - the kernel has no support for the new firmware.
 - the kernel has support for the new firmware but only the old firmware
   is available.

Additionaly this patch disables GPU acceleration on hawaii if the kernel
returns a value < 2. In this case the kernel hasn't the required fixes
for proper acceleration.

v2:
 - Fix indentation
 - Use private struct radeon_drm_winsys instead of public struct radeon_info
 - Rename r600_accel_working2 to accel_working2

v3:
 - Use type2 nop packet for returned value < 3

v4:
 - Fail to initialize winsys for returned value < 2

Cc: mesa-stable@lists.freedesktop.org
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 36771dc60f)

Conflicts:
	src/gallium/winsys/radeon/drm/radeon_drm_winsys.c

Also squashed together with:

winsys/radeon: fix hawaii accel_working2 comment

accel_working2 returns 3 if the new firmware is used.

The comment wasn't updated in v3 of commit:
36771dc winsys/radeon: fix nop packet padding for hawaii

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 64c379a3a8)
2014-09-02 12:29:58 +01:00
Marek Olšák
6e46a1c0b3 glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand
This fixes crashes if the number of temporaries is greater than 4096.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184

v2: added fail paths for realloc failures

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 482def592f)

Conflicts:
	src/mesa/state_tracker/st_glsl_to_tgsi.cpp
2014-09-02 12:29:58 +01:00
Robert Bragg
86b7073739 meta: save and restore swizzle for _GenerateMipmap
This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit c6f118484c)
Nominated-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-27 13:38:30 +01:00
Emil Velikov
36b8a16236 get-pick-list.sh: Require explicit "10.2" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 10.3 branch, (which was recently opened).

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-27 13:38:30 +01:00
Carl Worth
fefeba913b Makefile: Switch from md5sums to sha256sums
We switched to these several stable releases ago, (since the MD5 algorithm has
been broken for some time), but only now did I get around to fixing this in
the Makefile rather than just performing this step manually.

CC: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 46d03d37bf)
2014-08-27 13:38:30 +01:00
Alex Deucher
ac37c3c66e radeonsi: add new SI pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 153df68834)
2014-08-27 13:38:30 +01:00
Alex Deucher
438346b6c5 radeonsi: add new CIK pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f50b6b4895)
2014-08-27 13:38:30 +01:00
Tom Stellard
8d9a4112f2 r600g/compute: Don't initialize vertex_buffer_state masks to 0x2
cs_vertex_buffer_state.enabled_mask and
cs_vertex_buffer_state.dirty_mask are both updated when
r600_set_constant_buffer() is called, so we don't need to manually
update these values.

This fixes a crash with OpenCL programs that have a kernel with no
arguments.

https://bugs.freedesktop.org/show_bug.cgi?id=82671

CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bf7a60f41d)
2014-08-27 13:38:29 +01:00
Tom Stellard
f293bb9664 pipe-loader: Fix memory leak v2
v2:
  - Change driver_name to char*

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d954342e)

Conflicts:
	src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
2014-08-27 13:38:29 +01:00
Tom Stellard
0931f475fa radeon: Add work-around for missing Hainan support in clang < 3.6 v2
v2:
  - Add missing break.

https://bugs.freedesktop.org/show_bug.cgi?id=82709

CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8109664ded)
2014-08-27 13:38:29 +01:00
Brian Paul
6f8d2db2ef mesa: fix NULL pointer deref bug in _mesa_drawbuffers()
This is a follow-on fix to commit 39b40ad144.  Fixes a crash if the
user calls glDrawBuffers(0, NULL).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82814
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 31ce84a81f)
2014-08-27 13:38:29 +01:00
Marek Olšák
0374fdd5b6 radeonsi: save scissor state and sample mask for u_blitter
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 7792f9858b)
2014-08-27 13:38:29 +01:00
Ilia Mirkin
fa03aa9a8a nouveau: don't keep stale pointer to free'd data
If ->sys is non-null, we might decide that it's where the data is
stored.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ef130b6050)
2014-08-27 13:38:29 +01:00
Ilia Mirkin
301a23c624 nouveau: make sure to invalidate any vbo state as well
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8867ffbf95)
2014-08-27 13:38:29 +01:00
Marek Olšák
12e4e88c95 r600g: fix constant buffer fetches
Somebody forgot to do this. It was uncovered by recent st/mesa changes.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82139

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
(cherry picked from commit da9c3ed304)
2014-08-27 13:38:29 +01:00
Anuj Phogat
0ecb1cfad3 i965: Bail on vec4 copy propagation for scratch writes with source modifiers
Fixes Khronos GLES3 CTS test:
dynamic_expression_array_access_vertex

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7c1ea00eaf)
2014-08-27 13:38:28 +01:00
Tom Stellard
42bd348d21 clover: Flush the command queue in clReleaseCommandQueue()
This is required by the spec.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed3f7eadad)
2014-08-27 13:38:28 +01:00
Emil Velikov
e8d7e3bc57 cherry-ignore: reject a15088338e
The commit "radeonsi/compute: Stop leaking the input buffer" is not
meant for 10.2 as it relies on code that was never part of this
branch and breaks the build.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-27 13:37:26 +01:00
Emil Velikov
869e6e3955 cherry-ignore: drop whitespace fix
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-27 13:26:14 +01:00
Tom Stellard
a92c1b1904 radeonsi/compute: Call si_pm4_free_state() after emitting compute state
This will decrement the reference count for buffers referenced in the
command stream will prevent us from leaking them.

CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1e2e550671)
2014-08-27 13:25:19 +01:00
Tom Stellard
c5885eca15 radeonsi/compute: Update reference counts for buffers in si_set_global_binding()
CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e9681d55)

Conflicts:
	src/gallium/drivers/radeonsi/si_compute.c
2014-08-27 13:25:19 +01:00
Emil Velikov
f1ff7753f1 cherry-ignore: PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE is not it 10.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-27 13:24:26 +01:00
Tom Stellard
7b78cf4ae2 radeon/compute: Fix reported values for MAX_GLOBAL_SIZE and MAX_MEM_ALLOC_SIZE
There is a hard limit in older kernels of 256 MB for buffer allocations,
so report this value as MAX_MEM_ALLOC_SIZE and adjust MAX_GLOBAL_SIZE
to statisfy requirements of OpenCL.

CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 77ea58ca81)
2014-08-27 13:22:46 +01:00
Emil Velikov
395d4e11cf cherry-ignore: remove patch that lacking previous dependencies
The patch clears up the msse4.1 handling, yet mesa 10.2 lacks earlier
patches that this one relies on.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-27 13:22:45 +01:00
Emil Velikov
cab73f6eba android: dri/i915: do not build an 'empty' driver
The variable i915_C_FILES changed to i915_FILES with commit
34d4216e64 back in mesa 9.1/9.2. Yet we've missed to update the
the android build, essentially creating an dummy/empty driver that
can never work.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 5facd003a0)
2014-08-25 23:45:17 +01:00
Emil Velikov
0e52107ff8 android: glsl: the stlport over the limited Android STL
The latter lacks various functionality used by mesa/glsl.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 07f583186d)
2014-08-25 23:39:47 +01:00
Emil Velikov
aba5b15068 android: drop HAL_PIXEL_FORMAT_RGBA_{5551,4444}
Upstream Android (system/core) has dropped these formats with commit
6bac41f1bf9(get rid of HAL pixelformats 5551 and 4444) yet does not
mention why.

These formats never really worked so we're safe to drop them as well.

Identical commit is available in the android-x86 external/mesa repo

    commit 06a2d36edc
    Author: Chih-Wei Huang <cwhuang@linux.org.tw>
    Date:   Wed Sep 25 01:16:57 2013 +0800

        android: get rid of HAL pixelformats 5551 and 4444

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit dfa6dc5eb8)
2014-08-25 23:39:30 +01:00
Emil Velikov
477babf6f4 android: gallium/auxiliary: drop log2/log2f redefitions
Recent versions of bionic has picked up support for these functions,
leading to build issues due to the redefition of the symbols.

Note: wrapping things in #ifdef does not cut it :\

Identical patch is available in chromium, android-x86 and perhaps other
projects.

    commit 66c1c789ce3407472de9ed620c9f815639058835
    Author: rmcilroy@chromium.org
    Date:   Wed Apr 02 10:59:34 2014 +0000

        Porting to x64 Android. Remove redefinitions of log2 and log2f.

        BUG=
        R=kbr@chromium.org

        Review URL: https://codereview.chromium.org/216773005

    commit 9cc0a0d2b0
    Author: Chih-Wei Huang <cwhuang@linux.org.tw>
    Date:   Sun Jul 21 23:04:19 2013 +0800

        android: remove log2, log2f

        The functions are already defined in the latest bionic.

Cc: Chia-I Wu <olvaffe@gmail.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
(cherry picked from commit 51a9a09ba8)
2014-08-25 23:39:04 +01:00
Emil Velikov
1cab5b5144 android: egl/main: add/enable freedreno
For all everyone willing to give the freedreno driver
a go they can now build it under Android.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: freedreno@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 792041ebe5)
2014-08-25 23:38:43 +01:00
Emil Velikov
6761e28a46 android: gallium/freedreno: add preliminary build
For all the people interested in testing the freedreno driver on
their Android devices. The next commit will hook these up within
the libEGL driver (via the gallium-egl backend).

There may be some rough edges but those can be sorted when a
willing builder/tester comes along.

v2:
 - s/freefreno/freedreno/. Spotted by Matt Turner.
 - Use the installed libdrm headers.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: freedreno@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit bf05e06757)
2014-08-25 23:38:18 +01:00
Emil Velikov
0cdebea90c automake: gallium/freedreno: drop spurious include dirs
Rather than including two extra folders only for two headers,
just prefix the headers and be done with it.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: freedreno@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 458d03a4a4)

Conflicts:
	src/gallium/drivers/freedreno/Makefile.am
2014-08-25 23:35:47 +01:00
Paulo Sergio Travaglia
6601260464 android: egl/main: resolve radeon linking issues
- link against libdrm_radeon
 - link the r600 driver against libstlport
 - linkin the newly added libmesa_pipe_radeon library
required by r600 and radeonsi drivers

v2: Include pipe_radeon after pipe_r600/radeonsi.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
[Emil Velikov] Split up and add commit message.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit aae453afe8)
2014-08-25 23:34:29 +01:00
Paulo Sergio Travaglia
a3eb6989ad android: gallium/radeon: attempt to fix the android build
- include the correct folders
 - add a new buildscript for the common radeon folder

v2: Use the installed libdrm headers over the DRM_TOP ones.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
[Emil Velikov] Split up and add commit message.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 5bbfa308c9)
2014-08-25 23:34:22 +01:00
Emil Velikov
eda6510425 android: egl/main: fixup the nouveau build
For a while the nouveau pipe driver has been a static library
and it has been using STL for even longer.
Correct add the link and cleanup the gallium_DRIVERS.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 825fa2873f)
2014-08-25 23:34:14 +01:00
Emil Velikov
8cb517e43a android: gallium/nouveau: fix include folders, link against libstlport
nouveau uses STL for a while now thus we need to include
external/stlport/libstlport.mk in order to get the build
at least partially working.

v2: Use the installed libdrm headers over the DRM_TOP ones.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 6b510c6338)
2014-08-25 23:33:31 +01:00
Emil Velikov
cdfc7c6c74 configure.ac: bail out if building gallium_gbm without gallium_egl
The former is the only user of the latter. As such building gbm
without egl makes little to no sense.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1d1ec76bdf)
2014-08-25 23:32:36 +01:00
Kenneth Graunke
62f60edd77 i965/vec4: Respect ir->force_writemask_all in Gen8 code generation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e180e4c09)
2014-08-25 23:27:51 +01:00
Kenneth Graunke
66f88a4640 i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b6b61ba83)
2014-08-25 23:27:44 +01:00
Ian Romanick
d82ca4e2b2 mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image
Instead of catching the special case early, handle it by constructing a
fake gl_texture_image that will cause the values required by the OpenGL
4.0 spec to be returned.

Previously, calling

    glGenTextures(1, &t);
    glBindTexture(GL_TEXTURE_2D, t);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, 0xDEADBEEF, &value);

would not generate an error.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Suggested-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit ee58c71a65)
2014-08-19 15:24:18 -07:00
Carl Worth
8b056fc486 docs: Add sha256 sums for the 10.2.6 release 2014-08-19 15:17:50 -07:00
Carl Worth
346dda24bf Add release notes for the 10.2.6 release
Listing bugs fixed and changes made.
2014-08-19 14:01:02 -07:00
Carl Worth
efc7aa3187 Update VERSION to 10.2.6
In preparation for the 10.2.6 release.
2014-08-19 13:58:32 -07:00
Brian Paul
5532cf9d7e mesa: fix assertion in _mesa_drawbuffers()
Fixes failed assertion when _mesa_update_draw_buffers() was called
with GL_DRAW_BUFFER == GL_FRONT_AND_BACK.  The piglit gl30basic hit
this.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 39b40ad144)
2014-08-11 14:38:22 -07:00
Maarten Lankhorst
bca8865aec configure.ac: Do not require llvm on x32
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Maarten Lankhorst <dev@mblankhorst.nl>
(cherry picked from commit 4c16e6a8e0)
2014-08-11 14:37:57 -07:00
Marek Olšák
629df2fbdd radeonsi: fix CMASK and HTILE allocation on Tahiti
Tahiti has 12 tile pipes, but P8 pipe config.

It looks like there is no way to get the pipe config except for reading
GB_TILE_MODE. The TILING_CONFIG ioctl doesn't return more than 8 pipes,
so we can't use that for Hawaii.

This fixes a regression caused by fcb6c0d2b8
(cherry picked from 9b046474c9 as part of
Mesa 10.2.5) on Tahiti.

v2: add an assertion and print an error on failure

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 955505f6ff)
2014-08-11 14:36:24 -07:00
Marek Olšák
22ec7df0d7 radeonsi: fix a hang with instancing in Unigine Heaven/Valley on Hawaii
This isn't documented anywhere, but it's the only thing that works
for this case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 515269b3a7)

Conflicts:
	src/gallium/drivers/radeonsi/si_state_draw.c
2014-08-11 14:34:52 -07:00
Marek Olšák
1e0820032d radeon,r200: fix buffer validation after CS flush
This validates all bound buffers (CB, ZB, textures, DMA) at the beginning
of CS. This fixes "bo->space_accouned" assertion failures.

Tested by: Jochen Rollwagen <joro-2013@t-online.de>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

(cherry picked from commit 085a861545)
2014-08-11 14:27:45 -07:00
Marek Olšák
485fd2a4eb st/mesa: fix blit-based partial TexSubImage for 1D arrays
This fixes piglit spec/EXT_texture_array/render-1darray.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 0b5d88a518)
2014-08-11 14:27:10 -07:00
Pali Rohár
56a455f5ba configure: check for dladdr via AC_CHECK_FUNC/AC_CHECK_LIB
Use both macros as in some cases using AC_CHECK_FUNCS alone may fail.
Thus HAVE_DLADDR will not be defined, and as a result most of the code
in megadriver_stub.c will not be compiled. Breaking the backwards
compatibility between older libGL/xserver(s) and DRI megadrivers.

Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
[Emil Velikov] Commit message.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

(cherry picked from commit 39a4cc45a4)

Also squashed together with:

configure.ac: Use LIBS rather than LDFLAGS to add -ldl to dladdr check

ec8ebff "Check for dladdr()" erroneously uses LDFLAGS rather than LIBS to add
-ldl to the dladdr check.

Replace the workaround in 39a4cc4 of explicitly checking in libdl, with a more
correct approach of using LIBS.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Pali Rohár <pali.rohar@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2e1dc0cce)
2014-08-11 14:26:18 -07:00
Anuj Phogat
3e541f0ab8 meta: Fix datatype computation in get_temp_image_type()
Changes in the patch will cause datatype to be computed
correctly for 8 and 16 bit integer formats. For example:
GL_RG8I, GL_RG16I etc.

Fixes many failures in gles3 Khronos CTS test:
copy_tex_image_conversions_required
copy_tex_image_conversions_forbidden

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 338fef61f8)

Conflicts:
	src/mesa/drivers/common/meta.c
2014-08-11 14:25:01 -07:00
Anuj Phogat
82d46ebc07 egl: Fix OpenGL ES version checks in _eglParseContextAttribList()
We would generate EGL_BAD_CONFIG because _eglGetContextAPIBit
returns zero for the combination of EGL_OPENGL_ES_API and a major
version > 3.  By just returning zero, the caller can't tell the
difference between a bad version (which should generate
EGL_BAD_MATCH) and a bad API (which should generate
EGL_BAD_CONFIG).  This patch causes us to filter out major
versions > 3 at a point where we can generate the correct error.

Fixes gles3 Khronos CTS test:
egl_create_context.egl_create_context

V2: Fix commit message as suggested by Ian.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d308f57fe7)
2014-08-06 14:52:05 -07:00
Anuj Phogat
29e2f11f5d meta: Use _mesa_get_format_bits() to get the GL_RED_BITS
We currently get red bits from ctx->DrawBuffer->Visual.redBits
by making a false assumption that the texture we're writing to
(in glCopyTexImage2D()) is used as a DrawBuffer.

Fixes many failures in gles3 Khronos CTS test:
copy_tex_image_conversions_required

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 7de90890c6)

Conflicts:
	src/mesa/drivers/common/meta.c
2014-08-06 14:50:18 -07:00
Anuj Phogat
8dc1caa1ad mesa: Allow GL_TEXTURE_CUBE_MAP target with compressed internal formats
GL_TEXTURE_CUBE_MAP is an allowed texture target in glTexStorage2D()
and is allowed to be used (like GL_TEXTURE_2D) with compressed internal
formats.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c7def2257a)
2014-08-06 14:49:47 -07:00
Anuj Phogat
eef8b88e63 mesa: Add gles3 condition for normalized internal formats in glCopyTexImage*()
Fixes many failures in gles3 Khronos CTS test: packed_pixels

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 2fc4205461)
2014-08-06 14:49:36 -07:00
Anuj Phogat
326648add1 mesa: Add utility function _mesa_is_enum_format_unorm()
V2: Add missing formats.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 938b3d0034)
2014-08-06 14:49:27 -07:00
Anuj Phogat
ad0d1a1ab3 mesa: Add gles3 error condition for GL_RGBA10_A2 buffer format in glCopyTexImage*()
Fixes many failures in gles3 Khronos CTS test: packed_pixels

Khronos bug# 9807
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

(cherry picked from commit 6df48ff27a)
2014-08-06 14:49:18 -07:00
Anuj Phogat
712933f1d6 mesa: Add a gles3 error condition for sized internalformat in glCopyTexImage*()
Fixes many failures in gles3 Khronos CTS test: packed_pixels

V2: Add the check for alpha bits to avoid confusion.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 5c0d2a12f3)
2014-08-06 14:49:10 -07:00
Anuj Phogat
8829b3c37a mesa: Add a helper function _mesa_is_enum_format_unsized()
Function is utilized by next patch in the series.

V2: Add missing formats.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit e0fe00eeac)
2014-08-06 14:37:04 -07:00
Anuj Phogat
4c02b3d90e mesa: Don't allow snorm internal formats in glCopyTexImage*() in GLES3
Fixes few failures in gles3 Khronos CTS test: packed_pixels

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 2d362a6aee)
2014-08-06 14:36:46 -07:00
Anuj Phogat
b6fe3e7ac2 mesa: Add utility function _mesa_is_enum_format_snorm()
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 845b5ec89f)
2014-08-06 14:36:31 -07:00
Anuj Phogat
323f0246ef mesa: Fix condition for using compressed internalformat in glCompressedTexImage3D()
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 3c7a0c690a)
2014-08-06 14:36:06 -07:00
Anuj Phogat
686247ff62 mesa: Add error condition for using compressed internalformat in glTexStorage3D()
Fixes gles3 Khronos CTS test: texture_storage_texture_internal_formats

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit e27c9f3a02)
2014-08-06 14:35:46 -07:00
Anuj Phogat
d21e0d34fc mesa: Turn target_can_be_compressed() in to a utility function
V2:  Declare the function in teximage.h

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit ac2adf66c1)
2014-08-06 14:35:17 -07:00
Anuj Phogat
8bb6628cb9 mesa: Fix error condition for valid texture targets in glTexStorage* functions
Fixes gles3 Khronos CTS test: texture_storage_texture_targets

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a94d78438d)
2014-08-06 14:34:57 -07:00
Ilia Mirkin
69d6ceda43 mesa/st: only convert AND(a, NOT(b)) into MAD when not using native integers
Native integers imply a somewhat different handling of booleans. Instead
of being 1.0/0.0 floats, they are 0 (true) / -1 (false) integers. As such
the original optimization no longer applies.

Reported-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b3d0a9a1e)
2014-08-06 14:33:51 -07:00
Jordan Justen
1abd1d0430 i965/miptree: Layout 1D Array as 2D Array with height of 1
1D array miptrees were being laid out as a 2D texture with 1 slice.
This happened due to the mesa core storing the 1D array slice count in
the height field. On Intel hardware, we want to create a 2D array with
a height of 1 for the 1D array case.

Fixes assertion failure in piglit (gen6, gen8):
spec/glsl-1.30/execution/tex-miplevel-selection textureOffset 1DArrayShadow

In release builds of Mesa, this test was observed to cause a GPU hang
on gen8.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81450
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit c860a379d2)
2014-08-06 14:33:03 -07:00
Roland Scheidegger
3baf37f076 gallivm: fix up out-of-bounds level when using conformant out-of-bound behavior
When using (d3d10) conformant out-of-bound behavior for texel fetching
(currently always enabled) the level still needs to be set to a safe value
even though the offset in the end won't get used because the level is used
to look up the mip offset itself and the actual strides, which might otherwise
crash.
For simplicity, we'll use level 0 in this case (this ought to be safe, llvmpipe
does not actually fill in level 0 information if first_level is larger, but
some random strides / offsets shouldn't hurt as ultimately we always use
offset 0 in this case).
Fixes a crash in some in-house test where random huge levels appear in
lp_build_fetch_texel() (the test actually uses level 0 always but if the
fetching happens in a block with a execution mask random values may appear).

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 5a12155503)
2014-08-06 14:31:48 -07:00
Carl Worth
12d68e9aaf docs: Add sha256 sums to the 10.2.5 release notes 2014-08-02 19:10:30 -07:00
Carl Worth
a53047f6d1 docs: Add release notes for 10.2.5 2014-08-02 19:06:05 -07:00
Carl Worth
b83b9b677b Update version to 10.2.5
In preparation for the 10.2.5 release, of course.
2014-08-02 19:03:34 -07:00
Marek Olšák
853cd6a4f7 radeonsi: use DRAW_PREAMBLE on CIK
It's the same as setting the 3 regs separately, but shorter, and it also
seems to be required on GFX7.2 and later. This doesn't fix Hawaii.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 315f3c171d)
2014-07-31 15:23:14 -07:00
Christian König
c66da3d457 radeonsi: fix order of r600_need_dma_space and r600_context_bo_reloc
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c8011c1885)
2014-07-31 15:22:49 -07:00
Marek Olšák
f75dfcee10 radeonsi: fix build because of lack of draw_indirect infrastructure in 10.2 2014-07-31 15:22:21 -07:00
Carl Worth
490d8ddf87 cherry-ignore: Ignore a few patches picked in the previous stable release
I don't know what happened here, but these three commits were picked earlier,
but the commit IDs referenced in their commit messages do not currently
appear in the master branch, (perhaps a force-push occurred?).

Anyway, we can ignore these now.
2014-07-30 22:09:46 -07:00
Jason Ekstrand
b9c5a8f869 main/get_hash_params: Add GL_SAMPLE_SHADING_ARB
GL_SAMPLE_SHADING is specified as a valid pname for glGet in the
GL_ARB_sample_shading extension.  It seems as if we forgot to add it to the
table of pnames.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3ea922dd7c)
2014-07-30 22:09:46 -07:00
José Fonseca
c84b367b18 st/wgl: Clamp wglChoosePixelFormatARB's output nNumFormats to nMaxFormats.
While running https://github.com/nvMcJohn/apitest with apitrace I noticed that Mesa was producing bogus results:

  wglChoosePixelFormatARB(hdc, piAttribIList = {...}, pfAttribFList = &0, nMaxFormats = 1, piFormats = {19, 65576, 37, 198656, 131075, 0, 402653184, 0, 0, 0, 0, -573575710}, nNumFormats = &12) = TRUE

However https://www.opengl.org/registry/specs/ARB/wgl_pixel_format.txt states

    <nNumFormats> returns the number of matching formats. The returned
    value is guaranteed to be no larger than <nMaxFormats>.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 66a1b3a1da)
2014-07-30 22:09:46 -07:00
Marek Olšák
71102219ea r600g,radeonsi: switch all occurences of array_size to util_max_layer
This fixes 3D texture support in all these cases, because array_size is 1
with 3D textures and depth0 actually contains the "array size".
util_max_layer is universal and returns the last layer index for any texture
target.

A lot of the cases below can't actually be hit with 3D textures, but let's
be consistent.

This fixes a failure in:
    piglit layered-rendering/clear-color-all-types 3d single_level
for r600g and radeonsi, which was caused by an incorrect CMASK size
calculation.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit a9528cef6b)
2014-07-30 22:09:45 -07:00
Marek Olšák
d26ac40bad radeonsi: fix occlusion queries on Hawaii
This was just a guess - and it worked!

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 71ce92200e)
2014-07-30 22:09:45 -07:00
Marek Olšák
50dcc2eb26 winsys/radeon: fix vram_size overflow with Hawaii
This fixes piglit spec/!OpenGL 3.1/minmax.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 156b7e244c)
2014-07-30 22:09:45 -07:00
Marek Olšák
0dfcf50639 radeonsi: fix a hang with streamout on Hawaii
I actually couldn't reproduce this one, but internal docs recommend this
workaround. Better safe than sorry.

Also, the number of dwords for the sync packets is increased by 4 instead
of 2, because it wasn't bumped last time when a new packet was added there.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0e7f56313d)
2014-07-30 22:09:45 -07:00
Marek Olšák
6ff3da2509 radeonsi: fix a hang with instancing on Hawaii
This fixes "piglit/bin/arb_transform_feedback2-draw-auto instanced".

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3d9e87406c)
2014-07-30 22:09:45 -07:00
Marek Olšák
378def4cab gallium/util: add a helper for calculating primitive count from vertex count
This is needed by the following commit which is a candidate for stable too.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c7407b94a8)
2014-07-30 22:09:45 -07:00
Marek Olšák
fcb6c0d2b8 radeonsi: fix CMASK and HTILE calculations for Hawaii
This fixes the checkerboard pattern in glxgears and anything that triggers
fast color clear.

num_channels is always <= 8, but Hawaii has 16 pipes.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9b046474c9)
2014-07-30 22:09:45 -07:00
Marek Olšák
d59406cdb7 r600g: switch SNORM conversion to DX and GLES behavior
it also matches GL 4.2

further discussion:
http://lists.freedesktop.org/archives/mesa-dev/2013-August/042680.html

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d5bcb5e8de)
2014-07-30 22:09:45 -07:00
Ilia Mirkin
f2023b8dc8 nvc0: make sure that the local memory allocation is aligned to 0x10
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 47e5a8d7a2)
2014-07-30 22:09:45 -07:00
Kenneth Graunke
37114cc2eb i965/fs: Set LastRT on the final FB write on Broadwell.
In Piglit's EXT_framebuffer_multisample/alpha-to-coverage-dual-src-blend
test, key->nr_color_regions == 2, but the dual source blend FB write has
ir->target set to 0.  So we failed to set "Last Render Target Select" on
any FB write message.

We only emit one FB write per render target, so my comment about setting
LastRT on every FB write directed at the last color region is a bit...
misinformed.  According to the documentation, depth buffer writes and
scoreboard updates happen on the FB write with LastRT set, so I believe
we want to set it only once.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d4d886a0bc)

Conflicts:
	src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
2014-07-30 22:09:45 -07:00
Anuj Phogat
f876eae80b mesa: Don't use memcpy() in _mesa_texstore() for float depth texture data
because float depth texture data needs clamping to [0.0, 1.0]. Let the
_mesa_texstore() fallback to slower path.

Fixes Khronos GLES3 CTS tests:
shadow_execution_vert
shadow_execution_frag

V2: Move the check to _mesa_texstore_can_use_memcpy() function.
    Add check for floating point data types.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9548ba6e7b)
2014-07-30 22:09:45 -07:00
Kenneth Graunke
8e91b094c8 i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+.
We actually want to use mov(16), not mov(8).

Fixes 7 Piglit tests: ARB_sample_shading/builtin-gl-sample-mask [2468]
and ARB_sample_shading/builtin-gl-sample-mask-simple [468].

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 29af97f280)
2014-07-30 22:09:45 -07:00
Kenneth Graunke
6893c25c7b i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode.
We might be able to do this without an extra program key field, but this
is non-invasive and fixes the bug, for now.

This fixes the following Piglit tests on Broadwell:
- ARB_sample_shading/builtin-gl-sample-id 2
- ARB_sample_shading/builtin-gl-sample-position 2
- EXT_framebuffer_multisample/multisample-blit 2 color
- EXT_framebuffer_multisample/multisample-blit 2 color linear
- EXT_framebuffer_multisample/multisample-blit 2 depth
- EXT_framebuffer_multisample/no-color 2 depth combined
- EXT_framebuffer_multisample/no-color 2 depth separate
- EXT_framebuffer_multisample/no-color 2 depth single
- EXT_framebuffer_multisample/no-color 2 depth-computed combined
- EXT_framebuffer_multisample/no-color 2 depth-computed separate
- EXT_framebuffer_multisample/no-color 2 depth-computed single
- EXT_framebuffer_multisample/unaligned-blit 2 color msaa
- EXT_framebuffer_multisample/unaligned-blit 2 depth msaa

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 38ffef7840)
2014-07-30 22:09:38 -07:00
Kenneth Graunke
62ebb85cd4 i965: Add missing persample_shading field to brw_wm_debug_recompile.
Otherwise, the performance warning for shader recompiles will just say
"something else".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 4cf47c80fc)
2014-07-30 16:31:39 -07:00
Jason Ekstrand
8c8dc8c9e9 main/format_pack: Fix a wrong datatype in pack_ubyte_R8G8_UNORM
Before it was only storing one of the color components due to truncation.
With this patch it now properly stores all of them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ecd3e89b32)
2014-07-30 16:28:54 -07:00
Anuj Phogat
775895110c i965: Fix z_offset computation in intel_miptree_unmap_depthstencil()
The bug is triggered by using glTexSubImage2d() with GL_DEPTH_STENCIL
as base internal format and non-zero x, y offsets. Currently x, y
offsets are ignored while updating the texture image.

Fixes Khronos GLES3 CTS tests:
npot_tex_sub_image_2d
npot_tex_sub_image_3d
npot_pbo_tex_sub_image_2d
npot_pbo_tex_sub_image_2d

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 984a02ba55)
2014-07-30 16:28:30 -07:00
Adel Gadllah
386648c555 i915: Fix up intelInitScreen2 for DRI3
Commit 442442026e updated both i915 and i965 for DRI3 support,
but one check in intelInitScreen2 was missed for i915 causing crashes
when trying to use i915 with DRI3.

So fix that up.

Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
References: https://bugzilla.redhat.com/show_bug.cgi?id=1115323
References: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=754297
Tested-by: František Zatloukal <Zatloukal.Frantisek@gmail.com>
Tested-by: Dirk Griesbach <spamthis@freenet.de>
Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b656e3c603)
2014-07-30 16:27:59 -07:00
Thorsten Glaser
0e6d8ca573 nv50: fix build failure on m68k due to invalid struct alignment assumptions
Make alignment assumptions explicit by inserting correct padding with
unknown struct members.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3cfe6bc9cc)
2014-07-30 16:27:23 -07:00
Tom Stellard
1233cdd98d clover: Call end_query before getting timestamp result v2
v2:
  - Move the end_query() call into the timestamp constructor.
  - Still pass false as the wait parameter to get_query_result().

Reviewed-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74dfd86ed6)

Conflicts:
	src/gallium/state_trackers/clover/core/timestamp.cpp
2014-07-30 16:26:44 -07:00
Ian Romanick
7a856002d3 mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat profile
There are no queries for GL_TEXTURE_LUMINANCE_SIZE,
GL_TEXTURE_INTENSITY_SIZE, GL_TEXTURE_LUMINANCE_TYPE, or
GL_TEXTURE_INTENSITY_TYPE in any version of OpenGL ES or desktop OpenGL
core profile.

NOTE: Without changes to piglit, this regresses
required-sized-texture-formats.

v2: Rebase on different initial change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9a723b970e)
2014-07-30 16:10:38 -07:00
Ian Romanick
d8074a6a1c mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profile
There are no texture borders in any version of OpenGL ES or desktop
OpenGL core profile.

Fixes piglit's gl-3.2-texture-border-deprecated.

v2: Rebase on different initial change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 750286600b)
2014-07-30 16:10:11 -07:00
Carl Worth
816d37e5c5 docs: Add SHA256 checksums for the 10.2.4 release
Just after tagging the commit used to create the release.
2014-07-18 16:46:28 -07:00
Carl Worth
efe8cb1e53 Add release notes for 10.2.4
Just prior to the release.
2014-07-18 12:45:19 -07:00
Carl Worth
54733e5cb8 Update VERSION to 10.2.4
In preparation for the 10.2.4 release, of course.
2014-07-18 12:37:31 -07:00
Kenneth Graunke
6388ad51ff i965: Enable compressed multisample support (CMS) on Broadwell.
Everything is in place and appears to be working.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 8cf289c3ef)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
ab0ad8f7e9 i965: Add 2x MSAA support to the MCS allocation function.
2x MSAA also uses 8 bits, just like 4x.  More bits are unused.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit db184d43b0)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
1c386d5c35 i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.
MCS buffers are never allocated on Broadwell, so this does nothing for
now, but puts the infrastructure in place for when they do exist.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit a248b2a4eb)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
e3c0c23873 i965: Drop SINT workaround for CMS layout on Broadwell.
According to the documentation, we don't need this SINT workaround on
Broadwell.  (Or at least, it doesn't mention that we need it.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit e10311be9f)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
2a90fbfce4 i965: Add plumbing for Broadwell's auxiliary surface support.
Broadwell generalizes the MCS fields to allow for multiple kinds of
auxiliary surfaces.  This patch adds the plumbing to set those values,
but doesn't yet hook any up.

v2: (by Jordan Justen) Use mt for qpitch; pitch is tiles - 1.
v3: Don't forget to subtract 1 from aux_mt->pitch.
v4: Drop unnecessary aux_mt->offset (caught by Jordan Justen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit fd77187689)
2014-07-17 15:59:01 -07:00
Jordan Justen
d374cfe0bc i965: Add auxiliary surface field #defines for Broadwell.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit a46cb6a971)
2014-07-17 15:59:01 -07:00
Matt Turner
b56908d7db i965/fs: Set correct number of regs_written for MCS fetches.
regs_written is in units of virtual GRFs.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dfd117b857)
2014-07-17 15:59:01 -07:00
Eric Anholt
c6a6acb6b4 i965: Generalize the pixel_x/y workaround for all UW types.
This is the only case where a fs_reg in brw_fs_visitor is used during
optimization/code generation, and it meant that optimizations had to be
careful to not move pixel_x/y's register number without updating it.

Additionally, it turns out we had a couple of other UW values that weren't
getting this treatment (like gl_SampleID), so this more general fix is
probably a good idea (though I wasn't able to replicate problems with
either pixel_[xy]'s values or gl_SampleID, even when telling the register
allocator to reuse registers immediately)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 66f5c8df06)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
64ff84abae i965/fs: Use WE_all for gl_SampleID header register munging.
This code should execute without regard to the currently executing
channels.  Asking for gl_SampleID inside control flow might break in
strange ways.  It appears to break even at the top of the program in
SIMD16 mode occasionally as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6dc9e4e22a19108057162d9d8f8c7d559545f8de)
2014-07-17 15:59:00 -07:00
Matt Turner
8f4e03c397 i965/fs: Don't use brw_imm_* unnecessarily.
Using brw_imm_* creates a source with file=HW_REG, and the scheduler
inserts barrier dependencies when it sees HW_REG. None of these are
hardware-registers in the sense that they're special and scheduling
shouldn't touch them. A few of the modified cases already have HW_REGs
for other sources, so it won't allow extra flexibility in some cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c938be8ad2)
(This patch was cherry-picked to make the next commit apply cleanly.)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
258f35441a i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.
gen8_fs_generator uses these to decide whether to set the execution size
to 8 or 16, so we incorrectly made both of these MOVs the full width in
SIMD16 shaders.  (It happened to work out on Gen4-7.)

Setting them should also help inform optimization passes what's really
going on, which could help avoid bugs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2014-07-17 15:59:00 -07:00
Kenneth Graunke
cb04294b42 i965: Set execution size to 8 for instructions with force_sechalf set.
Both inst->force_uncompressed and inst->force_sechalf mean that the
generated instruction should be uncompressed and have an execution size
of 8.  We don't require the visitor to set both flags - setting
inst->force_sechalf by itself is supposed to be enough.

On Gen4-7, guess_execution_size() demoted instructions to 8-wide based
on the default compression state.  On Gen8+, we instead set a default
execution size, which worked great...except that we forgot to check
inst->force_sechalf when deciding whether to use 8 or 16.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1c62126612752f6eedb66f705cc3ff1e11beea5d)
2014-07-17 15:59:00 -07:00
Matt Turner
d389a863f2 i965/vec4: Constant propagate into 2-src math instructions on Gen8.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7192207de1)
2014-07-17 15:59:00 -07:00
Matt Turner
7fcfdfb17b i965/fs: Constant propagate into 2-src math instructions on Gen8.
total instructions in shared programs: 1878133 -> 1876986 (-0.06%)
instructions in affected programs:     153007 -> 151860 (-0.75%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 038eb649b3)
2014-07-17 15:59:00 -07:00
Matt Turner
8612a12a62 i965/fs: Make try_constant_propagate() static.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit aca4a951ea)
2014-07-17 15:59:00 -07:00
Matt Turner
8f787d3ca2 i965/fs: Don't fix_math_operand() on Gen >= 8.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 48f1143c64)
2014-07-17 15:59:00 -07:00
Matt Turner
b323fa8957 i965/vec4: Don't fix_math_operand() on Gen >= 8.
The emit_math?_gen? functions serve to implement workarounds for the
math instruction, none of which exist on Gen8+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b24e1cc604)
2014-07-17 15:59:00 -07:00
Matt Turner
d5d94598cb i965/vec4: Don't return void from a void function.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0e800dfe75)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
2efd0a3479 i965: Don't copy propagate abs into Broadwell logic instructions.
It's not clear what abs on logical instructions means on Broadwell, and
it doesn't appear to do anything sensible.

Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2de656278)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
1a832e5846 i965/vec4: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit c17db7537f)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
d55a897929 i965/fs: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 609d00e13e)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
276c6bb369 i965/fs: Refactor check for potential copy propagated instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit a66660d2b7)
2014-07-17 15:59:00 -07:00
Marek Olšák
0273f22a10 radeonsi: add support for TXB2
This is needed by latest fixes for samplerCubeShadow with bias.
Otherwise, a crash occurs.
2014-07-17 14:29:10 -07:00
Marek Olšák
e731031372 radeonsi: fix samplerCubeShadow with bias
Pack the depth value before overwriting it with cube coordinates.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b279f0143f)

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
2014-07-14 11:46:43 -07:00
Marek Olšák
906727dccb st/mesa: fix samplerCubeShadow with bias
It has 5 coordinates: (x,y,z,depth,lodbias)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit a11fff329e)
2014-07-14 11:43:22 -07:00
Brian Paul
d69c9114df gallium/u_blitter: fix some shader memory leaks
The _msaa shaders weren't getting freed.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 378fa34c7b)
2014-07-10 12:59:50 -07:00
Brian Paul
9b062d2020 st/mesa: fix geometry shader memory leak
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit d10204930f)
2014-07-10 12:59:34 -07:00
Brian Paul
37005cafa4 mesa: fix geometry shader memory leaks
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 176b64b811)
2014-07-10 12:59:19 -07:00
Marek Olšák
abe859d56e gallium: fix u_default_transfer_inline_write for textures
This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit fe6be9926f)
2014-07-10 12:58:56 -07:00
Ilia Mirkin
1e6620997f nvc0/ir: use manual TXD when offsets are involved
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.

This is the minimal fix appropriate for backporting.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 114d46829d)
2014-07-10 12:58:28 -07:00
Ilia Mirkin
9fd133747b nvc0/ir: do quadops on the right texture coordinates for TXD
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit afea9bae67)
2014-07-10 12:58:05 -07:00
Ilia Mirkin
5e1bfed1ca nv50/ir: ignore bias for samplerCubeShadow on nv50
Unfortunately there's no good way to do this on the nv50 shader isa.
Dropping the bias seems preferable to doing the compare post-filtering.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1065aa92f4)
2014-07-10 12:57:43 -07:00
Ilia Mirkin
0618881c82 nv50/ir: retrieve shadow compare from first arg
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 30d91e0eec)
2014-07-10 12:57:17 -07:00
Carl Worth
d00d73d1e1 docs: Add sha256 checksums for the 10.2.3 release
This was not possible until the previous commit was complete, used for
building archives, and then tagged.
2014-07-07 16:17:21 -07:00
Carl Worth
33cb9f9503 docs: Add release notes for the 10.2.3 release.
Which is imminent.
2014-07-07 16:12:42 -07:00
Rob Clark
0186858227 freedreno/a3xx: vtx formats
Add support for more vertex buffer formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 06e9536e5f)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit ba6a490bbc)
2014-07-07 16:09:36 -07:00
Rob Clark
b20c82f74c freedreno: fix for null textures
Some apps seem to give us a null sampler/view for texture slots which
come before the last used texture slot.  In particular 0ad triggers
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6aeeb706d2)
2014-07-07 16:09:36 -07:00
Rob Clark
8f77fbb6af freedreno/a3xx: texture fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit aa78c4586d)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 2456be63e9)
2014-07-07 16:09:21 -07:00
Rob Clark
afcb63802f freedreno: few caps fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>

(cherry picked from commit 286863939f)
2014-07-07 16:09:21 -07:00
Rob Clark
8b2d1068b5 freedreno/a3xx: fix blend opcode
Seems the opcodes are slightly different from a2xx.  Resync headers and
move blend_func() helper into hw generation specific code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit a4d229b099)
2014-07-07 16:09:20 -07:00
Rob Clark
f96e3e5351 freedreno/a3xx: fix depth/stencil gmem restore
We already multiply by bytes per pixel for this, so f3ba7611 broke
mem2gmem for depth/stencil.  Drop the now-redundant mutiply by cpp.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b81de5352d)
2014-07-07 16:09:20 -07:00
Rob Clark
55b6821a9f freedreno/a3xx: fix depth/stencil GMEM positioning
In cases where there was no color buf bound, there were inconsistancies
in register settings related to position of depth/stencil inside GMEM.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit f3ba761129)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 4da8267c36)
2014-07-07 16:09:00 -07:00
Rob Clark
a1b7c7d88e freedreno: use OUT_RELOCW when buffer is written
These aren't buffers we ever read back from CPU, so using incorrect
reloc fxn wasn't really harming anything.  But might as well be correct.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 0d54904c04)
2014-07-03 21:26:09 -07:00
Rob Clark
ff9cea8776 xa: fix segfault
Fixes:

  Program received signal SIGSEGV, Segmentation fault.
  bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445
  445						mask_pic->srf->tex->format);
  (gdb) bt
  #0  bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445
  #1  xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488
  #2  0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>,
      pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0,
      pDst=0x5923b8) at msm-exa-xa.c:533

We can't yet handle solid fill mask, so explicitly reject that, rather
than segfaulting.  Otherwise DDX would need to check XA version to see
if solid fill mask were supported.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b7e7ae9f60)
2014-07-03 21:25:50 -07:00
Ilia Mirkin
95ff8c6f18 nvc0: add a memory barrier when there are persistent UBOs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9a37eb8adb)
2014-07-03 20:04:56 -07:00
Ilia Mirkin
e11b3f8fbc nv50: do an explicit flush on draw when there are persistent buffers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d4f5218bb)
2014-07-03 20:04:12 -07:00
Ilia Mirkin
da80e6a1c4 nv50: disable dedicated ubo upload method
The hardware allows multiple simultaneous renders with the same
memory-backed constbufs but with each invocation having different
values. However in order for that to work, the data has to be streamed
in via the right constbuf slot. We weren't doing that for UBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b2b7c65122)
2014-07-03 20:03:06 -07:00
Aaron Watry
5ba1cf1893 radeon/llvm: Allocate space for kernel metadata operands
Previously, we were assuming that kernel metadata nodes only had 1 operand.

Kernels which have attributes can have more than 1, e.g.:
!0 = metadata !{void (i32 addrspace(1)*)* @testKernel, metadata !1}
!1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1}

Attempting to get the kernel without the correct number of attributes led
to memory corruption and luxrays crashing out.

Fixes the cl/program/execute/attributes.cl piglit test.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223
CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 824197efd5)
2014-07-03 20:02:17 -07:00
Thomas Hellstrom
ff02e7995c st/xa: Don't close the drm fd on failure v2
If XA fails to initialize with pipe_loader enabled, the pipe_loader's
cleanup function will close the drm file descriptor. That's pretty bad
because the file descriptor will probably be the X server driver's only
connection to drm. Temporarily solve this by dup()'ing the file descriptor
before handing it over to the pipe loader.

This fixes freedesktop.org bugzilla bug #80645.

v2: Fix CC addresses.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit 35cf3831d7)

Conflicts:
	src/gallium/state_trackers/xa/xa_tracker.c
2014-07-03 18:42:46 -07:00
Michel Dänzer
ee4274c393 radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>

https://bugs.freedesktop.org/show_bug.cgi?id=80015

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9f501bc6b)

Squashed together with the earlier:

radeon/llvm: Adapt to AMDGPU.rsq intrinsic change in LLVM 3.5

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 93b6b1fa83)
2014-07-03 18:39:43 -07:00
Carl Worth
7b21ee08db cherry-ignore: Add a patch that's been rejected
It may be that the patch is just fine, but at the very least it needs a better
commit message.
2014-07-03 18:36:06 -07:00
Kenneth Graunke
f9718e4b93 i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.
Apparently INTEL_DEBUG=fs has crashed on Broadwell for anything using
ARB_fragment_program since commit 9cee3ff5.  We need to NULL-check the
right field.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c60a4ba7e3)

Conflicts:
	src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
2014-07-03 18:35:54 -07:00
Jasper St. Pierre
3ca2119593 glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event
While the official INTEL_swap_event specification says that the drawable
field should contain the GLXDrawable, not the Drawable, the existing
DRI2 code in dri2.c that translates from DRI2_BufferSwapComplete sends out
GLX_BufferSwapComplete with the Drawable's ID, so existing codebases
like Clutter/Cogl rely on getting the Drawable.

Match DRI2's error here and stuff the event with the X Drawable, not
the GLX drawable.

This fixes apps seeing wrong drawables through an indirect GLX context
or with DRI3, which uses the GLX_BufferSwapComplete event directly on
the wire instead of translates Present in mesa.

At the same time, also modify the structure for the event to make sure
that clients don't make the same mistake. This is not an API or ABI
break, as GLXDrawable and Drawable are both typedefs for XID.

Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b4dcf87f34)
2014-07-03 18:03:11 -07:00
Kenneth Graunke
ad79d7e987 i965: Include marketing names for Broadwell GPUs.
Intel would like us to include the marketing names.  Developers
additionally want "Broadwell GT1/2/3" because it makes it easier
to identify what hardware users have when they request assistance
or report issues.

Including both makes it easy for everyone to map between the names.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05126b9bb5)
2014-07-03 18:02:00 -07:00
Takashi Iwai
9bd6dc9371 llvmpipe: Fix zero-division in llvmpipe_texture_layout()
Fix the crash of "gnome-control-center info" invocation on QEMU where
zero height is passed at init.

(sroland: simplify logic by eliminating the div altogether, using 64bit mul.)

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=879462

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6b8b17153a)
2014-07-03 18:01:14 -07:00
Ilia Mirkin
89e3b89796 nouveau: dup fd before passing it to device
nouveau screens are reused for the same device node. However in the
scenario where we create screen 1, screen 2, and then delete screen 1,
the surrounding code might also close the original device node. To
protect against this, dup the fd and use the dup'd fd in the
nouveau_device. Also tell the nouveau_device that it is the owner of the
fd so that it will be closed on destruction.

Also make sure to free the nouveau_device in case of any failure.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79823
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>
(cherry picked from commit a59f2bb17b)
2014-07-03 18:00:37 -07:00
Tobias Klausmann
bcff69f18f nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices
Previously, if we had something like:

  gl_ViewportIndex = idx;
  for(int i = 0; i < gl_in.length(); i++) {
     gl_Position = gl_in[i].gl_Position;
     EmitVertex();
  }
  EndPrimitive();

The right viewport index would not be set on the primitive because the
last vertex is the provoking one. However blob drivers appear to move
the gl_ViewportIndex write into the for loop, allowing the application
to be ignorant of this detail.

While the application is technically wrong here, because the blob does
it and other drivers appear to implicitly work this way as well, we add
a buffer register that viewport index writes go into, which is then
exported before every EmitVertex() call.

This fixes the remaining piglit tests in ARB_viewport_array for nv50/nvc0.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 98a86f61a8)
2014-07-03 17:46:01 -07:00
Roland Scheidegger
6ae4aff303 draw: (trivial) fix clamping of viewport index
The old logic would let all negative values go through unclamped, with
potentially disastrous results (probably trying to fetch viewport values
from random memory locations). GL has undefined rendering for vp indices
outside valid range but that's a bit too undefined...
(The logic is now the same as in llvmpipe.)

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 604e54de78)
2014-07-03 17:44:50 -07:00
Kenneth Graunke
500849f9cf i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.
As far as I can tell, Broadwell doesn't need any of the SURFACE_STATE
workarounds for textureGather() bugs, so there's no need to emit
a second set of identical copies.

To keep things simple, just point the gather surface index base to the
same place as the texture surface index base.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f6a99d1167)
2014-07-03 17:44:43 -07:00
Carl Worth
05add05438 docs: Add sha256 sums for the 10.2.2 release
Which, of course, we couldn't do until making the release files and tagging
the release.
2014-06-24 21:45:41 -07:00
Carl Worth
623e68fb1b docs: Add release notes for 10.2.2 release
Which is ready to go.
2014-06-24 21:30:02 -07:00
Carl Worth
a9750ff7b5 Update VERSION to 10.2.2
In preparation for the 10.2.2 release.
2014-06-24 21:26:25 -07:00
Ville Syrjälä
274be620a8 i915: Fix gen2 texblend setup
Fix an off by one in the texture unit walk during texblend
setup on gen2. This caused the last enabled texunit to be
skipped resulting in totally messed up texturing.

This is a regression introduced here:
 commit 1ad443ecdd
 Author: Eric Anholt <eric@anholt.net>
 Date:   Wed Apr 23 15:35:27 2014 -0700

    i915: Redo texture unit walking on i830.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit ca55a1aaa7)
2014-06-23 15:04:35 -07:00
Kenneth Graunke
5751b661ad i965: Save meta stencil blit programs in the context.
When the last context in a share group is destroyed, the hash table
containing all of the shader programs (ctx->Shared->ShaderObjects) is
destroyed, throwing away all of the shader programs.

Using a static variable to store program IDs ends up holding on to them
after this, so we think we still have a compiled program, when it
actually got destroyed.  _mesa_UseProgram then hits GL errors, since no
program by that ID exists.

Instead, store the program IDs in the context, so we know to recompile
if our context gets destroyed and the application creates another one.

Fixes es3conform tests when run without -minfmt (where it creates
separate contexts for testing each visual).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77865
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a20994d616)

Conflicts:
	src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
2014-06-23 15:04:11 -07:00
Daniel Manjarres
282ca8ba98 glx: Don't crash on swap event for a Window (non-GLXWindow)
Prior to GLX 1.3 there was the glxMakeCurrent() function that took a
single drawable handle. The Drawable could be either a bare XID for a
Window or an XID for a glxpixmap.

GLX 1.3 added glxMakeContextCurrent that takes 2 handles: one for
reading, one for writing. Nowadays the old glxMakeCurrent call is
implemented as a call to glxMakeContextCurrent with the single handle
duplicated.

Because of this it is allowed to use a plain-old Window ID as an
argument to glxMakeContextCurrent, although nobody really documents this
sort of thing. The manpage for the NEW call specifies the arguments as
GLXPixmaps, but the actual code accepts Window XIDs too, and handles
them correctly.

Similarly, the glxSelectEvents function can also take a bare Window XID.

The "piglit" tests all use GLXWindows and/or GLXPixmaps. You never
tested swap events with a bare Window XID. That is what my app was
doing.

The swap_events code worked with Window XIDs in mesa 7.x.y. The new code
added in versions 8, 9, and 10 assumes that all buffer swap events have
a GLXPixmap associated with them. Because of the historical quirks
above, this is not true. Swap events for bare Window XIDs do NOT have a
glxpixmap resulting in a segfault.

Any app that uses the old school glxMakeCurrent call with a Window XID
while trying to use swap_events will crash when the libs try to lookup
the nonexistent GLXPixmap associated with the incoming swap event.

I believe that the people who wrote the spec overlooked this, because
the "sbc" field comes from the OML_sync extension that is defined in
terms of glxpixmaps only.

v2 (idr): Formatting changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 86bd2196b4)
2014-06-23 15:01:16 -07:00
Iago Toral Quiroga
c50fa76c7e mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96a95f48ea)
2014-06-23 15:01:06 -07:00
Tom Stellard
ad9264366a clover: Don't use llvm's global context
An LLVMContext should only be accessed by a single and using the global
context was causing crashes in multi-threaded environments.  Now we use
a separate context for each compile.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4aa128a123)
2014-06-23 15:00:49 -07:00
Tom Stellard
855adad132 clover: Prevent Clang from printing number of errors and warnings to stderr.
https://bugs.freedesktop.org/show_bug.cgi?id=78581

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0cc391f013)
2014-06-23 15:00:37 -07:00
Ilia Mirkin
3568cf8128 nv30: hack to avoid errors on unexpected color/zeta combinations
This is just a hack, it should be possible to create a temporary zeta
surface and render to that instead. However that's more complicated and
this avoids the render being entirely broken and errors being reported
by the card.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25182e249e)
2014-06-23 15:00:14 -07:00
Ilia Mirkin
aca2d98c35 nv30: avoid dangling references to deleted contexts
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c092c46b27)
2014-06-23 15:00:00 -07:00
Ilia Mirkin
08317fa9c4 nv30: plug some memory leaks on screen destroy and shader compile
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5af80f6268)
2014-06-23 14:59:47 -07:00
Ian Romanick
4d0c445af6 meta: Respect the driver's maximum number of draw buffers
Commit c1c1cf5f9 added infrastructure for saving and restoring draw
buffer state.  However, it universially used MAX_DRAW_BUFFERS, but many
drivers support far fewer than that at limit.  For example, the radeon
and i915 drivers only support 1.  Using MAX_DRAW_BUFFERS causes meta to
generate GL errors.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80115
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org> [on Broadwell]
Tested-by: jpsinthemix@verizon.net
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cc219d1d65)
2014-06-23 14:59:28 -07:00
Kristian Høgsberg
9ad103d664 mesa: Remove glClear optimization based on drawable size
A drawable size of 0x0 means that we don't have buffers for a drawable yet,
not that we have a zero-sized buffer.  Core mesa shouldn't be optimizing out
drawing based on buffer size, since the draw call could be what triggers
the driver to go and get buffers.  As discussed in the referenced bug report,
the optimization was added as part of a scatter-shot attempt to fix a
different problem.  There's no other example in mesa core of using the
buffer size in this way.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74005
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7928b946ad)
2014-06-23 14:59:06 -07:00
Grigori Goronzy
12fcbcde47 radeon/uvd: disable VC-1 simple/main on UVD 2.x
It's about as broken as on later UVD revisions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 6cd30f5d73)
2014-06-23 14:58:46 -07:00
Ilia Mirkin
d8e3158a43 nv50: make sure to mark first scissor dirty after blit
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit af05270ccf)
2014-06-23 14:58:33 -07:00
Kenneth Graunke
ef5f998b76 i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.
Like on Haswell, we need to use 8x4 aligned rectangle primitives for
hierarchical depth buffer resolves and depth clears.  See the comments
in brw_blorp.cpp's brw_hiz_op_params() constructor.  (The Broadwell
documentation confirms that this is still necessary.)

This patch makes the Broadwell code follow the same behavior as Chad and
Jordan's Gen7 BLORP code.  Based on a patch by Topi Pohjolainen.

This fixes es3conform's framebuffer_blit_functionality_scissor_blit
test, with no Piglit regressions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 49659ad90c)
2014-06-23 14:58:22 -07:00
Kenneth Graunke
31dd2a6f18 i965/vec4: Use the sampler for pull constant loads on Broadwell.
We've used the LD sampler message for pull constant loads on earlier
hardware for some time, and also were already using it for the FS on
Broadwell.  This patch makes us use it for Broadwell VS/GS as well.

I believe that when I wrote this code in 2012, we still used the data
port in some cases, and I somehow neglected to convert it while
rebasing.

Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821%
(n = 17).  Many other applications should benefit similarly: this speeds
up uniform array access in the VS, which is commonly used for skinning
shaders, among other things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d8e246ac8)
2014-06-23 14:57:56 -07:00
Kenneth Graunke
c07485eab1 i965: Add missing newlines to a few perf_debug messages.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 847abaccc0)
2014-06-23 14:57:27 -07:00
Kenneth Graunke
3b941857ee i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.
I actually added MOCS support for these things, but forgot to delete the
corresponding perf_debug() warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d053a05ef3)
2014-06-23 14:56:50 -07:00
Kenneth Graunke
6b753df1f4 i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.
Somehow I missed this when adding all of the other MOCS values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f256c1c70)
2014-06-23 14:56:31 -07:00
Kenneth Graunke
01a79ac679 i965/vec4: Fix dead code elimination for VGRFs of size > 1.
When faced with code such as:

    mov vgrf31.0:UD, 960D
    mov vgrf31.1:UD, vgrf30.xxxx:UD

The dead code eliminator didn't consider reg_offsets, so it decided that
the second instruction was writing was writing to the same register as
the first one, and eliminated the first one.  But they're actually
different registers.

This fixes INTEL_DEBUG=shader_time for vertex shaders.  In the above
code, vgrf31.0 represents the offset into the shader_time buffer where
the data should be written, and vgrf31.1 represents the actual time
data.  With a completely undefined offset, results were...unexpected.

I think this is probably one of the few cases (maybe only case) where we
generate multiple MOVs to a large VGRF.  Normally, we just use them as
texturing results; the other SEND-from-GRF uses a size 1 VGRF.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d0575d98fc)
2014-06-23 14:56:11 -07:00
Jason Ekstrand
83be6a5517 meta_blit: properly compute texture width for the CopyTexSubImage fallback
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ffe609cc69)
2014-06-23 14:55:48 -07:00
Emil Velikov
348125e7f7 configure: correctly autodetect xvmc/vdpau/omx
Commit e62b7d38a1 (configure: autodetect video state-trackers
when non swrast driver is present) added a check that caused
the autodetection to be omitted when we have the swrast gallium
driver. Whereas it should have skipped the VL targets when only
swrast was selected.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79907
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 816d392b58)
2014-06-23 11:47:24 -07:00
Neil Roberts
126600c918 i965: Set the fast clear color value for texture surfaces
When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.

https://bugs.freedesktop.org/show_bug.cgi?id=79729

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 765efeef88)
2014-06-23 11:47:04 -07:00
Kenneth Graunke
ee2035a95f i965: Invalidate live intervals when inserting Gen4 SEND workarounds.
We need to invalidate the live intervals when inserting new
instructions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 237aac39b1)
2014-06-23 11:46:43 -07:00
Kenneth Graunke
07a6f8bcab i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.
When walking backwards, we want to stop at the head sentinel, which is
where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL.

Fixes random crashes, as well as valgrind errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecc78eab11)
2014-06-23 11:46:17 -07:00
Michel Dänzer
1d46c58b83 configure: Only check for OpenCL without LLVM when the latter is certain
LLVM is enabled by default for some architectures, but the test was failing
before that.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d399bb183)
2014-06-23 11:45:48 -07:00
Adrian Negreanu
f7fd6e52ec android, dricore: undefined reference to _mesa_streaming_load_memcpy
_mesa_streaming_load_memcpy is defined in main/streaming-load-memcpy.c
I'm adding it to the dricore lib

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 357a8b6f33)
2014-06-23 11:43:11 -07:00
Adrian Negreanu
a46fa0f9de android, mesa_gen_matypes: pull in timespec POSIX definition
This fixes:
  include/c11/threads_posix.h: In function 'cnd_timedwait':
  include/c11/threads_posix.h:140:21: error: storage size of 'abs_time' isn't known

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 6eb3888c86)
2014-06-23 11:43:02 -07:00
Adrian Negreanu
f4a19c1e2c android, egl: typo dri2_fallback_pixmap_surface -> dri2_fallback_create_pixmap_surface
I used commit bc8b07a6 as reference, and only the droid_display_vtbl had this issue.

This fixes:
src/egl/drivers/dri2/platform_android.c:641:29:
  error: 'dri2_fallback_pixmap_surface' undeclared here (not in a function)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 6980cae6ae)
2014-06-23 11:42:53 -07:00
Adrian Negreanu
bed18b082a android, egl: add correct drm include for libmesa_egl_dri2
Fixes:
  src/egl/drivers/dri2/platform_android.c:38:
  include/GL/internal/dri_interface.h:51:17:
    fatal error: drm.h: No such file or directory

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 4dc5545eff)
2014-06-23 11:42:42 -07:00
Adrian Negreanu
43752c3c37 android: add src/gallium/auxiliary as include path for libmesa_dricore
This fixes:
In file included from
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_exec_api.c:445:0:
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_attrib_tmp.h:28:38:
fatal error: util/u_format_r11g11b10f.h: No such file or directory

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 0048483f73)
2014-06-23 11:40:25 -07:00
Adrian Negreanu
aa03f78fc8 android: add libloader to libGLES_mesa and libmesa_egl_dri2
This fixes
  src/egl/drivers/dri2/platform_android.c:664: error: undefined reference to 'loader_set_logger'
  src/egl/drivers/dri2/platform_android.c:678: error: undefined reference to 'loader_get_driver_for_fd'

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit a49ebfab1d)
2014-06-23 11:40:14 -07:00
Adrian Negreanu
6194593661 android: adapt to the megadriver mechanism
Fixes linker error:
  ld:
  .../libmesa_dri_common_intermediates/libmesa_dri_common.a(dri_util.o):
    in function globalDriverAPI:dri_util.c(.data.rel+0x0): error:
    undefined reference to 'driDriverAPI'

As an example, you can see that mesa_dri_drivers
also uses common/libmegadriver_stub (src/mesa/drivers/dri/Makefile.am)

The _stub part might be confusing, but
it actually provides the dri-driver shared lib constructor,
megadriver_stub_init, which will later on load the real
platform dependent part and call
l __driDriverGetExtensions_<platform>

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit aba0f152be)
2014-06-23 11:39:59 -07:00
Adrian Negreanu
7654120e86 add megadriver_stub_FILES
So that android part can also use $(megadriver_stub_FILES)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit eb3f80dbba)
2014-06-23 11:39:43 -07:00
Emil Velikov
d6d80b44c4 configure: error out when building opencl without LLVM
Cc: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 93257a56b5)
2014-06-23 11:39:28 -07:00
José Fonseca
a5d00e243c mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).
A recent ApiTrace change, that tries to dump more buffer state
causes Mesa from my distro (10.1.4) to segfaults here.

I haven't actually confirm this fixes it (I can't repro on master),
but it seems a good idea to be defensive here anyway.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit eb58aa9cf0)
2014-06-23 11:39:15 -07:00
Ilia Mirkin
bfff355cef gk110/ir: fix bfind emission
There is a short-immediate version as well, but it should never end up
getting used since it would have gotten folded earlier.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd7dd3ed06)
2014-06-23 11:38:58 -07:00
Ilia Mirkin
1e1bdee5ec gk110/ir: fix emitting constbuf file index
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7a67318794)
2014-06-23 11:38:38 -07:00
Ilia Mirkin
9e50fc3812 gk110/ir: emit saturate flag on fadd when needed
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4a3a71a183)
2014-06-23 11:38:10 -07:00
Emil Velikov
8c319b3f98 targets/xa: limit the amount of exported symbols
In the presence of LLVM the final library exports every symbol from
the llvm namespace. Resolve this by using a version script (w/o the
version/name tag).

Considering that there are only ~35 symbols, explicitly list them
to minimize the chances of rogue symbols sneaking in.

v2: Conditionally include the version-script.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a75baba2f1)
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2014-06-16 15:32:16 +02:00
Ian Romanick
70ce1031e7 docs: Add MD5 checksum, etc. for 10.2.1 release
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 23:28:53 -07:00
Ian Romanick
8c4845d29b docs: Add initial 10.2.1 release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 23:20:00 -07:00
Ian Romanick
1b69ea1c6d Bump version to 10.2.1 2014-06-06 23:20:00 -07:00
Ian Romanick
c2fc9fb907 radeonsi: Fix build error introduced in 5ab9a9c
While resolving conflicts in cherry picking commit d226191, I
accidentally introduced some garbage.  Because radeonsi isn't built by
default, the problem went unnoticed by me.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
2014-06-06 23:19:53 -07:00
Ian Romanick
28d41e409d docs: Add MD5 checksum, etc. for 10.1 release
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 21:17:02 -07:00
Ian Romanick
f836ef63fd Bump version to 10.2 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 20:40:00 -07:00
Ilia Mirkin
99b9a0973a gk110/ir: fix slct emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fef8b3d81)
2014-06-06 20:40:00 -07:00
Ilia Mirkin
d36d53b564 gk110/ir: fix interp mode emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d588a4919b)
2014-06-06 18:40:58 -07:00
Ilia Mirkin
283cd12933 nvc0: don't bother trying to set up compute for gk110+
The nouveau fw currently prints a bunch of errors. No point in seeing
those all the time, esp since compute doesn't really work in the first
place.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
(cherry picked from commit ca65fc418f)
2014-06-06 18:40:21 -07:00
Ilia Mirkin
aa8ea648f4 gk110: add in forgotten code for gk110 isa
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
(cherry picked from commit b9ec766bd0)
2014-06-06 18:37:07 -07:00
Ilia Mirkin
e901f40764 gk110/ir: fix ISAD emission with register args
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed1b9e5721)
2014-06-06 18:19:45 -07:00
Ilia Mirkin
d5e47ee66b gk110/ir: fix quadon opcode emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6e046508a1)
2014-06-06 18:19:10 -07:00
Ilia Mirkin
932a5dadda gk110/ir: emit texbar the same way that the blob does
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 73eec47ef8)
2014-06-06 18:14:50 -07:00
Tobias Klausmann
203bc289a0 nv50/ir: clear subop when folding constant expressions
Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF) might have a subop set.
After folding, make sure that it is cleared

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3164bfc734)
2014-06-06 18:14:22 -07:00
Kenneth Graunke
11b3011805 i965: Support GL_CLAMP natively on Broadwell.
The new hardware actually supports this OpenGL 1.x feature natively,
so we can finally drop our shader workarounds.

Not many applications use GL_CLAMP, and most use it unintentionally, but
it's trivial to do right, so we should.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 221169693b)
2014-06-06 18:13:03 -07:00
Kenneth Graunke
c62bc58cce i965: Pass brw to translate_wrap_mode().
This lets us do generation checks.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f3d64a77b)
2014-06-06 18:12:20 -07:00
Kenneth Graunke
304e80e356 i965: Fix copy and pasted values in Broadwell code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7913b4b97b)
2014-06-06 18:11:54 -07:00
Sinclair Yeh
f4aca6868a egl: Check for NULL native_window in eglCreateWindowSurface
We have customers using NULL as a way to test the robustness of the API.
Without this check, EGL will segfault trying to dereference
dri2_surf->wl_win->private because wl_win is NULL.

This fix adds a check and sets EGL_BAD_NATIVE_WINDOW

v2: Incorporated feedback from idr - moved the check to a higher level
function.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 91ff0d4c65)
2014-06-06 18:11:30 -07:00
Marek Olšák
5ab9a9c0cc r600g,radeonsi: don't use hardware MSAA resolve if dst is fast-cleared
It doesn't work and our docs say so too.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d226191820)
2014-06-06 18:08:23 -07:00
Marek Olšák
ae16f443c2 r600g,radeonsi: disable fast clear if render condition is on
For some reason, CP DMA doesn't follow the predicate bit if I enable it,
so this is the only option.

This fixes piglit: spec/NV_conditional_render/clear

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit bf701a84eb)
2014-06-06 18:03:10 -07:00
José Fonseca
b8241bb3f2 mesa: Make glGetIntegerv(GL_*_ARRAY_SIZE) return GL_BGRA.
Same as b026b6bbfe, but
COLOR_ARRAY_SIZE/SECONDARY_COLOR_ARRAY_SIZE.

Ideally we wouldn't munge the incoming state, so that we wouldn't need
to unmunge it back on glGet*.  But the array size state is copied and
referred in many places, many of which couldn't take an GLenum like
GL_BGRA instead of a plain integer.  So just hack around on glGet*,
to ensure there is no risk of introducing regressions elsewhere.

This bug causes problems to Apitrace, resulting in wrong traces.  See
https://github.com/apitrace/apitrace/issues/261 for details.

Tested with piglit arb_vertex_array_bgra-get, which was created for this
purpose.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e3e13d6b85)
2014-06-06 17:54:32 -07:00
José Fonseca
224c193237 mesa/main: Make get_hash.c values constant.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53468dee03)
2014-06-06 17:35:45 -07:00
Beren Minor
494f916125 egl/main: Fix eglMakeCurrent when releasing context from current thread.
EGL 1.4 Specification says that
eglMakeCurrent(display, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT)
can be used to release the current thread's ownership on the surfaces
and context.

MESA's egl implementation was only accepting the parameters when the
KHR_surfaceless_context extension is supported.

[chadv] Add quote from the EGL 1.4 spec.
Cc: "10,1, 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 0ca0d5743f)
2014-06-06 17:15:51 -07:00
Marek Olšák
767bc05309 Revert "glx: load dri driver with RTLD_LOCAL so dlclose never fails to unload"
This reverts commit e3cc0d90e1.

It breaks too many apps and completely breaks my desktop too.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79469

We'll probably need to re-release all stable versions after this is committed.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0d5ec2c615)
2014-06-06 17:13:03 -07:00
Roland Scheidegger
3aaae6056e llvmpipe: fix crash when not all attachments are populated in a fb
Framebuffers can have NULL attachments since a while. llvmpipe handled
that properly for lp_rast_shade_quads_mask but it seems the change didn't
make it to lp_rast_shade_tile.
This fixes piglit fbo-drawbuffers-none test (though I need to increase
the FB_SIZE from 32 to 256 so the tris cover some tiles fully).
https://bugs.freedesktop.org/show_bug.cgi?id=79421

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 576868140b)
2014-06-06 17:06:55 -07:00
Ian Romanick
8b71741222 Bump version to 10.2-rc5
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-05-30 17:11:47 -07:00
Lubomir Rintel
15ec4ef0da i915: add a missing NULL pointer check
mesaVisual can be NULL with configless context since this commit:

    commit 551d459af4
    Author: Neil Roberts <neil@linux.intel.com>
    Date:   Fri Mar 7 18:05:47 2014 +0000

    Add the EGL_MESA_configless_context extension
...
    Previously the i965 and i915 drivers were explicitly creating a zeroed visual
    whenever 0 is passed for the EGLConfig.

We attempt to dereference the visual in i915 and now we don't create a
zeroed-out one one it crashes, breaking at least weston in an i915. There's
no point in doing so as it would be zero anyway.

v2: Fixed a typo in commit message.  Added some tags.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1100967
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 90b5747856)
2014-05-30 17:11:47 -07:00
Ian Romanick
9fde5670e2 glapi: Duplicate GLES1 prototypes in glapi_dispatch.c
These prototypes are necessary because GLES1 library builds will create
dispatch functions for them.  We can't directly include GLES/gl.h
because it would conflict the previously-included GL/gl.h.  Since GLES1
ABI is not expected to every add more functions, the path of least
resistance is to just duplicate the prototypes for the functions that
aren't already in desktop OpenGL.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79294
Acked-by: Matt Turner <mattst88@gmail.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b1aeec9cd)
2014-05-30 17:11:47 -07:00
Ilia Mirkin
76e112380a nvc0: revert mistaken logic to collapse color outputs to the beginning
In commit af38ef907, I added a "fix" to color outputs not being assigned
correctly when sample mask was being output. This was totally wrong --
the color indices (i.e. "si" values) were the ones that were wrong. Undo
that hunk.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 0d699530ff)

Requested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-30 17:11:15 -07:00
Ilia Mirkin
8ac81e5b66 mesa/st: fix color outputs in presence of sample mask output
Commit c5d822dad9 added support for sample mask incorrectly. It became
treated as a color output, and messed up the color output indices.
Revert the hunk that did that, and add explicit support just like for
depth/stencil writes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ab7bd7093d)

Requested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-30 17:11:15 -07:00
Rob Clark
6d23a0b2a6 configure: fix build error with XA
Fixes:

xa_tracker.c: In function 'xa_tracker_create':
 xa_tracker.c:147:5: error: implicit declaration of function 'pipe_loader_drm_probe_fd' [-Werror=implicit-function-declaration]

in some build configurations, as XA now implicitly depends on
gallium_drm_loader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit 20d14ef263)

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=511700
Requested-by: Matt Turner <mattst88@gmail.com>
2014-05-30 17:11:15 -07:00
Pavel Popov
8f984928cc i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell.
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
(cherry picked from commit d292d40207)
2014-05-30 17:11:15 -07:00
Jerome Glisse
7ab2363c11 glx: load dri driver with RTLD_LOCAL so dlclose never fails to unload
There is no reason anymore to load with RTLD_GLOBAL and for some driver
this even result in dlclose failing to unload leading to catastrophic
failure with swrast fallback.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
(cherry picked from commit e3cc0d90e1)
2014-05-29 15:48:53 -07:00
Brian Paul
55b9effa4a glsl: fix use-after free bug/crash in ast_declarator_list::hir()
The call to get_variable_being_redeclared() may delete 'var' so we
can't reference var->name afterward.  We fix that by examining the
var's name before making that call.

Fixes valgrind warnings and possible crash when running the piglit
tests/spec/glsl-1.30/execution/clipping/vs-clip-distance-in-param.shader_test
test (and probably others).

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f9cecca7a6)
2014-05-29 15:48:02 -07:00
Kenneth Graunke
5347fc5295 i965: Fix repeated usage of rectangle texture coordinate scaling.
Previously, we set up new entries in the params[] array on every access
of a rectangle texture.  Unfortunately, we only reserve space for
(2 * MaxTextureImageUnits) extra entries, so programs which accessed
rectangle textures more times than that would write off the end of the
array and likely crash.

We don't really have a decent mapping between the index returned by
_mesa_add_state_reference and our index into the params array, so we
have to manually search for it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78691
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bb9623a1a8)
2014-05-29 15:47:29 -07:00
Topi Pohjolainen
e8e48889e6 meta/blit: Use gl_FragColor also in the msaa blit shader
Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
the meta path.

No piglit regressions on IVB.

Further input from Ken:

 "Unfortunately, this doesn't fix MRT for integer data.

  In the single-sampled case, since we're directly copying data, we were
  read/copy/write data as "float" values, which actually contained the
  integer bits.  Here, we can't do that since we need to process the
  actual integer data.

  I do wonder if we could use intBitsToFloat/uintBitsToFloat to stuff the
  integer bits in the float gl_FragColor output.  Just a crazy idea.

  In the long term (post 10.2), I think we should draft an extension that
  allows you to do "layout(location = all)" on user-defined fragment
  shader outputs.  (Or some similar syntax.)"

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit a6022e5405)
2014-05-29 15:46:26 -07:00
Topi Pohjolainen
af3d4eddc1 i965/meta: Store stencil texturing mode
Meta path needs to keep the current texture object's state. Fixes
the following gles3 cts tests on bdw:

framebuffer_blit_functionality_negative_width_blit.test: fail
framebuffer_blit_functionality_all_buffer_blit.test: fail
framebuffer_blit_functionality_negative_height_blit.test: fail
framebuffer_blit_functionality_missing_buffers_blit.test: fail
framebuffer_blit_functionality_negative_dimensions_blit.test: fail
framebuffer_blit_functionality_minifying_blit.test: fail
framebuffer_blit_functionality_magnifying_blit.test: fail

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 57730d67f6)
2014-05-29 15:45:43 -07:00
Topi Pohjolainen
75ae4fff35 meta/blit: Add stencil texturing mode save and restore
v2 (Ken): Only restore the mode if it has changed.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c246828c4d)
2014-05-29 15:44:45 -07:00
Matt Turner
c984e5bd2e Revert "i965: Don't make instructions with a null dest a barrier to scheduling."
This reverts commit 42a26cb5e4.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648
(cherry picked from commit 0d3f83f4ad)
2014-05-29 15:44:09 -07:00
Matt Turner
ca6b38b80a Revert "i965/fs: Simplify interference scan in register coalescing."
This reverts commit 5ff1e446d4.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704
(cherry picked from commit a39428cf5c)
2014-05-29 15:42:43 -07:00
Matt Turner
b814afeb6c Revert "i965/fs: Give up in interference check if we see a WHILE."
This reverts commit 55de1c035c.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fc025a6719)
2014-05-29 15:41:53 -07:00
Matt Turner
17c7ead727 Revert "i965/fs: Reduce restrictions on interference in register coalescing."
This reverts commit f770123f58.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692
(cherry picked from commit ccb1ea8a15)
2014-05-29 15:40:55 -07:00
Emil Velikov
2a29dbdc6e glx: do not leak dri3Display
v2: Do not wrap the code in ifdef HAVE_DRI3 (suggested by Keith)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit eb2241f8a9)
2014-05-29 15:40:09 -07:00
Matt Turner
03e93f6079 Revert "i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6"
This reverts commit a6860100b8.

Why this code didn't work in all circumstances is unknown and without a
working Ironlake simulator (which uses a different AUB format) we'll
probably never know, short of a lot of experimentation, and spending a
bunch of time to try to optimize a few instructions on Ironlake is not
time well spent.

Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a
dependence between the otherwise independent per-component calculations.
Not using the accumulator, even if it means an extra instruction per
component might be preferable. We don't know, we don't have data, and
we don't have the necessary register on Ironlake for shader_time to tell
us.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c2c639ecf6)
2014-05-29 15:17:53 -07:00
Matt Turner
bc4b9467af Revert "i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6"
This reverts commit 2dfbbeca50 with the
comment about MAC and implicit accumulator removed.

Why this code didn't work in all circumstances is unknown and without a
working Ironlake simulator (which uses a different AUB format) we'll
probably never know, short of a lot of experimentation, and spending a
bunch of time to try to optimize a few instructions on Ironlake is not
time well spent.

Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a
dependence between the otherwise independent per-component calculations.
Not using the accumulator, even if it means an extra instruction per
component might be preferable. We don't know, we don't have data, and
we don't have the necessary register on Ironlake for shader_time to tell
us.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit db42dd8952)
2014-05-29 15:17:28 -07:00
Christoph Bumiller
7efdc55f5f nv50/ir/tgsi: optimize KIL
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d479713d25)
2014-05-29 15:16:56 -07:00
Christoph Bumiller
9ea859931e nv50/ir: fix lowering of predicated instructions (without defs)
Note that predicated instructions with defs are still not supported
because transformation to SSA doesn't handle them yet.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 452a4151aa)
2014-05-29 15:16:24 -07:00
Christoph Bumiller
4e5296208d nv50/ir/opt: fix constant folding with saturate modifier
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3b0867f35b)
2014-05-29 15:16:03 -07:00
Christoph Bumiller
1ced952686 nv50/ir/tgsi: TGSI_OPCODE_POW replicates its result
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2f2d1b3d9b)
2014-05-29 15:15:59 -07:00
Christoph Bumiller
afe723ce5f nv50,nvc0: set constbufs dirty on pipe context switch
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 49eccef06b)
2014-05-29 15:15:39 -07:00
Christoph Bumiller
8b74c2bdbd nv50: setup scissors on clear_render_target/depth_stencil
[imirkin: add logic to also clear the "regular" scissors]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 200382be85)
2014-05-29 15:15:10 -07:00
Christoph Bumiller
4afbd9b0e2 nv50,nvc0: always pull out bufctx on context destruction
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d11b761f2)
2014-05-29 15:01:49 -07:00
Ian Romanick
697316fe06 Bump version to 10.2-rc4
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-05-23 17:36:42 -07:00
Ian Romanick
bfaee5277a Merge remote-tracking branch 'robclark/freedreno-10.2' into 10.2 2014-05-23 17:21:59 -07:00
Pavel Popov
9a8f12ae03 i965: Properly return *RESET* status in glGetGraphicsResetStatusARB
The glGetGraphicsResetStatusARB from ARB_robustness extension always
returns GUILTY_CONTEXT_RESET_ARB and never returns NO_ERROR for guilty
context with LOSE_CONTEXT_ON_RESET_ARB strategy.  This is because Mesa
returns GUILTY_CONTEXT_RESET_ARB if batch_active !=0 whereas kernel
driver never reset batch_active and this variable always > 0 for guilty
context.  The same behaviour also can be observed for batch_pending and
INNOCENT_CONTEXT_RESET_ARB.

But ARB_robustness spec says:

  If a reset status other than NO_ERROR is returned and subsequent calls
  return NO_ERROR, the context reset was encountered and completed. If a
  reset status is repeatedly returned, the context may be in the process
  of resetting.

  8. How should the application react to a reset context event?
  RESOLVED: For this extension, the application is expected to query the
  reset status until NO_ERROR is returned. If a reset is encountered, at
  least one *RESET* status will be returned. Once NO_ERROR is
  encountered, the application can safely destroy the old context and
  create a new one.

The main problem is the context may be in the process of resetting and
in this case a reset status should be repeatedly returned.  But looks
like the kernel driver returns nonzero active/pending only if the
context reset has already been encountered and completed.  For this
reason the *RESET* status cannot be repeatedly returned and should be
returned only once.

The reset_count and brw->reset_count variables can be used to control
that glGetGraphicsResetStatusARB returns *RESET* status only once for
each context.  Note the i915 triggers reset_count twice which allows to
return correct reset count immediately after active/pending have been
incremented.

v2 (idr): Trivial reformatting of comments.

Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8dc4a98c44)
2014-05-23 09:57:18 -07:00
Emil Velikov
a31062fcb3 targets/egl-static: add missing line break in ldflags
Accidently omitted by commit 7b7944ee1c.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
(cherry picked from commit e0372239a5)
2014-05-23 09:57:15 -07:00
James Legg
a1fff38c96 mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT
glFramebufferRender(..., GL_DEPTH_STENCIL_ATTACHMENT, ..., 0) only
detached the depth buffer and not the stencil buffer.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=79115
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 846c715abb)
2014-05-23 09:56:26 -07:00
Jordan Justen
1db3ebd8a5 meta blit: Set Z texcoord during meta blit to sample the correct layer
If the source renderbuffer has a depth > 0, then send a Z texcoord
which is set to the source attachment Z offset.

This fixes piglit's gl-3.2-layered-rendering-gl-layer-render with the
GL_TEXTURE_2D_MULTISAMPLE_ARRAY case test on i965/gen8.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 57876fee38)
2014-05-23 09:55:23 -07:00
Kenneth Graunke
7cf3a674ea i965: Listen to BRW_NEW_FRAGMENT_PROGRAM for 3DSTATE_PS_BLEND.
brw_color_buffer_write_enabled depends on brw->fragment_program, which
means we have to listen to BRW_NEW_FRAGMENT_PROGRAM.

On most generations, this was only called from a function that already
subscribed.  However, on Broadwell, we failed to listen to the necessary
event in the atom that emits 3DSTATE_PS_BLEND.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 746921cbb4)
2014-05-23 09:54:41 -07:00
Kenneth Graunke
d2521a44af i965: Use WE_all for FB write header setup on Broadwell.
I forgot to disable writemasking on the OR and MOV which set the render
target index and "source 0 alpha present to render target" bit.

Using get_element_ud is equivalent and avoids a line-wrap.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d3985ca6c)
2014-05-23 09:54:15 -07:00
Anuj Phogat
00f2dcb791 meta: Use gl_FragColor to output color values to all the draw buffers
_mesa_meta_setup_blit_shader() currently generates a fragment shader
which, irrespective of the number of draw buffers, writes the color
to only one 'out' variable. Current shader rely on an undefined
behavior and possibly works by chance.

From OpenGL 4.0  spec, page 256:
  "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a
   set of draw buffers into which the single fragment color defined by
   gl_FragColor is written. If a fragment shader writes to gl_FragData,
   or a user-defined varying out variable, DrawBuffers specifies a set
   of draw buffers into which each of the multiple output colors defined
   by these variables are separately written. If a fragment shader writes
   to none of gl_FragColor, gl_FragData, nor any user defined varying out
   variables, the values of the fragment colors following shader execution
   are undefined, and may differ for each fragment color."

OpenGL 4.4 spec, page 463, added an additional line in this section:
  "If some, but not all user-defined output variables are written, the
   values of fragment colors corresponding to unwritten variables are
   similarly undefined."

V2: Write color output to gl_FragColor instead of writing to multiple
    'out' variables. This'll avoid recompiling the shader every time
    draw buffers count is updated.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 46737cebd3)
2014-05-23 09:53:42 -07:00
Anuj Phogat
ed1ffa0197 meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bee2915210)
2014-05-23 09:52:29 -07:00
Ilia Mirkin
5d056f51ab tgsi: add GS_INVOCATIONS to property names array
In commit 4be146b1, I neglected to add the new property to the strings
array. This leads to the string '(null)' to be printed instead when
converting a GS shader to text.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit cdeb7004e0)
2014-05-23 09:51:49 -07:00
Ilia Mirkin
6be7789e11 nv50,nvc0: fix 3d blits with mipmap levels
Make sure to normalize the z coordinates as well as the x/y ones when
there are mipmaps present. Fixes 3d mipmap generation, which now uses
the blit path.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit 28360fcad7)
2014-05-23 09:51:26 -07:00
Ilia Mirkin
d6a4c3c29c nv50/ir: fix constant folding for OP_MUL subop HIGH
These instructions can come in either through IMUL_HI/UMUL_HI TGSI
opcodes, or from OP_DIV constant folding.

Also make sure that the constant foldings which delete the original
instruction still get counted as having done something.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit d2a3de19c6)
2014-05-23 09:51:06 -07:00
Ilia Mirkin
9028b94670 nv50/ir: fix s32 x s32 -> high s32 multiply logic
Retrieving the high 32 bits of a signed multiply is rather annoying. It
appears that the simplest way to do this is to compute the absolute
value of the arguments, and perform a u32 x u32 -> u64 operation. If the
arguments' signs differ, then negate the result. Since there is no u64
support in the cvt instruction, we have the perform the 2's complement
negation "by hand".

This logic can come into use by the IMUL_HI instruction (very unlikely
to be seen), as well as from constant folding of division by a constant.
Fixes dolphin's divisions by 255.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit d3a5cf052c)
2014-05-23 09:50:26 -07:00
Kenneth Graunke
085d6bd5e7 meta: Avoid _swrast_BlitFramebuffer in the meta CopyTexSubImage code.
This is a replacement for bd44ac8b5c
that should actually work.

Fixes Piglit's copyteximage-border on swrast, as well as one of
es3conform's packed_pixels_pixelstore test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78546
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2ecc7268ba)
2014-05-23 09:49:28 -07:00
Kenneth Graunke
fd0ea5be9d meta: Split _swrast_BlitFramebuffer out of the meta blit path.
Separating the software fallbacks from the rest of the meta path (which
is usually hardware accelerated) gives callers better control over their
blitting options.

For example, i965 might want to try meta blit, hardware blits, then
swrast as a last resort.  Splitting it makes that possible.

This updates all callers to maintain the existing behavior (even in the
few cases where it isn't desirable behavior - later patches can change
that).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 54540ea691)
2014-05-23 09:48:13 -07:00
Kenneth Graunke
27d4836f35 meta: Drop unnecessary early returns in _mesa_meta_BlitFramebuffer.
These aren't necessary - all of the following code is predicated on mask
being non-zero, so no code will get executed anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d89ce333cc)
2014-05-23 09:47:37 -07:00
Kenneth Graunke
e306ba9a9b Revert "i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage."
This reverts commit bd44ac8b5c.

Fixes:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78842
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78843

Re-breaks:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
but that will be fixed properly in a few commits.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2fa3796bc1)
2014-05-23 09:46:57 -07:00
Topi Pohjolainen
81fb9ef112 i965/fbo: Only try stencil meta blits on gen >= 8
I don't have an ILK at hand but the fix should be trivial.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78872
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 21dddb22c1)
2014-05-23 09:46:28 -07:00
Kenneth Graunke
32549f3f17 mesa: Disable GL_EXT_framebuffer_multisample_blit_scaled on Broadwell.
It's not properly implemented in the meta code, and we don't have time
to fix it for 10.2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0b96d362bf)
2014-05-23 09:45:52 -07:00
Ilia Mirkin
9576e17804 nv50/ir: fix integer mul lowering for u32 x u32 -> high u32
UNION appears to expect that all of its sources are conditionally
defined. Otherwise it inserts an unpredicated mov instruction which
overwrites the desired result. This fixes tests that use UMUL_HI, and
much less directly, unsigned integer division by a constant, which uses
this functionality in a peephole pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit 5b8f1a0f7c)
2014-05-23 09:45:13 -07:00
Ilia Mirkin
cc65bc4d15 nv50/ir: make sure that texprep/texquerylod's args get coalesced
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit 4ebaabcccb)
2014-05-23 09:40:26 -07:00
Jeremy Huddleston Sequoia
25e641213f darwin: Fix test for kCGLPFAOpenGLProfile support at runtime
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit 7a109268ab)
2014-05-20 10:55:12 -07:00
Rob Clark
e084f71548 freedreno: don't advertise texture arrays for now
I think a3xx and later should support (it is part of GLES3), but this
isn't needed for the time being and still needs to be reversed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 10:55:54 -04:00
Rob Clark
cdd328639f freedreno/a3xx: shadow sampler support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:49 -04:00
Rob Clark
6440561737 freedreno/a3xx/compiler: refactor trans_samp()
Split it up into some smaller fxns so it doesn't grow into a huge
monster as we add things.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:49 -04:00
Rob Clark
fb4461b7dc freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
fec2b45d02 freedreno/a3xx: use util_format_compose_swizzles()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
d0c813c40a freedreno/a3xx/compiler: 1D textures
Gallium already gives us height==1 for these, so the texture state is
already setup correctly to emulate 1D textures as a Nx1 2D texture.  We
just need to supply the .y coord.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
a05c073d79 freedreno: fix caps
In particular, we want mesa to emulate primitive restart for us.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
031ee21961 freedreno: fix index buffer offset
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
b7604eff4c freedreno/a3xx: add sRBG texture support
That was easy.  Turns out it is just a matter of setting one bit.
Enable sampling from sRGB texture, and therefore enable GL 2.1 :-)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
80da86c650 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:48:20 -04:00
Rob Clark
3c0ca023dd freedreno/a3xx: fix write to bogus register
The loops for updating the multiple packed fields in SP_VS_OUT[] and
SP_VS_VPC_DST[] will zero out one register beyond the last that on
required.  Which is normally not a problem (and is kinda convenient
when looking at cmdstream dumps) unless we have maximum (16) varyings.

Fix loop termination condition so that this does not happen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:47:20 -04:00
Rob Clark
516db26e1e freedreno/a3xx: account for special inputs/outputs
We need to size input/output tables big enough for special inputs/
outputs (gl_Position, gl_FrontFacing, etc) which, while they don't
count towards the hw limit of 16 attributes or 16 varyings, we do
still need to track them all the same.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:47:19 -04:00
Rob Clark
d5d9984c2b freedreno/a3xx: fix MAX_INPUTS shader cap
Hardware only supports 16.  Which fd3_shader_variant properly reflected,
but the pipe cap did not, leading to array overflow (and shaders that
could not possibly work).

Also a bunch of asserts to make problems like this easier to see.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:47:19 -04:00
Ryan Houdek
6db6f05fae freedreno/a3xx/compiler: add KILL_IF
The KILL_IF opcode could potentially be merged in to the regular KILL
opcode function.  It was a pain to do so, so I've left is separated
for cleanliness.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:47:19 -04:00
Ryan Houdek
c338759051 freedreno/a3xx/compiler: start adding integer support
Adds a large sum of TGSI opcodes to the a3xx compiler.

For integer opcodes we have 28 opcodes added.
Adds 4 floating point compare opcodes

If GLSL 1.30 is enabled, this allows the GLSL 1.30 piglits to have a
completion amount of 432/641.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:46:38 -04:00
Rob Clark
47a6830e22 freedreno/a3xx: occlusion query support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:46:38 -04:00
Rob Clark
3ffc507c94 freedreno: add support for hw queries
Real GPU queries need some infrastructure to track samples per tile and
accumulate the results.  But fortunately this can be shared across GPU
generation.

See:
https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:46:38 -04:00
Rob Clark
c94e339adc freedreno/query: allow multiple query implementations
Split out fd_query into an abstract base class, to allow multiple
implementations.  The current sw based queries are moved into
fd_sw_query.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:45:50 -04:00
Rob Clark
a5951d09a5 freedreno/a3xx: add point-size
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:45:50 -04:00
Rob Clark
3475ca1f00 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:45:50 -04:00
Rob Clark
3733cc3e8f freedreno/a2xx: fix compiler warning
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 08:45:50 -04:00
Jeremy Huddleston Sequoia
ac49f97f12 glapi: Avoid heap corruption in _glapi_table
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com>
(cherry picked from commit ff5456d1ac)
2014-05-20 01:39:17 -07:00
Ian Romanick
d0aa394741 Bump version to 10.2-rc3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-05-16 23:48:44 -07:00
Brian Paul
4baf6f12a5 mesa: fix double-freeing of dispatch tables inside glBegin/End.
We allocate dispatch tables for BeginEnd and OutsideBeginEnd.  But
when we destroy the context we were freeing the BeginEnd and Exec
tables.  If Exec==BeginEnd we did a double-free.  This would happen
if the context was destroyed while inside a glBegin/End pair.  Now
free the BeginEnd and OutsideBeginEnd pointers.

Cc: "10.1", "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit ef6b6658f9)
2014-05-16 23:46:34 -07:00
Michel Dänzer
21792665c7 glsl_to_tgsi: Make sure the 'shader' member is always initialized
Fixes the valgrind report below and random crashes with piglit on radeonsi.

==30005== Conditional jump or move depends on uninitialised value(s)
==30005==    at 0xB13584E: st_translate_program (st_glsl_to_tgsi.cpp:5100)
==30005==    by 0xB14698B: st_translate_fragment_program (st_program.c:747)
==30005==    by 0xB14777D: st_get_fp_variant (st_program.c:824)
==30005==    by 0xB11219C: get_color_fp_variant (st_cb_drawpixels.c:1042)
==30005==    by 0xB1131AE: st_DrawPixels (st_cb_drawpixels.c:1154)
==30005==    by 0xAFF8806: _mesa_DrawPixels (drawpix.c:162)
==30005==    by 0x4EB86DB: stub_glDrawPixels (generated_dispatch.c:6640)
==30005==    by 0x4F1DF08: piglit_visualize_image (piglit-util-gl.c:1574)
==30005==    by 0x40691D: draw_image_to_window_system_fb(int, bool) (draw-buffers-common.cpp:733)
==30005==    by 0x406C8B: draw_reference_image(bool, bool) (draw-buffers-common.cpp:854)
==30005==    by 0x40722A: piglit_display (alpha-to-coverage-dual-src-blend.cpp:117)
==30005==    by 0x4EA7168: run_test (piglit_fbo_framework.c:52)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 2bab95973d)
2014-05-16 23:45:50 -07:00
Topi Pohjolainen
872ea423ac i965/fb: Use meta path for stencil up/downsampling
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit d45fadf11a)
2014-05-16 23:45:24 -07:00
Topi Pohjolainen
ad8ad99eff i965/meta: Stencil blit for miptree updownsampling
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 475216a4f0)
2014-05-16 23:43:16 -07:00
Topi Pohjolainen
62f1509070 i965/fb: Use meta path for stencil blits
This is effective only on gen8 for now as previous generations still
go through blorp.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b18f6b9b86)
2014-05-16 23:42:08 -07:00
Topi Pohjolainen
eb2ef1641c i965/meta: Stencil blits
v2: Create the intel renderbuffer with level hardcoded to zero instead
    of overriding it in the surface state configuration. Also moved the
    dimension adjustments for tiling, mip level, msaa into the render
    buffer creation. Finally prepares for another blit path needed for
    miptree updownsampling.
v3 (Ken): Dropped unnecessary memory context for "ralloc_asprintf()"

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit d1829badf5)
2014-05-16 23:41:56 -07:00
Topi Pohjolainen
947b60d19e meta: Refactor state save/restore for framebuffer texture blits
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2a549c43a8)

Note: This patch was cherry picked so that the next patch would build.
2014-05-16 23:41:40 -07:00
Topi Pohjolainen
cb37016f89 i965: Extend brw_get_rb_for_first_slice() for specified level/layer
v2: Configure stencil directly for final dimensions instead of
    adjusting bit by bit for tiling, mip level and msaa.
v3 (Ken): Used non-static constant for horizontal alignment

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d752c098c)
2014-05-16 23:31:05 -07:00
Topi Pohjolainen
43ea5f9347 i965/gen8: Surface state overriding for stencil
v2: Allow hardware to offset accesses to individual layers. Also leave
    the mip-level overriding for the creator of the intel renderbuffer
    to handle. Merged with "i965/gen8: Allow stencil buffers to be
    configured as single sampled"

Ken: I left the "_mesa_problem()" still in place. I think it is clearer
     to remove it in a separate patch.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 36caae48b2)
2014-05-16 23:28:04 -07:00
Topi Pohjolainen
b5e717a618 i965/wm: Surface state overrides for configuring w-tiled as y-tiled
v2: Use intel_mipmap_tree::total_width in order to get correct alignment
    automatically. Also use "mt->total_height / mt->physical_depth0" as
    surface height allowing hardware to offset to correct slice.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6aefaa4eb2)
2014-05-16 23:27:29 -07:00
Jordan Justen
f5848ec2e4 i965 meta up/downsample: Fix renderbuffer _BaseFormat
mt->format is of type mesa_format, and therefore can't be
used with _mesa_base_fbo_format which requires a GLenum input.

On gen8, this fixes various piglit fbo-depthstencil tests with
samples > 1.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 103057b2b7)
2014-05-16 23:26:58 -07:00
Roland Scheidegger
79a34441d5 mesa/st: fix number of ubos being declared in a shader
Previously the code used the total number of ubos being declared in the
linked program (so the ubos of all shaders combined), use the number
from the particular shader instead.
This fixes an assertion failure with piglit arb_uniform_buffer_object-maxblocks
seen in llvmpipe since 8a9f5ecdb1 as it now emits
code for each declared buffer, not just the ones actually used.

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 3e817e7e56)
2014-05-16 23:17:47 -07:00
Emil Velikov
1041fb86c0 docs: Add a note about llvm-shared-libs and libxatracker
Both changes landed in 10.2, and for people not following the
development cycle these will come as a surprise. Note that the
pipe_* interface is not stable.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit e48054d036)
2014-05-16 23:15:14 -07:00
Emil Velikov
b1aa25907a configure: correctly set LD_NO_UNDEFINED
Commit 11623be934 was meant to have this hunk, which
I accidently dropped during git rebase.

Cc: 10.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
(cherry picked from commit f57d092199)
2014-05-16 23:15:09 -07:00
Michel Dänzer
5d6e822d03 radeonsi: Fix anisotropic filtering state setup
Bring it back in line with r600g. I broke this in the original radeonsi
bringup. :(

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c5828b0599)
2014-05-16 23:14:36 -07:00
Jonathan Gray
26d5b22039 glsl: simplify the M_PI*f macros, fixes build on OpenBSD
The M_PI*f macros used a preprocessor paste to append 'f'
to M_PI defines, which works if the values are only numbers
but breaks on OpenBSD where M_PI definitions have casts
and brackets to meet requirements of a future version of POSIX,

http://austingroupbugs.net/view.php?id=801
http://austingroupbugs.net/view.php?id=828

Simplify the M_PI*f macros by using casts directly in the defines
as suggested by Kenneth Graunke.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
(cherry picked from commit 0c0bbe77d0)
2014-05-16 23:13:37 -07:00
Kenneth Graunke
3171da3402 i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage.
The point of copytexsubimage_using_blit_framebuffer is to use a hardware
accelerated BlitFramebuffer path.  If that fails, we shouldn't do a
swrast blit---we should try our CTSI fallback code.

This is especially important for i965 and GLES, where we don't even
create a swrast context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd44ac8b5c)
2014-05-16 23:13:04 -07:00
Kristian Høgsberg
875fd92d16 wayland: Move version 2 request to end of interface specification
We're moving towards requiring interface additions to be appended to the
end of the interface block.  No functional change, opcodes are assigned as
before, but version 2 additions are now grouped together, which prevents
a scanner warning.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit 06842d436e)
2014-05-16 23:12:45 -07:00
Topi Pohjolainen
fb5c68d312 meta: Refactor configuration of renderbuffer sampling
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4dc9c314c8)
2014-05-16 23:07:02 -07:00
Topi Pohjolainen
0e7b0f2a0a meta: Refactor binding of renderbuffer as texture image
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a2952315ac)
2014-05-16 23:05:22 -07:00
Topi Pohjolainen
5f495b85a0 meta: Merge compiling and linking of blit program
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ac4db0aa55)
2014-05-16 23:04:22 -07:00
Topi Pohjolainen
253834cbf6 i965/blorp: Expose coordinate scissoring and mirroring
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3a43cd0c3e)
2014-05-16 23:00:40 -07:00
Topi Pohjolainen
f5c083dbc3 i965/gen8: Use helper variables for surface parameters
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4a92ad5531)
2014-05-16 22:55:44 -07:00
Jordan Justen
2b4a871e05 i965/gen8: Set depth extent field
The depth extent field is used to limit the allowed slice range that
can be rendered to.

With the previous setting, only slice 0 could be rendered.

This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit c51c192891)
2014-05-14 12:19:16 -07:00
Jordan Justen
27da0bbeb4 i965/gen8 depth: Set depth size based on LOD0 for 3D textures
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 294ada2fef)
2014-05-14 12:19:14 -07:00
Jordan Justen
91e2808c41 i965/gen7 depth: Set depth size based on LOD0 for 3D textures
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit e6d6ed55ab)
2014-05-14 12:19:13 -07:00
Jordan Justen
6cad93daab i965/gen8 renderbuffer: Set depth size based on LOD0 for 3D textures
Fixes piglit's
'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped'

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit e47d08adef)
2014-05-14 12:19:12 -07:00
Jordan Justen
71f78bb87e i965/gen7 renderbuffer: Set depth size based on LOD0 for 3D textures
If blorp is disabled for color clears, then piglit's
'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped'
will fail.

Currently, gen8 fails similarly on this test because gen8
does not use blorp.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit b875f39e29)
2014-05-14 12:19:08 -07:00
Chris Forbes
ab43a98fcf i965/Gen8: Set up layer constraints properly for depth buffers
Same issues as the previous commit fixed for Gen7:
- Bogus physical->logical layer conversion; depth/stencil surfaces
  are still IMS layout on Gen8.
- mt_layer ignored in layered rendering case, which breaks handling
  of views with MinLayer.
- Render target array extent not set correctly for arrays.

I'm not able to test this one since I can't get a Broadwell yet, but
it's the same set of fixes as for Gen7.

V2: Restore the MAX2() to account for zero depth/layer_count.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 23e9f06569)
2014-05-14 12:16:54 -07:00
Chris Forbes
af228e999c i965/Gen7: Set up layer constraints properly for depth buffers
Again, a few problems:
- Layered attachments did not honor MinLayer.
- Non-layered MSAA attachments rendered to the wrong layer due to
  dividing by the layer count. All depth buffers use the IMS layout, so
  the physical layer count == logical layer count.
- Layered attachments were not limited to irb->layer_count, so we could
  render off the end of the texture.

V2: Restore the MAX2() to account for zero depth/layer_count.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 77d55ef481)
2014-05-14 12:16:51 -07:00
Chris Forbes
725a27e04d i965/Gen8: Set up layer constraints properly for renderbuffers
Fixing the same issues the previous commit does for Gen7.

Note that I can't test this one, since I don't have a Broadwell.

V2: Restore the MAX2() to account for zero depth/layer_count.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9269ea599c)
2014-05-14 12:16:50 -07:00
Chris Forbes
b0609b715b i965/Gen7: Set up layer constraints properly for renderbuffers
There were a few problems here, which mostly just broke layered
rendering into a view:

- Render target view extent was always set to be == depth. This is
  benign for non-layered-rendering, but allows writes off the end of the
  render target for layered rendering, which ends badly.
- Layered rendering did not honor the mt_layer setting, so would not
  properly handle MinLayer being set on a view.

V2: Restore the MAX2() to account for zero depth/layer_count.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dd43900b7b)
2014-05-14 12:16:47 -07:00
Ilia Mirkin
ca549a0f19 nv50,nvc0: fix blit 3d path for 1d array textures
Need to adjust coordinates since the shader receives the array index as
depth in z, but the TEX instruction expects it to be the second
coordinate for a 1D array texture. This fixes fbo-generatemipmap-array.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8baed87212)
2014-05-13 10:19:04 -07:00
Ilia Mirkin
407bff9db0 nv50,nvc0: leave queries on during blit, turn them on for 2d engine
Fixes the new logic of the conditional rendering piglit test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4467c0c9fb)
2014-05-13 10:18:05 -07:00
Ilia Mirkin
0e14b19492 mesa/st: leave current query enabled during glBlitFramebuffer
Also make sure that pipe_blit_info gets zero'd out so that query isn't
accidentally left enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 64a7ddf40d)
2014-05-13 10:11:00 -07:00
Ilia Mirkin
a233f4c303 gallium: add bit to pipe_blit_info to leave current query enabled
Previously the implication was that queries should be disabled during
blits. However glBlitFramebuffer() is supposed to obey the current
query, and this new bit will indicate that to the driver.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 752ce0affb)
2014-05-13 10:08:33 -07:00
Ilia Mirkin
7a81788c67 nv50: fix setting of texture ms info to be per-stage
Different textures may be bound to each slot for each stage. So we need
to be able to upload ms parameters for each one without stages
overwriting each other.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 863573b9cb)
2014-05-13 10:08:01 -07:00
Ilia Mirkin
13bb2bc84b nv50/ir: make sure to reverse cond codes on all the OP_SET variants
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68f47cad0d)
2014-05-13 09:57:28 -07:00
Ian Romanick
98b66e8d96 Add .cherry-ignore file
e696727 adds a change, and 155f98d reverts that change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-05-13 09:55:23 -07:00
Ian Romanick
0b3126bddd mesa: Bump version to 10.2-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-05-09 20:10:38 -07:00
Emil Velikov
f2682b3b9f glx/tests: Partially revert commit 51e3569573
C++ does not support designated initializers, thus compilation
is not guaranteed to succeed. Surprisingly gcc 4.6.3 fails to
build the code, while version 4.9.0 compiles it without a hitch.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78403
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 326b8e253e)
2014-05-09 20:10:38 -07:00
Emil Velikov
d259928a56 configure: error out if building GBM without dri
Both backends require --enable-dri, and building an empty libgbm
makes little to no sense. Error out at configure to prevent the
user from shooting themselves in the foot.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78225
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e477d12c33)
2014-05-09 20:10:38 -07:00
Kenneth Graunke
ec6bd21162 i965: Fix GPU hangs on Broadwell in shaders with some control flow.
According to the documentation, we need to set the source 0 register
type to IMM for flow control instructions that have both JIP and UIP.

Fixes GPU hangs in approximately 10 Piglit tests, 5 es3conform tests,
Unigine Crypt, a WebGL raytracer demo, and several Steam titles.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75478
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75878
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76939
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Tested-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 9584959123)
2014-05-09 20:10:37 -07:00
Tom Stellard
53a0f9d0ba radeonsi: Enable geometry shaders with LLVM 3.4.1
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 93c2ebbd83)
2014-05-09 20:10:37 -07:00
Tom Stellard
0f0f1106b6 configure.ac: Add LLVM_VERSION_PATCH to DEFINES
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c5d0008325)
2014-05-09 20:10:37 -07:00
Thomas Hellstrom
2b34277bbd st/xa: Fix performance regression introduced by commit "Cache render target surface"
The mentioned commit has the nasty side-effect of turning off accelerated
copies.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 9306b7c171)
2014-05-09 20:10:37 -07:00
Tom Stellard
e29daf82cc clover: Destory pipe_screen when device does not support compute v2
v2:
  - Make sure screen was successfully created before destroying it.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit c5f0c98c49)
2014-05-09 20:10:37 -07:00
Tom Stellard
03673bcf6c pipe-loader: Don't destroy the winsys in the sw loader
The screen takes ownership of the winsys, and is responsible for
destroying it.  Users of pipe-loader should make sure they destory
and  screens they've created to avoid memory leaks.

This fixes a crash in clover introduced by
ce6c17c083 where the pipe-loader was
destroying the winsys while a screen was still using it.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit c650033b86)
2014-05-09 20:10:37 -07:00
Roland Scheidegger
af47859aed draw: do not use draw_get_option_use_llvm() inside draw execution paths
1c73e919a4 made it possible to not allocate
the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is
not reliable after draw context creation, since drivers can explicitly
request a non-llvm draw context even if draw_get_option_use_llvm() would
return true (and softpipe does just that) which leads to crashes.
Thus use draw->llvm to determine if we're using llvm or not instead (and
make draw->llvm available even if HAVE_LLVM is false so we don't have to put
even more ifdefs).

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 9af68e9b1d)
2014-05-09 20:10:37 -07:00
Kenneth Graunke
e120f1a958 mesa: Fix MaxNumLayers for 1D array textures.
1D array targets store the number of slices in the Height field.

Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 5c399ca8e4)
2014-05-09 18:27:26 -07:00
Kenneth Graunke
cc92276cb8 i965: Enable GL_ARB_texture_view on Broadwell.
This is a port of commit c9c08867ed.
A tiny bit of extra work was necessary to not break stencil texturing.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit ecfc418b68)
2014-05-08 14:57:12 -07:00
Ilia Mirkin
fac042fa05 nv50/ir/gk110: fix set with f32 dest
Should fix comparison opcodes like SGE/SLT/etc which expected a float to
be returned. These were previously getting integer 0/-1 values.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: 10.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e7047f2917)
2014-05-08 14:50:33 -07:00
Ian Romanick
d26b59ec27 linker: Fix consumer_inputs_with_locations indexing
In an earlier incarnation of populate_consumer_input_sets and
get_matching_input, the consumer_inputs_with_locations array was indexed
using the user-specified location.  In that version, only user-defined
varyings were included in the array.

In the current incarnation, the Mesa location is used to index the
array, and built-in varyings are included.

This change fixes the unit test to exepect gl_ClipDistance in the array,
and it resizes the arrays to actually be big enough.  It's just dumb
luck that the existing piglit tests use small enough locations to not
stomp the stack. :(

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78258
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Cc: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit f7bf37cb13)
2014-05-07 09:50:52 -07:00
Kenneth Graunke
c2c15a9a37 meta: Only clear the requested color buffers.
This path is used to implement both glClear and glClearBuffer; the
latter is only supposed to clear particular buffers.  Core Mesa provides
us that information in the buffers bitmask; we must only clear buffers
mentioned there.

To accomplish this, we save/restore the color draw buffers state, and
use glDrawBuffers to restrict drawing to the relevant buffers.

Fixes Piglit's spec/!OpenGL 3.0/clearbuffer-mixed-formats and
spec/ARB_framebuffer_object/fbo-drawbuffers-none glClearBuffer tests
for drivers using meta clears (such as Broadwell).

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77856
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 9701c6984d)
2014-05-07 09:49:13 -07:00
Kenneth Graunke
e6c98309c6 meta: Add infrastructure for saving/restoring the DrawBuffers state.
Sometimes we need to configure what draw buffers we render to, without
creating a new FBO.  This path will make that possible.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c1c1cf5f92)
2014-05-07 09:48:34 -07:00
Kenneth Graunke
ffc0cc027a meta: Add a new MESA_META_DRAW_BUFFERS bit.
This will be used for saving/restoring the glDrawBuffers state.
For now, make sure that existing users of MESA_META_ALL don't get
the new bit, since they probably won't want it.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit e526ebf35c)
2014-05-07 09:48:34 -07:00
Kenneth Graunke
658d0410d0 meta: Unify the GLSL and fixed-function clear paths.
The majority of _mesa_meta_Clear and _mesa_meta_glsl_Clear was the same;
adding a boolean for whether to use GLSL allows us to share most of it
without polluting either path too much.

Tested for regressions by hacking i965 to always use the non-GLSL path.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 7c8df60f31)
2014-05-07 09:48:34 -07:00
Kenneth Graunke
a1dd1e62fa i965: Always intel_prepare_render() after invalidating front buffers.
Fixes glean/texture_srgb, which hit recursive-flush prevention
assertions in vbo_exec_FlushVertices.

This probably hurts the performance of front buffer rendering, but
very few people in their right mind do front buffer rendering.

Fixes Glean's texture_srgb test.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit cde8bad1c9)
2014-05-07 09:48:34 -07:00
Tapani Pälli
c7a3c2d29d glsl: fix bogus layout qualifier warnings
Print out GL_ARB_explicit_attrib_location warnings only
when parsing attribute that uses "location" qualifier.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77245
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e65917f94e)
2014-05-07 09:48:34 -07:00
Kenneth Graunke
0a5034517a i965: Set miptree target field when creating from a BO.
Prior to commit 8435b60a35, the region
equivalent of this function called intel_miptree_create_layout, which
set mt->target to target.  With that commit, it no longer copied target.

Piglit's ext_image_dma_buf_import-sample_[xa]rgb8888 tests would then
hit an assertion failure, where image->TexObject->Target was
GL_TEXTURE_EXTERNAL_OES, and mt->target was GL_TEXTURE_2D.

Copying the target fixes this assertion failure.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 829cb0423d)
2014-05-05 10:10:54 -07:00
Ian Romanick
e8f6150320 mesa: Bump version to 10.2-rc1
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-05-02 21:17:00 -07:00
280 changed files with 6774 additions and 1595 deletions

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx
# gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -42,7 +42,7 @@ DRM_TOP := external/drm
DRM_GRALLOC_TOP := hardware/drm_gralloc
classic_drivers := i915 i965
gallium_drivers := swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx
gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx
MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

View File

@@ -64,14 +64,13 @@ IGNORE_FILES = \
parsers: configure
$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h
$(MAKE) -C src/mesa program/lex.yy.c program/program_parse.tab.c program/program_parse.tab.h
# Everything for new a Mesa release:
ARCHIVES = $(PACKAGE_NAME).tar.gz \
$(PACKAGE_NAME).tar.bz2 \
$(PACKAGE_NAME).zip
tarballs: md5
tarballs: checksums
rm -f ../$(PACKAGE_DIR) $(PACKAGE_NAME).tar
manifest.txt: .git
@@ -98,9 +97,9 @@ $(PACKAGE_NAME).zip: parsers ../$(PACKAGE_DIR) manifest.txt
zip -q -@ $(PACKAGE_NAME).zip < $(PACKAGE_DIR)/manifest.txt ; \
mv $(PACKAGE_NAME).zip $(PACKAGE_DIR)
md5: $(ARCHIVES)
@-md5sum $(PACKAGE_NAME).tar.gz
@-md5sum $(PACKAGE_NAME).tar.bz2
@-md5sum $(PACKAGE_NAME).zip
checksums: $(ARCHIVES)
@-sha256sum $(PACKAGE_NAME).tar.gz
@-sha256sum $(PACKAGE_NAME).tar.bz2
@-sha256sum $(PACKAGE_NAME).zip
.PHONY: tarballs md5

View File

@@ -1 +1 @@
10.2.0-devel
10.2.7

30
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,30 @@
# The first is the change, and the second is the revert of that change.
e6967270c75a5b669152127bb7a746d55f4407a6 i965: Fix depth (array slices) computation for 1D_ARRAY render targets.
155f98d49fdc2f46c760f8214327b3804ee60079 Revert "i965: Fix depth (array slices) computation for 1D_ARRAY render targets."
# This patch didn't have enough in the commit message to convince me it
# is a bug fix, (email sent to author asking for more information).
41d759d076737f94976f5294b734dbc437a12bae
# These patch were already cherry-picked before the 10.2.4 release.
#
# But get-pick-list.sh doesn't realize that because the commit messages for
# these on the stable branch reference commit IDs that don't actually appear
# on master. I'm not sure what happened, (perhaps master was force-pushed at
# some point?).
2eaf3f670fea4ce4466340141244e41a45542c13
e5adc560cc8544200faa3e04504202839626ab37
cf1b5eee7f36af29d1d5caba3538ad4985e51f81
# The patch depends on earlier ones that are not part of 10.2.
b3121bfd413973f460e2cc9a9f852bdfa1265fcf mesa: guard better when building with sse4.1 optimisations
# The PIPE_CAP is not in mesa 10.2 - breaks the build.
72969e0efb7a5a011629c1001e81aa2329ede6b1 radeon/compute: Report a value for PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE
# No whitespace fixes for the stable branches.
38fccc37c1fa57c1fd373e8d71621bb4aed31083 radeonsi/compute: Whitespace fixes
# The commit relies of patches restructuring r600_resource, which never made
# it in the 10.2 branch.
a15088338ebe544efd90bfa7934cb99521488141 radeonsi/compute: Stop leaking the input buffer

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*10\.2.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -331,6 +331,19 @@ LDFLAGS=$save_LDFLAGS
AC_SUBST([GC_SECTIONS])
dnl
dnl OpenBSD does not have DT_NEEDED entries for libc by design
dnl so when these flags are passed to ld via libtool the checks will fail
dnl
case "$host_os" in
openbsd*)
LD_NO_UNDEFINED="" ;;
*)
LD_NO_UNDEFINED="-Wl,--no-undefined" ;;
esac
AC_SUBST([LD_NO_UNDEFINED])
dnl
dnl compatibility symlinks
dnl
@@ -481,10 +494,10 @@ AC_CHECK_FUNC([dlopen], [DEFINES="$DEFINES -DHAVE_DLOPEN"],
AC_SUBST([DLOPEN_LIBS])
dnl Check if that library also has dladdr
save_LDFLAGS="$LDFLAGS"
LDFLAGS="$LDFLAGS $DLOPEN_LIBS"
save_LIBS="$LIBS"
LIBS="$LIBS $DLOPEN_LIBS"
AC_CHECK_FUNCS([dladdr])
LDFLAGS="$save_LDFLAGS"
LIBS="$save_LIBS"
case "$host_os" in
darwin*|mingw*)
@@ -1179,6 +1192,13 @@ if test "x$enable_gbm" = xyes; then
if test "x$enable_shared_glapi" = xno; then
AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi])
fi
else
# Strictly speaking libgbm does not require --enable-dri, although
# both of its backends do. Thus one can build libgbm without any
# backends if --disable-dri is set.
# To avoid unnecessary complexity of checking if at least one backend
# is available when building, just mandate --enable-dri.
AC_MSG_ERROR([gbm requires --enable-dri])
fi
fi
AM_CONDITIONAL(HAVE_GBM, test "x$enable_gbm" = xyes)
@@ -1253,6 +1273,10 @@ if test "x$enable_gallium_gbm" = xyes; then
AC_MSG_ERROR([gbm_gallium requires --enable-dri to build])
fi
if test "x$enable_gallium_egl" != xyes; then
AC_MSG_ERROR([gbm_gallium is only used by egl_gallium])
fi
GALLIUM_STATE_TRACKERS_DIRS="gbm $GALLIUM_STATE_TRACKERS_DIRS"
GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS gbm"
enable_gallium_loader=yes
@@ -1273,6 +1297,7 @@ if test "x$enable_xa" = xyes; then
fi
GALLIUM_STATE_TRACKERS_DIRS="xa $GALLIUM_STATE_TRACKERS_DIRS"
enable_gallium_loader=yes
enable_gallium_drm_loader=yes
fi
AM_CONDITIONAL(HAVE_ST_XA, test "x$enable_xa" = xyes)
@@ -1303,7 +1328,7 @@ AM_CONDITIONAL(HAVE_OPENVG, test "x$enable_openvg" = xyes)
dnl
dnl Gallium G3DVL configuration
dnl
if test -n "$with_gallium_drivers" && ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
if test "x$enable_xvmc" = xauto; then
PKG_CHECK_EXISTS([xvmc], [enable_xvmc=yes], [enable_xvmc=no])
fi
@@ -1605,6 +1630,12 @@ if test "x$enable_gallium_llvm" = xyes; then
AC_COMPUTE_INT([LLVM_VERSION_MINOR], [LLVM_VERSION_MINOR],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/llvm-config.h"])
dnl In LLVM 3.4.1 patch level was defined in config.h and not
dnl llvm-config.h
AC_COMPUTE_INT([LLVM_VERSION_PATCH], [LLVM_VERSION_PATCH],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/config.h"],
LLVM_VERSION_PATCH=0) dnl Default if LLVM_VERSION_PATCH not found
if test -n "${LLVM_VERSION_MAJOR}"; then
LLVM_VERSION_INT="${LLVM_VERSION_MAJOR}0${LLVM_VERSION_MINOR}"
else
@@ -1627,7 +1658,7 @@ if test "x$enable_gallium_llvm" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} option"
fi
fi
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT"
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT -DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH"
MESA_LLVM=1
dnl Check for Clang internal headers
@@ -1646,6 +1677,10 @@ if test "x$enable_gallium_llvm" = xyes; then
else
MESA_LLVM=0
LLVM_VERSION_INT=0
if test "x$enable_opencl" = xyes; then
AC_MSG_ERROR([cannot enable OpenCL without LLVM])
fi
fi
dnl Directory for XVMC libs
@@ -1726,6 +1761,7 @@ gallium_check_st() {
gallium_require_llvm() {
if test "x$MESA_LLVM" = x0; then
case "$host" in *gnux32) return;; esac
case "$host_cpu" in
i*86|x86_64|amd64) AC_MSG_ERROR([LLVM is required to build $1 on x86 and x86_64]);;
esac

View File

@@ -16,6 +16,20 @@
<h1>News</h1>
<h2>June 6, 2014</h2>
<p>
<a href="relnotes/10.2.1.html">Mesa 10.2.1</a> is released. This release
only fixes a build error in the radeonsi driver that was introduced between
10.2-rc5 and the 10.2 final release.
</p>
<h2>June 6, 2014</h2>
<p>
<a href="relnotes/10.2.html">Mesa 10.2</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>April 18, 2014</h2>
<p>
<a href="relnotes/10.1.1.html">Mesa 10.1.1</a> is released.

View File

@@ -21,6 +21,8 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/10.2.1.html">10.2.1 release notes</a>
<li><a href="relnotes/10.2.html">10.2 release notes</a>
<li><a href="relnotes/10.1.1.html">10.1.1 release notes</a>
<li><a href="relnotes/10.1.html">10.1 release notes</a>
<li><a href="relnotes/10.0.5.html">10.0.5 release notes</a>

61
docs/relnotes/10.2.1.html Normal file
View File

@@ -0,0 +1,61 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.1 Release Notes / June 6, 2014</h1>
<p>
Mesa 10.2.1 is a bug fix release which fixes bugs found since the 10.1 release.
</p>
<p>
Mesa 10.2.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
96f892dae2d0bb14ac9c2113f586c909 MesaLib-10.2.1.tar.gz
093f9b5d077e5f6061dcd7b01b7aa51a MesaLib-10.2.1.tar.bz2
6ab76c1608e5deed1eb8b54c62d7a48a MesaLib-10.2.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>
Mesa 10.2 had a build problem in the radeonsi driver due to an error resolving
conflicts in a patch cherry-pick from master. The build error is fixed.
</p>
<h2>Changes</h2>
<p>Ian Romanick (3):</p>
<ul>
<li>docs: Add MD5 checksum, etc. for 10.1 release</li>
<li>radeonsi: Fix build error introduced in 5ab9a9c</li>
<li>Bump version to 10.2.1</li>
</ul>
</div>
</body>
</html>

181
docs/relnotes/10.2.2.html Normal file
View File

@@ -0,0 +1,181 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.2 Release Notes / June 24, 2014</h1>
<p>
Mesa 10.2.2 is a bug fix release which fixes bugs found since the 10.2.1 release.
</p>
<p>
Mesa 10.2.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
38c4a40364000f89cddaa1694f6f3cfb444981d1110238ce603093585477399c MesaLib-10.2.2.tar.bz2
2af2ec8b4db624c352e961eefbcce6c8d1f86d44c5542f6f378c50e1b958d453 MesaLib-10.2.2.tar.gz
d4c0372da59367a344d62ebcdf5cf61039c9cae6925f40f2dab8f8d95cf22da9 MesaLib-10.2.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66452">Bug 66452</a> - JUNIPER UVD accelerated playback of WMV3 streams does not work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77865">Bug 77865</a> - [BDW] Many Ogles3conform framebuffer_blit cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - OpenCL: clBuildProgram prints error messages directly rather than storing them</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79029">Bug 79029</a> - INTEL_DEBUG=shader_time is full of lies</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79907">Bug 79907</a> - Mesa 10.2.1 --enable-vdpau default=auto broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80115">Bug 80115</a> - MESA_META_DRAW_BUFFERS induced GL_INVALID_VALUE errors</li>
</ul>
<h2>Changes</h2>
<p>Adrian Negreanu (8):</p>
<ul>
<li>add megadriver_stub_FILES</li>
<li>android: adapt to the megadriver mechanism</li>
<li>android: add libloader to libGLES_mesa and libmesa_egl_dri2</li>
<li>android: add src/gallium/auxiliary as include path for libmesa_dricore</li>
<li>android, egl: add correct drm include for libmesa_egl_dri2</li>
<li>android, egl: typo dri2_fallback_pixmap_surface -&gt; dri2_fallback_create_pixmap_surface</li>
<li>android, mesa_gen_matypes: pull in timespec POSIX definition</li>
<li>android, dricore: undefined reference to _mesa_streaming_load_memcpy</li>
</ul>
<p>Carl Worth (1):</p>
<ul>
<li>Update VERSION to 10.2.2</li>
</ul>
<p>Daniel Manjarres (1):</p>
<ul>
<li>glx: Don't crash on swap event for a Window (non-GLXWindow)</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>targets/xa: limit the amount of exported symbols</li>
<li>configure: error out when building opencl without LLVM</li>
<li>configure: correctly autodetect xvmc/vdpau/omx</li>
</ul>
<p>Grigori Goronzy (1):</p>
<ul>
<li>radeon/uvd: disable VC-1 simple/main on UVD 2.x</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>docs: Add initial 10.2.1 release notes</li>
<li>docs: Add MD5 checksum, etc. for 10.2.1 release</li>
<li>meta: Respect the driver's maximum number of draw buffers</li>
</ul>
<p>Ilia Mirkin (7):</p>
<ul>
<li>gk110/ir: emit saturate flag on fadd when needed</li>
<li>gk110/ir: fix emitting constbuf file index</li>
<li>gk110/ir: fix bfind emission</li>
<li>nv50: make sure to mark first scissor dirty after blit</li>
<li>nv30: plug some memory leaks on screen destroy and shader compile</li>
<li>nv30: avoid dangling references to deleted contexts</li>
<li>nv30: hack to avoid errors on unexpected color/zeta combinations</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>meta_blit: properly compute texture width for the CopyTexSubImage fallback</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).</li>
</ul>
<p>Kenneth Graunke (9):</p>
<ul>
<li>i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.</li>
<li>i965: Invalidate live intervals when inserting Gen4 SEND workarounds.</li>
<li>i965/vec4: Fix dead code elimination for VGRFs of size &gt; 1.</li>
<li>i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.</li>
<li>i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.</li>
<li>i965: Add missing newlines to a few perf_debug messages.</li>
<li>i965/vec4: Use the sampler for pull constant loads on Broadwell.</li>
<li>i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.</li>
<li>i965: Save meta stencil blit programs in the context.</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>mesa: Remove glClear optimization based on drawable size</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>configure: Only check for OpenCL without LLVM when the latter is certain</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>i965: Set the fast clear color value for texture surfaces</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>clover: Prevent Clang from printing number of errors and warnings to stderr.</li>
<li>clover: Don't use llvm's global context</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i915: Fix gen2 texblend setup</li>
</ul>
</div>
</body>
</html>

130
docs/relnotes/10.2.3.html Normal file
View File

@@ -0,0 +1,130 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.3 Release Notes / July 7, 2014</h1>
<p>
Mesa 10.2.3 is a bug fix release which fixes bugs found since the 10.2.2 release.
</p>
<p>
Mesa 10.2.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e482a96170c98b17d6aba0d6e4dda4b9a2e61c39587bb64ac38cadfa4aba4aeb MesaLib-10.2.3.tar.bz2
96cffacaa1c52ae659b3b0f91be2eebf5528b748934256751261fb79ea3d6636 MesaLib-10.2.3.tar.gz
82cab6ff14c8038ee39842dbdea0d447a78d119efd8d702d1497bc7c246434e9 MesaLib-10.2.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76223">Bug 76223</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79823">Bug 79823</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80015">Bug 80015</a> - </li>
</ul>
<h2>Changes</h2>
<p>Aaron Watry (1):</p>
<ul>
<li>radeon/llvm: Allocate space for kernel metadata operands</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.2.2 release</li>
<li>cherry-ignore: Add a patch that's been rejected</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nouveau: dup fd before passing it to device</li>
<li>nv50: disable dedicated ubo upload method</li>
<li>nv50: do an explicit flush on draw when there are persistent buffers</li>
<li>nvc0: add a memory barrier when there are persistent UBOs</li>
</ul>
<p>Jasper St. Pierre (1):</p>
<ul>
<li>glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.</li>
<li>i965: Include marketing names for Broadwell GPUs.</li>
<li>i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ</li>
</ul>
<p>Rob Clark (9):</p>
<ul>
<li>xa: fix segfault</li>
<li>freedreno: use OUT_RELOCW when buffer is written</li>
<li>freedreno/a3xx: fix depth/stencil GMEM positioning</li>
<li>freedreno/a3xx: fix depth/stencil gmem restore</li>
<li>freedreno/a3xx: fix blend opcode</li>
<li>freedreno: few caps fixes</li>
<li>freedreno/a3xx: texture fixes</li>
<li>freedreno: fix for null textures</li>
<li>freedreno/a3xx: vtx formats</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: (trivial) fix clamping of viewport index</li>
</ul>
<p>Takashi Iwai (1):</p>
<ul>
<li>llvmpipe: Fix zero-division in llvmpipe_texture_layout()</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Don't close the drm fd on failure v2</li>
</ul>
<p>Tobias Klausmann (1):</p>
<ul>
<li>nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices</li>
</ul>
</div>
</body>
</html>

127
docs/relnotes/10.2.4.html Normal file
View File

@@ -0,0 +1,127 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.4 Release Notes / July 18, 2014</h1>
<p>
Mesa 10.2.4 is a bug fix release which fixes bugs found since the 10.2.3 release.
</p>
<p>
Mesa 10.2.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
06a2341244eb85c283f59f70161e06ded106f835ed9b6be1ef0243bd9344811a MesaLib-10.2.4.tar.bz2
33e3c8b4343503e7d7d17416c670438860a2fd99ec93ea3327f73c3abe33b5e4 MesaLib-10.2.4.tar.gz
e26791a4a62a61b82e506e6ba031812d09697d1a831e8239af67e5722a8ee538 MesaLib-10.2.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81157">Bug 81157</a> - [BDW]Piglit some spec_glsl-1.50_execution_built-in-functions* cases fail</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (3):</p>
<ul>
<li>i965/fs: Refactor check for potential copy propagated instructions.</li>
<li>i965/fs: skip copy-propate for logical instructions with negated src entries</li>
<li>i965/vec4: skip copy-propate for logical instructions with negated src entries</li>
</ul>
<p>Brian Paul (3):</p>
<ul>
<li>mesa: fix geometry shader memory leaks</li>
<li>st/mesa: fix geometry shader memory leak</li>
<li>gallium/u_blitter: fix some shader memory leaks</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 checksums for the 10.2.3 release</li>
<li>Update VERSION to 10.2.4</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965: Generalize the pixel_x/y workaround for all UW types.</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nv50/ir: retrieve shadow compare from first arg</li>
<li>nv50/ir: ignore bias for samplerCubeShadow on nv50</li>
<li>nvc0/ir: do quadops on the right texture coordinates for TXD</li>
<li>nvc0/ir: use manual TXD when offsets are involved</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>i965: Add auxiliary surface field #defines for Broadwell.</li>
</ul>
<p>Kenneth Graunke (9):</p>
<ul>
<li>i965: Don't copy propagate abs into Broadwell logic instructions.</li>
<li>i965: Set execution size to 8 for instructions with force_sechalf set.</li>
<li>i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.</li>
<li>i965/fs: Use WE_all for gl_SampleID header register munging.</li>
<li>i965: Add plumbing for Broadwell's auxiliary surface support.</li>
<li>i965: Drop SINT workaround for CMS layout on Broadwell.</li>
<li>i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.</li>
<li>i965: Add 2x MSAA support to the MCS allocation function.</li>
<li>i965: Enable compressed multisample support (CMS) on Broadwell.</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>gallium: fix u_default_transfer_inline_write for textures</li>
<li>st/mesa: fix samplerCubeShadow with bias</li>
<li>radeonsi: fix samplerCubeShadow with bias</li>
<li>radeonsi: add support for TXB2</li>
</ul>
<p>Matt Turner (8):</p>
<ul>
<li>i965/vec4: Don't return void from a void function.</li>
<li>i965/vec4: Don't fix_math_operand() on Gen &gt;= 8.</li>
<li>i965/fs: Don't fix_math_operand() on Gen &gt;= 8.</li>
<li>i965/fs: Make try_constant_propagate() static.</li>
<li>i965/fs: Constant propagate into 2-src math instructions on Gen8.</li>
<li>i965/vec4: Constant propagate into 2-src math instructions on Gen8.</li>
<li>i965/fs: Don't use brw_imm_* unnecessarily.</li>
<li>i965/fs: Set correct number of regs_written for MCS fetches.</li>
</ul>
</div>
</body>
</html>

188
docs/relnotes/10.2.5.html Normal file
View File

@@ -0,0 +1,188 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.5 Release Notes / August 2, 2014</h1>
<p>
Mesa 10.2.5 is a bug fix release which fixes bugs found since the 10.2.4 release.
</p>
<p>
Mesa 10.2.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
b4459f0bf7f4a3c8fb78ece3c9d2eac3d0e5bf38cb470f2a72705e744bd0310d MesaLib-10.2.5.tar.bz2
7b4dd0cb683f8c7dc48a3e7a315742bed58ddcd7b756c462aca4177bd1acdc79 MesaLib-10.2.5.tar.gz
6180565914fb238dd77ccdaff96b6155d9a6e1b3e981ebbf6a6851301b384fed MesaLib-10.2.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80991">Bug 80991</a> - [BDW]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 fails</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (3):</p>
<ul>
<li>i965/fs: Refactor check for potential copy propagated instructions.</li>
<li>i965/fs: skip copy-propate for logical instructions with negated src entries</li>
<li>i965/vec4: skip copy-propate for logical instructions with negated src entries</li>
</ul>
<p>Adel Gadllah (1):</p>
<ul>
<li>i915: Fix up intelInitScreen2 for DRI3</li>
</ul>
<p>Anuj Phogat (2):</p>
<ul>
<li>i965: Fix z_offset computation in intel_miptree_unmap_depthstencil()</li>
<li>mesa: Don't use memcpy() in _mesa_texstore() for float depth texture data</li>
</ul>
<p>Brian Paul (3):</p>
<ul>
<li>mesa: fix geometry shader memory leaks</li>
<li>st/mesa: fix geometry shader memory leak</li>
<li>gallium/u_blitter: fix some shader memory leaks</li>
</ul>
<p>Carl Worth (6):</p>
<ul>
<li>docs: Add sha256 checksums for the 10.2.3 release</li>
<li>Update VERSION to 10.2.4</li>
<li>Add release notes for 10.2.4</li>
<li>docs: Add SHA256 checksums for the 10.2.4 release</li>
<li>cherry-ignore: Ignore a few patches picked in the previous stable release</li>
<li>Update version to 10.2.5</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>radeonsi: fix order of r600_need_dma_space and r600_context_bo_reloc</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965: Generalize the pixel_x/y workaround for all UW types.</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profile</li>
<li>mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat profile</li>
</ul>
<p>Ilia Mirkin (5):</p>
<ul>
<li>nv50/ir: retrieve shadow compare from first arg</li>
<li>nv50/ir: ignore bias for samplerCubeShadow on nv50</li>
<li>nvc0/ir: do quadops on the right texture coordinates for TXD</li>
<li>nvc0/ir: use manual TXD when offsets are involved</li>
<li>nvc0: make sure that the local memory allocation is aligned to 0x10</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>main/format_pack: Fix a wrong datatype in pack_ubyte_R8G8_UNORM</li>
<li>main/get_hash_params: Add GL_SAMPLE_SHADING_ARB</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>i965: Add auxiliary surface field #defines for Broadwell.</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>st/wgl: Clamp wglChoosePixelFormatARB's output nNumFormats to nMaxFormats.</li>
</ul>
<p>Kenneth Graunke (13):</p>
<ul>
<li>i965: Don't copy propagate abs into Broadwell logic instructions.</li>
<li>i965: Set execution size to 8 for instructions with force_sechalf set.</li>
<li>i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.</li>
<li>i965/fs: Use WE_all for gl_SampleID header register munging.</li>
<li>i965: Add plumbing for Broadwell's auxiliary surface support.</li>
<li>i965: Drop SINT workaround for CMS layout on Broadwell.</li>
<li>i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.</li>
<li>i965: Add 2x MSAA support to the MCS allocation function.</li>
<li>i965: Enable compressed multisample support (CMS) on Broadwell.</li>
<li>i965: Add missing persample_shading field to brw_wm_debug_recompile.</li>
<li>i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode.</li>
<li>i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+.</li>
<li>i965/fs: Set LastRT on the final FB write on Broadwell.</li>
</ul>
<p>Marek Olšák (14):</p>
<ul>
<li>gallium: fix u_default_transfer_inline_write for textures</li>
<li>st/mesa: fix samplerCubeShadow with bias</li>
<li>radeonsi: fix samplerCubeShadow with bias</li>
<li>radeonsi: add support for TXB2</li>
<li>r600g: switch SNORM conversion to DX and GLES behavior</li>
<li>radeonsi: fix CMASK and HTILE calculations for Hawaii</li>
<li>gallium/util: add a helper for calculating primitive count from vertex count</li>
<li>radeonsi: fix a hang with instancing on Hawaii</li>
<li>radeonsi: fix a hang with streamout on Hawaii</li>
<li>winsys/radeon: fix vram_size overflow with Hawaii</li>
<li>radeonsi: fix occlusion queries on Hawaii</li>
<li>r600g,radeonsi: switch all occurences of array_size to util_max_layer</li>
<li>radeonsi: fix build because of lack of draw_indirect infrastructure in 10.2</li>
<li>radeonsi: use DRAW_PREAMBLE on CIK</li>
</ul>
<p>Matt Turner (8):</p>
<ul>
<li>i965/vec4: Don't return void from a void function.</li>
<li>i965/vec4: Don't fix_math_operand() on Gen &gt;= 8.</li>
<li>i965/fs: Don't fix_math_operand() on Gen &gt;= 8.</li>
<li>i965/fs: Make try_constant_propagate() static.</li>
<li>i965/fs: Constant propagate into 2-src math instructions on Gen8.</li>
<li>i965/vec4: Constant propagate into 2-src math instructions on Gen8.</li>
<li>i965/fs: Don't use brw_imm_* unnecessarily.</li>
<li>i965/fs: Set correct number of regs_written for MCS fetches.</li>
</ul>
<p>Thorsten Glaser (1):</p>
<ul>
<li>nv50: fix build failure on m68k due to invalid struct alignment assumptions</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>clover: Call end_query before getting timestamp result v2</li>
</ul>
</div>
</body>
</html>

118
docs/relnotes/10.2.6.html Normal file
View File

@@ -0,0 +1,118 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.6 Release Notes / August 19, 2014</h1>
<p>
Mesa 10.2.6 is a bug fix release which fixes bugs found since the 10.2.5 release.
</p>
<p>
Mesa 10.2.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
193314d2adba98e43697d726739ac46b4299aae324fa1821aa226890c28ac806 MesaLib-10.2.6.tar.bz2
f7a45a5977b485eb95ac024205c584a0c112fe3951c2313c797579bb16a7a448 MesaLib-10.2.6.tar.gz
6d086d6fcda8f317adfaaae40011decf2f2e2dc80819c4a7a77c76f73512e8d8 MesaLib-10.2.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81450">Bug 81450</a> - [BDW]Piglit spec_glsl-1.30_execution_tex-miplevel-selection_textureGrad_1DArray cases intel_do_flush_locked failed</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (15):</p>
<ul>
<li>mesa: Fix error condition for valid texture targets in glTexStorage* functions</li>
<li>mesa: Turn target_can_be_compressed() in to a utility function</li>
<li>mesa: Add error condition for using compressed internalformat in glTexStorage3D()</li>
<li>mesa: Fix condition for using compressed internalformat in glCompressedTexImage3D()</li>
<li>mesa: Add utility function _mesa_is_enum_format_snorm()</li>
<li>mesa: Don't allow snorm internal formats in glCopyTexImage*() in GLES3</li>
<li>mesa: Add a helper function _mesa_is_enum_format_unsized()</li>
<li>mesa: Add a gles3 error condition for sized internalformat in glCopyTexImage*()</li>
<li>mesa: Add gles3 error condition for GL_RGBA10_A2 buffer format in glCopyTexImage*()</li>
<li>mesa: Add utility function _mesa_is_enum_format_unorm()</li>
<li>mesa: Add gles3 condition for normalized internal formats in glCopyTexImage*()</li>
<li>mesa: Allow GL_TEXTURE_CUBE_MAP target with compressed internal formats</li>
<li>meta: Use _mesa_get_format_bits() to get the GL_RED_BITS</li>
<li>egl: Fix OpenGL ES version checks in _eglParseContextAttribList()</li>
<li>meta: Fix datatype computation in get_temp_image_type()</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix assertion in _mesa_drawbuffers()</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 sums to the 10.2.5 release notes</li>
<li>Update VERSION to 10.2.6</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>mesa/st: only convert AND(a, NOT(b)) into MAD when not using native integers</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>i965/miptree: Layout 1D Array as 2D Array with height of 1</li>
</ul>
<p>Maarten Lankhorst (1):</p>
<ul>
<li>configure.ac: Do not require llvm on x32</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>st/mesa: fix blit-based partial TexSubImage for 1D arrays</li>
<li>radeon,r200: fix buffer validation after CS flush</li>
<li>radeonsi: fix a hang with instancing in Unigine Heaven/Valley on Hawaii</li>
<li>radeonsi: fix CMASK and HTILE allocation on Tahiti</li>
</ul>
<p>Pali Rohár (1):</p>
<ul>
<li>configure: check for dladdr via AC_CHECK_FUNC/AC_CHECK_LIB</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix up out-of-bounds level when using conformant out-of-bound behavior</li>
</ul>
</div>
</body>
</html>

208
docs/relnotes/10.2.7.html Normal file
View File

@@ -0,0 +1,208 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.7 Release Notes / September 06, 2014</h1>
<p>
Mesa 10.2.7 is a bug fix release which fixes bugs found since the 10.2.6 release.
</p>
<p>
Mesa 10.2.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36193">Bug 36193</a> - [i965] brw_eu_emit.c:182: validate_reg: Assertion `execsize &gt;= width' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70441">Bug 70441</a> - [Gen4-5 clip] Piglit spec_OpenGL_1.1_polygon-offset hits (execsize &gt;= width) assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76188">Bug 76188</a> - EGL_EXT_image_dma_buf_import fd ownership is incorrect</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76789">Bug 76789</a> - [radeonsi] si_descriptors.c requires -std=gnu99 or -fms-extensions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82139">Bug 82139</a> - [r600g, bisected] multiple ubo piglit regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82671">Bug 82671</a> - [r600g-evergreen][compute]Empty kernel execution causes crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82709">Bug 82709</a> - OpenCL not working on radeon hainan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82814">Bug 82814</a> - glDrawBuffers(0, NULL) segfaults in _mesa_drawbuffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>radeonsi: Don't use anonymous struct trick in atom tracking</li>
</ul>
<p>Alex Deucher (2):</p>
<ul>
<li>radeonsi: add new CIK pci ids</li>
<li>radeonsi: add new SI pci ids</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>winsys/radeon: fix nop packet padding for hawaii</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Bail on vec4 copy propagation for scratch writes with source modifiers</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix NULL pointer deref bug in _mesa_drawbuffers()</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.2.6 release</li>
<li>Makefile: Switch from md5sums to sha256sums</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>i965: add missing parens in vec4 visitor</li>
</ul>
<p>Emil Velikov (17):</p>
<ul>
<li>configure.ac: bail out if building gallium_gbm without gallium_egl</li>
<li>android: gallium/nouveau: fix include folders, link against libstlport</li>
<li>android: egl/main: fixup the nouveau build</li>
<li>automake: gallium/freedreno: drop spurious include dirs</li>
<li>android: gallium/freedreno: add preliminary build</li>
<li>android: egl/main: add/enable freedreno</li>
<li>android: gallium/auxiliary: drop log2/log2f redefitions</li>
<li>android: drop HAL_PIXEL_FORMAT_RGBA_{5551,4444}</li>
<li>android: glsl: the stlport over the limited Android STL</li>
<li>android: dri/i915: do not build an 'empty' driver</li>
<li>cherry-ignore: remove patch that lacking previous dependencies</li>
<li>cherry-ignore: PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE is not it 10.2</li>
<li>cherry-ignore: drop whitespace fix</li>
<li>cherry-ignore: reject a15088338eb</li>
<li>get-pick-list.sh: Require explicit "10.2" for nominating stable patches</li>
<li>mesa: fix make tarballs</li>
<li>Update VERSION to 10.2.7</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image</li>
</ul>
<p>Ilia Mirkin (9):</p>
<ul>
<li>nouveau: make sure to invalidate any vbo state as well</li>
<li>nouveau: don't keep stale pointer to free'd data</li>
<li>nvc0/ir: avoid infinite recursion when finding first uses of tex</li>
<li>nv50: zero out unbound samplers</li>
<li>nvc0: don't make 1d staging textures linear</li>
<li>nv50/ir: avoid creating instructions that can't be emitted</li>
<li>nv50: set the miptree address when clearing bo's in vp2 init</li>
<li>nv50: mt address may not be the underlying bo's start address</li>
<li>nv50: attach the buffer bo to the miptree structures</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>gallivm: Fix build with latest LLVM</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>mesa: Move declaration to top of block.</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.</li>
<li>i965/vec4: Respect ir-&gt;force_writemask_all in Gen8 code generation.</li>
<li>i965/clip: Fix brw_clip_unfilled.c/compute_offset's assembly.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>r600g: fix constant buffer fetches</li>
<li>radeonsi: save scissor state and sample mask for u_blitter</li>
<li>glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand</li>
</ul>
<p>Paulo Sergio Travaglia (2):</p>
<ul>
<li>android: gallium/radeon: attempt to fix the android build</li>
<li>android: egl/main: resolve radeon linking issues</li>
</ul>
<p>Pekka Paalanen (1):</p>
<ul>
<li>egl_dri2: fix EXT_image_dma_buf_import fds</li>
</ul>
<p>Robert Bragg (1):</p>
<ul>
<li>meta: save and restore swizzle for _GenerateMipmap</li>
</ul>
<p>Tom Stellard (7):</p>
<ul>
<li>radeon/compute: Fix reported values for MAX_GLOBAL_SIZE and MAX_MEM_ALLOC_SIZE</li>
<li>radeonsi/compute: Update reference counts for buffers in si_set_global_binding()</li>
<li>radeonsi/compute: Call si_pm4_free_state() after emitting compute state</li>
<li>clover: Flush the command queue in clReleaseCommandQueue()</li>
<li>radeon: Add work-around for missing Hainan support in clang &lt; 3.6 v2</li>
<li>pipe-loader: Fix memory leak v2</li>
<li>r600g/compute: Don't initialize vertex_buffer_state masks to 0x2</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>gallivm: Fix build with LLVM &gt;= 3.6 r215967.</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2 Release Notes / TBD</h1>
<h1>Mesa 10.2 Release Notes / June 6, 2014</h1>
<p>
Mesa 10.2 is a new development release.
@@ -33,7 +33,9 @@ because compatibility contexts are not supported.
<h2>MD5 checksums</h2>
<pre>
TBD.
c87bfb6dd5cbcf1fdef42e5ccd972581 MesaLib-10.2.0.tar.gz
7aaba90bd7169a94ae2fe83febdec963 MesaLib-10.2.0.tar.bz2
58b203aca15dadc25ab4d1126db1052b MesaLib-10.2.0.zip
</pre>
@@ -67,6 +69,25 @@ TBD.
<h2>Changes</h2>
<ul>
<li>Renamed <i>--with-llvm-shared-libs</i> to <i>--enable-llvm-shared-libs</i></li>
<p>
The option is used to control how mesa is linked against LLVM, and now
defaults to enabled (shared linking).
</p>
<li>Split <i>libxatracker.so</i> into a standalone library which can be used
with any gallium driver.</li>
<p>
Previously the library was linked statically against vmware's virtual gpu
driver(svga), whereas now it loads a shared pipe_*.so driver. Provide the
following options during configure, if you would like support for svga driver
<i>--enable-xa --with-gallium-drivers=svga</i>
</p>
<p>
Note: The files are installed in $(libdir)/gallium-pipe/ and the interface
between them and libxatracker.so is <strong>not</strong> stable.
</p>
</ul>
</div>

View File

@@ -518,7 +518,7 @@ typedef struct {
unsigned long serial; /* # of last request processed by server */
Bool send_event; /* true if this came from a SendEvent request */
Display *display; /* Display the event was read from */
GLXDrawable drawable; /* drawable on which event was requested in event mask */
Drawable drawable; /* drawable on which event was requested in event mask */
int event_type;
int64_t ust;
int64_t msc;

View File

@@ -91,24 +91,24 @@ CHIPSET(0x0F32, byt, "Intel(R) Bay Trail")
CHIPSET(0x0F33, byt, "Intel(R) Bay Trail")
CHIPSET(0x0157, byt, "Intel(R) Bay Trail")
CHIPSET(0x0155, byt, "Intel(R) Bay Trail")
CHIPSET(0x1602, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x1606, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160A, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160B, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160D, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160E, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x1612, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x1616, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161A, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161B, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161D, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161E, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x1622, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x1626, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162A, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x1602, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x1606, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160A, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160B, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160D, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160E, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x1612, bdw_gt2, "Intel(R) HD Graphics 5600 (Broadwell GT2)")
CHIPSET(0x1616, bdw_gt2, "Intel(R) HD Graphics 5500 (Broadwell GT2)")
CHIPSET(0x161A, bdw_gt2, "Intel(R) Broadwell GT2")
CHIPSET(0x161B, bdw_gt2, "Intel(R) Broadwell GT2")
CHIPSET(0x161D, bdw_gt2, "Intel(R) Broadwell GT2")
CHIPSET(0x161E, bdw_gt2, "Intel(R) HD Graphics 5300 (Broadwell GT2)")
CHIPSET(0x1622, bdw_gt3, "Intel(R) Iris Pro 6200 (Broadwell GT3e)")
CHIPSET(0x1626, bdw_gt3, "Intel(R) HD Graphics 6000 (Broadwell GT3)")
CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x22B0, chv, "Intel(R) Cherryview")
CHIPSET(0x22B1, chv, "Intel(R) Cherryview")
CHIPSET(0x22B2, chv, "Intel(R) Cherryview")

View File

@@ -38,6 +38,7 @@ CHIPSET(0x6828, VERDE_6828, VERDE)
CHIPSET(0x6829, VERDE_6829, VERDE)
CHIPSET(0x682A, VERDE_682A, VERDE)
CHIPSET(0x682B, VERDE_682B, VERDE)
CHIPSET(0x682C, VERDE_682C, VERDE)
CHIPSET(0x682D, VERDE_682D, VERDE)
CHIPSET(0x682F, VERDE_682F, VERDE)
CHIPSET(0x6830, VERDE_6830, VERDE)
@@ -54,8 +55,11 @@ CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6604, OLAND_6604, OLAND)
CHIPSET(0x6605, OLAND_6605, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6608, OLAND_6608, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
@@ -73,6 +77,8 @@ CHIPSET(0x666F, HAINAN_666F, HAINAN)
CHIPSET(0x6640, BONAIRE_6640, BONAIRE)
CHIPSET(0x6641, BONAIRE_6641, BONAIRE)
CHIPSET(0x6646, BONAIRE_6646, BONAIRE)
CHIPSET(0x6647, BONAIRE_6647, BONAIRE)
CHIPSET(0x6649, BONAIRE_6649, BONAIRE)
CHIPSET(0x6650, BONAIRE_6650, BONAIRE)
CHIPSET(0x6651, BONAIRE_6651, BONAIRE)
@@ -132,6 +138,7 @@ CHIPSET(0x1313, KAVERI_1313, KAVERI)
CHIPSET(0x1315, KAVERI_1315, KAVERI)
CHIPSET(0x1316, KAVERI_1316, KAVERI)
CHIPSET(0x1317, KAVERI_1317, KAVERI)
CHIPSET(0x1318, KAVERI_1318, KAVERI)
CHIPSET(0x131B, KAVERI_131B, KAVERI)
CHIPSET(0x131C, KAVERI_131C, KAVERI)
CHIPSET(0x131D, KAVERI_131D, KAVERI)

View File

@@ -40,8 +40,12 @@ LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/egl/main \
$(MESA_TOP)/src/loader \
$(DRM_TOP)/include/drm \
$(DRM_GRALLOC_TOP)
LOCAL_STATIC_LIBRARIES := \
libloader
LOCAL_MODULE := libmesa_egl_dri2
include $(MESA_COMMON_MK)

View File

@@ -1663,36 +1663,13 @@ dri2_check_dma_buf_format(const _EGLImageAttribs *attrs)
/**
* The spec says:
*
* "If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target,
* the EGL takes ownership of the file descriptor and is responsible for
* closing it, which it may do at any time while the EGLDisplay is
* initialized."
* "If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target, the
* EGL will take a reference to the dma_buf(s) which it will release at any
* time while the EGLDisplay is initialized. It is the responsibility of the
* application to close the dma_buf file descriptors."
*
* Therefore we must never close or otherwise modify the file descriptors.
*/
static void
dri2_take_dma_buf_ownership(const int *fds, unsigned num_fds)
{
int already_closed[num_fds];
unsigned num_closed = 0;
unsigned i, j;
for (i = 0; i < num_fds; ++i) {
/**
* The same file descriptor can be referenced multiple times in case more
* than one plane is found in the same buffer, just with a different
* offset.
*/
for (j = 0; j < num_closed; ++j) {
if (already_closed[j] == fds[i])
break;
}
if (j == num_closed) {
close(fds[i]);
already_closed[num_closed++] = fds[i];
}
}
}
static _EGLImage *
dri2_create_image_dma_buf(_EGLDisplay *disp, _EGLContext *ctx,
EGLClientBuffer buffer, const EGLint *attr_list)
@@ -1755,8 +1732,6 @@ dri2_create_image_dma_buf(_EGLDisplay *disp, _EGLContext *ctx,
return EGL_NO_IMAGE_KHR;
res = dri2_create_image_from_dri(disp, dri_image);
if (res)
dri2_take_dma_buf_ownership(fds, num_fds);
return res;
}

View File

@@ -54,8 +54,6 @@ get_format_bpp(int native)
bpp = 3;
break;
case HAL_PIXEL_FORMAT_RGB_565:
case HAL_PIXEL_FORMAT_RGBA_5551:
case HAL_PIXEL_FORMAT_RGBA_4444:
bpp = 2;
break;
default:
@@ -371,8 +369,6 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx,
format = __DRI_IMAGE_FORMAT_XBGR8888;
break;
case HAL_PIXEL_FORMAT_RGB_888:
case HAL_PIXEL_FORMAT_RGBA_5551:
case HAL_PIXEL_FORMAT_RGBA_4444:
/* unsupported */
default:
_eglLog(_EGL_WARNING, "unsupported native buffer format 0x%x", buf->format);
@@ -638,7 +634,7 @@ droid_log(EGLint level, const char *msg)
static struct dri2_egl_display_vtbl droid_display_vtbl = {
.authenticate = NULL,
.create_window_surface = droid_create_window_surface,
.create_pixmap_surface = dri2_fallback_pixmap_surface,
.create_pixmap_surface = dri2_fallback_create_pixmap_surface,
.create_pbuffer_surface = droid_create_pbuffer_surface,
.destroy_surface = droid_destroy_surface,
.create_image = droid_create_image_khr,

View File

@@ -95,6 +95,12 @@ gallium_DRIVERS :=
# swrast
gallium_DRIVERS += libmesa_pipe_softpipe libmesa_winsys_sw_android
# freedreno
ifneq ($(filter freedreno, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_winsys_freedreno libmesa_pipe_freedreno
LOCAL_SHARED_LIBRARIES += libdrm_freedreno
endif
# i915g
ifneq ($(filter i915g, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_winsys_i915 libmesa_pipe_i915
@@ -109,28 +115,29 @@ endif
# nouveau
ifneq ($(filter nouveau, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += \
libmesa_winsys_nouveau \
libmesa_pipe_nvfx \
libmesa_pipe_nv50 \
libmesa_pipe_nvc0 \
libmesa_pipe_nouveau
gallium_DRIVERS += libmesa_winsys_nouveau libmesa_pipe_nouveau
LOCAL_SHARED_LIBRARIES += libdrm_nouveau
LOCAL_SHARED_LIBRARIES += libstlport
endif
# r300g/r600g/radeonsi
ifneq ($(filter r300g r600g radeonsi, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_winsys_radeon
LOCAL_SHARED_LIBRARIES += libdrm_radeon
ifneq ($(filter r300g, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_pipe_r300
endif
endif # r300g
ifneq ($(filter r600g radeonsi, $(MESA_GPU_DRIVERS)),)
ifneq ($(filter r600g, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_pipe_r600
endif
LOCAL_SHARED_LIBRARIES += libstlport
endif # r600g
ifneq ($(filter radeonsi, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_pipe_radeonsi
endif
endif
endif # radeonsi
gallium_DRIVERS += libmesa_pipe_radeon
endif # r600g || radeonsi
endif # r300g || r600g || radeonsi
# vmwgfx
ifneq ($(filter vmwgfx, $(MESA_GPU_DRIVERS)),)
@@ -154,11 +161,14 @@ LOCAL_STATIC_LIBRARIES := \
libmesa_glsl \
libmesa_glsl_utils \
libmesa_gallium \
libloader \
$(LOCAL_STATIC_LIBRARIES)
endif # MESA_BUILD_GALLIUM
LOCAL_STATIC_LIBRARIES := \
$(LOCAL_STATIC_LIBRARIES) \
libloader
LOCAL_MODULE := libGLES_mesa
LOCAL_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/egl

View File

@@ -524,8 +524,12 @@ eglMakeCurrent(EGLDisplay dpy, EGLSurface draw, EGLSurface read,
if (!context && ctx != EGL_NO_CONTEXT)
RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_FALSE);
if (!draw_surf || !read_surf) {
/* surfaces may be NULL if surfaceless */
if (!disp->Extensions.KHR_surfaceless_context)
/* From the EGL 1.4 (20130211) spec:
*
* To release the current context without assigning a new one, set ctx
* to EGL_NO_CONTEXT and set draw and read to EGL_NO_SURFACE.
*/
if (!disp->Extensions.KHR_surfaceless_context && ctx != EGL_NO_CONTEXT)
RETURN_EGL_ERROR(disp, EGL_BAD_SURFACE, EGL_FALSE);
if ((!draw_surf && draw != EGL_NO_SURFACE) ||
@@ -567,6 +571,10 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
EGLSurface ret;
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_SURFACE, drv);
if (native_window == NULL)
RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE);
surf = drv->API.CreateWindowSurface(drv, disp, conf, native_window,
attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;

View File

@@ -322,11 +322,14 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
break;
case 3:
default:
/* Don't put additional version checks here. We don't know that
* there won't be versions > 3.0.
*/
break;
default:
err = EGL_BAD_MATCH;
break;
}
}

View File

@@ -135,22 +135,6 @@
<arg name="stride2" type="int"/>
</request>
<!-- Create a wayland buffer for the prime fd. Use for regular and planar
buffers. Pass 0 for offset and stride for unused planes. -->
<request name="create_prime_buffer" since="2">
<arg name="id" type="new_id" interface="wl_buffer"/>
<arg name="name" type="fd"/>
<arg name="width" type="int"/>
<arg name="height" type="int"/>
<arg name="format" type="uint"/>
<arg name="offset0" type="int"/>
<arg name="stride0" type="int"/>
<arg name="offset1" type="int"/>
<arg name="stride1" type="int"/>
<arg name="offset2" type="int"/>
<arg name="stride2" type="int"/>
</request>
<!-- Notification of the path of the drm device which is used by
the server. The client should use this device for creating
local buffers. Only buffers created from this device should
@@ -177,6 +161,25 @@
<event name="capabilities">
<arg name="value" type="uint"/>
</event>
<!-- Version 2 additions -->
<!-- Create a wayland buffer for the prime fd. Use for regular and planar
buffers. Pass 0 for offset and stride for unused planes. -->
<request name="create_prime_buffer" since="2">
<arg name="id" type="new_id" interface="wl_buffer"/>
<arg name="name" type="fd"/>
<arg name="width" type="int"/>
<arg name="height" type="int"/>
<arg name="format" type="uint"/>
<arg name="offset0" type="int"/>
<arg name="stride0" type="int"/>
<arg name="offset1" type="int"/>
<arg name="stride1" type="int"/>
<arg name="offset2" type="int"/>
<arg name="stride2" type="int"/>
</request>
</interface>
</protocol>

View File

@@ -34,6 +34,11 @@ SUBDIRS := \
# swrast
SUBDIRS += winsys/sw/android drivers/softpipe
# freedreno
ifneq ($(filter freedreno, $(MESA_GPU_DRIVERS)),)
SUBDIRS += winsys/freedreno/drm drivers/freedreno
endif
# i915g
ifneq ($(filter i915g, $(MESA_GPU_DRIVERS)),)
SUBDIRS += winsys/i915/drm drivers/i915
@@ -57,6 +62,8 @@ SUBDIRS += winsys/radeon/drm
ifneq ($(filter r300g, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/r300
endif
ifneq ($(filter r600g radeonsi, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/radeon
ifneq ($(filter r600g, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/r600
endif
@@ -64,6 +71,7 @@ ifneq ($(filter radeonsi, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/radeonsi
endif
endif
endif
# vmwgfx
ifneq ($(filter vmwgfx, $(MESA_GPU_DRIVERS)),)

View File

@@ -1000,6 +1000,8 @@ draw_get_shader_param_no_llvm(unsigned shader, enum pipe_shader_cap param)
/**
* XXX: Results for PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS because there are two
* different ways of setting textures, and drivers typically only support one.
* Drivers requesting a draw context explicitly without llvm must call
* draw_get_shader_param_no_llvm instead.
*/
int
draw_get_shader_param(unsigned shader, enum pipe_shader_cap param)

View File

@@ -597,7 +597,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader,
#ifdef HAVE_LLVM
if (draw_get_option_use_llvm()) {
if (shader->draw->llvm) {
shader->gs_output = output_verts->verts;
if (max_out_prims > shader->max_out_prims) {
unsigned i;
@@ -674,7 +674,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader *shader,
void draw_geometry_shader_prepare(struct draw_geometry_shader *shader,
struct draw_context *draw)
{
boolean use_llvm = draw_get_option_use_llvm();
boolean use_llvm = draw->llvm != NULL;
if (!use_llvm && shader && shader->machine->Tokens != shader->state.tokens) {
tgsi_exec_machine_bind_shader(shader->machine,
shader->state.tokens,
@@ -686,7 +686,7 @@ void draw_geometry_shader_prepare(struct draw_geometry_shader *shader,
boolean
draw_gs_init( struct draw_context *draw )
{
if (!draw_get_option_use_llvm()) {
if (!draw->llvm) {
draw->gs.tgsi.machine = tgsi_exec_machine_create();
if (!draw->gs.tgsi.machine)
return FALSE;
@@ -715,7 +715,7 @@ draw_create_geometry_shader(struct draw_context *draw,
const struct pipe_shader_state *state)
{
#ifdef HAVE_LLVM
boolean use_llvm = draw_get_option_use_llvm();
boolean use_llvm = draw->llvm != NULL;
struct llvm_geometry_shader *llvm_gs;
#endif
struct draw_geometry_shader *gs;
@@ -870,7 +870,7 @@ void draw_delete_geometry_shader(struct draw_context *draw,
return;
}
#ifdef HAVE_LLVM
if (draw_get_option_use_llvm()) {
if (draw->llvm) {
struct llvm_geometry_shader *shader = llvm_geometry_shader(dgs);
struct draw_gs_llvm_variant_list_item *li;

View File

@@ -47,7 +47,6 @@
#include "tgsi/tgsi_scan.h"
#ifdef HAVE_LLVM
struct draw_llvm;
struct gallivm_state;
#endif
@@ -69,6 +68,7 @@ struct tgsi_exec_machine;
struct tgsi_sampler;
struct draw_pt_front_end;
struct draw_assembler;
struct draw_llvm;
/**
@@ -318,9 +318,7 @@ struct draw_context
unsigned start_instance;
unsigned start_index;
#ifdef HAVE_LLVM
struct draw_llvm *llvm;
#endif
/** Texture sampler and sampler view state.
* Note that we have arrays indexed by shader type. At this time
@@ -495,7 +493,7 @@ draw_stats_clipper_primitives(struct draw_context *draw,
static INLINE unsigned
draw_clamp_viewport_idx(int idx)
{
return ((PIPE_MAX_VIEWPORTS > idx || idx < 0) ? idx : 0);
return ((PIPE_MAX_VIEWPORTS > idx && idx >= 0) ? idx : 0);
}
/**

View File

@@ -149,7 +149,7 @@ draw_vs_init( struct draw_context *draw )
{
draw->dump_vs = debug_get_option_gallium_dump_vs();
if (!draw_get_option_use_llvm()) {
if (!draw->llvm) {
draw->vs.tgsi.machine = tgsi_exec_machine_create();
if (!draw->vs.tgsi.machine)
return FALSE;
@@ -175,7 +175,7 @@ draw_vs_destroy( struct draw_context *draw )
if (draw->vs.emit_cache)
translate_cache_destroy(draw->vs.emit_cache);
if (!draw_get_option_use_llvm())
if (!draw->llvm)
tgsi_exec_machine_destroy(draw->vs.tgsi.machine);
}

View File

@@ -63,7 +63,7 @@ vs_exec_prepare( struct draw_vertex_shader *shader,
{
struct exec_vertex_shader *evs = exec_vertex_shader(shader);
debug_assert(!draw_get_option_use_llvm());
debug_assert(!draw->llvm);
/* Specify the vertex program to interpret/execute.
* Avoid rebinding when possible.
*/
@@ -97,7 +97,7 @@ vs_exec_run_linear( struct draw_vertex_shader *shader,
unsigned slot;
boolean clamp_vertex_color = shader->draw->rasterizer->clamp_vertex_color;
debug_assert(!draw_get_option_use_llvm());
debug_assert(!shader->draw->llvm);
tgsi_exec_set_constant_buffers(machine, PIPE_MAX_CONSTANT_BUFFERS,
constants, const_size);

View File

@@ -34,6 +34,10 @@
#include <llvm/Support/Format.h>
#include <llvm/Support/MemoryObject.h>
#if HAVE_LLVM >= 0x0306
#include <llvm/Target/TargetSubtargetInfo.h>
#endif
#if HAVE_LLVM >= 0x0300
#include <llvm/Support/TargetRegistry.h>
#include <llvm/MC/MCSubtargetInfo.h>
@@ -302,7 +306,11 @@ disassemble(const void* func, llvm::raw_ostream & Out)
OwningPtr<TargetMachine> TM(T->createTargetMachine(Triple, ""));
#endif
#if HAVE_LLVM >= 0x0306
const TargetInstrInfo *TII = TM->getSubtargetImpl()->getInstrInfo();
#else
const TargetInstrInfo *TII = TM->getInstrInfo();
#endif
/*
* Wrap the data in a MemoryObject

View File

@@ -266,7 +266,11 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
using namespace llvm;
std::string Error;
#if HAVE_LLVM >= 0x0306
EngineBuilder builder(std::unique_ptr<Module>(unwrap(M)));
#else
EngineBuilder builder(unwrap(M));
#endif
/**
* LLVM 3.1+ haven't more "extern unsigned llvm::StackAlignmentOverride" and

View File

@@ -927,6 +927,7 @@ lp_build_nearest_mip_level(struct lp_build_sample_context *bld,
bld->int_coord_bld.type,
out);
}
level = lp_build_andnot(&bld->int_coord_bld, level, *out_of_bounds);
*level_out = level;
}
else {

View File

@@ -66,7 +66,7 @@ struct pipe_loader_device {
} pci;
} u; /**< Discriminated by \a type */
const char *driver_name;
char *driver_name;
const struct pipe_loader_ops *ops;
};

View File

@@ -256,6 +256,7 @@ pipe_loader_drm_release(struct pipe_loader_device **dev)
util_dl_close(ddev->lib);
close(ddev->fd);
FREE(ddev->base.driver_name);
FREE(ddev);
*dev = NULL;
}

View File

@@ -145,9 +145,6 @@ pipe_loader_sw_release(struct pipe_loader_device **dev)
{
struct pipe_loader_sw_device *sdev = pipe_loader_sw_device(*dev);
if (sdev->ws && sdev->ws->destroy)
sdev->ws->destroy(sdev->ws);
if (sdev->lib)
util_dl_close(sdev->lib);

View File

@@ -120,7 +120,8 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"FS_COORD_PIXEL_CENTER",
"FS_COLOR0_WRITES_ALL_CBUFS",
"FS_DEPTH_LAYOUT",
"VS_PROHIBIT_UCPS"
"VS_PROHIBIT_UCPS",
"GS_INVOCATIONS",
};
const char *tgsi_type_names[5] =

View File

@@ -383,6 +383,15 @@ void util_blitter_destroy(struct blitter_context *blitter)
if (ctx->fs_texfetch_stencil[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_stencil[i]);
if (ctx->fs_texfetch_col_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_col_msaa[i]);
if (ctx->fs_texfetch_depth_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_depth_msaa[i]);
if (ctx->fs_texfetch_depthstencil_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_depthstencil_msaa[i]);
if (ctx->fs_texfetch_stencil_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_stencil_msaa[i]);
for (j = 0; j< Elements(ctx->fs_resolve[i]); j++)
for (f = 0; f < 2; f++)
if (ctx->fs_resolve[i][j][f])

View File

@@ -149,28 +149,6 @@ roundf(float x)
#endif /* _MSC_VER */
#ifdef PIPE_OS_ANDROID
static INLINE
double log2(double d)
{
return log(d) * (1.0 / M_LN2);
}
/* workaround a conflict with main/imports.h */
#ifdef log2f
#undef log2f
#endif
static INLINE
float log2f(float f)
{
return logf(f) * (float) (1.0 / M_LN2);
}
#endif
#if __STDC_VERSION__ < 199901L && (!defined(__cplusplus) || defined(_MSC_VER))
static INLINE long int
lrint(double d)

View File

@@ -136,6 +136,21 @@ u_prim_vertex_count(unsigned prim)
return (likely(prim < PIPE_PRIM_MAX)) ? &prim_table[prim] : NULL;
}
/**
* Given a vertex count, return the number of primitives.
* For polygons, return the number of triangles.
*/
static INLINE unsigned
u_prims_for_vertices(unsigned prim, unsigned num)
{
const struct u_prim_vertex_count *info = u_prim_vertex_count(prim);
if (num < info->min)
return 0;
return 1 + ((num - info->min) / info->incr);
}
static INLINE boolean u_validate_pipe_prim( unsigned pipe_prim, unsigned nr )
{
const struct u_prim_vertex_count *count = u_prim_vertex_count(pipe_prim);

View File

@@ -25,8 +25,8 @@ void u_default_transfer_inline_write( struct pipe_context *pipe,
usage |= PIPE_TRANSFER_WRITE;
/* transfer_inline_write implicitly discards the rewritten buffer range */
/* XXX this looks very broken for non-buffer resources having more than one dim. */
if (box->x == 0 && box->width == resource->width0) {
if (resource->target == PIPE_BUFFER &&
box->x == 0 && box->width == resource->width0) {
usage |= PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE;
} else {
usage |= PIPE_TRANSFER_DISCARD_RANGE;

View File

@@ -0,0 +1,44 @@
# Copyright (C) 2014 Emil Velikov <emil.l.velikov@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
LOCAL_PATH := $(call my-dir)
# get C_SOURCES
include $(LOCAL_PATH)/Makefile.sources
include $(CLEAR_VARS)
LOCAL_SRC_FILES := \
$(C_SOURCES) \
$(a2xx_SOURCES) \
$(a3xx_SOURCES)
LOCAL_CFLAGS := \
-Wno-packed-bitfield-compat
LOCAL_C_INCLUDES := \
$(LOCAL_PATH)/ir3 \
$(TARGET_OUT_HEADERS)/libdrm \
$(TARGET_OUT_HEADERS)/freedreno
LOCAL_MODULE := libmesa_pipe_freedreno
include $(GALLIUM_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)

View File

@@ -5,8 +5,6 @@ include $(top_srcdir)/src/gallium/Automake.inc
AM_CFLAGS = \
-Wno-packed-bitfield-compat \
-I$(top_srcdir)/src/gallium/drivers/freedreno/a3xx \
-I$(top_srcdir)/src/gallium/drivers/freedreno/a2xx \
$(GALLIUM_DRIVER_CFLAGS) \
$(FREEDRENO_CFLAGS)

View File

@@ -3,6 +3,8 @@ C_SOURCES := \
freedreno_lowering.c \
freedreno_program.c \
freedreno_query.c \
freedreno_query_hw.c \
freedreno_query_sw.c \
freedreno_fence.c \
freedreno_resource.c \
freedreno_surface.c \
@@ -38,6 +40,7 @@ a3xx_SOURCES := \
a3xx/fd3_emit.c \
a3xx/fd3_gmem.c \
a3xx/fd3_program.c \
a3xx/fd3_query.c \
a3xx/fd3_rasterizer.c \
a3xx/fd3_screen.c \
a3xx/fd3_texture.c \

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32840 bytes, from 2014-01-05 14:44:21)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9009 bytes, from 2014-01-11 16:56:35)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 12362 bytes, from 2014-01-07 14:47:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 56545 bytes, from 2014-02-26 16:32:11)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 8344 bytes, from 2013-11-30 14:49:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -203,6 +203,15 @@ enum a2xx_rb_copy_sample_select {
SAMPLE_0123 = 6,
};
enum a2xx_rb_blend_opcode {
BLEND_DST_PLUS_SRC = 0,
BLEND_SRC_MINUS_DST = 1,
BLEND_MIN_DST_SRC = 2,
BLEND_MAX_DST_SRC = 3,
BLEND_DST_MINUS_SRC = 4,
BLEND_DST_PLUS_SRC_BIAS = 5,
};
enum adreno_mmu_clnt_beh {
BEH_NEVR = 0,
BEH_TRAN_RNG = 1,
@@ -996,7 +1005,7 @@ static inline uint32_t A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(enum adreno_rb_blend
}
#define A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__MASK 0x000000e0
#define A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__SHIFT 5
static inline uint32_t A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(enum adreno_rb_blend_opcode val)
static inline uint32_t A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(enum a2xx_rb_blend_opcode val)
{
return ((val) << A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__SHIFT) & A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__MASK;
}
@@ -1014,7 +1023,7 @@ static inline uint32_t A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(enum adreno_rb_blend
}
#define A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__MASK 0x00e00000
#define A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__SHIFT 21
static inline uint32_t A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(enum adreno_rb_blend_opcode val)
static inline uint32_t A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(enum a2xx_rb_blend_opcode val)
{
return ((val) << A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__SHIFT) & A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__MASK;
}

View File

@@ -34,6 +34,27 @@
#include "fd2_context.h"
#include "fd2_util.h"
static enum a2xx_rb_blend_opcode
blend_func(unsigned func)
{
switch (func) {
case PIPE_BLEND_ADD:
return BLEND_DST_PLUS_SRC;
case PIPE_BLEND_MIN:
return BLEND_MIN_DST_SRC;
case PIPE_BLEND_MAX:
return BLEND_MAX_DST_SRC;
case PIPE_BLEND_SUBTRACT:
return BLEND_SRC_MINUS_DST;
case PIPE_BLEND_REVERSE_SUBTRACT:
return BLEND_DST_MINUS_SRC;
default:
DBG("invalid blend func: %x", func);
return 0;
}
}
void *
fd2_blend_state_create(struct pipe_context *pctx,
const struct pipe_blend_state *cso)
@@ -61,10 +82,10 @@ fd2_blend_state_create(struct pipe_context *pctx,
so->rb_blendcontrol =
A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(fd_blend_factor(rt->rgb_src_factor)) |
A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(fd_blend_func(rt->rgb_func)) |
A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(blend_func(rt->rgb_func)) |
A2XX_RB_BLEND_CONTROL_COLOR_DESTBLEND(fd_blend_factor(rt->rgb_dst_factor)) |
A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(fd_blend_factor(rt->alpha_src_factor)) |
A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(fd_blend_func(rt->alpha_func)) |
A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(blend_func(rt->alpha_func)) |
A2XX_RB_BLEND_CONTROL_ALPHA_DESTBLEND(fd_blend_factor(rt->alpha_dst_factor));
if (rt->colormask & PIPE_MASK_R)

View File

@@ -125,7 +125,7 @@ emit_texture(struct fd_ringbuffer *ring, struct fd_context *ctx,
{
unsigned const_idx = fd2_get_const_idx(ctx, tex, samp_id);
static const struct fd2_sampler_stateobj dummy_sampler = {};
struct fd2_sampler_stateobj *sampler;
const struct fd2_sampler_stateobj *sampler;
struct fd2_pipe_sampler_view *view;
if (emitted & (1 << const_idx))

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32840 bytes, from 2014-01-05 14:44:21)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9009 bytes, from 2014-01-11 16:56:35)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 12362 bytes, from 2014-01-07 14:47:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 56545 bytes, from 2014-02-26 16:32:11)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 8344 bytes, from 2013-11-30 14:49:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -41,31 +41,11 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
enum a3xx_render_mode {
RB_RENDERING_PASS = 0,
RB_TILING_PASS = 1,
RB_RESOLVE_PASS = 2,
};
enum a3xx_tile_mode {
LINEAR = 0,
TILE_32X32 = 2,
};
enum a3xx_threadmode {
MULTI = 0,
SINGLE = 1,
};
enum a3xx_instrbuffermode {
BUFFER = 1,
};
enum a3xx_threadsize {
TWO_QUADS = 0,
FOUR_QUADS = 1,
};
enum a3xx_state_block_id {
HLSQ_BLOCK_ID_TP_TEX = 2,
HLSQ_BLOCK_ID_TP_MIPMAP = 3,
@@ -180,12 +160,6 @@ enum a3xx_color_swap {
XYZW = 3,
};
enum a3xx_msaa_samples {
MSAA_ONE = 0,
MSAA_TWO = 1,
MSAA_FOUR = 2,
};
enum a3xx_sp_perfcounter_select {
SP_FS_CFLOW_INSTRUCTIONS = 12,
SP_FS_FULL_ALU_INSTRUCTIONS = 14,
@@ -212,21 +186,26 @@ enum a3xx_rop_code {
ROP_SET = 15,
};
enum adreno_rb_copy_control_mode {
RB_COPY_RESOLVE = 1,
RB_COPY_DEPTH_STENCIL = 5,
enum a3xx_rb_blend_opcode {
BLEND_DST_PLUS_SRC = 0,
BLEND_SRC_MINUS_DST = 1,
BLEND_DST_MINUS_SRC = 2,
BLEND_MIN_DST_SRC = 3,
BLEND_MAX_DST_SRC = 4,
};
enum a3xx_tex_filter {
A3XX_TEX_NEAREST = 0,
A3XX_TEX_LINEAR = 1,
A3XX_TEX_ANISO = 2,
};
enum a3xx_tex_clamp {
A3XX_TEX_REPEAT = 0,
A3XX_TEX_CLAMP_TO_EDGE = 1,
A3XX_TEX_MIRROR_REPEAT = 2,
A3XX_TEX_CLAMP_NONE = 3,
A3XX_TEX_CLAMP_TO_BORDER = 3,
A3XX_TEX_MIRROR_CLAMP = 4,
};
enum a3xx_tex_swiz {
@@ -337,6 +316,7 @@ enum a3xx_tex_type {
#define REG_A3XX_RBBM_INT_0_STATUS 0x00000064
#define REG_A3XX_RBBM_PERFCTR_CTL 0x00000080
#define A3XX_RBBM_PERFCTR_CTL_ENABLE 0x00000001
#define REG_A3XX_RBBM_PERFCTR_LOAD_CMD0 0x00000081
@@ -570,6 +550,10 @@ static inline uint32_t REG_A3XX_CP_PROTECT_REG(uint32_t i0) { return 0x00000460
#define REG_A3XX_CP_AHB_FAULT 0x0000054d
#define REG_A3XX_SP_GLOBAL_MEM_SIZE 0x00000e22
#define REG_A3XX_SP_GLOBAL_MEM_ADDR 0x00000e23
#define REG_A3XX_GRAS_CL_CLIP_CNTL 0x00002040
#define A3XX_GRAS_CL_CLIP_CNTL_IJ_PERSP_CENTER 0x00001000
#define A3XX_GRAS_CL_CLIP_CNTL_CLIP_DISABLE 0x00010000
@@ -644,8 +628,26 @@ static inline uint32_t A3XX_GRAS_CL_VPORT_ZSCALE(float val)
}
#define REG_A3XX_GRAS_SU_POINT_MINMAX 0x00002068
#define A3XX_GRAS_SU_POINT_MINMAX_MIN__MASK 0x0000ffff
#define A3XX_GRAS_SU_POINT_MINMAX_MIN__SHIFT 0
static inline uint32_t A3XX_GRAS_SU_POINT_MINMAX_MIN(float val)
{
return ((((uint32_t)(val * 8.0))) << A3XX_GRAS_SU_POINT_MINMAX_MIN__SHIFT) & A3XX_GRAS_SU_POINT_MINMAX_MIN__MASK;
}
#define A3XX_GRAS_SU_POINT_MINMAX_MAX__MASK 0xffff0000
#define A3XX_GRAS_SU_POINT_MINMAX_MAX__SHIFT 16
static inline uint32_t A3XX_GRAS_SU_POINT_MINMAX_MAX(float val)
{
return ((((uint32_t)(val * 8.0))) << A3XX_GRAS_SU_POINT_MINMAX_MAX__SHIFT) & A3XX_GRAS_SU_POINT_MINMAX_MAX__MASK;
}
#define REG_A3XX_GRAS_SU_POINT_SIZE 0x00002069
#define A3XX_GRAS_SU_POINT_SIZE__MASK 0xffffffff
#define A3XX_GRAS_SU_POINT_SIZE__SHIFT 0
static inline uint32_t A3XX_GRAS_SU_POINT_SIZE(float val)
{
return ((((uint32_t)(val * 8.0))) << A3XX_GRAS_SU_POINT_SIZE__SHIFT) & A3XX_GRAS_SU_POINT_SIZE__MASK;
}
#define REG_A3XX_GRAS_SU_POLY_OFFSET_SCALE 0x0000206c
#define A3XX_GRAS_SU_POLY_OFFSET_SCALE_VAL__MASK 0x00ffffff
@@ -885,7 +887,7 @@ static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR(enum adreno_rb_b
}
#define A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__MASK 0x000000e0
#define A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__SHIFT 5
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(enum adreno_rb_blend_opcode val)
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(enum a3xx_rb_blend_opcode val)
{
return ((val) << A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__SHIFT) & A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__MASK;
}
@@ -903,7 +905,7 @@ static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_SRC_FACTOR(enum adreno_rb
}
#define A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__MASK 0x00e00000
#define A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__SHIFT 21
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(enum adreno_rb_blend_opcode val)
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(enum a3xx_rb_blend_opcode val)
{
return ((val) << A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__SHIFT) & A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__MASK;
}
@@ -986,12 +988,19 @@ static inline uint32_t A3XX_RB_COPY_CONTROL_MSAA_RESOLVE(enum a3xx_msaa_samples
{
return ((val) << A3XX_RB_COPY_CONTROL_MSAA_RESOLVE__SHIFT) & A3XX_RB_COPY_CONTROL_MSAA_RESOLVE__MASK;
}
#define A3XX_RB_COPY_CONTROL_DEPTHCLEAR 0x00000008
#define A3XX_RB_COPY_CONTROL_MODE__MASK 0x00000070
#define A3XX_RB_COPY_CONTROL_MODE__SHIFT 4
static inline uint32_t A3XX_RB_COPY_CONTROL_MODE(enum adreno_rb_copy_control_mode val)
{
return ((val) << A3XX_RB_COPY_CONTROL_MODE__SHIFT) & A3XX_RB_COPY_CONTROL_MODE__MASK;
}
#define A3XX_RB_COPY_CONTROL_FASTCLEAR__MASK 0x00000f00
#define A3XX_RB_COPY_CONTROL_FASTCLEAR__SHIFT 8
static inline uint32_t A3XX_RB_COPY_CONTROL_FASTCLEAR(uint32_t val)
{
return ((val) << A3XX_RB_COPY_CONTROL_FASTCLEAR__SHIFT) & A3XX_RB_COPY_CONTROL_FASTCLEAR__MASK;
}
#define A3XX_RB_COPY_CONTROL_GMEM_BASE__MASK 0xffffc000
#define A3XX_RB_COPY_CONTROL_GMEM_BASE__SHIFT 14
static inline uint32_t A3XX_RB_COPY_CONTROL_GMEM_BASE(uint32_t val)
@@ -1034,6 +1043,12 @@ static inline uint32_t A3XX_RB_COPY_DEST_INFO_SWAP(enum a3xx_color_swap val)
{
return ((val) << A3XX_RB_COPY_DEST_INFO_SWAP__SHIFT) & A3XX_RB_COPY_DEST_INFO_SWAP__MASK;
}
#define A3XX_RB_COPY_DEST_INFO_DITHER_MODE__MASK 0x00000c00
#define A3XX_RB_COPY_DEST_INFO_DITHER_MODE__SHIFT 10
static inline uint32_t A3XX_RB_COPY_DEST_INFO_DITHER_MODE(enum adreno_rb_dither_mode val)
{
return ((val) << A3XX_RB_COPY_DEST_INFO_DITHER_MODE__SHIFT) & A3XX_RB_COPY_DEST_INFO_DITHER_MODE__MASK;
}
#define A3XX_RB_COPY_DEST_INFO_COMPONENT_ENABLE__MASK 0x0003c000
#define A3XX_RB_COPY_DEST_INFO_COMPONENT_ENABLE__SHIFT 14
static inline uint32_t A3XX_RB_COPY_DEST_INFO_COMPONENT_ENABLE(uint32_t val)
@@ -1074,7 +1089,7 @@ static inline uint32_t A3XX_RB_DEPTH_INFO_DEPTH_FORMAT(enum adreno_rb_depth_form
#define A3XX_RB_DEPTH_INFO_DEPTH_BASE__SHIFT 11
static inline uint32_t A3XX_RB_DEPTH_INFO_DEPTH_BASE(uint32_t val)
{
return ((val >> 10) << A3XX_RB_DEPTH_INFO_DEPTH_BASE__SHIFT) & A3XX_RB_DEPTH_INFO_DEPTH_BASE__MASK;
return ((val >> 12) << A3XX_RB_DEPTH_INFO_DEPTH_BASE__SHIFT) & A3XX_RB_DEPTH_INFO_DEPTH_BASE__MASK;
}
#define REG_A3XX_RB_DEPTH_PITCH 0x00002103
@@ -1202,6 +1217,8 @@ static inline uint32_t A3XX_RB_WINDOW_OFFSET_Y(uint32_t val)
}
#define REG_A3XX_RB_SAMPLE_COUNT_CONTROL 0x00002110
#define A3XX_RB_SAMPLE_COUNT_CONTROL_RESET 0x00000001
#define A3XX_RB_SAMPLE_COUNT_CONTROL_COPY 0x00000002
#define REG_A3XX_RB_SAMPLE_COUNT_ADDR 0x00002111
@@ -1366,10 +1383,36 @@ static inline uint32_t A3XX_HLSQ_CONST_FSPRESV_RANGE_REG_ENDENTRY(uint32_t val)
}
#define REG_A3XX_HLSQ_CL_NDRANGE_0_REG 0x0000220a
#define A3XX_HLSQ_CL_NDRANGE_0_REG_WORKDIM__MASK 0x00000003
#define A3XX_HLSQ_CL_NDRANGE_0_REG_WORKDIM__SHIFT 0
static inline uint32_t A3XX_HLSQ_CL_NDRANGE_0_REG_WORKDIM(uint32_t val)
{
return ((val) << A3XX_HLSQ_CL_NDRANGE_0_REG_WORKDIM__SHIFT) & A3XX_HLSQ_CL_NDRANGE_0_REG_WORKDIM__MASK;
}
#define A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE0__MASK 0x00000ffc
#define A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE0__SHIFT 2
static inline uint32_t A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE0(uint32_t val)
{
return ((val) << A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE0__SHIFT) & A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE0__MASK;
}
#define A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE1__MASK 0x003ff000
#define A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE1__SHIFT 12
static inline uint32_t A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE1(uint32_t val)
{
return ((val) << A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE1__SHIFT) & A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE1__MASK;
}
#define A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE2__MASK 0xffc00000
#define A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE2__SHIFT 22
static inline uint32_t A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE2(uint32_t val)
{
return ((val) << A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE2__SHIFT) & A3XX_HLSQ_CL_NDRANGE_0_REG_LOCALSIZE2__MASK;
}
#define REG_A3XX_HLSQ_CL_NDRANGE_1_REG 0x0000220b
static inline uint32_t REG_A3XX_HLSQ_CL_GLOBAL_WORK(uint32_t i0) { return 0x0000220b + 0x2*i0; }
#define REG_A3XX_HLSQ_CL_NDRANGE_2_REG 0x0000220c
static inline uint32_t REG_A3XX_HLSQ_CL_GLOBAL_WORK_SIZE(uint32_t i0) { return 0x0000220b + 0x2*i0; }
static inline uint32_t REG_A3XX_HLSQ_CL_GLOBAL_WORK_OFFSET(uint32_t i0) { return 0x0000220c + 0x2*i0; }
#define REG_A3XX_HLSQ_CL_CONTROL_0_REG 0x00002211
@@ -1377,7 +1420,9 @@ static inline uint32_t A3XX_HLSQ_CONST_FSPRESV_RANGE_REG_ENDENTRY(uint32_t val)
#define REG_A3XX_HLSQ_CL_KERNEL_CONST_REG 0x00002214
#define REG_A3XX_HLSQ_CL_KERNEL_GROUP_X_REG 0x00002215
static inline uint32_t REG_A3XX_HLSQ_CL_KERNEL_GROUP(uint32_t i0) { return 0x00002215 + 0x1*i0; }
static inline uint32_t REG_A3XX_HLSQ_CL_KERNEL_GROUP_RATIO(uint32_t i0) { return 0x00002215 + 0x1*i0; }
#define REG_A3XX_HLSQ_CL_KERNEL_GROUP_Y_REG 0x00002216
@@ -1492,6 +1537,12 @@ static inline uint32_t A3XX_VFD_DECODE_INSTR_REGID(uint32_t val)
{
return ((val) << A3XX_VFD_DECODE_INSTR_REGID__SHIFT) & A3XX_VFD_DECODE_INSTR_REGID__MASK;
}
#define A3XX_VFD_DECODE_INSTR_SWAP__MASK 0x00c00000
#define A3XX_VFD_DECODE_INSTR_SWAP__SHIFT 22
static inline uint32_t A3XX_VFD_DECODE_INSTR_SWAP(enum a3xx_color_swap val)
{
return ((val) << A3XX_VFD_DECODE_INSTR_SWAP__SHIFT) & A3XX_VFD_DECODE_INSTR_SWAP__MASK;
}
#define A3XX_VFD_DECODE_INSTR_SHIFTCNT__MASK 0x1f000000
#define A3XX_VFD_DECODE_INSTR_SHIFTCNT__SHIFT 24
static inline uint32_t A3XX_VFD_DECODE_INSTR_SHIFTCNT(uint32_t val)
@@ -1624,6 +1675,7 @@ static inline uint32_t A3XX_SP_VS_CTRL_REG0_THREADSIZE(enum a3xx_threadsize val)
}
#define A3XX_SP_VS_CTRL_REG0_SUPERTHREADMODE 0x00200000
#define A3XX_SP_VS_CTRL_REG0_PIXLODENABLE 0x00400000
#define A3XX_SP_VS_CTRL_REG0_COMPUTEMODE 0x00800000
#define A3XX_SP_VS_CTRL_REG0_LENGTH__MASK 0xff000000
#define A3XX_SP_VS_CTRL_REG0_LENGTH__SHIFT 24
static inline uint32_t A3XX_SP_VS_CTRL_REG0_LENGTH(uint32_t val)
@@ -1797,6 +1849,7 @@ static inline uint32_t A3XX_SP_FS_CTRL_REG0_THREADSIZE(enum a3xx_threadsize val)
}
#define A3XX_SP_FS_CTRL_REG0_SUPERTHREADMODE 0x00200000
#define A3XX_SP_FS_CTRL_REG0_PIXLODENABLE 0x00400000
#define A3XX_SP_FS_CTRL_REG0_COMPUTEMODE 0x00800000
#define A3XX_SP_FS_CTRL_REG0_LENGTH__MASK 0xff000000
#define A3XX_SP_FS_CTRL_REG0_LENGTH__SHIFT 24
static inline uint32_t A3XX_SP_FS_CTRL_REG0_LENGTH(uint32_t val)
@@ -1976,6 +2029,42 @@ static inline uint32_t A3XX_TPL1_TP_FS_TEX_OFFSET_BASETABLEPTR(uint32_t val)
#define REG_A3XX_VBIF_OUT_AXI_AOOO 0x0000305f
#define REG_A3XX_VBIF_PERF_CNT_EN 0x00003070
#define A3XX_VBIF_PERF_CNT_EN_CNT0 0x00000001
#define A3XX_VBIF_PERF_CNT_EN_CNT1 0x00000002
#define A3XX_VBIF_PERF_CNT_EN_PWRCNT0 0x00000004
#define A3XX_VBIF_PERF_CNT_EN_PWRCNT1 0x00000008
#define A3XX_VBIF_PERF_CNT_EN_PWRCNT2 0x00000010
#define REG_A3XX_VBIF_PERF_CNT_CLR 0x00003071
#define A3XX_VBIF_PERF_CNT_CLR_CNT0 0x00000001
#define A3XX_VBIF_PERF_CNT_CLR_CNT1 0x00000002
#define A3XX_VBIF_PERF_CNT_CLR_PWRCNT0 0x00000004
#define A3XX_VBIF_PERF_CNT_CLR_PWRCNT1 0x00000008
#define A3XX_VBIF_PERF_CNT_CLR_PWRCNT2 0x00000010
#define REG_A3XX_VBIF_PERF_CNT_SEL 0x00003072
#define REG_A3XX_VBIF_PERF_CNT0_LO 0x00003073
#define REG_A3XX_VBIF_PERF_CNT0_HI 0x00003074
#define REG_A3XX_VBIF_PERF_CNT1_LO 0x00003075
#define REG_A3XX_VBIF_PERF_CNT1_HI 0x00003076
#define REG_A3XX_VBIF_PERF_PWR_CNT0_LO 0x00003077
#define REG_A3XX_VBIF_PERF_PWR_CNT0_HI 0x00003078
#define REG_A3XX_VBIF_PERF_PWR_CNT1_LO 0x00003079
#define REG_A3XX_VBIF_PERF_PWR_CNT1_HI 0x0000307a
#define REG_A3XX_VBIF_PERF_PWR_CNT2_LO 0x0000307b
#define REG_A3XX_VBIF_PERF_PWR_CNT2_HI 0x0000307c
#define REG_A3XX_VSC_BIN_SIZE 0x00000c01
#define A3XX_VSC_BIN_SIZE_WIDTH__MASK 0x0000001f
#define A3XX_VSC_BIN_SIZE_WIDTH__SHIFT 0
@@ -2249,6 +2338,12 @@ static inline uint32_t A3XX_TEX_SAMP_0_WRAP_R(enum a3xx_tex_clamp val)
{
return ((val) << A3XX_TEX_SAMP_0_WRAP_R__SHIFT) & A3XX_TEX_SAMP_0_WRAP_R__MASK;
}
#define A3XX_TEX_SAMP_0_COMPARE_FUNC__MASK 0x00700000
#define A3XX_TEX_SAMP_0_COMPARE_FUNC__SHIFT 20
static inline uint32_t A3XX_TEX_SAMP_0_COMPARE_FUNC(enum adreno_compare_func val)
{
return ((val) << A3XX_TEX_SAMP_0_COMPARE_FUNC__SHIFT) & A3XX_TEX_SAMP_0_COMPARE_FUNC__MASK;
}
#define A3XX_TEX_SAMP_0_UNNORM_COORDS 0x80000000
#define REG_A3XX_TEX_SAMP_1 0x00000001
@@ -2267,6 +2362,7 @@ static inline uint32_t A3XX_TEX_SAMP_1_MIN_LOD(float val)
#define REG_A3XX_TEX_CONST_0 0x00000000
#define A3XX_TEX_CONST_0_TILED 0x00000001
#define A3XX_TEX_CONST_0_SRGB 0x00000004
#define A3XX_TEX_CONST_0_SWIZ_X__MASK 0x00000070
#define A3XX_TEX_CONST_0_SWIZ_X__SHIFT 4
static inline uint32_t A3XX_TEX_CONST_0_SWIZ_X(enum a3xx_tex_swiz val)
@@ -2303,6 +2399,7 @@ static inline uint32_t A3XX_TEX_CONST_0_FMT(enum a3xx_tex_fmt val)
{
return ((val) << A3XX_TEX_CONST_0_FMT__SHIFT) & A3XX_TEX_CONST_0_FMT__MASK;
}
#define A3XX_TEX_CONST_0_NOCONVERT 0x20000000
#define A3XX_TEX_CONST_0_TYPE__MASK 0xc0000000
#define A3XX_TEX_CONST_0_TYPE__SHIFT 30
static inline uint32_t A3XX_TEX_CONST_0_TYPE(enum a3xx_tex_type val)

View File

@@ -34,6 +34,27 @@
#include "fd3_context.h"
#include "fd3_util.h"
static enum a3xx_rb_blend_opcode
blend_func(unsigned func)
{
switch (func) {
case PIPE_BLEND_ADD:
return BLEND_DST_PLUS_SRC;
case PIPE_BLEND_MIN:
return BLEND_MIN_DST_SRC;
case PIPE_BLEND_MAX:
return BLEND_MAX_DST_SRC;
case PIPE_BLEND_SUBTRACT:
return BLEND_SRC_MINUS_DST;
case PIPE_BLEND_REVERSE_SUBTRACT:
return BLEND_DST_MINUS_SRC;
default:
DBG("invalid blend func: %x", func);
return 0;
}
}
void *
fd3_blend_state_create(struct pipe_context *pctx,
const struct pipe_blend_state *cso)
@@ -80,10 +101,10 @@ fd3_blend_state_create(struct pipe_context *pctx,
so->rb_mrt[i].blend_control =
A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR(fd_blend_factor(rt->rgb_src_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(fd_blend_func(rt->rgb_func)) |
A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(blend_func(rt->rgb_func)) |
A3XX_RB_MRT_BLEND_CONTROL_RGB_DEST_FACTOR(fd_blend_factor(rt->rgb_dst_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_SRC_FACTOR(fd_blend_factor(rt->alpha_src_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(fd_blend_func(rt->alpha_func)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(blend_func(rt->alpha_func)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_DEST_FACTOR(fd_blend_factor(rt->alpha_dst_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_CLAMP_ENABLE;

View File

@@ -1074,77 +1074,154 @@ trans_arl(const struct instr_translater *t,
add_src_reg(ctx, instr, tmp_src, chan)->flags |= IR3_REG_HALF;
}
/* texture fetch/sample instructions: */
static void
trans_samp(const struct instr_translater *t,
struct fd3_compile_context *ctx,
/*
* texture fetch/sample instructions:
*/
struct tex_info {
int8_t order[4];
unsigned src_wrmask, flags;
};
static const struct tex_info *
get_tex_info(struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst)
{
struct ir3_instruction *instr;
struct tgsi_src_register *coord = &inst->Src[0].Register;
struct tgsi_src_register *samp = &inst->Src[1].Register;
unsigned tex = inst->Texture.Texture;
int8_t *order;
unsigned i, flags = 0, src_wrmask;
bool needs_mov = false;
static const struct tex_info tex1d = {
.order = { 0, -1, -1, -1 }, /* coord.x */
.src_wrmask = TGSI_WRITEMASK_XY,
.flags = 0,
};
static const struct tex_info tex1ds = {
.order = { 0, -1, 2, -1 }, /* coord.xz */
.src_wrmask = TGSI_WRITEMASK_XYZ,
.flags = IR3_INSTR_S,
};
static const struct tex_info tex2d = {
.order = { 0, 1, -1, -1 }, /* coord.xy */
.src_wrmask = TGSI_WRITEMASK_XY,
.flags = 0,
};
static const struct tex_info tex2ds = {
.order = { 0, 1, 2, -1 }, /* coord.xyz */
.src_wrmask = TGSI_WRITEMASK_XYZ,
.flags = IR3_INSTR_S,
};
static const struct tex_info tex3d = {
.order = { 0, 1, 2, -1 }, /* coord.xyz */
.src_wrmask = TGSI_WRITEMASK_XYZ,
.flags = IR3_INSTR_3D,
};
static const struct tex_info tex3ds = {
.order = { 0, 1, 2, 3 }, /* coord.xyzw */
.src_wrmask = TGSI_WRITEMASK_XYZW,
.flags = IR3_INSTR_S | IR3_INSTR_3D,
};
static const struct tex_info txp1d = {
.order = { 0, -1, 3, -1 }, /* coord.xw */
.src_wrmask = TGSI_WRITEMASK_XYZ,
.flags = IR3_INSTR_P,
};
static const struct tex_info txp1ds = {
.order = { 0, -1, 2, 3 }, /* coord.xzw */
.src_wrmask = TGSI_WRITEMASK_XYZW,
.flags = IR3_INSTR_P | IR3_INSTR_S,
};
static const struct tex_info txp2d = {
.order = { 0, 1, 3, -1 }, /* coord.xyw */
.src_wrmask = TGSI_WRITEMASK_XYZ,
.flags = IR3_INSTR_P,
};
static const struct tex_info txp2ds = {
.order = { 0, 1, 2, 3 }, /* coord.xyzw */
.src_wrmask = TGSI_WRITEMASK_XYZW,
.flags = IR3_INSTR_P | IR3_INSTR_S,
};
static const struct tex_info txp3d = {
.order = { 0, 1, 2, 3 }, /* coord.xyzw */
.src_wrmask = TGSI_WRITEMASK_XYZW,
.flags = IR3_INSTR_P | IR3_INSTR_3D,
};
switch (t->arg) {
unsigned tex = inst->Texture.Texture;
switch (inst->Instruction.Opcode) {
case TGSI_OPCODE_TEX:
switch (tex) {
case TGSI_TEXTURE_1D:
return &tex1d;
case TGSI_TEXTURE_SHADOW1D:
return &tex1ds;
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_RECT:
order = (int8_t[4]){ 0, 1, -1, -1 };
src_wrmask = TGSI_WRITEMASK_XY;
break;
return &tex2d;
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
return &tex2ds;
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
order = (int8_t[4]){ 0, 1, 2, -1 };
src_wrmask = TGSI_WRITEMASK_XYZ;
flags |= IR3_INSTR_3D;
break;
return &tex3d;
case TGSI_TEXTURE_SHADOWCUBE:
return &tex3ds;
default:
compile_error(ctx, "unknown texture type: %s\n",
tgsi_texture_names[tex]);
break;
return NULL;
}
break;
case TGSI_OPCODE_TXP:
switch (tex) {
case TGSI_TEXTURE_1D:
return &txp1d;
case TGSI_TEXTURE_SHADOW1D:
return &txp1ds;
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_RECT:
order = (int8_t[4]){ 0, 1, 3, -1 };
src_wrmask = TGSI_WRITEMASK_XYZ;
break;
return &txp2d;
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
return &txp2ds;
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
order = (int8_t[4]){ 0, 1, 2, 3 };
src_wrmask = TGSI_WRITEMASK_XYZW;
flags |= IR3_INSTR_3D;
break;
return &txp3d;
default:
compile_error(ctx, "unknown texture type: %s\n",
tgsi_texture_names[tex]);
break;
}
flags |= IR3_INSTR_P;
break;
default:
compile_assert(ctx, 0);
break;
}
compile_assert(ctx, 0);
return NULL;
}
static struct tgsi_src_register *
get_tex_coord(struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst,
const struct tex_info *tinf)
{
struct tgsi_src_register *coord = &inst->Src[0].Register;
struct ir3_instruction *instr;
unsigned tex = inst->Texture.Texture;
bool needs_mov = false;
unsigned i;
/* cat5 instruction cannot seem to handle const or relative: */
if (is_rel_or_const(coord))
needs_mov = true;
/* 1D textures we fix up w/ 0.0 as 2nd coord: */
if ((tex == TGSI_TEXTURE_1D) || (tex == TGSI_TEXTURE_SHADOW1D))
needs_mov = true;
/* The texture sample instructions need to coord in successive
* registers/components (ie. src.xy but not src.yx). And TXP
* needs the .w component in .z for 2D.. so in some cases we
* might need to emit some mov instructions to shuffle things
* around:
*/
for (i = 1; (i < 4) && (order[i] >= 0) && !needs_mov; i++)
if (src_swiz(coord, i) != (src_swiz(coord, 0) + order[i]))
for (i = 1; (i < 4) && (tinf->order[i] >= 0) && !needs_mov; i++)
if (src_swiz(coord, i) != (src_swiz(coord, 0) + tinf->order[i]))
needs_mov = true;
if (needs_mov) {
@@ -1157,28 +1234,55 @@ trans_samp(const struct instr_translater *t,
/* need to move things around: */
tmp_src = get_internal_temp(ctx, &tmp_dst);
for (j = 0; (j < 4) && (order[j] >= 0); j++) {
instr = instr_create(ctx, 1, 0);
for (j = 0; j < 4; j++) {
if (tinf->order[j] < 0)
continue;
instr = instr_create(ctx, 1, 0); /* mov */
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, &tmp_dst, j);
add_src_reg(ctx, instr, coord,
src_swiz(coord, order[j]));
src_swiz(coord, tinf->order[j]));
}
/* fix up .y coord: */
if ((tex == TGSI_TEXTURE_1D) ||
(tex == TGSI_TEXTURE_SHADOW1D)) {
instr = instr_create(ctx, 1, 0); /* mov */
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, &tmp_dst, 1); /* .y */
ir3_reg_create(instr, 0, IR3_REG_IMMED)->fim_val = 0.5;
}
coord = tmp_src;
}
return coord;
}
static void
trans_samp(const struct instr_translater *t,
struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst)
{
struct ir3_instruction *instr;
struct tgsi_dst_register *dst = &inst->Dst[0].Register;
struct tgsi_src_register *coord;
struct tgsi_src_register *samp = &inst->Src[1].Register;
const struct tex_info *tinf;
tinf = get_tex_info(ctx, inst);
coord = get_tex_coord(ctx, inst, tinf);
instr = instr_create(ctx, 5, t->opc);
instr->cat5.type = get_ftype(ctx);
instr->cat5.samp = samp->Index;
instr->cat5.tex = samp->Index;
instr->flags |= flags;
instr->flags |= tinf->flags;
add_dst_reg_wrmask(ctx, instr, &inst->Dst[0].Register, 0,
inst->Dst[0].Register.WriteMask);
add_src_reg_wrmask(ctx, instr, coord, coord->SwizzleX, src_wrmask);
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, coord, coord->SwizzleX, tinf->src_wrmask);
}
/*
@@ -1231,15 +1335,19 @@ trans_cmp(const struct instr_translater *t,
switch (t->tgsi_opc) {
case TGSI_OPCODE_SEQ:
case TGSI_OPCODE_FSEQ:
condition = IR3_COND_EQ;
break;
case TGSI_OPCODE_SNE:
case TGSI_OPCODE_FSNE:
condition = IR3_COND_NE;
break;
case TGSI_OPCODE_SGE:
case TGSI_OPCODE_FSGE:
condition = IR3_COND_GE;
break;
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_FSLT:
condition = IR3_COND_LT;
break;
case TGSI_OPCODE_SLE:
@@ -1269,11 +1377,15 @@ trans_cmp(const struct instr_translater *t,
switch (t->tgsi_opc) {
case TGSI_OPCODE_SEQ:
case TGSI_OPCODE_FSEQ:
case TGSI_OPCODE_SGE:
case TGSI_OPCODE_FSGE:
case TGSI_OPCODE_SLE:
case TGSI_OPCODE_SNE:
case TGSI_OPCODE_FSNE:
case TGSI_OPCODE_SGT:
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_FSLT:
/* cov.u16f16 dst, tmp0 */
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = get_utype(ctx);
@@ -1293,6 +1405,96 @@ trans_cmp(const struct instr_translater *t,
put_dst(ctx, inst, dst);
}
/*
* USNE(a,b) = (a != b) ? 1 : 0
* cmps.u32.ne dst, a, b
*
* USEQ(a,b) = (a == b) ? 1 : 0
* cmps.u32.eq dst, a, b
*
* ISGE(a,b) = (a > b) ? 1 : 0
* cmps.s32.ge dst, a, b
*
* USGE(a,b) = (a > b) ? 1 : 0
* cmps.u32.ge dst, a, b
*
* ISLT(a,b) = (a < b) ? 1 : 0
* cmps.s32.lt dst, a, b
*
* USLT(a,b) = (a < b) ? 1 : 0
* cmps.u32.lt dst, a, b
*
* UCMP(a,b,c) = (a < 0) ? b : c
* cmps.u32.lt tmp0, a, {0}
* sel.b16 dst, b, tmp0, c
*/
static void
trans_icmp(const struct instr_translater *t,
struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst)
{
struct ir3_instruction *instr;
struct tgsi_dst_register *dst = get_dst(ctx, inst);
struct tgsi_src_register constval0;
struct tgsi_src_register *a0, *a1, *a2;
unsigned condition;
a0 = &inst->Src[0].Register; /* a */
a1 = &inst->Src[1].Register; /* b */
switch (t->tgsi_opc) {
case TGSI_OPCODE_USNE:
condition = IR3_COND_NE;
break;
case TGSI_OPCODE_USEQ:
condition = IR3_COND_EQ;
break;
case TGSI_OPCODE_ISGE:
case TGSI_OPCODE_USGE:
condition = IR3_COND_GE;
break;
case TGSI_OPCODE_ISLT:
case TGSI_OPCODE_USLT:
condition = IR3_COND_LT;
break;
case TGSI_OPCODE_UCMP:
get_immediate(ctx, &constval0, 0);
a0 = &inst->Src[0].Register; /* a */
a1 = &constval0; /* {0} */
condition = IR3_COND_LT;
break;
default:
compile_assert(ctx, 0);
return;
}
if (is_const(a0) && is_const(a1))
a0 = get_unconst(ctx, a0);
if (t->tgsi_opc == TGSI_OPCODE_UCMP) {
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
tmp_src = get_internal_temp(ctx, &tmp_dst);
/* cmps.u32.lt tmp, a0, a1 */
instr = instr_create(ctx, 2, t->opc);
instr->cat2.condition = condition;
vectorize(ctx, instr, &tmp_dst, 2, a0, 0, a1, 0);
a1 = &inst->Src[1].Register;
a2 = &inst->Src[2].Register;
/* sel.{b32,b16} dst, src2, tmp, src1 */
instr = instr_create(ctx, 3, OPC_SEL_B32);
vectorize(ctx, instr, dst, 3, a1, 0, tmp_src, 0, a2, 0);
} else {
/* cmps.{u32,s32}.<cond> dst, a0, a1 */
instr = instr_create(ctx, 2, t->opc);
instr->cat2.condition = condition;
vectorize(ctx, instr, dst, 2, a0, 0, a1, 0);
}
put_dst(ctx, inst, dst);
}
/*
* Conditional / Flow control
*/
@@ -1533,7 +1735,7 @@ trans_endif(const struct instr_translater *t,
}
/*
* Kill / Kill-if
* Kill
*/
static void
@@ -1579,6 +1781,76 @@ trans_kill(const struct instr_translater *t,
ctx->kill[ctx->kill_count++] = instr;
}
/*
* Kill-If
*/
static void
trans_killif(const struct instr_translater *t,
struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst)
{
struct tgsi_src_register *src = &inst->Src[0].Register;
struct ir3_instruction *instr, *immed, *cond = NULL;
bool inv = false;
immed = create_immed(ctx, 0.0);
/* cmps.f.ne p0.x, cond, {0.0} */
instr = instr_create(ctx, 2, OPC_CMPS_F);
instr->cat2.condition = IR3_COND_NE;
ir3_reg_create(instr, regid(REG_P0, 0), 0);
ir3_reg_create(instr, 0, IR3_REG_SSA)->instr = immed;
add_src_reg(ctx, instr, src, src->SwizzleX);
cond = instr;
/* kill p0.x */
instr = instr_create(ctx, 0, OPC_KILL);
instr->cat0.inv = inv;
ir3_reg_create(instr, 0, 0); /* dummy dst */
ir3_reg_create(instr, 0, IR3_REG_SSA)->instr = cond;
ctx->kill[ctx->kill_count++] = instr;
}
/*
* I2F / U2F / F2I / F2U
*/
static void
trans_cov(const struct instr_translater *t,
struct fd3_compile_context *ctx,
struct tgsi_full_instruction *inst)
{
struct ir3_instruction *instr;
struct tgsi_dst_register *dst = get_dst(ctx, inst);
struct tgsi_src_register *src = &inst->Src[0].Register;
// cov.f32s32 dst, tmp0 /
instr = instr_create(ctx, 1, 0);
switch (t->tgsi_opc) {
case TGSI_OPCODE_U2F:
instr->cat1.src_type = TYPE_U32;
instr->cat1.dst_type = TYPE_F32;
break;
case TGSI_OPCODE_I2F:
instr->cat1.src_type = TYPE_S32;
instr->cat1.dst_type = TYPE_F32;
break;
case TGSI_OPCODE_F2U:
instr->cat1.src_type = TYPE_F32;
instr->cat1.dst_type = TYPE_U32;
break;
case TGSI_OPCODE_F2I:
instr->cat1.src_type = TYPE_F32;
instr->cat1.dst_type = TYPE_S32;
break;
}
vectorize(ctx, instr, dst, 1, src, 0);
}
/*
* Handlers for TGSI instructions which do have 1:1 mapping to native
* instructions:
@@ -1616,9 +1888,11 @@ instr_cat2(const struct instr_translater *t,
switch (t->tgsi_opc) {
case TGSI_OPCODE_ABS:
case TGSI_OPCODE_IABS:
src0_flags = IR3_REG_ABS;
break;
case TGSI_OPCODE_SUB:
case TGSI_OPCODE_INEG:
src1_flags = IR3_REG_NEGATE;
break;
}
@@ -1724,6 +1998,22 @@ static const struct instr_translater translaters[TGSI_OPCODE_LAST] = {
INSTR(SUB, instr_cat2, .opc = OPC_ADD_F),
INSTR(MIN, instr_cat2, .opc = OPC_MIN_F),
INSTR(MAX, instr_cat2, .opc = OPC_MAX_F),
INSTR(UADD, instr_cat2, .opc = OPC_ADD_U),
INSTR(IMIN, instr_cat2, .opc = OPC_MIN_S),
INSTR(UMIN, instr_cat2, .opc = OPC_MIN_U),
INSTR(IMAX, instr_cat2, .opc = OPC_MAX_S),
INSTR(UMAX, instr_cat2, .opc = OPC_MAX_U),
INSTR(AND, instr_cat2, .opc = OPC_AND_B),
INSTR(OR, instr_cat2, .opc = OPC_OR_B),
INSTR(NOT, instr_cat2, .opc = OPC_NOT_B),
INSTR(XOR, instr_cat2, .opc = OPC_XOR_B),
INSTR(UMUL, instr_cat2, .opc = OPC_MUL_U),
INSTR(SHL, instr_cat2, .opc = OPC_SHL_B),
INSTR(USHR, instr_cat2, .opc = OPC_SHR_B),
INSTR(ISHR, instr_cat2, .opc = OPC_ASHR_B),
INSTR(IABS, instr_cat2, .opc = OPC_ABSNEG_S),
INSTR(INEG, instr_cat2, .opc = OPC_ABSNEG_S),
INSTR(AND, instr_cat2, .opc = OPC_AND_B),
INSTR(MAD, instr_cat3, .opc = OPC_MAD_F32, .hopc = OPC_MAD_F16),
INSTR(TRUNC, instr_cat2, .opc = OPC_TRUNC_F),
INSTR(CLAMP, trans_clamp),
@@ -1741,16 +2031,33 @@ static const struct instr_translater translaters[TGSI_OPCODE_LAST] = {
INSTR(TXP, trans_samp, .opc = OPC_SAM, .arg = TGSI_OPCODE_TXP),
INSTR(SGT, trans_cmp),
INSTR(SLT, trans_cmp),
INSTR(FSLT, trans_cmp),
INSTR(SGE, trans_cmp),
INSTR(FSGE, trans_cmp),
INSTR(SLE, trans_cmp),
INSTR(SNE, trans_cmp),
INSTR(FSNE, trans_cmp),
INSTR(SEQ, trans_cmp),
INSTR(FSEQ, trans_cmp),
INSTR(CMP, trans_cmp),
INSTR(USNE, trans_icmp, .opc = OPC_CMPS_U),
INSTR(USEQ, trans_icmp, .opc = OPC_CMPS_U),
INSTR(ISGE, trans_icmp, .opc = OPC_CMPS_S),
INSTR(USGE, trans_icmp, .opc = OPC_CMPS_U),
INSTR(ISLT, trans_icmp, .opc = OPC_CMPS_S),
INSTR(USLT, trans_icmp, .opc = OPC_CMPS_U),
INSTR(UCMP, trans_icmp, .opc = OPC_CMPS_U),
INSTR(IF, trans_if),
INSTR(UIF, trans_if),
INSTR(ELSE, trans_else),
INSTR(ENDIF, trans_endif),
INSTR(END, instr_cat0, .opc = OPC_END),
INSTR(KILL, trans_kill, .opc = OPC_KILL),
INSTR(KILL_IF, trans_killif, .opc = OPC_KILL),
INSTR(I2F, trans_cov),
INSTR(U2F, trans_cov),
INSTR(F2I, trans_cov),
INSTR(F2U, trans_cov),
};
static fd3_semantic
@@ -1935,6 +2242,8 @@ decl_in(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
DBG("decl in -> r%d", i);
compile_assert(ctx, n < ARRAY_SIZE(so->inputs));
so->inputs[n].semantic = decl_semantic(&decl->Semantic);
so->inputs[n].compmask = (1 << ncomp) - 1;
so->inputs[n].regid = r;
@@ -2024,6 +2333,8 @@ decl_out(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
ncomp = 4;
compile_assert(ctx, n < ARRAY_SIZE(so->outputs));
so->outputs[n].semantic = decl_semantic(&decl->Semantic);
so->outputs[n].regid = regid(i, comp);
@@ -2147,6 +2458,7 @@ compile_instructions(struct fd3_compile_context *ctx)
struct tgsi_full_immediate *imm =
&ctx->parser.FullToken.FullImmediate;
unsigned n = ctx->so->immediates_count++;
compile_assert(ctx, n < ARRAY_SIZE(ctx->so->immediates));
memcpy(ctx->so->immediates[n].val, imm->u, 16);
break;
}

View File

@@ -1324,6 +1324,8 @@ decl_in(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
DBG("decl in -> r%d", i + base); // XXX
compile_assert(ctx, n < ARRAY_SIZE(so->inputs));
so->inputs[n].semantic = decl_semantic(&decl->Semantic);
so->inputs[n].compmask = (1 << ncomp) - 1;
so->inputs[n].ncomp = ncomp;
@@ -1410,6 +1412,7 @@ decl_out(struct fd3_compile_context *ctx, struct tgsi_full_declaration *decl)
for (i = decl->Range.First; i <= decl->Range.Last; i++) {
unsigned n = so->outputs_count++;
compile_assert(ctx, n < ARRAY_SIZE(so->outputs));
so->outputs[n].semantic = decl_semantic(&decl->Semantic);
so->outputs[n].regid = regid(i + base, comp);
}

View File

@@ -33,6 +33,7 @@
#include "fd3_emit.h"
#include "fd3_gmem.h"
#include "fd3_program.h"
#include "fd3_query.h"
#include "fd3_rasterizer.h"
#include "fd3_texture.h"
#include "fd3_zsa.h"
@@ -134,5 +135,7 @@ fd3_context_create(struct pipe_screen *pscreen, void *priv)
fd3_ctx->solid_vbuf = create_solid_vertexbuf(pctx);
fd3_ctx->blit_texcoord_vbuf = create_blit_texcoord_vertexbuf(pctx);
fd3_query_context_init(pctx);
return pctx;
}

View File

@@ -195,8 +195,10 @@ emit_textures(struct fd_ringbuffer *ring,
OUT_RING(ring, CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS) |
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
for (i = 0; i < tex->num_textures; i++) {
struct fd3_pipe_sampler_view *view =
fd3_pipe_sampler_view(tex->textures[i]);
static const struct fd3_pipe_sampler_view dummy_view = {};
const struct fd3_pipe_sampler_view *view = tex->textures[i] ?
fd3_pipe_sampler_view(tex->textures[i]) :
&dummy_view;
OUT_RING(ring, view->texconst0);
OUT_RING(ring, view->texconst1);
OUT_RING(ring, view->texconst2 |
@@ -213,8 +215,10 @@ emit_textures(struct fd_ringbuffer *ring,
OUT_RING(ring, CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS) |
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
for (i = 0; i < tex->num_textures; i++) {
struct fd3_pipe_sampler_view *view =
fd3_pipe_sampler_view(tex->textures[i]);
static const struct fd3_pipe_sampler_view dummy_view = {};
const struct fd3_pipe_sampler_view *view = tex->textures[i] ?
fd3_pipe_sampler_view(tex->textures[i]) :
&dummy_view;
struct fd_resource *rsc = view->tex_resource;
for (j = 0; j < view->mipaddrs; j++) {
@@ -323,9 +327,12 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring,
if (vp->inputs[i].compmask) {
struct pipe_resource *prsc = vbufs[i].prsc;
struct fd_resource *rsc = fd_resource(prsc);
enum a3xx_vtx_fmt fmt = fd3_pipe2vtx(vbufs[i].format);
enum pipe_format pfmt = vbufs[i].format;
enum a3xx_vtx_fmt fmt = fd3_pipe2vtx(pfmt);
bool switchnext = (i != last);
uint32_t fs = util_format_get_blocksize(vbufs[i].format);
uint32_t fs = util_format_get_blocksize(pfmt);
debug_assert(fmt != ~0);
OUT_PKT0(ring, REG_A3XX_VFD_FETCH(j), 2);
OUT_RING(ring, A3XX_VFD_FETCH_INSTR_0_FETCHSIZE(fs - 1) |
@@ -339,6 +346,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring,
OUT_RING(ring, A3XX_VFD_DECODE_INSTR_CONSTFILL |
A3XX_VFD_DECODE_INSTR_WRITEMASK(vp->inputs[i].compmask) |
A3XX_VFD_DECODE_INSTR_FORMAT(fmt) |
A3XX_VFD_DECODE_INSTR_SWAP(fd3_pipe2swap(pfmt)) |
A3XX_VFD_DECODE_INSTR_REGID(vp->inputs[i].regid) |
A3XX_VFD_DECODE_INSTR_SHIFTCNT(fs) |
A3XX_VFD_DECODE_INSTR_LASTCOMPVALID |

View File

@@ -82,7 +82,7 @@ emit_mrt(struct fd_ringbuffer *ring, unsigned nr_bufs,
stride = bin_w * rsc->cpp;
if (bases) {
base = bases[i] * rsc->cpp;
base = bases[i];
}
} else {
stride = slice->pitch * rsc->cpp;
@@ -106,9 +106,17 @@ emit_mrt(struct fd_ringbuffer *ring, unsigned nr_bufs,
}
static uint32_t
depth_base(struct fd_gmem_stateobj *gmem)
depth_base(struct fd_context *ctx)
{
return align(gmem->bin_w * gmem->bin_h, 0x4000);
struct fd_gmem_stateobj *gmem = &ctx->gmem;
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
uint32_t cpp = 4;
if (pfb->cbufs[0]) {
struct fd_resource *rsc =
fd_resource(pfb->cbufs[0]->texture);
cpp = rsc->cpp;
}
return align(gmem->bin_w * gmem->bin_h * cpp, 0x4000);
}
static bool
@@ -156,7 +164,7 @@ emit_binning_workaround(struct fd_context *ctx)
OUT_RING(ring, A3XX_RB_COPY_CONTROL_MSAA_RESOLVE(MSAA_ONE) |
A3XX_RB_COPY_CONTROL_MODE(0) |
A3XX_RB_COPY_CONTROL_GMEM_BASE(0));
OUT_RELOC(ring, fd_resource(fd3_ctx->solid_vbuf)->bo, 0x20, 0, -1); /* RB_COPY_DEST_BASE */
OUT_RELOCW(ring, fd_resource(fd3_ctx->solid_vbuf)->bo, 0x20, 0, -1); /* RB_COPY_DEST_BASE */
OUT_RING(ring, A3XX_RB_COPY_DEST_PITCH_PITCH(128));
OUT_RING(ring, A3XX_RB_COPY_DEST_INFO_TILE(LINEAR) |
A3XX_RB_COPY_DEST_INFO_FORMAT(RB_R8G8B8A8_UNORM) |
@@ -399,12 +407,7 @@ fd3_emit_tile_gmem2mem(struct fd_context *ctx, struct fd_tile *tile)
}}, 1);
if (ctx->resolve & (FD_BUFFER_DEPTH | FD_BUFFER_STENCIL)) {
uint32_t base = 0;
if (pfb->cbufs[0]) {
struct fd_resource *rsc =
fd_resource(pfb->cbufs[0]->texture);
base = depth_base(&ctx->gmem) * rsc->cpp;
}
uint32_t base = depth_base(ctx);
emit_gmem2mem_surf(ctx, RB_COPY_DEPTH_STENCIL, base, pfb->zsbuf);
}
@@ -458,7 +461,7 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, struct fd_tile *tile)
y1 = ((float)tile->yoff + bin_h) / ((float)pfb->height);
OUT_PKT3(ring, CP_MEM_WRITE, 5);
OUT_RELOC(ring, fd_resource(fd3_ctx->blit_texcoord_vbuf)->bo, 0, 0, 0);
OUT_RELOCW(ring, fd_resource(fd3_ctx->blit_texcoord_vbuf)->bo, 0, 0, 0);
OUT_RING(ring, fui(x0));
OUT_RING(ring, fui(y0));
OUT_RING(ring, fui(x1));
@@ -558,7 +561,7 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, struct fd_tile *tile)
bin_h = gmem->bin_h;
if (ctx->restore & (FD_BUFFER_DEPTH | FD_BUFFER_STENCIL))
emit_mem2gmem_surf(ctx, depth_base(gmem), pfb->zsbuf, bin_w);
emit_mem2gmem_surf(ctx, depth_base(ctx), pfb->zsbuf, bin_w);
if (ctx->restore & FD_BUFFER_COLOR)
emit_mem2gmem_surf(ctx, 0, pfb->cbufs[0], bin_w);
@@ -639,7 +642,7 @@ update_vsc_pipe(struct fd_context *ctx)
int i;
OUT_PKT0(ring, REG_A3XX_VSC_SIZE_ADDRESS, 1);
OUT_RELOC(ring, fd3_ctx->vsc_size_mem, 0, 0, 0); /* VSC_SIZE_ADDRESS */
OUT_RELOCW(ring, fd3_ctx->vsc_size_mem, 0, 0, 0); /* VSC_SIZE_ADDRESS */
for (i = 0; i < 8; i++) {
struct fd_vsc_pipe *pipe = &ctx->pipe[i];
@@ -654,7 +657,7 @@ update_vsc_pipe(struct fd_context *ctx)
A3XX_VSC_PIPE_CONFIG_Y(pipe->y) |
A3XX_VSC_PIPE_CONFIG_W(pipe->w) |
A3XX_VSC_PIPE_CONFIG_H(pipe->h));
OUT_RELOC(ring, pipe->bo, 0, 0, 0); /* VSC_PIPE[i].DATA_ADDRESS */
OUT_RELOCW(ring, pipe->bo, 0, 0, 0); /* VSC_PIPE[i].DATA_ADDRESS */
OUT_RING(ring, fd_bo_size(pipe->bo) - 32); /* VSC_PIPE[i].DATA_LENGTH */
}
}
@@ -789,6 +792,7 @@ fd3_emit_tile_init(struct fd_context *ctx)
{
struct fd_ringbuffer *ring = ctx->ring;
struct fd_gmem_stateobj *gmem = &ctx->gmem;
uint32_t rb_render_control;
fd3_emit_restore(ctx);
@@ -813,8 +817,10 @@ fd3_emit_tile_init(struct fd_context *ctx)
patch_draws(ctx, IGNORE_VISIBILITY);
}
patch_rbrc(ctx, A3XX_RB_RENDER_CONTROL_ENABLE_GMEM |
A3XX_RB_RENDER_CONTROL_BIN_WIDTH(gmem->bin_w));
rb_render_control = A3XX_RB_RENDER_CONTROL_ENABLE_GMEM |
A3XX_RB_RENDER_CONTROL_BIN_WIDTH(gmem->bin_w);
patch_rbrc(ctx, rb_render_control);
}
/* before mem2gmem */
@@ -827,7 +833,7 @@ fd3_emit_tile_prep(struct fd_context *ctx, struct fd_tile *tile)
uint32_t reg;
OUT_PKT0(ring, REG_A3XX_RB_DEPTH_INFO, 2);
reg = A3XX_RB_DEPTH_INFO_DEPTH_BASE(depth_base(gmem));
reg = A3XX_RB_DEPTH_INFO_DEPTH_BASE(depth_base(ctx));
if (pfb->zsbuf) {
reg |= A3XX_RB_DEPTH_INFO_DEPTH_FORMAT(fd_pipe2depth(pfb->zsbuf->format));
}

View File

@@ -406,7 +406,7 @@ fd3_program_emit(struct fd_ringbuffer *ring,
A3XX_SP_VS_PARAM_REG_PSIZEREGID(psize_regid) |
A3XX_SP_VS_PARAM_REG_TOTALVSOUTVAR(align(fp->total_in, 4) / 4));
for (i = 0, j = -1; j < (int)fp->inputs_count; i++) {
for (i = 0, j = -1; (i < 8) && (j < (int)fp->inputs_count); i++) {
uint32_t reg = 0;
OUT_PKT0(ring, REG_A3XX_SP_VS_OUT_REG(i), 1);
@@ -428,7 +428,7 @@ fd3_program_emit(struct fd_ringbuffer *ring,
OUT_RING(ring, reg);
}
for (i = 0, j = -1; j < (int)fp->inputs_count; i++) {
for (i = 0, j = -1; (i < 4) && (j < (int)fp->inputs_count); i++) {
uint32_t reg = 0;
OUT_PKT0(ring, REG_A3XX_SP_VS_VPC_DST_REG(i), 1);

View File

@@ -91,7 +91,7 @@ struct fd3_shader_variant {
struct {
fd3_semantic semantic;
uint8_t regid;
} outputs[16];
} outputs[16 + 2]; /* +POSITION +PSIZE */
bool writes_pos, writes_psize;
/* vertices/inputs: */
@@ -104,7 +104,7 @@ struct fd3_shader_variant {
/* in theory inloc of fs should match outloc of vs: */
uint8_t inloc;
uint8_t bary;
} inputs[16];
} inputs[16 + 2]; /* +POSITION +FACE */
unsigned total_in; /* sum of inputs (scalar) */

View File

@@ -0,0 +1,139 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2014 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#include "freedreno_query_hw.h"
#include "freedreno_context.h"
#include "freedreno_util.h"
#include "fd3_query.h"
#include "fd3_util.h"
struct fd_rb_samp_ctrs {
uint64_t ctr[16];
};
/*
* Occlusion Query:
*
* OCCLUSION_COUNTER and OCCLUSION_PREDICATE differ only in how they
* interpret results
*/
static struct fd_hw_sample *
occlusion_get_sample(struct fd_context *ctx, struct fd_ringbuffer *ring)
{
struct fd_hw_sample *samp =
fd_hw_sample_init(ctx, sizeof(struct fd_rb_samp_ctrs));
/* Set RB_SAMPLE_COUNT_ADDR to samp->offset plus value of
* HW_QUERY_BASE_REG register:
*/
OUT_PKT3(ring, CP_SET_CONSTANT, 3);
OUT_RING(ring, CP_REG(REG_A3XX_RB_SAMPLE_COUNT_ADDR) | 0x80000000);
OUT_RING(ring, HW_QUERY_BASE_REG);
OUT_RING(ring, samp->offset);
OUT_PKT0(ring, REG_A3XX_RB_SAMPLE_COUNT_CONTROL, 1);
OUT_RING(ring, A3XX_RB_SAMPLE_COUNT_CONTROL_COPY);
OUT_PKT3(ring, CP_DRAW_INDX, 3);
OUT_RING(ring, 0x00000000);
OUT_RING(ring, DRAW(DI_PT_POINTLIST_A2XX, DI_SRC_SEL_AUTO_INDEX,
INDEX_SIZE_IGN, USE_VISIBILITY));
OUT_RING(ring, 0); /* NumIndices */
OUT_PKT3(ring, CP_EVENT_WRITE, 1);
OUT_RING(ring, ZPASS_DONE);
OUT_PKT0(ring, REG_A3XX_RBBM_PERFCTR_CTL, 1);
OUT_RING(ring, A3XX_RBBM_PERFCTR_CTL_ENABLE);
OUT_PKT0(ring, REG_A3XX_VBIF_PERF_CNT_EN, 1);
OUT_RING(ring, A3XX_VBIF_PERF_CNT_EN_CNT0 |
A3XX_VBIF_PERF_CNT_EN_CNT1 |
A3XX_VBIF_PERF_CNT_EN_PWRCNT0 |
A3XX_VBIF_PERF_CNT_EN_PWRCNT1 |
A3XX_VBIF_PERF_CNT_EN_PWRCNT2);
return samp;
}
static uint64_t
count_samples(const struct fd_rb_samp_ctrs *start,
const struct fd_rb_samp_ctrs *end)
{
uint64_t n = 0;
unsigned i;
/* not quite sure what all of these are, possibly different
* counters for each MRT render target:
*/
for (i = 0; i < 16; i += 4)
n += end->ctr[i] - start->ctr[i];
return n;
}
static void
occlusion_counter_accumulate_result(struct fd_context *ctx,
const void *start, const void *end,
union pipe_query_result *result)
{
uint64_t n = count_samples(start, end);
result->u64 += n;
}
static void
occlusion_predicate_accumulate_result(struct fd_context *ctx,
const void *start, const void *end,
union pipe_query_result *result)
{
uint64_t n = count_samples(start, end);
result->b |= (n > 0);
}
static const struct fd_hw_sample_provider occlusion_counter = {
.query_type = PIPE_QUERY_OCCLUSION_COUNTER,
.active = FD_STAGE_DRAW, /* | FD_STAGE_CLEAR ??? */
.get_sample = occlusion_get_sample,
.accumulate_result = occlusion_counter_accumulate_result,
};
static const struct fd_hw_sample_provider occlusion_predicate = {
.query_type = PIPE_QUERY_OCCLUSION_PREDICATE,
.active = FD_STAGE_DRAW, /* | FD_STAGE_CLEAR ??? */
.get_sample = occlusion_get_sample,
.accumulate_result = occlusion_predicate_accumulate_result,
};
void fd3_query_context_init(struct pipe_context *pctx)
{
fd_hw_query_register_provider(pctx, &occlusion_counter);
fd_hw_query_register_provider(pctx, &occlusion_predicate);
}

View File

@@ -0,0 +1,36 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2014 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#ifndef FD3_QUERY_H_
#define FD3_QUERY_H_
#include "pipe/p_context.h"
void fd3_query_context_init(struct pipe_context *pctx);
#endif /* FD3_QUERY_H_ */

View File

@@ -40,6 +40,7 @@ fd3_rasterizer_state_create(struct pipe_context *pctx,
const struct pipe_rasterizer_state *cso)
{
struct fd3_rasterizer_stateobj *so;
float psize_min, psize_max;
so = CALLOC_STRUCT(fd3_rasterizer_stateobj);
if (!so)
@@ -47,19 +48,28 @@ fd3_rasterizer_state_create(struct pipe_context *pctx,
so->base = *cso;
if (cso->point_size_per_vertex) {
psize_min = util_get_min_point_size(cso);
psize_max = 8192;
} else {
/* Force the point size to be as if the vertex output was disabled. */
psize_min = cso->point_size;
psize_max = cso->point_size;
}
/*
if (cso->line_stipple_enable) {
??? TODO line stipple
}
TODO cso->half_pixel_center
TODO cso->point_size
TODO psize_min/psize_max
if (cso->multisample)
TODO
*/
so->gras_cl_clip_cntl = A3XX_GRAS_CL_CLIP_CNTL_IJ_PERSP_CENTER; /* ??? */
so->gras_su_point_minmax = 0xffc00010; /* ??? */
so->gras_su_point_size = 0x00000008; /* ??? */
so->gras_su_point_minmax =
A3XX_GRAS_SU_POINT_MINMAX_MIN(psize_min/2) |
A3XX_GRAS_SU_POINT_MINMAX_MAX(psize_max/2);
so->gras_su_point_size = A3XX_GRAS_SU_POINT_SIZE(cso->point_size/2);
so->gras_su_poly_offset_scale =
A3XX_GRAS_SU_POLY_OFFSET_SCALE_VAL(cso->offset_scale);
so->gras_su_poly_offset_offset =

View File

@@ -30,6 +30,7 @@
#include "util/u_string.h"
#include "util/u_memory.h"
#include "util/u_inlines.h"
#include "util/u_format.h"
#include "fd3_texture.h"
#include "fd3_util.h"
@@ -47,12 +48,14 @@ tex_clamp(unsigned wrap)
case PIPE_TEX_WRAP_REPEAT:
return A3XX_TEX_REPEAT;
case PIPE_TEX_WRAP_CLAMP:
case PIPE_TEX_WRAP_CLAMP_TO_BORDER:
case PIPE_TEX_WRAP_CLAMP_TO_EDGE:
return A3XX_TEX_CLAMP_TO_EDGE;
case PIPE_TEX_WRAP_CLAMP_TO_BORDER:
return A3XX_TEX_CLAMP_TO_BORDER;
case PIPE_TEX_WRAP_MIRROR_CLAMP:
case PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER:
case PIPE_TEX_WRAP_MIRROR_CLAMP_TO_EDGE:
return A3XX_TEX_MIRROR_CLAMP;
case PIPE_TEX_WRAP_MIRROR_REPEAT:
return A3XX_TEX_MIRROR_REPEAT;
default:
@@ -99,6 +102,9 @@ fd3_sampler_state_create(struct pipe_context *pctx,
A3XX_TEX_SAMP_0_WRAP_T(tex_clamp(cso->wrap_t)) |
A3XX_TEX_SAMP_0_WRAP_R(tex_clamp(cso->wrap_r));
if (cso->compare_mode)
so->texsamp0 |= A3XX_TEX_SAMP_0_COMPARE_FUNC(cso->compare_func); /* maps 1:1 */
if (cso->min_mip_filter != PIPE_TEX_MIPFILTER_NONE) {
so->texsamp1 =
A3XX_TEX_SAMP_1_MIN_LOD(cso->min_lod) |
@@ -158,6 +164,10 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct pipe_resource *prsc,
A3XX_TEX_CONST_0_MIPLVLS(miplevels) |
fd3_tex_swiz(cso->format, cso->swizzle_r, cso->swizzle_g,
cso->swizzle_b, cso->swizzle_a);
if (util_format_is_srgb(cso->format))
so->texconst0 |= A3XX_TEX_CONST_0_SRGB;
so->texconst1 =
A3XX_TEX_CONST_1_FETCHSIZE(fd3_pipe2fetchsize(cso->format)) |
A3XX_TEX_CONST_1_WIDTH(prsc->width0) |

View File

@@ -37,70 +37,44 @@ fd3_pipe2vtx(enum pipe_format format)
{
switch (format) {
/* 8-bit buffers. */
case PIPE_FORMAT_A8_UNORM:
case PIPE_FORMAT_I8_UNORM:
case PIPE_FORMAT_L8_UNORM:
case PIPE_FORMAT_R8_UNORM:
case PIPE_FORMAT_L8_SRGB:
return VFMT_NORM_UBYTE_8;
case PIPE_FORMAT_A8_SNORM:
case PIPE_FORMAT_I8_SNORM:
case PIPE_FORMAT_L8_SNORM:
case PIPE_FORMAT_R8_SNORM:
return VFMT_NORM_BYTE_8;
case PIPE_FORMAT_A8_UINT:
case PIPE_FORMAT_I8_UINT:
case PIPE_FORMAT_L8_UINT:
case PIPE_FORMAT_R8_UINT:
return VFMT_UBYTE_8;
case PIPE_FORMAT_A8_SINT:
case PIPE_FORMAT_I8_SINT:
case PIPE_FORMAT_L8_SINT:
case PIPE_FORMAT_R8_SINT:
return VFMT_BYTE_8;
/* 16-bit buffers. */
case PIPE_FORMAT_R16_UNORM:
case PIPE_FORMAT_A16_UNORM:
case PIPE_FORMAT_L16_UNORM:
case PIPE_FORMAT_I16_UNORM:
case PIPE_FORMAT_Z16_UNORM:
return VFMT_NORM_USHORT_16;
case PIPE_FORMAT_R16_SNORM:
case PIPE_FORMAT_A16_SNORM:
case PIPE_FORMAT_L16_SNORM:
case PIPE_FORMAT_I16_SNORM:
return VFMT_NORM_SHORT_16;
case PIPE_FORMAT_R16_UINT:
case PIPE_FORMAT_A16_UINT:
case PIPE_FORMAT_L16_UINT:
case PIPE_FORMAT_I16_UINT:
return VFMT_USHORT_16;
case PIPE_FORMAT_R16_SINT:
case PIPE_FORMAT_A16_SINT:
case PIPE_FORMAT_L16_SINT:
case PIPE_FORMAT_I16_SINT:
return VFMT_SHORT_16;
case PIPE_FORMAT_L8A8_UNORM:
case PIPE_FORMAT_R16_FLOAT:
return VFMT_FLOAT_16;
case PIPE_FORMAT_R8G8_UNORM:
return VFMT_NORM_UBYTE_8_8;
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_R8G8_SNORM:
return VFMT_NORM_BYTE_8_8;
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_R8G8_UINT:
return VFMT_UBYTE_8_8;
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_R8G8_SINT:
return VFMT_BYTE_8_8;
@@ -121,42 +95,62 @@ fd3_pipe2vtx(enum pipe_format format)
case PIPE_FORMAT_A8B8G8R8_UNORM:
case PIPE_FORMAT_A8R8G8B8_UNORM:
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_X8B8G8R8_UNORM:
case PIPE_FORMAT_X8R8G8B8_UNORM:
case PIPE_FORMAT_A8B8G8R8_SRGB:
case PIPE_FORMAT_B8G8R8A8_SRGB:
return VFMT_NORM_UBYTE_8_8_8_8;
case PIPE_FORMAT_R8G8B8A8_SNORM:
case PIPE_FORMAT_R8G8B8X8_SNORM:
return VFMT_NORM_BYTE_8_8_8_8;
case PIPE_FORMAT_R8G8B8A8_UINT:
case PIPE_FORMAT_R8G8B8X8_UINT:
return VFMT_UBYTE_8_8_8_8;
case PIPE_FORMAT_R8G8B8A8_SINT:
case PIPE_FORMAT_R8G8B8X8_SINT:
return VFMT_BYTE_8_8_8_8;
/* TODO probably need gles3 blob drivers to find the 32bit int formats:
case PIPE_FORMAT_R32_UINT:
case PIPE_FORMAT_R32_SINT:
case PIPE_FORMAT_A32_UINT:
case PIPE_FORMAT_A32_SINT:
case PIPE_FORMAT_L32_UINT:
case PIPE_FORMAT_L32_SINT:
case PIPE_FORMAT_I32_UINT:
case PIPE_FORMAT_I32_SINT:
*/
case PIPE_FORMAT_R16G16_SSCALED:
return VFMT_SHORT_16_16;
case PIPE_FORMAT_R16G16_FLOAT:
return VFMT_FLOAT_16_16;
case PIPE_FORMAT_R16G16_UINT:
return VFMT_USHORT_16_16;
case PIPE_FORMAT_R16G16_UNORM:
return VFMT_NORM_USHORT_16_16;
case PIPE_FORMAT_R16G16_SNORM:
return VFMT_NORM_SHORT_16_16;
case PIPE_FORMAT_R10G10B10A2_UNORM:
return VFMT_NORM_UINT_10_10_10_2;
case PIPE_FORMAT_R10G10B10A2_SNORM:
return VFMT_NORM_INT_10_10_10_2;
case PIPE_FORMAT_R10G10B10A2_USCALED:
return VFMT_UINT_10_10_10_2;
case PIPE_FORMAT_R10G10B10A2_SSCALED:
return VFMT_INT_10_10_10_2;
/* 48-bit buffers. */
case PIPE_FORMAT_R16G16B16_FLOAT:
return VFMT_FLOAT_16_16_16;
case PIPE_FORMAT_R16G16B16_SSCALED:
return VFMT_SHORT_16_16_16;
case PIPE_FORMAT_R16G16B16_UINT:
return VFMT_USHORT_16_16_16;
case PIPE_FORMAT_R16G16B16_SNORM:
return VFMT_NORM_SHORT_16_16_16;
case PIPE_FORMAT_R16G16B16_UNORM:
return VFMT_NORM_USHORT_16_16_16;
case PIPE_FORMAT_R32_FLOAT:
case PIPE_FORMAT_A32_FLOAT:
case PIPE_FORMAT_L32_FLOAT:
case PIPE_FORMAT_I32_FLOAT:
case PIPE_FORMAT_Z32_FLOAT:
return VFMT_FLOAT_32;
@@ -177,23 +171,14 @@ fd3_pipe2vtx(enum pipe_format format)
return VFMT_SHORT_16_16_16_16;
case PIPE_FORMAT_R32G32_FLOAT:
case PIPE_FORMAT_L32A32_FLOAT:
return VFMT_FLOAT_32_32;
case PIPE_FORMAT_R32G32_FIXED:
return VFMT_FIXED_32_32;
case PIPE_FORMAT_R16G16B16A16_FLOAT:
case PIPE_FORMAT_R16G16B16X16_FLOAT:
return VFMT_FLOAT_16_16_16_16;
/* TODO probably need gles3 blob drivers to find the 32bit int formats:
case PIPE_FORMAT_R32G32_SINT:
case PIPE_FORMAT_R32G32_UINT:
case PIPE_FORMAT_L32A32_UINT:
case PIPE_FORMAT_L32A32_SINT:
*/
/* 96-bit buffers. */
case PIPE_FORMAT_R32G32B32_FLOAT:
return VFMT_FLOAT_32_32_32;
@@ -203,7 +188,6 @@ fd3_pipe2vtx(enum pipe_format format)
/* 128-bit buffers. */
case PIPE_FORMAT_R32G32B32A32_FLOAT:
case PIPE_FORMAT_R32G32B32X32_FLOAT:
return VFMT_FLOAT_32_32_32_32;
case PIPE_FORMAT_R32G32B32A32_FIXED:
@@ -214,6 +198,20 @@ fd3_pipe2vtx(enum pipe_format format)
case PIPE_FORMAT_R32G32B32A32_UNORM:
case PIPE_FORMAT_R32G32B32A32_SINT:
case PIPE_FORMAT_R32G32B32A32_UINT:
case PIPE_FORMAT_R32_UINT:
case PIPE_FORMAT_R32_SINT:
case PIPE_FORMAT_A32_UINT:
case PIPE_FORMAT_A32_SINT:
case PIPE_FORMAT_L32_UINT:
case PIPE_FORMAT_L32_SINT:
case PIPE_FORMAT_I32_UINT:
case PIPE_FORMAT_I32_SINT:
case PIPE_FORMAT_R32G32_SINT:
case PIPE_FORMAT_R32G32_UINT:
case PIPE_FORMAT_L32A32_UINT:
case PIPE_FORMAT_L32A32_SINT:
*/
default:
@@ -235,6 +233,10 @@ fd3_pipe2tex(enum pipe_format format)
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_B8G8R8A8_SRGB:
case PIPE_FORMAT_B8G8R8X8_SRGB:
case PIPE_FORMAT_R8G8B8A8_SRGB:
case PIPE_FORMAT_R8G8B8X8_SRGB:
return TFMT_NORM_UINT_8_8_8_8;
case PIPE_FORMAT_Z24X8_UNORM:
@@ -275,6 +277,12 @@ fd3_pipe2fetchsize(enum pipe_format format)
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_B8G8R8A8_SRGB:
case PIPE_FORMAT_B8G8R8X8_SRGB:
case PIPE_FORMAT_R8G8B8A8_SRGB:
case PIPE_FORMAT_R8G8B8X8_SRGB:
case PIPE_FORMAT_Z24X8_UNORM:
case PIPE_FORMAT_Z24_UNORM_S8_UINT:
return TFETCH_4_BYTE;
@@ -348,8 +356,22 @@ fd3_pipe2swap(enum pipe_format format)
switch (format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_B8G8R8A8_SRGB:
case PIPE_FORMAT_B8G8R8X8_SRGB:
return WXYZ;
case PIPE_FORMAT_A8R8G8B8_UNORM:
case PIPE_FORMAT_X8R8G8B8_UNORM:
case PIPE_FORMAT_A8R8G8B8_SRGB:
case PIPE_FORMAT_X8R8G8B8_SRGB:
return ZYXW;
case PIPE_FORMAT_A8B8G8R8_UNORM:
case PIPE_FORMAT_X8B8G8R8_UNORM:
case PIPE_FORMAT_A8B8G8R8_SRGB:
case PIPE_FORMAT_X8B8G8R8_SRGB:
return XYZW;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:
@@ -379,14 +401,14 @@ fd3_tex_swiz(enum pipe_format format, unsigned swizzle_r, unsigned swizzle_g,
{
const struct util_format_description *desc =
util_format_description(format);
uint8_t swiz[] = {
unsigned char swiz[4] = {
swizzle_r, swizzle_g, swizzle_b, swizzle_a,
PIPE_SWIZZLE_ZERO, PIPE_SWIZZLE_ONE,
PIPE_SWIZZLE_ONE, PIPE_SWIZZLE_ONE,
};
}, rswiz[4];
return A3XX_TEX_CONST_0_SWIZ_X(tex_swiz(swiz[desc->swizzle[0]])) |
A3XX_TEX_CONST_0_SWIZ_Y(tex_swiz(swiz[desc->swizzle[1]])) |
A3XX_TEX_CONST_0_SWIZ_Z(tex_swiz(swiz[desc->swizzle[2]])) |
A3XX_TEX_CONST_0_SWIZ_W(tex_swiz(swiz[desc->swizzle[3]]));
util_format_compose_swizzles(desc->swizzle, swiz, rswiz);
return A3XX_TEX_CONST_0_SWIZ_X(tex_swiz(rswiz[0])) |
A3XX_TEX_CONST_0_SWIZ_Y(tex_swiz(rswiz[1])) |
A3XX_TEX_CONST_0_SWIZ_Z(tex_swiz(rswiz[2])) |
A3XX_TEX_CONST_0_SWIZ_W(tex_swiz(rswiz[3]));
}

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32840 bytes, from 2014-01-05 14:44:21)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9009 bytes, from 2014-01-11 16:56:35)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 12362 bytes, from 2014-01-07 14:47:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 56545 bytes, from 2014-02-26 16:32:11)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 8344 bytes, from 2013-11-30 14:49:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -87,15 +87,6 @@ enum adreno_rb_blend_factor {
FACTOR_SRC_ALPHA_SATURATE = 16,
};
enum adreno_rb_blend_opcode {
BLEND_DST_PLUS_SRC = 0,
BLEND_SRC_MINUS_DST = 1,
BLEND_MIN_DST_SRC = 2,
BLEND_MAX_DST_SRC = 3,
BLEND_DST_MINUS_SRC = 4,
BLEND_DST_PLUS_SRC_BIAS = 5,
};
enum adreno_rb_surface_endian {
ENDIAN_NONE = 0,
ENDIAN_8IN16 = 1,
@@ -116,6 +107,39 @@ enum adreno_rb_depth_format {
DEPTHX_24_8 = 1,
};
enum adreno_rb_copy_control_mode {
RB_COPY_RESOLVE = 1,
RB_COPY_CLEAR = 2,
RB_COPY_DEPTH_STENCIL = 5,
};
enum a3xx_render_mode {
RB_RENDERING_PASS = 0,
RB_TILING_PASS = 1,
RB_RESOLVE_PASS = 2,
RB_COMPUTE_PASS = 3,
};
enum a3xx_msaa_samples {
MSAA_ONE = 0,
MSAA_TWO = 1,
MSAA_FOUR = 2,
};
enum a3xx_threadmode {
MULTI = 0,
SINGLE = 1,
};
enum a3xx_instrbuffermode {
BUFFER = 1,
};
enum a3xx_threadsize {
TWO_QUADS = 0,
FOUR_QUADS = 1,
};
#define REG_AXXX_CP_RB_BASE 0x000001c0
#define REG_AXXX_CP_RB_CNTL 0x000001c1

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32840 bytes, from 2014-01-05 14:44:21)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9009 bytes, from 2014-01-11 16:56:35)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 12362 bytes, from 2014-01-07 14:47:36)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 56545 bytes, from 2014-02-26 16:32:11)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 8344 bytes, from 2013-11-30 14:49:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -164,6 +164,11 @@ enum adreno_pm4_type3_packets {
CP_SET_BIN = 76,
CP_TEST_TWO_MEMS = 113,
CP_WAIT_FOR_ME = 19,
CP_SET_DRAW_STATE = 67,
CP_DRAW_INDX_OFFSET = 56,
CP_DRAW_INDIRECT = 40,
CP_DRAW_INDX_INDIRECT = 41,
CP_DRAW_AUTO = 36,
IN_IB_PREFETCH_END = 23,
IN_SUBBLK_PREFETCH = 31,
IN_INSTR_PREFETCH = 32,
@@ -351,6 +356,93 @@ static inline uint32_t CP_DRAW_INDX_2_2_NUM_INDICES(uint32_t val)
return ((val) << CP_DRAW_INDX_2_2_NUM_INDICES__SHIFT) & CP_DRAW_INDX_2_2_NUM_INDICES__MASK;
}
#define REG_CP_DRAW_INDX_OFFSET_0 0x00000000
#define CP_DRAW_INDX_OFFSET_0_PRIM_TYPE__MASK 0x0000003f
#define CP_DRAW_INDX_OFFSET_0_PRIM_TYPE__SHIFT 0
static inline uint32_t CP_DRAW_INDX_OFFSET_0_PRIM_TYPE(enum pc_di_primtype val)
{
return ((val) << CP_DRAW_INDX_OFFSET_0_PRIM_TYPE__SHIFT) & CP_DRAW_INDX_OFFSET_0_PRIM_TYPE__MASK;
}
#define CP_DRAW_INDX_OFFSET_0_SOURCE_SELECT__MASK 0x000000c0
#define CP_DRAW_INDX_OFFSET_0_SOURCE_SELECT__SHIFT 6
static inline uint32_t CP_DRAW_INDX_OFFSET_0_SOURCE_SELECT(enum pc_di_src_sel val)
{
return ((val) << CP_DRAW_INDX_OFFSET_0_SOURCE_SELECT__SHIFT) & CP_DRAW_INDX_OFFSET_0_SOURCE_SELECT__MASK;
}
#define CP_DRAW_INDX_OFFSET_0_VIS_CULL__MASK 0x00000700
#define CP_DRAW_INDX_OFFSET_0_VIS_CULL__SHIFT 8
static inline uint32_t CP_DRAW_INDX_OFFSET_0_VIS_CULL(enum pc_di_vis_cull_mode val)
{
return ((val) << CP_DRAW_INDX_OFFSET_0_VIS_CULL__SHIFT) & CP_DRAW_INDX_OFFSET_0_VIS_CULL__MASK;
}
#define CP_DRAW_INDX_OFFSET_0_INDEX_SIZE__MASK 0x00000800
#define CP_DRAW_INDX_OFFSET_0_INDEX_SIZE__SHIFT 11
static inline uint32_t CP_DRAW_INDX_OFFSET_0_INDEX_SIZE(enum pc_di_index_size val)
{
return ((val) << CP_DRAW_INDX_OFFSET_0_INDEX_SIZE__SHIFT) & CP_DRAW_INDX_OFFSET_0_INDEX_SIZE__MASK;
}
#define CP_DRAW_INDX_OFFSET_0_NOT_EOP 0x00001000
#define CP_DRAW_INDX_OFFSET_0_SMALL_INDEX 0x00002000
#define CP_DRAW_INDX_OFFSET_0_PRE_DRAW_INITIATOR_ENABLE 0x00004000
#define CP_DRAW_INDX_OFFSET_0_NUM_INDICES__MASK 0xffff0000
#define CP_DRAW_INDX_OFFSET_0_NUM_INDICES__SHIFT 16
static inline uint32_t CP_DRAW_INDX_OFFSET_0_NUM_INDICES(uint32_t val)
{
return ((val) << CP_DRAW_INDX_OFFSET_0_NUM_INDICES__SHIFT) & CP_DRAW_INDX_OFFSET_0_NUM_INDICES__MASK;
}
#define REG_CP_DRAW_INDX_OFFSET_1 0x00000001
#define REG_CP_DRAW_INDX_OFFSET_2 0x00000002
#define CP_DRAW_INDX_OFFSET_2_NUM_INDICES__MASK 0xffffffff
#define CP_DRAW_INDX_OFFSET_2_NUM_INDICES__SHIFT 0
static inline uint32_t CP_DRAW_INDX_OFFSET_2_NUM_INDICES(uint32_t val)
{
return ((val) << CP_DRAW_INDX_OFFSET_2_NUM_INDICES__SHIFT) & CP_DRAW_INDX_OFFSET_2_NUM_INDICES__MASK;
}
#define REG_CP_DRAW_INDX_OFFSET_2 0x00000002
#define CP_DRAW_INDX_OFFSET_2_INDX_BASE__MASK 0xffffffff
#define CP_DRAW_INDX_OFFSET_2_INDX_BASE__SHIFT 0
static inline uint32_t CP_DRAW_INDX_OFFSET_2_INDX_BASE(uint32_t val)
{
return ((val) << CP_DRAW_INDX_OFFSET_2_INDX_BASE__SHIFT) & CP_DRAW_INDX_OFFSET_2_INDX_BASE__MASK;
}
#define REG_CP_DRAW_INDX_OFFSET_2 0x00000002
#define CP_DRAW_INDX_OFFSET_2_INDX_SIZE__MASK 0xffffffff
#define CP_DRAW_INDX_OFFSET_2_INDX_SIZE__SHIFT 0
static inline uint32_t CP_DRAW_INDX_OFFSET_2_INDX_SIZE(uint32_t val)
{
return ((val) << CP_DRAW_INDX_OFFSET_2_INDX_SIZE__SHIFT) & CP_DRAW_INDX_OFFSET_2_INDX_SIZE__MASK;
}
#define REG_CP_SET_DRAW_STATE_0 0x00000000
#define CP_SET_DRAW_STATE_0_COUNT__MASK 0x0000ffff
#define CP_SET_DRAW_STATE_0_COUNT__SHIFT 0
static inline uint32_t CP_SET_DRAW_STATE_0_COUNT(uint32_t val)
{
return ((val) << CP_SET_DRAW_STATE_0_COUNT__SHIFT) & CP_SET_DRAW_STATE_0_COUNT__MASK;
}
#define CP_SET_DRAW_STATE_0_DIRTY 0x00010000
#define CP_SET_DRAW_STATE_0_DISABLE 0x00020000
#define CP_SET_DRAW_STATE_0_DISABLE_ALL_GROUPS 0x00040000
#define CP_SET_DRAW_STATE_0_LOAD_IMMED 0x00080000
#define CP_SET_DRAW_STATE_0_GROUP_ID__MASK 0x1f000000
#define CP_SET_DRAW_STATE_0_GROUP_ID__SHIFT 24
static inline uint32_t CP_SET_DRAW_STATE_0_GROUP_ID(uint32_t val)
{
return ((val) << CP_SET_DRAW_STATE_0_GROUP_ID__SHIFT) & CP_SET_DRAW_STATE_0_GROUP_ID__MASK;
}
#define REG_CP_SET_DRAW_STATE_1 0x00000001
#define CP_SET_DRAW_STATE_1_ADDR__MASK 0xffffffff
#define CP_SET_DRAW_STATE_1_ADDR__SHIFT 0
static inline uint32_t CP_SET_DRAW_STATE_1_ADDR(uint32_t val)
{
return ((val) << CP_SET_DRAW_STATE_1_ADDR__SHIFT) & CP_SET_DRAW_STATE_1_ADDR__MASK;
}
#define REG_CP_SET_BIN_0 0x00000000
#define REG_CP_SET_BIN_1 0x00000001

View File

@@ -34,6 +34,7 @@
#include "freedreno_state.h"
#include "freedreno_gmem.h"
#include "freedreno_query.h"
#include "freedreno_query_hw.h"
#include "freedreno_util.h"
static struct fd_ringbuffer *next_rb(struct fd_context *ctx)
@@ -145,6 +146,7 @@ fd_context_destroy(struct pipe_context *pctx)
DBG("");
fd_prog_fini(pctx);
fd_hw_query_fini(pctx);
util_slab_destroy(&ctx->transfer_pool);
@@ -221,6 +223,7 @@ fd_context_init(struct fd_context *ctx, struct pipe_screen *pscreen,
fd_query_context_init(pctx);
fd_texture_init(pctx);
fd_state_init(pctx);
fd_hw_query_init(pctx);
ctx->blitter = util_blitter_create(pctx);
if (!ctx->blitter)

View File

@@ -33,6 +33,7 @@
#include "pipe/p_context.h"
#include "indices/u_primconvert.h"
#include "util/u_blitter.h"
#include "util/u_double_list.h"
#include "util/u_slab.h"
#include "util/u_string.h"
@@ -82,16 +83,80 @@ struct fd_vertex_stateobj {
unsigned num_elements;
};
/* Bitmask of stages in rendering that a particular query query is
* active. Queries will be automatically started/stopped (generating
* additional fd_hw_sample_period's) on entrance/exit from stages that
* are applicable to the query.
*
* NOTE: set the stage to NULL at end of IB to ensure no query is still
* active. Things aren't going to work out the way you want if a query
* is active across IB's (or between tile IB and draw IB)
*/
enum fd_render_stage {
FD_STAGE_NULL = 0x00,
FD_STAGE_DRAW = 0x01,
FD_STAGE_CLEAR = 0x02,
/* TODO before queries which include MEM2GMEM or GMEM2MEM will
* work we will need to call fd_hw_query_prepare() from somewhere
* appropriate so that queries in the tiling IB get backed with
* memory to write results to.
*/
FD_STAGE_MEM2GMEM = 0x04,
FD_STAGE_GMEM2MEM = 0x08,
/* used for driver internal draws (ie. util_blitter_blit()): */
FD_STAGE_BLIT = 0x10,
};
#define MAX_HW_SAMPLE_PROVIDERS 4
struct fd_hw_sample_provider;
struct fd_hw_sample;
struct fd_context {
struct pipe_context base;
struct fd_device *dev;
struct fd_screen *screen;
struct blitter_context *blitter;
struct primconvert_context *primconvert;
/* slab for pipe_transfer allocations: */
struct util_slab_mempool transfer_pool;
/* slabs for fd_hw_sample and fd_hw_sample_period allocations: */
struct util_slab_mempool sample_pool;
struct util_slab_mempool sample_period_pool;
/* next sample offset.. incremented for each sample in the batch/
* submit, reset to zero on next submit.
*/
uint32_t next_sample_offset;
/* sample-providers for hw queries: */
const struct fd_hw_sample_provider *sample_providers[MAX_HW_SAMPLE_PROVIDERS];
/* cached samples (in case multiple queries need to reference
* the same sample snapshot)
*/
struct fd_hw_sample *sample_cache[MAX_HW_SAMPLE_PROVIDERS];
/* tracking for current stage, to know when to start/stop
* any active queries:
*/
enum fd_render_stage stage;
/* list of active queries: */
struct list_head active_queries;
/* list of queries that are not active, but were active in the
* current submit:
*/
struct list_head current_queries;
/* current query result bo and tile stride: */
struct fd_bo *query_bo;
uint32_t query_tile_stride;
/* table with PIPE_PRIM_MAX entries mapping PIPE_PRIM_x to
* DI_PT_x value to use for draw initiator. There are some
* slight differences between generation:

View File

@@ -36,6 +36,7 @@
#include "freedreno_context.h"
#include "freedreno_state.h"
#include "freedreno_resource.h"
#include "freedreno_query_hw.h"
#include "freedreno_util.h"
@@ -70,7 +71,7 @@ fd_draw_emit(struct fd_context *ctx, struct fd_ringbuffer *ring,
idx_bo = fd_resource(idx->buffer)->bo;
idx_type = size2indextype(idx->index_size);
idx_size = idx->index_size * info->count;
idx_offset = idx->offset;
idx_offset = idx->offset + (info->start * idx->index_size);
src_sel = DI_SRC_SEL_DMA;
} else {
idx_bo = NULL;
@@ -156,6 +157,7 @@ fd_draw_vbo(struct pipe_context *pctx, const struct pipe_draw_info *info)
/* and any buffers used, need to be resolved: */
ctx->resolve |= buffers;
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_DRAW);
ctx->draw(ctx, info);
}
@@ -188,6 +190,8 @@ fd_clear(struct pipe_context *pctx, unsigned buffers,
util_format_short_name(pipe_surface_format(pfb->cbufs[0])),
util_format_short_name(pipe_surface_format(pfb->zsbuf)));
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_CLEAR);
ctx->clear(ctx, buffers, color, depth, stencil);
ctx->dirty |= FD_DIRTY_ZSA |

View File

@@ -35,6 +35,7 @@
#include "freedreno_gmem.h"
#include "freedreno_context.h"
#include "freedreno_resource.h"
#include "freedreno_query_hw.h"
#include "freedreno_util.h"
/*
@@ -273,17 +274,24 @@ render_tiles(struct fd_context *ctx)
ctx->emit_tile_prep(ctx, tile);
if (ctx->restore)
if (ctx->restore) {
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_MEM2GMEM);
ctx->emit_tile_mem2gmem(ctx, tile);
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_NULL);
}
ctx->emit_tile_renderprep(ctx, tile);
fd_hw_query_prepare_tile(ctx, i, ctx->ring);
/* emit IB to drawcmds: */
OUT_IB(ctx->ring, ctx->draw_start, ctx->draw_end);
fd_reset_wfi(ctx);
/* emit gmem2mem to transfer tile back to system memory: */
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_GMEM2MEM);
ctx->emit_tile_gmem2mem(ctx, tile);
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_NULL);
}
}
@@ -292,6 +300,8 @@ render_sysmem(struct fd_context *ctx)
{
ctx->emit_sysmem_prep(ctx);
fd_hw_query_prepare_tile(ctx, 0, ctx->ring);
/* emit IB to drawcmds: */
OUT_IB(ctx->ring, ctx->draw_start, ctx->draw_end);
fd_reset_wfi(ctx);
@@ -314,6 +324,11 @@ fd_gmem_render_tiles(struct pipe_context *pctx)
}
}
/* close out the draw cmds by making sure any active queries are
* paused:
*/
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_NULL);
/* mark the end of the clear/draw cmds before emitting per-tile cmds: */
fd_ringmarker_mark(ctx->draw_end);
fd_ringmarker_mark(ctx->binning_end);
@@ -326,6 +341,7 @@ fd_gmem_render_tiles(struct pipe_context *pctx)
DBG("rendering sysmem (%s/%s)",
util_format_short_name(pipe_surface_format(pfb->cbufs[0])),
util_format_short_name(pipe_surface_format(pfb->zsbuf)));
fd_hw_query_prepare(ctx, 1);
render_sysmem(ctx);
ctx->stats.batch_sysmem++;
} else {
@@ -334,6 +350,7 @@ fd_gmem_render_tiles(struct pipe_context *pctx)
DBG("rendering %dx%d tiles (%s/%s)", gmem->nbins_x, gmem->nbins_y,
util_format_short_name(pipe_surface_format(pfb->cbufs[0])),
util_format_short_name(pipe_surface_format(pfb->zsbuf)));
fd_hw_query_prepare(ctx, gmem->nbins_x * gmem->nbins_y);
render_tiles(ctx);
ctx->stats.batch_gmem++;
}

View File

@@ -1,7 +1,7 @@
/* -*- mode: C; c-file-style: "k&r"; ttxab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2012 Rob Clark <robclark@freedesktop.org>
* Copyright (C) 2013 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
@@ -27,63 +27,27 @@
*/
#include "pipe/p_state.h"
#include "util/u_string.h"
#include "util/u_memory.h"
#include "util/u_inlines.h"
#include "os/os_time.h"
#include "freedreno_query.h"
#include "freedreno_query_sw.h"
#include "freedreno_query_hw.h"
#include "freedreno_context.h"
#include "freedreno_util.h"
#define FD_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0)
#define FD_QUERY_BATCH_TOTAL (PIPE_QUERY_DRIVER_SPECIFIC + 1) /* total # of batches (submits) */
#define FD_QUERY_BATCH_SYSMEM (PIPE_QUERY_DRIVER_SPECIFIC + 2) /* batches using system memory (GMEM bypass) */
#define FD_QUERY_BATCH_GMEM (PIPE_QUERY_DRIVER_SPECIFIC + 3) /* batches using GMEM */
#define FD_QUERY_BATCH_RESTORE (PIPE_QUERY_DRIVER_SPECIFIC + 4) /* batches requiring GMEM restore */
/* Currently just simple cpu query's supported.. probably need
* to refactor this a bit when I'm eventually ready to add gpu
* queries:
/*
* Pipe Query interface:
*/
struct fd_query {
int type;
/* storage for the collected data */
union pipe_query_result data;
bool active;
uint64_t begin_value, end_value;
uint64_t begin_time, end_time;
};
static inline struct fd_query *
fd_query(struct pipe_query *pq)
{
return (struct fd_query *)pq;
}
static struct pipe_query *
fd_create_query(struct pipe_context *pctx, unsigned query_type)
{
struct fd_context *ctx = fd_context(pctx);
struct fd_query *q;
switch (query_type) {
case PIPE_QUERY_PRIMITIVES_GENERATED:
case PIPE_QUERY_PRIMITIVES_EMITTED:
case FD_QUERY_DRAW_CALLS:
case FD_QUERY_BATCH_TOTAL:
case FD_QUERY_BATCH_SYSMEM:
case FD_QUERY_BATCH_GMEM:
case FD_QUERY_BATCH_RESTORE:
break;
default:
return NULL;
}
q = CALLOC_STRUCT(fd_query);
q = fd_sw_create_query(ctx, query_type);
if (!q)
return NULL;
q->type = query_type;
q = fd_hw_create_query(ctx, query_type);
return (struct pipe_query *) q;
}
@@ -92,64 +56,21 @@ static void
fd_destroy_query(struct pipe_context *pctx, struct pipe_query *pq)
{
struct fd_query *q = fd_query(pq);
free(q);
}
static uint64_t
read_counter(struct pipe_context *pctx, int type)
{
struct fd_context *ctx = fd_context(pctx);
switch (type) {
case PIPE_QUERY_PRIMITIVES_GENERATED:
/* for now same thing as _PRIMITIVES_EMITTED */
case PIPE_QUERY_PRIMITIVES_EMITTED:
return ctx->stats.prims_emitted;
case FD_QUERY_DRAW_CALLS:
return ctx->stats.draw_calls;
case FD_QUERY_BATCH_TOTAL:
return ctx->stats.batch_total;
case FD_QUERY_BATCH_SYSMEM:
return ctx->stats.batch_sysmem;
case FD_QUERY_BATCH_GMEM:
return ctx->stats.batch_gmem;
case FD_QUERY_BATCH_RESTORE:
return ctx->stats.batch_restore;
}
return 0;
}
static bool
is_rate_query(struct fd_query *q)
{
switch (q->type) {
case FD_QUERY_BATCH_TOTAL:
case FD_QUERY_BATCH_SYSMEM:
case FD_QUERY_BATCH_GMEM:
case FD_QUERY_BATCH_RESTORE:
return true;
default:
return false;
}
q->funcs->destroy_query(fd_context(pctx), q);
}
static void
fd_begin_query(struct pipe_context *pctx, struct pipe_query *pq)
{
struct fd_query *q = fd_query(pq);
q->active = true;
q->begin_value = read_counter(pctx, q->type);
if (is_rate_query(q))
q->begin_time = os_time_get();
q->funcs->begin_query(fd_context(pctx), q);
}
static void
fd_end_query(struct pipe_context *pctx, struct pipe_query *pq)
{
struct fd_query *q = fd_query(pq);
q->active = false;
q->end_value = read_counter(pctx, q->type);
if (is_rate_query(q))
q->end_time = os_time_get();
q->funcs->end_query(fd_context(pctx), q);
}
static boolean
@@ -157,21 +78,7 @@ fd_get_query_result(struct pipe_context *pctx, struct pipe_query *pq,
boolean wait, union pipe_query_result *result)
{
struct fd_query *q = fd_query(pq);
if (q->active)
return false;
util_query_clear_result(result, q->type);
result->u64 = q->end_value - q->begin_value;
if (is_rate_query(q)) {
double fps = (result->u64 * 1000000) /
(double)(q->end_time - q->begin_time);
result->u64 = (uint64_t)fps;
}
return true;
return q->funcs->get_query_result(fd_context(pctx), q, wait, result);
}
static int

View File

@@ -1,7 +1,7 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2012 Rob Clark <robclark@freedesktop.org>
* Copyright (C) 2013 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
@@ -31,6 +31,37 @@
#include "pipe/p_context.h"
struct fd_context;
struct fd_query;
struct fd_query_funcs {
void (*destroy_query)(struct fd_context *ctx,
struct fd_query *q);
void (*begin_query)(struct fd_context *ctx, struct fd_query *q);
void (*end_query)(struct fd_context *ctx, struct fd_query *q);
boolean (*get_query_result)(struct fd_context *ctx,
struct fd_query *q, boolean wait,
union pipe_query_result *result);
};
struct fd_query {
const struct fd_query_funcs *funcs;
bool active;
int type;
};
static inline struct fd_query *
fd_query(struct pipe_query *pq)
{
return (struct fd_query *)pq;
}
#define FD_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0)
#define FD_QUERY_BATCH_TOTAL (PIPE_QUERY_DRIVER_SPECIFIC + 1) /* total # of batches (submits) */
#define FD_QUERY_BATCH_SYSMEM (PIPE_QUERY_DRIVER_SPECIFIC + 2) /* batches using system memory (GMEM bypass) */
#define FD_QUERY_BATCH_GMEM (PIPE_QUERY_DRIVER_SPECIFIC + 3) /* batches using GMEM */
#define FD_QUERY_BATCH_RESTORE (PIPE_QUERY_DRIVER_SPECIFIC + 4) /* batches requiring GMEM restore */
void fd_query_screen_init(struct pipe_screen *pscreen);
void fd_query_context_init(struct pipe_context *pctx);

View File

@@ -0,0 +1,465 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2014 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#include "pipe/p_state.h"
#include "util/u_memory.h"
#include "util/u_inlines.h"
#include "freedreno_query_hw.h"
#include "freedreno_context.h"
#include "freedreno_util.h"
struct fd_hw_sample_period {
struct fd_hw_sample *start, *end;
struct list_head list;
};
/* maps query_type to sample provider idx: */
static int pidx(unsigned query_type)
{
switch (query_type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
return 0;
case PIPE_QUERY_OCCLUSION_PREDICATE:
return 1;
default:
return -1;
}
}
static struct fd_hw_sample *
get_sample(struct fd_context *ctx, struct fd_ringbuffer *ring,
unsigned query_type)
{
struct fd_hw_sample *samp = NULL;
int idx = pidx(query_type);
if (!ctx->sample_cache[idx]) {
ctx->sample_cache[idx] =
ctx->sample_providers[idx]->get_sample(ctx, ring);
}
fd_hw_sample_reference(ctx, &samp, ctx->sample_cache[idx]);
return samp;
}
static void
clear_sample_cache(struct fd_context *ctx)
{
int i;
for (i = 0; i < ARRAY_SIZE(ctx->sample_cache); i++)
fd_hw_sample_reference(ctx, &ctx->sample_cache[i], NULL);
}
static bool
is_active(struct fd_hw_query *hq, enum fd_render_stage stage)
{
return !!(hq->provider->active & stage);
}
static void
resume_query(struct fd_context *ctx, struct fd_hw_query *hq,
struct fd_ringbuffer *ring)
{
assert(!hq->period);
hq->period = util_slab_alloc(&ctx->sample_period_pool);
list_inithead(&hq->period->list);
hq->period->start = get_sample(ctx, ring, hq->base.type);
/* NOTE: util_slab_alloc() does not zero out the buffer: */
hq->period->end = NULL;
}
static void
pause_query(struct fd_context *ctx, struct fd_hw_query *hq,
struct fd_ringbuffer *ring)
{
assert(hq->period && !hq->period->end);
hq->period->end = get_sample(ctx, ring, hq->base.type);
list_addtail(&hq->period->list, &hq->current_periods);
hq->period = NULL;
}
static void
destroy_periods(struct fd_context *ctx, struct list_head *list)
{
struct fd_hw_sample_period *period, *s;
LIST_FOR_EACH_ENTRY_SAFE(period, s, list, list) {
fd_hw_sample_reference(ctx, &period->start, NULL);
fd_hw_sample_reference(ctx, &period->end, NULL);
list_del(&period->list);
util_slab_free(&ctx->sample_period_pool, period);
}
}
static void
fd_hw_destroy_query(struct fd_context *ctx, struct fd_query *q)
{
struct fd_hw_query *hq = fd_hw_query(q);
destroy_periods(ctx, &hq->periods);
destroy_periods(ctx, &hq->current_periods);
list_del(&hq->list);
free(hq);
}
static void
fd_hw_begin_query(struct fd_context *ctx, struct fd_query *q)
{
struct fd_hw_query *hq = fd_hw_query(q);
if (q->active)
return;
/* begin_query() should clear previous results: */
destroy_periods(ctx, &hq->periods);
if (is_active(hq, ctx->stage))
resume_query(ctx, hq, ctx->ring);
q->active = true;
/* add to active list: */
list_del(&hq->list);
list_addtail(&hq->list, &ctx->active_queries);
}
static void
fd_hw_end_query(struct fd_context *ctx, struct fd_query *q)
{
struct fd_hw_query *hq = fd_hw_query(q);
if (!q->active)
return;
if (is_active(hq, ctx->stage))
pause_query(ctx, hq, ctx->ring);
q->active = false;
/* move to current list: */
list_del(&hq->list);
list_addtail(&hq->list, &ctx->current_queries);
}
/* helper to get ptr to specified sample: */
static void * sampptr(struct fd_hw_sample *samp, uint32_t n, void *ptr)
{
return ((char *)ptr) + (samp->tile_stride * n) + samp->offset;
}
static boolean
fd_hw_get_query_result(struct fd_context *ctx, struct fd_query *q,
boolean wait, union pipe_query_result *result)
{
struct fd_hw_query *hq = fd_hw_query(q);
const struct fd_hw_sample_provider *p = hq->provider;
struct fd_hw_sample_period *period;
if (q->active)
return false;
/* if the app tries to read back the query result before the
* back is submitted, that forces us to flush so that there
* are actually results to wait for:
*/
if (!LIST_IS_EMPTY(&hq->list)) {
DBG("reading query result forces flush!");
ctx->needs_flush = true;
fd_context_render(&ctx->base);
}
util_query_clear_result(result, q->type);
if (LIST_IS_EMPTY(&hq->periods))
return true;
assert(LIST_IS_EMPTY(&hq->list));
assert(LIST_IS_EMPTY(&hq->current_periods));
assert(!hq->period);
if (LIST_IS_EMPTY(&hq->periods))
return true;
/* if !wait, then check the last sample (the one most likely to
* not be ready yet) and bail if it is not ready:
*/
if (!wait) {
int ret;
period = LIST_ENTRY(struct fd_hw_sample_period,
hq->periods.prev, list);
ret = fd_bo_cpu_prep(period->end->bo, ctx->screen->pipe,
DRM_FREEDRENO_PREP_READ | DRM_FREEDRENO_PREP_NOSYNC);
if (ret)
return false;
fd_bo_cpu_fini(period->end->bo);
}
/* sum the result across all sample periods: */
LIST_FOR_EACH_ENTRY(period, &hq->periods, list) {
struct fd_hw_sample *start = period->start;
struct fd_hw_sample *end = period->end;
unsigned i;
/* start and end samples should be from same batch: */
assert(start->bo == end->bo);
assert(start->num_tiles == end->num_tiles);
for (i = 0; i < start->num_tiles; i++) {
void *ptr;
fd_bo_cpu_prep(start->bo, ctx->screen->pipe,
DRM_FREEDRENO_PREP_READ);
ptr = fd_bo_map(start->bo);
p->accumulate_result(ctx, sampptr(period->start, i, ptr),
sampptr(period->end, i, ptr), result);
fd_bo_cpu_fini(start->bo);
}
}
return true;
}
static const struct fd_query_funcs hw_query_funcs = {
.destroy_query = fd_hw_destroy_query,
.begin_query = fd_hw_begin_query,
.end_query = fd_hw_end_query,
.get_query_result = fd_hw_get_query_result,
};
struct fd_query *
fd_hw_create_query(struct fd_context *ctx, unsigned query_type)
{
struct fd_hw_query *hq;
struct fd_query *q;
int idx = pidx(query_type);
if ((idx < 0) || !ctx->sample_providers[idx])
return NULL;
hq = CALLOC_STRUCT(fd_hw_query);
if (!hq)
return NULL;
hq->provider = ctx->sample_providers[idx];
list_inithead(&hq->periods);
list_inithead(&hq->current_periods);
list_inithead(&hq->list);
q = &hq->base;
q->funcs = &hw_query_funcs;
q->type = query_type;
return q;
}
struct fd_hw_sample *
fd_hw_sample_init(struct fd_context *ctx, uint32_t size)
{
struct fd_hw_sample *samp = util_slab_alloc(&ctx->sample_pool);
pipe_reference_init(&samp->reference, 1);
samp->size = size;
samp->offset = ctx->next_sample_offset;
/* NOTE: util_slab_alloc() does not zero out the buffer: */
samp->bo = NULL;
samp->num_tiles = 0;
samp->tile_stride = 0;
ctx->next_sample_offset += size;
return samp;
}
void
__fd_hw_sample_destroy(struct fd_context *ctx, struct fd_hw_sample *samp)
{
if (samp->bo)
fd_bo_del(samp->bo);
util_slab_free(&ctx->sample_pool, samp);
}
static void
prepare_sample(struct fd_hw_sample *samp, struct fd_bo *bo,
uint32_t num_tiles, uint32_t tile_stride)
{
if (samp->bo) {
assert(samp->bo == bo);
assert(samp->num_tiles == num_tiles);
assert(samp->tile_stride == tile_stride);
return;
}
samp->bo = bo;
samp->num_tiles = num_tiles;
samp->tile_stride = tile_stride;
}
static void
prepare_query(struct fd_hw_query *hq, struct fd_bo *bo,
uint32_t num_tiles, uint32_t tile_stride)
{
struct fd_hw_sample_period *period, *s;
/* prepare all the samples in the query: */
LIST_FOR_EACH_ENTRY_SAFE(period, s, &hq->current_periods, list) {
prepare_sample(period->start, bo, num_tiles, tile_stride);
prepare_sample(period->end, bo, num_tiles, tile_stride);
/* move from current_periods list to periods list: */
list_del(&period->list);
list_addtail(&period->list, &hq->periods);
}
}
static void
prepare_queries(struct fd_context *ctx, struct fd_bo *bo,
uint32_t num_tiles, uint32_t tile_stride,
struct list_head *list, bool remove)
{
struct fd_hw_query *hq, *s;
LIST_FOR_EACH_ENTRY_SAFE(hq, s, list, list) {
prepare_query(hq, bo, num_tiles, tile_stride);
if (remove)
list_delinit(&hq->list);
}
}
/* called from gmem code once total storage requirements are known (ie.
* number of samples times number of tiles)
*/
void
fd_hw_query_prepare(struct fd_context *ctx, uint32_t num_tiles)
{
uint32_t tile_stride = ctx->next_sample_offset;
struct fd_bo *bo;
if (ctx->query_bo)
fd_bo_del(ctx->query_bo);
if (tile_stride > 0) {
bo = fd_bo_new(ctx->dev, tile_stride * num_tiles,
DRM_FREEDRENO_GEM_CACHE_WCOMBINE |
DRM_FREEDRENO_GEM_TYPE_KMEM);
} else {
bo = NULL;
}
ctx->query_bo = bo;
ctx->query_tile_stride = tile_stride;
prepare_queries(ctx, bo, num_tiles, tile_stride,
&ctx->active_queries, false);
prepare_queries(ctx, bo, num_tiles, tile_stride,
&ctx->current_queries, true);
/* reset things for next batch: */
ctx->next_sample_offset = 0;
}
void
fd_hw_query_prepare_tile(struct fd_context *ctx, uint32_t n,
struct fd_ringbuffer *ring)
{
uint32_t tile_stride = ctx->query_tile_stride;
uint32_t offset = tile_stride * n;
/* bail if no queries: */
if (tile_stride == 0)
return;
fd_wfi(ctx, ring);
OUT_PKT0 (ring, HW_QUERY_BASE_REG, 1);
OUT_RELOCW(ring, ctx->query_bo, offset, 0, 0);
}
void
fd_hw_query_set_stage(struct fd_context *ctx, struct fd_ringbuffer *ring,
enum fd_render_stage stage)
{
/* special case: internal blits (like mipmap level generation)
* go through normal draw path (via util_blitter_blit()).. but
* we need to ignore the FD_STAGE_DRAW which will be set, so we
* don't enable queries which should be paused during internal
* blits:
*/
if ((ctx->stage == FD_STAGE_BLIT) &&
(stage != FD_STAGE_NULL))
return;
if (stage != ctx->stage) {
struct fd_hw_query *hq;
LIST_FOR_EACH_ENTRY(hq, &ctx->active_queries, list) {
bool was_active = is_active(hq, ctx->stage);
bool now_active = is_active(hq, stage);
if (now_active && !was_active)
resume_query(ctx, hq, ring);
else if (was_active && !now_active)
pause_query(ctx, hq, ring);
}
}
clear_sample_cache(ctx);
ctx->stage = stage;
}
void
fd_hw_query_register_provider(struct pipe_context *pctx,
const struct fd_hw_sample_provider *provider)
{
struct fd_context *ctx = fd_context(pctx);
int idx = pidx(provider->query_type);
assert((0 <= idx) && (idx < MAX_HW_SAMPLE_PROVIDERS));
assert(!ctx->sample_providers[idx]);
ctx->sample_providers[idx] = provider;
}
void
fd_hw_query_init(struct pipe_context *pctx)
{
struct fd_context *ctx = fd_context(pctx);
util_slab_create(&ctx->sample_pool, sizeof(struct fd_hw_sample),
16, UTIL_SLAB_SINGLETHREADED);
util_slab_create(&ctx->sample_period_pool, sizeof(struct fd_hw_sample_period),
16, UTIL_SLAB_SINGLETHREADED);
list_inithead(&ctx->active_queries);
list_inithead(&ctx->current_queries);
}
void
fd_hw_query_fini(struct pipe_context *pctx)
{
struct fd_context *ctx = fd_context(pctx);
util_slab_destroy(&ctx->sample_pool);
util_slab_destroy(&ctx->sample_period_pool);
}

View File

@@ -0,0 +1,164 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2014 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#ifndef FREEDRENO_QUERY_HW_H_
#define FREEDRENO_QUERY_HW_H_
#include "util/u_double_list.h"
#include "freedreno_query.h"
#include "freedreno_context.h"
/*
* HW Queries:
*
* See: https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries
*
* Hardware queries will be specific to gpu generation, but they need
* some common infrastructure for triggering start/stop samples at
* various points (for example, to exclude mem2gmem/gmem2mem or clear)
* as well as per tile tracking.
*
* NOTE: in at least some cases hw writes sample values to memory addr
* specified in some register. So we don't really have the option to
* just sample the same counter multiple times for multiple different
* queries with the same query_type. So we cache per sample provider
* the most recent sample since the last draw. This way multiple
* sample periods for multiple queries can reference the same sample.
*
* fd_hw_sample_provider:
* - one per query type, registered/implemented by gpu generation
* specific code
* - can construct fd_hw_samples on demand
* - most recent sample (since last draw) cached so multiple
* different queries can ref the same sample
*
* fd_hw_sample:
* - abstracts one snapshot of counter value(s) across N tiles
* - backing object not allocated until submit time when number
* of samples and number of tiles is known
*
* fd_hw_sample_period:
* - consists of start and stop sample
* - a query accumulates a list of sample periods
* - the query result is the sum of the sample periods
*/
struct fd_hw_sample_provider {
unsigned query_type;
/* stages applicable to the query type: */
enum fd_render_stage active;
/* when a new sample is required, emit appropriate cmdstream
* and return a sample object:
*/
struct fd_hw_sample *(*get_sample)(struct fd_context *ctx,
struct fd_ringbuffer *ring);
/* accumulate the results from specified sample period: */
void (*accumulate_result)(struct fd_context *ctx,
const void *start, const void *end,
union pipe_query_result *result);
};
struct fd_hw_sample {
struct pipe_reference reference; /* keep this first */
/* offset and size of the sample are know at the time the
* sample is constructed.
*/
uint32_t size;
uint32_t offset;
/* backing object, offset/stride/etc are determined not when
* the sample is constructed, but when the batch is submitted.
* This way we can defer allocation until total # of requested
* samples, and total # of tiles, is known.
*/
struct fd_bo *bo;
uint32_t num_tiles;
uint32_t tile_stride;
};
struct fd_hw_sample_period;
struct fd_hw_query {
struct fd_query base;
const struct fd_hw_sample_provider *provider;
/* list of fd_hw_sample_period in previous submits: */
struct list_head periods;
/* list of fd_hw_sample_period's in current submit: */
struct list_head current_periods;
/* if active and not paused, the current sample period (not
* yet added to current_periods):
*/
struct fd_hw_sample_period *period;
struct list_head list; /* list-node in ctx->active_queries */
};
static inline struct fd_hw_query *
fd_hw_query(struct fd_query *q)
{
return (struct fd_hw_query *)q;
}
struct fd_query * fd_hw_create_query(struct fd_context *ctx, unsigned query_type);
/* helper for sample providers: */
struct fd_hw_sample * fd_hw_sample_init(struct fd_context *ctx, uint32_t size);
/* don't call directly, use fd_hw_sample_reference() */
void __fd_hw_sample_destroy(struct fd_context *ctx, struct fd_hw_sample *samp);
void fd_hw_query_prepare(struct fd_context *ctx, uint32_t num_tiles);
void fd_hw_query_prepare_tile(struct fd_context *ctx, uint32_t n,
struct fd_ringbuffer *ring);
void fd_hw_query_set_stage(struct fd_context *ctx,
struct fd_ringbuffer *ring, enum fd_render_stage stage);
void fd_hw_query_register_provider(struct pipe_context *pctx,
const struct fd_hw_sample_provider *provider);
void fd_hw_query_init(struct pipe_context *pctx);
void fd_hw_query_fini(struct pipe_context *pctx);
static inline void
fd_hw_sample_reference(struct fd_context *ctx,
struct fd_hw_sample **ptr, struct fd_hw_sample *samp)
{
struct fd_hw_sample *old_samp = *ptr;
if (pipe_reference(&(*ptr)->reference, &samp->reference))
__fd_hw_sample_destroy(ctx, old_samp);
if (ptr)
*ptr = samp;
}
#endif /* FREEDRENO_QUERY_HW_H_ */

View File

@@ -0,0 +1,165 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2014 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#include "pipe/p_state.h"
#include "util/u_string.h"
#include "util/u_memory.h"
#include "util/u_inlines.h"
#include "os/os_time.h"
#include "freedreno_query_sw.h"
#include "freedreno_context.h"
#include "freedreno_util.h"
/*
* SW Queries:
*
* In the core, we have some support for basic sw counters
*/
static void
fd_sw_destroy_query(struct fd_context *ctx, struct fd_query *q)
{
struct fd_sw_query *sq = fd_sw_query(q);
free(sq);
}
static uint64_t
read_counter(struct fd_context *ctx, int type)
{
switch (type) {
case PIPE_QUERY_PRIMITIVES_GENERATED:
/* for now same thing as _PRIMITIVES_EMITTED */
case PIPE_QUERY_PRIMITIVES_EMITTED:
return ctx->stats.prims_emitted;
case FD_QUERY_DRAW_CALLS:
return ctx->stats.draw_calls;
case FD_QUERY_BATCH_TOTAL:
return ctx->stats.batch_total;
case FD_QUERY_BATCH_SYSMEM:
return ctx->stats.batch_sysmem;
case FD_QUERY_BATCH_GMEM:
return ctx->stats.batch_gmem;
case FD_QUERY_BATCH_RESTORE:
return ctx->stats.batch_restore;
}
return 0;
}
static bool
is_rate_query(struct fd_query *q)
{
switch (q->type) {
case FD_QUERY_BATCH_TOTAL:
case FD_QUERY_BATCH_SYSMEM:
case FD_QUERY_BATCH_GMEM:
case FD_QUERY_BATCH_RESTORE:
return true;
default:
return false;
}
}
static void
fd_sw_begin_query(struct fd_context *ctx, struct fd_query *q)
{
struct fd_sw_query *sq = fd_sw_query(q);
q->active = true;
sq->begin_value = read_counter(ctx, q->type);
if (is_rate_query(q))
sq->begin_time = os_time_get();
}
static void
fd_sw_end_query(struct fd_context *ctx, struct fd_query *q)
{
struct fd_sw_query *sq = fd_sw_query(q);
q->active = false;
sq->end_value = read_counter(ctx, q->type);
if (is_rate_query(q))
sq->end_time = os_time_get();
}
static boolean
fd_sw_get_query_result(struct fd_context *ctx, struct fd_query *q,
boolean wait, union pipe_query_result *result)
{
struct fd_sw_query *sq = fd_sw_query(q);
if (q->active)
return false;
util_query_clear_result(result, q->type);
result->u64 = sq->end_value - sq->begin_value;
if (is_rate_query(q)) {
double fps = (result->u64 * 1000000) /
(double)(sq->end_time - sq->begin_time);
result->u64 = (uint64_t)fps;
}
return true;
}
static const struct fd_query_funcs sw_query_funcs = {
.destroy_query = fd_sw_destroy_query,
.begin_query = fd_sw_begin_query,
.end_query = fd_sw_end_query,
.get_query_result = fd_sw_get_query_result,
};
struct fd_query *
fd_sw_create_query(struct fd_context *ctx, unsigned query_type)
{
struct fd_sw_query *sq;
struct fd_query *q;
switch (query_type) {
case PIPE_QUERY_PRIMITIVES_GENERATED:
case PIPE_QUERY_PRIMITIVES_EMITTED:
case FD_QUERY_DRAW_CALLS:
case FD_QUERY_BATCH_TOTAL:
case FD_QUERY_BATCH_SYSMEM:
case FD_QUERY_BATCH_GMEM:
case FD_QUERY_BATCH_RESTORE:
break;
default:
return NULL;
}
sq = CALLOC_STRUCT(fd_sw_query);
if (!sq)
return NULL;
q = &sq->base;
q->funcs = &sw_query_funcs;
q->type = query_type;
return q;
}

View File

@@ -0,0 +1,55 @@
/* -*- mode: C; c-file-style: "k&r"; tab-width 4; indent-tabs-mode: t; -*- */
/*
* Copyright (C) 2014 Rob Clark <robclark@freedesktop.org>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#ifndef FREEDRENO_QUERY_SW_H_
#define FREEDRENO_QUERY_SW_H_
#include "freedreno_query.h"
/*
* SW Queries:
*
* In the core, we have some support for basic sw counters
*/
struct fd_sw_query {
struct fd_query base;
uint64_t begin_value, end_value;
uint64_t begin_time, end_time;
};
static inline struct fd_sw_query *
fd_sw_query(struct fd_query *q)
{
return (struct fd_sw_query *)q;
}
struct fd_query * fd_sw_create_query(struct fd_context *ctx,
unsigned query_type);
#endif /* FREEDRENO_QUERY_SW_H_ */

View File

@@ -36,6 +36,7 @@
#include "freedreno_screen.h"
#include "freedreno_surface.h"
#include "freedreno_context.h"
#include "freedreno_query_hw.h"
#include "freedreno_util.h"
#include <errno.h>
@@ -47,6 +48,10 @@ realloc_bo(struct fd_resource *rsc, uint32_t size)
uint32_t flags = DRM_FREEDRENO_GEM_CACHE_WCOMBINE |
DRM_FREEDRENO_GEM_TYPE_KMEM; /* TODO */
/* if we start using things other than write-combine,
* be sure to check for PIPE_RESOURCE_FLAG_MAP_COHERENT
*/
if (rsc->bo)
fd_bo_del(rsc->bo);
@@ -401,7 +406,9 @@ render_blit(struct pipe_context *pctx, struct pipe_blit_info *info)
util_blitter_save_fragment_sampler_views(ctx->blitter,
ctx->fragtex.num_textures, ctx->fragtex.textures);
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_BLIT);
util_blitter_blit(ctx->blitter, info);
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_NULL);
return true;
}

View File

@@ -50,8 +50,8 @@
#include "freedreno_query.h"
#include "freedreno_util.h"
#include "fd2_screen.h"
#include "fd3_screen.h"
#include "a2xx/fd2_screen.h"
#include "a3xx/fd3_screen.h"
/* XXX this should go away */
#include "state_tracker/drm_driver.h"
@@ -143,6 +143,8 @@ tables for things that differ if the delta is not too much..
static int
fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
{
struct fd_screen *screen = fd_screen(pscreen);
/* this is probably not totally correct.. but it's a start: */
switch (param) {
/* Supported features (boolean caps). */
@@ -159,11 +161,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MIXED_COLORBUFFER_FORMATS:
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:
case PIPE_CAP_SM3:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_PRIMITIVE_RESTART:
case PIPE_CAP_CONDITIONAL_RENDER:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
case PIPE_CAP_TGSI_INSTANCEID:
@@ -173,13 +171,18 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_COMPUTE:
case PIPE_CAP_START_INSTANCE:
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
case PIPE_CAP_TEXTURE_MULTISAMPLE:
case PIPE_CAP_USER_CONSTANT_BUFFERS:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
return 1;
case PIPE_CAP_SHADER_STENCIL_EXPORT:
case PIPE_CAP_TGSI_TEXCOORD:
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
case PIPE_CAP_CONDITIONAL_RENDER:
case PIPE_CAP_PRIMITIVE_RESTART:
case PIPE_CAP_TEXTURE_MULTISAMPLE:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_SM3:
return 0;
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
@@ -205,7 +208,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_TGSI_VS_LAYER:
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_SAMPLE_SHADING:
@@ -229,17 +231,18 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
return MAX_MIP_LEVELS;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return 9192;
return 0; /* TODO: a3xx+ should support (required in gles3) */
/* Render targets. */
case PIPE_CAP_MAX_RENDER_TARGETS:
return 1;
/* Timer queries. */
/* Queries. */
case PIPE_CAP_QUERY_TIME_ELAPSED:
case PIPE_CAP_OCCLUSION_QUERY:
case PIPE_CAP_QUERY_TIMESTAMP:
return 0;
case PIPE_CAP_OCCLUSION_QUERY:
return (screen->gpu_id >= 300) ? 1: 0;
case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MIN_TEXEL_OFFSET:
@@ -252,7 +255,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_ENDIANNESS:
return PIPE_ENDIAN_LITTLE;
case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
return 64;
default:
@@ -315,7 +318,7 @@ fd_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
return 8; /* XXX */
case PIPE_SHADER_CAP_MAX_INPUTS:
return 32;
return 16;
case PIPE_SHADER_CAP_MAX_TEMPS:
return 64; /* Max native temporaries. */
case PIPE_SHADER_CAP_MAX_ADDRS:

View File

@@ -57,7 +57,7 @@ static void bind_sampler_states(struct fd_texture_stateobj *prog,
for (i = 0; i < nr; i++) {
if (hwcso[i])
new_nr++;
new_nr = i + 1;
prog->samplers[i] = hwcso[i];
prog->dirty_samplers |= (1 << i);
}
@@ -78,7 +78,7 @@ static void set_sampler_views(struct fd_texture_stateobj *prog,
for (i = 0; i < nr; i++) {
if (views[i])
new_nr++;
new_nr = i + 1;
pipe_sampler_view_reference(&prog->textures[i], views[i]);
prog->dirty_samplers |= (1 << i);
}

View File

@@ -111,26 +111,6 @@ fd_blend_factor(unsigned factor)
}
}
enum adreno_rb_blend_opcode
fd_blend_func(unsigned func)
{
switch (func) {
case PIPE_BLEND_ADD:
return BLEND_DST_PLUS_SRC;
case PIPE_BLEND_MIN:
return BLEND_MIN_DST_SRC;
case PIPE_BLEND_MAX:
return BLEND_MAX_DST_SRC;
case PIPE_BLEND_SUBTRACT:
return BLEND_SRC_MINUS_DST;
case PIPE_BLEND_REVERSE_SUBTRACT:
return BLEND_DST_MINUS_SRC;
default:
DBG("invalid blend func: %x", func);
return 0;
}
}
enum adreno_pa_su_sc_draw
fd_polygon_mode(unsigned mode)
{

View File

@@ -45,7 +45,6 @@
enum adreno_rb_depth_format fd_pipe2depth(enum pipe_format format);
enum pc_di_index_size fd_pipe2index(enum pipe_format format);
enum adreno_rb_blend_factor fd_blend_factor(unsigned factor);
enum adreno_rb_blend_opcode fd_blend_func(unsigned func);
enum adreno_pa_su_sc_draw fd_polygon_mode(unsigned mode);
enum adreno_stencil_op fd_stencil_op(unsigned op);
@@ -223,11 +222,18 @@ OUT_IB(struct fd_ringbuffer *ring, struct fd_ringmarker *start,
emit_marker(ring, 6);
}
/* CP_SCRATCH_REG4 is used to hold base address for query results: */
#define HW_QUERY_BASE_REG REG_AXXX_CP_SCRATCH_REG4
static inline void
emit_marker(struct fd_ringbuffer *ring, int scratch_idx)
{
extern unsigned marker_cnt;
OUT_PKT0(ring, REG_AXXX_CP_SCRATCH_REG0 + scratch_idx, 1);
unsigned reg = REG_AXXX_CP_SCRATCH_REG0 + scratch_idx;
assert(reg != HW_QUERY_BASE_REG);
if (reg == HW_QUERY_BASE_REG)
return;
OUT_PKT0(ring, reg, 1);
OUT_RING(ring, ++marker_cnt);
}

View File

@@ -312,9 +312,15 @@ lp_rast_shade_tile(struct lp_rasterizer_task *task,
/* color buffer */
for (i = 0; i < scene->fb.nr_cbufs; i++){
stride[i] = scene->cbufs[i].stride;
color[i] = lp_rast_get_unswizzled_color_block_pointer(task, i, tile_x + x,
tile_y + y, inputs->layer);
if (scene->fb.cbufs[i]) {
stride[i] = scene->cbufs[i].stride;
color[i] = lp_rast_get_unswizzled_color_block_pointer(task, i, tile_x + x,
tile_y + y, inputs->layer);
}
else {
stride[i] = 0;
color[i] = NULL;
}
}
/* depth buffer */

View File

@@ -115,7 +115,7 @@ llvmpipe_texture_layout(struct llvmpipe_screen *screen,
lpr->row_stride[level] = align(nblocksx * block_size, util_cpu_caps.cacheline);
/* if row_stride * height > LP_MAX_TEXTURE_SIZE */
if (lpr->row_stride[level] > LP_MAX_TEXTURE_SIZE / nblocksy) {
if ((uint64_t)lpr->row_stride[level] * nblocksy > LP_MAX_TEXTURE_SIZE) {
/* image too large */
goto fail;
}

View File

@@ -28,18 +28,19 @@ include $(LOCAL_PATH)/Makefile.sources
include $(CLEAR_VARS)
LOCAL_SRC_FILES := $(C_SOURCES) \
LOCAL_SRC_FILES := \
$(C_SOURCES) \
$(NV30_C_SOURCES) \
$(NV50_CODEGEN_SOURCES) \
$(NV50_C_SOURES) \
$(NVC0_CODEGEN_SOURCES) \
$(NVC0_C_SOURCES)
LOCAL_C_INCLUDES := $(DRM_TOP) \
$(DRM_TOP)/include/drm \
$(DRM_TOP)/nouveau
LOCAL_C_INCLUDES := \
$(TARGET_OUT_HEADERS)/libdrm
LOCAL_MODULE := libmesa_pipe_nouveau
include external/stlport/libstlport.mk
include $(GALLIUM_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)

View File

@@ -177,6 +177,7 @@ struct nv50_ir_prog_info
uint8_t vertexId; /* system value index of VertexID */
uint8_t edgeFlagIn;
uint8_t edgeFlagOut;
int8_t viewportId; /* output index of ViewportIndex */
uint8_t fragDepth; /* output index of FragDepth */
uint8_t sampleMask; /* output index of SampleMask */
boolean sampleInterp; /* perform sample interp on all fp inputs */

View File

@@ -287,10 +287,12 @@ CodeEmitterGK110::emitPredicate(const Instruction *i)
void
CodeEmitterGK110::setCAddress14(const ValueRef& src)
{
const int32_t addr = src.get()->asSym()->reg.data.offset / 4;
const Storage& res = src.get()->asSym()->reg;
const int32_t addr = res.data.offset / 4;
code[0] |= (addr & 0x01ff) << 23;
code[1] |= (addr & 0x3e00) >> 9;
code[1] |= res.fileIndex << 5;
}
void
@@ -413,7 +415,6 @@ CodeEmitterGK110::emitForm_21(const Instruction *i, uint32_t opc2,
case FILE_MEMORY_CONST:
code[1] &= (s == 2) ? ~(0x4 << 28) : ~(0x8 << 28);
setCAddress14(i->src(s));
code[1] |= i->getSrc(s)->reg.fileIndex << 5;
break;
case FILE_IMMEDIATE:
setShortImmediate(i, s);
@@ -555,6 +556,7 @@ CodeEmitterGK110::emitFADD(const Instruction *i)
RND_(2a, F);
ABS_(31, 0);
NEG_(33, 0);
SAT_(35);
if (code[0] & 0x1) {
modNegAbsF32_3b(i, 1);
@@ -633,7 +635,7 @@ CodeEmitterGK110::emitISAD(const Instruction *i)
{
assert(i->dType == TYPE_S32 || i->dType == TYPE_U32);
emitForm_21(i, 0x1fc, 0xb74);
emitForm_21(i, 0x1f4, 0xb74);
if (i->dType == TYPE_S32)
code[1] |= 1 << 19;
@@ -711,7 +713,7 @@ CodeEmitterGK110::emitEXTBF(const Instruction *i)
void
CodeEmitterGK110::emitBFIND(const Instruction *i)
{
emitForm_21(i, 0x618, 0xc18);
emitForm_C(i, 0x218, 0x2);
if (i->dType == TYPE_S32)
code[1] |= 0x80000;
@@ -915,6 +917,9 @@ CodeEmitterGK110::emitSET(const CmpInstruction *i)
modNegAbsF32_3b(i, 1);
}
FTZ_(3a);
if (i->dType == TYPE_F32)
code[1] |= 1 << 23;
}
if (i->sType == TYPE_S32)
code[1] |= 1 << 19;
@@ -949,7 +954,7 @@ CodeEmitterGK110::emitSLCT(const CmpInstruction *i)
FTZ_(32);
emitCondCode(cc, 0x33, 0xf);
} else {
emitForm_21(i, 0x1a4, 0xb20);
emitForm_21(i, 0x1a0, 0xb20);
emitCondCode(cc, 0x34, 0x7);
}
}
@@ -964,7 +969,7 @@ void CodeEmitterGK110::emitSELP(const Instruction *i)
void CodeEmitterGK110::emitTEXBAR(const Instruction *i)
{
code[0] = 0x00000002 | (i->subOp << 23);
code[0] = 0x0000003e | (i->subOp << 23);
code[1] = 0x77000000;
emitPredicate(i);
@@ -1201,7 +1206,7 @@ CodeEmitterGK110::emitFlow(const Instruction *i)
case OP_PRECONT: code[1] = 0x15800000; mask = 2; break;
case OP_PRERET: code[1] = 0x13800000; mask = 2; break;
case OP_QUADON: code[1] = 0x1b000000; mask = 0; break;
case OP_QUADON: code[1] = 0x1b800000; mask = 0; break;
case OP_QUADPOP: code[1] = 0x1c000000; mask = 0; break;
case OP_BRKPT: code[1] = 0x00000000; mask = 0; break;
default:
@@ -1323,7 +1328,8 @@ CodeEmitterGK110::emitOUT(const Instruction *i)
void
CodeEmitterGK110::emitInterpMode(const Instruction *i)
{
code[1] |= i->ipa << 21; // TODO: INTERP_SAMPLEID
code[1] |= (i->ipa & 0x3) << 21; // TODO: INTERP_SAMPLEID
code[1] |= (i->ipa & 0xc) << (19 - 2);
}
void

View File

@@ -790,6 +790,8 @@ bool Source::scanSource()
info->prop.gp.instanceCount = 1; // default value
}
info->io.viewportId = -1;
info->immd.data = (uint32_t *)MALLOC(scan.immediate_count * 16);
info->immd.type = (ubyte *)MALLOC(scan.immediate_count * sizeof(ubyte));
@@ -982,6 +984,9 @@ bool Source::scanDeclaration(const struct tgsi_full_declaration *decl)
case TGSI_SEMANTIC_SAMPLEMASK:
info->io.sampleMask = i;
break;
case TGSI_SEMANTIC_VIEWPORT_INDEX:
info->io.viewportId = i;
break;
default:
break;
}
@@ -1258,6 +1263,8 @@ private:
Stack joinBBs; // fork BB, for inserting join ops on ENDIF
Stack loopBBs; // loop headers
Stack breakBBs; // end of / after loop
Value *viewport;
};
Symbol *
@@ -1555,8 +1562,16 @@ Converter::storeDst(const tgsi::Instruction::DstRegister dst, int c,
mkOp2(OP_WRSV, TYPE_U32, NULL, dstToSym(dst, c), val);
} else
if (f == TGSI_FILE_OUTPUT && prog->getType() != Program::TYPE_FRAGMENT) {
if (ptr || (info->out[idx].mask & (1 << c)))
mkStore(OP_EXPORT, TYPE_U32, dstToSym(dst, c), ptr, val);
if (ptr || (info->out[idx].mask & (1 << c))) {
/* Save the viewport index into a scratch register so that it can be
exported at EMIT time */
if (info->out[idx].sn == TGSI_SEMANTIC_VIEWPORT_INDEX &&
viewport != NULL)
mkOp1(OP_MOV, TYPE_U32, viewport, val);
else
mkStore(OP_EXPORT, TYPE_U32, dstToSym(dst, c), ptr, val);
}
} else
if (f == TGSI_FILE_TEMPORARY ||
f == TGSI_FILE_PREDICATE ||
@@ -2199,7 +2214,6 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
case TGSI_OPCODE_IMUL_HI:
case TGSI_OPCODE_UMUL_HI:
case TGSI_OPCODE_OR:
case TGSI_OPCODE_POW:
case TGSI_OPCODE_SHL:
case TGSI_OPCODE_ISHR:
case TGSI_OPCODE_USHR:
@@ -2254,6 +2268,11 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
mkOp1(OP_MOV, TYPE_U32, dst0[c], fetchSrc(0, c));
break;
case TGSI_OPCODE_POW:
val0 = mkOp2v(op, TYPE_F32, getScratch(), fetchSrc(0, 0), fetchSrc(1, 0));
FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
mkOp1(OP_MOV, TYPE_F32, dst0[c], val0);
break;
case TGSI_OPCODE_EX2:
case TGSI_OPCODE_LG2:
val0 = mkOp1(op, TYPE_F32, getScratch(), fetchSrc(0, 0))->getDef(0);
@@ -2453,7 +2472,12 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
break;
case TGSI_OPCODE_KILL_IF:
val0 = new_LValue(func, FILE_PREDICATE);
mask = 0;
for (c = 0; c < 4; ++c) {
const int s = tgsi.getSrc(0).getSwizzle(c);
if (mask & (1 << s))
continue;
mask |= 1 << s;
mkCmp(OP_SET, CC_LT, TYPE_F32, val0, TYPE_F32, fetchSrc(0, c), zero);
mkOp(OP_DISCARD, TYPE_NONE, NULL)->setPredicate(CC_P, val0);
}
@@ -2480,7 +2504,7 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
break;
case TGSI_OPCODE_TXB2:
case TGSI_OPCODE_TXL2:
handleTEX(dst0, 2, 2, 0x10, 0x11, 0x00, 0x00);
handleTEX(dst0, 2, 2, 0x10, 0x0f, 0x00, 0x00);
break;
case TGSI_OPCODE_SAMPLE:
case TGSI_OPCODE_SAMPLE_B:
@@ -2514,6 +2538,13 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
mkCvt(OP_CVT, dstTy, dst0[c], srcTy, fetchSrc(0, c));
break;
case TGSI_OPCODE_EMIT:
/* export the saved viewport index */
if (viewport != NULL) {
Symbol *vpSym = mkSymbol(FILE_SHADER_OUTPUT, 0, TYPE_U32,
info->out[info->io.viewportId].slot[0] * 4);
mkStore(OP_EXPORT, TYPE_U32, vpSym, NULL, viewport);
}
/* fallthrough */
case TGSI_OPCODE_ENDPRIM:
// get vertex stream if specified (must be immediate)
src0 = tgsi.srcCount() ?
@@ -2943,6 +2974,11 @@ Converter::run()
mkOp1(OP_RCP, TYPE_F32, fragCoord[3], fragCoord[3]);
}
if (info->io.viewportId >= 0)
viewport = getScratch();
else
viewport = NULL;
for (ip = 0; ip < code->scan.num_instructions; ++ip) {
if (!handleInstruction(&code->insns[ip]))
return false;

View File

@@ -37,18 +37,25 @@ namespace nv50_ir {
// ah*bl 00
//
// fffe0001 + fffe0001
//
// Note that this sort of splitting doesn't work for signed values, so we
// compute the sign on those manually and then perform an unsigned multiply.
static bool
expandIntegerMUL(BuildUtil *bld, Instruction *mul)
{
const bool highResult = mul->subOp == NV50_IR_SUBOP_MUL_HIGH;
DataType fTy = mul->sType; // full type
DataType hTy;
DataType fTy; // full type
switch (mul->sType) {
case TYPE_S32: fTy = TYPE_U32; break;
case TYPE_S64: fTy = TYPE_U64; break;
default: fTy = mul->sType; break;
}
DataType hTy; // half type
switch (fTy) {
case TYPE_S32: hTy = TYPE_S16; break;
case TYPE_U32: hTy = TYPE_U16; break;
case TYPE_U64: hTy = TYPE_U32; break;
case TYPE_S64: hTy = TYPE_S32; break;
default:
return false;
}
@@ -59,15 +66,25 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul)
bld->setPosition(mul, true);
Value *s[2];
Value *a[2], *b[2];
Value *c[2];
Value *t[4];
for (int j = 0; j < 4; ++j)
t[j] = bld->getSSA(fullSize);
s[0] = mul->getSrc(0);
s[1] = mul->getSrc(1);
if (isSignedType(mul->sType)) {
s[0] = bld->getSSA(fullSize);
s[1] = bld->getSSA(fullSize);
bld->mkOp1(OP_ABS, mul->sType, s[0], mul->getSrc(0));
bld->mkOp1(OP_ABS, mul->sType, s[1], mul->getSrc(1));
}
// split sources into halves
i[0] = bld->mkSplit(a, halfSize, mul->getSrc(0));
i[1] = bld->mkSplit(b, halfSize, mul->getSrc(1));
i[0] = bld->mkSplit(a, halfSize, s[0]);
i[1] = bld->mkSplit(b, halfSize, s[1]);
i[2] = bld->mkOp2(OP_MUL, fTy, t[0], a[0], b[1]);
i[3] = bld->mkOp3(OP_MAD, fTy, t[1], a[1], b[0], t[0]);
@@ -75,23 +92,76 @@ expandIntegerMUL(BuildUtil *bld, Instruction *mul)
i[4] = bld->mkOp3(OP_MAD, fTy, t[3], a[0], b[0], t[2]);
if (highResult) {
Value *r[3];
Value *c[2];
Value *r[5];
Value *imm = bld->loadImm(NULL, 1 << (halfSize * 8));
c[0] = bld->getSSA(1, FILE_FLAGS);
c[1] = bld->getSSA(1, FILE_FLAGS);
for (int j = 0; j < 3; ++j)
for (int j = 0; j < 5; ++j)
r[j] = bld->getSSA(fullSize);
i[8] = bld->mkOp2(OP_SHR, fTy, r[0], t[1], bld->mkImm(halfSize * 8));
i[6] = bld->mkOp2(OP_ADD, fTy, r[1], r[0], imm);
bld->mkOp2(OP_UNION, TYPE_U32, r[2], r[1], r[0]);
i[5] = bld->mkOp3(OP_MAD, fTy, mul->getDef(0), a[1], b[1], r[2]);
bld->mkMov(r[3], r[0])->setPredicate(CC_NC, c[0]);
bld->mkOp2(OP_UNION, TYPE_U32, r[2], r[1], r[3]);
i[5] = bld->mkOp3(OP_MAD, fTy, r[4], a[1], b[1], r[2]);
// set carry defs / sources
i[3]->setFlagsDef(1, c[0]);
i[4]->setFlagsDef(0, c[1]); // actual result not required, just the carry
// actual result required in negative case, but ignored for
// unsigned. for some reason the compiler ends up dropping the whole
// instruction if the destination is unused but the flags are.
if (isSignedType(mul->sType))
i[4]->setFlagsDef(1, c[1]);
else
i[4]->setFlagsDef(0, c[1]);
i[6]->setPredicate(CC_C, c[0]);
i[5]->setFlagsSrc(3, c[1]);
if (isSignedType(mul->sType)) {
Value *cc[2];
Value *rr[7];
Value *one = bld->getSSA(fullSize);
bld->loadImm(one, 1);
for (int j = 0; j < 7; j++)
rr[j] = bld->getSSA(fullSize);
// NOTE: this logic uses predicates because splitting basic blocks is
// ~impossible during the SSA phase. The RA relies on a correlation
// between edge order and phi node sources.
// Set the sign of the result based on the inputs
bld->mkOp2(OP_XOR, fTy, NULL, mul->getSrc(0), mul->getSrc(1))
->setFlagsDef(0, (cc[0] = bld->getSSA(1, FILE_FLAGS)));
// 1s complement of 64-bit value
bld->mkOp1(OP_NOT, fTy, rr[0], r[4])
->setPredicate(CC_S, cc[0]);
bld->mkOp1(OP_NOT, fTy, rr[1], t[3])
->setPredicate(CC_S, cc[0]);
// add to low 32-bits, keep track of the carry
Instruction *n = bld->mkOp2(OP_ADD, fTy, NULL, rr[1], one);
n->setPredicate(CC_S, cc[0]);
n->setFlagsDef(0, (cc[1] = bld->getSSA(1, FILE_FLAGS)));
// If there was a carry, add 1 to the upper 32 bits
// XXX: These get executed even if they shouldn't be
bld->mkOp2(OP_ADD, fTy, rr[2], rr[0], one)
->setPredicate(CC_C, cc[1]);
bld->mkMov(rr[3], rr[0])
->setPredicate(CC_NC, cc[1]);
bld->mkOp2(OP_UNION, fTy, rr[4], rr[2], rr[3]);
// Merge the results from the negative and non-negative paths
bld->mkMov(rr[5], rr[4])
->setPredicate(CC_S, cc[0]);
bld->mkMov(rr[6], r[4])
->setPredicate(CC_NS, cc[0]);
bld->mkOp2(OP_UNION, mul->sType, mul->getDef(0), rr[5], rr[6]);
} else {
bld->mkMov(mul->getDef(0), r[4]);
}
} else {
bld->mkMov(mul->getDef(0), t[3]);
}
@@ -591,6 +661,10 @@ void NV50LoweringPreSSA::loadTexMsInfo(uint32_t off, Value **ms,
Value *tmp = new_LValue(func, FILE_GPR);
uint8_t b = prog->driver->io.resInfoCBSlot;
off += prog->driver->io.suInfoBase;
if (prog->getType() > Program::TYPE_VERTEX)
off += 16 * 2 * 4;
if (prog->getType() > Program::TYPE_GEOMETRY)
off += 16 * 2 * 4;
*ms_x = bld.mkLoadv(TYPE_U32, bld.mkSymbol(
FILE_MEMORY_CONST, b, TYPE_U32, off + 0), NULL);
*ms_y = bld.mkLoadv(TYPE_U32, bld.mkSymbol(
@@ -723,6 +797,16 @@ NV50LoweringPreSSA::handleTXB(TexInstruction *i)
const CondCode cc[4] = { CC_EQU, CC_S, CC_C, CC_O };
int l, d;
// We can't actually apply bias *and* do a compare for a cube
// texture. Since the compare has to be done before the filtering, just
// drop the bias on the floor.
if (i->tex.target == TEX_TARGET_CUBE_SHADOW) {
i->op = OP_TEX;
i->setSrc(3, i->getSrc(4));
i->setSrc(4, NULL);
return handleTEX(i);
}
handleTEX(i);
Value *bias = i->getSrc(i->tex.target.getArgCount());
if (bias->isUniform())
@@ -1205,8 +1289,11 @@ NV50LoweringPreSSA::checkPredicate(Instruction *insn)
Value *pred = insn->getPredicate();
Value *cdst;
if (!pred || pred->reg.file == FILE_FLAGS)
// FILE_PREDICATE will simply be changed to FLAGS on conversion to SSA
if (!pred ||
pred->reg.file == FILE_FLAGS || pred->reg.file == FILE_PREDICATE)
return;
cdst = bld.getSSA(1, FILE_FLAGS);
bld.mkCmp(OP_SET, CC_NEU, insn->dType, cdst, insn->dType, bld.loadImm(NULL, 0), pred);

View File

@@ -26,6 +26,7 @@
#include "codegen/nv50_ir_target_nvc0.h"
#include <limits>
#include <tr1/unordered_set>
namespace nv50_ir {
@@ -148,7 +149,8 @@ private:
bool insertTextureBarriers(Function *);
inline bool insnDominatedBy(const Instruction *, const Instruction *) const;
void findFirstUses(const Instruction *tex, const Instruction *def,
std::list<TexUse>&);
std::list<TexUse>&,
std::tr1::unordered_set<const Instruction *>&);
void findOverwritingDefs(const Instruction *tex, Instruction *insn,
const BasicBlock *term,
std::list<TexUse>&);
@@ -230,15 +232,31 @@ NVC0LegalizePostRA::findOverwritingDefs(const Instruction *texi,
}
void
NVC0LegalizePostRA::findFirstUses(const Instruction *texi,
const Instruction *insn,
std::list<TexUse> &uses)
NVC0LegalizePostRA::findFirstUses(
const Instruction *texi,
const Instruction *insn,
std::list<TexUse> &uses,
std::tr1::unordered_set<const Instruction *>& visited)
{
for (int d = 0; insn->defExists(d); ++d) {
Value *v = insn->getDef(d);
for (Value::UseIterator u = v->uses.begin(); u != v->uses.end(); ++u) {
Instruction *usei = (*u)->getInsn();
/* XXX HACK ALERT XXX
*
* This shouldn't have to be here, we should always be making forward
* progress by looking at the uses. However this somehow does not
* appear to be the case. Probably because this is being done right
* after RA, when the defs/uses lists have been messed with by node
* merging. This should probably be moved to being done right before
* RA. But this will do for now.
*/
if (visited.find(usei) != visited.end())
continue;
visited.insert(usei);
if (usei->op == OP_PHI || usei->op == OP_UNION) {
// need a barrier before WAW cases
for (int s = 0; usei->srcExists(s); ++s) {
@@ -253,11 +271,11 @@ NVC0LegalizePostRA::findFirstUses(const Instruction *texi,
usei->op == OP_PHI ||
usei->op == OP_UNION) {
// these uses don't manifest in the machine code
findFirstUses(texi, usei, uses);
findFirstUses(texi, usei, uses, visited);
} else
if (usei->op == OP_MOV && usei->getDef(0)->equals(usei->getSrc(0)) &&
usei->subOp != NV50_IR_SUBOP_MOV_FINAL) {
findFirstUses(texi, usei, uses);
findFirstUses(texi, usei, uses, visited);
} else {
addTexUse(uses, usei, insn);
}
@@ -313,8 +331,10 @@ NVC0LegalizePostRA::insertTextureBarriers(Function *fn)
uses = new std::list<TexUse>[texes.size()];
if (!uses)
return false;
for (size_t i = 0; i < texes.size(); ++i)
findFirstUses(texes[i], texes[i], uses[i]);
for (size_t i = 0; i < texes.size(); ++i) {
std::tr1::unordered_set<const Instruction *> visited;
findFirstUses(texes[i], texes[i], uses[i], visited);
}
// determine the barrier level at each use
for (size_t i = 0; i < texes.size(); ++i) {
@@ -814,6 +834,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
Value *zero = bld.loadImm(bld.getSSA(), 0);
int l, c;
const int dim = i->tex.target.getDim();
const int array = i->tex.target.isArray();
i->op = OP_TEX; // no need to clone dPdx/dPdy later
@@ -824,7 +845,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
for (l = 0; l < 4; ++l) {
// mov coordinates from lane l to all lanes
for (c = 0; c < dim; ++c)
bld.mkQuadop(0x00, crd[c], l, i->getSrc(c), zero);
bld.mkQuadop(0x00, crd[c], l, i->getSrc(c + array), zero);
// add dPdx from lane l to lanes dx
for (c = 0; c < dim; ++c)
bld.mkQuadop(qOps[l][0], crd[c], l, i->dPdx[c].get(), crd[c]);
@@ -834,7 +855,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
// texture
bld.insert(tex = cloneForward(func, i));
for (c = 0; c < dim; ++c)
tex->setSrc(c, crd[c]);
tex->setSrc(c + array, crd[c]);
// save results
for (c = 0; i->defExists(c); ++c) {
Instruction *mov;
@@ -870,7 +891,8 @@ NVC0LoweringPass::handleTXD(TexInstruction *txd)
if (dim > 2 ||
txd->tex.target.isCube() ||
arg > 4 ||
txd->tex.target.isShadow())
txd->tex.target.isShadow() ||
txd->tex.useOffsets)
return handleManualTXD(txd);
for (int c = 0; c < dim; ++c) {

View File

@@ -187,7 +187,8 @@ LoadPropagation::checkSwapSrc01(Instruction *insn)
return;
}
if (insn->op == OP_SET)
if (insn->op == OP_SET || insn->op == OP_SET_AND ||
insn->op == OP_SET_OR || insn->op == OP_SET_XOR)
insn->asCmp()->setCond = reverseCondCode(insn->asCmp()->setCond);
else
if (insn->op == OP_SLCT)
@@ -424,7 +425,17 @@ ConstantFolding::expr(Instruction *i,
case TYPE_F32: res.data.f32 = a->data.f32 * b->data.f32; break;
case TYPE_F64: res.data.f64 = a->data.f64 * b->data.f64; break;
case TYPE_S32:
case TYPE_U32: res.data.u32 = a->data.u32 * b->data.u32; break;
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH) {
res.data.s32 = ((int64_t)a->data.s32 * b->data.s32) >> 32;
break;
}
/* fallthrough */
case TYPE_U32:
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH) {
res.data.u32 = ((uint64_t)a->data.u32 * b->data.u32) >> 32;
break;
}
res.data.u32 = a->data.u32 * b->data.u32; break;
default:
return;
}
@@ -549,9 +560,14 @@ ConstantFolding::expr(Instruction *i,
ImmediateValue src0;
if (i->src(0).getImmediate(src0))
expr(i, src0, *i->getSrc(1)->asImm());
if (i->saturate && !prog->getTarget()->isSatSupported(i)) {
bld.setPosition(i, false);
i->setSrc(1, bld.loadImm(NULL, res.data.u32));
}
} else {
i->op = OP_MOV;
i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
}
i->subOp = 0;
}
void
@@ -601,6 +617,7 @@ ConstantFolding::unary(Instruction *i, const ImmediateValue &imm)
switch (i->op) {
case OP_NEG: res.data.f32 = -imm.reg.data.f32; break;
case OP_ABS: res.data.f32 = fabsf(imm.reg.data.f32); break;
case OP_SAT: res.data.f32 = CLAMP(imm.reg.data.f32, 0.0f, 1.0f); break;
case OP_RCP: res.data.f32 = 1.0f / imm.reg.data.f32; break;
case OP_RSQ: res.data.f32 = 1.0f / sqrtf(imm.reg.data.f32); break;
case OP_LG2: res.data.f32 = log2f(imm.reg.data.f32); break;
@@ -690,12 +707,41 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
{
const int t = !s;
const operation op = i->op;
Instruction *newi = i;
switch (i->op) {
case OP_MUL:
if (i->dType == TYPE_F32)
tryCollapseChainedMULs(i, s, imm0);
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH) {
assert(!isFloatType(i->sType));
if (imm0.isInteger(1) && i->dType == TYPE_S32) {
bld.setPosition(i, false);
// Need to set to the sign value, which is a compare.
newi = bld.mkCmp(OP_SET, CC_LT, TYPE_S32, i->getDef(0),
TYPE_S32, i->getSrc(t), bld.mkImm(0));
delete_Instruction(prog, i);
} else if (imm0.isInteger(0) || imm0.isInteger(1)) {
// The high bits can't be set in this case (either mul by 0 or
// unsigned by 1)
i->op = OP_MOV;
i->subOp = 0;
i->setSrc(0, new_ImmediateValue(prog, 0u));
i->src(0).mod = Modifier(0);
i->setSrc(1, NULL);
} else if (!imm0.isNegative() && imm0.isPow2()) {
// Translate into a shift
imm0.applyLog2();
i->op = OP_SHR;
i->subOp = 0;
imm0.reg.data.u32 = 32 - imm0.reg.data.u32;
i->setSrc(0, i->getSrc(t));
i->src(0).mod = i->src(t).mod;
i->setSrc(1, new_ImmediateValue(prog, imm0.reg.data.u32));
i->src(1).mod = 0;
}
} else
if (imm0.isInteger(0)) {
i->op = OP_MOV;
i->setSrc(0, new_ImmediateValue(prog, 0u));
@@ -786,7 +832,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
else
tA = tB;
tB = s ? bld.getSSA() : i->getDef(0);
bld.mkOp2(OP_ADD, TYPE_U32, tB, mul->getDef(0), tA);
newi = bld.mkOp2(OP_ADD, TYPE_U32, tB, mul->getDef(0), tA);
if (s)
bld.mkOp2(OP_SHR, TYPE_U32, i->getDef(0), tB, bld.mkImm(s));
@@ -818,7 +864,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
tA = bld.getSSA();
bld.mkCmp(OP_SET, CC_LT, TYPE_S32, tA, TYPE_S32, i->getSrc(0), bld.mkImm(0));
tD = (d < 0) ? bld.getSSA() : i->getDef(0)->asLValue();
bld.mkOp2(OP_SUB, TYPE_U32, tD, tB, tA);
newi = bld.mkOp2(OP_SUB, TYPE_U32, tD, tB, tA);
if (d < 0)
bld.mkOp1(OP_NEG, TYPE_S32, i->getDef(0), tB);
@@ -882,6 +928,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
case OP_ABS:
case OP_NEG:
case OP_SAT:
case OP_LG2:
case OP_RCP:
case OP_SQRT:
@@ -896,7 +943,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
default:
return;
}
if (i->op != op)
if (newi->op != op)
foldCount++;
}

View File

@@ -998,7 +998,9 @@ GCRA::doCoalesce(ArrayList& insns, unsigned int mask)
case OP_TXQ:
case OP_TXD:
case OP_TXG:
case OP_TXLQ:
case OP_TEXCSAA:
case OP_TEXPREP:
if (!(mask & JOIN_MASK_TEX))
break;
for (c = 0; insn->srcExists(c) && c != insn->predSrc; ++c)

View File

@@ -331,6 +331,8 @@ TargetNV50::insnCanLoad(const Instruction *i, int s,
return false;
if (sf == FILE_IMMEDIATE)
return false;
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH && sf == FILE_MEMORY_CONST)
return false;
ldSize = 2;
} else {
ldSize = typeSizeof(ld->dType);

View File

@@ -165,6 +165,9 @@ nv30_context_destroy(struct pipe_context *pipe)
if (nv30->draw)
draw_destroy(nv30->draw);
if (nv30->screen->base.pushbuf->user_priv == &nv30->bufctx)
nv30->screen->base.pushbuf->user_priv = NULL;
nouveau_bufctx_del(&nv30->bufctx);
if (nv30->screen->cur_ctx == nv30)

View File

@@ -325,6 +325,12 @@ nv30_screen_destroy(struct pipe_screen *pscreen)
nouveau_fence_ref(NULL, &screen->base.fence.current);
}
nouveau_bo_ref(NULL, &screen->notify);
nouveau_heap_destroy(&screen->query_heap);
nouveau_heap_destroy(&screen->vp_exec_heap);
nouveau_heap_destroy(&screen->vp_data_heap);
nouveau_object_del(&screen->query);
nouveau_object_del(&screen->fence);
nouveau_object_del(&screen->ntfy);

View File

@@ -23,6 +23,7 @@
*
*/
#include "util/u_format.h"
#include "util/u_helpers.h"
#include "util/u_inlines.h"
@@ -360,6 +361,22 @@ nv30_set_framebuffer_state(struct pipe_context *pipe,
nv30->framebuffer = *fb;
nv30->dirty |= NV30_NEW_FRAMEBUFFER;
/* Hardware can't handle different swizzled-ness or different blocksizes
* for zs and cbufs. If both are supplied and something doesn't match,
* blank out the zs for now so that at least *some* rendering can occur.
*/
if (fb->nr_cbufs > 0 && fb->zsbuf) {
struct nv30_miptree *color_mt = nv30_miptree(fb->cbufs[0]->texture);
struct nv30_miptree *zeta_mt = nv30_miptree(fb->zsbuf->texture);
if (color_mt->swizzled != zeta_mt->swizzled ||
(util_format_get_blocksize(fb->zsbuf->format) > 2) !=
(util_format_get_blocksize(fb->cbufs[0]->format) > 2)) {
nv30->framebuffer.zsbuf = NULL;
debug_printf("Mismatched color and zeta formats, ignoring zeta.\n");
}
}
}
static void

View File

@@ -1225,6 +1225,7 @@ out:
if(fpc)
{
FREE(fpc->r_temp);
FREE(fpc->r_imm);
util_dynarray_fini(&fpc->if_stack);
util_dynarray_fini(&fpc->label_relocs);
util_dynarray_fini(&fpc->imm_data);

View File

@@ -61,7 +61,7 @@ static void
nv50_memory_barrier(struct pipe_context *pipe, unsigned flags)
{
struct nv50_context *nv50 = nv50_context(pipe);
int i;
int i, s;
if (flags & PIPE_BARRIER_MAPPED_BUFFER) {
for (i = 0; i < nv50->num_vtxbufs; ++i) {
@@ -74,6 +74,26 @@ nv50_memory_barrier(struct pipe_context *pipe, unsigned flags)
if (nv50->idxbuf.buffer &&
nv50->idxbuf.buffer->flags & PIPE_RESOURCE_FLAG_MAP_PERSISTENT)
nv50->base.vbo_dirty = TRUE;
for (s = 0; s < 3 && !nv50->cb_dirty; ++s) {
uint32_t valid = nv50->constbuf_valid[s];
while (valid && !nv50->cb_dirty) {
const unsigned i = ffs(valid) - 1;
struct pipe_resource *res;
valid &= ~(1 << i);
if (nv50->constbuf[s][i].user)
continue;
res = nv50->constbuf[s][i].u.buf;
if (!res)
continue;
if (res->flags & PIPE_RESOURCE_FLAG_MAP_PERSISTENT)
nv50->cb_dirty = TRUE;
}
}
}
}
@@ -122,12 +142,9 @@ nv50_destroy(struct pipe_context *pipe)
{
struct nv50_context *nv50 = nv50_context(pipe);
if (nv50_context_screen(nv50)->cur_ctx == nv50) {
nv50->base.pushbuf->kick_notify = NULL;
if (nv50_context_screen(nv50)->cur_ctx == nv50)
nv50_context_screen(nv50)->cur_ctx = NULL;
nouveau_pushbuf_bufctx(nv50->base.pushbuf, NULL);
}
/* need to flush before destroying the bufctx */
nouveau_pushbuf_bufctx(nv50->base.pushbuf, NULL);
nouveau_pushbuf_kick(nv50->base.pushbuf, nv50->base.pushbuf->channel);
nv50_context_unreference_resources(nv50);
@@ -256,7 +273,14 @@ nv50_create(struct pipe_screen *pscreen, void *priv)
nv50->base.screen = &screen->base;
nv50->base.copy_data = nv50_m2mf_copy_linear;
nv50->base.push_data = nv50_sifc_linear_u8;
/* FIXME: Make it possible to use this again. The problem is that there is
* some clever logic in the card that allows for multiple renders to happen
* when there are only constbuf changes. However that relies on the
* constbuf updates happening to the right constbuf slots. Currently
* implementation just makes it go through a separate slot which doesn't
* properly update the right constbuf data.
nv50->base.push_cb = nv50_cb_push;
*/
nv50->screen = screen;
pipe->screen = pscreen;

View File

@@ -78,16 +78,16 @@
/* 8 user clip planes, at 4 32-bit floats each */
#define NV50_CB_AUX_UCP_OFFSET 0x0000
#define NV50_CB_AUX_UCP_SIZE (8 * 4 * 4)
/* 256 textures, each with ms_x, ms_y u32 pairs */
/* 16 textures * 3 shaders, each with ms_x, ms_y u32 pairs */
#define NV50_CB_AUX_TEX_MS_OFFSET 0x0080
#define NV50_CB_AUX_TEX_MS_SIZE (256 * 2 * 4)
#define NV50_CB_AUX_TEX_MS_SIZE (16 * 3 * 2 * 4)
/* For each MS level (4), 8 sets of 32-bit integer pairs sample offsets */
#define NV50_CB_AUX_MS_OFFSET 0x880
#define NV50_CB_AUX_MS_OFFSET 0x200
#define NV50_CB_AUX_MS_SIZE (4 * 8 * 4 * 2)
/* Sample position pairs for the current output MS level */
#define NV50_CB_AUX_SAMPLE_OFFSET 0x980
#define NV50_CB_AUX_SAMPLE_OFFSET 0x300
#define NV50_CB_AUX_SAMPLE_OFFSET_SIZE (4 * 8 * 2)
/* next spot: 0x9c0 */
/* next spot: 0x340 */
/* 4 32-bit floats for the vertex runout, put at the end */
#define NV50_CB_AUX_RUNOUT_OFFSET (NV50_CB_AUX_SIZE - 0x10)
@@ -106,6 +106,7 @@ struct nv50_context {
struct nouveau_bufctx *bufctx;
uint32_t dirty;
boolean cb_dirty;
struct {
uint32_t instance_elts; /* bitmask of per-instance elements */

View File

@@ -332,7 +332,7 @@ nv50_render_condition(struct pipe_context *pipe,
nv50->cond_cond = condition;
nv50->cond_mode = mode;
PUSH_SPACE(push, 6);
PUSH_SPACE(push, 9);
if (!pq) {
BEGIN_NV04(push, NV50_3D(COND_MODE), 1);
@@ -351,6 +351,10 @@ nv50_render_condition(struct pipe_context *pipe,
PUSH_DATAh(push, q->bo->offset + q->offset);
PUSH_DATA (push, q->bo->offset + q->offset);
PUSH_DATA (push, NV50_3D_COND_MODE_RES_NON_ZERO);
BEGIN_NV04(push, NV50_2D(COND_ADDRESS_HIGH), 2);
PUSH_DATAh(push, q->bo->offset + q->offset);
PUSH_DATA (push, q->bo->offset + q->offset);
}
void

View File

@@ -397,6 +397,8 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
PUSH_DATA (push, 0);
BEGIN_NV04(push, SUBC_2D(0x0888), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_2D(COND_MODE), 1);
PUSH_DATA (push, NV50_2D_COND_MODE_ALWAYS);
BEGIN_NV04(push, SUBC_3D(NV01_SUBCHAN_OBJECT), 1);
PUSH_DATA (push, screen->tesla->handle);

View File

@@ -585,9 +585,12 @@ nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
nv50_screen_tsc_unlock(nv50->screen, old);
}
assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS);
for (; i < nv50->num_samplers[s]; ++i)
if (nv50->samplers[s][i])
for (; i < nv50->num_samplers[s]; ++i) {
if (nv50->samplers[s][i]) {
nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]);
nv50->samplers[s][i] = NULL;
}
}
nv50->num_samplers[s] = nr;

Some files were not shown because too many files have changed in this diff Show More