Compare commits

..

599 Commits

Author SHA1 Message Date
Jonathan Gray
20370d4f1b mesa: automake: include mesa_glinterop.h in distfile
Add mesa_glinterop.h to the list of headers that will get included
in the distfile as it is required to build Mesa itself.

Corrects a regression introduced in a89faa2022.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 23392abf50)
2016-11-10 21:57:37 +00:00
Emil Velikov
72539c5e38 Update version to 12.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-10 21:03:41 +00:00
Dave Airlie
e7c1408870 Revert "st/vdpau: use linear layout for output surfaces"
This reverts commit d180de3532.

This is a radeon specific hack that causes problems on nouveau
when combined with the SHARED flag later. If radeonsi needs a fix
for this, please fix it in the driver.

[chk]
Using linear surfaces for this makes sense because tilling isn't
beneficial and the surfaces can potentially be shared with other GPUs
using the VDPAU OpenGL interop.

[airlied]
I think we need a flag that isn't SHARED/LINEAR that is more
SHARED_OTHER_GPU.

[mareko]
Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate
into linear ?

[mareko]
My only concern is decoding performance. If the decoder works in 64x1
blocks, tiling will hurt. That's the theory. I don't know how the
decoder works.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)
(cherry picked from commit d0d5f7600c)
2016-11-08 20:45:03 +00:00
Marek Olšák
422e4da25c glx: make interop ABI visible again
This was broken when the GLAPI use was removed from mesa_glinterop.h.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 64c2593a5c)
2016-11-08 20:45:03 +00:00
Marek Olšák
1040360e9f egl: make interop ABI visible again
This was broken when the GLAPI use was removed from mesa_glinterop.h.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee39d4456e)
2016-11-08 20:45:03 +00:00
Marek Olšák
ad7d21bc3a egl: use util/macros.h
I need the definition of PUBLIC.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bf51b45313)
[Emil Velikov: Keep the MIN2 macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 20:44:27 +00:00
Jason Ekstrand
3d4a219dd8 intel/blorp: Rework our usage of ralloc when compiling shaders
Previously, we were creating the shader with a NULL ralloc context and then
trusting in blorp_compile_fs to clean it up.  The only problem was that
blorp_compile_fs didn't clean up its context properly so we were leaking.
When I went to fix that, I realized that it couldn't because it has to
return the shader binary which is allocated off of that context and used by
the caller.  The solution is to make blorp_compile_fs take a ralloc
context, allocate the nir_shaders directly off that context, and clean it
all up in whatever function creates the shader and calls blorp_compile_fs.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dadb6edd)
[Emil Velikov: attribute src/intel/blorp file movement]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/blorp/blorp.c
	src/intel/blorp/blorp_clear.c
	src/intel/blorp/blorp_priv.h
	src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
2016-11-08 16:23:23 +00:00
Samuel Pitoiset
76a77249ed nvc0/ir: fix emission of IMAD with NEG modifiers
The emitter tried to emit sub instead of subr when src0 has
actually a NEG modifier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 84e946380b)
2016-11-08 16:23:23 +00:00
Tapani Pälli
c1f138149e egl: set preserved behavior for surface only if config supports it
Otherwise we can end up with mismatching behavior between config and
surface when client queries surface attributes. As example, configs
for DRI3 do not support preserved behavior but here we were setting
preserved behavior for pixmap and pbuffer.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 2035930966)
2016-11-08 16:23:22 +00:00
Emil Velikov
3a27b813b4 cherry-ignore: add ClientWaitSync fixes
Patches (and extension overall) depends on gallium API and driver work
which hasn't landed in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:22 +00:00
Marek Olšák
ac3abe534b winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures
Maybe this is why SDMA has been broken for many amdgpu users?

SDMA is the only block which is used with imported textures and relies
on this variable. DB also uses it, but it doesn't get imported textures,
so it's unaffected.

I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma
is changed to use imported textures.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ec3b2a4b1)
[Emil Velikov: resolve trivial conflicts - SI support does not exist in
branch]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
2016-11-08 16:23:22 +00:00
Marek Olšák
20008a9fb8 gallium/radeon: make sure the address of separate CMASK is aligned properly
This should fix random GPU hangs on Hawaii and Fiji.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit dce05b3423)
2016-11-08 16:23:22 +00:00
Samuel Pitoiset
79a1cc2364 nvc0: use correct bufctx when invalidating CP textures
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b2712c367)
2016-11-08 16:23:22 +00:00
Tapani Pälli
cac5e31b0f mesa: fix error handling in DrawBuffers
Patch rearranges error checking so that enum checking provided via
destmask happens before other checks. It needs to be done in this
order because other error checks do not work properly if there were
invalid enums passed.

Patch also refines one existing check and it's documentation to match
GLES 3.0 spec (also in later specs). This was somewhat mysteriously
referring to desktop GL but had a check for gles3.

Fixes following dEQP tests:

   dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers

no CI regressions observed.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98134
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a1652a059e)
2016-11-08 16:23:22 +00:00
Tapani Pälli
979e4b9c3f egl: add check that eglCreateContext gets a valid config
Fixes following dEQP test:

   dEQP-EGL.functional.negative_api.create_context

v2: don't break EGL_KHR_no_config_context (Eric Engestrom)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5876f3c85a)
[Emil Velikov: drop EGL_NO_CONFIG_KHR, use MESA_configless_context]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/egl/main/eglapi.c
2016-11-08 16:23:22 +00:00
Emil Velikov
71d0b5f7c7 cherry-ignore: add N/A EGL revert
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:22 +00:00
Tapani Pälli
8fbad64732 egl/dri2: set max values for pbuffer width and height
While these max values were previously fixed for pbuffer creation, this
change makes also eglGetConfigAttrib() return correct values.

Fixes following dEQP tests:

   dEQP-EGL.functional.create_surface.pbuffer.rgb888_no_depth_no_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgb888_depth_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgba8888_no_depth_no_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgba8888_depth_stencil

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b91e1e38e8)
2016-11-08 16:23:21 +00:00
Axel Davy
a106d8c872 st/nine: Fix locking CubeTexture surfaces.
Only one face of Cubetextures was locked when in DEFAULT Pool.
Fixes:
https://github.com/iXit/Mesa-3D/issues/129

CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit eed605a473)
2016-11-08 16:23:21 +00:00
Axel Davy
5abbe84671 st/nine: Fix mistake in Volume9 UnlockBox
In the format fallback path,
the height was used instead of the depth.

CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit fe7bb46134)
2016-11-08 16:23:21 +00:00
Jonathan Gray
7133f0054d mapi: automake: set VISIBILITY_CFLAGS for shared glapi
shared glapi was previously built without setting CFLAGS for
AM_CFLAGS and VISIBILITY_CFLAGS.

This resulted in symbols being exported that shouldn't be.

The x86 and sparc assembly versions of the dispatch table partially
mitigated this by using .hidden.  Otherwise shared_dispatch_stub_*
were being exported.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "11.2 12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 907ace5798)
2016-11-08 16:23:21 +00:00
Stencel, Joanna
b5cb4f5980 egl/wayland: add missing destroy_window callback
The original patch by Joanna added the function pointer and callback yet
things got only partially applied - the infra was added, but the
implementation was missing.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 690ead4a13 ("egl/wayland-egl: Fix for segfault in
dri2_wl_destroy_surface.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 2e0ab61e29)
2016-11-08 16:23:21 +00:00
Emil Velikov
d2be28c2bd automake: don't forget to pick wglext.h in the tarball
Earlier commit reworked the header install rules, to ensure that the
correct ones are installed only as needed.

By doing so it dropped a wildcard which was effectively including the
wglext.h header in the tarball.

Add the header to the top-level noinst_HEADERS, since the it is not
meant to be installed (autoconf is not used on Windows plaforms).

Fixes: a89faa2022 ("autoconf: Make header install distinct for various
APIs (v2)")
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

(cherry picked from commit 3511a86111)
2016-11-08 16:23:21 +00:00
Emil Velikov
cd2db885bf get-pick-list.sh: Require explicit "12.0" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 12.0 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:21 +00:00
Nicolai Hähnle
648f012459 radeonsi: fix 64-bit loads from LDS
Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among
others.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4a2dbfff05)
2016-11-08 16:23:21 +00:00
Nicolai Hähnle
f8f9d7528a st/mesa: only set primitive_restart when the restart index is in range
Even when enabled, primitive restart has no effect when the restart index
is larger than the representable values in the index buffer.

Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert
for radeonsi VI.

v2: add an explanatory comment

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
(cherry picked from commit bfa50f88ce)
2016-11-08 16:23:20 +00:00
Nicolai Hähnle
3a030e886d st/glsl_to_tgsi: fix block copies of arrays of doubles
Set the type of the left-hand side to the same as the right-hand side,
so that when the base type is double, the writemask of the MOV instruction
is properly fixed up.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ca592af880)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
4d478aad50 nv50/ir: process texture offset sources as regular sources
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.

This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd45d758ff)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
e8ae2da8a0 nv50,nvc0: avoid reading out of bounds when getting bogus so info
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 313fba5ee1)
2016-11-08 16:23:20 +00:00
Jonathan Gray
34cb65716e genxml: add generated headers to EXTRA_DIST
Building the Mesa 12.0.3 distfile failed on a system without python
as generated files were not included in the distfile.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 41754f743f)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
4ddcb9cb22 gm107/ir: fix bit offset of tex lod setting for indirect texturing
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8c78fdb328)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
aeabbc1e1d gm107/ir: fix texturing with indirect samplers
The indirect handle has to come right after the coordinates, so if there
was a sample/bias/depth compare/offset, everything would end up being
shifted by one argument position.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecea2f69ef)
2016-11-08 16:23:20 +00:00
Kenneth Graunke
9cecb50bf2 i965: Fix gl_InvocationID in dual object GS where invocations == 1.
dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations
draws using a geometry shader that specifies

   layout(points, invocations = 1) in;

and then uses gl_InvocationID.  According to the Haswell PRM, the
"GS Instance ID 0" (and 1) thread payload fields are undefined in
dual object mode:

   "If 'dispatch mode' is DUAL_OBJECT this field is not valid."

But there's no point in using them - if there's only one invocation,
the ID will be 0.  So just load a constant.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 9f677d6541)
2016-11-08 16:23:20 +00:00
Nicolai Hähnle
c403863348 st/glsl_to_tgsi: fix atomic counter addressing
When more than one atomic counter buffer is in use, UniformStorage[n].opaque
is set up to contain indices that are contiguous across all used buffers.

This appears to be used by i965 via NIR, but for TGSI we do not treat atomic
counter buffers as opaque, so using the data in the opaque array is incorrect.

Fixes GL45-CTS.compute_shader.resource-atomic-counter.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1dd99a15a4)
2016-11-08 16:23:19 +00:00
Nicolai Hähnle
7c7973606f radeonsi: fix indirect loads of 64 bit constants
This fixes GL45-CTS.compute_shader.fp64-case3.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 51f9b38ce8)
2016-11-08 16:23:19 +00:00
Chad Versace
341889d6ca egl: Don't advertise unsupported platform extensions
Mesa's set of supported platform extensions depends on the autoconf
option --with-egl-platforms=foo,bar,baz. If --with-egl-platforms lacks
foo, then eglGetPlatformDisplay(EGL_PLATFORM_FOO, ...) unconditonally
fails.

So, if --with-egl-platforms lacks foo, then remove
EGL_VENDOR_platform_foo from the EGL client extension string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c177ef9d47)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/egl/main/eglglobals.c
2016-11-08 16:23:19 +00:00
Emil Velikov
866aee0264 egl/x11: don't crash if dri2_dpy->conn is NULL
The dri3 version of commits 60e9c35b3a and 6de9a03bed.

While using xcb_connect() guarantees that we always get a non NULL
return value, XGetXCBConnection() does/can not.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit b10c05d4ff)
2016-11-08 16:23:19 +00:00
Emil Velikov
32a469b8ed isl/gen6: correctly check msaa layout samples count
Samples == 1 is a valid value, so returning false is plain wrong.
Seeming copy/paste typo introduced since day 1.

Fixes: afdadec77f ("isl: Implement isl_surf_init() for gen4-gen9")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 84f9ef1de4)
2016-11-08 16:23:19 +00:00
Vinson Lee
eb9236e275 Revert "mesa_glinterop: remove inclusion of GLX header"
This reverts commit 8472045b16.

Conflicts:

	include/GL/mesa_glinterop.h

This patch fixes this build error with GCC 4.4.

  Compiling src/glx/dri_common_interop.c ...
In file included from src/glx/dri_common_interop.c:33:
include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’
include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here

Fixes: 8472045b16 ("mesa_glinterop: remove inclusion of GLX header")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit c10dcb2ce8)

Squashed with commit:

mesa_glinterop: allow building without X and related headers

This commit effectively reverts c10dcb2ce8
and fixes the typedef redefinition which inspired it.

In order to prevent requiring X packages at build time earlier commit
forward declared the required X/GLX typedefs. Since that approach
introduced typedef redefinition (a C11 feature) it was reverted.

To avoid the redefinition while _not_ mandating X and related headers
forward declare the structs and use those through the header.

As anyone uses the mesa interop header they ensure that the X (or others
in terms of EGL) headers are included, which ensures that everything is
resolved within the compilation unit.

Cc: Vinson Lee <vlee@freedesktop.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Fixes: c10dcb2ce8 ("Revert "mesa_glinterop: remove inclusion of GLX
header"")
Fixes: 8472045b16 ("mesa_glinterop: remove inclusion of GLX header")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

(cherry picked from commit c85b34ffd0)
2016-11-08 16:23:19 +00:00
Mario Kleiner
50afa72f3c glx: Perform check for valid fbconfig against proper X-Screen.
Commit cf804b4455
('glx: fix crash with bad fbconfig') introduced a check
in glXCreateNewContext() if the given config is a valid
fbconfig.

Unfortunately the check always checks the given config against
the fbconfigs of the DefaultScreen(dpy), instead of the
actual X-Screen specified in the config config->screen.

This leads to failure whenever a GL context is created
on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of
a multi-x-screen setup, where the default screen is
typically 0.

Fix this by using config->screen instead of DefaultScreen(dpy).

Tested to fix context creation failure on a dual-x-screen setup.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0c94ed0987)
2016-11-08 16:23:19 +00:00
Dave Airlie
80409971c0 anv/wsi: fix apps that acquire multiple images up front
This fix was found in the radv codebase when running dota2,
no idea if anyone has reported it on anv, but the same problem
occurs.

Once an image is acquired we need to mark it busy.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8980ac0411)

Squashed with commit

anv: fix the wayland wsi busy flag setting

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a3834ebaf9)
2016-11-08 16:23:19 +00:00
Dave Airlie
920150f28a anv: initialise and increment send_sbc
At least set this to not be uninitialised memory.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dfe74fd1a9)
2016-11-08 16:23:18 +00:00
Marek Olšák
12bdcc105c radeonsi: disable ReZ
This is a serious performance fix. Discovered by luck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit e12c1cab5d)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
1cb2a483ba st/mesa: fix vertex elements setup for doubles
Whether one or two slots are taken up by one API array depends on the
vertex shader, not on how the array is configured. When an array is
set up with fewer components than the shader expects, the high components
are undefined.

Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d413fbb159)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
79e84bafc1 st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1d7685e52c)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
5181624675 st/glsl_to_tgsi: simplify translate_tex_offset
This fixes a bug with offsets from uniforms which seems to have only been
noticed as a crash in piglit's
arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag
on radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b234e37765)
2016-11-08 16:23:18 +00:00
Ilia Mirkin
d18d830744 nvc0/ir: fix textureGather with a single offset
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.

Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a48a343c29)
2016-11-08 16:23:18 +00:00
Ilia Mirkin
bef0cc6287 nv50/ir: copy over value's register id when resolving merge of a phi
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 300b5ad023)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
f0f5bd9607 st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations
This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.

A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like

    dvec2 d;
    float x = (float)d.y;

which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 63193b9cde)
2016-11-08 16:23:18 +00:00
Axel Davy
ca135ebd76 st/nine: Fix the calculation of the number of vs inputs
Fixes hangs on radeonsi, and assert on llvmpipe.

Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2fd296648)
2016-11-08 16:23:17 +00:00
Axel Davy
ec2751f967 gallium/util: Really allow aliasing of dst for u_box_union_*
Gallium nine relies on aliasing to work with this function.
Without this patch, dirty region tracking was incorrect, which
could lead to incorrect textures or vertex buffers.
Fixes several game bugs with nine.
Fixes https://github.com/iXit/Mesa-3D/issues/234

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2290eac84e)
2016-11-08 16:23:17 +00:00
Ilia Mirkin
24a16d0a9d nvc0/ir: fix overwriting of value backing non-constant gather offset
Normally the value is an immediate, which is moved to some temporary, so
there's no problem. In the case of a non-constant offset (as allowed by
ARB_gpu_shader5), we have to take care to copy it first before using it
to build up the bits.

This fixes a compilation error observed in F1 2015.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 5239bd5920)
2016-11-08 16:23:17 +00:00
Eric Anholt
9c5de2546c gallium: Fix install-gallium-links.mk on non-bash /bin/sh
Debian uses dash by default, which doesn't do '+='.  Fixes servo's
osmesa-based headless testing system, which was looking for libOSMesa in
the lib/ directory.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ec9ed1c4d8)
2016-11-08 16:23:17 +00:00
Martin Peres
17f59da9db loader/dri3: import prime buffers in the currently-bound screen
This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2()
and fixes many applications when using DRI3:
 - Totem with libva on hw-accelerated decoding
 - obs-studio, using Window Capture (Xcomposite) as a Source
 - gstreamer with VAAPI

v2:
 - introduce get_dri_screen() in the dri3 loader's vtable (krh)

Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Tested-by: Ionut Biru <biru.ionut@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
(cherry picked from commit a599b1c203)
2016-11-08 16:23:17 +00:00
Martin Peres
0ae4d909ea loader/dri3: add get_dri_screen() to the vtable
This allows querying the current active screen from the
loader's common code.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
(cherry picked from commit 0247e5ee3e)
2016-11-08 16:23:17 +00:00
Chuck Atkins
2fedb106bb autoconf: Make header install distinct for various APIs (v2)
This fixes a problem where GL headers would only get installed if
glx was enabled.  So if osmesa was enabled but not glx, then the
GL headers required by osmesa would be missing from the install.

v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a89faa2022)
2016-11-08 16:23:17 +00:00
Chad Versace
c5de4cbf32 i965/sync: Fix uninitalized usage and leak of mutex
We locked an unitialized mutex in the callstack
    glClientWaitSync
    intel_gl_client_wait_sync
    brw_fence_client_wait_sync
because we forgot to initialize it in intel_gl_fence_sync.
(The EGLSync codepath didn't have this bug. It initialized the mutex in
intel_dri_create_sync).

We also forgot to tear down (mtx_destroy) the mutex when destroying
the sync object.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ce1d67c2e5)
2016-11-08 16:23:17 +00:00
Marek Olšák
e7b274c552 radeonsi: fix texture border colors for compute shaders
There are VM faults without this.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit cc4a19c4ad)
2016-11-08 16:23:16 +00:00
Marek Olšák
63c1a5d391 radeonsi: fix interpolateAt opcodes for .zw components
Not returning garbage in .zw seems pretty important.

This fixes:
GL45-CTS.shader_multisample_interpolation.render.interpolate_at_*_check.*

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1b37e5541c)
2016-11-08 16:23:16 +00:00
Kenneth Graunke
d16be6898b i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bab1c05634)
2016-11-08 16:23:16 +00:00
Kenneth Graunke
4b0512ade4 i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.
CACHE_NEW_CS_PROG hasn't existed in quite a long time...the old
comment was there, but not the actual bit.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ce6c80ebbb)
2016-11-08 16:23:16 +00:00
Emil Velikov
39344de587 cherry-ignore: add update_renderbuffer_read_surfaces()
The function (and underlying work) is not in branch. Former introduced
with 786108e7b2.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:16 +00:00
Kenneth Graunke
915683485f i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.
3DSTATE_PS doesn't need this.  3DSTATE_PS_EXTRA however does,
for brw_color_buffer_write_enabled().

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0047d600af)
2016-11-08 16:23:16 +00:00
Kenneth Graunke
bc04c92aef i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 28e1538be7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/gen6_clip_state.c
2016-11-08 16:23:16 +00:00
Kenneth Graunke
5fb22e1258 i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.
Needed for user clip plane enables.  Broken since this code was
introduced.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 78df96256b)
2016-11-08 16:23:16 +00:00
Chad Versace
b5de58d7e1 egl: Fix truncation error in _eglParseSyncAttribList64
The function stores EGLAttrib values in EGLint variables. On 64-bit
systems, this truncated the values.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 69adb9a778)
2016-11-08 16:23:16 +00:00
Emil Velikov
082ea77cdf cherry-ignore: add EGL_KHR_debug fix
The extension landed after the branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:15 +00:00
Nicolai Hähnle
a5c0b8784a gallium/radeon: cleanup and fix branch emits
Some of the existing code is needlessly complicated. The basic principle
should be: control-flow opcodes emit branches to properly terminate the
current block, _unless_ the current block already has a terminator (which
happens if and only if there was a BRK or CONT).

This also fixes a bug where multiple terminators were created in a block.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6f87d7a146)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
2016-11-08 16:23:15 +00:00
James Legg
020550e099 radeonsi: Fix primitive restart when index changes
If primitive restart is enabled for two consecutive draws which use
different primitive restart indices, then the first draw's primitive
restart index was incorrectly used for the second draw.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e33f31d61f)
2016-11-08 16:23:15 +00:00
Matt Whitlock
e7491c3bbd gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 42ed8a6c9c)
2016-11-08 16:23:15 +00:00
Matt Whitlock
39c0535646 st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ac6064f918)
2016-11-08 16:23:15 +00:00
Matt Whitlock
7c27d56535 st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 0c060f691c)
2016-11-08 16:23:15 +00:00
Matt Whitlock
ea3e778bff gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5d0069eca2)
2016-11-08 16:23:15 +00:00
Matt Whitlock
d82738fbd9 egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit c8fd7d060d)
2016-11-08 16:23:15 +00:00
Jason Ekstrand
4e1298be04 nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks
Previously, we were saving off the last nir_block in a vtn_block before
moving on so that we could find the nir_block again when it came time to
handle phi sources.  Unfortunately, NIR's control flow modification code is
inconsistent when it comes to how it splits blocks so the block pointer we
saved off may point to a block somewhere else in the shader by the time we
get around to handling phi sources.  In order to get around this, we insert
a nop instruction and use that as the logical end of our block.  Since the
control flow manipulation code respects instructions, the nop will keeps
its place like any other instruction and we can easily find the end of our
block when we need it.

This fixes a bug triggered by a couple of vkQuake shaders.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6ffbfc760d)
2016-11-08 16:23:14 +00:00
Jason Ekstrand
17429a22a6 nir: Add a nop intrinsic
This intrinsic has no destination, no sources, no variables, and can be
eliminated.  In other words, it does nothing and will always get deleted by
dead code elimination.  However, it does provide a quick-and-easy way to
temporarily tag a particular location in a NIR shader.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7697b4b98b)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/nir/nir_intrinsics.h
2016-11-08 16:23:14 +00:00
Tapani Pälli
d8d7de3b29 egl: stop claiming support for pbuffer + msaa
This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
and same crash in many dEQP EGL tests.

I also found that some Qt example did a workaround because of this
crash: https://bugreports.qt.io/browse/QTBUG-47509

v2: Ian pointed out that v1 removed support for all multisample
    configs, including window ones. This one removes pbuffer bit
    when adding configs, now only pbuffer+msaa gets rejected and
    window+msaa continues to work. Fixed also comment (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 4d6d55deef)
2016-11-08 16:23:14 +00:00
Jason Ekstrand
5e4aeeb8ec nir/spirv/cfg: Detect switch_break after loop_break/continue
While the current CFG code is valid in the case where a switch break also
happens to be a loop continue, it's a bit suboptimal.  Since hardware is
capable of handling the continue as a direct jump, it's better to use a
continue instruction when we can than to bother with all of the nasty
switch break lowering.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ef3c5ac7fb)
2016-11-08 16:23:14 +00:00
Jason Ekstrand
12d09e24f8 nir/spirv/cfg: Handle switches whose break block is a loop continue
It is possible that the break block of a switch is actually the continue of
the loop containing the switch.  In this case, we need to identify the
break block as a continue and break out of current level of CFG handling.
If we don't, the continue portion of the loop will get handled twice, once
by following after the break and a second time by the loop handling code
handling it explicitly.

This fixes 6 of the new Vulkan CTS tests:
 - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order*
 - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order*

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d02faede5)
2016-11-08 16:23:14 +00:00
Eric Anholt
cd88ea6d82 travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 78ab62b1e9)
2016-11-08 16:23:14 +00:00
Eric Anholt
c004014db4 travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 084678ccbb)
2016-11-08 16:23:14 +00:00
Eric Anholt
6931dab9b4 travis: Update to the Ubuntu Trusty image.
This will hopefully fix wget from x.org (no real reason explained in
Travis CI bug reports), and may also mean that we can enable LLVM driver
builds.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 80a872f3f0)
2016-11-08 16:23:14 +00:00
Eric Anholt
a8010d31ce travis: Parse configure.ac to pick an updated LIBDRM_VERSION.
Travis has been broken a couple of times by configure.ac updates.  To make
it useful, auto-update the version necessary.

This could potentially be used for other dependencies, too, but those get
bumped less frequently.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit ecbc76cf6e)
2016-11-08 16:23:13 +00:00
Ian Romanick
0909e54c3c glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept
At this point in the code, s must be visit_continue.  If the child
returned visit_stop, visit_stop is the only correct thing to return.

Found by inspection.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ea6ed2379d)
2016-11-08 16:23:13 +00:00
Tim Rowley
6cd0438840 configure.ac: add llvm inteljitevents component if enabled
Needed to successfully link llvmpipe or swr when using shared llvm libs
built with inteljitevents enabled.

v2: Make adding inteljitevents component global rather than just
llvmpipe/swr, since libgallium will have a symbol dependency.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bacdd9ef4c)
2016-11-08 16:23:13 +00:00
Nicholas Bishop
f228c90f80 st/dri: check pipe_screen->resource_get_handle() return value
Change dri2_query_image to check the return value of resource_get_handle
and return GL_FALSE if an error occurs.

For reference this is an example callstack that should propagate the
error back to the user:

    i915_drm_buffer_get_handle
    i915_texture_get_handle
    u_resource_get_handle_vtbl
    dri2_query_image
    gbm_dri_bo_get_fd
    gbm_bo_get_fd

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nicholas Bishop <nbishop@neverware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
[Emil Velikov: Split from larger patch, polish coding style, cc stable]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit aa560e8e63)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/state_trackers/dri/dri2.c
2016-11-08 16:23:13 +00:00
Nicholas Bishop
29320aa06a gbm: return appropriate error when queryImage() fails
Change gbm_dri_bo_get_fd to check the return value of queryImage and
return -1 (an invalid file descriptor) if an error occurs.

Update the comment for gbm_bo_get_fd to return -1, since (apart from the
above) we've already return -1 on error.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nicholas Bishop <nbishop@neverware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
[Emil Velikov: Split from larger patch, polish coding style, cc stable]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 2d05ba2ca0)
2016-11-08 16:23:13 +00:00
Emil Velikov
48f001b836 cherry-ignore: add vaapi encode fix
The encode codepaths landed after the branch point.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:13 +00:00
Brian Paul
0823c98a65 st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()
Some demos, like Heaven, were creating and destroying a large number
of sampler views because of a swizzle issue.

Basically, we compute the sampler view's swizzle by examining the
texture format, user swizzle, depth mode, etc.  Later, during validation
we recompute that swizzle (in case something like depth mode changes)
and see if it matches the view's swizzle.

In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle
returned SWIZZLE_XYZW but the u_sampler_view_default_template() function
was setting the sampler view's swizzle to SWIZZLE_XY01.  This mismatch
caused the validation step to always "fail" so we'd destroy the old
sampler view and create a new one.

By removing the conditional, the sampler view's swizzle and the computed
texture swizzle match and validation "passes".  When creating a new sampler
view, we always want to use the texture swizzle which we just computed.

Fixes VMware issue 1733389.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 1cdc232e1a)
2016-11-08 16:23:13 +00:00
Samuel Pitoiset
a287f820b8 gk110/ir: fix wrong emission of OP_NOT
This should emit src0 instead of src1.
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d8b4f5fcca)
2016-11-08 16:23:13 +00:00
Samuel Pitoiset
725ef1fbba nvc0/ir: fix subops for IMAD
Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 50baaf6bc6)
2016-11-08 16:23:13 +00:00
Marek Olšák
a4a2378408 mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc
This fixes 66 CTS tests on st/mesa.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d58a3906cb)
2016-11-08 16:23:12 +00:00
Emil Velikov
cc34777cec cherry-ignore: add non-applicable i965 commit
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:12 +00:00
Hans de Goede
cd614f31b8 pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms
Make pipe_loader_sw_probe_kms take ownership of the passed in fd,
like pipe_loader_drm_probe_fd does.

The only caller is dri_kms_init_screen which passes in a dupped fd,
just like dri2_init_screen passes in a dupped fd to
pipe_loader_drm_probe_fd.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 459cc94507)

Squashed with commit:

pipe_loader_sw: Don't invoke Unix close() on Windows.

Trivial.

(cherry picked from commit c6d17701c8)
2016-11-08 16:23:12 +00:00
Vedran Miletić
67c99c1245 clover: Fix build against clang SVN >= r273191
setLangDefaults() now requires PreprocessorOptions as an argument.

Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 82e0bbd01a)
Nominated-by: Andreas Boll <andreas.boll.dev@gmail.com>
Nominated-by: Timo Aaltonen <tjaalton@ubuntu.com>
2016-11-08 16:23:12 +00:00
Kenneth Graunke
6a72af2aeb mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.
This is supposed to be exposed with the GL_KHR_robustness extension,
which we support on ES 2.0 and later.  On desktop GL, it's also exposed
by GL_ARB_robustness, which is supported by all drivers ("dummy_true").
so we also allow desktop GL.

Fixes:
- ES32-CTS.robust.robustness.noResetNotification
- ES32-CTS.robust.robustness.loseContextOnReset

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3bcdc2e3db)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/get.c
	src/mesa/main/get_hash_params.py
2016-11-08 16:23:12 +00:00
Kenneth Graunke
e591b0b206 nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().
This is mandatory.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit e6eed3533e)
2016-11-08 16:23:12 +00:00
Brendan King
96aa7ca98c configure.ac: fix the name of the Wayland Scanner pc file
The Wayland Scanner pkg-config file is called wayland-scanner.pc.

Fixes: 153539bd9d ("configure: rework wayland_scanner
       handling (fix make distcheck)")

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 95f3e5861c)
2016-11-08 16:23:12 +00:00
Marek Olšák
b1c5719d7b radeonsi: fix FP64 UBO loads with indirect uniform block indexing
No known tests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 15a127bc2c)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
        src/gallium/drivers/radeonsi/si_shader.c
2016-11-08 16:23:12 +00:00
Ilia Mirkin
ec1f6700b6 st/mesa: fix is_scissor_enabled when X/Y are negative
Similar to commit 49c24d8a24 ("i965: fix noop_scissor range issue on
width/height") - take the X/Y into account to determine whether the
scissor covers the whole area or not.

Fixes the recently-added gl-1.0-scissor-depth-clear-negative-xy piglit
test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 742832434a)
2016-11-08 16:23:11 +00:00
Julien Isorce
b2495c2202 st/va: also honors interlaced preference when providing a video format
This fixes a crash when using the prefered video format with vaapisink
on Nvidia hardwares.
Also caught by the following assert:
  nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed.

TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! vaapisink

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Tested-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
Tested-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit bf901a2f8c)
2016-11-08 16:23:11 +00:00
Chuanbo Weng
7ad97bc307 gbm: fix potential NULL deref of mapImage/unmapImage.
The mapImage/unmapImage functions of DRIimage extension can be NULL,
so we should add additional check for them.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9a1eb54237)
2016-11-08 16:23:11 +00:00
Ilia Mirkin
106f2dc8a7 gm107/ir: AL2P writes to a predicate register
We have to force it to write to predicate 7 (aka PT) in order for it not
to mess up another predicate. Unclear what would be returned in the
predicate, perhaps an error code for out-of-bounds requests. Blob
doesn't seem to check it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a22aee5ad1)
2016-11-08 16:23:11 +00:00
Marek Olšák
a3c232db2f radeonsi: take compute shader and dispatch indirect memory usage into account
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit e62caf576e)

Squashed with commit:

radeonsi: flush TC L2 before using a compute indirect buffer

There is no known test for this.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 08bcbfdc07)
2016-11-08 16:23:11 +00:00
Max Staudt
f75a108434 r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering
On the RSxxx chip series, HW TCL is missing and r300_emit_vs_state()
is never called.

However, if R300_VAP_CNTL is never set, the hardware (at least the
RS690 I tested this on) comes up with rendering artifacts, and
parts that are uploaded before this "fix" remain broken in VRAM.
This causes artifacts as in fdo#69076 ("triangle flickering").

It seems like this setup needs to happen at least once after power on
for 3D rendering to work properly. In the DDX with EXA, this happens in
RADEON_SWITCH_TO_3D() when processing an XRENDER Composite or an
Xv request. So playing back a video or starting a GTK+2 application
fixes 3D rendering for the rest of the session. However, this auto-fix
doesn't happen when EXA is not used, such as with GLAMOR or Wayland.

This patch ensures the register is configured even in absence of
the DDX's EXA module.

The register setting is taken from:
  xf86-video-ati  --  RADEONInit3DEngineInternal()
  mesa/src/mesa/drivers/dri/r300  --  r300EmitClearState()

Tested on RS690.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Max Staudt <mstaudt@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 02675622b0)
2016-11-08 16:23:11 +00:00
Jason Ekstrand
258e651e0f nir/spirv: Refactor variable deocration handling
Previously, we dind't apply variable decorations to the members of a split
structure variable.  This doesn't quite work, unfortunately, because things
such as the "flat" qualifier may get applied to an entire structure instead
of propagated to the members.  This fixes 9 of the new CTS tests in the
dEQP-VK.glsl.linkage.varying.struct.* group.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a00bd7bc27)
2016-11-08 16:23:11 +00:00
Jason Ekstrand
662a7c627b nir/spirv: Break variable decoration handling into a helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5505730d3)
2016-11-08 16:23:11 +00:00
Ilia Mirkin
ed8e99761d nir: fix definition of pack_uvec2_to_uint
Found by inspection. Untested beyond compilation. This also matches the
logic used in nir_lower_alu_to_scalar.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8c8874eafb)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
98c9fcf259 mesa/formatquery: limit ES target support, fix core context support
First off, as late as ES 3.2, GetInternalformat only supports
RENDERBUFFER and 2DMS(_ARRAY) targets.

Secondly, the _mesa_has_ext helpers are very accurate... a little too
accurate, some might say. If we only show an extension in compat
profiles because core profiles have the functionality guaranteed, they
will return false. Fix these to either check for a core profile
explicitly, or to a different-but-identical extension available in core
profile.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matteo Bruni <matteo.mystral@gmail.com>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c42acd93d4)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
5eabc81d50 main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer
Add a separate extension check for that format. Prevents glTexImage from
trying to find a matching format, which fails on drivers without support
for this format.

Fixes: sized-texture-format-channels (on a3xx)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 36347c8d6f)
2016-11-08 16:23:10 +00:00
Jason Ekstrand
2fbce4c9e1 nir/spirv: Use the correct sources for CompareExchange on images
The CompareExchange operation has two "Memory Semantics" parameters instead
of one so the real arguments start at w[7] instead of w[6].

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f2a10937d8)
2016-11-08 16:23:10 +00:00
Jason Ekstrand
ab6126bb1d nir/spirv: Swap the argument order for AtomicCompareExchange
SPIR-V has the two arguments in the opposite order from GLSL.  NIR uses the
GLSL order so we had them backwards.

Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ead7bef6b)
2016-11-08 16:23:10 +00:00
Marek Olšák
f5de7da4e1 radeonsi: fix cubemaps viewed as 2D
This fixes: GL43-CTS.texture_view.view_sampling

v2: fix a typo, merge both if statements

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a4fa215058)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
05ec6a7c03 a3xx: use window scissor to simulate viewport xy clip
Unfortunately a3xx does not have a separate disable for depth clipping,
so when depth clamp is enabled, we disable the whole 3d clipper logic.
This in turn also gets rid of the xy clip that it would normally do.
When we detect this would happen, instead we integrate the viewport into
the window scissor. This may have slightly different behavior around
wide points, but it's unlikely that anything depends on this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ca313e00b6)
[Emil Velikov: s|batch->||g]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/freedreno/a3xx/fd3_emit.c
2016-11-08 16:23:10 +00:00
Ilia Mirkin
dee992caa1 a3xx: make use of software clipping when hw can't handle it
The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs,
or a clip vertex, or clip distances are in use, then we must use the
fallback discard-based clipping from the frag shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 83d7230fd5)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
68620d14d4 a3xx: make sure to actually clamp depth as requested
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit dac72234c7)
[Emil Velikov: s|batch->||g]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:10 +00:00
Ilia Mirkin
42a890891c nv30: set usage to staging so that the buffer is allocated in GART
The code a few lines below expects to migrate the bo in question to
VRAM. Since we're filling the initial data via CPU, it's more efficient
to create the temporary buffer in GART. There is no "push" method
implemented, otherwise we'd use that instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6118bcab4e)
2016-10-31 17:39:42 +00:00
Michel Dänzer
d3d33918c7 loader/dri3: Overhaul dri3_update_num_back
Always use 3 buffers when flipping. With only 2 buffers, we have to wait
for a flip to complete (which takes non-0 time even with asynchronous
flips) before we can start working on the next frame. We were previously
only using 2 buffers for flipping if the X server supports asynchronous
flips, even when we're not using asynchronous flips. This could result
in bad performance (the referenced bug report is an extreme case, where
the inter-frame stalls were preventing the GPU from reaching its maximum
clocks).

I couldn't measure any performance boost using 4 buffers with flipping.
Performance actually seemed to go down slightly, but that might have
been just noise.

Without flipping, a single back buffer is enough for swap interval 0,
but we need to use 2 back buffers when the swap interval is non-0,
otherwise we have to wait for the swap interval to pass before we can
start working on the next frame. This condition was previously reversed.

Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1e3218bc5b)

Squashed with commit:

loader/dri3: Always use at least two back buffers

This can make a significant difference for performance with some extreme
test cases such as vblank_mode=0 glxgears.

Fixes: 1e3218bc5b ("loader/dri3: Overhaul dri3_update_num_back")
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97549
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit dc3bb5db8c)
2016-10-31 17:39:32 +00:00
Emil Velikov
09460b8cf7 docs: add sha256 checksums for 12.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-15 11:29:24 +01:00
Emil Velikov
d79b2e7bf3 docs: add release notes for 12.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-15 10:19:26 +01:00
Emil Velikov
e487048f8c Update version to 12.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-15 10:15:57 +01:00
Emil Velikov
71b47b9cfe Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"
This reverts commit be0344f630.

The commit depends on 48e9ecc47f ("Revert "i965/miptree: Set
logical_depth0 == 6 for cube maps") which was reverted earlier.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97781
2016-09-14 12:14:01 +01:00
Jose Fonseca
bde8f418bd appveyor: Update winflexbison download URL.
This particular version got moved into a `old_versions` subdirectory.
2016-09-14 11:21:04 +01:00
Emil Velikov
614fb93a6d docs: add sha256 checksums for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-05 16:03:06 +01:00
Emil Velikov
2fc6a31f10 docs: add release notes for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-05 12:14:11 +01:00
Emil Velikov
63001e7ddf Update version to 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-05 12:09:24 +01:00
Emil Velikov
7757de1ebf glx/glvnd: list the strcmp arguments in correct order
Currently, due to the inverse order, strcmp will produce negative result
when the needle is towards the start of the haystack. Thus on the next
iteration(s) we'll end up further towards the end and eventually fail to
locate the entry.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 62b224d428)
2016-09-05 11:59:25 +01:00
Ilia Mirkin
8e9b6161eb gk110/ir: fix quadop dall emission
We recently starting to always emit the NDV (== dall) bit for quadops.
However it was folded into the wrong code word.

Fixes: e0a067ed48 (nv50/ir: always emit the NDV bit for OP_QUADOP)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 61e978524a)
2016-09-05 11:37:18 +01:00
Ilia Mirkin
7c96b11fd6 a4xx: make sure to actually clamp depth as requested
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

(cherry-picked from 89f00f749f)
[imirkin: adjust ctx->batch to just ctx]
2016-09-05 11:37:07 +01:00
Emil Velikov
49e84b8f18 Revert "i965/miptree: Set logical_depth0 == 6 for cube maps"
This reverts commit 48e9ecc47f.

The commit regressed several piglit tests on SNB/ILK hardware.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97567
2016-09-05 11:37:06 +01:00
Jose Fonseca
463d9ea0dc appveyor: Force Visual Studio 2013 image.
It seems the default build image is now Visual Studio 2015, and Visual
Studio 2013 is not installed.
2016-09-01 22:10:08 +01:00
Jose Fonseca
53e8701c7b appveyor: Install pywin32 extensions.
AppVeyor build images seem to have been upgraded to Python 2.7.12, but
no longer have pywin32 pre-installed.
2016-09-01 22:10:08 +01:00
Ilia Mirkin
0fa0e2a505 nv30: only bail on color/depth bpp mismatch when surfaces are swizzled
The actual restriction is a little weaker than I originally thought. See
https://bugs.freedesktop.org/show_bug.cgi?id=92306#c17 for the
suggestion. This also explain why things weren't *always* failing
before, only sometimes. We will allocate a non-swizzled depth buffer for
NPOT winsys buffer sizes, which they almost always are.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8caf2cb0c0)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
9a8d605398 anv: Rework pipeline caching
The original pipeline cache the Kristian wrote was based on a now-false
premise that the shaders can be stored in the pipeline cache.  The Vulkan
1.0 spec explicitly states that the pipeline cache object is transiant and
you are allowed to delete it after using it to create a pipeline with no
ill effects.  As nice as Kristian's design was, it doesn't jive with the
expectation provided by the Vulkan spec.

The new pipeline cache uses reference-counted anv_shader_bin objects that
are backed by a large state pool.  The cache itself is just a hash table
mapping keys hashes to anv_shader_bin objects.  This has the added
advantage of removing one more hand-rolled hash table from mesa.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
(cherry picked from commit 10f9901bce)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
17d40ca82b anv/pipeline: Add support for caching the push constant map
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit ffcef720b7)
[Emil Velikov: dependency for the next patch]
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-01 11:39:47 +01:00
Jason Ekstrand
a8e4b59cfd anv: Add a struct for storing a compiled shader
This new anv_shader_bin struct stores the compiled kernel (as an anv_state)
as well as all of the metadata that is generated at shader compile time.
The struct is very similar to the old cache_entry struct except that it
is reference counted and stores the actual pipeline_bind_map.  Similarly to
cache_entry, much of the actual data is floating-size and stored after the
main struct.  Unlike cache_entry, which was storred in GPU-accessable
memory, the storage for anv_shader_bin kernels comes from a state pool.
The struct itself is reference-counted so that it can be used by multiple
pipelines at a time without fear of allocation issues.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
(cherry picked from commit 6899718470)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
2566315063 anv: Add pipeline_has_stage guards a few places
All of these worked before because they were depending on prog_data to be
null.  Soon, we won't be able to depend on a nice prog_data pointer and
it's nice to be more explicit anyway.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 13c09fdd0c)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
b529a77d79 anv: Remove unused fields from anv_pipeline_bind_map
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b259d86ad6)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
d159ca4fa2 anv/pipeline: Properly handle OOM during shader compilation
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d5945bec12)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
1ce414accf anv/allocator: Correctly set the number of buckets
The range from ANV_MIN_STATE_SIZE_LOG2 to ANV_MAX_STATE_SIZE_LOG2 should
be inclusive and we have asserts that ensure that you never try to allocate
a state larger than (1 << ANV_MAX_STATE_SIZE_LOG2).  However, without
adding 1 to the difference, we allocate 1 too few bucckts and so, even
though we have an assert, anything landing in the last bucket will fail to
allocate properly..

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0f5c496e3)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
e12b7486b3 anv/pipeline: Fix bind maps for fragment output arrays
Found by inspection.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4200c2266e)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
1d0c79b13b anv/descriptor_set: memset anv_descriptor_set_layout
We hash this data structure so we can't afford to have uninitialized data
even if it is just structure padding.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d316cec1c1)
2016-09-01 11:39:46 +01:00
Samuel Pitoiset
04f04ab6a6 nv50/ir: always emit the NDV bit for OP_QUADOP
This silences a divergent error found with F1 2015.

Basically, the NDV bit has to be set when a FSWZ instruction is
inside divergent code, but it's not needed otherwise. The correct
fix should be to set it only in divergent code situations.

GM107 emitter already sets that bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0a067ed48)
2016-09-01 11:39:46 +01:00
Emil Velikov
5af16ddf84 i915: Check return value of screen->image.loader->getBuffers
Ported from the i965 commit e7ab358e81.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 5de640a518)
2016-09-01 11:39:46 +01:00
Ilia Mirkin
178c34c535 nouveau: always enable at least one RC
Experimentally, this is required for glxgears and others to display the
proper colors. This is also what the code used to do before the
referenced commit.

Fixes: c703658b39 (mesa: Drop _EnabledUnits.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 357d8261f1)
2016-09-01 11:39:46 +01:00
Brian Paul
7c583adfb5 mesa: fix format conversion bug in get_tex_rgba_uncompressed()
We need to set the need_convert flag with each loop iteration, not
just when the rgba pointer is null.

Bug reported by Markus Müller <mueller@imfusion.de> on mesa-users list.
Fixes new piglit arb_texture_float-get-tex3d test.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit b9b88516f8)
2016-09-01 11:39:46 +01:00
Ilia Mirkin
f70585e56a main: add missing EXTRA_END in OES_sample_variables get check
Fixes: 3002296cb6 (mesa: add GL_OES_shader_multisample_interpolation support)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05b37e20de)
2016-09-01 11:39:46 +01:00
Jason Ekstrand
2d48468e58 isl: Allow multisampled array textures
This probably isn't the only thing that needs to be done to get
multisampled array textures working in Vulkan but I think this is all that
ISL really needs and it does fix 8 of the new CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit fb89551047)
2016-09-01 11:39:46 +01:00
Ian Romanick
ab0183172f glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10
All the GLSL 4.x keywords were added to the list of reserved keywords
in GLSL ES 3.10.  As far as I can tell, these are the only ones that
were missed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c879dbc4e4)
2016-09-01 11:39:46 +01:00
Miklós Máté
1d4c887020 vbo: set draw_id
Fixes conditional jump depending on uninitialized value
in si_state_draw.c:593

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit b9ac72b511)
2016-09-01 11:39:46 +01:00
Chad Versace
a0e81225bd i965: Respect miptree offsets in intel_readpixels_tiled_memcpy()
Respect intel_miptree_slice::x_offset,y_offset and
intel_mipmap_tree::offset. All three may be non-zero when glReadPixels
is called on an EGLImage created from the non-base slice of a miptree.

Patch 2/2 that fixes test
'dEQP-EGL.functional.image.create.gles2_cubemap_*'.

Reported-by: Haixia Shi <hshi@chromium.org>
Diagnosed-by: Haixia Shi <hshi@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Change-Id: I4b397b27e55a743a7094d29fb0a6a4b6b34352b0
(cherry picked from commit 5b03975889)
2016-09-01 11:39:46 +01:00
Chad Versace
6898eb5859 i965: Fix miptree layout for EGLImage-based renderbuffers
When glEGLImageTargetRenderbufferStorageOES() was given an EGLImage
created from the non-base slice of a miptree,
intel_image_target_renderbuffer_storage() forgot to apply the intra-tile
offsets __DRIimage::tile_x,tile_y to the miptree layout.

This patch fixes the problem with a quick hack suitable for
cherry-picking. A proper fix requires more thorough plumbing in
intel_miptree_create_layout() and brw_tex_layout().

Patch 1/2 that fixes test
'dEQP-EGL.functional.image.create.gles2_cubemap_*'.

Reported-by: Haixia Shi <hshi@chromium.org>
Diagnosed-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Change-Id: I8a64b0048a1ee9e714ebb3f33fffd8334036450b
(cherry picked from commit c82f99e883)
2016-09-01 11:39:46 +01:00
Matt Turner
9aa4e400d2 nir: Walk blocks in source code order in lower_vars_to_ssa.
Prior to this commit rename_variables_block() is recursively called,
performing a depth-first traversal of the control flow graph. The
function uses a non-trivial amount of stack space for local variables,
which puts us in danger of smashing the stack, given a sufficiently deep
dominance tree.

XCOM: Enemy Within contains a shader with such a dominance tree (1574
nir_blocks in total, depth of at least 143).

Jason tells me that he believes that any walk over the nir_blocks that
respects dominance is sufficient (a DFS might have been necessary prior
to the introduction of nir_phi_builder).

In fact, the introduction of nir_phi_builder made the problem worse:
rename_variables_block(), walks to the bottom of the dominance tree
before calling nir_phi_builder_value_get_block_def() which walks back to
the top of the dominance tree...

In any case, this patch ensures we avoid that problem as well.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit e53130cc27)
2016-09-01 11:39:46 +01:00
Brian Paul
b061b2e3eb swrast: fix incorrectly positioned putImage() in swrast driver
Some front buffer rendering was in the wrong position.  This included
scissored clears, glDrawPixels and glCopyPixels.  The problem was the
y coordinate passed to putImage() didn't match the y coordinate passed
to getImage().

We fix this by setting xrb->map_y to the inverted coordinate in
swrast_map_renderbuffer() which is used later by the putImage() call.
Also pass xrb->map_y to getImage() to be symmetric.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97426
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 2a2dc416b6)
2016-09-01 11:39:45 +01:00
Marek Olšák
69acfb7c94 radeonsi: disable SDMA texture copying on Carrizo
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 3ff0b67e1b)
2016-09-01 11:39:45 +01:00
Jason Ekstrand
544a92ad49 anv: Include the pipeline layout in the shader hash
The pipeline layout affects shader compilation because it is what
determines binding table locations as well as whether or not a particular
buffer has dynamic offsets.  Since this affects the generated shader, it
needs to be in the hash.  This fixes a bunch of CTS tests now that the CTS
is using a pipeline cache.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2301705dee)
2016-09-01 11:39:45 +01:00
Samuel Pitoiset
9fced1aa53 nvc0: invalidate textures/samplers on GK104+
Like Fermi, textures and samplers are aliased between 3D and compute,
especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate
these resources when switching between the two pipelines.

This fixes a GPU hang with Elemental (and most likely with other UE4 demos).

Tested on GK107 and GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a227b0a4f1)
2016-09-01 11:39:45 +01:00
Marek Olšák
5ad09f744c radeonsi: fix VM faults due NULL internal const buffers on CIK
They are harmless, but the interrupts do decrease performance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2c13abb491)
2016-09-01 11:39:45 +01:00
Nicolai Hähnle
6001e37b8e radeonsi: add si_set_rw_buffer to be used for internal descriptors
So that callers outside of si_descriptors.c need to worry less about the
details of descriptor handling.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ba4a2840c7)
2016-09-01 11:39:45 +01:00
Tomasz Figa
e45e500d4b gallium/winsys/kms: Look up the GEM handle after importing a prime FD
drmPrimeHandleToFD() will return the same GEM handle every time the same
buffer is imported, even from a different prime FD. Since GEM handles
are not reference counted, we need to make sure that each GEM handle is
referenced only by one display target struct, by looking it up in
kms_sw->bo_list first and bumping the refcount of the found dt on hit
and falling back to creating a new dt only on miss.

v2: Split into separate function.
    Use helper function for lookup.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 577f85e2bb)
2016-09-01 11:39:45 +01:00
Tomasz Figa
4ec509533e gallium/winsys/kms: Move display target handle lookup to separate function
As a preparation to use the lookup in more than once place, move the
code that looks up given KMS/GEM handle to a separate function. This
change should not introduce any functional changes.

v2: Split into separate patch.
    Move lookup code into separate function.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0465c72d46)
2016-09-01 11:39:45 +01:00
Tomasz Figa
731e6575e6 gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)
Currently kms_sw_displaytarget_add_from_prime() allocates the struct and
fills in only some of the fields, resulting in a half-baked struct that
needs to be further completed by the caller. To make this a bit more
consistent, pass width, height and stride to this function and fill in
everything there, so that caller can take the returned struct as is.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e71b78ebf9)
2016-09-01 11:39:45 +01:00
Tomasz Figa
ab157ffd86 gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)
Currently the code creates a display target struct with refcount field
initialized to 1 and then the caller again increments it, leading to
a leaked reference. Let's remove the unnecessary increment.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0aa6a818ef)
2016-09-01 11:39:45 +01:00
Ilia Mirkin
61a6d84679 nv50/ir: make sure cfg iterator always hits all blocks
In some very specially-crafted cases, we could attempt to visit a node
that has already been visited, and then run out of bb's to visit, while
there were still cross blocks on the list. Make sure that those get
moved over in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 092f994a03)
2016-09-01 11:39:45 +01:00
Eric Anholt
fabc5c2783 vc4: Fix leak of the bo_handles table.
(cherry picked from commit 9f95690959)
2016-09-01 11:39:45 +01:00
Rob Herring
ec68600280 vc4: add hash table look-up for exported dmabufs
It is necessary to reuse existing BOs when dmabufs are imported. There
are 2 cases that need to be handled. dmabufs can be created/exported and
imported by the same process and can be imported multiple times.
Copying other drivers, add a hash table to track exported BOs so the
BOs get reused.

v2: Whitespace fixup (by anholt)

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 9ace2c1355)
2016-09-01 11:39:45 +01:00
Eric Anholt
838c1cbde4 vc4: Fix a leak of the src[] array of VPM reads in optimization.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0671d67de)
2016-09-01 11:39:44 +01:00
Eric Anholt
965ceef596 vc4: Disable early Z with computed depth.
We don't tell the hardware whether we're computing depth, so we need
to manage early Z state manually.  Fixes piglit early-z.

(cherry picked from commit ce8504d196)
2016-09-01 11:39:44 +01:00
Eric Anholt
0f097f28eb vc4: Close our screen's fd on screen close.
We're passed in a freshly dup()ed fd on screen create, so we should close
it on exit.  Debugged by Hugh Cole-Baker.

(cherry picked from commit c65a00eaff)
2016-09-01 11:39:44 +01:00
Rob Herring
7c583f85e1 vc4: fix vc4_resource_from_handle() stride calculation
The expected stride calculation is completely wrong. It should
ultimately be multiplying cpp and width rather than dividing. The width
also needs to be aligned to the tiling width first before converting to
stride bytes.

The whole stride check here is possibly pointless. Any buffers which
were allocated outside of vc4 may have strides with larger alignment
requirements.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 067c5b10b6)
2016-09-01 11:39:44 +01:00
Matt Turner
cdfd7c7b72 mesa: Use AC_HEADER_MAJOR to include correct header for major().
Gentoo has been smoke testing an upcoming change to glibc.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=580392
(cherry picked from commit 20553e4a2d)
2016-09-01 11:39:44 +01:00
Stencel, Joanna
e671f40e79 egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.
Segfault occurs when destroying EGL surface attached to already destroyed
Wayland window. The fix is to set to NULL the pointer of surface's
native window when wl_egl_destroy_window() is called.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Stencel, Joanna <joanna.stencel@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 690ead4a13)
2016-09-01 11:39:44 +01:00
Jason Ekstrand
42fe245370 anv/clear: Clear E5B9G9R9 images as R32_UINT
We can't actually clear these images normally because we can't render to
them.  Instead, we have to manually unpack the rgb9e5 color value on the
CPU and clear it as R32_UINT.  We still have a bit of work to do to clear
non-power-of-two images, but this should get all of the power-of-two clears
working on at least Haswell.  This fixes three of the new Vulkan CTS tests
in the dEQP-VK.api.image_clearing.clear_color_image.* group.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7bdccd104b)
[Emil Velikov: rgb9e5 header is renamed in master
s/format_rgb9e5.h/u_format_rgb9e5.h/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-01 11:39:44 +01:00
Jason Ekstrand
015b920fb4 anv/clear: Make cmd_clear_image take an actual VkClearValue
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit afa7ca0f77)
2016-09-01 11:39:44 +01:00
Jason Ekstrand
32a0038a02 anv/blit2d: Add support for RGB destinations
This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS
tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf3cf2ecfc)
2016-09-01 11:39:44 +01:00
Jason Ekstrand
346f9e5a85 anv/blit2d: Add a format parameter to bind_dst and create_iview
Signed-off-by: Jasosn Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 16ddda8452)
[Emil Velikov: don't attribute if using ISL_TILING_W. patches that
attribute and require the ISL_TILING_W handling aren't in 12.0]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_meta_blit2d.c
2016-09-01 11:39:44 +01:00
Dave Airlie
7fe8cad4e4 st/glsl_to_tgsi: fix st_src_reg_for_double constant.
This needs to set the src swizzle so it doesn't access the .zw
members ever when we are just emitting a 0 constant here.

This fixes:
vert-conversion-explicit-dvec3-bvec3.shader_test
and a bunch of other fp64 tests on softpipe and radeonsi.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 26187f3890)
2016-09-01 11:39:44 +01:00
Daniel Scharrer
6847a37363 mesa: Fix fixed function spot lighting on newer hardware (again)
This was first fixed in commit b3f9c5c and then broken again in commit
fe2d2c7, which removed the abs modifier from input registers.

v2: Don't change the size of struct ureg.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
(cherry picked from commit 16ef7ab5c1)
2016-09-01 11:39:44 +01:00
Matt Turner
bf61f2e5c3 i965/vec4: Ignore swizzle of VGRF for use by var_range_end().
var_range_end(v, n) loops over the n components of variable number v and
finds the maximum value, giving the last use of any component of v.
Therefore it expects v to correspond to the variable associated with the
.x channel of the VGRF.

var_from_reg() however returns the variable for the first channel of the
VGRF, post-swizzle.

So, if the last register had a swizzle with y, z, or w in the swizzle
component, we would read out of bounds. For any other register, we would
read liveness information from the next register.

The fix is to convert the src_reg to a dst_reg in order to call the
dst_reg version of var_from_reg() that doesn't consider the swizzle.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e7c376adfd)
2016-09-01 11:39:43 +01:00
Emil Velikov
d2f0a2925f cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth"
The commit depends a 700+ patch introducing fd_batch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-01 11:39:43 +01:00
Ilia Mirkin
9e71069d8f a4xx: only disable depth clipping, not all clipping, when requested
The previous bit disables the whole clipper, including the regular
viewport-related clipping that would go on. The two new bits disable
near and far clipping (separately, as verified with the
depth-clamp-range piglit).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cd8e30452f)
2016-09-01 11:39:43 +01:00
Ilia Mirkin
8d1029fb7b vbo: add basevertex when looking up elements for vbo splitting
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97351
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 659dc10d32)
2016-09-01 11:39:43 +01:00
Emil Velikov
16751e0be4 isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility
v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d61d259518)
[Emil Velikov: drop not applicable gen4-6 hunks]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/isl/Makefile.am
2016-09-01 11:39:43 +01:00
mil Velikov
52f094859a anv: remove dummy VK_DEBUG_MARKER_EXT entry points
The vkCmdDbgMarker{Begin,End} symbols are exported, yet the json does no
advertise that the driver supports the extension. Furthermore the
functions are empty stubs.

Remove those until we get a proper implementation and json notation.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ebd5dc8826)
2016-09-01 11:39:43 +01:00
Emil Velikov
21411adc27 anv: do not export the Vulkan API
With version 1 of the Loader interface there is an internal/private symbol
(vk_icdGetInstanceProcAddr) which is used to retrieve all the API from the
Vulkan entrypoints from the ICD. Implying that exposing the Vulkan API is not
recommended.

Version 2 goes a step further explicitly forbiding the ICD from exposing Vulkan
symbols (and adding a negotiation API)

As a reference:
 - Nvidia 367.35
Missing negotiation API - version 1.
Exposes only vk_icdGetInstanceProcAddr.

 - AMD 16.30.3.306809
Have negotiation API - version 2,
Exposes vk_icdGetInstanceProcAddr.
Exposes a couple of Vulkan entry points - seems to be in violation with the spec.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 49394e8d77)
2016-09-01 11:39:43 +01:00
Emil Velikov
e609ec2c1a anv: automake: build with -Bsymbolic
Explicitly suggested in the Loader interface version 2 section, but it's good
idea either way. It essentially, ensures that our symbols are not interposed.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1cdb6ca40b)
2016-09-01 11:39:43 +01:00
Emil Velikov
610857acea anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility
Hide the internal symbols and annotate the vk_icdGetInstanceProcAddr as public
since the loader needs it (since v1 of the loader interface).

v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 40e4fff563)
2016-09-01 11:39:43 +01:00
Emil Velikov
de1f9ce703 anv: remove internal 'validate' layer
Presently the layer has only a single entry point. As mentioned by Jason the
function does not validate anything that isn't checked elsewhere, thus we can
drop the whole thing.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b0d56f2f4f)
2016-09-01 11:39:43 +01:00
Kenneth Graunke
951b508e44 i965: Fix barrier count shift in scalar TCS backend.
The "Barrier Count" field goes in 14:9 of m0.2.  The vec4 backend
correctly shifts by 9, but the scalar backend only shifted by 8.

It's not like this changed - I think I just made a typo when writing
the original scalar TCS backend code.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit d14dd727f4)
2016-09-01 11:39:43 +01:00
Kenneth Graunke
47b72990fe i965: Fix execution size of scalar TCS barrier setup code.
Previously, the scalar TCS backend was generating:

mov(8)   g17<1>UD     0x00000000UD    { align1 WE_all 1Q compacted };
and(8)   g17.2<1>UD   g0.2<0,1,0>UD   0x0001e000UD  { align1 WE_all 1Q };
shl(8)   g17.2<1>UD   g17.2<8,8,1>UD  0x0000000bUD  { align1 WE_all 1Q };
or(8)    g17.2<1>UD   g17.2<8,8,1>UD  0x00008200UD  { align1 WE_all 1Q };
send(8)  null<1>UW    g17<8,8,1>UD
         gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };

This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal
region.  Not to mention it clobbers 8 channels of data when we only
wanted to touch m0.2.

Instead, we want:

mov(8)   g17<1>UD     0x00000000UD    { align1 WE_all 1Q compacted };
and(1)   g17.2<1>UD   g0.2<0,1,0>UD   0x0001e000UD  { align1 WE_all };
shl(1)   g17.2<1>UD   g17.2<0,1,0>UD  0x0000000bUD  { align1 WE_all };
or(1)    g17.2<1>UD   g17.2<0,1,0>UD  0x00008200UD  { align1 WE_all };
send(8)  null<1>UW    g17<8,8,1>UD
         gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };

Using component() accomplishes this.

Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers.
barrier_guarded_read_write_calls on Skylake.  Probably fixes other
barrier issues on Gen8+.

v2: Use a group(1, 0) builder so inst->exec_size is set correctly
    (thanks to Francisco Jerez for catching that it was incorrect).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 159f037755)
2016-09-01 11:39:42 +01:00
Kenneth Graunke
9cf5eb292b i965: Implement the WaPreventHSTessLevelsInterference workaround.
Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases:
- vertex_spacing
- tessellation_shader_point_mode.points_verification
- tessellation_shader_quads_tessellation.inner_tessellation_level_rounding

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 9e778837ff)
[Emil Velikov: attribute for the lack of gl_linked_shader struct.]
[Namely: s/tes->info./shader_prog->/;s/gl_linked_shader/gl_shader/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_tcs.c
2016-09-01 11:39:42 +01:00
Kenneth Graunke
6ea7a82b5e nir/builder: Add bany_inequal and bany helpers.
The first simply picks the bany_inequal[234] opcodes based on the SSA
def's number of components.  The latter implicitly compares with zero
to achieve the same semantics of GLSL's any().

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit d8971128ac)
2016-09-01 11:39:42 +01:00
Kenneth Graunke
ab441496ca mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.
GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers:

        0.0,                      if E == 0 and M == 0,
        2^-14 * (M / 32),         if E == 0 and M != 0,
        2^(E-15) * (1 + M/32),    if 0 < E < 31,
        INF,                      if E == 31 and M == 0, or
        NaN,                      if E == 31 and M != 0,

In the second case (E == 0 and M != 0), we were multiplying the mantissa
by 2^-20, when we should have been multiplying by 2^-19 (which is
2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32).

The previous section defines the formula for 11-bit numbers, which is:

        2^-14 * (M / 64),         if E == 0 and M != 0,

In other words, we had accidentally copy and pasted the 11-bit code
to the 10-bit case, and neglected to change the exponent.

Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles
when run with surface dimensions of 1536x1152 or 1920x1080.

Cc: mesa-stable@lists.freedesktop.org
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=56244
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
(cherry picked from commit 01e99cba04)
2016-09-01 11:39:42 +01:00
Michel Dänzer
d1142926c2 glx: Don't use current context in __glXSendError
There's no guarantee that there is one, and we don't need one anyway.

Fixes piglit tests:

glx@glx-fbconfig-bad
glx@glx_ext_import_context@import context, multi process
glx@glx_ext_import_context@import context, single process

Fixes: 2e3f067458 ("glx: fix error code when there is no context bound")
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4ac640e3d2)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
8b76a3744c nv50/ir: fix bb positions after exit instructions
It's fairly rare that the BB layout puts BBs after the exit block, which
is likely the reason these issues lingered for so long.

This fixes a fraction of issues with the giant pixmark piano shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e988999791)
2016-09-01 11:39:42 +01:00
Dave Airlie
d279dec359 anv: fix writemask on blit fragment shader.
I'm not sure if anything even uses this, but I found this on radv, so
just fix it on anv for consistency.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c2f2252037)
2016-09-01 11:39:42 +01:00
Bernard Kilarski
c04ee8c110 glx: fix error code when there is no context bound
v2: change all related NULL checks to check against dummyContext
v3: really check for dummyContext *only* when ctx was from
    __glXGetCurrentContext
v4: cover more checks, add dummyBuffer, dummyVtable (Emil)

Signed-off-by: Bernard Kilarski <bernard.r.kilarski@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e3f067458)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
07df4bf0c8 nv50,nvc0: fix depth range when halfz is enabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5c1ccd8053)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
32c009b116 gallium/util: add helper to compute zmin/zmax for a viewport state
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c85b7f0e87)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
ea29312a79 vbo: allow DrawElementsBaseVertex in display lists
Looks like it was missed originally. The multi version is there already.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97331
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 68b64f32e8)
2016-09-01 11:39:42 +01:00
Kenneth Graunke
bc40bc5527 glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.
Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant"
to be specified on both inputs and outputs, and match when linking.

New languages only allow outputs to be qualified as "invariant"
and remove the "invariant must match" restriction when linking
varyings (because no input can have that qualifier).

Commit 426a50e208 introduced the new
behavior for ES 3.00.  It also removed the "must match" restriction
for ES 1.00 shaders, which I believe is incorrect.  This patch adds
that back, as well as making 4.30+ follow the new rules.

Thanks to Qiankun Miao for noticing this discrepancy.

Fixes a WebGL 2.0 conformance test when run in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit f9f462936a)
2016-09-01 11:39:42 +01:00
Ian Romanick
0c25b3e4b0 glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders
Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:

   It is an error to undefine or to redefine a built-in (pre-defined)
   macro name.

The GLSL ES 1.00 spec does not contain this text.

Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:

   #define and #undef functionality are defined as is standard for C++
   preprocessors for macro definitions both with and without macro
   parameters.

At least as far as I can tell GCC allow '#undef __FILE__'.  Furthermore,
there are desktop OpenGL conformance tests that expect '#undef
__VERSION__' and '#undef GL_core_profile' to work.

Fixes:

    GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 50b49d242d)

Squashed with commit

glcpp: Update tests for new #undef of built-in macro rules.

Ian recently changed the preprocessor to allow this in most GLSL
versions, but not GLSL ES 3.00+.  This patch converts the existing
test that expects a failure to a #version 300 es shader, and adds
a #version 110 shader to make sure that it's allowed.

Fixes 'make check'.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 1f47f78fc3)
2016-09-01 11:39:29 +01:00
Ian Romanick
e1698fa455 glcpp: Track the actual version instead of just the version_resolved flag
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit eda6349346)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/glcpp/glcpp-parse.y
2016-09-01 10:06:24 +01:00
Jason Ekstrand
3dca5c8eb1 i965/vec4: Make opt_vector_float reset at the top of each block
The pass isn't really control-flow aware and you can get into case where it
tries to combine instructions from different blocks.  This can actually
lead to an assertion failure when removing unneeded instructions if part of
the vector is set in one block and part in another.  This prevents
regressions in the next commit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c3a6b07e2)
2016-09-01 10:06:24 +01:00
Marek Olšák
659d9f189c radeonsi: only set dual source blending for MRT0
This is the proper fix for Overlord and Witcher 2 hangs.

The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel
shader while MRT1 is disabled in CB_TARGET_MASK (does this generate
unflushable pixel quads? I don't know), and another app (e.g. Glamor)
must enable dual source blending in both MRT0 and MRT1. The hw gets
confused, which leads to corruption and hangs.

Cc: 12.0 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 947e0614d0)
2016-09-01 10:06:23 +01:00
Nicolai Hähnle
1959b57310 radeonsi: flush TC L2 cache for indirect draw data
This fixes a bug when indirect draw data is generated by transform
feedback.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2852dedaa0)
2016-09-01 10:06:23 +01:00
Kenneth Graunke
fce2e3b493 glsl: Fix location bias for patch variables.
We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0.

Since "patch" only applies to inputs and outputs, we can just handle
this once outside the switch statement, rather than replicating the
check twice and complicating the earlier conditions.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 398428f406)
2016-09-01 10:06:23 +01:00
Kenneth Graunke
236172997c glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].
These are lowered to gl_TessLevel{Outer,Inner}MESA.  We need them to
appear in the program resource list with their original names and types.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 1556f16e46)
2016-09-01 10:06:23 +01:00
Kenneth Graunke
cd009c46be glsl: Delete bogus ir_set_program_inouts assert.
This assertion is bogus.  Varying structs, and arrays of structs, are
allowed by GLSL, and we can see them here.  While we currently don't
have any partial-variable support for those, simply returning false
and marking the entire thing as used is certainly legitimate.

I believe this is often swept under the rug by varying packing,
but that's disabled in certain tessellation situations.

Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 4a49851da1)
2016-09-01 10:06:23 +01:00
Nanley Chery
f20168723f anv/gen7_pipeline: Set PixelShaderKillPixel for discards
According to the IVB PRM Vol2 P1, this bit must be set if a pixel shader
contains a discard instruction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97207
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c495c18b24)
2016-09-01 10:06:23 +01:00
Jan Ziak
6980f686d0 loader: fix memory leak in loader_dri3_open
Found via "valgrind --leak-check=full glxgears".

Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com>
Acked-by: Boyan Ding <boyan.j.ding@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit fd32868590)
2016-09-01 10:06:23 +01:00
Marek Olšák
9e22182223 gallium/util: fix align64
it cut off the upper 32 bits

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6db93cd167)
2016-09-01 10:06:23 +01:00
Nicolas Boichat
e82567c02b egl/dri2: Add reference count for dri2_egl_display
android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls
eglTerminate, followed by eglReleaseThread. A similar case is
observed in this bug: https://bugs.freedesktop.org/show_bug.cgi?id=69622,
where the test calls eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL).

With the current code, dri2_dpy structure is freed on eglTerminate
call, so the display is not initialized when eglReleaseThread calls
MakeCurrent with NULL parameters, to unbind the context, which
causes a a segfault in drv->API.MakeCurrent (dri2_make_current),
either in glFlush or in a latter call.

eglTerminate specifies that "If contexts or surfaces associated
with display is current to any thread, they are not released until
they are no longer current as a result of eglMakeCurrent."

However, to properly free the current context/surface (i.e., call
glFlush, unbindContext, driDestroyContext), we still need the
display vtbl (and possibly an active dri dpy connection). Therefore,
we add some reference counter to dri2_egl_display, to make sure
the structure is kept allocated as long as it is required.

One drawback of this is that eglInitialize may not completely reinitialize
the display (if eglTerminate was called with a current context), however,
this seems to meet the EGL spec quite well, and does not permanently
leak any context/display even for incorrectly written apps.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9ee683f877)

Squashed with commit

egl/dri2: dri2_make_current: Release previous context's display

eglMakeCurrent can also be used to change the active display. In that
case, we need to decrement ref_count of the previous display (possibly
destroying it), and increment it on the next display.

Also, old_dsurf/old_rsurf cannot be non-NULL if old_ctx is NULL, so
we only need to test if old_ctx is non-NULL.

v2: Save the old display before destroying the context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97214
Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Alexandr Zelinsky <mexahotabop@w1l.ru>
Tested-by: Alexandr Zelinsky <mexahotabop@w1l.ru>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
(cherry picked from commit 78e3cea419)

Squashed with commit

egl/dri2: dri2_initialize: Do not reference-count TestOnly display

In the case where dri2_initialize is called with a TestOnly display,
the display is not actually initialized, so dri2_egl_display always
fails, and we cannot do any reference counting.

Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible
with LIBGL_ALWAYS_SOFTWARE=1).

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4f3f8bb59d)
2016-09-01 10:05:40 +01:00
Nicolas Boichat
6503a05f35 egl/android: Set dpy->DriverData to NULL on error
Avoid use-after-free on error.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c0580f6a38)
2016-09-01 10:05:40 +01:00
Nicolas Boichat
447501914a egl/drm: Set disp->DriverData to NULL on error
Avoid use-after-free on error.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a9e8fb7397)
2016-09-01 10:05:40 +01:00
Nicolas Boichat
f8a5d340e8 egl/surfaceless: Set disp->DriverData to NULL on error
Avoid use-after-free on error.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0e67d86540)
2016-09-01 10:05:39 +01:00
Nicolas Boichat
6ec0c92b3c egl/wayland: Set disp->DriverData to NULL on error
Avoid use-after-free, fix spec@egl_khr_fence_sync@conformance.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 48fd952f28)
2016-09-01 10:05:39 +01:00
Jan Ziak
fbde508c18 egl/x11: avoid using freed memory if dri2 init fails
Found with valgrind:

==4841== Invalid read of size 4
==4841==    at 0x56BDC80: dri2_initialize (egl_dri2.c:783)
==4841==    by 0x56BAFE5: _eglMatchAndInitialize (egldriver.c:261)
==4841==    by 0x56BB15E: _eglMatchDriver (egldriver.c:295)
==4841==    by 0x56B58C9: eglInitialize (eglapi.c:480)
==4841==    by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x402E59: main
==4841==  Address 0x6a05824 is 148 bytes inside a block of size 480 free'd
==4841==    at 0x4C2B680: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4841==    by 0x56C2AAE: dri2_initialize_x11_swrast (platform_x11.c:1233)
==4841==    by 0x56C2AAE: dri2_initialize_x11 (platform_x11.c:1493)
==4841==    by 0x56BDCEB: dri2_initialize (egl_dri2.c:805)
==4841==    by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261)
==4841==    by 0x56BB0C9: _eglMatchDriver (egldriver.c:292)
==4841==    by 0x56B58C9: eglInitialize (eglapi.c:480)
==4841==    by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x402E59: main
==4841==  Block was alloc'd at
==4841==    at 0x4C2A868: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4841==    by 0x56C2A47: dri2_initialize_x11_swrast (platform_x11.c:1171)
==4841==    by 0x56C2A47: dri2_initialize_x11 (platform_x11.c:1493)
==4841==    by 0x56BDCEB: dri2_initialize (egl_dri2.c:805)
==4841==    by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261)
==4841==    by 0x56BB0C9: _eglMatchDriver (egldriver.c:292)
==4841==    by 0x56B58C9: eglInitialize (eglapi.c:480)
==4841==    by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x402E59: main

Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com>
Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 769ac1ec78)
2016-09-01 10:05:39 +01:00
Nicolai Hähnle
178b889823 glsl: fix optimization of discard nested multiple levels
The order of optimizations can lead to the conditional discard optimization
being applied twice to the same discard statement. In this case, we must
ensure that both conditions are applied.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 21556d86fc)
[Emil Velikov: s/get_head_raw()/head/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/opt_conditional_discard.cpp
2016-07-28 17:05:28 +01:00
Nicolai Hähnle
7208d82dfb st_glsl_to_tgsi: only skip over slots of an input array that are present
When an application declares varying arrays but does not actually do any
indirect indexing, some array indices may end up unused in the consuming
shader, so the number of input slots that correspond to the array ends
up less than the array_size.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 185b0c15ab)
2016-07-28 14:51:32 +01:00
Jason Ekstrand
be0344f630 i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations
intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no
longer need to be multiplying by 6.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d76690f17)
2016-07-28 14:50:32 +01:00
Nicolai Hähnle
6156d8d93e radeonsi: ensure sample locations are set for line and polygon smoothing
Since commit d938b8c, the sample locations are no longer set unconditionally,
so we need to set the atom to dirty on all chips, not just Polaris.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3d69357da9)
2016-07-28 14:49:12 +01:00
Nicolai Hähnle
3237c07e98 radeonsi: fix Polaris MSAA regression
The regression was introduced by commit d938b8c. The problem here is that in
order to use the small primitive filter, we need to explicitly set the sample
locations to 0. But the DB doesn't properly process the change of sample
locations without a flush, and so we can end up with incorrect Z values.

Instead of doing a flush, just disable the small primitive filter when MSAA
is force-disabled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f755da0f2f)
2016-07-28 14:47:49 +01:00
Kenneth Graunke
5f09454e34 mesa: Don't call GenerateMipmap if Width or Height == 0.
One of the WebGL 2.0 conformance tests is trying to call
glGenerateMipmaps with a width and height of 0.  With the meta
implementation, this generates a "framebuffer attachment incomplete"
status, and falls back to the CPU path, calling MapTextureImage.

Except that there's no actual texture to map, and we assert fail.

There's no work to do in this case.  The test expects it to succeed,
so just return early with no error and avoid hassling the driver.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit f80bea2d80)
2016-07-28 14:46:50 +01:00
Jason Ekstrand
1951df7812 anv/pipeline: Set up point coord enables
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b33bccb519)
2016-07-28 14:45:48 +01:00
Kenneth Graunke
2f8cd4a8c3 mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.
The GL_EXT_texture_format_BGRA8888 extension specification defines a
GL_BGRA_EXT unsized internal format (which is a little odd - usually
BGRA is a pixel transfer format).  The extension is written against
the ES 1.0 specification, so it's a little hard to map, but I believe
it's effectively adding it to the table used here, so we should allow
it here as well.

Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true),
so we don't need to check if it's enabled here.

This fixes mipmap generation in Skia and ChromeOS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Stéphane Marchesin <marcheu@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cb70773129)
2016-07-28 14:44:45 +01:00
Kenneth Graunke
2f80fb368b i965: Fix shared atomic intrinsics to pay attention to base.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 76e161056a)
2016-07-28 14:43:35 +01:00
Kenneth Graunke
e1c20919a8 nir: Add a base const_index to shared atomic intrinsics.
Commit 52e75dcb8c made nir_lower_io
start using nir_intrinsic_set_base instead of writing const_index[0]
directly.  However, those intrinsics apparently don't /have/ a base,
so this caused assert failures.

However, the old code was happily setting non-existent const_index
fields, so it was pretty bogus too.

Jason pointed out that load_shared and store_shared have a base,
and that the i965 driver uses that field.  So presumably atomics
should have one as well, so that loads/stores/atomics all refer
to variables with consistent addressing.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit cf6f2d3ce7)
2016-07-28 14:40:24 +01:00
Kenneth Graunke
a94056e2c7 i965: Include VUE handles for GS with invocations > 1.
We always resort to the pull model for instanced GS inputs.  So, we'd
better include the VUE handles, or else we can't actually pull anything.

Ian reports that on his branch with OES_geometry_shader enabled,
this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests::

- instanced.draw_2_instances_geometry_2_invocations
- instanced.draw_2_instances_geometry_8_invocations
- instanced.draw_4_instances_geometry_2_invocations
- instanced.draw_4_instances_geometry_8_invocations
- instanced.draw_8_instances_geometry_2_invocations
- instanced.draw_8_instances_geometry_8_invocations
- instanced.geometry_2_invocations
- instanced.geometry_32_invocations
- instanced.geometry_8_invocations
- instanced.geometry_max_invocations
- instanced.geometry_output_different_2_invocations
- instanced.geometry_output_different_32_invocations
- instanced.geometry_output_different_8_invocations
- instanced.geometry_output_different_max_invocations
- instanced.invocation_output_vary_by_attribute
- instanced.invocation_output_vary_by_texture
- instanced.invocation_output_vary_by_uniform
- query.primitives_generated_instanced

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2db357e4c3)
2016-07-28 14:21:29 +01:00
Chuck Atkins
d70f97784b swr: Refactor checks for compiler feature flags
Encapsulate the test for which flags are needed to get a compiler to
support certain features.  Along with this, give various options to try
for AVX and AVX2 support.  Ideally we want to use specific instruction
set feature flags, like -mavx2 for instance instead of -march=haswell,
but the flags required for certain compilers are different.  This
allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c
while the Intel compiler which doesn't support those flags can fall
back to using -march=core-avx2.

This addresses a bug where the Intel compiler will silently ignore the
AVX2 instruction feature flags and then potentially fail to build.

v2: Pass preprocessor-check argument as true-state instead of
    false-state for clarity.
v3: Reduce AVX2 define test to just __AVX2__.  Additional defines suchas
    __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined
    w.r.t thier availability.
v4: Fix C++11 flags being added globally and add more logic to
    swr_require_cxx_feature_flags

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Tim Rowley <timothy.o.rowley@Intel.com>
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit c1bf6692be)
2016-07-27 15:15:34 +01:00
Tim Rowley
cd10b86026 swr: switch from overriding -march to selecting features
Acked-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit 5a64549f54)
2016-07-27 15:14:57 +01:00
Marek Olšák
3b4c74963a winsys/amdgpu: disallow DCC with mipmaps
It has never been implemented. master will get a different fix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96381

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
2016-07-27 14:37:01 +01:00
Samuel Pitoiset
faa432c0b6 nvc0: upload sample locations on GM20x
This fixes a bunch of multisample piglit tests on GM206, like
bin/arb_texture_multisample-texelfetch 2 -auto -fbo

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit e7b2ce5fd8)
[Emil Velikov: resolve conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
2016-07-27 14:21:10 +01:00
Rob Herring
271eee2464 Android: add missing u_math.h include path for libmesa_isl
Commit 87d062a940 ("i965: Fix shared local memory size for Gen9+.")
added u_math.h include which broke the Android build:

In file included from external/mesa3d/src/intel/isl/isl_storage_image.c:25:
In file included from external/mesa3d/src/mesa/drivers/dri/i965/brw_compiler.h:29:
external/mesa3d/src/mesa/main/macros.h:35:10: fatal error: 'util/u_math.h' file not found
         ^

Add the missing include paths for libmesa_isl.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Kenneth Garunke <kenneth@whitecape.org>
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 789ed13284)
2016-07-27 14:06:27 +01:00
Matt Turner
fd2312e745 mapi: Massage code to allow clang to compile.
According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code
was violating the spec, resulting in it failing to compile.

Cc: mesa-stable@lists.freedesktop.org
Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5ec140c17b)

Squashed with commit:

mapi: fix typo in macro name

Fixes: 5ec140c17b ("mapi: Massage code to allow clang to compile.")
Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 4da9f7e7ce)
2016-07-27 11:07:53 +01:00
Jason Ekstrand
67032af87a nir/inline: Constant-initialize local variables in the callee if needed
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9d503aea06)
2016-07-21 16:08:59 +01:00
Jason Ekstrand
f9964dd2c6 nir: Add a nir_deref_foreach_leaf helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc9f2436c3)
2016-07-21 16:06:08 +01:00
Kenneth Graunke
1c9412bb1a anv: Properly call gen75_emit_state_base_address on Haswell.
This should fix MOCS values.  Caught by Coverity.

CID: 1364155

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e614062e54)
2016-07-21 16:05:12 +01:00
Kenneth Graunke
726f47e495 genxml: Rename "API Rendering Disable" to "Rendering Disable".
Gen7/7.5 call it "Rendering Disable" while Gen8/9 prefix it with "API".

Pick one for consistency, and so we can share code between generations.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 87660579f5)
2016-07-21 16:04:04 +01:00
Kenneth Graunke
1dd0c22ab0 anv: Unify 3DSTATE_CLIP code across generations.
The bulk of this is the same.  There are just a couple fields that only
exist on one generation or another, and we can easily handle those with
an #ifdef.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bfd9942cdc)
2016-07-21 16:03:03 +01:00
Kenneth Graunke
a00081a9e8 anv: Enable early culling on Gen7.
We set the cull mode, but forgot the enable bit.  Gen8 uses this.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 44502afd82)
2016-07-21 16:02:06 +01:00
Kenneth Graunke
626f21051b anv: Fix near plane clipping on Gen7/7.5.
The Gen7/7.5 clip code used APIMODE_OGL, while the Gen8+ clip code used
APIMODE_D3D.  The meaning hasn't changed, so one of these must be wrong.

It appears that the hardware documentation is completely wrong.  It
claims that the "API Mode" bit means:

   0h    APIMODE_OGL    NEAR_VP boundary == 0.0 (NDC)
   1h    APIMODE_D3D    NEAR_VP boundary == -1.0 (NDC)

However, DirectX typically uses 0.0 for the near plane, while unextended
OpenGL uses -1.0.  i965's gen6_clip_state.c uses APIMODE_D3D for the
GL_ZERO_TO_ONE case, so I believe the meanings are backwards from what
the documentation says.

Section 23.2 ("Primitive Clipping") of the Vulkan 1.0.21 specification
contains the following equations:

   -w_c <= x_c <= w_c
   -w_c <= y_c <= w_c
      0 <= z_c <= w_c

This means that Vulkan follows D3D semantics.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0d77f08042)
2016-07-21 16:01:05 +01:00
Kenneth Graunke
629a7b32e0 genxml: Add APIMODE_D3D missing enum values and improve consistency.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6b67270262)
2016-07-21 15:59:45 +01:00
Kenneth Graunke
749e4cb96b genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values.
Gen6-7.5 use CLIPMODE_REJECT_ALL, while Gen8+ just used REJECT_ALL.
Being consistent will let me unify code, and I prefer having the prefix.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c31cf532af)
2016-07-21 15:58:37 +01:00
Jason Ekstrand
48e9ecc47f i965/miptree: Set logical_depth0 == 6 for cube maps
This matches what we do for cube maps where logical_depth0 is in number of
face-layers rather than number of cubes.  This does mean that we will
temporarily be setting the surface bounds too loose for cube map textures
but we are already setting them too loose for cube arrays and we will be
fixing that in the next commit anyway.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e19b7f7f1b)
2016-07-21 15:57:39 +01:00
Jason Ekstrand
57c1d0ea07 i965/miptree: Enforce that height == 1 for 1-D array textures
The GL API and mesa internals do this differently than we do.  In GL, there
is no depth parameter for 1-D arrays and height is used.  In the i965
miptree code we do the sane thing and make height == 1 and use depth for
number of slices.  This makes for a mismatch every time we create a 1-D
array texture from GL.  Instead of actually solving this problem, we just
said "1-D is hard, let's make sure it works no matter which way we pass the
parameters" and called it a day.

This commit fixes the one GL -> i965 transition point where we weren't
already handling 1-D array textures to do the right thing and then replaces
the magic fixup code with an assert that you're doing the right thing.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d4d505d0b0)
2016-07-21 14:32:33 +01:00
Stefan Dirsch
fb8b548ac1 Avoid overflow in 'last' variable of FindGLXFunction(...)
This 'last' variable used in FindGLXFunction(...) may become negative,
but has been defined as unsigned int resulting in an overflow,
finally resulting in a segfault when accessing _glXDispatchTableStrings[...].
Fixed this by definining it as signed int. 'first' variable also needs to be
defined as signed int. Otherwise condition for while loop fails due to C
implicitly converting signed to unsigned values before comparison.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Stefan Dirsch <sndirsch@suse.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 27ef7bfd6c)
2016-07-21 14:31:27 +01:00
Tomasz Figa
92799eee93 egl/android: Stop leaking DRI images
Current implementation of the DRI image loader does not free the images
created in get_back_bo() and so leaks memory. Moreover, it creates a new
image every time the DRI driver queries for buffers, even if the backing
native buffer has not changed. leaking memory again.

This patch adds missing call to destroyImage() in droid_enqueue_buffer()
and a check if image is already created to get_back_bo() to fix the
above.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9e1248d075)
2016-07-21 14:29:40 +01:00
Tomasz Figa
b717b49286 egl/android: Check return value of dri2_get_dri_config()
It might return NULL if specific config variant is unsupported.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 94282b6dd0)
2016-07-21 14:29:40 +01:00
Emil Velikov
bb0c49bf76 i965: store reference to the context within struct brw_fence (v2)
As the spec allows for {server,client}_wait_sync to be called without
currently bound context, while our implementation requires context
pointer.

v2: Add a mutex and acquire it for the duration of
    brw_fence_client_wait() and brw_fence_is_completed() as suggested
    by Chad.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
(cherry picked from commit 4f48674d51)
2016-07-21 14:29:40 +01:00
Nicolas Boichat
de695014eb egl/dri2: dri2_make_current: Set EGL error if bindContext fails
Without this, if a configuration is, say, available only on GLES2/3, but
not on GLES1, and is rejected by the dri module's bindContext call,
eglMakeCurrent fails with error "EGL_SUCCESS".

In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP
dEQP-EGL.functional.surfaceless_context expect.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9bebef4034)
2016-07-21 14:29:40 +01:00
Tomasz Figa
67e04622d8 egl/android: Remove unused variables
There are some unused variables left after previous clean-ups triggering
compiler warnings. Let's remove them.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ccda100a5a)
2016-07-21 14:29:40 +01:00
Haixia Shi
fde7fc1ab2 platform_android: prevent deadlock in droid_swap_buffers
To avoid blocking other EGL calls, release the display mutex before
we enqueue buffer to android frameworks and re-acquire the mutex
upon return.

v2: moved lock/unlock inside droid_window_enqueue_buffer().

TEST=verify pinch zoom in Photos app no longer causes hangs

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1ea233c6f3)
2016-07-21 14:29:40 +01:00
Tomasz Figa
abaf0e9817 gallium/dri: Add shared glapi to LIBADD on Android
An earlier patch fixed the problem for classic drivers, however Gallium
was still left broken. This patch applies the same workaround to
Gallium, when compiled for Android. Following is a quote from the
original patch:

0cbc90c57c mesa: dri: Add shared glapi to LIBADD on Android

/system/vendor/lib/dri/*_dri.so actually depend on libglapi: without
this, loading the so file fails with:
cannot locate symbol "__emutls_v._glapi_tls_Context"

On non-Android (non-bionic) platform, EGL uses the following
workflow, which works fine:
  dlopen("libglapi.so", RTLD_LAZY | RTLD_GLOBAL);
  dlopen("dri/<driver>_dri.so", RTLD_NOW | RTLD_GLOBAL);

However, bionic does not respect the RTLD_GLOBAL flag, and the dri
library cannot find symbols in libglapi.so, so we need to link
to libglapi.so explicitly. Android.mk already does this.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 70a28afb29)
2016-07-21 14:01:15 +01:00
Emil Velikov
485c6d231e mesa: scons: list builddir before srcdir
Analogous to previous commit.

Note: scons always uses OOT builds, while the in-tree generated files
could be created either manually or by the autoconf build.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1c7c0d77ac)
[Emil Velikov: remove not-applicable "Dir('main')" entry]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/SConscript
2016-07-21 13:59:25 +01:00
Emil Velikov
1f3ce210cd mesa: automake: list builddir before srcdir
In the case of building in out-of-tree fashion, while having generated
in-tree sources, the latter [likely stale] files will be used.

Flip the order to prevent that.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit eafa82e20e)
2016-07-21 13:53:16 +01:00
Samuel Pitoiset
f123f574fa gm107/ir: make use of ADD32I for all immediates
ADD only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c63224540)
2016-07-21 13:51:55 +01:00
Samuel Pitoiset
bbb0587c78 gm107/ir: add missing NEG modifier for IADD32I
Like FADD32I, the NEG modifier of src0 is at position 56.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0904a2ba97)
2016-07-21 13:50:33 +01:00
Andreas Boll
fb4ab871a8 configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too
The help string wasn't updated in cbc37f7.

Fixes: cbc37f7 ("anv: install the intel_icd.json to ${datarootdir} by
default")

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d66cb7c84f)
2016-07-21 13:49:34 +01:00
Ilia Mirkin
53bb4e0354 nv50,nvc0: srgb rendering is only available for rgba/bgra
Mark both L8_SRGB and L8A8_SRGB as non-renderable (the latter already
didn't have the bind flags). This makes the state tracker pick a
different format when rendering is required, or mark the fb as
incomplete. This fixes:

  bin/getteximage-formats init-by-clear-and-render -auto -fbo
  bin/getteximage-formats init-by-rendering -auto -fbo

which previously ran into srgb-encoding differences.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ed9dd3bcd9)
2016-07-21 13:48:16 +01:00
Leo Liu
aeb3ca9754 vl/dri3: fix a memory leak from front buffer
Inspired by fix for mem leak of vdpau interop, resource_from_handle
set texture reference count, that need to be decreased and released,
recall there is a similar case for DRI3, that is with VA-API glx
extension, there is temporary TFP(texture from pixmap), we target it
through dma-buf. leak happens when without count down the reference.

Checked and found with mpv vo=opengl case, there only one static TFP,
the leak happens once, but for totem player using gstreamer VA-API glx,
the dynamic TFP for each frame, so leak quite a bit.

This fixes mem leak for mpv and totem.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 134d6e4e4f)
2016-07-21 13:46:55 +01:00
Jason Ekstrand
b1b601fc7c anv: Handle VK_WHOLE_SIZE properly for buffer views
The old calculation, which used view->offset, encorporated buffer->offset
into the size calculation where it doesn't belong.  This meant that, if
buffer->offset > buffer->size, you would always get a negative size.  This
fixes 170 dEQP-VK.renderpass.attachment.* Vulkan CTS tests on Haswell.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 593731ea3c)
[Emil Velikov: s|bpb / 8|bs|g]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_image.c
2016-07-21 13:46:02 +01:00
Jason Ekstrand
7441632753 anv: Add an align_down_npot_u32 helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 827405f072)
2016-07-21 12:31:28 +01:00
Jason Ekstrand
56d6f64206 anv: Enable independentBlend on gen7
We can totally do it, we were just only setting up one BLEND_STATE and, now
that the code is unified with gen8, we should be handling it correctly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f124f4a394)
2016-07-21 12:30:32 +01:00
Jason Ekstrand
f194c84b37 anv/pipeline: Unify blend state setup between gen7 and gen8
This fixes all 674 broken dEQP-VK.pipeline.blend Vulkan CTS tests on
Haswell.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2e7b2e653)
2016-07-21 12:29:34 +01:00
Jason Ekstrand
cb9a2a4b85 genxml: Make gen6-7 blending look more like gen8
This renames BLEND_STATE to BLEND_STATE_ENTRY and adds an new struct
BLEND_STATE which is just an array of 8 BLEND_STATE_ENTRYs.  This will make
it much easier to write gen-agnostic blend handling code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit aaa202ebe7)
2016-07-21 12:28:20 +01:00
Brian Paul
9798eb14da mesa: use _mesa_clear_texture_image() in clear_texture_fields()
This avoids a failed assert(img->_BaseFormat != -1) in
init_teximage_fields_ms() because the internalFormat argument is GL_NONE.
This was hit when using glTexStorage() to do a proxy texture test.

Fixes a failure with the updated Piglit tex3d-maxsize test.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit e477d92c94)
2016-07-21 12:27:27 +01:00
Nanley Chery
60e41ca10a isl: Fix isl_tiling_is_any_y()
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 00caba4152)
2016-07-21 12:25:54 +01:00
Nanley Chery
826117a1c4 anv/device: Fix max buffer range limits
Set limits that are consistent with ISL's assertions in
isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType
mapping in anv_isl_format_for_descriptor_type().

Fixes the following new crucible tests:
* stress.limits.buffer-update.range.uniform
* stress.limits.buffer-update.range.storage

These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a5748cb920)
2016-07-21 12:24:55 +01:00
Nanley Chery
f334b6deaa isl: Fix assert on raw buffer surface state size
See inline PRM reference.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 028f6d8317)
2016-07-21 12:23:58 +01:00
Nanley Chery
30c6bff143 anv/descriptor_set: Fix binding partly undefined descriptor sets
Section 13.2.3. of the Vulkan spec requires that implementations be able to
bind sparsely-defined Descriptor Sets without any errors or exceptions.

When binding a descriptor set that contains a dynamic buffer binding/descriptor,
the driver attempts to dereference the descriptor's buffer_view field if it is
non-NULL. It currently segfaults on undefined descriptors as this field is never
zero-initialized. Zero undefined descriptors to avoid segfaulting. This
solution was suggested by Jason Ekstrand.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit fd16e64321)
2016-07-21 12:23:00 +01:00
Brian Paul
474b169c1f svga: handle mismatched number of samplers, sampler views
in svga_init_shader_key_common().  Since the CSO module only tracks
sampler views for fragment shaders, the number of samplers and sampler
views can be mismatched for other types of shaders.  This situation
triggered an assertion in Chrome with maps.google.com

This patch adds defensive code to handle that situation.

Fixes VMware bug 1694027
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>

(cherry picked from commit 50a669de4e)
2016-07-21 12:21:57 +01:00
Leo Liu
6deeccf5aa st/omx/enc: check uninitialized list from task release
The uninitialized list should be checked and returned.

Thank Julien for the notification and suggested fix.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9d10e79c8)
2016-07-21 12:20:56 +01:00
Jason Ekstrand
bf11931c95 glsl/types: Use _mesa_hash_data for hashing function types
This is way better than the stupid string approach especially since you
could overflow the string.  Again, I thought I had something better at one
point but it obviously got lost.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b919100d61)
2016-07-21 12:19:38 +01:00
Jason Ekstrand
17e1b016fc glsl/types: Fix function type comparison function
It was returning true if the function types have different lengths rather
than false.  This was new with the SPIR-V to NIR pass and I thought I'd
fixed it a while ago but it may have gotten lost in rebasing somewhere.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 11ac1c4dbb)
2016-07-21 12:18:19 +01:00
Christian König
b0d1395480 st/mesa: fix reference counting bug in st_vdpau
Otherwise we leak the resources created for the DMA-buf descriptors.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com>
Ack-by: Tom St Denis <tom.stdenis@amd.com>

(cherry picked from commit 9ce52baf7f)
2016-07-21 12:17:21 +01:00
Jason Ekstrand
ad09c8142f anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9e99282a6)
2016-07-21 12:16:14 +01:00
Tim Rowley
2e010ab1cc Revert "gallium: Force blend color to 16-byte alignment"
This reverts commit d8d6091a84.

Heap allocations may be only 8-byte aligned on 32-bit system, and so having
members with 16-byte alignment (such as in the case where pipe_blend_color is
embedded in radeonsi's si_context) is undefined behavior which indeed causes
crashes when compiled with gcc -O3.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Acked-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit 29f53d7937)
2016-07-21 12:08:18 +01:00
Jason Ekstrand
0aae486a8b nir/spirv: Don't multiply the push constant block size by 4
I have no idea why we were multiplying by 4 before.  The offsets we get
from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to
do any adjustment whatsoever.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 49476576dd)
2016-07-21 12:07:01 +01:00
Marek Olšák
605063953d radeonsi: add a workaround for a compute VGPR-usage LLVM bug
v2: use abort(), describe which LLVM version is affected

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d227dbe272)
2016-07-21 12:05:48 +01:00
Marek Olšák
cbdbf67f1c glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
This bug is uncovered by glsl/lower_if_to_cond_assign.
I don't know if it can be reproduced in any other way.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit ead7736821)
2016-07-21 12:04:46 +01:00
Ilia Mirkin
a2aec66444 mesa: set _NEW_BUFFERS when updating texture bound to current buffers
When a glTexImage call updates the parameters of a currently bound
framebuffer, we might miss out on revalidating whether it is complete.
Make sure to set _NEW_BUFFERS which will trigger the revalidation in
that case.

Also while we're at it, fix the fb parameter passed in to the eventual
RenderTexture call.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94148
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
(cherry picked from commit da7223ebdc)
2016-07-21 12:03:44 +01:00
Ilia Mirkin
1ed8237b1a st/mesa: return appropriate mesa format for ETC texture formats
Even when the backend driver does not support ETC formats, we handle the
decoding into an uncompressed backing texture. However as far as core
mesa is concerned, it's an ETC texture and we should return the relevant
ETC mesa format. This condition can get hit when using glTexStorage to
create the texture object.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 00d4315d37)
2016-07-21 12:02:43 +01:00
Ilia Mirkin
e7de53fefd mesa: etc2 online compression is unsupported, don't attempt it
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ee3cdde04)
2016-07-21 12:00:55 +01:00
Samuel Pitoiset
76a2950c1e nvc0: fix the driver cb size when draw parameters are used
The size of the driver constant buffer for each stage should be 2048
and not 512 because it has been increased recently for buffers/images.
While we are at it, do the same change for indirect draws.

This fixes all ARB_shader_draw_parameters tests on GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 31a615677b)
2016-07-21 11:59:33 +01:00
Samuel Pitoiset
d6c387933d nvc0/ir: fix images indirect access on Fermi
This fixes the following piglits:

arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index
arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 19d0450b27)
2016-07-21 11:57:57 +01:00
Nicolai Hähnle
60eabe9ad3 radeonsi: explicitly choose center locations for 1xAA on Polaris
Unlike SC, the small primitive filter does not automatically use center
locations in 1xAA mode, so this is needed to avoid artifacts caused by
the small primitive filter discarding triangles that it shouldn't.

As a side effect of how the effective number of samples is now calculated,
this patch also avoids submitting the sample locations for line/poly smoothing
when they're not really needed.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d938b8c0bf)
2016-07-21 11:56:25 +01:00
Francisco Jerez
04584d5835 i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.
This hardware race condition has caused problems several times already
(see "i965: Fix cache pollution race during L3 partitioning set-up.",
"i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and
"i965: intel_texture_barrier reimplemented").  The problem is that
whenever we attempt to both flush and invalidate multiple caches with
a single pipe control command the flush and invalidation happen in
reverse order, so the contents flushed from the R/W caches aren't
guaranteed to become visible from the invalidated caches after the
PIPE_CONTROL command completes execution if some concurrent rendering
workload happened to pollute any of the invalidated R/O caches in the
short window of time between the invalidation and flush.

This makes sure that brw_emit_pipe_control_flush() has the effect
expected by most callers of making the contents flushed from any R/W
caches visible from the invalidated R/O caches.

Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 37b901003b)
2016-07-21 11:54:41 +01:00
Francisco Jerez
2035ac24ce i965: Make room in the batch epilogue for three more pipe controls.
Review carefully, it sucks to have to keep track of the number of
command packet dwords emitted in the batch epilogue manually.  The
MI_REPORT_PERF_COUNT_BATCH_DWORDS calculation was obviously wrong.

Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0bd3a121c6)
2016-07-21 11:52:56 +01:00
Francisco Jerez
72d287e347 i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.
There were two places in the driver doing a pipe control VF cache
flush, one of them was missing this workaround, move it down into
brw_emit_pipe_control_flush to make sure we don't miss it again.

Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit a10879f48c)
2016-07-21 11:51:53 +01:00
Ian Romanick
3a35da7e8a glsl: Pack integer and double varyings as flat even if interpolation mode is none
v2: Also update varying_matches::compute_packing_class().  Suggested by
Timothy Arceri.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3119871bd9)
2016-07-21 11:48:58 +01:00
Ian Romanick
bc68532a06 mesa: Strip arrayness from interface block names in some IO validation
Outputs from the vertex shader need to be able to match
per-vertex-arrayed inputs of later stages.  Acomplish this by stripping
one level of arrayness from the names and types of outputs going to a
per-vertex-arrayed stage.

v2: Add missing checks for TESS_EVAL->GEOMETRY.  Noticed by Timothy
Arceri.

v3: Use a slightly simpler stage check suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 73a6a4ce49)
2016-07-21 11:30:46 +01:00
Emil Velikov
edfc17a19a docs: add sha256 checksums for 12.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-09 00:02:13 +01:00
Emil Velikov
04277f058d docs: add release notes for 12.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:58 +01:00
Emil Velikov
a2770e55a2 Update version to 12.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:11 +01:00
Emil Velikov
a705f82a56 radeon: reference the correct cdw/max_dw
With commit f41f78cda1 ("radeonsi: drop the DRAW_PREAMBLE packet on
Polaris") we failed to attribute that the separate current/prev
radeon_winsys_cs_chunk(s) are not applicable/available in branch.

The latter of which introduced with commit 89ba076de4 ("radeon/winsys:
introduce radeon_winsys_cs_chunk").

Just drop "current." from the respective places to get things up and
running again.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96864
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:11 +01:00
Emil Velikov
3a146a789c docs: add sha256 checksums for 12.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:11 +01:00
Emil Velikov
8b06176f31 docs: Update 12.0.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 14:45:24 +01:00
Emil Velikov
ab8938817f Update version to 12.0.0(final)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 14:43:18 +01:00
Neha Bhende
88a095962f svga: Fix failures caused in fedora 24
SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands
are not presents in fedora 24 kernel module. Because of this
reason application like supertuxkart are not running.

v2: Add few comments and code modifications suggested by Brian P.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 7988513ac3)
2016-07-08 14:29:58 +01:00
Ilia Mirkin
450f076482 glsl: don't try to lower non-gl builtins as if they were gl_FragData
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.

This is a small part of the problem pointed out by the below bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a37e46323c)
2016-07-08 14:29:50 +01:00
Emil Velikov
168fdc6a07 bugzilla_mesa.sh: Drop "Bug " from sed command
After a recent Bugzilla update the word is no longer in the title. Thus
the script ended up producing bogus HTML.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f35f8464ec)
2016-07-07 16:12:34 +01:00
Akihiko Odaki
5193fe9f4f mesa: don't install GLX files if GLX is not built
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki.4i@stu.hosei.ac.jp>
[Emil Velikov: Drop guards around dri_interface.h, add stable tag]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 42968424fb)
2016-07-07 16:12:34 +01:00
Mathias Fröhlich
4a3d510b5b osmesa: Export OSMesaCreateContextAttribs.
Since the function is exported like any other
public api function and put in the header
as if you could link against it, export it also
from shared objects.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 13affe0d3f)
2016-07-07 16:12:34 +01:00
Rob Clark
63b7c6ffc8 glsl: add driconf to zero-init unintialized vars
Some games are sloppy.. perhaps because it is defined behavior for DX or
perhaps because nv blob driver defaults things to zero.

So add driconf param to force uninitialized variables to default to zero.

This issue was observed with rust, from steam store.  But has surfaced
elsewhere in the past.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f78a6b1ce3)
2016-07-07 16:12:33 +01:00
Rob Clark
3276443935 i965: don't drop const initializers in vector splitting
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 01ccb0d91e)
2016-07-07 16:12:33 +01:00
Rob Clark
8a700c562c freedreno: fix crash on smaller gpus and higher resolutions
Devices with smaller GMEM size need more tiles.  On db410c at 2048x1152,
glmark2 shadow needed ~330 tiles for fullscreen.  Lets bump it up to
512.  (Maybe with MRT you could end up needing more, but at that point
things are probably going to be painfully slow.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 7295428e41)
2016-07-07 16:12:33 +01:00
Emil Velikov
dcc4df858a anv: vulkan: remove the anv_device.$(OBJEXT) rule
Atm the actual rule will expand to foo.o which is used for static
libraries only.

Thus the automake manual recommendation [to use OBJEXT] won't help us,
since since we're working with a shared library.

Thus let's 'demote' the file and add it back to BUILT_SOURCES. This will
manage all the complexity for us, at the (existing expense) of working
only with the all, check and install targets.

The crazy (why the issue was hard to spot):
If the dependencies (.deps/*.Plo) are already created one can alter the
anv_device.$(OBJEXT) line and/or nuke it all together. That won't lead
to any warnings/issues, even though the Makefile is regenerated.

Moral of the story:
Always rm -rf top_builddir or don't resolve the dependencies manually
and use BUILT_SOURCES.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96825
Fixes: d7a604c3f7a ("anv: use cache uuid based on the build timestamp.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 9618e2a24c)
2016-07-07 16:12:33 +01:00
Emil Velikov
dbb4c3c7c8 anv: install the intel_icd.json to ${datarootdir} by default
As mentioned by the spec (and used by Archlinux and Debian) default to
${datarootdir} as opposed to ${sysconfdir} for the default location.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit cbc37f72e3)
2016-07-07 16:12:33 +01:00
Emil Velikov
7af5c2834c swr: automake: don't ship LLVM version specific generated sources
Otherwise things will fail to build, if the builder is using another
version of LLVM.

v2: annotate all the dependencies of builder_gen.h
v3: clean the generated files as needed
v4: comment cleanups (Tim)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (v2)
Reported-by: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 744d0d8f3b)
2016-07-07 16:12:33 +01:00
Emil Velikov
0e5e20dca0 automake: don't mandate git_sha1.h/MESA_GIT_SHA1
It has proven subtle to get it right both from the build side POV (see
commit list below) and builders due to their varying workflows.

Furthermore it does not fully fulfil the reason why it was enforced -
to detect uniqueness between different builds, in order to distinguish
and invalidate Vulkan/GL caches.

With that having a much better solution (previous commit) we can drop
this solution.

This effectively reverts the following commits:
359d9dfec3 ("mesa: automake: add directory prefix for git_sha1.h")
2c424e00c3 ("mesa: automake: ensure that git_sha1.h.tmp has the right
attributes")
b7f7ec7843 ("mesa: automake: distclean git_sha1.h when building OOT")
8229fe68b5 ("automake: get in-tree `make distclean' working again.")

Cc: Timo Aaltonen <tjaalton@debian.org>
Cc: Haixia Shi <hshi@chromium.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 22e9357028)
2016-07-07 16:12:33 +01:00
Emil Velikov
cc2c350416 anv: use cache uuid based on the build timestamp.
Do not rely on the git sha1:
 - its current truncated form makes it less unique
 - it does not attribute for local (Vulkand or otherwise) changes

Use a timestamp produced at the time of build. It's perfectly unique,
unless someone explicitly thinkers with their system clock. Even then
chances of producing the exact same one are very small, if not zero.

v2: Remove .tmp rule. Its not needed since we want for the header to be
regenerated on each time we call make (Eric).

v3:
 - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel)
 - Replace the generated header with a define, to prevent needless
builds on consecutive `make' and/or `make install' calls. (Dave)

v4:
 - Keep the timestamp generation at make time. (Jason)

v5:
 - Ensure that file is regenerated on incremental builds.

Cc: Michel Dänzer <michel@daenzer.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit addb099ce8)
2016-07-07 16:12:33 +01:00
Emil Velikov
66fe2be1f5 clover: conditionally use MESA_GIT_SHA1
Considering how hard/annoying it was for many peoples' workflow to
properly generate the macro, it will be demoted to conditionally
available with follow-up commits.

v2: Kill off gracious blank line (Vedran).

Cc: mesa-stable@lists.freedesktop.org
Cc: Vedran Miletić <vedran@miletic.net>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Vedran Miletić <vedran@miletic.net>
(cherry picked from commit f98530b739)
2016-07-07 16:12:33 +01:00
Dave Airlie
568ba49673 Revert "st/glsl_to_tgsi: don't increase immediate index by 1."
This reverts commit 27d456cc87.

DOH, what seems right and what is right with fp64 are always
two different things.

This regressed:
spec@arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-mixed-shader
on radeonsi

Reported-by: Michel Dänzer <michel@daenzer.net>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cb728df967)
2016-07-07 16:12:33 +01:00
Samuel Pitoiset
134523aa7d nvc0/ir: reset the base offset for indirect images accesses
In presence of an indirect image access, the base offset should be
zeroed because the stride will be computed twice. This is a pretty
rare situation but it can happen when tex.r > 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3b9fff3c3)
2016-07-07 16:12:33 +01:00
Samuel Pitoiset
9f364ed35e gm107/ir: fix sign bit emission for FADD32I
When emitting OP_SUB, the sign bit for FADD and FADD32I is not
at the same position. It's at position 45 for FADD but 51 for FADD32I.

This fixes the following piglit test:
tests/spec/arb_fragment_program/fdo30337b.shader_test

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb828b7b18)
2016-07-07 16:12:32 +01:00
Lionel Landwerlin
d0e1d6b1c8 anv/wsi: create swapchain images using specified image usage
The image usage specified by the caller of vkCreateSwapchainKHR should be
passed onto the internal image creation. Otherwise the driver might later
crash when the user tries to use the image as a combined sampler even though
the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT.

Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be
expected even if the swapchain is created without any flag.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dbbc4fb4cc)
2016-07-07 16:12:32 +01:00
Dave Airlie
7d40db8cdb st/glsl_to_tgsi: don't increase immediate index by 1.
Immediates are stored into a separate table, and are
consolidated, so if we get an immediate we don't need
to offset it as the index it has is correct.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 27d456cc87)
2016-07-07 16:12:32 +01:00
Nicolai Hähnle
250e13e585 st/mesa: check the texture image level in st_texture_match_image
Otherwise, 1x1 images of arbitrarily high level are accepted.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 07cc838b10)
2016-07-07 16:12:32 +01:00
Nicolai Hähnle
5469fc9800 st/mesa: an incomplete texture may have a zero-size first image
Fixes a regression introduced by commit 42624ea83 which triggered
an assertion in
dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0

While stImage must have a non-zero size as verified by the caller, we also
look at the size of the base image in an attempt to make a better guess at
the level0 size (this is important when the base image size is odd). However,
the base image may have a zero size even when it exists.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 0ba053b34c)
2016-07-07 16:12:32 +01:00
Chuck Atkins
951be8a50c gallium: Force blend color to 16-byte alignment
This aligns the 4-element color float array to 16 byte boundaries.  This
should allow compiler vectorizers to generate better optimizations.
Also fixes broken vectorization generated by Intel compiler.

v2: Fixed indentation and added a lengthy comment explaining the
    reason for the alignment.

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d8d6091a84)
2016-07-07 16:12:32 +01:00
Emil Velikov
b08e5a1940 Revert "swr: Refactor checks for compiler feature flags"
This reverts commit a380199e3968462da8291e8dda25888f19e86783.
2016-07-07 16:12:32 +01:00
Chuck Atkins
2ad47d912e swr: Refactor checks for compiler feature flags
Encapsulate the test for which flags are needed to get a compiler to
support certain features.  Along with this, give various options to try
for AVX and AVX2 support.  Ideally we want to use specific instruction
set feature flags, like -mavx2 for instance instead of -march=haswell,
but the flags required for certain compilers are different.  This
allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c
while the Intel compiler which doesn't support those flags can fall
back to using -march=core-avx2.

This addresses a bug where the Intel compiler will silently ignore the
AVX2 instruction feature flags and then potentially fail to build.

v2: Pass preprocessor-check argument as true-state instead of
    false-state for clarity.
v3: Reduce AVX2 define test to just __AVX2__.  Additional defines suchas
    __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined
    w.r.t thier availability.
v4: Fix C++11 flags being added globally and add more logic to
    swr_require_cxx_feature_flags

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Tim Rowley <timothy.o.rowley@Intel.com>
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit c1bf6692be)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-07-07 16:12:32 +01:00
Ian Romanick
bb819a9e21 mapi: Export all GLES 3.1 functions in libGLESv2.so
Khronos recommends that the GLES 3.1 library also be called libGLESv2.
It also requires that functions be statically linkable from that
library.

NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension
since at least Mesa 10.5, so applications targeting Linux should use
eglGetProcAddress to avoid problems running binaries on systems with
older, non-GLES 3.1 libGLESv2 libraries.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Reported-by: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Acked-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 5921f372c8)
2016-07-07 16:12:32 +01:00
sonjiang
0076e14f53 radeon/uvd: fix a h265 context size bug
Fixes a h265 video corruption bug which caused by uvd fw interface changes.

Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit b928ff6f62)
2016-07-07 16:12:32 +01:00
sonjiang
930425df1e radeon/uvd: separate uvd context buffer from DPB
Adapt driver for Polairs uvd firmware interface changes.

Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 5c80354a23)
2016-07-07 16:12:32 +01:00
sonjiang
700c1412e7 radeon: uvd add uvd fw version for amdgpu
Because Polaris uvd fw interface changes, the driver need to check fw version
to apply right interface. This change is to add uvd fw version.

Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 28f85eab49)
[Emil Velikov: resolve trivial s/bool/boolean/ conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeon/radeon_winsys.h
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
1eaba7b5b3 gm107/ir: make sure that flagsDef is set when emitting setcond
Rely on the existence of a second destination when emitting a setcond
flag is dangerous, because this doesn't mean that the flag has been
correctly set. Instead rely on flagsDef like what emitX() does
for flagsSrc.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cc97b6a34a)
2016-07-07 16:12:31 +01:00
Marek Olšák
684e555aaa radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris
This was missing.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c1dbc563f4)
2016-07-07 16:12:31 +01:00
Kenneth Graunke
dcc6dde5f9 i965: Make emit_urb_writes() not produce an EOT message for GS.
emit_urb_writes() contains code to emit an EOT write with no actual
data when there are no output varyings.  This makes sense for the VS
and TES stages, where it's called once at the end of the program.

However, in the geometry shader stage, emit_urb_writes() is called once
for every EmitVertex().  We explicitly emit a URB write with EOT set at
the end of the shader, separately from this path.  So we'd better not
terminate the thread.  This could get us into trouble for shaders which
do EmitVertex() with no varyings followed by SSBO/image/atomic writes.

It also caused us to emit multiple sends with EOT set, which apparently
confuses the register allocator into not using g112-g127 for all but
the first one.  This caused EU validation failures in OglGSCloth
shaders in shader-db.  (The actual application was fine, but shader-db
thinks there are no outputs because it doesn't understand transform
feedback.)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 7e7e501acf)
2016-07-07 16:12:31 +01:00
Kenneth Graunke
6d47c3eb4e glsl: Ignore ir_texture in lower_const_arrays_to_uniforms.
The only part of an ir_texture which can be an array is the
offsets array in textureGatherOffsets() calls.  We don't want
to lower those, because they're required to remain constants.

Fixes textureGatherOffsets with Gallium drivers such as llvmpipe,
which commit ef78df8d3b regressed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit a36a73a7b8)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
7f0984ca51 gm107/ir: add missing setcond flags for LOP variants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b9b096775)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
94bfef8a71 gm107/ir: make use of LOP32I for all immediates
LOP only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 83a4f28dc2)
2016-07-07 16:12:31 +01:00
Dave Airlie
2bd400d953 virgl: reduce some limits for now
These need to be passed from the host in caps structure if they
are larger, this fixes a bunch of tests on Intel hw, that I'd
put the limits too high for.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c7cc264ca9)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
d4ab5bcad5 gm107/ir: make use of MOV32I for all immediates
MOV only allows to emit 19-bits immediates. This is similar to the
previous fix I did for IMUL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c7fa3c92f8)
2016-07-07 16:12:31 +01:00
Jordan Justen
b82362f52c i965: Use miptree to decide format on multi-plane images for gen < 7
This wasn't handled correctly for multi-plane images on gen < 7 in
727a9b2493.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96674
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 367cf3a2e3)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
2fc24cfd8c gm107/ir: make use of IMUL32I for all immediates
IMUL only allows to emit 19-bits immediates. This is similar to
d30768025a which fixed the same thing
for the GK110 emitter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b84c97587b)
2016-07-07 16:12:30 +01:00
Jordan Justen
507d19c44f i965: Skip update_texture_surface when the plane doesn't exist
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96607
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 727a9b2493)
2016-07-07 16:12:30 +01:00
Kenneth Graunke
a6a246e17c i965: Set fs_inst::base_mrf = -1 by default.
On MRF platforms, we need to set base_mrf to the first MRF value we'd
like to use for the message.  On send-from-GRF platforms, we set it to
-1 to indicate that the operation doesn't use MRFs.

As MRF platforms are becoming increasingly a thing of the past, we've
forgotten to bother with this.  It makes more sense to set it to -1 by
default, so we don't have to think about it for new code.

I searched the code for every instance of 'mlen =' in brw_fs*cpp, and
it appears that all MRF-based messages correctly program a base_mrf.

Forgetting to set base_mrf = -1 can confuse the register allocator,
causing it to think we have a large fake-MRF region.  This ends up
moving the send-with-EOT registers earlier, sometimes even out of
the g112-g127 range, which is illegal.  For example, this fixes
illegal sends in Piglit's arb_gpu_shader_fp64-layout-std430-fp64-shader,
which had SSBO messages with mlen > 0 but base_mrf == 0.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3e04e3758e)
2016-07-07 16:12:30 +01:00
Marek Olšák
a51a9d7ba3 radeonsi: fix fractional odd tessellation spacing for Polaris
ported from Vulkan (and no source explains why this is needed)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 28d0d0c5b4)
2016-07-07 16:12:30 +01:00
Marek Olšák
4f784775a7 radeonsi: fix a compute shader hang with big threadgroups on SI & CI
ported from Vulkan

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1e8adb0ee4)
[Emil Velikov: resolve trivial conflict in si_launch_grid()]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_compute.c
2016-07-07 16:12:30 +01:00
Ilia Mirkin
48fe283158 nvc0: when mapping directly, provide accurate xfer info + start
We were ignoring the incoming box parameters, and were providing totally
bogus stride/layer stride, and other bits, for when a non-full-surface
map was requested.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b433cb51e5)
2016-06-24 21:33:24 +01:00
Nicolai Hähnle
f41f78cda1 radeonsi: drop the DRAW_PREAMBLE packet on Polaris
It will be removed from the firmware for the Polaris.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0da890e62c)
2016-06-24 21:31:45 +01:00
Nicolai Hähnle
197e2eaea8 radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on Polaris
The non-MULTI variants will be removed in Polaris firmware.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2aa0485902)
2016-06-24 21:30:37 +01:00
Jordan Justen
eadccf8c67 i965: Preserve the internal format of the dri image
Since the OpenGLES API is strict about the internal format matching
the for many operations, we need to preserve it.

See _mesa_es3_error_check_format_and_type in
src/mesa/main/glformats.c.

Fixes ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96351
Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit c36a363a2d)
2016-06-24 21:29:47 +01:00
Kenneth Graunke
1fc705366c i965: Implement rasterizer discard via SOL unless required for queries.
We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED
query, which involves passing all primitives to the clipper.  When
rasterizer discard is enabled, we program the clipper in REJECT_ALL
mode, rather than using the SOL stage's "Rendering Disable" feature.

See commit f09b91f782 for an explanation
of why we implement GL_PRIMITIVES_GENERATED this way.

Apparently the SOL stage's "Rendering Disable" feature is a lot faster
than having the clipper reject all primitives.  It's safe to use when
no GL_PRIMITIVES_GENERATED query is active, as we don't care about
CL_INVOCATION_COUNT incrementing.

This patch makes us use SO_RENDERING_DISABLE when no query is active,
but continues falling back to the clipper in REJECT_ALL mode when the
queries are enabled.  It brings back the perf_debug for the clipper
case (which I removed in commit 1f9445ff57, thinking it wasn't useful).

Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10)
on my Broadwell GT2 laptop.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b0629e6894)
2016-06-24 21:28:59 +01:00
Kenneth Graunke
91de94a119 i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms.
They're basically the same.  Let's avoid the code duplication.

v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught
    by Jason Ekstrand).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4db98f8beb)
2016-06-24 21:27:28 +01:00
Kenneth Graunke
d7ea3eada7 glsl: Don't constant propagate arrays.
Constant propagation on arrays doesn't make a lot of sense.  If the
array is only accessed with constant indexes, then opt_array_splitting
would split it up.  Otherwise, we have variable indexing.  If there's
multiple accesses, then constant propagation would end up replicating
the data.

The lower_const_arrays_to_uniforms pass creates uniforms for each
ir_constant with array type that it encounters.  This means that it
creates redundant uniforms for each copy of the constant, which means
uploading too much data.  It can even mean exceeding the maximum number
of uniform components, causing link failures.

We could try and teach the pass to de-duplicate the data by hashing
constants, but it makes more sense to avoid duplicating it in the first
place.  We should promote constant arrays to uniforms, then propagate
the uniform access.

Fixes the TressFX shaders from Tomb Raider, which exceeded the maximum
number of uniform components by a huge margin and failed to link.

On Broadwell:

total instructions in shared programs: 9067702 -> 9068202 (0.01%)
instructions in affected programs: 10335 -> 10835 (4.84%)
helped: 10 (Hoard, Shadow of Mordor, Amnesia: The Dark Descent)
HURT: 20 (Natural Selection 2)

loops in affected programs: 4 -> 0

The hurt programs appear to no longer have a constarray uniform, as
all constants were successfully propagated.  Apparently before this
patch, we successfully unrolled a loop containing array access, but
only after promoting constant arrays to uniforms.  With this patch,
we unroll it first, so all array access is direct, and the array
is split up, and individual constants are propagated.  This seems
better.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit fb857b5eea)
2016-06-24 21:26:39 +01:00
Kenneth Graunke
0af4f5c1ba glsl: Make lower_const_arrays_to_uniforms work directly on constants.
There's really no point in looking at ir_dereference_array of a
constant.  It also misses cases like:

  (assign () (var_ref tmp) (constant (array ...) ...))

No changes in shader-db, but keeps it working after the next commit.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit ef78df8d3b)
2016-06-24 21:25:50 +01:00
Kenneth Graunke
335193107a i965: Copy propagate before doing variable index lowering.
The scalar backend currently doesn't support variable indexing on
temporary arrays, but it does support it on uniform arrays, and
some stages support it for input arrays.  Make sure these are
propagated through before exploding indirects into piles of
if-ladders unnecessarily.

On Broadwell, no instruction count change in shader-db.

total cycles in shared programs: 80675652 -> 80674928 (-0.00%)
cycles in affected programs: 649972 -> 649248 (-0.11%)
helped: 386
HURT: 165

This will help avoid code quality regressions in a future commit.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit f7741c5211)
2016-06-24 21:25:00 +01:00
Kenneth Graunke
32bb867118 glsl: Propagate invariant/precise after lowering const arrays.
The new uniform may need precise as well.

Fixes copy propagation of constant array uniforms in Tomb Raider shaders.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 586f4a42e7)
2016-06-24 21:24:09 +01:00
Kenneth Graunke
60b5ba557e glsl: Split arrays even in the presence of whole-array copies.
Previously, we failed to split constant arrays.  Code such as

   int[2] numbers = int[](1, 2);

would generates a whole-array assignment:

  (assign () (var_ref numbers)
             (constant (array int 4) (constant int 1) (constant int 2)))

opt_array_splitting generally tried to visit ir_dereference_array nodes,
and avoid recursing into the inner ir_dereference_variable.  So if it
ever saw a ir_dereference_variable, it assumed this was a whole-array
read and bailed.  However, in the above case, there's no array deref,
and we can totally handle it - we just have to "unroll" the assignment,
creating assignments for each element.

This was mitigated by the fact that we constant propagate whole arrays,
so a dereference of a single component would usually get the desired
single value anyway.  However, I plan to stop doing that shortly;
early experiments with disabling constant propagation of arrays
revealed this shortcoming.

This patch causes some arrays in Gl32GSCloth's geometry shaders to be
split, which allows other optimizations to eliminate unused GS inputs.
The VS then doesn't have to write them, which eliminates the entire VS
(5 -> 2 instructions).  It still renders correctly.

No other change in shader-db.

v2: Drop !AOA check and improve a comment (feedback from Tim Arceri).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit c264fdbc07)
2016-06-24 21:23:20 +01:00
Kenneth Graunke
9013f56bb7 glsl: Make constant propagation's folder not propagate into an LHS.
opt_constant_propagation.cpp contains constant folding code which can
actually do constant propagation in some cases.  It was happily
propagating constants into the left-hand-side of assignments.

For example,

   (assign () (var_ref temp) (constant ...))

would brilliantly be turned into:

   (assign () (constant ...) (constant ....))

This is a bigger hammer than necessary - it prevents propagation
into the left-hand-side altogether.  We could certainly do better
someday.  Notably, the constant propagation pass itself already
takes this approach - it's just the constant propagation pass's
built-in constant folding code (which actually propagates, too)
that was broken.

No change in shader-db, but prevents regressions after future commits.
It seems plausible that this could be hit today, but I haven't seen it
happen.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit acf5444044)
2016-06-24 21:22:31 +01:00
Ardinartsev Nikita
133d0f0882 i965: Avoid division by zero.
Fixes regression introduced by af5ca43f26

Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
(cherry picked from commit 01c89ccc5d)
2016-06-24 21:21:34 +01:00
Tim Rowley
1e8fb90f19 swr: push/pop DEBUG macro around llvm includes
llvm redefines DEBUG; adding push/pop prevents a undefined reference
to debug_refcnt_state in llvm-3.7+.

v2: add undef DEBUG

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 9ca741c645)
2016-06-24 21:20:28 +01:00
Jose Fonseca
6c1911effb include: Require MSVC 2013 Update 4.
Earlier MSVC 2013 releases have troubles compiling some of our C99 code,
so make sure we have Update 4 to avoid confusion.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 805dbdf06d)
2016-06-24 20:58:23 +01:00
Jason Ekstrand
aefcbf41ef anv: Use different BOs for different scratch sizes and stages
This solves a race condition where we can end up having different stages
stomp on each other because they're all trying to scratch in the same BO
but they have different views of its layout.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c2f2c8e407)
2016-06-24 20:57:17 +01:00
Jason Ekstrand
892cbc202c genxml: Make ScratchSpaceBasePointer an address instead of an offset
While we're here, we also fixup MEDIA_VFE_STATE and rename the field in
3DSTATE_VS on gen6-7.5 to be consistent with the others.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 45c0f60999)
2016-06-24 20:56:10 +01:00
Jason Ekstrand
7a4641cdbe anv: Add an allocator for scratch buffers
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 966bed17c1)
2016-06-24 20:55:06 +01:00
Jason Ekstrand
13d82b7690 genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE on gen7
The pack header generation scripts can't handle the case where you have
two addresses in the same dword; they just take whatever is the last one.
This meant that the MCS address wasn't properly getting handled.  Since we
don't care about append counters, we can just re-arrange the XML for now.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 89ded099f8)
2016-06-24 20:53:35 +01:00
Jason Ekstrand
94cd7425e8 anv,isl: Lower storage image formats in anv
ISL was being a bit too clever for its own good and lowering the format for
us.  This is all well and good *if* we always want to lower it.  However,
the GL driver selectively lowers the format depending on whether the
surface is write-only or not.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d82322eb18)
2016-06-24 20:52:38 +01:00
Jason Ekstrand
40a9ffbbca isl/state: Allow for full 31-bit buffer texture sizes
Ivy Bridge and above can handle up to 2^31 elements for RAW buffer
surfaces.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 97f12773b8)
2016-06-24 20:51:48 +01:00
Jason Ekstrand
02bf08e124 isl/state: Don't use designated initializers for buffer surface state
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb64e666ba)
2016-06-24 20:50:53 +01:00
Jason Ekstrand
feaa68e38a isl/state: Add assertions for buffer surface restrictions
Acked-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4061fde66e)
2016-06-24 20:49:57 +01:00
Jason Ekstrand
72cc8544a8 isl/state: Don't set SurfacePitch for gen9 1-D textures
This field is ignored by the hardware in this case and, on very large 1-D
textures, it can end up being larger than the maximum allowed value.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ce24097abe)
2016-06-24 20:49:05 +01:00
Jason Ekstrand
fcefb53c37 isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7
This matches better what happens on gen8 where the "Tiled Surface" and
"Tile Walke" bits are combined into a single two-bit value.  This is also
more consistent with what the GL driver does.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f47e23a8b6)
2016-06-24 20:48:09 +01:00
Jason Ekstrand
913e9e14f0 isl/state: Emit no-op mip tail setup on SKL
This hasn't ever been a problem in the past but it is recommended by the
hardware docs.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96706bad5f)
2016-06-24 20:47:21 +01:00
Jason Ekstrand
a49f97fae3 isl/state: Only set cube face enables if usage includes CUBE_BIT
It seems safe to set it all the time, but this reduces the diff between
the way i965 does it and what ISL does.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 14d7c16e50)
2016-06-24 20:46:31 +01:00
Jason Ekstrand
672872051d isl/state: Use the layout for computing qpitch rather than dimensions
For depth/stencil 1-D textures on SKL, we want them layed out in the old
format that has been used since gen4.  In order for the surface state
fill-out code to handle, this it needs to distinguish based on layout
rather than just dimensionality.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d24e9cfa1)
2016-06-24 20:45:39 +01:00
Jason Ekstrand
667beb92a9 isl/state: Set the IntegerSurfaceFormat bit on Haswell
This fixes 688 Vulkan CTS tests on Haswell.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a43204afa)
2016-06-24 20:44:46 +01:00
Jason Ekstrand
262282c1bf isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 324103da75)
2016-06-24 20:43:55 +01:00
Jason Ekstrand
350ae65585 isl/state: Don't set RenderTargetViewExtent for texture surfaces
The docs specify that this only matters for render targets and surfaces
used with typed dataport messages.  On some platforms (gen4-6) the Depth
field has more bits than RenderTargetViewExtent so we can have textures
with more levels than we can render to.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 215282c9f4)
2016-06-24 20:43:01 +01:00
Jason Ekstrand
415869c5c9 isl/state: Set SurfaceArray based on the surface dimension
According to the PRM, you can't set SurfaceArray for 3D or buffer textures.
There doesn't seem to be a good reason not to set it when we can.  On the
other hand, if we don't set it we can end up getting strange results for
1-layer array textures such as textureSize() returning the wrong results.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb326f7b01)
2016-06-24 20:42:07 +01:00
Jason Ekstrand
6a3f08be3a isl/state: Don't force-disable L2 bypass for everything
We already set the bit in the few cases where it's required by the docs so
there's no need to set it all the time.  This has no noticable perf impact
for Dota 2 on Vulkan with the time demo I have.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d050ffbce9)
2016-06-24 20:41:15 +01:00
Jason Ekstrand
0315650532 isl/state: Refactor the setup of clear colors
This commit switches clear colors to use #if's instead of a C if.  This
lets us properly handle SNB where the clear color field doesn't exist.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 87f0ffa646)
2016-06-24 20:40:19 +01:00
Jason Ekstrand
9259e0f990 isl/state: Refactor the per-gen isl_to_gen_h/valign tables
This moves the #if's around so that halign and valign have different sets
of #if conditions.  This also prepares us for SNB because isl_to_gen_halign
is not defined at all on gen6.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 62a5e6e031)
2016-06-24 20:39:23 +01:00
Jason Ekstrand
dbc94da586 isl/state: Return an extent3d from the halign/valign helper
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b1b0d6fb54)
2016-06-24 20:38:27 +01:00
Jason Ekstrand
1dd276aa7c isl/state: Put pitch calculations together
This is purely cosmetic, but it makes things look a bit more readable.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a60ae9e10a)
2016-06-24 20:37:27 +01:00
Jason Ekstrand
652161bdc8 isl/state: Put all dimension setup together and towards the top
This is purely cosmetic, but it makes things look a bit more readable.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 70c8afc0c8)
2016-06-24 20:36:22 +01:00
Jason Ekstrand
29b24d75eb isl/state: Put surface format setup at the top
This is purely cosmetic, but it makes things look a bit more readable.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e66e70ef47)
2016-06-24 20:35:22 +01:00
Jason Ekstrand
8b3333d1df isl/state: Remove some unused fields
They're already zero-initialized and we have no plans of doing anything
more interesting with them.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 39baea551f)
2016-06-24 20:34:30 +01:00
Jason Ekstrand
bf59ce8869 isl/state: Don't use designated initializers for the surface state
While designated initializers are nice, they also force us to put some
things in the initializer and some things later.  Surface state setup is
complicated enough that this really hurts readability in the long run.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit caf2af4181)
2016-06-24 20:33:35 +01:00
Jason Ekstrand
0a7671a309 genxml/gen8,9: Prefix the multisample format enum with MSFMT
This is what gen7 does and it's nice to have a prefix

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit de1d194856)
2016-06-24 20:32:34 +01:00
Jason Ekstrand
69234ef45e i965/gen4: Subtract 1 from buffer sizes
The PRM states that the values put in Width, Height, and Depth should be
various bits from the value size - 1.  We seem to have done this wrong
more-or-less from the start.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2a1cc94d27)
2016-06-24 20:31:42 +01:00
Jason Ekstrand
2681454102 i965/fs: Use a default Y coordinate of 0 for TXF on gen9+
Previously, we were incrementing length but not actually putting anything
in the Y coordinate.  This meant that 1-D TXF operations had a garbage
array index.  If the surface is emitted as 1-D non-array, the coordinate
gets discarded and it works fine.  If it happens to be bound as an array
surface, it may count as an out-of-bounds array access and you get zero.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0195299c86)
2016-06-24 20:30:48 +01:00
Jason Ekstrand
e9fd680fde i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1436238b75)
2016-06-24 20:29:56 +01:00
Jason Ekstrand
6a6947d89a i965/blorp/gen8: Use the correct max level and layer in emit_surface_states
We were adding in the base which is wrong because the values given in the
miptree are relative to zero and not the base layer/level.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 620f81d2ed)
2016-06-24 20:29:03 +01:00
Jason Ekstrand
1673dec65c i965: Drop the maximum 3D texture size to 512 on Sandy Bridge
The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be
set to the depth of a 3-D texture when rendering.  Unfortunatley, that
field is only 9 bits on Sandy Bridge and prior so we can't actually bind
a 3-D texturing for rendering if it has depth > 512.  On Ivy Bridge, this
field was bumpped to 11 bits so we can go all the way up to 2048.  On Iron
Lake and prior, we don't support layered rendering and we use OffsetX/Y
hacks to render to particular layers so 2048 is ok there too.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6ba88bce64)
2016-06-24 20:28:06 +01:00
Jason Ekstrand
af12f81147 i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctly
This is basically a direct translation of what we do for gen7.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0f9cd74aab)
2016-06-24 20:27:12 +01:00
Jason Ekstrand
d9219b5b79 i965/gen4: Pull texture formats from the texture object not the miptree
This makes texture views sort-of work.  It doesn't add full texture view
support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats
piglit test on Iron Lake.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ee39d3ba91)
2016-06-24 20:26:14 +01:00
Ilia Mirkin
abfed13bf4 glsl: only match gl_FragData and not gl_SecondaryFragDataEXT
There's special logic around finding gl_FragData. It latches onto any
array with FRAG_RESULT_DATA0. However gl_SecondaryFragDataEXT[], added
by GL_EXT_blend_func_extended, fits those parameters as well. The real
frag data array should have index 0 though, so we can use that to
distinguish them.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96617
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 36ed1b695e)
2016-06-24 20:25:10 +01:00
Ilia Mirkin
8ac0a713f7 nv50,nvc0: fix start_instance in manual push path
The start instance is applied as an offset into the buffer directly,
ignoring the divisor, not as an instance id offset that respects the
divisor.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1f4bca798d)
2016-06-24 20:23:49 +01:00
Ilia Mirkin
f7af3868f7 translate: fix start_instance parameter in sse version
The generic version gets this right already, but this was using an
incorrect formula in SSE.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 5b0d64886d)
2016-06-24 20:22:16 +01:00
Jason Ekstrand
15d06d4d61 anv/cmd: Dirty descriptor sets when a new pipeline is bound
Ever since c2581a9375, the binding table layout has depended on the
pipeline.  This means that whenever we change pipelines we also need to
re-emit binding tables for the new layout.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 35b53c8d47)
2016-06-24 20:21:18 +01:00
Jason Ekstrand
6fd7d618f4 anv/cmd: Move emit_descriptor_pointers to genX_cmd_buffer.c
It's tiny and fully generic so there's really no reason for it to be in a
gen7-specific file.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2bfe0c3374)
2016-06-24 20:20:21 +01:00
Jason Ekstrand
045d6bc023 anv/cmd: Move flush_descriptor_sets to anv_cmd_buffer.c
There's no good reason for recompiling it

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9df4d6bb36)
2016-06-24 20:19:11 +01:00
Jason Ekstrand
b2fe134064 spirv: Use the system value version of gl_FrontFace
SPIR-V treats it as an input but NIR wants the system value.  This
shouldn't have been too much of a surprise given that we have to do the
same conversion in the GLSL IR to NIR pass.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 295e03c980)
2016-06-24 20:03:46 +01:00
Kenneth Graunke
2e8129ddf8 i965: Reorganize prog_data->total_scratch code a bit.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 40013c5033)
2016-06-24 18:17:50 +01:00
Emil Velikov
5e0b11cb6d Update version to 12.0.0-rc4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-21 13:32:04 +01:00
Nicolai Hähnle
6306930c3f st/mesa: flush bitmap cache before CopyImageSubData
Found by inspection.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f9ddd52317)
2016-06-21 11:53:55 +01:00
Nicolai Hähnle
76377387c2 st/mesa: flush bitmap cache before texture functions
As far as I can tell, a sequence of glBitmap followed by texture functions
that refer to a texture bound as the framebuffer is well within what should
be allowed.

Found by inspection.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e7fff3cfe1)
2016-06-21 11:52:36 +01:00
Nicolai Hähnle
6775b169cd st/mesa: flush bitmap cache before compute dispatch
In the unlikely case that a program uses glBitmap to render to a framebuffer
whose texture is bound in a compute shader.

Found by inspection.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c542b7e43d)
2016-06-21 11:51:20 +01:00
Kenneth Graunke
a0235eb0f7 i965: Fix multiplication of immediates on Cherryview/Broxton.
Cherryview and Broxton don't support DW x DW multiplication.  We have
piles of code to handle this, but apparently weren't retyping in the
immediate case.

For example,
tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes
makes the simulator angry about instructions such as:

   mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D

Just retype to W or UW.  It should be safe on all platforms.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cd89c834a8)
2016-06-21 11:49:55 +01:00
Jason Ekstrand
09a098bdeb anv: Add proper support for depth clamping
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb6764c4a7)
2016-06-21 11:48:39 +01:00
Jason Ekstrand
f3c8dde2e4 anv/cmd_buffer: Split emit_viewport in two
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8a46b505cb)
2016-06-21 11:47:20 +01:00
Jason Ekstrand
3fddb9fd46 anv/cmd_buffer: Set depth/stencil extent based on the image
It used to be based on the framebuffer which isn't quite right.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 20e95a746d)
2016-06-21 11:46:03 +01:00
Jason Ekstrand
f614a1f4d8 anv/cmd_buffer: Don't crash if push constants are provided for missing stages
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b65f2e4163)
2016-06-21 11:44:48 +01:00
Jason Ekstrand
f4bc7218d5 anv/pipeline: Do invariance propagation on SPIR-V shaders
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e6c2fe4519)
2016-06-21 11:43:29 +01:00
Jason Ekstrand
77f241bd37 nir/alu_to_scalar: Respect the exact ALU operation qualifier
Just setting builder->exact isn't sufficient because that only applies to
instructions that are built with the builder but instructions created
manually and only inserted using the builder are left alone.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bec07b7292)
2016-06-21 11:41:49 +01:00
Jason Ekstrand
deedb368de nir: Add a pass for propagating invariant decorations
This pass is similar to propagate_invariance in the GLSL compiler.  The
real "output" of this pass is that any algebraic operations which are
eventually consumed by an invariant variable get marked as "exact".

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 202751fbb7)
2016-06-21 11:37:37 +01:00
Jason Ekstrand
bac23b13eb nir/algebraic: Remove imprecise flog2 optimizations
While mathematically correct, these two optimizations result in an
expression with substantially lower precision than the original.  For any
positive finite floating-point value, log2(x) is well-defined and finite.
More precisely, it is in the range [-150, 150] so any sum of logarithms
log2(a) + log2(b) is also well-defined and finite as long as a and b are
both positive and finite.  However, if a and b are either very small or
very large, their product may get flushed to infinity or zero causing
log2(a * b) to be nowhere close to log2(a) + log2(b).

This imprecision was causing incorrect rendering in Talos Principal because
part of its HDR rendering process involves doing 8 texture operations,
clamping the result to [0, 65000], taking a dot-product with a constant,
and then taking the log2.  This is done 6 or 8 times and summed to produce
the final result which is written to a red texture.  In cases where you
have a region of the screen that is very dark, it can end up getting a
result value of -inf which is not what is intended.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96425
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68e308d853)
2016-06-21 11:36:08 +01:00
Nicolai Hähnle
b03b256e92 radeonsi: fix calculation of valid RB mask per SE
The old calculation treated too many RBs as disabled.

Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c95175581e)
2016-06-21 11:34:38 +01:00
Nicolai Hähnle
52ae654569 radeonsi: raise SI_PM4_MAX_DW
The old limit, introduced in commit afa752d3f0,
was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6c2e636982)
2016-06-21 11:33:00 +01:00
Roland Scheidegger
f675339b22 gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).

This addresses https://llvm.org/bugs/show_bug.cgi?id=28176

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Aaron Watry <awatry@gmail.com>
(cherry picked from commit b0cf99165a)
2016-06-21 11:31:08 +01:00
Ilia Mirkin
cdbcd315b3 nvc0: don't make use of push hint if there are no non-const user vbos
This makes the check match up what we do on nv50 as well - there's no
point in switching over the push path if everything's in managed
buffers. This can happen when a shader uses a vertex without an enabled
array - we end up passing it a constant attribute.

This also has the effect of "fixing" some flickering in Talos. I have no
idea why. I've stared at the push logic forwards, backwards, and
sideways. By always forcing the push path (which is slow), the
flickering also goes away, but other rendering is still wrong
(specifically draw 383068 as identified in the bug). However by not
switching over to the push path, draw 383068 is correct.

Note that other flickering remains in Talos, like the red/green
walls/floors. This takes care of the shadow flickering though.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 154c0a42a2)
2016-06-21 11:29:27 +01:00
Ilia Mirkin
7f1a4dc740 gk104/ir: fix tex use generation to be more careful about eliding uses
If we have a loop, instructions before the tex might be added as tex
uses, and those may in fact dominate all other uses of the tex results.
This however doesn't mean that we don't need a texbar after the tex.
Only check if uses dominate each other they are dominated by the tex.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565
Fixes: 7752bbc44 (gk104/ir: simplify and fool-proof texbar algorithm)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1804aa0b80)
2016-06-21 11:27:50 +01:00
Samuel Iglesias Gonsálvez
97440cc2ed i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT
From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region
Restrictions, page 844:

  "When source or destination datatype is 64b or operation is integer DWord
   multiply, indirect addressing must not be used."

v2:
- Fix it for Broxton too.

v3:
- Simplify code by using subscript() and not creating a new num_components
variable (Kenneth).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bdab572a86)
2016-06-17 14:41:16 +01:00
Iago Toral Quiroga
3265becac3 i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT
From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine,
Register Region Restrictions:

   "When source or destination is 64b (...), regioning in Align1
    must follow these rules:

    1. Source and destination horizontal stride must be aligned to
       the same qword.
    (...)"

v2:
- Fix it for Broxton too.

v3:
- Remove inst->regs_written change as it is not necessary (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0177dbb6c2)
2016-06-17 14:40:12 +01:00
Ian Romanick
033279c961 mesa: If validation fails in a debug context just emit a debug message
There are quite a few pipelines that desktop applications (including a
bunch of piglit test) can expect to have run but don't meet the GLES
requirements.  Instead of failing validation, just emit a debug message.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 6bec55a780)
2016-06-17 14:39:16 +01:00
Ian Romanick
6572273631 glsl: Always strip arrayness in precision_qualifier_allowed
Previously some callers of precision_qualifier_allowed would strip the
arrayness from the type and some would not.  As a result, some places
would not notice that float[6], for example, needed a precision
qualifier.

Fixes the new piglit test no-default-float-array-precision.frag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 9c87282041)
2016-06-17 14:38:08 +01:00
Kenneth Graunke
dab4a6001b i965: Use a uniform for gl_PatchVerticesIn in the TCS on Gen8+.
We still need to recompile the passthrough shader when this value
changes, as it also affects the output vertex count.  But otherwise,
we can eliminate recompiles on Gen8+.

We probably want to do this for Gen7 as well, but that requires
rewriting the input release code to use a loop, which is a trade-off
I'd need to consider in more detail.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c319512e16)
2016-06-17 14:37:06 +01:00
Kenneth Graunke
286ed3aff0 glsl: Optionally lower TCS gl_PatchVerticesIn to a uniform.
i965 has no special hardware for this, so the best way to implement
this is to pass it in via a uniform.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 2b867264d2)
2016-06-17 14:28:44 +01:00
Kenneth Graunke
baa6ef4ed0 i965: Use a uniform for gl_PatchVerticesIn in the TES.
Fixes three GL44-CTS.tessellation_shader subtests:
- max_patch_vertices
- single.max_patch_vertices
- tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn

These use gl_PatchVerticesIn in the TES, but don't link against a
TCS (which would allow the linker to lower it to a constant).  We had
no handling for the system value in the backend, so it would just
assert fail.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1bc194cd64)
2016-06-17 14:27:42 +01:00
Kenneth Graunke
b7e91a0421 glsl: Optionally lower TES gl_PatchVerticesIn to a uniform.
i965 has no special hardware for this, so we need to pass this value in
as a uniform (unless the TES is linked against a TCS, in which case the
linker can just replace this with a constant).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0be2105137)
2016-06-17 14:19:26 +01:00
Nicolai Hähnle
05c5ed47d1 mesa/main: fix integer overflows in _mesa_image_offset
Found using -fsanitize=undefined.

Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6510e07345)
2016-06-17 14:18:27 +01:00
Kenneth Graunke
a9647850d1 mesa: Pass gl_constant_value union into _mesa_fetch_state().
We've had some trouble in the past with copying integers around via
float pointers, as the C compiler sometimes uses x87 floating point
registers to load values on 32-bit systems.  Passing the
gl_constant_value union should be safer.

To avoid churn, this patch creates a "GLfloat *value" variable so
existing uses can stay the same.

Not observed to fix anything, but I was in the area adding more integer
state vars, and thought it'd be wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8b408972ff)
2016-06-17 14:01:23 +01:00
Emil Velikov
7d41c8aa25 Update version to 12.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-15 09:29:14 +01:00
Nicolai Hähnle
575f9eaa2d radeonsi: mark buffer texture range valid for shader images
When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.

This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels,
reported by Michel Dänzer.

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a64c7cd2ba)

Back-ported from commit a64c7cd2ba:
- include util/u_format.h
- code was extracted to si_set_shader_image in master, move it back

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
--
 src/gallium/drivers/radeonsi/si_descriptors.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
2016-06-15 09:29:14 +01:00
Ilia Mirkin
792a5ee425 nv50/ir: record number of threads in a compute shader
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 27a51ff9b4)
2016-06-15 09:29:14 +01:00
Ilia Mirkin
59841f5466 nvc0/ir: limit max number of regs based on availability in SM
This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1f895caba0)
2016-06-15 09:29:14 +01:00
Tomasz Figa
966ee94558 i965: Check return value of screen->image.loader->getBuffers (v2)
The images struct is an uninitialized local variable on the stack. If the
callback returns 0, the struct might not have been updated and so should
be considered uninitialized. Currently the code ignores the return value,
which (depending on stack contents) might end up in reading a non-zero
value from images.image_mask and dereferencing further fields.

Another solution would be to initialize image_mask with 0, but checking
the return value seems more sensible and it is what Gallium is doing.

v2: fix typos in commit message,
    fix indentation,
    remove unnecessary parentheses and pointer dereference to keep line
    length reasonable.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e7ab358e81)
2016-06-15 09:29:14 +01:00
Dylan Baker
8ed5204182 isl: Replace bash generator with python generator
This replaces the current bash generator with a python based generator
using mako. It's quite fast and works with both python 2.7 and python
3.5, and should work with 3.3+ and maybe even 3.2.

It produces an almost identical file except for a minor layout changes,
and the addition of a "generated file, do not edit" warning.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5a87bc7181)
2016-06-15 09:29:14 +01:00
Bas Nieuwenhuizen
28294573c7 radeonsi: Reinitialize all descriptors in CE preamble.
This fixes a problem with the CE preamble and restoring only stuff in the
preamble when needed.

To illustrate suppose we have two graphics IB's 1 and 2, which  are submitted in
that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we
have a context switch at the start of IB 1, but not between IB 1 and IB 2.

The old code put the CE RAM loads in the preamble of IB 2. As the preamble of
IB 1 does not have the loads and the preamble of IB 2 does not get executed, the
old values are not load into CE RAM.

Fix this by always restoring the entire CE RAM.

v2: - Just load all descriptor set buffers instead of load and store the entire
      CE RAM.
    - Leave the ce_ram_dirty tracking in place for the non-preamble case.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Note: This commit differs from the one in master - 54f755fa0f
("radeonsi: Reinitialize all descriptors in CE preamble.")
2016-06-15 09:29:13 +01:00
Emil Velikov
7bed792ebb cherry-ignore: drop the "i965 bring back INTEL_PRECISE_TRIG"
The commit that removes it isn't in branch, thus there's nothing to do
here.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-15 09:29:13 +01:00
Samuel Iglesias Gonsálvez
7d5cdb7675 i965: Defeat the register stride checker in pull uniform messages.
Pulling DF uniforms from pull constant buffer generates messages like:
    send(4)         g12<1>DF        g12<0,1,0>F
         sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1

which produces GPU hangs in Cherryview/Braswell:

    "For 64-bit Align1 operation or multiplication of dwords in CHV,
     source horizontal stride must be aligned to qword."

This seems to be documented in the Cherryview PRM, Volume 7, Page 843:

    "When source or destination datatype is 64b or operation is integer
     DWord multiply, regioning in Align1 must follow these rules:

     1. Source and Destination horizontal stride must be aligned to the
        same qword."

We should set the destination type to UD, D, or F so that
the register stride checker doesn't notice.  The destination type of
send messages is basically irrelevant anyway.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a0ed8503b7)
2016-06-15 09:29:13 +01:00
Kenneth Graunke
465be91421 i965: Defeat the register stride checker in URB reads.
Pulling DF inputs from the URB generates messages like:

   send(8)         g23<1>DF        g1<8,8,1>UD
                   urb 3 SIMD8 read mlen 1 rlen 2      { align1 1Q };

which makes the simulator angry:

   "For 64-bit Align1 operation or multiplication of dwords in CHV,
    source horizontal stride must be aligned to qword."

This seems to be documented in the Cherryview PRM, Volume 7, Page 823:

   "When source or destination datatype is 64b or operation is integer
    DWord multiply, regioning in Align1 must follow these rules:

    1. Source and Destination horizontal stride must be aligned to the
       same qword."

Setting the source horizontal stride to QWord is insane, as it's the
message header containing 8 URB handles in a single 32-bit DWord.
Instead, we should whack the destination type to UD, D, or F so that
the register stride checker doesn't notice.  The destination type of
send messages is basically irrelevant anyway.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit ed3ba651f6)
2016-06-15 09:29:13 +01:00
Kenneth Graunke
4a6fecdf69 i965: Fix issues with number of VS URB entries on Cherryview/Broxton.
Cherryview/Broxton annoyingly have a minimum number of VS URB entries
of 34, which is not a multiple of 8.  When the VS size is less than 9,
the number of VS entries has to be a multiple of 8.

Notably, BLORP programmed the minimum number of VS URB entries (34), with
a size of 1 (less than 9), which is invalid.

It seemed like this could be a problem in the regular URB code as well,
so I went ahead and updated that to be safe.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9f37df06da)
2016-06-15 09:29:13 +01:00
Timothy Arceri
883a1b3bd2 glsl: make sure UBO arrays are sized in ES
This check was removed in 5b2675093e add it back in.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
https://bugs.freedesktop.org/show_bug.cgi?id=96349
(cherry picked from commit b010fa8567)
2016-06-15 09:29:13 +01:00
Vedran Miletić
a71e0fd8cd clover: Update OpenCL version string to match OpenGL
Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For
both, always append git version suffix from git_sha1.h.

v5: move semicolon to same line as MESA_GIT_SHA1.
v4: drop #ifdef guards.
v3: add missing include.
v2: change CL_DEVICE_VERSION as well.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 4825264f75)

Squashed with commit

clover: Include generated sources in AM_CPPFLAGS

git_sha1.c is generated in $(top_builddir)/src.

Fixes out-of-tree builds since 4825264f75.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit fafe026dbe)
2016-06-15 09:29:13 +01:00
Francisco Jerez
547b5d2daa i965/fs: Fix regs_written for SIMD-lowered instructions some more.
ISTR having suggested this during review of the recent FP64 changes to
the SIMD lowering pass, but it doesn't look like it was taken into
account in the end.  Using the fs_reg::component_size helper instead
of this open-coded variant makes sure that the stride is taken into
account correctly.  Fixes at least the following piglit tests with
spilling forced on (since otherwise regs_written would be calculated
incorrectly and the spilling code would be rather confused about how
much data needs to be spilled):

 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit bd9f972651)
2016-06-15 09:29:13 +01:00
Francisco Jerez
7154fa614b i965: Fix cross-primitive scratch corruption when changing the per-thread allocation.
I haven't found any mention of this in the hardware docs, but
experimentally what seems to be going on is that when the per-thread
scratch slot size is changed between two pipelined draw calls, shader
invocations using the old and new scratch size setting may end up
being executed in parallel, causing their scratch offset calculations
to be based in a different partitioning of the scratch space, which
can cause their thread-local scratch space to overlap leading to
cross-thread scratch corruption.

I've been experimenting with alternative workarounds, like emitting a
PIPE_CONTROL with DC flush and CS stall between draw (or dispatch
compute) calls using different per-thread scratch allocation settings,
or avoiding reuse of the scratch BO if the per-thread scratch
allocation doesn't exactly match the original.  Both seem to be as
effective as this workaround, but they have potential performance
implications, while this should be basically for free.

Fixes over 40 failures in our CI system with spilling forced on
(including CTS, dEQP and Piglit failures) on a number of different
platforms from Gen4 to Gen9.  The 'glsl-max-varyings' piglit test
seems to be able to reproduce this bug consistently in the vertex
shader on at least Gen4, Gen8 and Gen9 with spilling forced on.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a84b5d43e2)
2016-06-15 09:29:13 +01:00
Francisco Jerez
2cf78b4851 i965: Keep track of the per-thread scratch allocation in brw_stage_state.
This will be used to find out what per-thread slot size a previously
allocated scratch BO was used with in order to fix a hardware race
condition without introducing additional stalls or memory allocations.
Instead of calling brw_get_scratch_bo() manually from the various
codegen functions, call a new helper function that keeps track of the
per-thread scratch size and conditionally allocates a larger scratch
BO.

v2: Handle BO allocation manually instead of relying on
    brw_get_scratch_bo (Ken).

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d960284e44)
2016-06-15 09:29:12 +01:00
Francisco Jerez
b9f69df93d i965: Fix scratch overallocation if the original slot size was already a power of two.
The bitwise arithmetic trick used in brw_get_scratch_size() to clamp
the scratch allocation to 1KB has the unintended side effect that it
will cause us to allocate 2x the required amount of scratch space if
the original per-thread scratch size happened to be already a power of
two.  Instead use the obvious MAX2 idiom to clamp the scratch
allocation to the expected range.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 013ae4a70a)
2016-06-15 09:29:12 +01:00
Kenneth Graunke
eaa8561230 i965: Fix encode_slm_size() to take a generation, not a device info.
In the Vulkan driver, we have the generation number (a compile time
constant) but not necessarily the brw_device_info struct.  I meant
to rework the function to take a generation number instead of a
brw_device_info pointer to accomodate this.  But I forgot, and left
it taking a brw_device_info pointer, while making Vulkan pass the
generation number (8, 9, ...) directly.  This led to crashes.

Brown paper bag fix for commit 87d062a940.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5a0d294d38)
2016-06-15 09:29:12 +01:00
Kenneth Graunke
9edc2f1828 i965: Don't leak scratch BOs for TCS/TES.
These need to be freed too.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 667e5cec76)
2016-06-15 09:29:12 +01:00
Nanley Chery
5e41ac197f anv/pipeline: Don't dereference NULL dynamic state pointers
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.

This fixes a segfault seen in the McNopper demo, VKTS_Example09.

v3 (Jason Ekstrand):
   - Fix disabled rasterization check
   - Revert opaque detection of color attachment usage

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a4a5917248)
2016-06-15 09:29:12 +01:00
Nanley Chery
cdeb3e8eb4 anv: Document and rename anv_pipeline_init_dynamic_state()
To reduce confusion, clarify that the state being copied is not dynamic.

This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0d84a9ef9)
2016-06-15 09:29:12 +01:00
Samuel Pitoiset
0b71ef5e46 nvc0/ir: clamp the UBO index for compute on Kepler
We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f257abc1b)
2016-06-15 09:29:12 +01:00
Jimmy Berry
c01ebdc83e st/va: hardlink driver instances to gallium_drv_video.so
Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.

Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0c0f841e5d)
2016-06-15 09:29:12 +01:00
Emil Velikov
501e8421f8 swr: automake: add missing -I flag
When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.

Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fcb5a75a66)
2016-06-15 09:29:12 +01:00
Emil Velikov
3162e2f9fc automake: add SWR to `make distcheck' gallium drivers
Will allows us to catch missing files and build issues before getting
the tarball out for general consumption.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f4d26856df)
2016-06-15 09:29:11 +01:00
Emil Velikov
766f852616 configure.ac: strip out the llvm-config -march/mtune flags
Otherwise drivers such as SWR that depend on providing their own values
will fail to build.

v2: Add -mcpu for good measure (Chuck)

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit bab5ab6940)
2016-06-15 09:29:11 +01:00
Chuck Atkins
b499d1062d swr: Add missing headers for package inclusion
CC: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c86fcaca72)
2016-06-15 09:29:11 +01:00
Emil Velikov
939cd6edac automake: get in-tree `make distclean' working again.
With earlier commit we've handled the `make distclean' out of tree
build, yet we failed to attribute that for in-tree builds the test
condition will return 1. Thus effectively the target will be considered
as "failed".

Fixes: b7f7ec7843 ("mesa: automake: distclean git_sha1.h when building
OOT")
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 8229fe68b5)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
2b6817c91c i965: Use the correct number of threads for compute shaders.
We were programming the number of threads per subslice, when we should
have been programming the total number of threads on the GPU as a whole.

Thanks to Curro and Jordan for helping track this down!

On Skylake GT3e:
- Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
- Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
- Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.

On Broadwell GT2:
- Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
- Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
- Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
  0.255654% (n=25).

On Haswell GT3e:
- Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
  by roughly 1.10x.
- Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
- Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
  0.432771% (n=64).

On Ivybridge GT2:
- Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
  by roughly 1.03x.
- Improves performance in Synmark's G/43CSDof by roughly 1.25x.
- No change in Synmark's Gl43CSCloth (n=28).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0fb85ac08d)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
9a118c79e7 i965: Assert that the scratch spaces are in range.
I don't know that anything actually guarantees this, but if we exceed
the limits, we may end up overflowing and trashing random buffers that
happen to be nearby in the VMA space, leading to rendering corruption,
hangs, or worse.

We should really fix this properly.  However, the pitfall has existed
for ages, so for now we should at least detect it.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 1db37ebecf)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
be426c46ab i965: Fix CS scratch size calculations on Ivybridge and Baytrail.
These are linear, not powers of two, and much more limited.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a42a93dc12)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
02f381bb17 i965: Fix Haswell CS per-thread scratch space encoding.
Most scratch stages use power of two sizes, in kilobytes, where
0 means 1kB.  But compute shaders on Haswell have a minimum of 2kB,
and use a representation where 0 = 2kB.

This meant that we were effectively telling the hardware to allocate
each thread twice as much space as we meant to, while simultaneously
not allocating that much space in the buffer, leading to overflows.

Note that the existing code is completely wrong for Ivybridge,
but that will take additional work to sort out, so I've left it
as is for now.  A subsequent commit will take care of that.

Together with the previous patches, this fixes rendering corruption
on Synmark's Gl43CSDof on Haswell.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 147a90d82a)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
e84116f364 i965: Account for poor address calculations in Haswell CS scratch size.
Curro figured this out by investigating the simulator.  Apparently
there's also a workaround in the Windows driver.  I'm not sure it's
actually documented anywhere.

We were underallocating the scratch buffer by a factor of 128/70.

v2: Rename threads_per_subslice to scratch_ids_per_subslice
    (suggested by Jordan Justen).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a7d029d3df)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
6c5c1bc1b9 i965: Allocate scratch space for the maximum number of compute threads.
We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.

Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing.  We need to allocate enough
for every possible thread that could run.

Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2213ffdb4b)
2016-06-15 09:29:10 +01:00
Kenneth Graunke
fdcc6a855b i965: Set subslice_total on Gen7/7.5 platforms.
We'll use this for compute shader thread counts and scratch space
calculations shortly.

Note that subslices are referred to as "half slices" on Ivybridge.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9cd8f95809)
2016-06-15 09:29:10 +01:00
Kenneth Graunke
c9477e0a80 i965: Fix shared local memory size for Gen9+.
Skylake changes the representation of shared local memory size:

 Size   | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
 -------------------------------------------------------------------
 Gen7-8 |    0 | none | none |    1 |    2 |     4 |     8 |    16 |
 -------------------------------------------------------------------
 Gen9+  |    0 |    1 |    2 |    3 |    4 |     5 |     6 |     7 |

The old formula would substantially underallocate the amount of space.
This fixes GPU hangs on Skylake when running with full thread counts.

v2: Fix the Vulkan driver too, use a helper function, and fix the table
    in the comments and commit message.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 87d062a940)
2016-06-15 09:29:10 +01:00
Ilia Mirkin
8d9bf67bba mesa: add drawbuffer argument to ClearNamedFramebufferfi
This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since
it's horrible to have differing APIs across library versions, we should
attempt to minimize the impact by backporting it as far as possible and
hope no one notices.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d7e015381)
2016-06-15 09:29:10 +01:00
Ilia Mirkin
ca009cf8ba GL: update glcorearb.h to svn 32433
This brings in the fixed glClearNamedFramebufferfi definition, as well
as a lot of GLsizei -> GLsizeiptr changes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92351a71a8)
2016-06-15 09:29:10 +01:00
Ilia Mirkin
7487d5cbdc GL: update glext to svn 32957
This brings in defines from GL_EXT_window_rectangles and fixes the
glClearNamedFramebufferfi definition.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f81374fd3e)
2016-06-15 09:29:10 +01:00
Anuj Phogat
72dbdf6f89 gallium: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
 -----------
|           |
 ------- ---
|       |   |
|       |   |
 ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 466b320163)
2016-06-15 09:29:10 +01:00
Anuj Phogat
dd1943f904 mesa: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
 rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
 to the destination rectangle bounded by the locations (dstX0, dstY 0)
 and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
 while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
     -----------
    |           |
     ------- ---
    |       |   |
    |       |   |
     ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit f8679badd4)
2016-06-15 09:29:10 +01:00
Jason Ekstrand
6eb0240a32 anv/entrypoints: Rework #if guards
This reworks the #if guards a bit.  When Emil originally wrote them, he
just guarded everything.  However, part of what anv_entrypoints_gen.py
generates is a hash table for looking up entrypoints based on their name.
This table *cannot* get out of sync between C and python regardless of
preprocessor flags.  In order to prevent this, this commit makes us use
void pointers in the dispatch table for those entrypoints which aren't
available.  This means that the dispatch table size and entry order is
constant and it should never get out-of-sync with the python.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8d37556ec9)
2016-06-15 09:29:10 +01:00
Jason Ekstrand
eb0197ad53 anv/entrypoints: Use the function pointer types provided by vulkan.h
This is a bit cleaner than generating the types ourselves when making the
table.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9ed0d9dd06)
2016-06-15 09:29:09 +01:00
Jason Ekstrand
242ac96a24 anv/entrypoints: Emit #if guards for all platforms
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d1a53f91ee)
2016-06-15 09:28:54 +01:00
Nicolai Hähnle
c03b4444d1 st/mesa: use base level size as "guess" when available
When an applications specifies mip levels _before_ setting a mipmap texture
filter, we will initially guess a single texture level. When the second level
image is created, we try to allocate the full texture -- however, we get the
base level size guess wrong if that size is odd. This leads to yet another
re-allocation of the texture later during st_finalize_texture.

Even worse, this re-allocation breaks a (reasonable) assumption made by
st_generate_mipmaps, because the re-allocation in the finalization call will
again allocate a single-level pipe texture (based on the non-mipmap texture
filter!). As a result, mipmap generation fails in interesting ways.

All of this can be avoided by just using the fact that we already know the
size of the base level.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 42624ea837)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
ad684cee3a anv: Remove the PhysicalDeviceLimits FINISHME
At this point, the limits are probably more-or-less correct.  If there is
an invalid limit, that's a bug not a FINSHME.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1e69930e4)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
ea24c9be4a anv/pipeline_cache: Allow for an zero-sized cache
This gets ANV_ENABLE_PIPELINE_CACHE=false working again.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4f5bbf804b)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
86dbf1ef4b anv/pipeline: Store the (set, binding, index) tripple in the bind map
This way the the bind map (which we're caching) is mostly independent of
the pipeline layout.  The only coupling remaining is that we pull the array
size of a binding out of the layout.  However, that size is also specified
in the shader and should always match so it's not really coupled.  This
rendering issues in Dota 2.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1a25db699)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
b1f217b5a9 anv/descriptor_set: Ensure that bindings are always in increasing order
Since applications are allowed to specify some set of bindings which need
not be dense they also need not be in order.  For most things, this doesn't
matter, but it could result getting the wrong dynamic offsets. This adds a
quick-and-dirty sort to ensure that everything is always in increasing
order of binding index.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c13c5ac561)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
a0be8d3d08 anv/descriptor_set: Add a type field in debug builds
This allows for some extra validation and makes it easier to see what's
going on when poking around in gdb.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e2265926f2)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
901c78786f anv/descriptor_set: Set array_size to zero for non-existant descriptors
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd21015abd)
2016-06-14 15:48:40 +01:00
Leo Liu
986159437d vl/dri3: support receiving new pixmap for front buffer
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.

This also fixes Totem player playback corruption.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2ad443e4cc)
2016-06-14 15:48:39 +01:00
Leo Liu
ab75b22029 vl/dri3: get Makefile properly
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.

This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ef8500aab)
2016-06-14 15:48:39 +01:00
Daniel Czarnowski
5cae2ac47e glx: fix crash with bad fbconfig
GLX documentation states:
	glXCreateNewContext can generate the following errors: (...)
	GLXBadFBConfig if config is not a valid GLXFBConfig

Function checks if the given config is a valid config and sets proper
error code.

Fixes currently crashing glx-fbconfig-bad Piglit test.

v2: coding style cleanups (Emil, Topi)
    use DefaultScreen macro (Emil)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf804b4455)
2016-06-14 15:48:39 +01:00
Jason Ekstrand
7d515b26bb i965: Emit surface states for extra planes prior to gen8
When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8
but not for gen7 or earlier.  It all works, we just need to emit the states
for the extra planes.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 037ce5d734)
2016-06-14 15:48:39 +01:00
Marc-André Lureau
0e554f54dc virgl: fix checking fences
When calling virgl_fence_wait() with timeout=0,
virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE
for a busy resource, whereace virgl_fence_wait() should return TRUE for
a completed (non-busy) resource.

This fixes running supertuxkart in a VM (I could not reproduce locally
with vtest though there is a similar fix)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dc81b3ad43)
2016-06-14 15:48:39 +01:00
Nicolai Hähnle
201f357c52 st/mesa: directly compute level=0 texture size in st_finalize_texture
The width0/height0/depth0 on stObj may not have been set at this point.
Observed in a trace that set up levels 2..9 of a 2d texture, and set the base
level to 2, with height 1. This made the guess logic always bail.

Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat
redundant storage of width0/height0/depth0 and makes sure we always compute
pipe texture sizes that are compatible with the base level image of the
GL texture.

Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul.

v2:
- try to re-use an existing pipe texture when possible
- handle a corner case where the base level is not level 0 and it is of
  size 1x1x1

v3:
- ptHeight = ptWidth in cube map 1x1 case (suggested by Brian)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit bd5c41fe5f)
2016-06-14 15:48:39 +01:00
Ilia Mirkin
bf3d6d9601 st/mesa: use buffer usage history to set dirty flags for revalidation
We were previously unconditionally doing this for arrays and ubo's, and
ignoring texture/storage/atomic buffers. Instead use the usage history
to determine which atoms need to be revalidated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6e6fd911da)
2016-06-14 15:48:39 +01:00
Marek Olšák
f51e99f704 gallium/radeon: don't allocate DCC for non-renderable texture formats
R9G9B9E5 is the only uncompressed one hopefully.

This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d4d733e39d)
2016-06-14 15:48:39 +01:00
Nicolai Hähnle
b2afa23a40 tgsi/scan: add uses_derivatives (v2)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d3a584defe)
2016-06-14 15:48:39 +01:00
Ilia Mirkin
6f38259419 st/mesa: revalidate image atoms when a texture is updated
A texture may be redefined with _NEW_TEXTURE, which might have been
bound to a shader image slot. We have to revalidate the image atoms to
pick up on the new resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c81b090c92)
2016-06-14 15:48:39 +01:00
Ilia Mirkin
bba2299735 gk104/ir: fix conditions for adding a texbar
Sometimes a register source can actually be double- or even quad-wide.
We must make sure that the inserted texbars take that width into
account.

Based on an earlier patch by Samuel Pitoiset.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 71ad8a173f)
2016-06-14 15:48:38 +01:00
Dave Airlie
49c53a2987 i965/gen8: fix cull distance emission for tessellation shaders.
This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c295923d13)
2016-06-14 15:48:38 +01:00
Samuel Pitoiset
4306e01ece nv50/ir: use round toward 0 when converting doubles to integers
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.

This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08ddfe7b2f)
2016-06-14 15:48:38 +01:00
Dave Airlie
b9920d2bba mesa/program_resource: return -1 for index if no location.
The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."

This fixes:
GL45-CTS.program_interface_query.output-built-in

[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit 07403014c3)
2016-06-14 15:48:38 +01:00
Nicolai Hähnle
9bf30be693 radeonsi: set descriptor dirty mask on shader buffer unbind
Found randomly while skimming the code. This might have caused VM faults in
robustness tests.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ec2b52e2d9)
2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez
05d33806cd i965/gs/scalar: Fix load input for doubles
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b648ec17c)
2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez
507d25f6f1 i965/fs: fix offset when loading double vector input varyings
When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.

const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d6f82a294)
2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez
4daa331e25 i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings
Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
64-bit aligned and the current implementation fails to read the data
properly. Instead, when there is is a double input varying, read it as
vector of floats with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb30727648)
2016-06-14 15:48:38 +01:00
Dave Airlie
fc0a469e4c glsl: geom shader max_vertices layout must match.
From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs".
"all geometry shader output vertex count declarations in a
program must declare the same count."

Fixes:
GL45-CTS.geometry_shader.output.conflicted_output_vertices_max

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4c86399378)
2016-06-14 15:48:38 +01:00
Dave Airlie
0ce3dc9a30 i965: don't use NumLayers for 3D textures.
For 3D textures we shouldn't be using NumLayers, we need
to get it from the depth.

This fixes:
GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ff2e569153)
2016-06-14 15:48:37 +01:00
Dave Airlie
89bc5f9a90 glsl: for anonymous struct matching use without_array() (v3)
With tessellation shaders we can have cases where we have
arrays of anon structs, so make sure we match using without_array().

Fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in

v2:
test lengths match as well (Ilia)
v3:
descend array lengths to check for matches as well (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1f66a4b689)
2016-06-14 15:48:37 +01:00
Dave Airlie
09f48203c5 glsl/ast: don't crash when func_name is NULL
This fixes a crash in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

If we can't find the func_name in one of these paths,
we have emitted an earlier error so just return here.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6702c15810)
2016-06-14 15:48:37 +01:00
Dave Airlie
997bcc45ec glsl: handle ast_aggregate in has_sequence_subexpression. (v2)
GL43-CTS.compute_shader.work-group-size does
uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 };

The initializer triggers the GLSL 4.30/GLES3 tests
for constant sequence subexpressions, so it doesn't
happen unless you are using those, so just return
false as this path is now reachable.

v2: update commit msg with diagnosis
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4336196b7f)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
a0e36438a8 nv50,nvc0: fix BGR10_A2UI vertex format
This is mostly academic as this is not reachable from GL, which only has
the packed RGB10_A2UI vertex format.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 092ec3920f)
2016-06-14 15:48:37 +01:00
Samuel Pitoiset
954829ebbb nvc0: do not clear surfaces bins in the validate function
We should not call nouveau_bufctx_reset() inside a validate function.
This only affects Fermi where images are aliased between 3D and CP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit be365f34f0)
2016-06-14 15:48:37 +01:00
Samuel Pitoiset
f12a16ec99 nvc0: re-validate images after launching a grid on Fermi
Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.

However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.

This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d3ecfb33)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
ceb9ed0e38 nvc0: reduce overhead from always marking images dirty
We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fd6bbc2ee2)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
5a63ae9f15 nvc0: reduce overhead from always marking buffers dirty
We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0f673db6f0)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
a95560bac5 nvc0: fix memory barrier flag handling
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e8ee161b16)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
1adbe2f45c nvc0: mark bound buffer range valid
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 29abbeecd8)
2016-06-14 15:48:37 +01:00
Marek Olšák
ccc9783a98 r600g: write WAIT_UNTIL in the correct place
This has been wrong all along. Fixing this will allow removing useless
cache flushes.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 7746903d3a)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
c632590996 anv/blit: Use CLAMP_TO_EDGE for scaled blits
When upscaling you can end up interpolating between the edge pixel and one
past the edge.  Using CLAMP_TO_EDGE seems like the most reasonable thing to
do in this case.  This fixes two of the new Vulkan CTS tests in
dEQP-VK.api.copy_and_blit.blit_image.*

Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>

(cherry picked from commit 441194edd9)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
2830ae638c anv/copy: Account for the anv_surface.offset when creating a blit2d_surf
This was causing problems if the user tried to copy to/from the stencil
portion of a combined depth/stencil image.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9313a56816)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
5ca18b6a4b nir/spirv: Make a decoration switch complete
Getting rid of the default case makes the compiler warn if we are missing
cases.  While we're here, we also add the one missing case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 526a8de22d)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
bb4ff53a71 nir/spirv: Make unhandled decorations and capabilities non-fatal
glslang frequently throw bogus decorations into shaders.  While we are free
to assert-fail, it's a bit nicer to the application to just warn.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 62c6e94bd6)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
d5dc87a1ef nir/spirv: Add a way to print non-fatal warnings
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed14d21d04)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
85579221a4 nir/spirv: Add string lookup tables for a couple of SPIR-V enums
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e46a5d155)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
6eb39fa255 nir/spirv: Complete the list of capabilities
Previously we supported a subset of capabilities and just left a default
case for the others.  It's time to stop being lazy and actually audit the
capabilities.  This should bring them up-to-date with reality.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5a1e56f344)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
13999dc70d anv/pipeline: Add support for early depth stencil
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fa958e95b)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
a3fce26907 i965/fs Add a wm_prog_data bit for has_side_effects
This is more accurate than calling
_mesa_active_fragment_shader_has_side_effects because it looks at whether
or not the SSBOs, images, or atomic buffers are actually written rather
than just existing in the program.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3fb289f957)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
7ddbe91435 anv/pipeline: Silently pass tests if depth or stencil is missing
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 56a178922f)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
c6ca6f0728 anv/pipeline: Unify gen7/8 emit_ds_state
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bc7f7e1953)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
300737042c genxml/gen6,7,75: s/BackFace/Backface
This is more consistent with gen8+

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdc3c5dd05)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
09b2be7a51 nir/spirv: Handle the WorkgroupSize builtin decoration
This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests
in the latest dev version of the Vulkan CTS.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1f7b54ed29)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
6efd37f30d nir/spirv: Use breaks instead of returns in constant handling
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b26cdd65e8)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
af2a278dfe anv/pipeline: Refactor specialization constant handling a bit
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a19ae36ce5)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
07f5e621cf nir/lower_indirect_derefs: Use the direct array deref for recursion
This fixes about 100 of the new Vulkan CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 45542f554c)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
d0dddbf4ee anv/clear: Handle ClearImage on 3-D images
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 59f06ac389)
2016-06-14 15:48:35 +01:00
Francisco Jerez
cbb02ebd74 Revert "i965/fs: Allow scalar source regions on SNB math instructions."
This reverts commit c1107cec44.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode.  Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):

   es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
   es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
   es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_float_frag_xvary
   es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
   es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
   es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7244dc1e06)
2016-06-14 15:48:35 +01:00
Francisco Jerez
d5528d5f55 i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).
The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction.  This prevents
cmod propagation from (mis)optimizing:

 cmp.z.f0 tmp, ...
 mov.z.f0 null, tmp

into:

 cmp.z.f0 tmp, ...

which gives the negation of the flag result of the original sequence.
I originally noticed this while working on SIMD32 in the scalar
back-end, but the same scenario is likely to be possible in vec4
programs so this commit ports the bugfix with the same name from the
scalar back-end to the vec4 cmod propagation pass.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a2135c6fd9)
2016-06-14 15:48:35 +01:00
Emil Velikov
0e540b4a15 anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS
Otherwise we will fail to find the headers in some scenarios.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
(cherry picked from commit 7a3a0d9212)
2016-06-14 15:48:35 +01:00
Dave Airlie
911eddd37b mesa/get: return correct value for layer provoking vertex.
This fixes:
GL45-CTS.geometry_shader.layered_rendering.layered_rendering

on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d10ae20b96)
2016-06-10 12:36:36 +01:00
Samuel Pitoiset
2185edf699 nvc0: mark buffer texture range valid for shader images
Loosely based on radeonsi (Thanks to Nicolai).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 28590eb949)
2016-06-10 12:35:15 +01:00
Francisco Jerez
09f0e97d1c i965/fs: Reindent emit_zip().
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 060c8d245d)
2016-06-10 12:34:19 +01:00
Francisco Jerez
2db670cf3e i965/fs: Skip SIMD lowering destination zipping if possible.
Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lowered instruction, and all source regions of
the instruction must be either fully disjoint from the destination or
be aligned with it group by group.

v2: Don't handle partial source-destination overlap for simplicity
    (Jason).  No instruction count regressions with respect to v1 in
    either shader-db or the few FP64 shader_runner test-cases with
    partial overlap I've checked manually.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7aa76d66a1)
2016-06-10 12:33:14 +01:00
Anuj Phogat
5000556d5d blorp: Fix 16x multisample scaled blits
Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled
(with added 16x sample support) now passes with this patch.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 75da9c9933)
2016-06-10 12:32:17 +01:00
Dave Airlie
ab525a637a mesa/copyimage: report INVALID_VALUE for missing cube face
The specs says INVALID_VALUE for exceeding dimensions,
which is really what is happening here.

This fixes:
GL45-CTS.copy_image.non_existent_mipmap

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit af7bf610cf)
2016-06-10 12:31:22 +01:00
Dave Airlie
d130c53ac1 mesa/copyimage: fix num samples check to handle renderbuffers.
This test was only happening for textures, but there is
nothing in the spec to say this, so test it for all cases.

This fixes:
GL45-CTS.copy_image.invalid_target

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c0856eacf1)
2016-06-10 12:30:23 +01:00
Nanley Chery
669836e1be mesa/extensions: Fix ES1 extension reporting
Commit eda15abd84 , unintentionally
advertised these extensions in ES1 contexts. Undo this error.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c06cef7f9b)
2016-06-10 12:26:10 +01:00
Eric Engestrom
1dce03e4c1 st/osmesa: remove double-write (overwriting)
These two lines have been here since the file was created.
I'm guessing the second one was just for testing during dev, so it's the
one that's going away.

CoverityID: 1296205

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 17f4c723eb)
2016-06-10 12:08:02 +01:00
Emil Velikov
a7649abe9f Update version to 12.0.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:35:59 +01:00
Emil Velikov
bcfda0a1fe mesa: automake: distclean git_sha1.h when building OOT
In the case of out-of-tree (OOT) builds, in particular when building
from tarball, we'll end up with the file in both srcdir and builddir.

We want the former to remain intact (since we need it on rebuild) while
the latter should be removed otherwise `make distclean' gets angry at
us.

Ideally there'll be a solution that feels a bit less of a hack. Until
then this does the job exactly as expected.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b7f7ec7843)
2016-06-07 12:35:53 +01:00
Emil Velikov
998e503592 mesa: automake: ensure that git_sha1.h.tmp has the right attributes
... when copied from git_sha1.h.

As the latter file can we lacking the write attribute, one should set it
explicitly. Otherwise we'll get a warning/failure at cleanup stage.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2c424e00c3)
2016-06-07 12:35:50 +01:00
Emil Velikov
5e3e292502 mesa: automake: add directory prefix for git_sha1.h
Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.

Here the file is already generated and is part of the tarball.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 359d9dfec3)
2016-06-07 12:35:46 +01:00
Emil Velikov
3be5c6a9ec egl: android: don't add the image loader extension for !render_node
With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.

As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.

That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.

Fixes: 34ddef39ce ("egl: android: add dma-buf fd support")

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1816c837c1)
2016-06-07 12:35:40 +01:00
Emil Velikov
a26ca04fe3 anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards
The generated sources should follow the example set by the vulkan
headers and our non-generated code. Namely: the code for all supported
platforms should be available, each one guarded by its respective
VK_USE_PLATFORM_*_KHR macro.

v2: Reword commit message.

Cc: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)
(cherry picked from commit b8e1f59d62)
2016-06-03 01:44:56 +01:00
Mauro Rossi
1a5d6a232f isl: add support for Android libmesa_isl static library
isl library is needed to build i965, libmesa_isl static library is added
to fix related Android building errors.

Any attempt to build libmesa_genxml as phony package module failed to deliver
gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9}

Due to constraints in Android Build System, libmesa_genxml is built as static,
at least one source is needed, so dummy.c is autogenerated for this scope,
libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES,
to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 278c2212ac)
2016-06-02 22:35:29 +01:00
Mauro Rossi
702a1121c9 android: libmesa_glsl: add a dependency on libmesa_nir static
Fixes the following building error:

target  C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp
In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0,
                 from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28:
external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory
compilation terminated.
build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed
make: *** [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1
make: *** Waiting for unfinished jobs....

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4143245c23)
2016-06-02 22:35:29 +01:00
Emil Velikov
9a21315ea9 isl: automake: don't include isl_format_layout.c in two lists.
Including the file in both ISL_FILES and ISL_GENERATED_FILES makes
the actual dependency list less obvious.

v2: Drop unrelated vulkan hunk (Jason).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit af1a0ae8ce)
2016-06-02 22:35:29 +01:00
Emil Velikov
94630ce0c7 automake: bring back the .PHONY git_sha1.h.tmp rule
With earlier commit 3689ef32af ("automake: rework the git_sha1.h rule,
include in tarball") we/I erroneously removed the PHONY rule and the
temporary file.

The former is used to ensure that the header is regenerated when on each
make invocation, while the latter helps us avoid the unneeded rebuild(s)
when the SHA1 hasn't changed.

Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit af2637aa32)
2016-06-02 22:35:29 +01:00
Christian König
6ad61d90ea radeon/uvd: fix the H264 level for Tonga v2
We support 5.2 for a while now.

v2: we even support 5.2 for H264, 5.1 is for HEVC.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3e75c3997)
2016-06-02 14:04:14 +01:00
Jordan Justen
a136b8bfe2 i965: Remove old CS local ID handling
The old method pushed data for each channels uvec3 data of
gl_LocalInvocationID.

The new method pushes 1 dword of data that is a 'thread local ID'
value. Based on that value, we can generate gl_LocalInvocationIndex
and gl_LocalInvocationID with some calculations.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0a3acff5b5)
2016-06-02 14:02:05 +01:00
Jordan Justen
52ba7abe1e i965: Enable cross-thread constants and compact local IDs for hsw+
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

One complication is that cross-thread constants are loaded into
registers before per-thread constants. Previously, our local IDs were
loaded before the uniform data and treated as 'payload' data, even
though they were actually pushed into the registers like the other
uniform data.

Therefore, in this patch we simultaneously enable a newer layout where
each thread now uses a single uniform slot for a unique local ID for
the thread. This uniform is handled specially to make sure it is added
last into the uniform push constant registers. This minimizes our
usage of push constant registers, and maximizes our ability to use
cross-thread constants for registers.

To swap from the old to the new layout, we also need to flip some
lowering pass switches to let our driver handle the lowering instead.
We also no longer force thread_local_id_index to -1.

v4:
 * Minimize size of patch that switches from the old local ID layout
   to the new layout (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b1f22c6317)
2016-06-02 14:01:31 +01:00
Jordan Justen
28ecf2b90e anv: Support new local ID generation & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ba9594f32)
2016-06-02 14:01:04 +01:00
Jordan Justen
ead833a395 i965: Support new local ID push constant & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 30685392e0)
2016-06-02 13:59:04 +01:00
Jordan Justen
ee77c4a099 i965: Add CS push constant info to brw_cs_prog_data
We need information about push constants in a few places for the GL
driver, and another couple places for the vulkan driver.

When we add support for uploading both a common (cross-thread) set of
push constants, combined with the previous per-thread push constant
data, things are going to get even more complicated. To simplify
things, we add push constant info into the cs prog_data struct.

The cross-thread constant support is added as of Haswell. To support
it we need to make sure all push constants with uniform values are
added to earlier registers. The register that varies per thread and
holds the thread invocation's unique local ID needs to be added last.

For now we add the code that would calculate cross-thread constatn
information for hsw+, but we force it (cross_thread_supported) off
until the other parts of the driver support it.

v4:
 * Support older local ID push constant layout as well. (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d437798ace)
2016-06-02 13:56:54 +01:00
Jordan Justen
a94be40ecc i965: Store number of threads in brw_cs_prog_data
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1b79e7ebbd)
2016-06-02 13:54:44 +01:00
Jordan Justen
632d7ef148 i965: Add nir based intrinsic lowering and thread ID uniform
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.

We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.

We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex.

v2:
 * Create variable during lowering pass. (Ken)

v3:
 * Don't create a variable, but instead just insert an intrisic call
   to load a uniform from the allocated location. (Jason)

v4:
 * Don't run this pass if thread_local_id_index < 0

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ef0957dac)
2016-06-02 13:53:58 +01:00
Jordan Justen
5513300f59 i965: Put CS local thread ID uniform in last push register
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.

It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.

The reason this is important is that the cross-thread push constant
registers are loaded first, and the per-thread push constant registers
are loaded after that. (Broadwell adds another push constant upload
mechanism which reverses this order, but we are ignoring this for
now.)

v2:
 * Add variable in intrinsics lowering pass
 * Make sure the ID is pushed last in assign_constant_locations, and
   that we save a spot for the ID in the push constants

v3:
 * Simplify code based with Jason's suggestions.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 04fc72501a)
2016-06-02 13:53:24 +01:00
Jordan Justen
33d0016836 i965: Add uniform for a CS thread local base ID
v4:
 * Force thread_local_id_index to -1 for now, and have
   fs_visitor::setup_cs_payload look at thread_local_id_index. This
   enables us to more easily cut over from the old local ID layout to
   the new layout, as suggested by Jason.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit fa279dfbf0)
2016-06-02 13:51:11 +01:00
Jordan Justen
169b700dfd i965: Add nir channel_num system value
v2:
 * simd16/32 fixes (curro)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8f48d23e0f)
2016-06-02 13:48:20 +01:00
Jordan Justen
33e985f8b9 nir: Make lowering gl_LocalInvocationIndex optional
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6f316c9d86)
2016-06-02 13:45:29 +01:00
Jordan Justen
c9de6190a0 glsl: Add glsl LowerCsDerivedVariables option
v2:
 * Move lower flag to context constants. (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7b9def3583)
2016-06-02 13:38:06 +01:00
Jason Ekstrand
05d88165d9 i965/fs: Copy the offset when lowering logical pull constant sends
This fixes 64 Vulkan CTS tests per gen

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1205999c22)
2016-06-02 13:37:30 +01:00
Dave Airlie
d1cf18497a glsl/distance: make sure we use clip dist varying slot for lowered var.
When lowering, we always want to use the clip dist varying.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8d4f4adfbd)
2016-06-02 13:36:45 +01:00
Kenneth Graunke
5a44d36b46 i965: Fix isoline reads in scalar TES.
Isolines aren't reversed.  commit 5b2d8c2273 fixed this for the vec4
TES backend, but not the scalar one.

Found while debugging GL45-CTS.tessellation_shader.
tessellation_control_to_tessellation_evaluation.gl_tessLevel.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 25e1b8d366)
2016-06-02 13:36:14 +01:00
Ian Romanick
0e54eebeed glsl: Use Geom.VerticesOut == -1 to specify unset
Because apparently layout(max_vertices=0) is a thing.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a428c955ce)
2016-06-02 13:35:18 +01:00
Ian Romanick
0ab1a3957a i965: If control_data_header_size_bits is zero, don't do EndPrimitive
This can occur when max_vertices=0 is explicitly specified.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b27dfa5403)
2016-06-02 13:34:43 +01:00
Ian Romanick
1398a9510f mesa: Fix bogus strncmp
The string "[0]\0" is the same as "[0]" as far as the C string datatype
is concerned.  That string has length 3.  strncmp(s, length_3_string, 4)
is the same as strcmp(s, length_3_string), so make it be strcmp.

v2: Not the same as strncmp(..., 3).  Noticed by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 049bb94d2e)
2016-06-02 13:33:53 +01:00
Ilia Mirkin
b265796c79 nir: allow sat on all float destination types
With the introduction of fp64 and fp16 to nir, there are now a bunch of
float types running around. A F1 2015 shader ends up with an i2f.sat
operation, which has a nir_type_float32 destination. Allow sat on all
the float destination types.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ca135a2612)
2016-06-02 13:32:52 +01:00
Alex Deucher
4a00da1662 radeonsi: fix the raster config setup for 1 RB iceland chips
I didn't realize there were 1 and 2 RB variants when this code
was originally added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd85e4a041)
2016-06-02 13:32:05 +01:00
Dave Airlie
e817522728 mesa/sampler: fix error codes for sampler parameters.
The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it,
however version 8 of it fixed this, and the GL specs also have
the fixed value in them.

Fixes:
GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6400144041)
2016-06-02 13:31:18 +01:00
Dave Airlie
915cc490d7 glsl: define some GLES3 constants in GLSL 4.1
The GLSL 4.1 spec adds:
gl_MaxVertexUniformVectors
gl_MaxFragmentUniformVectors
gl_MaxVaryingVectors

This fixes:
GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ebf4257a3)
2016-06-02 13:30:24 +01:00
Topi Pohjolainen
683c6940d8 i965: Add norbc debug option
This INTEL_DEBUG option disables lossless compression (also known
as render buffer compression).

v2: (Matt) Use likely(!lossless_compression_disabled) instead of
           !likely(lossless_compression_disabled)
    (Grazvydas) Update docs/envvars.html

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6ca118d2f4)
2016-06-02 13:28:22 +01:00
Topi Pohjolainen
2d483256d5 i965/gen9: Configure rbc buffers as plain for non-rbc tex views
Fixes rendering in Shadow of Mordor with rbc. Application writes
RGBA_UNORM texture filling it with values the application wants to
later on treat as SRGB_ALPHA.
Intel driver enables lossless compression for the buffer by the time
of writing. However, the driver fails to make sure the buffer can be
sampled as something else later on and unfortunately there is
restriction in the hardware for using lossless compression for srgb
formats which looks to extend itself to the sampling engine also.
Requesting srgb to linear conversion on top of compressed buffer
results the color values to be pretty much garbage.

Fortunately none of tracked benchmarks showed a regression with
this.

v2 (Matt): Add missing space

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 30e9e6bd07)
2016-06-02 13:27:53 +01:00
Kenneth Graunke
8c627af1f0 i965: Fix the passthrough TCS for isolines.
We weren't setting up several of the uniform values for the patch
header, so we'd crash when uploading push constants.  We at least
need to initialize them to zero.  We also had the isoline parameters
reversed, so it would also render incorrectly (if it didn't crash).

Fixes a new Piglit test(*) (isoline-no-tcs), as well as crashes in
GL44-CTS.tessellation_shader.single.max_patch_vertices.

(*) https://lists.freedesktop.org/archives/piglit/2016-May/019866.html

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a3dc99f3d4)
2016-06-02 13:27:23 +01:00
Dave Airlie
86e367a572 i965/xfb: skip components in correct buffer.
The driver was adding the skip components but always for buffer 0.

This fixes:
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ebb81cd683)
2016-06-02 13:26:50 +01:00
Dave Airlie
64015c03bb glsl/linker: fix multiple streams transform feedback.
e2791b38b4
mesa/program_interface_query: fix transform feedback varyings.

caused a regression in
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams
on radeonsi.

The problem was it was using the skip components varying to set
the stream id, when it should wait until a varying was written,
this just adds the varying checks in the right place.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1fe7bbb911)
2016-06-02 13:25:59 +01:00
Dave Airlie
99fcfd985e mesa/bufferobj: use mapping range in BufferSubData.
According to GL4.5 spec:
An INVALID_OPERATION error is generated if any part of the speci-
fied buffer range is mapped with MapBufferRange or MapBuffer (see sec-
tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map-
BufferRange access flags.

So we should use the if range is mapped path.

This fixes:
GL45-CTS.buffer_storage.map_persistent_buffer_sub_data

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e891f7cf55)
2016-06-02 13:25:08 +01:00
Ilia Mirkin
7bc29c784a nv50/ir: fix error finding free element in bitset in some situations
This really only hits for bitsets with a size of a multiple of 32. We
can end up with pos = -1 as a result of the ffs, which we in turn decide
is a valid position (since we fall through the loop and i == 1, we end
up adding 32 to it, so end up returning 31 again).

Up until recently this was largely unreachable, as the register file
sizes were all 63 or 255. However with the advent of compute shaders
which can restrict the number of registers, this can now happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18d11c9989)
2016-06-02 13:24:08 +01:00
Timothy Arceri
b2b7f05da6 Revert "glsl: fix xfb_offset unsized array validation"
This reverts commit aac90ba292.

The commit caused a regression in:
piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom

Also the CTS test it was meant to fix seems like it may be bogus.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 98d40b4d11)
2016-06-02 13:21:36 +01:00
Francisco Jerez
eb56a2f250 i965/fs: Allow scalar source regions on SNB math instructions.
I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:

 "The supported regioning modes for math instructions are align16,
  align1 with the following restrictions:
   - Scalar source is supported.
  [...]
   - Source and destination offset must be the same, except the case of
     scalar source."

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c1107cec44)
2016-06-02 13:20:45 +01:00
Francisco Jerez
c1269825cf i965/fs: Fix constant combining for instructions that cannot accept source mods.
This is the case for SNB math instructions so we need to be careful
and insert the literal value of the immediate into the table (rather
than its absolute value) if the instruction is unable to invert the
sign of the constant on the fly.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 06d8765bc0)
2016-06-02 13:20:15 +01:00
Francisco Jerez
f651a4bb2e i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 303ec22ed6)
2016-06-02 13:19:41 +01:00
Francisco Jerez
44029d4237 i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes.
Which requires using a bitset instead of a boolean flag to keep track
of the GRFs we've seen a generating instruction for already.  The
search loop continues until all instructions initializing the value of
the source VGRF have been found, or it is determined that coalescing
is not possible.

Fixes a few piglit test cases on Gen4-6 which were regressed by
6956015aa5 due to the different (yet
perfectly valid) ordering in which copy instructions are emitted now
by the simd lowering pass, which had the side effect of causing this
optimization pass to start corrupting the program in cases where a
VGRF-to-MRF copy instruction would be eliminated but only the last
instruction writing to the source VGRF region would be rewritten to
point to the target MRF.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4fe4f6e8a7)
2016-06-02 13:19:07 +01:00
Francisco Jerez
910fa7a824 i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation.
This will be required to correctly transform the destination of 8-wide
instructions that write a single GRF of a VGRF to MRF copy marked
COMPR4.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1898673f58)
2016-06-02 13:18:33 +01:00
Francisco Jerez
3b78304025 i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops.
This will allow compute_to_mrf to handle cases where the source of the
VGRF-to-MRF copy is initialized by more than one instruction.  In such
cases we cannot rewrite the destination of any of the generating
instructions until it's known whether the whole VGRF source region can
be coalesced into the destination MRF, which will imply continuing the
search until all generating instructions have been found or it has
been determined that the VGRF and MRF registers cannot be coalesced.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 485fbaff03)
2016-06-02 13:18:00 +01:00
Francisco Jerez
dd96daa55e i965/fs: Fix compute-to-mrf VGRF region coverage condition.
Compute-to-mrf was checking whether the destination of scan_inst is
more than one component (making assumptions about the instruction data
type) in order to find out whether the result is being fully copied
into the MRF destination, which is rather inaccurate in cases where a
single-component instruction is only partially contained in the source
region, or when the execution size of the copy and scan_inst
instructions differ.  Instead check whether the destination region of
the instruction is really contained within the bounds of the source
region of the copy.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4b0ec9f475)
2016-06-02 13:17:26 +01:00
Francisco Jerez
a6011c6fc6 i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap().
Compute-to-mrf was being rather heavy-handed about checking whether
instruction source or destination regions interfere with the copy
instruction, which could conceivably lead to program miscompilation.
Fix it by using regions_overlap() instead of the open-coded and
dubiously correct overlap checks.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bb61e24787)
2016-06-02 13:16:52 +01:00
Francisco Jerez
2d83aad693 i965/fs: Teach regions_overlap() about COMPR4 MRF regions.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 88f380a2dd)
2016-06-02 13:16:04 +01:00
Dylan Baker
665f57c513 Don't use python 3
Now there are not files that require python 3, so for now just remove
the python 3 dependency and use python 2. I think the right plan is to
just get all of the python ready for python 3, and then use whatever
python is available.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 604010a7ed)
2016-06-02 13:15:38 +01:00
Dylan Baker
7e62585ee8 genxml: change chbang to python 2
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ab31817fed)
2016-06-02 13:15:08 +01:00
Dylan Baker
4dd70617a1 genxml: use the isalpha method rather than str.isalpha.
This fixes gen_pack_header to work on python 2, where name[0] is unicode
not str.

Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 12c1a01c72)
2016-06-02 13:14:38 +01:00
Dylan Baker
9ed6965749 genxml: require future imports for python2 compatibility.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a45a25418b)
2016-06-02 13:14:08 +01:00
Dylan Baker
aed6230269 genxml: mark re strings as raw
This is a correctness issue.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e5681e4d70)
2016-06-02 13:13:39 +01:00
Dylan Baker
f73a68ec37 genxml: Make classes descendants of object
This is the default in python3, but in python2 you get old style
classes. No one likes old-style classes.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit de2e9da2e9)
2016-06-02 13:13:09 +01:00
Dylan Baker
0c12887764 genxml: mark gen_pack_header.py as encoded in utf-8
There is unicode in this file, and I'm actually surprised that the
python interpreter hasn't gotten grumpy.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9f50e3572c)
2016-06-02 13:12:35 +01:00
Marek Olšák
145705e49c mesa: fix crash in driver_RenderTexture_is_safe
This just fixed the crash with the apitrace in bug report.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8a10192b4b)
2016-06-02 13:11:43 +01:00
Dave Airlie
d3c92267e0 glsl/images: bounds check image unit assignment
The CTS test:
GL45-CTS.multi_bind.dispatch_bind_image_textures
binds 192 image uniforms, we reject this later,
but not until after we trash the contents of the
struct gl_shader.

Error now reads:
Too many compute shader image uniforms (192 > 16)
instead of
Too many compute shader image uniforms (2745344416 > 16)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f87352d769)
2016-06-02 13:10:50 +01:00
Ilia Mirkin
36e26f2ee2 nvc0/ir: fix spilling predicates to registers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b1a167a2b)
2016-06-02 12:54:52 +01:00
Emil Velikov
9a56e7d25b Update version to 12.0.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 19:20:34 +01:00
Emil Velikov
7ad2cb6f08 docs: rename release notes to 12.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 19:20:34 +01:00
Emil Velikov
a43a368457 nir: add the SConscript.nir to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 922b471777)
2016-05-30 19:20:33 +01:00
1809 changed files with 64997 additions and 147810 deletions

View File

@@ -1,34 +0,0 @@
# To use this config on you editor, follow the instructions at:
# http://editorconfig.org
root = true
[*]
charset = utf-8
insert_final_newline = true
[*.{c,h,cpp,hpp,cc,hh}]
indent_style = space
indent_size = 3
[{Makefile*,*.mk}]
indent_style = tab
[{*.py,SCons*}]
indent_style = space
indent_size = 4
[*.pl]
indent_style = space
indent_size = 4
[*.m4]
indent_style = space
indent_size = 2
[*.yml]
indent_style = space
indent_size = 2
[*.patch]
trim_trailing_whitespace = false

1
.gitignore vendored
View File

@@ -49,4 +49,3 @@ Makefile.in
.install-mesa-links
.install-gallium-links
/src/git_sha1.h
TAGS

View File

@@ -88,11 +88,9 @@ Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <s3734770@mai
Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <carli@carli-laptop.(none)>
Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <Carl-Philip.Haensch@mailbox.tu-dresden.de>
Chad Versace <chadversary@chromium.org> <chad@kiwitree.net>
Chad Versace <chadversary@chromium.org> <chad@chad-versace.us>
Chad Versace <chadversary@chromium.org> <Chad Versace chad@chad-versace.us>
Chad Versace <chadversary@chromium.org> <chad.versace@intel.com>
Chad Versace <chadversary@chromium.org> <chad.versace@linux.intel.com>
Chad Versace <chad.versace@intel.com> <chad@chad-versace.us>
Chad Versace <chad.versace@intel.com> <Chad Versace chad@chad-versace.us>
Chad Versace <chad.versace@intel.com> <chad.versace@linux.intel.com>
Chia-I Wu <olvaffe@gmail.com> <olv@lunarg.com>
Chia-I Wu <olvaffe@gmail.com> Chia-Wu <olvaffe@gmail.com>
@@ -140,8 +138,6 @@ Dmitry Cherkassov <dcherkassov@gmail.com> Dmitry Cherkasov <dcherkassov@gmail.co
Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
@@ -278,7 +274,7 @@ Marc Dietrich <marvin24@gmx.de> marvin24 <marvin24@gmx.de>
Marcin Ślusarz <marcin.slusarz@gmail.com> Marcin Slusarz <marcin.slusarz@gmail.com>
Marek Olšák <maraeo@gmail.com> <marek.olsak@amd.com>
Marek Olšák <marek.olsak@amd.com> <maraeo@gmail.com>
Mario Kleiner <mario.kleiner.de@gmail.com> kleinerm <mario.kleiner@tuebingen.mpg.de>
Mario Kleiner <mario.kleiner.de@gmail.com> <mario.kleiner@tuebingen.mpg.de>

View File

@@ -11,6 +11,7 @@ addons:
apt:
packages:
- libdrm-dev
- libudev-dev
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libxcb-dri2-0-dev

View File

@@ -34,10 +34,6 @@ MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)
LOCAL_CFLAGS += \
-Wno-unused-parameter \
-Wno-date-time \
-Wno-pointer-arith \
-Wno-missing-field-initializers \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
@@ -82,14 +78,6 @@ LOCAL_CFLAGS += \
-D__STDC_LIMIT_MACROS
endif
ifneq ($(LOCAL_IS_HOST_MODULE),true)
# add libdrm if there are hardware drivers
ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
LOCAL_CFLAGS += -DHAVE_LIBDRM
LOCAL_SHARED_LIBRARIES += libdrm
endif
endif
LOCAL_CPPFLAGS += \
$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \
-Wno-error=non-virtual-dtor \

View File

@@ -95,8 +95,8 @@ SUBDIRS := \
src/mesa \
src/util \
src/egl \
src/amd \
src/intel \
src/intel/genxml \
src/intel/isl \
src/mesa/drivers/dri
INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))

View File

@@ -44,7 +44,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--with-egl-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \
--with-vulkan-drivers=intel,radeon
--with-vulkan-drivers=intel
ACLOCAL_AMFLAGS = -I m4

View File

@@ -104,7 +104,3 @@ F: src/egl/drivers/dri2/platform_wayland.c
FREEDRENO
R: Rob Clark <robclark@freedesktop.org>
F: src/gallium/drivers/freedreno/
GLX
R: Adam Jackson <ajax@redhat.com>
F: src/glx/

View File

@@ -1 +1 @@
13.0.2
12.0.4

View File

@@ -1,5 +1,25 @@
# Commit was picked with -x
907ace57986733add2aebfa9dd7c83c67efed70e mapi: automake: set VISIBILITY_CFLAGS for shared glapi
# The offending commit that this patch (part) reverts isn't in 12.0
be32a2132785fbc119f17e62070e007ee7d17af7 i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable
# Commit was reverted shortly after it landed in master
a39ad185932eab4f25a0cb2b112c10d8700ef242 configure.ac: honour LLVM_LIBDIR when linking against LLVM
# The patch depends on the batch_cache work at least.
89f00f749fda4c1beca38f362c7f86bdc6e32785 a4xx: make sure to actually clamp depth as requested
# The patch depends on the 'generic' interoplation and location
# implementation introduced with 2d6dd30a9b30
114874b22beafb2d07006b197c62d717fc7f80cc i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode
# VAAPI encode landed after the branch point.
a5993022275c20061ac025d9adc26c5f9d02afee st/va Avoid VBR bitrate calculation overflow v2
# EGL_KHR_debug landed after the branch point.
17084b6f9340f798111e53e08f5d35c7630cee48 egl: Fix missing unlock in eglGetSyncAttribKHR
# Depends on update_renderbuffer_read_surfaces at least
f2b9b0c730e345bcffa9eadabb25af3ab02642f2 i965: Add missing BRW_NEW_FS_PROG_DATA to render target reads.
# The commit in question hasn't landed in branch
1ef787339774bc7f1cc9c1615722f944005e070c Revert "egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT"
# Patches depend on the fence_finish() gallium API change and corresponding driver work
f240ad98bc05281ea7013d91973cb5f932ae9434 st/mesa: unduplicate st_check_sync code
b687f766fddb7b39479cd9ee0427984029ea3559 st/mesa: allow multiple concurrent waiters in ClientWaitSync

View File

@@ -1,3 +0,0 @@
[*.sh]
indent_style = space
indent_size = 2

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*12\.0.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -86,7 +86,7 @@ def AddOptions(opts):
from SCons.Options.EnumOption import EnumOption
opts.Add(EnumOption('build', 'build type', 'debug',
allowed_values=('debug', 'checked', 'profile',
'release', 'opt')))
'release')))
opts.Add(BoolOption('verbose', 'verbose output', 'no'))
opts.Add(EnumOption('machine', 'use machine-specific assembly code',
default_machine,

View File

@@ -74,11 +74,11 @@ LIBDRM_AMDGPU_REQUIRED=2.4.63
LIBDRM_INTEL_REQUIRED=2.4.61
LIBDRM_NVVIEUX_REQUIRED=2.4.66
LIBDRM_NOUVEAU_REQUIRED=2.4.66
LIBDRM_FREEDRENO_REQUIRED=2.4.68
LIBDRM_VC4_REQUIRED=2.4.69
LIBDRM_FREEDRENO_REQUIRED=2.4.67
DRI2PROTO_REQUIRED=2.6
DRI3PROTO_REQUIRED=1.0
PRESENTPROTO_REQUIRED=1.0
LIBUDEV_REQUIRED=151
GLPROTO_REQUIRED=1.4.14
LIBOMXIL_BELLAGIO_REQUIRED=0.0
LIBVA_REQUIRED=0.38.0
@@ -89,8 +89,7 @@ XCBDRI2_REQUIRED=1.8
XCBGLX_REQUIRED=1.8.1
XSHMFENCE_REQUIRED=1.1
XVMC_REQUIRED=1.0.6
PYTHON_MAKO_REQUIRED=0.8.0
LIBSENSORS_REQUIRED=4.0.0
PYTHON_MAKO_REQUIRED=0.3.4
dnl Check for progs
AC_PROG_CPP
@@ -109,7 +108,6 @@ LT_PREREQ([2.2])
LT_INIT([disable-static])
AC_CHECK_PROG(RM, rm, [rm -f])
AC_CHECK_PROG(XXD, xxd, [xxd])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-parse.c"],
@@ -257,12 +255,15 @@ case "$host_os" in
*-android)
android=yes
;;
linux*|*-gnu*|gnu*|cygwin*)
linux*|*-gnu*|gnu*)
DEFINES="$DEFINES -D_GNU_SOURCE"
;;
solaris*)
DEFINES="$DEFINES -DSVR4"
;;
cygwin*)
DEFINES="$DEFINES -D_XOPEN_SOURCE=700"
;;
esac
AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes)
@@ -301,9 +302,16 @@ if test "x$GCC" = xyes; then
# Restore CFLAGS; VISIBILITY_CFLAGS are added to it where needed.
CFLAGS=$save_CFLAGS
# Work around aliasing bugs - developers should comment this out
CFLAGS="$CFLAGS -fno-strict-aliasing"
# We don't want floating-point math functions to set errno or trap
CFLAGS="$CFLAGS -fno-math-errno -fno-trapping-math"
# gcc's builtin memcmp is slower than glibc's
# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
CFLAGS="$CFLAGS -fno-builtin-memcmp"
# Flags to help ensure that certain portions of the code -- and only those
# portions -- can be built with MSVC:
# - src/util, src/gallium/auxiliary, rc/gallium/drivers/llvmpipe, and
@@ -340,8 +348,12 @@ if test "x$GXX" = xyes; then
# Restore CXXFLAGS; VISIBILITY_CXXFLAGS are added to it where needed.
CXXFLAGS=$save_CXXFLAGS
# We don't want floating-point math functions to set errno or trap
CXXFLAGS="$CXXFLAGS -fno-math-errno -fno-trapping-math"
# Work around aliasing bugs - developers should comment this out
CXXFLAGS="$CXXFLAGS -fno-strict-aliasing"
# gcc's builtin memcmp is slower than glibc's
# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
CXXFLAGS="$CXXFLAGS -fno-builtin-memcmp"
fi
AC_SUBST([MSVC2013_COMPAT_CFLAGS])
@@ -387,17 +399,6 @@ fi
AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
dnl Check for new-style atomic builtins
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
int main() {
int n;
return __atomic_load_n(&n, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
fi
AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
dnl Check for Endianness
AC_C_BIGENDIAN(
little_endian=no,
@@ -826,21 +827,9 @@ dnl to -pthread, which causes problems if we need -lpthread to appear in
dnl pkgconfig files.
test -z "$PTHREAD_LIBS" && PTHREAD_LIBS="-lpthread"
dnl pthread-stubs is mandatory on targets where it exists
case "$host_os" in
cygwin* )
pthread_stubs_possible="no"
;;
* )
pthread_stubs_possible="yes"
;;
esac
if test "x$pthread_stubs_possible" = xyes; then
PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
AC_SUBST(PTHREADSTUBS_CFLAGS)
AC_SUBST(PTHREADSTUBS_LIBS)
fi
PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
AC_SUBST(PTHREADSTUBS_CFLAGS)
AC_SUBST(PTHREADSTUBS_LIBS)
dnl SELinux awareness.
AC_ARG_ENABLE([selinux],
@@ -883,32 +872,6 @@ AC_ARG_ENABLE([dri],
[enable_dri="$enableval"],
[enable_dri=yes])
AC_ARG_ENABLE([gallium-extra-hud],
[AS_HELP_STRING([--enable-gallium-extra-hud],
[enable HUD block/NIC I/O HUD stats support @<:@default=disabled@:>@])],
[enable_gallium_extra_hud="$enableval"],
[enable_gallium_extra_hud=no])
AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes)
if test "x$enable_gallium_extra_hud" = xyes ; then
DEFINES="${DEFINES} -DHAVE_GALLIUM_EXTRA_HUD=1"
fi
#TODO: no pkgconfig .pc available for libsensors.
#PKG_CHECK_MODULES([LIBSENSORS], [libsensors >= $LIBSENSORS_REQUIRED], [enable_lmsensors=yes], [enable_lmsensors=no])
AC_ARG_ENABLE([lmsensors],
[AS_HELP_STRING([--enable-lmsensors],
[enable HUD lmsensor support @<:@default=disabled@:>@])],
[enable_lmsensors="$enableval"],
[enable_lmsensors=no])
AM_CONDITIONAL(HAVE_LIBSENSORS, test "x$enable_lmsensors" = xyes)
if test "x$enable_lmsensors" = xyes ; then
DEFINES="${DEFINES} -DHAVE_LIBSENSORS=1"
LIBSENSORS_LDFLAGS="-lsensors"
else
LIBSENSORS_LDFLAGS=""
fi
AC_SUBST(LIBSENSORS_LDFLAGS)
case "$host_os" in
linux*)
dri3_default=yes
@@ -1140,20 +1103,11 @@ if test "x$have_libdrm" = xyes; then
DEFINES="$DEFINES -DHAVE_LIBDRM"
fi
require_libdrm() {
if test "x$have_libdrm" != xyes; then
AC_MSG_ERROR([$1 requires libdrm >= $LIBDRM_REQUIRED])
fi
}
# Select which platform-dependent DRI code gets built
case "$host_os" in
darwin*)
dri_platform='apple' ;;
cygwin*)
dri_platform='windows' ;;
gnu*)
gnu*|cygwin*)
dri_platform='none' ;;
*)
dri_platform='drm' ;;
@@ -1169,9 +1123,6 @@ AM_CONDITIONAL(HAVE_DRISW_KMS, test "x$have_drisw_kms" = xyes )
AM_CONDITIONAL(HAVE_DRI2, test "x$enable_dri" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes )
AM_CONDITIONAL(HAVE_DRI3, test "x$enable_dri3" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes )
AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xapple )
AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes )
AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes )
AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xwindows )
AC_ARG_ENABLE([shared-glapi],
[AS_HELP_STRING([--enable-shared-glapi],
@@ -1352,9 +1303,23 @@ if test "x$with_sha1" = "x"; then
fi
fi
AM_CONDITIONAL([ENABLE_SHADER_CACHE], [test x$enable_shader_cache = xyes])
if test "x$enable_shader_cache" = "xyes"; then
AC_DEFINE([ENABLE_SHADER_CACHE], [1], [Enable shader cache])
fi
case "$host_os" in
linux*)
need_pci_id=yes ;;
*)
need_pci_id=no ;;
esac
PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
have_libudev=yes, have_libudev=no)
AC_ARG_ENABLE([sysfs],
[AS_HELP_STRING([--enable-sysfs],
[enable /sys PCI identification @<:@default=disabled@:>@])],
[have_sysfs="$enableval"],
[have_sysfs=no]
)
if test "x$enable_dri" = xyes; then
if test "$enable_static" = yes; then
@@ -1398,7 +1363,9 @@ xdri)
if test x"$driglx_direct" = xyes; then
if test x"$dri_platform" = xdrm ; then
DEFINES="$DEFINES -DGLX_USE_DRM"
require_libdrm "Direct rendering"
if test "x$have_libdrm" != xyes; then
AC_MSG_ERROR([Direct rendering requires libdrm >= $LIBDRM_REQUIRED])
fi
PKG_CHECK_MODULES([DRI2PROTO], [dri2proto >= $DRI2PROTO_REQUIRED])
GL_PC_REQ_PRIV="$GL_PC_REQ_PRIV libdrm >= $LIBDRM_REQUIRED"
@@ -1420,9 +1387,6 @@ xdri)
if test x"$dri_platform" = xapple ; then
DEFINES="$DEFINES -DGLX_USE_APPLEGL"
fi
if test x"$dri_platform" = xwindows ; then
DEFINES="$DEFINES -DGLX_USE_WINDOWSGL"
fi
fi
# add xf86vidmode if available
@@ -1442,6 +1406,17 @@ xdri)
;;
esac
have_pci_id=no
if test "$have_libudev" = yes; then
DEFINES="$DEFINES -DHAVE_LIBUDEV"
have_pci_id=yes
fi
if test "$have_sysfs" = yes; then
DEFINES="$DEFINES -DHAVE_SYSFS"
have_pci_id=yes
fi
# This is outside the case (above) so that it is invoked even for non-GLX
# builds.
AM_CONDITIONAL(HAVE_XF86VIDMODE, test "x$HAVE_XF86VIDMODE" = xyes)
@@ -1550,6 +1525,10 @@ if test "x$enable_dri" = xyes; then
DEFINES="$DEFINES -DHAVE_DRI3"
fi
if test "x$have_pci_id" != xyes; then
AC_MSG_ERROR([libudev-dev or sysfs required for building DRI])
fi
case "$host_cpu" in
powerpc* | sparc*)
# Build only the drivers for cards that exist on PowerPC/sparc
@@ -1681,13 +1660,6 @@ if test -n "$with_vulkan_drivers"; then
HAVE_INTEL_VULKAN=yes;
;;
xradeon)
PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
HAVE_RADEON_VULKAN=yes;
if test "x$with_sha1" == "x"; then
AC_MSG_ERROR([radv vulkan driver requires SHA1])
fi
;;
*)
AC_MSG_ERROR([Vulkan driver '$driver' does not exist])
;;
@@ -1757,6 +1729,10 @@ if test "x$enable_gbm" = xauto; then
esac
fi
if test "x$enable_gbm" = xyes; then
if test "x$need_pci_id$have_pci_id" = xyesno; then
AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED or sysfs])
fi
if test "x$enable_dri" = xyes; then
if test "x$enable_shared_glapi" = xno; then
AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi])
@@ -1771,8 +1747,11 @@ if test "x$enable_gbm" = xyes; then
fi
fi
AM_CONDITIONAL(HAVE_GBM, test "x$enable_gbm" = xyes)
# FINISHME: GBM has a number of dependencies which we should add below
GBM_PC_REQ_PRIV=""
if test "x$need_pci_id$have_libudev" = xyesyes; then
GBM_PC_REQ_PRIV="libudev >= $LIBUDEV_REQUIRED"
else
GBM_PC_REQ_PRIV=""
fi
GBM_PC_LIB_PRIV="$DLOPEN_LIBS"
AC_SUBST([GBM_PC_REQ_PRIV])
AC_SUBST([GBM_PC_LIB_PRIV])
@@ -2032,6 +2011,8 @@ egl_platforms=`IFS=', '; echo $with_egl_platforms`
for plat in $egl_platforms; do
case "$plat" in
wayland)
test "x$have_libdrm" != xyes &&
AC_MSG_ERROR([EGL platform wayland requires libdrm >= $LIBDRM_REQUIRED])
PKG_CHECK_MODULES([WAYLAND], [wayland-client >= $WAYLAND_REQUIRED wayland-server >= $WAYLAND_REQUIRED])
@@ -2047,9 +2028,13 @@ for plat in $egl_platforms; do
drm)
test "x$enable_gbm" = "xno" &&
AC_MSG_ERROR([EGL platform drm needs gbm])
test "x$have_libdrm" != xyes &&
AC_MSG_ERROR([EGL platform drm requires libdrm >= $LIBDRM_REQUIRED])
;;
surfaceless)
test "x$have_libdrm" != xyes &&
AC_MSG_ERROR([EGL platform surfaceless requires libdrm >= $LIBDRM_REQUIRED])
;;
android)
@@ -2060,11 +2045,10 @@ for plat in $egl_platforms; do
;;
esac
case "$plat" in
wayland|drm|surfaceless)
require_libdrm "Platform $plat"
;;
esac
case "$plat$need_pci_id$have_pci_id" in
waylandyesno|drmyesno)
AC_MSG_ERROR([cannot build $plat platform without udev >= $LIBUDEV_REQUIRED or sysfs]) ;;
esac
done
# libEGL wants to default to the first platform specified in
@@ -2159,7 +2143,7 @@ if test "x$enable_gallium_llvm" = xauto; then
i*86|x86_64|amd64) enable_gallium_llvm=yes;;
esac
fi
if test "x$enable_gallium_llvm" = xyes || test "x$HAVE_RADEON_VULKAN" = xyes; then
if test "x$enable_gallium_llvm" = xyes; then
if test -n "$llvm_prefix"; then
AC_PATH_TOOL([LLVM_CONFIG], [llvm-config], [no], ["$llvm_prefix/bin"])
else
@@ -2205,7 +2189,7 @@ if test "x$enable_gallium_llvm" = xyes || test "x$HAVE_RADEON_VULKAN" = xyes; th
fi
if test "x$enable_opencl" = xyes; then
llvm_check_version_for "3" "6" "0" "opencl"
llvm_check_version_for "3" "5" "0" "opencl"
LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker instrumentation"
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option objcarcopts profiledata"
@@ -2284,6 +2268,12 @@ AC_SUBST([D3D_DRIVER_INSTALL_DIR])
dnl
dnl Gallium helper functions
dnl
gallium_require_drm() {
if test "x$have_libdrm" != xyes; then
AC_MSG_ERROR([$1 requires libdrm >= $LIBDRM_REQUIRED])
fi
}
gallium_require_llvm() {
if test "x$MESA_LLVM" = x0; then
case "$host" in *gnux32) return;; esac
@@ -2293,6 +2283,12 @@ gallium_require_llvm() {
fi
}
gallium_require_drm_loader() {
if test "x$need_pci_id$have_pci_id" = xyesno; then
AC_MSG_ERROR([Gallium drm loader requires libudev >= $LIBUDEV_REQUIRED or sysfs])
fi
}
dnl This is for Glamor. Skip this if OpenGL is disabled.
require_egl_drm() {
if test "x$enable_opengl" = xno; then
@@ -2317,7 +2313,10 @@ radeon_llvm_check() {
else
amdgpu_llvm_target_name='amdgpu'
fi
llvm_check_version_for $2 $3 $4 $1
if test "x$enable_gallium_llvm" != "xyes"; then
AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
fi
llvm_check_version_for "3" "6" "0" $1
if test true && $LLVM_CONFIG --targets-built | grep -iqvw $amdgpu_llvm_target_name ; then
AC_MSG_ERROR([LLVM $amdgpu_llvm_target_name not enabled in your LLVM build.])
fi
@@ -2328,13 +2327,6 @@ radeon_llvm_check() {
fi
}
radeon_gallium_llvm_check() {
if test "x$enable_gallium_llvm" != "xyes"; then
AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
fi
radeon_llvm_check $*
}
swr_llvm_check() {
gallium_require_llvm $1
if test ${LLVM_VERSION_INT} -lt 306; then
@@ -2391,30 +2383,35 @@ if test -n "$with_gallium_drivers"; then
case "x$driver" in
xsvga)
HAVE_GALLIUM_SVGA=yes
require_libdrm "svga"
gallium_require_drm "svga"
gallium_require_drm_loader
;;
xi915)
HAVE_GALLIUM_I915=yes
PKG_CHECK_MODULES([INTEL], [libdrm_intel >= $LIBDRM_INTEL_REQUIRED])
require_libdrm "Gallium i915"
gallium_require_drm "Gallium i915"
gallium_require_drm_loader
;;
xilo)
HAVE_GALLIUM_ILO=yes
PKG_CHECK_MODULES([INTEL], [libdrm_intel >= $LIBDRM_INTEL_REQUIRED])
require_libdrm "Gallium i965/ilo"
gallium_require_drm "Gallium i965/ilo"
gallium_require_drm_loader
;;
xr300)
HAVE_GALLIUM_R300=yes
PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
require_libdrm "Gallium R300"
gallium_require_drm "Gallium R300"
gallium_require_drm_loader
gallium_require_llvm "Gallium R300"
;;
xr600)
HAVE_GALLIUM_R600=yes
PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
require_libdrm "Gallium R600"
gallium_require_drm "Gallium R600"
gallium_require_drm_loader
if test "x$enable_opencl" = xyes; then
radeon_gallium_llvm_check "r600g" "3" "6" "0"
radeon_llvm_check "r600g"
LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
fi
;;
@@ -2422,19 +2419,22 @@ if test -n "$with_gallium_drivers"; then
HAVE_GALLIUM_RADEONSI=yes
PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
require_libdrm "radeonsi"
radeon_gallium_llvm_check "radeonsi" "3" "6" "0"
gallium_require_drm "radeonsi"
gallium_require_drm_loader
radeon_llvm_check "radeonsi"
require_egl_drm "radeonsi"
;;
xnouveau)
HAVE_GALLIUM_NOUVEAU=yes
PKG_CHECK_MODULES([NOUVEAU], [libdrm_nouveau >= $LIBDRM_NOUVEAU_REQUIRED])
require_libdrm "nouveau"
gallium_require_drm "nouveau"
gallium_require_drm_loader
;;
xfreedreno)
HAVE_GALLIUM_FREEDRENO=yes
PKG_CHECK_MODULES([FREEDRENO], [libdrm_freedreno >= $LIBDRM_FREEDRENO_REQUIRED])
require_libdrm "freedreno"
gallium_require_drm "freedreno"
gallium_require_drm_loader
;;
xswrast)
HAVE_GALLIUM_SOFTPIPE=yes
@@ -2464,8 +2464,8 @@ if test -n "$with_gallium_drivers"; then
;;
xvc4)
HAVE_GALLIUM_VC4=yes
PKG_CHECK_MODULES([VC4], [libdrm_vc4 >= $LIBDRM_VC4_REQUIRED])
require_libdrm "vc4"
gallium_require_drm "vc4"
gallium_require_drm_loader
PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
[USE_VC4_SIMULATOR=yes;
@@ -2474,7 +2474,8 @@ if test -n "$with_gallium_drivers"; then
;;
xvirgl)
HAVE_GALLIUM_VIRGL=yes
require_libdrm "virgl"
gallium_require_drm "virgl"
gallium_require_drm_loader
require_egl_drm "virgl"
;;
*)
@@ -2484,10 +2485,6 @@ if test -n "$with_gallium_drivers"; then
done
fi
if test "x$HAVE_RADEON_VULKAN" = "xyes"; then
radeon_llvm_check "radv" "3" "9" "0"
fi
dnl Set LLVM_LIBS - This is done after the driver configuration so
dnl that drivers can add additional components to LLVM_COMPONENTS.
dnl Previously, gallium drivers were updating LLVM_LIBS directly
@@ -2536,7 +2533,7 @@ if test "x$MESA_LLVM" != x0; then
AC_MSG_WARN([Building mesa with statically linked LLVM may cause compilation issues])
dnl We need to link to llvm system libs when using static libs
dnl However, only llvm 3.5+ provides --system-libs
if test $LLVM_VERSION_MAJOR -ge 4 -o $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5; then
if test $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5; then
LLVM_LIBS="$LLVM_LIBS `$LLVM_CONFIG --system-libs`"
fi
fi
@@ -2579,13 +2576,8 @@ AM_CONDITIONAL(HAVE_R200_DRI, test x$HAVE_R200_DRI = xyes)
AM_CONDITIONAL(HAVE_RADEON_DRI, test x$HAVE_RADEON_DRI = xyes)
AM_CONDITIONAL(HAVE_SWRAST_DRI, test x$HAVE_SWRAST_DRI = xyes)
AM_CONDITIONAL(HAVE_RADEON_VULKAN, test "x$HAVE_RADEON_VULKAN" = xyes)
AM_CONDITIONAL(HAVE_INTEL_VULKAN, test "x$HAVE_INTEL_VULKAN" = xyes)
AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_R600" = xyes -o \
"x$HAVE_GALLIUM_RADEONSI" = xyes -o \
"x$HAVE_RADEON_VULKAN" = xyes)
AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
"x$HAVE_I965_DRI" = xyes)
@@ -2624,8 +2616,6 @@ VA_MINOR=`$PKG_CONFIG --modversion libva | $SED -n 's/.*\.\(.*\)\..*$/\1/p'`
AC_SUBST([VA_MAJOR], $VA_MAJOR)
AC_SUBST([VA_MINOR], $VA_MINOR)
AM_CONDITIONAL(HAVE_VULKAN_COMMON, test "x$VULKAN_DRIVERS" != "x")
AC_SUBST([XVMC_MAJOR], 1)
AC_SUBST([XVMC_MINOR], 0)
@@ -2679,9 +2669,6 @@ CXXFLAGS="$CXXFLAGS $USER_CXXFLAGS"
dnl Substitute the config
AC_CONFIG_FILES([Makefile
src/Makefile
src/amd/Makefile
src/amd/common/Makefile
src/amd/vulkan/Makefile
src/compiler/Makefile
src/egl/Makefile
src/egl/main/egl.pc
@@ -2756,11 +2743,10 @@ AC_CONFIG_FILES([Makefile
src/glx/Makefile
src/glx/apple/Makefile
src/glx/tests/Makefile
src/glx/windows/Makefile
src/glx/windows/windowsdriproto.pc
src/gtest/Makefile
src/intel/Makefile
src/intel/tools/Makefile
src/intel/genxml/Makefile
src/intel/isl/Makefile
src/intel/vulkan/Makefile
src/loader/Makefile
src/mapi/Makefile
@@ -2784,14 +2770,16 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/wsi/Makefile])
src/util/tests/hash_table/Makefile])
AC_OUTPUT
# Fix up dependencies in *.Plo files, where we changed the extension of a
# source file
$SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_blorp.Plo
$SED -i -e 's/gen6_blorp.cpp/gen6_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen6_blorp.Plo
$SED -i -e 's/gen7_blorp.cpp/gen7_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen7_blorp.Plo
$SED -i -e 's/gen8_blorp.cpp/gen8_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen8_blorp.Plo
dnl
@@ -2890,19 +2878,6 @@ else
echo " Gallium: no"
fi
echo ""
if test "x$enable_gallium_extra_hud" != xyes; then
echo " HUD extra stats: no"
else
echo " HUD extra stats: yes"
fi
if test "x$enable_lmsensors" != xyes; then
echo " HUD lmsensors: no"
else
echo " HUD lmsensors: yes"
fi
dnl Shader cache
echo ""
echo " Shader cache: $enable_shader_cache"

View File

@@ -107,11 +107,11 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_vertex_type_2_10_10_10_rev DONE (swr)
GL 4.0, GLSL 4.00 --- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
GL_ARB_draw_buffers_blend DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL_ARB_gpu_shader5 DONE (i965/gen7+)
GL_ARB_draw_buffers_blend DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (i965, llvmpipe, softpipe, swr)
GL_ARB_gpu_shader5 DONE (i965)
- 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE (softpipe)
- Dynamically uniform UBO array indices DONE ()
@@ -124,214 +124,154 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen8+, nvc0, r600, radeonsi
- Enhanced per-sample shading DONE ()
- Interpolation functions DONE ()
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 DONE (llvmpipe, softpipe)
GL_ARB_sample_shading DONE (i965/gen6+, nv50)
GL_ARB_shader_subroutine DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_tessellation_shader DONE (i965/gen7+)
GL_ARB_texture_buffer_object_rgb32 DONE (i965/gen6+, llvmpipe, softpipe, swr)
GL_ARB_texture_cube_map_array DONE (i965/gen6+, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_gpu_shader_fp64 DONE (i965/gen8+, llvmpipe, softpipe)
GL_ARB_sample_shading DONE (i965, nv50)
GL_ARB_shader_subroutine DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_tessellation_shader DONE (i965)
GL_ARB_texture_buffer_object_rgb32 DONE (i965, llvmpipe, softpipe, swr)
GL_ARB_texture_cube_map_array DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod DONE (i965, nv50, softpipe)
GL_ARB_transform_feedback2 DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback3 DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback2 DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback3 DONE (i965, nv50, llvmpipe, softpipe, swr)
GL 4.1, GLSL 4.10 --- all DONE: i965/gen8+, nvc0, r600, radeonsi
GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
GL_ARB_ES2_compatibility DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision DONE (all drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bit DONE (llvmpipe, softpipe)
GL_ARB_vertex_attrib_64bit DONE (i965/gen8+, llvmpipe, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL 4.2, GLSL 4.20 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.2, GLSL 4.20 -- all DONE: radeonsi
GL_ARB_texture_compression_bptc DONE (i965, r600)
GL_ARB_texture_compression_bptc DONE (i965, nvc0, r600, radeonsi)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, softpipe)
GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (i965, softpipe)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_map_buffer_alignment DONE (all drivers)
GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.3, GLSL 4.30:
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader DONE (i965, softpipe)
GL_ARB_copy_image DONE (i965, nv50, r600, softpipe, llvmpipe)
GL_ARB_compute_shader DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_copy_image DONE (i965, nv50, nvc0, r600, radeonsi)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, r600, llvmpipe, softpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, r600, softpipe)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_internalformat_query2 DONE (all drivers)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, r600, llvmpipe, softpipe, swr)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior DONE (i965)
GL_ARB_shader_image_size DONE (i965, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, softpipe)
GL_ARB_stencil_texturing DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (nv50, i965, r600, llvmpipe)
GL_ARB_robust_buffer_access_behavior DONE (i965, nvc0, radeonsi)
GL_ARB_shader_image_size DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_view DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.4, GLSL 4.40:
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv50, r600)
GL_ARB_clear_texture DONE (i965, nv50, r600)
GL_ARB_enhanced_layouts DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_buffer_storage DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_clear_texture DONE (i965, nv50, nvc0)
GL_ARB_enhanced_layouts in progress (Timothy)
- compile-time constant expressions DONE
- explicit byte offsets for blocks DONE
- forced alignment within blocks DONE
- specified vec4-slot component numbers DONE (i965, nv50, llvmpipe, softpipe)
- specified vec4-slot component numbers in progress
- specified transform/feedback layout DONE
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_stencil8 DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_query_buffer_object DONE (i965/hsw+, nvc0)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_stencil8 DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
GL 4.5, GLSL 4.50:
GL_ARB_ES3_1_compatibility DONE (i965/hsw+)
GL_ARB_clip_control DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_derivative_control DONE (i965, nv50, r600)
GL_ARB_ES3_1_compatibility DONE (nvc0, radeonsi)
GL_ARB_clip_control DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, nvc0, llvmpipe, softpipe)
GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, r600)
GL_ARB_texture_barrier DONE (i965, nv50, r600)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_texture_barrier DONE (i965, nv50, nvc0, r600, radeonsi)
GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful)
GL_KHR_robustness DONE (i965)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
GLES3.1, GLSL ES 3.1
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (i965/gen7+, softpipe)
GL_ARB_draw_indirect DONE (i965/gen7+, r600, llvmpipe, softpipe, swr)
GL_ARB_compute_shader DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (i965/gen7+, r600, softpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965/gen7+, softpipe)
GL_ARB_shader_image_load_store DONE (i965/gen7+, softpipe)
GL_ARB_shader_image_size DONE (i965/gen7+, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965/gen7+, softpipe)
GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_image_load_store DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_image_size DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, r600, llvmpipe, softpipe)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965/gen7+, r600)
GS5 Packing/bitfield/conversion functions DONE (i965/gen6+, r600)
GS5 Enhanced textureGather DONE (i965, nvc0, r600, radeonsi)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600, radeonsi)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functionality not covered above:
glMemoryBarrierByRegion DONE
glGetTexLevelParameter[fi]v - needs updates DONE
glGetBooleani_v - restrict to GLES enums
gl_HelperInvocation support DONE (i965, r600)
GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
gl_HelperInvocation support DONE (i965, nvc0, r600, radeonsi)
GLES3.2, GLSL ES 3.2
GL_EXT_color_buffer_float DONE (all drivers)
GL_KHR_blend_equation_advanced DONE (i965)
GL_KHR_blend_equation_advanced not started
GL_KHR_debug DONE (all drivers)
GL_KHR_robustness DONE (i965, nvc0, radeonsi)
GL_KHR_robustness DONE (i965)
GL_KHR_texture_compression_astc_ldr DONE (i965/gen9+)
GL_OES_copy_image DONE (all drivers)
GL_OES_copy_image DONE (i965)
GL_OES_draw_buffers_indexed DONE (all drivers that support GL_ARB_draw_buffers_blend)
GL_OES_draw_elements_base_vertex DONE (all drivers)
GL_OES_geometry_shader DONE (i965/gen8+, nvc0, radeonsi)
GL_OES_geometry_shader started (idr)
GL_OES_gpu_shader5 DONE (all drivers that support GL_ARB_gpu_shader5)
GL_OES_primitive_bounding_box DONE (i965/gen7+, nvc0, radeonsi)
GL_OES_primitive_bounding_box not started
GL_OES_sample_shading DONE (i965, nvc0, r600, radeonsi)
GL_OES_sample_variables DONE (i965, nvc0, r600, radeonsi)
GL_OES_shader_image_atomic DONE (all drivers that support GL_ARB_shader_image_load_store)
GL_OES_shader_io_blocks DONE (i965/gen8+, nvc0, radeonsi)
GL_OES_shader_multisample_interpolation DONE (i965, nvc0, r600, radeonsi)
GL_OES_tessellation_shader DONE (all drivers that support GL_ARB_tessellation_shader)
GL_OES_tessellation_shader started (Ken)
GL_OES_texture_border_clamp DONE (all drivers)
GL_OES_texture_buffer DONE (i965, nvc0, radeonsi)
GL_OES_texture_cube_map_array DONE (i965/gen8+, nvc0, radeonsi)
GL_OES_texture_cube_map_array not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8)
GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample)
Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version:
GL_ARB_bindless_texture started (airlied)
GL_ARB_cl_event not started
GL_ARB_compute_variable_group_size DONE (nvc0, radeonsi)
GL_ARB_ES3_2_compatibility DONE (i965/gen8+)
GL_ARB_fragment_shader_interlock not started
GL_ARB_gl_spirv not started
GL_ARB_gpu_shader_int64 started (airlied for core and Gallium, idr for i965)
GL_ARB_indirect_parameters DONE (nvc0, radeonsi)
GL_ARB_parallel_shader_compile not started, but Chia-I Wu did some related work in 2014
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, radeonsi, softpipe, swr)
GL_ARB_post_depth_coverage not started
GL_ARB_robustness_isolation not started
GL_ARB_sample_locations not started
GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
GL_ARB_shader_atomic_counter_ops DONE (nvc0, radeonsi, softpipe)
GL_ARB_shader_ballot not started
GL_ARB_shader_clock DONE (i965/gen7+)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (nvc0)
GL_ARB_shader_stencil_export DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr)
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+)
GL_ARB_sparse_buffer not started
GL_ARB_sparse_texture not started
GL_ARB_sparse_texture2 not started
GL_ARB_sparse_texture_clamp not started
GL_ARB_texture_filter_minmax not started
GL_ARB_transform_feedback_overflow_query not started
GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+)
GL_KHR_no_error not started
GL_KHR_texture_compression_astc_hdr DONE (core only)
GL_KHR_texture_compression_astc_sliced_3d not started
GL_OES_depth_texture_cube_map DONE (all drivers that support GLSL 1.30+)
GL_OES_EGL_image DONE (all drivers)
GL_OES_EGL_image_external_essl3 not started
GL_OES_required_internalformat not started - GLES2 extension based on OpenGL ES 3.0 feature
GL_OES_surfaceless_context DONE (all drivers)
GL_OES_texture_compression_astc DONE (core only)
GL_OES_texture_float DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_float_linear DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_half_float_linear DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
GL_OES_texture_view not started - based on GL_ARB_texture_view
GL_OES_viewport_array DONE (i965, nvc0, radeonsi)
GLX_ARB_context_flush_control not started
GLX_ARB_robustness_application_isolation not started
GLX_ARB_robustness_share_group_isolation not started
The following extensions are not part of any OpenGL or OpenGL ES version, and
we DO NOT WANT implementations of these extensions for Mesa.
GL_ARB_geometry_shader4 Superseded by GL 3.2 geometry shaders
GL_ARB_matrix_palette Superseded by GL_ARB_vertex_program
GL_ARB_shading_language_include Not interesting
GL_ARB_shadow_ambient Superseded by GL_ARB_fragment_program
GL_ARB_vertex_blend Superseded by GL_ARB_vertex_program
More info about these features and the work involved can be found at
http://dri.freedesktop.org/wiki/MissingFunctionality

View File

@@ -38,7 +38,7 @@ including:
<p>
Other companies including
<a href="https://01.org/linuxgraphics">Intel</a>
<a href="http://www.intellinuxgraphics.org/index.html">Intel</a>
and RedHat also actively contribute to the project.
Intel has recently contributed the new GLSL compiler in Mesa 7.9.
</p>

View File

@@ -252,14 +252,10 @@ check for regressions.
<h3>Mailing Patches</h3>
<p>
Patches should be sent to the mesa-dev mailing list for review:
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">
mesa-dev@lists.freedesktop.org<a/>.
When submitting a patch make sure to use
<a href="https://git-scm.com/docs/git-send-email">git send-email</a>
rather than attaching patches to emails. Sending patches as
attachments prevents people from being able to provide in-line review
comments.
Patches should be sent to the Mesa mailing list for review.
When submitting a patch make sure to use git send-email rather than attaching
patches to emails. Sending patches as attachments prevents people from being
able to provide in-line review comments.
</p>
<p>
@@ -688,11 +684,9 @@ To add a new GL extension to Mesa you have to do at least the following.
</li>
<li>
Add a new entry to the <code>gl_extensions</code> struct in mtypes.h
if the extension requires driver capabilities not already exposed by
another extension.
</li>
<li>
Add a new entry to the src/mesa/main/extensions_table.h file.
Update the <code>extensions.c</code> file.
</li>
<li>
From this point, the best way to proceed is to find another extension,
@@ -703,18 +697,12 @@ To add a new GL extension to Mesa you have to do at least the following.
If the new extension adds new GL state, the functions in get.c, enable.c
and attrib.c will most likely require new code.
</li>
<li>
To determine if the new extension is active in the current context,
use the auto-generated _mesa_has_##name_str() function defined in
src/mesa/main/extensions.h.
</li>
<li>
The dispatch tests check_table.cpp and dispatch_sanity.cpp
should be updated with details about the new extensions functions. These
tests are run using 'make check'
</li>
</ul>
</p>

View File

@@ -50,17 +50,8 @@ sometimes be useful for debugging end-user issues.
if the application generates a GL_INVALID_ENUM error, a corresponding error
message indicating where the error occurred, and possibly why, will be
printed to stderr.<br>
For release builds, MESA_DEBUG defaults to off (no debug output).
MESA_DEBUG accepts the following comma-separated list of named
flags, which adds extra behaviour to just set MESA_DEBUG=1:
<ul>
<li>silent - turn off debug messages. Only useful for debug builds.</li>
<li>flush - flush after each drawing command</li>
<li>incomplete_tex - extra debug messages when a texture is incomplete</li>
<li>incomplete_fbo - extra debug messages when a fbo is incomplete</li>
</ul>
If the value of MESA_DEBUG is 'FP' floating point arithmetic errors will
generate exceptions.
<li>MESA_LOG_FILE - specifies a file name for logging all errors, warnings,
etc., rather than stderr
<li>MESA_TEX_PROG - if set, implement conventional texture env modes with
@@ -153,10 +144,11 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.
<li>bat - emit batch information</li>
<li>pix - emit messages about pixel operations</li>
<li>buf - emit messages about buffer objects</li>
<li>reg - emit messages about regions</li>
<li>fbo - emit messages about framebuffers</li>
<li>fs - dump shader assembly for fragment shaders</li>
<li>gs - dump shader assembly for geometry shaders</li>
<li>sync - after sending each batch, emit a message and wait for that batch to finish rendering</li>
<li>sync - emit messages about synchronization</li>
<li>prim - emit messages about drawing primitives</li>
<li>vert - emit messages about vertex assembly</li>
<li>dri - emit messages about the DRI interface</li>
@@ -171,18 +163,9 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.
<li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>
<li>nodualobj - suppress generation of dual-object geometry shader code</li>
<li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>
<li>ann - annotate IR in assembly dumps</li>
<li>no8 - don't generate SIMD8 fragment shader</li>
<li>vec4 - force vec4 mode in vertex shader</li>
<li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>
<li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>
<li>cs - dump shader assembly for compute shaders</li>
<li>hex - print instruction hex dump with the disassembly</li>
<li>nocompact - disable instruction compaction</li>
<li>tcs - dump shader assembly for tessellation control shaders</li>
<li>tes - dump shader assembly for tessellation evaluation shaders</li>
<li>l3 - emit messages about the new L3 state during transitions</li>
<li>do32 - generate compute shader SIMD32 programs even if workgroup size doesn't exceed the SIMD16 limit</li>
<li>norbc - disable single sampled render buffer compression</li>
</ul>
</ul>
@@ -215,10 +198,8 @@ Mesa EGL supports different sets of environment variables. See the
<li>GALLIUM_HUD_TOGGLE_SIGNAL - toggle visibility via user specified signal.
Especially useful to toggle hud at specific points of application and
disable for unencumbered viewing the rest of the time. For example, set
GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_SIGNAL_TOGGLE to 10 (SIGUSR1).
Use kill -10 <pid> to toggle the hud as desired.
<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=1 for
choosing one of the software renderers "softpipe", "llvmpipe" or "swr".
<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
rather than stderr.
<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment

View File

@@ -57,7 +57,7 @@ drivers for X.org.
<ul>
<li>See the <a href="http://dri.freedesktop.org/">DRI website</a>
for more information.</li>
<li>See <a href="https://01.org/linuxgraphics">01.org</a>
<li>See <a href="http://intellinuxgraphics.org">intellinuxgraphics.org</a>
for more information about Intel drivers.</li>
<li>See <a href="http://nouveau.freedesktop.org">nouveau.freedesktop.org</a>
for more information about Nouveau drivers.</li>

View File

@@ -56,8 +56,8 @@ You can find some further To-do lists here:
<b>Common To-Do lists:</b>
</p>
<ul>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt">
<b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt">
<b>GL3.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>
<li><a href="http://dri.freedesktop.org/wiki/MissingFunctionality">
<b>MissingFunctionality</b></a> - Detailed information about missing OpenGL features.</li>
</ul>

View File

@@ -16,31 +16,6 @@
<h1>News</h1>
<h2>September 15, 2016</h2>
<p>
<a href="relnotes/12.0.3.html">Mesa 12.0.3</a> is released.
This is a bug-fix release.
</p>
<h2>September 2, 2016</h2>
<p>
<a href="relnotes/12.0.2.html">Mesa 12.0.2</a> is released.
This is a bug-fix release.
</p>
<h2>July 8, 2016</h2>
<p>
<a href="relnotes/12.0.1.html">Mesa 12.0.1</a> is released.
This is a bug-fix release, resolving build issues in the r600 and
radeonsi drivers.
</p>
<p>
<a href="relnotes/12.0.0.html">Mesa 12.0.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>May 9, 2016</h2>
<p>
<a href="relnotes/11.1.4.html">Mesa 11.1.4</a> and

View File

@@ -173,27 +173,6 @@ of the OpenGL specification is implemented.
</p>
<h2>Version 12.x features</h2>
<p>
Version 12.x of Mesa implements the OpenGL 4.3 API, but not all drivers
support OpenGL 4.3.
</p>
<h2>Version 11.x features</h2>
<p>
Version 11.x of Mesa implements the OpenGL 4.1 API, but not all drivers
support OpenGL 4.1.
</p>
<h2>Version 10.x features</h2>
<p>
Version 10.x of Mesa implements the OpenGL 3.3 API, but not all drivers
support OpenGL 3.3.
</p>
<h2>Version 9.x features</h2>
<p>
Version 9.x of Mesa implements the OpenGL 3.1 API.
@@ -203,10 +182,6 @@ community contributed features required for OpenGL 3.1. The primary
features added since the Mesa 8.0 release are
GL_ARB_texture_buffer_object and GL_ARB_uniform_buffer_object.
</p>
<p>
Version 9.0 of Mesa also included the first release of the Clover state
tracker for OpenCL.
</p>
<h2>Version 8.x features</h2>

View File

@@ -21,10 +21,6 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/12.0.3.html">12.0.3 release notes</a>
<li><a href="relnotes/12.0.2.html">12.0.2 release notes</a>
<li><a href="relnotes/12.0.1.html">12.0.1 release notes</a>
<li><a href="relnotes/12.0.0.html">12.0.0 release notes</a>
<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>
<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>
<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>

View File

@@ -16,6 +16,8 @@
<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>
<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>
<p>
Mesa 12.0.1 is a bug fix release which fixes bugs found since the 12.0.1 release.
</p>

View File

@@ -1,311 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.0 Release Notes / November 1, 2016</h1>
<p>
Mesa 13.0.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 13.0.1.
</p>
<p>
Mesa 13.0.0 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4a54d7cdc1a94a8dae05a75ccff48356406d51b0d6a64cbdc641c266e3e008eb mesa-13.0.0.tar.gz
94edb4ebff82066a68be79d9c2627f15995e1fe10f67ab3fc63deb842027d727 mesa-13.0.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL ES 3.1 on i965/hsw</li>
<li>OpenGL ES 3.2 on i965/gen9+ (Skylake and later)</li>
<li>GL_ARB_ES3_1_compatibility on i965</li>
<li>GL_ARB_ES3_2_compatibility on i965/gen8+</li>
<li>GL_ARB_clear_texture on r600, radeonsi</li>
<li>GL_ARB_compute_variable_group_size on nvc0, radeonsi</li>
<li>GL_ARB_cull_distance on radeonsi</li>
<li>GL_ARB_enhanced_layouts on i965, nv50, nvc0, radeonsi, llvmpipe, softpipe</li>
<li>GL_ARB_indirect_parameters on radeonsi</li>
<li>GL_ARB_query_buffer_object on radeonsi</li>
<li>GL_ARB_shader_draw_parameters on radeonsi</li>
<li>GL_ARB_shader_group_vote on nvc0</li>
<li>GL_ARB_shader_viewport_layer_array on i965/gen6+</li>
<li>GL_ARB_stencil_texturing on i965/hsw</li>
<li>GL_ARB_texture_stencil8 on i965/hsw</li>
<li>GL_EXT_window_rectangles on nv50, nvc0</li>
<li>GL_KHR_blend_equation_advanced on i965</li>
<li>GL_KHR_robustness on nvc0, radeonsi</li>
<li>GL_KHR_texture_compression_astc_sliced_3d on i965</li>
<li>GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe</li>
<li>GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi</li>
<li>GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi</li>
<li>GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi</li>
<li>GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi</li>
<li>GL_OES_viewport_array on nvc0, radeonsi</li>
<li>GL_ANDROID_extension_pack_es31a on i965/gen9+</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61907">Bug 61907</a> - Indirect rendering of multi-texture vertex arrays broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83036">Bug 83036</a> - [ILK]Piglit spec_ARB_copy_image_arb_copy_image-formats fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94561">Bug 94561</a> - [llvmpipe] PIPE_CAP_VIDEO_MEMORY reports negative value on 32 bits (with 16GB ram)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94627">Bug 94627</a> - Game Risen on wine black grass</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94681">Bug 94681</a> - dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95000">Bug 95000</a> - deqp: assert in dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives &quot;invalid floating point operation&quot; at startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95462">Bug 95462</a> - [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95529">Bug 95529</a> - [regression, bisected] Image corruption in Chrome</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96235">Bug 96235</a> - st_nir.h:34: error: redefinition of typedef nir_shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96285">Bug 96285</a> - Mesa build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96299">Bug 96299</a> - [vulkan] 64 regressions due to mesa d5f2f32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96343">Bug 96343</a> - oom since st/mesa: implement PBO downloads for ReadPixels</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96346">Bug 96346</a> - [SNB,CTS] es2-cts.gtf.gl.atan regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96349">Bug 96349</a> - [CTS,SKL,BSW,BDW,KBL,BXT] es31-cts.arrays_of_arrays.interactionuniformbuffers3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96351">Bug 96351</a> - [CTS,SKL,KBL,BXT] es2-cts.gtf.gl2extensiontests.egl_image.egl_image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96484">Bug 96484</a> - [vulkan] deqp-vk.glsl.builtin.precision.sin / cos regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96504">Bug 96504</a> - [vulkancts] compute tests crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] &quot;clover: Update OpenCL version string to match OpenGL&quot;: clover's build fails because of missing git_sha1.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96528">Bug 96528</a> - Location qualifier segfaults during shader compilation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96541">Bug 96541</a> - Tonga Unreal elemental bad rendering since radeonsi: Decompress DCC textures in a render feedback loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96565">Bug 96565</a> - Clive Barker's Jericho displays strange,vivid colors when motion blur enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96629">Bug 96629</a> - dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width &gt;= 1' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96674">Bug 96674</a> - [SNB, ILK] spec.ext_image_dma_buf_import.ext_image_dma_buf_import-sample_nv1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96729">Bug 96729</a> - Wrong shader compilation error message</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96765">Bug 96765</a> - BindFragDataLocationIndexed on array fragment shader output.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef GLXContext</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96782">Bug 96782</a> - [regression bisected] R600 fp64 and glsl-4.00 piglit failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96878">Bug 96878</a> - [Bisected: cc2d0e6][HSW] &quot;GPU HANG&quot; msg after autologin to gnome-session</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96949">Bug 96949</a> - [regression] Piglit numSamples assertion failures with 9a23a177b90</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96950">Bug 96950</a> - Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97019">Bug 97019</a> - [clover] build failure in llvm/codegen/native.cpp:129:52</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97032">Bug 97032</a> - [BDW,SKL] piglit.spec.arb_gpu_shader5.arb_gpu_shader5-interpolateatcentroid-flat</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97033">Bug 97033</a> - [BDW,SKL] piglit.spec.arb_gpu_shader_fp64.varying-packing.simple regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97083">Bug 97083</a> - [IVB,BYT] GPU hang on deqp-gles31.functional.separate.shader.random</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97140">Bug 97140</a> - dd_draw.c:949:11: error: implicit declaration of function 'fmemopen' is invalid in C99 [-Werror,-Wimplicit-function-declaration]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97267">Bug 97267</a> - [BDW] GL45-CTS.texture_cube_map_array.sampling asserts inside brw_fs.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97278">Bug 97278</a> - [vulkancts,HSW] all vulkancts tests assert on HSW</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97285">Bug 97285</a> - Darkness in Dota 2 after Patch &quot;Make Gallium's BlitFramebuffer follow the GL 4.4 sRGB rules&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97286">Bug 97286</a> - `make check` fails uniform-initializer-test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97305">Bug 97305</a> - Gallium: TBOs and images set the offset in elements, not bytes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97309">Bug 97309</a> - piglit.spec.glsl-1_30.compiler.switch-statement.switch-case-duplicated.vert regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97322">Bug 97322</a> - GenerateMipmap creates wrong mipmap for sRGB texture</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97413">Bug 97413</a> - BioShock Infinite crashes on startup with Mesa Git version, R7 370</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97448">Bug 97448</a> - [HSW] deqp-vk.api_.copy_and_blit.image_to_image_stencil regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97477">Bug 97477</a> - i915g: gl_FragCoord is always (0.0, max_y)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97513">Bug 97513</a> - clover reports wrong device pointer size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97587">Bug 97587</a> - make check nir/tests/control_flow_tests regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97761">Bug 97761</a> - es2-cts.gtf.gl2extensiontests.egl_image_external.testsimpleunassociated crashes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97773">Bug 97773</a> - New Mesa master now results in warnings in glrender (and subsurfaces and simple-egl), black screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97790">Bug 97790</a> - Vulkan cts regressions due to 24be63066</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97804">Bug 97804</a> - Later precision statement isn't overriding earlier one</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97808">Bug 97808</a> - &quot;tgsi/scan: don't set interp flags for inputs only used by INTERP instructions&quot; causes glitches in wine with gallium nine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97894">Bug 97894</a> - Crash in u_transfer_unmap_vtbl when unmapping a buffer mapped in different context</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97952">Bug 97952</a> - /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97969">Bug 97969</a> - [radeonsi, bisected: fb827c0] Video decoding shows green artifacts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97976">Bug 97976</a> - VCE regression BO to small for addr since winsys/amdgpu: enable buffer allocation from slabs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98005">Bug 98005</a> - VCE dual instance encoding inconsistent since st/va: enable dual instances encode by sync surface</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98128">Bug 98128</a> - nir/tests/control_flow_tests.cpp:79:73: error: nir_loop_first_cf_node was not declared in this scope</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98131">Bug 98131</a> - Compiler should reject lowp/mediump qualifiers on atomic_uints</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98133">Bug 98133</a> - GetSynciv should raise an error if bufSize &lt; 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98135">Bug 98135</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98167">Bug 98167</a> - [vulkan, radv] missing libgcrypt and openssl devel results in linker error in libvulkan_common</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98172">Bug 98172</a> - Concurrent call to glClientWaitSync results in segfault in one of the waiters.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98244">Bug 98244</a> - dEQP: textureOffset(sampler2DArrayShadow, ...) should not exist.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98264">Bug 98264</a> - Build broken for i965 due to multiple deifnitions of intelFenceExtension</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98307">Bug 98307</a> - &quot;st/glsl_to_tgsi: explicitly track all input and output declaration&quot; broke flightgear colors on rs780</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98431">Bug 98431</a> - UnrealEngine v4 demos startup fails to blorp blit assert</li>
</ul>
<h2>Changes</h2>
Mesa no longer depends on libudev.
</div>
</body>
</html>

View File

@@ -1,188 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.1 Release Notes / November 14, 2016</h1>
<p>
Mesa 13.0.1 is a bug fix release which fixes bugs found since the 13.0.0 release.
</p>
<p>
Mesa 13.0.1 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7cbb91dead05cde279ee95f86e8321c8e1c8fc9deb88f12e0f587672a10d88c5 mesa-13.0.1.tar.gz
71962fb2bf77d33b0ad4a565b490dbbeaf4619099c6d9722f04a73187957a731 mesa-13.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97715">Bug 97715</a> - [ILK,G45,G965] piglit.spec.arb_separate_shader_objects.misc api error checks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98012">Bug 98012</a> - [IVB] Segfault when running Dolphin twice with Vulkan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98512">Bug 98512</a> - radeon r600 vdpau: Invalid command stream: texture bo too small</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (2):</p>
<ul>
<li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>
<li>glx/glvnd: Fix dispatch function names and indices</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx/windows: Add wgl.h to the sources list</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>
</ul>
<p>Chih-Wei Huang (1):</p>
<ul>
<li>android: avoid using libdrm with host modules</li>
</ul>
<p>Darren Salt (1):</p>
<ul>
<li>radv/pipeline: Don't dereference NULL dynamic state pointers</li>
</ul>
<p>Dave Airlie (8):</p>
<ul>
<li>radv: expose xlib platform extension</li>
<li>radv: fix dual source blending</li>
<li>Revert "st/vdpau: use linear layout for output surfaces"</li>
<li>radv: emit correct last export when Z/stencil export is enabled</li>
<li>ac/nir: add support for discard_if intrinsic (v2)</li>
<li>nir: add conditional discard optimisation (v4)</li>
<li>radv: enable conditional discard optimisation on radv.</li>
<li>radv: fix GetFenceStatus for signaled fences</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.0</li>
<li>amd/addrlib: limit fastcall/regparm to GCC i386</li>
<li>anv: use correct .specVersion for extensions</li>
<li>radv: use correct .specVersion for extensions</li>
<li>radv: Suffix the radeon_icd file with the host CPU</li>
<li>Update version to 13.0.1</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Use Newton-Raphson on the 1/W write to fix glmark2 terrain.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>nir: Flip gl_SamplePosition in nir_lower_wpos_ytransform().</li>
</ul>
<p>Fredrik Höglund (1):</p>
<ul>
<li>radv: add support for anisotropic filtering on VI+</li>
</ul>
<p>Jason Ekstrand (21):</p>
<ul>
<li>anv/device: Return DEVICE_LOST if execbuf2 fails</li>
<li>vulkan/wsi/x11: Better handle wsi_x11_connection_create failure</li>
<li>vulkan/wsi/x11: Clean up connections in finish_wsi</li>
<li>anv: Better handle return codes from anv_physical_device_init</li>
<li>intel/blorp: Use wm_prog_data instead of hand-rolling our own</li>
<li>intel/blorp: Pass a brw_stage_prog_data to upload_shader</li>
<li>anv/pipeline: Put actual pointers in anv_shader_bin</li>
<li>anv/pipeline: Properly cache prog_data::param</li>
<li>intel/blorp: Emit all the binding tables</li>
<li>anv/device: Add an execbuf wrapper</li>
<li>anv: Add a cmd_buffer_execbuf helper</li>
<li>anv: Don't presume to know what address is in a surface relocation</li>
<li>anv: Add a new bo_pool_init helper</li>
<li>anv/allocator: Simplify anv_scratch_pool</li>
<li>anv: Initialize anv_bo::offset to -1</li>
<li>anv/batch_chain: Improve write_reloc</li>
<li>anv: Add an anv_execbuf helper struct</li>
<li>anv/batch: Move last_ss_pool_bo_offset to the command buffer</li>
<li>anv: Move relocation handling from EndCommandBuffer to QueueSubmit</li>
<li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>
<li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>glsl: Update deref types when resizing implicitly sized arrays.</li>
<li>mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>anv: Do relocations in userspace before execbuf ioctl</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>egl: use util/macros.h</li>
<li>egl: make interop ABI visible again</li>
<li>glx: make interop ABI visible again</li>
<li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>radeonsi: fix BFE/BFI lowering for GLSL semantics</li>
<li>glsl: fix lowering of UBO references of named blocks</li>
<li>st/glsl_to_tgsi: fix dvec[34] loads from SSBO</li>
<li>st/mesa: fix the layer of VDPAU surface samplers</li>
</ul>
<p>Steven Toth (3):</p>
<ul>
<li>gallium/hud: fix a problem where objects are free'd while in use.</li>
<li>gallium/hud: close a previously opened handle</li>
<li>gallium/hud: protect against and initialization race</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa/glsl: delete previously linked shaders earlier when linking</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,188 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.2 Release Notes / November 28, 2016</h1>
<p>
Mesa 13.0.2 is a bug fix release which fixes bugs found since the 13.0.1 release.
</p>
<p>
Mesa 13.0.2 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97321">Bug 97321</a> - Query INFO_LOG_LENGTH for empty info log should return 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97420">Bug 97420</a> - &quot;#version 0&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98632">Bug 98632</a> - Fix build on Hurd without PATH_MAX</li>
</ul>
<h2>Changes</h2>
<p>Ben Widawsky (3):</p>
<ul>
<li>i965: Add some APL and KBL SKU strings</li>
<li>i965: Reorder PCI ID list to match release order</li>
<li>i965/glk: Add basic Geminilake support</li>
</ul>
<p>Dave Airlie (14):</p>
<ul>
<li>radv: fix texturesamples to handle single sample case</li>
<li>wsi: fix VK_INCOMPLETE for vkGetSwapchainImagesKHR</li>
<li>radv: don't crash on null swapchain destroy.</li>
<li>ac/nir/llvm: fix channel in texture gather lowering code.</li>
<li>radv: make sure to flush input attachments correctly.</li>
<li>radv: fix image view creation for depth and stencil only</li>
<li>radv: spir-v allows texture size query with and without lod.</li>
<li>vulkan/wsi/x11: handle timeouts properly in next image acquire (v1.1)</li>
<li>vulkan/wsi: store present mode in swapchain base class</li>
<li>vulkan/wsi/x11: add support for IMMEDIATE present mode</li>
<li>radv: fix texel fetch offset with 2d arrays.</li>
<li>radv/si: fix optimal micro tile selection</li>
<li>radv/ac/llvm: shadow samplers only return one value.</li>
<li>radv: fix 3D clears with baseMiplevel</li>
</ul>
<p>Eduardo Lima Mitev (2):</p>
<ul>
<li>vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfaceFormatsKHR</li>
<li>vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfacePresentModesKHR</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.1</li>
<li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>
<li>anv: fix enumeration of properties</li>
<li>radv: honour the number of properties available</li>
<li>Update version to 13.0.2</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>vc4: Don't abort when a shader compile fails.</li>
<li>vc4: Clamp the shadow comparison value.</li>
<li>vc4: Fix register class handling of DDX/DDY arguments.</li>
</ul>
<p>Gwan-gyeong Mun (2):</p>
<ul>
<li>util/disk_cache: close a previously opened handle in disk_cache_put (v2)</li>
<li>anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTEL</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>anv/format: handle unsupported formats properly</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>glcpp: Handle '#version 0' and other invalid values</li>
<li>glsl: Parse 0 as a preprocessor INTCONSTANT</li>
</ul>
<p>Jason Ekstrand (15):</p>
<ul>
<li>anv/gen8: Stall when needed in Cmd(Set|Reset)Event</li>
<li>anv/wsi: Set the fence to signaled in AcquireNextImageKHR</li>
<li>anv: Rework fences</li>
<li>vulkan/wsi/wayland: Include pthread.h</li>
<li>vulkan/wsi/wayland: Clean up some error handling paths</li>
<li>vulkan/wsi: Report the correct min/maxImageCount</li>
<li>i965/gs: Allow primitive id to be a system value</li>
<li>anv: Handle null in all destructors</li>
<li>anv/fence: Handle ANV_FENCE_CREATE_SIGNALED_BIT</li>
<li>nir/spirv: Fix handling of gl_PrimitiveId</li>
<li>anv/blorp: Ignore clears for attachments first used as resolve destinations</li>
<li>anv: Implement a depth stall restriction on gen7</li>
<li>anv/cmd_buffer: Handle running out of binding tables in compute shaders</li>
<li>anv/cmd_buffer: Emit a CS stall before setting a CS pipeline</li>
<li>vulkan/wsi/x11: Implement FIFO mode.</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>isl: Fix height calculation in isl_msaa_interleaved_scale_px_to_sa</li>
<li>i965/hsw: Set integer mode in sampling state for stencil texturing</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>intel: Set min_ds_entries on Broxton.</li>
<li>i965: Fix compute shader crash.</li>
<li>mesa: Drop PATH_MAX usage.</li>
<li>i965: Fix GS push inputs with enhanced layouts.</li>
</ul>
<p>Kevin Strasser (1):</p>
<ul>
<li>vulkan/wsi: Add a thread-safe queue implementation</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix multi level clears with VK_REMAINING_MIP_LEVELS</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>gbm: request correct version of the DRI2_FENCE extension</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>radeonsi: store group_size_variable in struct si_compute</li>
<li>glsl/lower_output_reads: fix geometry shader output handling with conditional emit</li>
</ul>
<p>Steinar H. Gunderson (1):</p>
<ul>
<li>Fix races during _mesa_HashWalk().</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: fix empty program log length</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,120 +0,0 @@
Name
MESA_platform_surfaceless
Name Strings
EGL_MESA_platform_surfaceless
Contributors
Chad Versace <chadversary@google.com>
Haixia Shi <hshi@google.com>
Stéphane Marchesin <marcheu@google.com>
Zach Reizner <zachr@chromium.org>
Gurchetan Singh <gurchetansingh@google.com>
Contacts
Chad Versace <chadversary@google.com>
Status
DRAFT
Version
Version 2, 2016-10-13
Number
EGL Extension #TODO
Extension Type
EGL client extension
Dependencies
Requires EGL 1.5 or later; or EGL 1.4 with EGL_EXT_platform_base.
This extension is written against the EGL 1.5 Specification (draft
20140122).
This extension interacts with EGL_EXT_platform_base as follows. If the
implementation supports EGL_EXT_platform_base, then text regarding
eglGetPlatformDisplay applies also to eglGetPlatformDisplayEXT;
eglCreatePlatformWindowSurface to eglCreatePlatformWindowSurfaceEXT; and
eglCreatePlatformPixmapSurface to eglCreatePlatformPixmapSurfaceEXT.
Overview
This extension defines a new EGL platform, the "surfaceless" platform. This
platfom's defining property is that it has no native surfaces, and hence
neither eglCreatePlatformWindowSurface nor eglCreatePlatformPixmapSurface
can be used. The platform is independent of any native window system.
The platform's intended use case is for enabling OpenGL and OpenGL ES
applications on systems where no window system exists. However, the
platform's permitted usage is not restricted to this case. Since the
platform is independent of any native window system, it may also be used on
systems where a window system is present.
New Types
None
New Procedures and Functions
None
New Tokens
Accepted as the <platform> argument of eglGetPlatformDisplay:
EGL_PLATFORM_SURFACELESS_MESA 0x31DD
Additions to the EGL Specification
None.
New Behavior
To determine if the EGL implementation supports this extension, clients
should query the EGL_EXTENSIONS string of EGL_NO_DISPLAY.
To obtain an EGLDisplay on the surfaceless platform, call
eglGetPlatformDisplay with <platform> set to EGL_PLATFORM_SURFACELESS_MESA.
The <native_display> parameter must be EGL_DEFAULT_DISPLAY.
eglCreatePlatformWindowSurface fails when called with a <display> that
belongs to the surfaceless platform. It returns EGL_NO_SURFACE and
generates EGL_BAD_NATIVE_WINDOW. The justification for this unconditional
failure is that the surfaceless platform has no native windows, and
therefore the <native_window> parameter is always invalid.
Likewise, eglCreatePlatformPixmapSurface also fails when called with a
<display> that belongs to the surfaceless platform. It returns
EGL_NO_SURFACE and generates EGL_BAD_NATIVE_PIXMAP.
The surfaceless platform imposes no platform-specific restrictions on the
creation of pbuffers, as eglCreatePbufferSurface has no native surface
parameter. Specifically, if the EGLDisplay advertises an EGLConfig whose
EGL_SURFACE_TYPE attribute contains EGL_PBUFFER_BIT, then the EGLDisplay
permits the creation of pbuffers with that config.
Issues
None.
Revision History
Version 2, 2016-10-13 (Chad Versace)
- Assign enum values
- Define interfactions with EGL 1.4 and EGL_EXT_platform_base.
- Add Gurchetan as contributor, as he implemented the pbuffer support.
Version 1, 2016-09-23 (Chad Versace)
- Initial version
- Posted for review at
https://lists.freedesktop.org/archives/mesa-dev/2016-September/129549.html

View File

@@ -12,12 +12,11 @@ Contact
Status
Superseded by the functionally identical EGL_KHR_no_config_context
extension.
Proposal
Version
Version 2, September 9, 2016
Version 1, February 28, 2014
Number
@@ -122,8 +121,5 @@ Issues
Revision History
Version 2, September 9, 2016
Defer to EGL_KHR_no_config_context (Adam Jackson)
Version 1, February 28, 2014
Initial draft (Neil Roberts)

View File

@@ -1,520 +0,0 @@
Name
MESA_shader_integer_functions
Name Strings
GL_MESA_shader_integer_functions
Contact
Ian Romanick <ian.d.romanick@intel.com>
Contributors
All the contributors of GL_ARB_gpu_shader5
Status
Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
Version
Version 2, July 7, 2016
Number
TBD
Dependencies
This extension is written against the OpenGL 3.2 (Compatibility Profile)
Specification.
This extension is written against Version 1.50 (Revision 09) of the OpenGL
Shading Language Specification.
GLSL 1.30 is required.
This extension interacts with ARB_gpu_shader5.
This extension interacts with ARB_gpu_shader_fp64.
This extension interacts with NV_gpu_shader5.
Overview
GL_ARB_gpu_shader5 extends GLSL in a number of useful ways. Much of this
added functionality requires significant hardware support. There are many
aspects, however, that can be easily implmented on any GPU with "real"
integer support (as opposed to simulating integers using floating point
calculations).
This extension provides a set of new features to the OpenGL Shading
Language to support capabilities of these GPUs, extending the capabilities
of version 1.30 of the OpenGL Shading Language. Shaders
using the new functionality provided by this extension should enable this
functionality via the construct
#extension GL_MESA_shader_integer_functions : require (or enable)
This extension provides a variety of new features for all shader types,
including:
* support for implicitly converting signed integer types to unsigned
types, as well as more general implicit conversion and function
overloading infrastructure to support new data types introduced by
other extensions;
* new built-in functions supporting:
* splitting a floating-point number into a significand and exponent
(frexp), or building a floating-point number from a significand and
exponent (ldexp);
* integer bitfield manipulation, including functions to find the
position of the most or least significant set bit, count the number
of one bits, and bitfield insertion, extraction, and reversal;
* extended integer precision math, including add with carry, subtract
with borrow, and extenended multiplication;
The resulting extension is a strict subset of GL_ARB_gpu_shader5.
IP Status
No known IP claims.
New Procedures and Functions
None
New Tokens
None
Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
(OpenGL Operation)
None.
Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
(Rasterization)
None.
Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
(Per-Fragment Operations and the Frame Buffer)
None.
Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
(Special Functions)
None.
Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
(State and State Requests)
None.
Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
Specification (Invariance)
None.
Additions to the AGL/GLX/WGL Specifications
None.
Modifications to The OpenGL Shading Language Specification, Version 1.50
(Revision 09)
Including the following line in a shader can be used to control the
language features described in this extension:
#extension GL_MESA_shader_integer_functions : <behavior>
where <behavior> is as specified in section 3.3.
New preprocessor #defines are added to the OpenGL Shading Language:
#define GL_MESA_shader_integer_functions 1
Modify Section 4.1.10, Implicit Conversions, p. 27
(modify table of implicit conversions)
Can be implicitly
Type of expression converted to
--------------------- -----------------
int uint, float
ivec2 uvec2, vec2
ivec3 uvec3, vec3
ivec4 uvec4, vec4
uint float
uvec2 vec2
uvec3 vec3
uvec4 vec4
(modify second paragraph of the section) No implicit conversions are
provided to convert from unsigned to signed integer types or from
floating-point to integer types. There are no implicit array or structure
conversions.
(insert before the final paragraph of the section) When performing
implicit conversion for binary operators, there may be multiple data types
to which the two operands can be converted. For example, when adding an
int value to a uint value, both values can be implicitly converted to uint
and float. In such cases, a floating-point type is chosen if either
operand has a floating-point type. Otherwise, an unsigned integer type is
chosen if either operand has an unsigned integer type. Otherwise, a
signed integer type is chosen.
Modify Section 5.9, Expressions, p. 57
(modify bulleted list as follows, adding support for implicit conversion
between signed and unsigned types)
Expressions in the shading language are built from the following:
* Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
types, and all matrix types.
...
* The operator modulus (%) operates on signed or unsigned integer scalars
or vectors. If the fundamental types of the operands do not match, the
conversions from Section 4.1.10 "Implicit Conversions" are applied to
produce matching types. ...
Modify Section 6.1, Function Definitions, p. 63
(modify description of overloading, beginning at the top of p. 64)
Function names can be overloaded. The same function name can be used for
multiple functions, as long as the parameter types differ. If a function
name is declared twice with the same parameter types, then the return
types and all qualifiers must also match, and it is the same function
being declared. For example,
vec4 f(in vec4 x, out vec4 y); // (A)
vec4 f(in vec4 x, out uvec4 y); // (B) okay, different argument type
vec4 f(in ivec4 x, out uvec4 y); // (C) okay, different argument type
int f(in vec4 x, out ivec4 y); // error, only return type differs
vec4 f(in vec4 x, in vec4 y); // error, only qualifier differs
vec4 f(const in vec4 x, out vec4 y); // error, only qualifier differs
When function calls are resolved, an exact type match for all the
arguments is sought. If an exact match is found, all other functions are
ignored, and the exact match is used. If no exact match is found, then
the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
applied to find a match. Mismatched types on input parameters (in or
inout or default) must have a conversion from the calling argument type
to the formal parameter type. Mismatched types on output parameters (out
or inout) must have a conversion from the formal parameter type to the
calling argument type.
If implicit conversions can be used to find more than one matching
function, a single best-matching function is sought. To determine a best
match, the conversions between calling argument and formal parameter
types are compared for each function argument and pair of matching
functions. After these comparisons are performed, each pair of matching
functions are compared. A function definition A is considered a better
match than function definition B if:
* for at least one function argument, the conversion for that argument
in A is better than the corresponding conversion in B; and
* there is no function argument for which the conversion in B is better
than the corresponding conversion in A.
If a single function definition is considered a better match than every
other matching function definition, it will be used. Otherwise, a
semantic error occurs and the shader will fail to compile.
To determine whether the conversion for a single argument in one match is
better than that for another match, the following rules are applied, in
order:
1. An exact match is better than a match involving any implicit
conversion.
2. A match involving an implicit conversion from float to double is
better than a match involving any other implicit conversion.
3. A match involving an implicit conversion from either int or uint to
float is better than a match involving an implicit conversion from
either int or uint to double.
If none of the rules above apply to a particular pair of conversions,
neither conversion is considered better than the other.
For the function prototypes (A), (B), and (C) above, the following
examples show how the rules apply to different sets of calling argument
types:
f(vec4, vec4); // exact match of vec4 f(in vec4 x, out vec4 y)
f(vec4, uvec4); // exact match of vec4 f(in vec4 x, out ivec4 y)
f(vec4, ivec4); // matched to vec4 f(in vec4 x, out vec4 y)
// (C) not relevant, can't convert vec4 to
// ivec4. (A) better than (B) for 2nd
// argument (rule 2), same on first argument.
f(ivec4, vec4); // NOT matched. All three match by implicit
// conversion. (C) is better than (A) and (B)
// on the first argument. (A) is better than
// (B) and (C).
Modify Section 8.3, Common Functions, p. 84
(add support for single-precision frexp and ldexp functions)
Syntax:
genType frexp(genType x, out genIType exp);
genType ldexp(genType x, in genIType exp);
The function frexp() splits each single-precision floating-point number in
<x> into a binary significand, a floating-point number in the range [0.5,
1.0), and an integral exponent of two, such that:
x = significand * 2 ^ exponent
The significand is returned by the function; the exponent is returned in
the parameter <exp>. For a floating-point value of zero, the significant
and exponent are both zero. For a floating-point value that is an
infinity or is not a number, the results of frexp() are undefined.
If the input <x> is a vector, this operation is performed in a
component-wise manner; the value returned by the function and the value
written to <exp> are vectors with the same number of components as <x>.
The function ldexp() builds a single-precision floating-point number from
each significand component in <x> and the corresponding integral exponent
of two in <exp>, returning:
significand * 2 ^ exponent
If this product is too large to be represented as a single-precision
floating-point value, the result is considered undefined.
If the input <x> is a vector, this operation is performed in a
component-wise manner; the value passed in <exp> and returned by the
function are vectors with the same number of components as <x>.
(add support for new integer built-in functions)
Syntax:
genIType bitfieldExtract(genIType value, int offset, int bits);
genUType bitfieldExtract(genUType value, int offset, int bits);
genIType bitfieldInsert(genIType base, genIType insert, int offset,
int bits);
genUType bitfieldInsert(genUType base, genUType insert, int offset,
int bits);
genIType bitfieldReverse(genIType value);
genUType bitfieldReverse(genUType value);
genIType bitCount(genIType value);
genIType bitCount(genUType value);
genIType findLSB(genIType value);
genIType findLSB(genUType value);
genIType findMSB(genIType value);
genIType findMSB(genUType value);
The function bitfieldExtract() extracts bits <offset> through
<offset>+<bits>-1 from each component in <value>, returning them in the
least significant bits of corresponding component of the result. For
unsigned data types, the most significant bits of the result will be set
to zero. For signed data types, the most significant bits will be set to
the value of bit <offset>+<base>-1. If <bits> is zero, the result will be
zero. The result will be undefined if <offset> or <bits> is negative, or
if the sum of <offset> and <bits> is greater than the number of bits used
to store the operand. Note that for vector versions of bitfieldExtract(),
a single pair of <offset> and <bits> values is shared for all components.
The function bitfieldInsert() inserts the <bits> least significant bits of
each component of <insert> into the corresponding component of <base>.
The result will have bits numbered <offset> through <offset>+<bits>-1
taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
directly from the corresponding bits of <base>. If <bits> is zero, the
result will simply be <base>. The result will be undefined if <offset> or
<bits> is negative, or if the sum of <offset> and <bits> is greater than
the number of bits used to store the operand. Note that for vector
versions of bitfieldInsert(), a single pair of <offset> and <bits> values
is shared for all components.
The function bitfieldReverse() reverses the bits of <value>. The bit
numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
<value>, where <bits> is the total number of bits used to represent
<value>.
The function bitCount() returns the number of one bits in the binary
representation of <value>.
The function findLSB() returns the bit number of the least significant one
bit in the binary representation of <value>. If <value> is zero, -1 will
be returned.
The function findMSB() returns the bit number of the most significant bit
in the binary representation of <value>. For positive integers, the
result will be the bit number of the most significant one bit. For
negative integers, the result will be the bit number of the most
significant zero bit. For a <value> of zero or negative one, -1 will be
returned.
(support for unsigned integer add/subtract with carry-out)
Syntax:
genUType uaddCarry(genUType x, genUType y, out genUType carry);
genUType usubBorrow(genUType x, genUType y, out genUType borrow);
The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
<y>, returning the sum modulo 2^32. The value <carry> is set to zero if
the sum was less than 2^32, or one otherwise.
The function usubBorrow() subtracts the 32-bit unsigned integer or vector
<y> from <x>, returning the difference if non-negative or 2^32 plus the
difference, otherwise. The value <borrow> is set to zero if x >= y, or
one otherwise.
(support for signed and unsigned multiplies, with 32-bit inputs and a
64-bit result spanning two 32-bit outputs)
Syntax:
void umulExtended(genUType x, genUType y, out genUType msb,
out genUType lsb);
void imulExtended(genIType x, genIType y, out genIType msb,
out genIType lsb);
The functions umulExtended() and imulExtended() multiply 32-bit unsigned
or signed integers or vectors <x> and <y>, producing a 64-bit result. The
32 least significant bits are returned in <lsb>; the 32 most significant
bits are returned in <msb>.
GLX Protocol
None.
Dependencies on ARB_gpu_shader_fp64
This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
of implicit conversions supported in the OpenGL Shading Language. If more
than one of these extensions is supported, an expression of one type may
be converted to another type if that conversion is allowed by any of these
specifications.
If ARB_gpu_shader_fp64 or a similar extension introducing new data types
is not supported, the function overloading rule in the GLSL specification
preferring promotion an input parameters to smaller type to a larger type
is never applicable, as all data types are of the same size. That rule
and the example referring to "double" should be removed.
Dependencies on NV_gpu_shader5
This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
of implicit conversions supported in the OpenGL Shading Language. If more
than one of these extensions is supported, an expression of one type may
be converted to another type if that conversion is allowed by any of these
specifications.
If NV_gpu_shader5 is supported, integer data types are supported with four
different precisions (8-, 16, 32-, and 64-bit) and floating-point data
types are supported with three different precisions (16-, 32-, and
64-bit). The extension adds the following rule for output parameters,
which is similar to the one present in this extension for input
parameters:
5. If the formal parameters in both matches are output parameters, a
conversion from a type with a larger number of bits per component is
better than a conversion from a type with a smaller number of bits
per component. For example, a conversion from an "int16_t" formal
parameter type to "int" is better than one from an "int8_t" formal
parameter type to "int".
Such a rule is not provided in this extension because there is no
combination of types in this extension and ARB_gpu_shader_fp64 where this
rule has any effect.
Errors
None
New State
None
New Implementation Dependent State
None
Issues
(1) What should this extension be called?
UNRESOLVED. This extension borrows from GL_ARB_gpu_shader5, so creating
some sort of a play on that name would be viable. However, nothing in
this extension should require SM5 hardware, so such a name would be a
little misleading and weird.
Since the primary purpose is to add integer related functions from
GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
for now.
(2) Why is some of the formatting in this extension weird?
RESOLVED: This extension is formatted to minimize the differences (as
reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
specification.
(3) Should ldexp and frexp be included?
RESOLVED: Yes. Few GPUs have native instructions to implement these
functions. These are generally implemented using existing GLSL built-in
functions and the other functions provided by this extension.
(4) Should umulExtended and imulExtended be included?
RESOLVED: Yes. These functions should be implementable on any GPU that
can support the rest of this extension, but the implementation may be
complex. The implementation on a GPU that only supports 32bit x 32bit =
32bit multiplication would be quite expensive. However, many GPUs
(including OpenGL 4.0 GPUs that already support this function) have a
32bit x 16bit = 48bit multiplier. The implementation there is only
trivially more expensive than regular 32bit multiplication.
(5) Should the pack and unpack functions be included?
RESOLVED: No. These functions are already available via
GL_ARB_shading_language_packing.
(6) Should the "BitsTo" functions be included?
RESOLVED: No. These functions are already available via
GL_ARB_shader_bit_encoding.
Revision History
Rev. Date Author Changes
---- ----------- -------- -----------------------------------------
2 7-Jul-2016 idr Fix typo in #extension line
1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5.

View File

@@ -1,18 +1,10 @@
The definitive source for enum values and reserved ranges are the XML files in
the Khronos registry:
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/egl.xml
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/gl.xml
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/glx.xml
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/wgl.xml
See the OpenGL ARB enum registry at http://www.opengl.org/registry/api/enum.spec
GL blocks allocated to Mesa:
Blocks allocated to Mesa:
0x8750-0x875F
0x8BB0-0x8BBF
EGL blocks allocated to Mesa:
0x31D0-0x31DF
0x3290-0x329F
GL_MESA_packed_depth_stencil
GL_DEPTH_STENCIL_MESA 0x8750
@@ -21,7 +13,7 @@ GL_MESA_packed_depth_stencil
GL_UNSIGNED_SHORT_15_1_MESA 0x8753
GL_UNSIGNED_SHORT_1_15_REV_MESA 0x8754
GL_MESA_trace:
GL_MESA_trace.spec:
GL_TRACE_ALL_BITS_MESA 0xFFFF
GL_TRACE_OPERATIONS_BIT_MESA 0x0001
GL_TRACE_PRIMITIVES_BIT_MESA 0x0002
@@ -32,12 +24,12 @@ GL_MESA_trace:
GL_TRACE_MASK_MESA 0x8755
GL_TRACE_NAME_MESA 0x8756
GL_MESA_ycbcr_texture:
MESA_ycbcr_texture.spec:
GL_YCBCR_MESA 0x8757
GL_UNSIGNED_SHORT_8_8_MESA 0x85BA /* same as Apple's */
GL_UNSIGNED_SHORT_8_8_REV_MESA 0x85BB /* same as Apple's */
GL_MESA_pack_invert:
GL_MESA_pack_invert.spec
GL_PACK_INVERT_MESA 0x8758
GL_MESA_shader_debug.spec: (obsolete)
@@ -45,7 +37,7 @@ GL_MESA_shader_debug.spec: (obsolete)
GL_DEBUG_PRINT_MESA 0x875A
GL_DEBUG_ASSERT_MESA 0x875B
GL_MESA_program_debug: (obsolete)
GL_MESA_program_debug.spec: (obsolete)
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x????
GL_VERTEX_PROGRAM_CALLBACK_MESA 0x????
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x????
@@ -63,24 +55,3 @@ GL_MESAX_texture_stack:
GL_TEXTURE_1D_STACK_BINDING_MESAX 0x875D
GL_TEXTURE_2D_STACK_BINDING_MESAX 0x875E
EGL_MESA_drm_image
EGL_DRM_BUFFER_FORMAT_MESA 0x31D0
EGL_DRM_BUFFER_USE_MESA 0x31D1
EGL_DRM_BUFFER_FORMAT_ARGB32_MESA 0x31D2
EGL_DRM_BUFFER_MESA 0x31D3
EGL_DRM_BUFFER_STRIDE_MESA 0x31D4
EGL_MESA_platform_gbm
EGL_PLATFORM_GBM_MESA 0x31D7
EGL_MESA_platform_surfaceless
EGL_PLATFORM_SURFACELESS_MESA 0x31DD
EGL_WL_bind_wayland_display
EGL_TEXTURE_FORMAT 0x3080
EGL_WAYLAND_BUFFER_WL 0x31D5
EGL_WAYLAND_PLANE_WL 0x31D6
EGL_TEXTURE_Y_U_V_WL 0x31D7
EGL_TEXTURE_Y_UV_WL 0x31D8
EGL_TEXTURE_Y_XUXV_WL 0x31D9
EGL_WAYLAND_Y_INVERTED_WL 0x31DB

View File

@@ -199,7 +199,7 @@ This incurs a small performance penalty.
<h2>Extensions</h2>
<p>
The following Mesa-specific extensions are implemented in the Xlib driver.
The following MESA-specific extensions are implemented in the Xlib driver.
</p>
<h3>GLX_MESA_pixmap_colormap</h3>

View File

@@ -1,2 +0,0 @@
[*.h]
indent_style = tab

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013-2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -38,7 +38,7 @@ extern "C" {
#include <EGL/eglplatform.h>
#define EGL_EGLEXT_VERSION 20160809
#define EGL_EGLEXT_VERSION 20150508
/* Generated C header for:
* API: egl
@@ -99,33 +99,6 @@ EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSync64KHR (EGLDisplay dpy, EGLenum type,
#define EGL_CONTEXT_OPENGL_NO_ERROR_KHR 0x31B3
#endif /* EGL_KHR_create_context_no_error */
#ifndef EGL_KHR_debug
#define EGL_KHR_debug 1
typedef void *EGLLabelKHR;
typedef void *EGLObjectKHR;
typedef void (EGLAPIENTRY *EGLDEBUGPROCKHR)(EGLenum error,const char *command,EGLint messageType,EGLLabelKHR threadLabel,EGLLabelKHR objectLabel,const char* message);
#define EGL_OBJECT_THREAD_KHR 0x33B0
#define EGL_OBJECT_DISPLAY_KHR 0x33B1
#define EGL_OBJECT_CONTEXT_KHR 0x33B2
#define EGL_OBJECT_SURFACE_KHR 0x33B3
#define EGL_OBJECT_IMAGE_KHR 0x33B4
#define EGL_OBJECT_SYNC_KHR 0x33B5
#define EGL_OBJECT_STREAM_KHR 0x33B6
#define EGL_DEBUG_MSG_CRITICAL_KHR 0x33B9
#define EGL_DEBUG_MSG_ERROR_KHR 0x33BA
#define EGL_DEBUG_MSG_WARN_KHR 0x33BB
#define EGL_DEBUG_MSG_INFO_KHR 0x33BC
#define EGL_DEBUG_CALLBACK_KHR 0x33B8
typedef EGLint (EGLAPIENTRYP PFNEGLDEBUGMESSAGECONTROLKHRPROC) (EGLDEBUGPROCKHR callback, const EGLAttrib *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDEBUGKHRPROC) (EGLint attribute, EGLAttrib *value);
typedef EGLint (EGLAPIENTRYP PFNEGLLABELOBJECTKHRPROC) (EGLDisplay display, EGLenum objectType, EGLObjectKHR object, EGLLabelKHR label);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLint EGLAPIENTRY eglDebugMessageControlKHR (EGLDEBUGPROCKHR callback, const EGLAttrib *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryDebugKHR (EGLint attribute, EGLAttrib *value);
EGLAPI EGLint EGLAPIENTRY eglLabelObjectKHR (EGLDisplay display, EGLenum objectType, EGLObjectKHR object, EGLLabelKHR label);
#endif
#endif /* EGL_KHR_debug */
#ifndef EGL_KHR_fence_sync
#define EGL_KHR_fence_sync 1
typedef khronos_utime_nanoseconds_t EGLTimeKHR;
@@ -250,16 +223,6 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurface64KHR (EGLDisplay dpy, EGLSurface s
#endif
#endif /* EGL_KHR_lock_surface3 */
#ifndef EGL_KHR_mutable_render_buffer
#define EGL_KHR_mutable_render_buffer 1
#define EGL_MUTABLE_RENDER_BUFFER_BIT_KHR 0x1000
#endif /* EGL_KHR_mutable_render_buffer */
#ifndef EGL_KHR_no_config_context
#define EGL_KHR_no_config_context 1
#define EGL_NO_CONFIG_KHR ((EGLConfig)0)
#endif /* EGL_KHR_no_config_context */
#ifndef EGL_KHR_partial_update
#define EGL_KHR_partial_update 1
#define EGL_BUFFER_AGE_KHR 0x313D
@@ -439,28 +402,11 @@ EGLAPI void EGLAPIENTRY eglSetBlobCacheFuncsANDROID (EGLDisplay dpy, EGLSetBlobF
#endif
#endif /* EGL_ANDROID_blob_cache */
#ifndef EGL_ANDROID_create_native_client_buffer
#define EGL_ANDROID_create_native_client_buffer 1
#define EGL_NATIVE_BUFFER_USAGE_ANDROID 0x3143
#define EGL_NATIVE_BUFFER_USAGE_PROTECTED_BIT_ANDROID 0x00000001
#define EGL_NATIVE_BUFFER_USAGE_RENDERBUFFER_BIT_ANDROID 0x00000002
#define EGL_NATIVE_BUFFER_USAGE_TEXTURE_BIT_ANDROID 0x00000004
typedef EGLClientBuffer (EGLAPIENTRYP PFNEGLCREATENATIVECLIENTBUFFERANDROIDPROC) (const EGLint *attrib_list);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLClientBuffer EGLAPIENTRY eglCreateNativeClientBufferANDROID (const EGLint *attrib_list);
#endif
#endif /* EGL_ANDROID_create_native_client_buffer */
#ifndef EGL_ANDROID_framebuffer_target
#define EGL_ANDROID_framebuffer_target 1
#define EGL_FRAMEBUFFER_TARGET_ANDROID 0x3147
#endif /* EGL_ANDROID_framebuffer_target */
#ifndef EGL_ANDROID_front_buffer_auto_refresh
#define EGL_ANDROID_front_buffer_auto_refresh 1
#define EGL_FRONT_BUFFER_AUTO_REFRESH_ANDROID 0x314C
#endif /* EGL_ANDROID_front_buffer_auto_refresh */
#ifndef EGL_ANDROID_image_native_buffer
#define EGL_ANDROID_image_native_buffer 1
#define EGL_NATIVE_BUFFER_ANDROID 0x3140
@@ -478,15 +424,6 @@ EGLAPI EGLint EGLAPIENTRY eglDupNativeFenceFDANDROID (EGLDisplay dpy, EGLSyncKHR
#endif
#endif /* EGL_ANDROID_native_fence_sync */
#ifndef EGL_ANDROID_presentation_time
#define EGL_ANDROID_presentation_time 1
typedef khronos_stime_nanoseconds_t EGLnsecsANDROID;
typedef EGLBoolean (EGLAPIENTRYP PFNEGLPRESENTATIONTIMEANDROIDPROC) (EGLDisplay dpy, EGLSurface surface, EGLnsecsANDROID time);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglPresentationTimeANDROID (EGLDisplay dpy, EGLSurface surface, EGLnsecsANDROID time);
#endif
#endif /* EGL_ANDROID_presentation_time */
#ifndef EGL_ANDROID_recordable
#define EGL_ANDROID_recordable 1
#define EGL_RECORDABLE_ANDROID 0x3142
@@ -679,13 +616,9 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurfaceEXT (EGLDisplay dpy,
#define EGL_PLATFORM_X11_SCREEN_EXT 0x31D6
#endif /* EGL_EXT_platform_x11 */
#ifndef EGL_EXT_protected_content
#define EGL_EXT_protected_content 1
#define EGL_PROTECTED_CONTENT_EXT 0x32C0
#endif /* EGL_EXT_protected_content */
#ifndef EGL_EXT_protected_surface
#define EGL_EXT_protected_surface 1
#define EGL_PROTECTED_CONTENT_EXT 0x32C0
#endif /* EGL_EXT_protected_surface */
#ifndef EGL_EXT_stream_consumer_egloutput
@@ -764,12 +697,6 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurfaceHI (EGLDisplay dpy, EGLConfi
#define EGL_CONTEXT_PRIORITY_LOW_IMG 0x3103
#endif /* EGL_IMG_context_priority */
#ifndef EGL_IMG_image_plane_attribs
#define EGL_IMG_image_plane_attribs 1
#define EGL_NATIVE_BUFFER_MULTIPLANE_SEPARATE_IMG 0x3105
#define EGL_NATIVE_BUFFER_PLANE_OFFSET_IMG 0x3106
#endif /* EGL_IMG_image_plane_attribs */
#ifndef EGL_MESA_drm_image
#define EGL_MESA_drm_image 1
#define EGL_DRM_BUFFER_FORMAT_MESA 0x31D0
@@ -885,48 +812,6 @@ EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV (EGLDisplay dpy, EGLSurface sur
#endif
#endif /* EGL_NV_post_sub_buffer */
#ifndef EGL_NV_robustness_video_memory_purge
#define EGL_NV_robustness_video_memory_purge 1
#define EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV 0x334C
#endif /* EGL_NV_robustness_video_memory_purge */
#ifndef EGL_NV_stream_consumer_gltexture_yuv
#define EGL_NV_stream_consumer_gltexture_yuv 1
#define EGL_YUV_PLANE0_TEXTURE_UNIT_NV 0x332C
#define EGL_YUV_PLANE1_TEXTURE_UNIT_NV 0x332D
#define EGL_YUV_PLANE2_TEXTURE_UNIT_NV 0x332E
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSTREAMCONSUMERGLTEXTUREEXTERNALATTRIBSNVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLAttrib *attrib_list);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerGLTextureExternalAttribsNV (EGLDisplay dpy, EGLStreamKHR stream, EGLAttrib *attrib_list);
#endif
#endif /* EGL_NV_stream_consumer_gltexture_yuv */
#ifndef EGL_NV_stream_metadata
#define EGL_NV_stream_metadata 1
#define EGL_MAX_STREAM_METADATA_BLOCKS_NV 0x3250
#define EGL_MAX_STREAM_METADATA_BLOCK_SIZE_NV 0x3251
#define EGL_MAX_STREAM_METADATA_TOTAL_SIZE_NV 0x3252
#define EGL_PRODUCER_METADATA_NV 0x3253
#define EGL_CONSUMER_METADATA_NV 0x3254
#define EGL_PENDING_METADATA_NV 0x3328
#define EGL_METADATA0_SIZE_NV 0x3255
#define EGL_METADATA1_SIZE_NV 0x3256
#define EGL_METADATA2_SIZE_NV 0x3257
#define EGL_METADATA3_SIZE_NV 0x3258
#define EGL_METADATA0_TYPE_NV 0x3259
#define EGL_METADATA1_TYPE_NV 0x325A
#define EGL_METADATA2_TYPE_NV 0x325B
#define EGL_METADATA3_TYPE_NV 0x325C
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDISPLAYATTRIBNVPROC) (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSETSTREAMMETADATANVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLint n, EGLint offset, EGLint size, const void *data);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSTREAMMETADATANVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLenum name, EGLint n, EGLint offset, EGLint size, void *data);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribNV (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);
EGLAPI EGLBoolean EGLAPIENTRY eglSetStreamMetadataNV (EGLDisplay dpy, EGLStreamKHR stream, EGLint n, EGLint offset, EGLint size, const void *data);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryStreamMetadataNV (EGLDisplay dpy, EGLStreamKHR stream, EGLenum name, EGLint n, EGLint offset, EGLint size, void *data);
#endif
#endif /* EGL_NV_stream_metadata */
#ifndef EGL_NV_stream_sync
#define EGL_NV_stream_sync 1
#define EGL_SYNC_NEW_FRAME_NV 0x321F

View File

@@ -84,11 +84,6 @@ typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGIONNOK) (EGLDisplay dpy, EG
#define EGL_NO_CONFIG_MESA ((EGLConfig)0)
#endif
#ifndef EGL_MESA_platform_surfaceless
#define EGL_MESA_platform_surfaceless 1
#define EGL_PLATFORM_SURFACELESS_MESA 0x31DD
#endif /* EGL_MESA_platform_surfaceless */
#ifdef __cplusplus
}
#endif

View File

@@ -33,7 +33,7 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 33061 $ on $Date: 2016-07-14 20:14:13 -0400 (Thu, 14 Jul 2016) $
** Khronos $Revision: 32957 $ on $Date: 2016-06-09 17:03:08 -0400 (Thu, 09 Jun 2016) $
*/
#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)
@@ -53,7 +53,7 @@ extern "C" {
#define GLAPI extern
#endif
#define GL_GLEXT_VERSION 20160714
#define GL_GLEXT_VERSION 20160609
/* Generated C header for:
* API: gl
@@ -8836,11 +8836,6 @@ GLAPI void APIENTRY glBlendFuncSeparateINGR (GLenum sfactorRGB, GLenum dfactorRG
#define GL_INTERLACE_READ_INGR 0x8568
#endif /* GL_INGR_interlace_read */
#ifndef GL_INTEL_conservative_rasterization
#define GL_INTEL_conservative_rasterization 1
#define GL_CONSERVATIVE_RASTERIZATION_INTEL 0x83FE
#endif /* GL_INTEL_conservative_rasterization */
#ifndef GL_INTEL_fragment_shader_ordering
#define GL_INTEL_fragment_shader_ordering 1
#endif /* GL_INTEL_fragment_shader_ordering */

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013-2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,10 +33,10 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 32889 $ on $Date: 2016-05-31 07:09:51 -0400 (Tue, 31 May 2016) $
** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $
*/
#define GLX_GLXEXT_VERSION 20160531
#define GLX_GLXEXT_VERSION 20140810
/* Generated C header for:
* API: glx
@@ -250,26 +250,6 @@ __GLXextFuncPtr glXGetProcAddressARB (const GLubyte *procName);
#define GLX_GPU_NUM_SIMD_AMD 0x21A6
#define GLX_GPU_NUM_RB_AMD 0x21A7
#define GLX_GPU_NUM_SPI_AMD 0x21A8
typedef unsigned int ( *PFNGLXGETGPUIDSAMDPROC) (unsigned int maxCount, unsigned int *ids);
typedef int ( *PFNGLXGETGPUINFOAMDPROC) (unsigned int id, int property, GLenum dataType, unsigned int size, void *data);
typedef unsigned int ( *PFNGLXGETCONTEXTGPUIDAMDPROC) (GLXContext ctx);
typedef GLXContext ( *PFNGLXCREATEASSOCIATEDCONTEXTAMDPROC) (unsigned int id, GLXContext share_list);
typedef GLXContext ( *PFNGLXCREATEASSOCIATEDCONTEXTATTRIBSAMDPROC) (unsigned int id, GLXContext share_context, const int *attribList);
typedef Bool ( *PFNGLXDELETEASSOCIATEDCONTEXTAMDPROC) (GLXContext ctx);
typedef Bool ( *PFNGLXMAKEASSOCIATEDCONTEXTCURRENTAMDPROC) (GLXContext ctx);
typedef GLXContext ( *PFNGLXGETCURRENTASSOCIATEDCONTEXTAMDPROC) (void);
typedef void ( *PFNGLXBLITCONTEXTFRAMEBUFFERAMDPROC) (GLXContext dstCtx, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
#ifdef GLX_GLXEXT_PROTOTYPES
unsigned int glXGetGPUIDsAMD (unsigned int maxCount, unsigned int *ids);
int glXGetGPUInfoAMD (unsigned int id, int property, GLenum dataType, unsigned int size, void *data);
unsigned int glXGetContextGPUIDAMD (GLXContext ctx);
GLXContext glXCreateAssociatedContextAMD (unsigned int id, GLXContext share_list);
GLXContext glXCreateAssociatedContextAttribsAMD (unsigned int id, GLXContext share_context, const int *attribList);
Bool glXDeleteAssociatedContextAMD (GLXContext ctx);
Bool glXMakeAssociatedContextCurrentAMD (GLXContext ctx);
GLXContext glXGetCurrentAssociatedContextAMD (void);
void glXBlitContextFramebufferAMD (GLXContext dstCtx, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
#endif
#endif /* GLX_AMD_gpu_association */
#ifndef GLX_EXT_buffer_age
@@ -317,11 +297,6 @@ void glXFreeContextEXT (Display *dpy, GLXContext context);
#endif
#endif /* GLX_EXT_import_context */
#ifndef GLX_EXT_libglvnd
#define GLX_EXT_libglvnd 1
#define GLX_VENDOR_NAMES_EXT 0x20F6
#endif /* GLX_EXT_libglvnd */
#ifndef GLX_EXT_stereo_tree
#define GLX_EXT_stereo_tree 1
typedef struct {
@@ -548,11 +523,6 @@ int glXBindVideoDeviceNV (Display *dpy, unsigned int video_slot, unsigned int vi
#endif
#endif /* GLX_NV_present_video */
#ifndef GLX_NV_robustness_video_memory_purge
#define GLX_NV_robustness_video_memory_purge 1
#define GLX_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV 0x20F7
#endif /* GLX_NV_robustness_video_memory_purge */
#ifndef GLX_NV_swap_group
#define GLX_NV_swap_group 1
typedef Bool ( *PFNGLXJOINSWAPGROUPNVPROC) (Display *dpy, GLXDrawable drawable, GLuint group);

View File

@@ -1094,7 +1094,7 @@ struct __DRIdri2ExtensionRec {
* extensions.
*/
#define __DRI_IMAGE "DRI_IMAGE"
#define __DRI_IMAGE_VERSION 13
#define __DRI_IMAGE_VERSION 12
/**
* These formats correspond to the similarly named MESA_FORMAT_*
@@ -1208,8 +1208,6 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_ATTRIB_FOURCC 0x2008 /* available in versions 11 */
#define __DRI_IMAGE_ATTRIB_NUM_PLANES 0x2009 /* available in versions 11 */
#define __DRI_IMAGE_ATTRIB_OFFSET 0x200A /* available in versions 13 */
enum __DRIYUVColorSpace {
__DRI_YUV_COLOR_SPACE_UNDEFINED = 0,
__DRI_YUV_COLOR_SPACE_ITU_REC601 = 0x327F,

View File

@@ -97,7 +97,7 @@ struct mesa_glinterop_device_info {
/* The callee will overwrite it if it supports a lower version.
*
* The caller should check the value and access up-to the version supported
* by the callee.
* by the the callee.
*/
/* NOTE: Do not use the MESA_GLINTEROP_DEVICE_INFO_VERSION macro */
uint32_t version;
@@ -125,7 +125,7 @@ struct mesa_glinterop_export_in {
/* The callee will overwrite it if it supports a lower version.
*
* The caller should check the value and access up-to the version supported
* by the callee.
* by the the callee.
*/
/* NOTE: Do not use the MESA_GLINTEROP_EXPORT_IN_VERSION macro */
uint32_t version;
@@ -190,7 +190,7 @@ struct mesa_glinterop_export_out {
/* The callee will overwrite it if it supports a lower version.
*
* The caller should check the value and access up-to the version supported
* by the callee.
* by the the callee.
*/
/* NOTE: Do not use the MESA_GLINTEROP_EXPORT_OUT_VERSION macro */
uint32_t version;

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013-2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,7 +33,7 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 32686 $ on $Date: 2016-04-19 21:08:44 -0400 (Tue, 19 Apr 2016) $
** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $
*/
#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)
@@ -41,7 +41,7 @@ extern "C" {
#include <windows.h>
#endif
#define WGL_WGLEXT_VERSION 20160419
#define WGL_WGLEXT_VERSION 20140810
/* Generated C header for:
* API: wgl

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,16 +33,12 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 32749 $ on $Date: 2016-04-28 09:03:03 -0700 (Thu, 28 Apr 2016) $
** Khronos $Revision: 24614 $ on $Date: 2013-12-30 04:44:46 -0800 (Mon, 30 Dec 2013) $
*/
#include <GLES2/gl2platform.h>
#ifndef GL_APIENTRYP
#define GL_APIENTRYP GL_APIENTRY*
#endif
/* Generated on date 20160428 */
/* Generated on date 20131230 */
/* Generated C header for:
* API: gles2
@@ -378,148 +374,6 @@ typedef khronos_uint8_t GLubyte;
#define GL_RENDERBUFFER_BINDING 0x8CA7
#define GL_MAX_RENDERBUFFER_SIZE 0x84E8
#define GL_INVALID_FRAMEBUFFER_OPERATION 0x0506
typedef void (GL_APIENTRYP PFNGLACTIVETEXTUREPROC) (GLenum texture);
typedef void (GL_APIENTRYP PFNGLATTACHSHADERPROC) (GLuint program, GLuint shader);
typedef void (GL_APIENTRYP PFNGLBINDATTRIBLOCATIONPROC) (GLuint program, GLuint index, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERPROC) (GLenum target, GLuint buffer);
typedef void (GL_APIENTRYP PFNGLBINDFRAMEBUFFERPROC) (GLenum target, GLuint framebuffer);
typedef void (GL_APIENTRYP PFNGLBINDRENDERBUFFERPROC) (GLenum target, GLuint renderbuffer);
typedef void (GL_APIENTRYP PFNGLBINDTEXTUREPROC) (GLenum target, GLuint texture);
typedef void (GL_APIENTRYP PFNGLBLENDCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEPROC) (GLenum modeRGB, GLenum modeAlpha);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCPROC) (GLenum sfactor, GLenum dfactor);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEPROC) (GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorAlpha, GLenum dfactorAlpha);
typedef void (GL_APIENTRYP PFNGLBUFFERDATAPROC) (GLenum target, GLsizeiptr size, const void *data, GLenum usage);
typedef void (GL_APIENTRYP PFNGLBUFFERSUBDATAPROC) (GLenum target, GLintptr offset, GLsizeiptr size, const void *data);
typedef GLenum (GL_APIENTRYP PFNGLCHECKFRAMEBUFFERSTATUSPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLCLEARPROC) (GLbitfield mask);
typedef void (GL_APIENTRYP PFNGLCLEARCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);
typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFPROC) (GLfloat d);
typedef void (GL_APIENTRYP PFNGLCLEARSTENCILPROC) (GLint s);
typedef void (GL_APIENTRYP PFNGLCOLORMASKPROC) (GLboolean red, GLboolean green, GLboolean blue, GLboolean alpha);
typedef void (GL_APIENTRYP PFNGLCOMPILESHADERPROC) (GLuint shader);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLint border, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOPYTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLint x, GLint y, GLsizei width, GLsizei height, GLint border);
typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint x, GLint y, GLsizei width, GLsizei height);
typedef GLuint (GL_APIENTRYP PFNGLCREATEPROGRAMPROC) (void);
typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROC) (GLenum type);
typedef void (GL_APIENTRYP PFNGLCULLFACEPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLDELETEBUFFERSPROC) (GLsizei n, const GLuint *buffers);
typedef void (GL_APIENTRYP PFNGLDELETEFRAMEBUFFERSPROC) (GLsizei n, const GLuint *framebuffers);
typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLDELETERENDERBUFFERSPROC) (GLsizei n, const GLuint *renderbuffers);
typedef void (GL_APIENTRYP PFNGLDELETESHADERPROC) (GLuint shader);
typedef void (GL_APIENTRYP PFNGLDELETETEXTURESPROC) (GLsizei n, const GLuint *textures);
typedef void (GL_APIENTRYP PFNGLDEPTHFUNCPROC) (GLenum func);
typedef void (GL_APIENTRYP PFNGLDEPTHMASKPROC) (GLboolean flag);
typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFPROC) (GLfloat n, GLfloat f);
typedef void (GL_APIENTRYP PFNGLDETACHSHADERPROC) (GLuint program, GLuint shader);
typedef void (GL_APIENTRYP PFNGLDISABLEPROC) (GLenum cap);
typedef void (GL_APIENTRYP PFNGLDISABLEVERTEXATTRIBARRAYPROC) (GLuint index);
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSPROC) (GLenum mode, GLint first, GLsizei count);
typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices);
typedef void (GL_APIENTRYP PFNGLENABLEPROC) (GLenum cap);
typedef void (GL_APIENTRYP PFNGLENABLEVERTEXATTRIBARRAYPROC) (GLuint index);
typedef void (GL_APIENTRYP PFNGLFINISHPROC) (void);
typedef void (GL_APIENTRYP PFNGLFLUSHPROC) (void);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERRENDERBUFFERPROC) (GLenum target, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level);
typedef void (GL_APIENTRYP PFNGLFRONTFACEPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLGENBUFFERSPROC) (GLsizei n, GLuint *buffers);
typedef void (GL_APIENTRYP PFNGLGENERATEMIPMAPPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGENFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);
typedef void (GL_APIENTRYP PFNGLGENRENDERBUFFERSPROC) (GLsizei n, GLuint *renderbuffers);
typedef void (GL_APIENTRYP PFNGLGENTEXTURESPROC) (GLsizei n, GLuint *textures);
typedef void (GL_APIENTRYP PFNGLGETACTIVEATTRIBPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETATTACHEDSHADERSPROC) (GLuint program, GLsizei maxCount, GLsizei *count, GLuint *shaders);
typedef GLint (GL_APIENTRYP PFNGLGETATTRIBLOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETBOOLEANVPROC) (GLenum pname, GLboolean *data);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef GLenum (GL_APIENTRYP PFNGLGETERRORPROC) (void);
typedef void (GL_APIENTRYP PFNGLGETFLOATVPROC) (GLenum pname, GLfloat *data);
typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERATTACHMENTPARAMETERIVPROC) (GLenum target, GLenum attachment, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETINTEGERVPROC) (GLenum pname, GLint *data);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMIVPROC) (GLuint program, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMINFOLOGPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLGETRENDERBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSHADERIVPROC) (GLuint shader, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSHADERINFOLOGPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLGETSHADERPRECISIONFORMATPROC) (GLenum shadertype, GLenum precisiontype, GLint *range, GLint *precision);
typedef void (GL_APIENTRYP PFNGLGETSHADERSOURCEPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *source);
typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGPROC) (GLenum name);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERFVPROC) (GLenum target, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMFVPROC) (GLuint program, GLint location, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMIVPROC) (GLuint program, GLint location, GLint *params);
typedef GLint (GL_APIENTRYP PFNGLGETUNIFORMLOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBFVPROC) (GLuint index, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIVPROC) (GLuint index, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBPOINTERVPROC) (GLuint index, GLenum pname, void **pointer);
typedef void (GL_APIENTRYP PFNGLHINTPROC) (GLenum target, GLenum mode);
typedef GLboolean (GL_APIENTRYP PFNGLISBUFFERPROC) (GLuint buffer);
typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDPROC) (GLenum cap);
typedef GLboolean (GL_APIENTRYP PFNGLISFRAMEBUFFERPROC) (GLuint framebuffer);
typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPROC) (GLuint program);
typedef GLboolean (GL_APIENTRYP PFNGLISRENDERBUFFERPROC) (GLuint renderbuffer);
typedef GLboolean (GL_APIENTRYP PFNGLISSHADERPROC) (GLuint shader);
typedef GLboolean (GL_APIENTRYP PFNGLISTEXTUREPROC) (GLuint texture);
typedef void (GL_APIENTRYP PFNGLLINEWIDTHPROC) (GLfloat width);
typedef void (GL_APIENTRYP PFNGLLINKPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLPIXELSTOREIPROC) (GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETPROC) (GLfloat factor, GLfloat units);
typedef void (GL_APIENTRYP PFNGLREADPIXELSPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, void *pixels);
typedef void (GL_APIENTRYP PFNGLRELEASESHADERCOMPILERPROC) (void);
typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEPROC) (GLenum target, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSAMPLECOVERAGEPROC) (GLfloat value, GLboolean invert);
typedef void (GL_APIENTRYP PFNGLSCISSORPROC) (GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSHADERBINARYPROC) (GLsizei count, const GLuint *shaders, GLenum binaryformat, const void *binary, GLsizei length);
typedef void (GL_APIENTRYP PFNGLSHADERSOURCEPROC) (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);
typedef void (GL_APIENTRYP PFNGLSTENCILFUNCPROC) (GLenum func, GLint ref, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILFUNCSEPARATEPROC) (GLenum face, GLenum func, GLint ref, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILMASKPROC) (GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILMASKSEPARATEPROC) (GLenum face, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILOPPROC) (GLenum fail, GLenum zfail, GLenum zpass);
typedef void (GL_APIENTRYP PFNGLSTENCILOPSEPARATEPROC) (GLenum face, GLenum sfail, GLenum dpfail, GLenum dppass);
typedef void (GL_APIENTRYP PFNGLTEXIMAGE2DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFPROC) (GLenum target, GLenum pname, GLfloat param);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFVPROC) (GLenum target, GLenum pname, const GLfloat *params);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIVPROC) (GLenum target, GLenum pname, const GLint *params);
typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLUNIFORM1FPROC) (GLint location, GLfloat v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM1FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM1IPROC) (GLint location, GLint v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM1IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2FPROC) (GLint location, GLfloat v0, GLfloat v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM2FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2IPROC) (GLint location, GLint v0, GLint v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM2IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM3FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3IPROC) (GLint location, GLint v0, GLint v1, GLint v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM3IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM4FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4IPROC) (GLint location, GLint v0, GLint v1, GLint v2, GLint v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM4IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUSEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FPROC) (GLuint index, GLfloat x);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FPROC) (GLuint index, GLfloat x, GLfloat y);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBPOINTERPROC) (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const void *pointer);
typedef void (GL_APIENTRYP PFNGLVIEWPORTPROC) (GLint x, GLint y, GLsizei width, GLsizei height);
GL_APICALL void GL_APIENTRY glActiveTexture (GLenum texture);
GL_APICALL void GL_APIENTRY glAttachShader (GLuint program, GLuint shader);
GL_APICALL void GL_APIENTRY glBindAttribLocation (GLuint program, GLuint index, const GLchar *name);

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013-2015 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,14 +33,14 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 33080 $ on $Date: 2016-08-05 04:09:22 -0700 (Fri, 05 Aug 2016) $
** Khronos $Revision: 32120 $ on $Date: 2015-10-15 04:27:13 -0700 (Thu, 15 Oct 2015) $
*/
#ifndef GL_APIENTRYP
#define GL_APIENTRYP GL_APIENTRY*
#endif
/* Generated on date 20160805 */
/* Generated on date 20151015 */
/* Generated C header for:
* API: gles2
@@ -52,10 +52,6 @@ extern "C" {
* Extensions removed: _nomatch_^
*/
#ifndef GL_ARB_sparse_texture2
#define GL_ARB_sparse_texture2 1
#endif /* GL_ARB_sparse_texture2 */
#ifndef GL_KHR_blend_equation_advanced
#define GL_KHR_blend_equation_advanced 1
#define GL_MULTIPLY_KHR 0x9294
@@ -756,34 +752,6 @@ GL_APICALL GLboolean GL_APIENTRY glIsVertexArrayOES (GLuint array);
#define GL_INT_10_10_10_2_OES 0x8DF7
#endif /* GL_OES_vertex_type_10_10_10_2 */
#ifndef GL_OES_viewport_array
#define GL_OES_viewport_array 1
#define GL_MAX_VIEWPORTS_OES 0x825B
#define GL_VIEWPORT_SUBPIXEL_BITS_OES 0x825C
#define GL_VIEWPORT_BOUNDS_RANGE_OES 0x825D
#define GL_VIEWPORT_INDEX_PROVOKING_VERTEX_OES 0x825F
typedef void (GL_APIENTRYP PFNGLVIEWPORTARRAYVOESPROC) (GLuint first, GLsizei count, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVIEWPORTINDEXEDFOESPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat w, GLfloat h);
typedef void (GL_APIENTRYP PFNGLVIEWPORTINDEXEDFVOESPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLSCISSORARRAYVOESPROC) (GLuint first, GLsizei count, const GLint *v);
typedef void (GL_APIENTRYP PFNGLSCISSORINDEXEDOESPROC) (GLuint index, GLint left, GLint bottom, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSCISSORINDEXEDVOESPROC) (GLuint index, const GLint *v);
typedef void (GL_APIENTRYP PFNGLDEPTHRANGEARRAYFVOESPROC) (GLuint first, GLsizei count, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLDEPTHRANGEINDEXEDFOESPROC) (GLuint index, GLfloat n, GLfloat f);
typedef void (GL_APIENTRYP PFNGLGETFLOATI_VOESPROC) (GLenum target, GLuint index, GLfloat *data);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glViewportArrayvOES (GLuint first, GLsizei count, const GLfloat *v);
GL_APICALL void GL_APIENTRY glViewportIndexedfOES (GLuint index, GLfloat x, GLfloat y, GLfloat w, GLfloat h);
GL_APICALL void GL_APIENTRY glViewportIndexedfvOES (GLuint index, const GLfloat *v);
GL_APICALL void GL_APIENTRY glScissorArrayvOES (GLuint first, GLsizei count, const GLint *v);
GL_APICALL void GL_APIENTRY glScissorIndexedOES (GLuint index, GLint left, GLint bottom, GLsizei width, GLsizei height);
GL_APICALL void GL_APIENTRY glScissorIndexedvOES (GLuint index, const GLint *v);
GL_APICALL void GL_APIENTRY glDepthRangeArrayfvOES (GLuint first, GLsizei count, const GLfloat *v);
GL_APICALL void GL_APIENTRY glDepthRangeIndexedfOES (GLuint index, GLfloat n, GLfloat f);
GL_APICALL void GL_APIENTRY glGetFloati_vOES (GLenum target, GLuint index, GLfloat *data);
#endif
#endif /* GL_OES_viewport_array */
#ifndef GL_AMD_compressed_3DC_texture
#define GL_AMD_compressed_3DC_texture 1
#define GL_3DC_X_AMD 0x87F9
@@ -1118,21 +1086,6 @@ GL_APICALL void GL_APIENTRY glBufferStorageEXT (GLenum target, GLsizeiptr size,
#endif
#endif /* GL_EXT_buffer_storage */
#ifndef GL_EXT_clip_cull_distance
#define GL_EXT_clip_cull_distance 1
#define GL_MAX_CLIP_DISTANCES_EXT 0x0D32
#define GL_MAX_CULL_DISTANCES_EXT 0x82F9
#define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES_EXT 0x82FA
#define GL_CLIP_DISTANCE0_EXT 0x3000
#define GL_CLIP_DISTANCE1_EXT 0x3001
#define GL_CLIP_DISTANCE2_EXT 0x3002
#define GL_CLIP_DISTANCE3_EXT 0x3003
#define GL_CLIP_DISTANCE4_EXT 0x3004
#define GL_CLIP_DISTANCE5_EXT 0x3005
#define GL_CLIP_DISTANCE6_EXT 0x3006
#define GL_CLIP_DISTANCE7_EXT 0x3007
#endif /* GL_EXT_clip_cull_distance */
#ifndef GL_EXT_color_buffer_float
#define GL_EXT_color_buffer_float 1
#endif /* GL_EXT_color_buffer_float */
@@ -1459,15 +1412,6 @@ GL_APICALL void GL_APIENTRY glGetIntegeri_vEXT (GLenum target, GLuint index, GLi
#define GL_ANY_SAMPLES_PASSED_CONSERVATIVE_EXT 0x8D6A
#endif /* GL_EXT_occlusion_query_boolean */
#ifndef GL_EXT_polygon_offset_clamp
#define GL_EXT_polygon_offset_clamp 1
#define GL_POLYGON_OFFSET_CLAMP_EXT 0x8E1B
typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETCLAMPEXTPROC) (GLfloat factor, GLfloat units, GLfloat clamp);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glPolygonOffsetClampEXT (GLfloat factor, GLfloat units, GLfloat clamp);
#endif
#endif /* GL_EXT_polygon_offset_clamp */
#ifndef GL_EXT_post_depth_coverage
#define GL_EXT_post_depth_coverage 1
#endif /* GL_EXT_post_depth_coverage */
@@ -1481,12 +1425,6 @@ GL_APICALL void GL_APIENTRY glPrimitiveBoundingBoxEXT (GLfloat minX, GLfloat min
#endif
#endif /* GL_EXT_primitive_bounding_box */
#ifndef GL_EXT_protected_textures
#define GL_EXT_protected_textures 1
#define GL_CONTEXT_FLAG_PROTECTED_CONTENT_BIT_EXT 0x00000010
#define GL_TEXTURE_PROTECTED_EXT 0x8BFA
#endif /* GL_EXT_protected_textures */
#ifndef GL_EXT_pvrtc_sRGB
#define GL_EXT_pvrtc_sRGB 1
#define GL_COMPRESSED_SRGB_PVRTC_2BPPV1_EXT 0x8A54
@@ -1666,10 +1604,6 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin
#define GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT 0x8A52
#endif /* GL_EXT_shader_framebuffer_fetch */
#ifndef GL_EXT_shader_group_vote
#define GL_EXT_shader_group_vote 1
#endif /* GL_EXT_shader_group_vote */
#ifndef GL_EXT_shader_implicit_conversions
#define GL_EXT_shader_implicit_conversions 1
#endif /* GL_EXT_shader_implicit_conversions */
@@ -1682,10 +1616,6 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin
#define GL_EXT_shader_io_blocks 1
#endif /* GL_EXT_shader_io_blocks */
#ifndef GL_EXT_shader_non_constant_global_initializers
#define GL_EXT_shader_non_constant_global_initializers 1
#endif /* GL_EXT_shader_non_constant_global_initializers */
#ifndef GL_EXT_shader_pixel_local_storage
#define GL_EXT_shader_pixel_local_storage 1
#define GL_MAX_SHADER_PIXEL_LOCAL_STORAGE_FAST_SIZE_EXT 0x8F63
@@ -1693,21 +1623,6 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin
#define GL_SHADER_PIXEL_LOCAL_STORAGE_EXT 0x8F64
#endif /* GL_EXT_shader_pixel_local_storage */
#ifndef GL_EXT_shader_pixel_local_storage2
#define GL_EXT_shader_pixel_local_storage2 1
#define GL_MAX_SHADER_COMBINED_LOCAL_STORAGE_FAST_SIZE_EXT 0x9650
#define GL_MAX_SHADER_COMBINED_LOCAL_STORAGE_SIZE_EXT 0x9651
#define GL_FRAMEBUFFER_INCOMPLETE_INSUFFICIENT_SHADER_COMBINED_LOCAL_STORAGE_EXT 0x9652
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERPIXELLOCALSTORAGESIZEEXTPROC) (GLuint target, GLsizei size);
typedef GLsizei (GL_APIENTRYP PFNGLGETFRAMEBUFFERPIXELLOCALSTORAGESIZEEXTPROC) (GLuint target);
typedef void (GL_APIENTRYP PFNGLCLEARPIXELLOCALSTORAGEUIEXTPROC) (GLsizei offset, GLsizei n, const GLuint *values);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glFramebufferPixelLocalStorageSizeEXT (GLuint target, GLsizei size);
GL_APICALL GLsizei GL_APIENTRY glGetFramebufferPixelLocalStorageSizeEXT (GLuint target);
GL_APICALL void GL_APIENTRY glClearPixelLocalStorageuiEXT (GLsizei offset, GLsizei n, const GLuint *values);
#endif
#endif /* GL_EXT_shader_pixel_local_storage2 */
#ifndef GL_EXT_shader_texture_lod
#define GL_EXT_shader_texture_lod 1
#endif /* GL_EXT_shader_texture_lod */
@@ -1973,39 +1888,11 @@ GL_APICALL void GL_APIENTRY glTextureViewEXT (GLuint texture, GLenum target, GLu
#define GL_UNPACK_SKIP_PIXELS_EXT 0x0CF4
#endif /* GL_EXT_unpack_subimage */
#ifndef GL_EXT_window_rectangles
#define GL_EXT_window_rectangles 1
#define GL_INCLUSIVE_EXT 0x8F10
#define GL_EXCLUSIVE_EXT 0x8F11
#define GL_WINDOW_RECTANGLE_EXT 0x8F12
#define GL_WINDOW_RECTANGLE_MODE_EXT 0x8F13
#define GL_MAX_WINDOW_RECTANGLES_EXT 0x8F14
#define GL_NUM_WINDOW_RECTANGLES_EXT 0x8F15
typedef void (GL_APIENTRYP PFNGLWINDOWRECTANGLESEXTPROC) (GLenum mode, GLsizei count, const GLint *box);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glWindowRectanglesEXT (GLenum mode, GLsizei count, const GLint *box);
#endif
#endif /* GL_EXT_window_rectangles */
#ifndef GL_FJ_shader_binary_GCCSO
#define GL_FJ_shader_binary_GCCSO 1
#define GL_GCCSO_SHADER_BINARY_FJ 0x9260
#endif /* GL_FJ_shader_binary_GCCSO */
#ifndef GL_IMG_framebuffer_downsample
#define GL_IMG_framebuffer_downsample 1
#define GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE_AND_DOWNSAMPLE_IMG 0x913C
#define GL_NUM_DOWNSAMPLE_SCALES_IMG 0x913D
#define GL_DOWNSAMPLE_SCALES_IMG 0x913E
#define GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_SCALE_IMG 0x913F
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DDOWNSAMPLEIMGPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level, GLint xscale, GLint yscale);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURELAYERDOWNSAMPLEIMGPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer, GLint xscale, GLint yscale);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glFramebufferTexture2DDownsampleIMG (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level, GLint xscale, GLint yscale);
GL_APICALL void GL_APIENTRY glFramebufferTextureLayerDownsampleIMG (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer, GLint xscale, GLint yscale);
#endif
#endif /* GL_IMG_framebuffer_downsample */
#ifndef GL_IMG_multisampled_render_to_texture
#define GL_IMG_multisampled_render_to_texture 1
#define GL_RENDERBUFFER_SAMPLES_IMG 0x9133
@@ -2057,11 +1944,6 @@ GL_APICALL void GL_APIENTRY glFramebufferTexture2DMultisampleIMG (GLenum target,
#define GL_CUBIC_MIPMAP_LINEAR_IMG 0x913B
#endif /* GL_IMG_texture_filter_cubic */
#ifndef GL_INTEL_conservative_rasterization
#define GL_INTEL_conservative_rasterization 1
#define GL_CONSERVATIVE_RASTERIZATION_INTEL 0x83FE
#endif /* GL_INTEL_conservative_rasterization */
#ifndef GL_INTEL_framebuffer_CMAA
#define GL_INTEL_framebuffer_CMAA 1
typedef void (GL_APIENTRYP PFNGLAPPLYFRAMEBUFFERATTACHMENTCMAAINTELPROC) (void);
@@ -2238,17 +2120,6 @@ GL_APICALL void GL_APIENTRY glSubpixelPrecisionBiasNV (GLuint xbits, GLuint ybit
#endif
#endif /* GL_NV_conservative_raster */
#ifndef GL_NV_conservative_raster_pre_snap_triangles
#define GL_NV_conservative_raster_pre_snap_triangles 1
#define GL_CONSERVATIVE_RASTER_MODE_NV 0x954D
#define GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV 0x954E
#define GL_CONSERVATIVE_RASTER_MODE_PRE_SNAP_TRIANGLES_NV 0x954F
typedef void (GL_APIENTRYP PFNGLCONSERVATIVERASTERPARAMETERINVPROC) (GLenum pname, GLint param);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glConservativeRasterParameteriNV (GLenum pname, GLint param);
#endif
#endif /* GL_NV_conservative_raster_pre_snap_triangles */
#ifndef GL_NV_copy_buffer
#define GL_NV_copy_buffer 1
#define GL_COPY_READ_BUFFER_NV 0x8F36
@@ -2436,109 +2307,6 @@ GL_APICALL void GL_APIENTRY glRenderbufferStorageMultisampleNV (GLenum target, G
#define GL_NV_geometry_shader_passthrough 1
#endif /* GL_NV_geometry_shader_passthrough */
#ifndef GL_NV_gpu_shader5
#define GL_NV_gpu_shader5 1
typedef khronos_int64_t GLint64EXT;
typedef khronos_uint64_t GLuint64EXT;
#define GL_INT64_NV 0x140E
#define GL_UNSIGNED_INT64_NV 0x140F
#define GL_INT8_NV 0x8FE0
#define GL_INT8_VEC2_NV 0x8FE1
#define GL_INT8_VEC3_NV 0x8FE2
#define GL_INT8_VEC4_NV 0x8FE3
#define GL_INT16_NV 0x8FE4
#define GL_INT16_VEC2_NV 0x8FE5
#define GL_INT16_VEC3_NV 0x8FE6
#define GL_INT16_VEC4_NV 0x8FE7
#define GL_INT64_VEC2_NV 0x8FE9
#define GL_INT64_VEC3_NV 0x8FEA
#define GL_INT64_VEC4_NV 0x8FEB
#define GL_UNSIGNED_INT8_NV 0x8FEC
#define GL_UNSIGNED_INT8_VEC2_NV 0x8FED
#define GL_UNSIGNED_INT8_VEC3_NV 0x8FEE
#define GL_UNSIGNED_INT8_VEC4_NV 0x8FEF
#define GL_UNSIGNED_INT16_NV 0x8FF0
#define GL_UNSIGNED_INT16_VEC2_NV 0x8FF1
#define GL_UNSIGNED_INT16_VEC3_NV 0x8FF2
#define GL_UNSIGNED_INT16_VEC4_NV 0x8FF3
#define GL_UNSIGNED_INT64_VEC2_NV 0x8FF5
#define GL_UNSIGNED_INT64_VEC3_NV 0x8FF6
#define GL_UNSIGNED_INT64_VEC4_NV 0x8FF7
#define GL_FLOAT16_NV 0x8FF8
#define GL_FLOAT16_VEC2_NV 0x8FF9
#define GL_FLOAT16_VEC3_NV 0x8FFA
#define GL_FLOAT16_VEC4_NV 0x8FFB
#define GL_PATCHES 0x000E
typedef void (GL_APIENTRYP PFNGLUNIFORM1I64NVPROC) (GLint location, GLint64EXT x);
typedef void (GL_APIENTRYP PFNGLUNIFORM2I64NVPROC) (GLint location, GLint64EXT x, GLint64EXT y);
typedef void (GL_APIENTRYP PFNGLUNIFORM3I64NVPROC) (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);
typedef void (GL_APIENTRYP PFNGLUNIFORM4I64NVPROC) (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);
typedef void (GL_APIENTRYP PFNGLUNIFORM1I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM1UI64NVPROC) (GLint location, GLuint64EXT x);
typedef void (GL_APIENTRYP PFNGLUNIFORM2UI64NVPROC) (GLint location, GLuint64EXT x, GLuint64EXT y);
typedef void (GL_APIENTRYP PFNGLUNIFORM3UI64NVPROC) (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);
typedef void (GL_APIENTRYP PFNGLUNIFORM4UI64NVPROC) (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);
typedef void (GL_APIENTRYP PFNGLUNIFORM1UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMI64VNVPROC) (GLuint program, GLint location, GLint64EXT *params);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1I64NVPROC) (GLuint program, GLint location, GLint64EXT x);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2I64NVPROC) (GLuint program, GLint location, GLint64EXT x, GLint64EXT y);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3I64NVPROC) (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4I64NVPROC) (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glUniform1i64NV (GLint location, GLint64EXT x);
GL_APICALL void GL_APIENTRY glUniform2i64NV (GLint location, GLint64EXT x, GLint64EXT y);
GL_APICALL void GL_APIENTRY glUniform3i64NV (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);
GL_APICALL void GL_APIENTRY glUniform4i64NV (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);
GL_APICALL void GL_APIENTRY glUniform1i64vNV (GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform2i64vNV (GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform3i64vNV (GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform4i64vNV (GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform1ui64NV (GLint location, GLuint64EXT x);
GL_APICALL void GL_APIENTRY glUniform2ui64NV (GLint location, GLuint64EXT x, GLuint64EXT y);
GL_APICALL void GL_APIENTRY glUniform3ui64NV (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);
GL_APICALL void GL_APIENTRY glUniform4ui64NV (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);
GL_APICALL void GL_APIENTRY glUniform1ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform2ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform3ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glUniform4ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glGetUniformi64vNV (GLuint program, GLint location, GLint64EXT *params);
GL_APICALL void GL_APIENTRY glProgramUniform1i64NV (GLuint program, GLint location, GLint64EXT x);
GL_APICALL void GL_APIENTRY glProgramUniform2i64NV (GLuint program, GLint location, GLint64EXT x, GLint64EXT y);
GL_APICALL void GL_APIENTRY glProgramUniform3i64NV (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);
GL_APICALL void GL_APIENTRY glProgramUniform4i64NV (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);
GL_APICALL void GL_APIENTRY glProgramUniform1i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform2i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform3i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform4i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform1ui64NV (GLuint program, GLint location, GLuint64EXT x);
GL_APICALL void GL_APIENTRY glProgramUniform2ui64NV (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y);
GL_APICALL void GL_APIENTRY glProgramUniform3ui64NV (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);
GL_APICALL void GL_APIENTRY glProgramUniform4ui64NV (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);
GL_APICALL void GL_APIENTRY glProgramUniform1ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform2ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform3ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
GL_APICALL void GL_APIENTRY glProgramUniform4ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);
#endif
#endif /* GL_NV_gpu_shader5 */
#ifndef GL_NV_image_formats
#define GL_NV_image_formats 1
#endif /* GL_NV_image_formats */
@@ -2945,10 +2713,6 @@ GL_APICALL void GL_APIENTRY glResolveDepthValuesNV (void);
#define GL_NV_sample_mask_override_coverage 1
#endif /* GL_NV_sample_mask_override_coverage */
#ifndef GL_NV_shader_atomic_fp16_vector
#define GL_NV_shader_atomic_fp16_vector 1
#endif /* GL_NV_shader_atomic_fp16_vector */
#ifndef GL_NV_shader_noperspective_interpolation
#define GL_NV_shader_noperspective_interpolation 1
#endif /* GL_NV_shader_noperspective_interpolation */
@@ -3015,26 +2779,6 @@ GL_APICALL GLboolean GL_APIENTRY glIsEnablediNV (GLenum target, GLuint index);
#define GL_NV_viewport_array2 1
#endif /* GL_NV_viewport_array2 */
#ifndef GL_NV_viewport_swizzle
#define GL_NV_viewport_swizzle 1
#define GL_VIEWPORT_SWIZZLE_POSITIVE_X_NV 0x9350
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_X_NV 0x9351
#define GL_VIEWPORT_SWIZZLE_POSITIVE_Y_NV 0x9352
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Y_NV 0x9353
#define GL_VIEWPORT_SWIZZLE_POSITIVE_Z_NV 0x9354
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Z_NV 0x9355
#define GL_VIEWPORT_SWIZZLE_POSITIVE_W_NV 0x9356
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_W_NV 0x9357
#define GL_VIEWPORT_SWIZZLE_X_NV 0x9358
#define GL_VIEWPORT_SWIZZLE_Y_NV 0x9359
#define GL_VIEWPORT_SWIZZLE_Z_NV 0x935A
#define GL_VIEWPORT_SWIZZLE_W_NV 0x935B
typedef void (GL_APIENTRYP PFNGLVIEWPORTSWIZZLENVPROC) (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glViewportSwizzleNV (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);
#endif
#endif /* GL_NV_viewport_swizzle */
#ifndef GL_OVR_multiview
#define GL_OVR_multiview 1
#define GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_NUM_VIEWS_OVR 0x9630

View File

@@ -1,7 +1,7 @@
#ifndef __gl2platform_h_
#define __gl2platform_h_
/* $Revision: 23328 $ on $Date:: 2013-10-02 02:28:28 -0700 #$ */
/* $Revision: 10602 $ on $Date:: 2010-03-04 22:35:34 -0800 #$ */
/*
* This document is licensed under the SGI Free Software B License Version

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,21 +33,17 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 32749 $ on $Date: 2016-04-28 09:03:03 -0700 (Thu, 28 Apr 2016) $
** Khronos $Revision: 24614 $ on $Date: 2013-12-30 04:44:46 -0800 (Mon, 30 Dec 2013) $
*/
#include <GLES3/gl3platform.h>
#ifndef GL_APIENTRYP
#define GL_APIENTRYP GL_APIENTRY*
#endif
/* Generated on date 20160428 */
/* Generated on date 20131230 */
/* Generated C header for:
* API: gles2
* Profile: common
* Versions considered: 2\.[0-9]|3\.0
* Versions considered: [23]\.[0-9]
* Versions emitted: .*
* Default extensions included: None
* Additional extensions included: _nomatch_^
@@ -378,148 +374,6 @@ typedef khronos_uint8_t GLubyte;
#define GL_RENDERBUFFER_BINDING 0x8CA7
#define GL_MAX_RENDERBUFFER_SIZE 0x84E8
#define GL_INVALID_FRAMEBUFFER_OPERATION 0x0506
typedef void (GL_APIENTRYP PFNGLACTIVETEXTUREPROC) (GLenum texture);
typedef void (GL_APIENTRYP PFNGLATTACHSHADERPROC) (GLuint program, GLuint shader);
typedef void (GL_APIENTRYP PFNGLBINDATTRIBLOCATIONPROC) (GLuint program, GLuint index, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERPROC) (GLenum target, GLuint buffer);
typedef void (GL_APIENTRYP PFNGLBINDFRAMEBUFFERPROC) (GLenum target, GLuint framebuffer);
typedef void (GL_APIENTRYP PFNGLBINDRENDERBUFFERPROC) (GLenum target, GLuint renderbuffer);
typedef void (GL_APIENTRYP PFNGLBINDTEXTUREPROC) (GLenum target, GLuint texture);
typedef void (GL_APIENTRYP PFNGLBLENDCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEPROC) (GLenum modeRGB, GLenum modeAlpha);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCPROC) (GLenum sfactor, GLenum dfactor);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEPROC) (GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorAlpha, GLenum dfactorAlpha);
typedef void (GL_APIENTRYP PFNGLBUFFERDATAPROC) (GLenum target, GLsizeiptr size, const void *data, GLenum usage);
typedef void (GL_APIENTRYP PFNGLBUFFERSUBDATAPROC) (GLenum target, GLintptr offset, GLsizeiptr size, const void *data);
typedef GLenum (GL_APIENTRYP PFNGLCHECKFRAMEBUFFERSTATUSPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLCLEARPROC) (GLbitfield mask);
typedef void (GL_APIENTRYP PFNGLCLEARCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);
typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFPROC) (GLfloat d);
typedef void (GL_APIENTRYP PFNGLCLEARSTENCILPROC) (GLint s);
typedef void (GL_APIENTRYP PFNGLCOLORMASKPROC) (GLboolean red, GLboolean green, GLboolean blue, GLboolean alpha);
typedef void (GL_APIENTRYP PFNGLCOMPILESHADERPROC) (GLuint shader);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLint border, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOPYTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLint x, GLint y, GLsizei width, GLsizei height, GLint border);
typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint x, GLint y, GLsizei width, GLsizei height);
typedef GLuint (GL_APIENTRYP PFNGLCREATEPROGRAMPROC) (void);
typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROC) (GLenum type);
typedef void (GL_APIENTRYP PFNGLCULLFACEPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLDELETEBUFFERSPROC) (GLsizei n, const GLuint *buffers);
typedef void (GL_APIENTRYP PFNGLDELETEFRAMEBUFFERSPROC) (GLsizei n, const GLuint *framebuffers);
typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLDELETERENDERBUFFERSPROC) (GLsizei n, const GLuint *renderbuffers);
typedef void (GL_APIENTRYP PFNGLDELETESHADERPROC) (GLuint shader);
typedef void (GL_APIENTRYP PFNGLDELETETEXTURESPROC) (GLsizei n, const GLuint *textures);
typedef void (GL_APIENTRYP PFNGLDEPTHFUNCPROC) (GLenum func);
typedef void (GL_APIENTRYP PFNGLDEPTHMASKPROC) (GLboolean flag);
typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFPROC) (GLfloat n, GLfloat f);
typedef void (GL_APIENTRYP PFNGLDETACHSHADERPROC) (GLuint program, GLuint shader);
typedef void (GL_APIENTRYP PFNGLDISABLEPROC) (GLenum cap);
typedef void (GL_APIENTRYP PFNGLDISABLEVERTEXATTRIBARRAYPROC) (GLuint index);
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSPROC) (GLenum mode, GLint first, GLsizei count);
typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices);
typedef void (GL_APIENTRYP PFNGLENABLEPROC) (GLenum cap);
typedef void (GL_APIENTRYP PFNGLENABLEVERTEXATTRIBARRAYPROC) (GLuint index);
typedef void (GL_APIENTRYP PFNGLFINISHPROC) (void);
typedef void (GL_APIENTRYP PFNGLFLUSHPROC) (void);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERRENDERBUFFERPROC) (GLenum target, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level);
typedef void (GL_APIENTRYP PFNGLFRONTFACEPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLGENBUFFERSPROC) (GLsizei n, GLuint *buffers);
typedef void (GL_APIENTRYP PFNGLGENERATEMIPMAPPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGENFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);
typedef void (GL_APIENTRYP PFNGLGENRENDERBUFFERSPROC) (GLsizei n, GLuint *renderbuffers);
typedef void (GL_APIENTRYP PFNGLGENTEXTURESPROC) (GLsizei n, GLuint *textures);
typedef void (GL_APIENTRYP PFNGLGETACTIVEATTRIBPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETATTACHEDSHADERSPROC) (GLuint program, GLsizei maxCount, GLsizei *count, GLuint *shaders);
typedef GLint (GL_APIENTRYP PFNGLGETATTRIBLOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETBOOLEANVPROC) (GLenum pname, GLboolean *data);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef GLenum (GL_APIENTRYP PFNGLGETERRORPROC) (void);
typedef void (GL_APIENTRYP PFNGLGETFLOATVPROC) (GLenum pname, GLfloat *data);
typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERATTACHMENTPARAMETERIVPROC) (GLenum target, GLenum attachment, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETINTEGERVPROC) (GLenum pname, GLint *data);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMIVPROC) (GLuint program, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMINFOLOGPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLGETRENDERBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSHADERIVPROC) (GLuint shader, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSHADERINFOLOGPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLGETSHADERPRECISIONFORMATPROC) (GLenum shadertype, GLenum precisiontype, GLint *range, GLint *precision);
typedef void (GL_APIENTRYP PFNGLGETSHADERSOURCEPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *source);
typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGPROC) (GLenum name);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERFVPROC) (GLenum target, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMFVPROC) (GLuint program, GLint location, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMIVPROC) (GLuint program, GLint location, GLint *params);
typedef GLint (GL_APIENTRYP PFNGLGETUNIFORMLOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBFVPROC) (GLuint index, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIVPROC) (GLuint index, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBPOINTERVPROC) (GLuint index, GLenum pname, void **pointer);
typedef void (GL_APIENTRYP PFNGLHINTPROC) (GLenum target, GLenum mode);
typedef GLboolean (GL_APIENTRYP PFNGLISBUFFERPROC) (GLuint buffer);
typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDPROC) (GLenum cap);
typedef GLboolean (GL_APIENTRYP PFNGLISFRAMEBUFFERPROC) (GLuint framebuffer);
typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPROC) (GLuint program);
typedef GLboolean (GL_APIENTRYP PFNGLISRENDERBUFFERPROC) (GLuint renderbuffer);
typedef GLboolean (GL_APIENTRYP PFNGLISSHADERPROC) (GLuint shader);
typedef GLboolean (GL_APIENTRYP PFNGLISTEXTUREPROC) (GLuint texture);
typedef void (GL_APIENTRYP PFNGLLINEWIDTHPROC) (GLfloat width);
typedef void (GL_APIENTRYP PFNGLLINKPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLPIXELSTOREIPROC) (GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETPROC) (GLfloat factor, GLfloat units);
typedef void (GL_APIENTRYP PFNGLREADPIXELSPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, void *pixels);
typedef void (GL_APIENTRYP PFNGLRELEASESHADERCOMPILERPROC) (void);
typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEPROC) (GLenum target, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSAMPLECOVERAGEPROC) (GLfloat value, GLboolean invert);
typedef void (GL_APIENTRYP PFNGLSCISSORPROC) (GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSHADERBINARYPROC) (GLsizei count, const GLuint *shaders, GLenum binaryformat, const void *binary, GLsizei length);
typedef void (GL_APIENTRYP PFNGLSHADERSOURCEPROC) (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);
typedef void (GL_APIENTRYP PFNGLSTENCILFUNCPROC) (GLenum func, GLint ref, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILFUNCSEPARATEPROC) (GLenum face, GLenum func, GLint ref, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILMASKPROC) (GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILMASKSEPARATEPROC) (GLenum face, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILOPPROC) (GLenum fail, GLenum zfail, GLenum zpass);
typedef void (GL_APIENTRYP PFNGLSTENCILOPSEPARATEPROC) (GLenum face, GLenum sfail, GLenum dpfail, GLenum dppass);
typedef void (GL_APIENTRYP PFNGLTEXIMAGE2DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFPROC) (GLenum target, GLenum pname, GLfloat param);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFVPROC) (GLenum target, GLenum pname, const GLfloat *params);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIVPROC) (GLenum target, GLenum pname, const GLint *params);
typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLUNIFORM1FPROC) (GLint location, GLfloat v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM1FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM1IPROC) (GLint location, GLint v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM1IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2FPROC) (GLint location, GLfloat v0, GLfloat v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM2FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2IPROC) (GLint location, GLint v0, GLint v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM2IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM3FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3IPROC) (GLint location, GLint v0, GLint v1, GLint v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM3IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM4FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4IPROC) (GLint location, GLint v0, GLint v1, GLint v2, GLint v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM4IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUSEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FPROC) (GLuint index, GLfloat x);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FPROC) (GLuint index, GLfloat x, GLfloat y);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBPOINTERPROC) (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const void *pointer);
typedef void (GL_APIENTRYP PFNGLVIEWPORTPROC) (GLint x, GLint y, GLsizei width, GLsizei height);
GL_APICALL void GL_APIENTRY glActiveTexture (GLenum texture);
GL_APICALL void GL_APIENTRY glAttachShader (GLuint program, GLuint shader);
GL_APICALL void GL_APIENTRY glBindAttribLocation (GLuint program, GLuint index, const GLchar *name);
@@ -851,22 +705,6 @@ typedef unsigned short GLhalf;
#define GL_COLOR_ATTACHMENT13 0x8CED
#define GL_COLOR_ATTACHMENT14 0x8CEE
#define GL_COLOR_ATTACHMENT15 0x8CEF
#define GL_COLOR_ATTACHMENT16 0x8CF0
#define GL_COLOR_ATTACHMENT17 0x8CF1
#define GL_COLOR_ATTACHMENT18 0x8CF2
#define GL_COLOR_ATTACHMENT19 0x8CF3
#define GL_COLOR_ATTACHMENT20 0x8CF4
#define GL_COLOR_ATTACHMENT21 0x8CF5
#define GL_COLOR_ATTACHMENT22 0x8CF6
#define GL_COLOR_ATTACHMENT23 0x8CF7
#define GL_COLOR_ATTACHMENT24 0x8CF8
#define GL_COLOR_ATTACHMENT25 0x8CF9
#define GL_COLOR_ATTACHMENT26 0x8CFA
#define GL_COLOR_ATTACHMENT27 0x8CFB
#define GL_COLOR_ATTACHMENT28 0x8CFC
#define GL_COLOR_ATTACHMENT29 0x8CFD
#define GL_COLOR_ATTACHMENT30 0x8CFE
#define GL_COLOR_ATTACHMENT31 0x8CFF
#define GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE 0x8D56
#define GL_MAX_SAMPLES 0x8D57
#define GL_HALF_FLOAT 0x140B
@@ -988,111 +826,7 @@ typedef unsigned short GLhalf;
#define GL_MAX_ELEMENT_INDEX 0x8D6B
#define GL_NUM_SAMPLE_COUNTS 0x9380
#define GL_TEXTURE_IMMUTABLE_LEVELS 0x82DF
typedef void (GL_APIENTRYP PFNGLREADBUFFERPROC) (GLenum src);
typedef void (GL_APIENTRYP PFNGLDRAWRANGEELEMENTSPROC) (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);
typedef void (GL_APIENTRYP PFNGLTEXIMAGE3DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE3DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLGENQUERIESPROC) (GLsizei n, GLuint *ids);
typedef void (GL_APIENTRYP PFNGLDELETEQUERIESPROC) (GLsizei n, const GLuint *ids);
typedef GLboolean (GL_APIENTRYP PFNGLISQUERYPROC) (GLuint id);
typedef void (GL_APIENTRYP PFNGLBEGINQUERYPROC) (GLenum target, GLuint id);
typedef void (GL_APIENTRYP PFNGLENDQUERYPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGETQUERYIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETQUERYOBJECTUIVPROC) (GLuint id, GLenum pname, GLuint *params);
typedef GLboolean (GL_APIENTRYP PFNGLUNMAPBUFFERPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPOINTERVPROC) (GLenum target, GLenum pname, void **params);
typedef void (GL_APIENTRYP PFNGLDRAWBUFFERSPROC) (GLsizei n, const GLenum *bufs);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLBLITFRAMEBUFFERPROC) (GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLenum target, GLsizei samples, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURELAYERPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer);
typedef void *(GL_APIENTRYP PFNGLMAPBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length, GLbitfield access);
typedef void (GL_APIENTRYP PFNGLFLUSHMAPPEDBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length);
typedef void (GL_APIENTRYP PFNGLBINDVERTEXARRAYPROC) (GLuint array);
typedef void (GL_APIENTRYP PFNGLDELETEVERTEXARRAYSPROC) (GLsizei n, const GLuint *arrays);
typedef void (GL_APIENTRYP PFNGLGENVERTEXARRAYSPROC) (GLsizei n, GLuint *arrays);
typedef GLboolean (GL_APIENTRYP PFNGLISVERTEXARRAYPROC) (GLuint array);
typedef void (GL_APIENTRYP PFNGLGETINTEGERI_VPROC) (GLenum target, GLuint index, GLint *data);
typedef void (GL_APIENTRYP PFNGLBEGINTRANSFORMFEEDBACKPROC) (GLenum primitiveMode);
typedef void (GL_APIENTRYP PFNGLENDTRANSFORMFEEDBACKPROC) (void);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERRANGEPROC) (GLenum target, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERBASEPROC) (GLenum target, GLuint index, GLuint buffer);
typedef void (GL_APIENTRYP PFNGLTRANSFORMFEEDBACKVARYINGSPROC) (GLuint program, GLsizei count, const GLchar *const*varyings, GLenum bufferMode);
typedef void (GL_APIENTRYP PFNGLGETTRANSFORMFEEDBACKVARYINGPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLsizei *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBIPOINTERPROC) (GLuint index, GLint size, GLenum type, GLsizei stride, const void *pointer);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIIVPROC) (GLuint index, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIUIVPROC) (GLuint index, GLenum pname, GLuint *params);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IPROC) (GLuint index, GLint x, GLint y, GLint z, GLint w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIPROC) (GLuint index, GLuint x, GLuint y, GLuint z, GLuint w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IVPROC) (GLuint index, const GLint *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIVPROC) (GLuint index, const GLuint *v);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMUIVPROC) (GLuint program, GLint location, GLuint *params);
typedef GLint (GL_APIENTRYP PFNGLGETFRAGDATALOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLUNIFORM1UIPROC) (GLint location, GLuint v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM2UIPROC) (GLint location, GLuint v0, GLuint v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM3UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM4UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2, GLuint v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM1UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERIVPROC) (GLenum buffer, GLint drawbuffer, const GLint *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERUIVPROC) (GLenum buffer, GLint drawbuffer, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFVPROC) (GLenum buffer, GLint drawbuffer, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFIPROC) (GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);
typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGIPROC) (GLenum name, GLuint index);
typedef void (GL_APIENTRYP PFNGLCOPYBUFFERSUBDATAPROC) (GLenum readTarget, GLenum writeTarget, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMINDICESPROC) (GLuint program, GLsizei uniformCount, const GLchar *const*uniformNames, GLuint *uniformIndices);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMSIVPROC) (GLuint program, GLsizei uniformCount, const GLuint *uniformIndices, GLenum pname, GLint *params);
typedef GLuint (GL_APIENTRYP PFNGLGETUNIFORMBLOCKINDEXPROC) (GLuint program, const GLchar *uniformBlockName);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKIVPROC) (GLuint program, GLuint uniformBlockIndex, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKNAMEPROC) (GLuint program, GLuint uniformBlockIndex, GLsizei bufSize, GLsizei *length, GLchar *uniformBlockName);
typedef void (GL_APIENTRYP PFNGLUNIFORMBLOCKBINDINGPROC) (GLuint program, GLuint uniformBlockIndex, GLuint uniformBlockBinding);
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINSTANCEDPROC) (GLenum mode, GLint first, GLsizei count, GLsizei instancecount);
typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSINSTANCEDPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices, GLsizei instancecount);
typedef GLsync (GL_APIENTRYP PFNGLFENCESYNCPROC) (GLenum condition, GLbitfield flags);
typedef GLboolean (GL_APIENTRYP PFNGLISSYNCPROC) (GLsync sync);
typedef void (GL_APIENTRYP PFNGLDELETESYNCPROC) (GLsync sync);
typedef GLenum (GL_APIENTRYP PFNGLCLIENTWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);
typedef void (GL_APIENTRYP PFNGLWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);
typedef void (GL_APIENTRYP PFNGLGETINTEGER64VPROC) (GLenum pname, GLint64 *data);
typedef void (GL_APIENTRYP PFNGLGETSYNCIVPROC) (GLsync sync, GLenum pname, GLsizei bufSize, GLsizei *length, GLint *values);
typedef void (GL_APIENTRYP PFNGLGETINTEGER64I_VPROC) (GLenum target, GLuint index, GLint64 *data);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERI64VPROC) (GLenum target, GLenum pname, GLint64 *params);
typedef void (GL_APIENTRYP PFNGLGENSAMPLERSPROC) (GLsizei count, GLuint *samplers);
typedef void (GL_APIENTRYP PFNGLDELETESAMPLERSPROC) (GLsizei count, const GLuint *samplers);
typedef GLboolean (GL_APIENTRYP PFNGLISSAMPLERPROC) (GLuint sampler);
typedef void (GL_APIENTRYP PFNGLBINDSAMPLERPROC) (GLuint unit, GLuint sampler);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIPROC) (GLuint sampler, GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, const GLint *param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFPROC) (GLuint sampler, GLenum pname, GLfloat param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, const GLfloat *param);
typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBDIVISORPROC) (GLuint index, GLuint divisor);
typedef void (GL_APIENTRYP PFNGLBINDTRANSFORMFEEDBACKPROC) (GLenum target, GLuint id);
typedef void (GL_APIENTRYP PFNGLDELETETRANSFORMFEEDBACKSPROC) (GLsizei n, const GLuint *ids);
typedef void (GL_APIENTRYP PFNGLGENTRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);
typedef GLboolean (GL_APIENTRYP PFNGLISTRANSFORMFEEDBACKPROC) (GLuint id);
typedef void (GL_APIENTRYP PFNGLPAUSETRANSFORMFEEDBACKPROC) (void);
typedef void (GL_APIENTRYP PFNGLRESUMETRANSFORMFEEDBACKPROC) (void);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMBINARYPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLenum *binaryFormat, void *binary);
typedef void (GL_APIENTRYP PFNGLPROGRAMBINARYPROC) (GLuint program, GLenum binaryFormat, const void *binary, GLsizei length);
typedef void (GL_APIENTRYP PFNGLPROGRAMPARAMETERIPROC) (GLuint program, GLenum pname, GLint value);
typedef void (GL_APIENTRYP PFNGLINVALIDATEFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments);
typedef void (GL_APIENTRYP PFNGLINVALIDATESUBFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments, GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLTEXSTORAGE2DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLTEXSTORAGE3DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);
typedef void (GL_APIENTRYP PFNGLGETINTERNALFORMATIVPROC) (GLenum target, GLenum internalformat, GLenum pname, GLsizei bufSize, GLint *params);
GL_APICALL void GL_APIENTRY glReadBuffer (GLenum src);
GL_APICALL void GL_APIENTRY glReadBuffer (GLenum mode);
GL_APICALL void GL_APIENTRY glDrawRangeElements (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);
GL_APICALL void GL_APIENTRY glTexImage3D (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);
GL_APICALL void GL_APIENTRY glTexSubImage3D (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2016 The Khronos Group Inc.
** Copyright (c) 2013-2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -38,16 +38,12 @@ extern "C" {
#include <GLES3/gl3platform.h>
#ifndef GL_APIENTRYP
#define GL_APIENTRYP GL_APIENTRY*
#endif
/* Generated on date 20160428 */
/* Generated on date 20140317 */
/* Generated C header for:
* API: gles2
* Profile: common
* Versions considered: 2\.[0-9]|3\.[01]
* Versions considered: 2.[0-9]|3.[01]
* Versions emitted: .*
* Default extensions included: None
* Additional extensions included: _nomatch_^
@@ -378,148 +374,6 @@ typedef khronos_uint8_t GLubyte;
#define GL_RENDERBUFFER_BINDING 0x8CA7
#define GL_MAX_RENDERBUFFER_SIZE 0x84E8
#define GL_INVALID_FRAMEBUFFER_OPERATION 0x0506
typedef void (GL_APIENTRYP PFNGLACTIVETEXTUREPROC) (GLenum texture);
typedef void (GL_APIENTRYP PFNGLATTACHSHADERPROC) (GLuint program, GLuint shader);
typedef void (GL_APIENTRYP PFNGLBINDATTRIBLOCATIONPROC) (GLuint program, GLuint index, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERPROC) (GLenum target, GLuint buffer);
typedef void (GL_APIENTRYP PFNGLBINDFRAMEBUFFERPROC) (GLenum target, GLuint framebuffer);
typedef void (GL_APIENTRYP PFNGLBINDRENDERBUFFERPROC) (GLenum target, GLuint renderbuffer);
typedef void (GL_APIENTRYP PFNGLBINDTEXTUREPROC) (GLenum target, GLuint texture);
typedef void (GL_APIENTRYP PFNGLBLENDCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEPROC) (GLenum modeRGB, GLenum modeAlpha);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCPROC) (GLenum sfactor, GLenum dfactor);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEPROC) (GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorAlpha, GLenum dfactorAlpha);
typedef void (GL_APIENTRYP PFNGLBUFFERDATAPROC) (GLenum target, GLsizeiptr size, const void *data, GLenum usage);
typedef void (GL_APIENTRYP PFNGLBUFFERSUBDATAPROC) (GLenum target, GLintptr offset, GLsizeiptr size, const void *data);
typedef GLenum (GL_APIENTRYP PFNGLCHECKFRAMEBUFFERSTATUSPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLCLEARPROC) (GLbitfield mask);
typedef void (GL_APIENTRYP PFNGLCLEARCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);
typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFPROC) (GLfloat d);
typedef void (GL_APIENTRYP PFNGLCLEARSTENCILPROC) (GLint s);
typedef void (GL_APIENTRYP PFNGLCOLORMASKPROC) (GLboolean red, GLboolean green, GLboolean blue, GLboolean alpha);
typedef void (GL_APIENTRYP PFNGLCOMPILESHADERPROC) (GLuint shader);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLint border, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOPYTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLint x, GLint y, GLsizei width, GLsizei height, GLint border);
typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint x, GLint y, GLsizei width, GLsizei height);
typedef GLuint (GL_APIENTRYP PFNGLCREATEPROGRAMPROC) (void);
typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROC) (GLenum type);
typedef void (GL_APIENTRYP PFNGLCULLFACEPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLDELETEBUFFERSPROC) (GLsizei n, const GLuint *buffers);
typedef void (GL_APIENTRYP PFNGLDELETEFRAMEBUFFERSPROC) (GLsizei n, const GLuint *framebuffers);
typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLDELETERENDERBUFFERSPROC) (GLsizei n, const GLuint *renderbuffers);
typedef void (GL_APIENTRYP PFNGLDELETESHADERPROC) (GLuint shader);
typedef void (GL_APIENTRYP PFNGLDELETETEXTURESPROC) (GLsizei n, const GLuint *textures);
typedef void (GL_APIENTRYP PFNGLDEPTHFUNCPROC) (GLenum func);
typedef void (GL_APIENTRYP PFNGLDEPTHMASKPROC) (GLboolean flag);
typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFPROC) (GLfloat n, GLfloat f);
typedef void (GL_APIENTRYP PFNGLDETACHSHADERPROC) (GLuint program, GLuint shader);
typedef void (GL_APIENTRYP PFNGLDISABLEPROC) (GLenum cap);
typedef void (GL_APIENTRYP PFNGLDISABLEVERTEXATTRIBARRAYPROC) (GLuint index);
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSPROC) (GLenum mode, GLint first, GLsizei count);
typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices);
typedef void (GL_APIENTRYP PFNGLENABLEPROC) (GLenum cap);
typedef void (GL_APIENTRYP PFNGLENABLEVERTEXATTRIBARRAYPROC) (GLuint index);
typedef void (GL_APIENTRYP PFNGLFINISHPROC) (void);
typedef void (GL_APIENTRYP PFNGLFLUSHPROC) (void);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERRENDERBUFFERPROC) (GLenum target, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level);
typedef void (GL_APIENTRYP PFNGLFRONTFACEPROC) (GLenum mode);
typedef void (GL_APIENTRYP PFNGLGENBUFFERSPROC) (GLsizei n, GLuint *buffers);
typedef void (GL_APIENTRYP PFNGLGENERATEMIPMAPPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGENFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);
typedef void (GL_APIENTRYP PFNGLGENRENDERBUFFERSPROC) (GLsizei n, GLuint *renderbuffers);
typedef void (GL_APIENTRYP PFNGLGENTEXTURESPROC) (GLsizei n, GLuint *textures);
typedef void (GL_APIENTRYP PFNGLGETACTIVEATTRIBPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETATTACHEDSHADERSPROC) (GLuint program, GLsizei maxCount, GLsizei *count, GLuint *shaders);
typedef GLint (GL_APIENTRYP PFNGLGETATTRIBLOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETBOOLEANVPROC) (GLenum pname, GLboolean *data);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef GLenum (GL_APIENTRYP PFNGLGETERRORPROC) (void);
typedef void (GL_APIENTRYP PFNGLGETFLOATVPROC) (GLenum pname, GLfloat *data);
typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERATTACHMENTPARAMETERIVPROC) (GLenum target, GLenum attachment, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETINTEGERVPROC) (GLenum pname, GLint *data);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMIVPROC) (GLuint program, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMINFOLOGPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLGETRENDERBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSHADERIVPROC) (GLuint shader, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSHADERINFOLOGPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLGETSHADERPRECISIONFORMATPROC) (GLenum shadertype, GLenum precisiontype, GLint *range, GLint *precision);
typedef void (GL_APIENTRYP PFNGLGETSHADERSOURCEPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *source);
typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGPROC) (GLenum name);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERFVPROC) (GLenum target, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMFVPROC) (GLuint program, GLint location, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMIVPROC) (GLuint program, GLint location, GLint *params);
typedef GLint (GL_APIENTRYP PFNGLGETUNIFORMLOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBFVPROC) (GLuint index, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIVPROC) (GLuint index, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBPOINTERVPROC) (GLuint index, GLenum pname, void **pointer);
typedef void (GL_APIENTRYP PFNGLHINTPROC) (GLenum target, GLenum mode);
typedef GLboolean (GL_APIENTRYP PFNGLISBUFFERPROC) (GLuint buffer);
typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDPROC) (GLenum cap);
typedef GLboolean (GL_APIENTRYP PFNGLISFRAMEBUFFERPROC) (GLuint framebuffer);
typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPROC) (GLuint program);
typedef GLboolean (GL_APIENTRYP PFNGLISRENDERBUFFERPROC) (GLuint renderbuffer);
typedef GLboolean (GL_APIENTRYP PFNGLISSHADERPROC) (GLuint shader);
typedef GLboolean (GL_APIENTRYP PFNGLISTEXTUREPROC) (GLuint texture);
typedef void (GL_APIENTRYP PFNGLLINEWIDTHPROC) (GLfloat width);
typedef void (GL_APIENTRYP PFNGLLINKPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLPIXELSTOREIPROC) (GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETPROC) (GLfloat factor, GLfloat units);
typedef void (GL_APIENTRYP PFNGLREADPIXELSPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, void *pixels);
typedef void (GL_APIENTRYP PFNGLRELEASESHADERCOMPILERPROC) (void);
typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEPROC) (GLenum target, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSAMPLECOVERAGEPROC) (GLfloat value, GLboolean invert);
typedef void (GL_APIENTRYP PFNGLSCISSORPROC) (GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLSHADERBINARYPROC) (GLsizei count, const GLuint *shaders, GLenum binaryformat, const void *binary, GLsizei length);
typedef void (GL_APIENTRYP PFNGLSHADERSOURCEPROC) (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);
typedef void (GL_APIENTRYP PFNGLSTENCILFUNCPROC) (GLenum func, GLint ref, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILFUNCSEPARATEPROC) (GLenum face, GLenum func, GLint ref, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILMASKPROC) (GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILMASKSEPARATEPROC) (GLenum face, GLuint mask);
typedef void (GL_APIENTRYP PFNGLSTENCILOPPROC) (GLenum fail, GLenum zfail, GLenum zpass);
typedef void (GL_APIENTRYP PFNGLSTENCILOPSEPARATEPROC) (GLenum face, GLenum sfail, GLenum dpfail, GLenum dppass);
typedef void (GL_APIENTRYP PFNGLTEXIMAGE2DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFPROC) (GLenum target, GLenum pname, GLfloat param);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFVPROC) (GLenum target, GLenum pname, const GLfloat *params);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIVPROC) (GLenum target, GLenum pname, const GLint *params);
typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLUNIFORM1FPROC) (GLint location, GLfloat v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM1FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM1IPROC) (GLint location, GLint v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM1IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2FPROC) (GLint location, GLfloat v0, GLfloat v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM2FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2IPROC) (GLint location, GLint v0, GLint v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM2IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM3FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3IPROC) (GLint location, GLint v0, GLint v1, GLint v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM3IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM4FVPROC) (GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4IPROC) (GLint location, GLint v0, GLint v1, GLint v2, GLint v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM4IVPROC) (GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUSEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPROC) (GLuint program);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FPROC) (GLuint index, GLfloat x);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FPROC) (GLuint index, GLfloat x, GLfloat y);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FVPROC) (GLuint index, const GLfloat *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBPOINTERPROC) (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const void *pointer);
typedef void (GL_APIENTRYP PFNGLVIEWPORTPROC) (GLint x, GLint y, GLsizei width, GLsizei height);
GL_APICALL void GL_APIENTRY glActiveTexture (GLenum texture);
GL_APICALL void GL_APIENTRY glAttachShader (GLuint program, GLuint shader);
GL_APICALL void GL_APIENTRY glBindAttribLocation (GLuint program, GLuint index, const GLchar *name);
@@ -851,22 +705,6 @@ typedef unsigned short GLhalf;
#define GL_COLOR_ATTACHMENT13 0x8CED
#define GL_COLOR_ATTACHMENT14 0x8CEE
#define GL_COLOR_ATTACHMENT15 0x8CEF
#define GL_COLOR_ATTACHMENT16 0x8CF0
#define GL_COLOR_ATTACHMENT17 0x8CF1
#define GL_COLOR_ATTACHMENT18 0x8CF2
#define GL_COLOR_ATTACHMENT19 0x8CF3
#define GL_COLOR_ATTACHMENT20 0x8CF4
#define GL_COLOR_ATTACHMENT21 0x8CF5
#define GL_COLOR_ATTACHMENT22 0x8CF6
#define GL_COLOR_ATTACHMENT23 0x8CF7
#define GL_COLOR_ATTACHMENT24 0x8CF8
#define GL_COLOR_ATTACHMENT25 0x8CF9
#define GL_COLOR_ATTACHMENT26 0x8CFA
#define GL_COLOR_ATTACHMENT27 0x8CFB
#define GL_COLOR_ATTACHMENT28 0x8CFC
#define GL_COLOR_ATTACHMENT29 0x8CFD
#define GL_COLOR_ATTACHMENT30 0x8CFE
#define GL_COLOR_ATTACHMENT31 0x8CFF
#define GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE 0x8D56
#define GL_MAX_SAMPLES 0x8D57
#define GL_HALF_FLOAT 0x140B
@@ -988,111 +826,7 @@ typedef unsigned short GLhalf;
#define GL_MAX_ELEMENT_INDEX 0x8D6B
#define GL_NUM_SAMPLE_COUNTS 0x9380
#define GL_TEXTURE_IMMUTABLE_LEVELS 0x82DF
typedef void (GL_APIENTRYP PFNGLREADBUFFERPROC) (GLenum src);
typedef void (GL_APIENTRYP PFNGLDRAWRANGEELEMENTSPROC) (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);
typedef void (GL_APIENTRYP PFNGLTEXIMAGE3DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);
typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE3DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLsizei imageSize, const void *data);
typedef void (GL_APIENTRYP PFNGLGENQUERIESPROC) (GLsizei n, GLuint *ids);
typedef void (GL_APIENTRYP PFNGLDELETEQUERIESPROC) (GLsizei n, const GLuint *ids);
typedef GLboolean (GL_APIENTRYP PFNGLISQUERYPROC) (GLuint id);
typedef void (GL_APIENTRYP PFNGLBEGINQUERYPROC) (GLenum target, GLuint id);
typedef void (GL_APIENTRYP PFNGLENDQUERYPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGETQUERYIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETQUERYOBJECTUIVPROC) (GLuint id, GLenum pname, GLuint *params);
typedef GLboolean (GL_APIENTRYP PFNGLUNMAPBUFFERPROC) (GLenum target);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPOINTERVPROC) (GLenum target, GLenum pname, void **params);
typedef void (GL_APIENTRYP PFNGLDRAWBUFFERSPROC) (GLsizei n, const GLenum *bufs);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLBLITFRAMEBUFFERPROC) (GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLenum target, GLsizei samples, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURELAYERPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer);
typedef void *(GL_APIENTRYP PFNGLMAPBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length, GLbitfield access);
typedef void (GL_APIENTRYP PFNGLFLUSHMAPPEDBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length);
typedef void (GL_APIENTRYP PFNGLBINDVERTEXARRAYPROC) (GLuint array);
typedef void (GL_APIENTRYP PFNGLDELETEVERTEXARRAYSPROC) (GLsizei n, const GLuint *arrays);
typedef void (GL_APIENTRYP PFNGLGENVERTEXARRAYSPROC) (GLsizei n, GLuint *arrays);
typedef GLboolean (GL_APIENTRYP PFNGLISVERTEXARRAYPROC) (GLuint array);
typedef void (GL_APIENTRYP PFNGLGETINTEGERI_VPROC) (GLenum target, GLuint index, GLint *data);
typedef void (GL_APIENTRYP PFNGLBEGINTRANSFORMFEEDBACKPROC) (GLenum primitiveMode);
typedef void (GL_APIENTRYP PFNGLENDTRANSFORMFEEDBACKPROC) (void);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERRANGEPROC) (GLenum target, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
typedef void (GL_APIENTRYP PFNGLBINDBUFFERBASEPROC) (GLenum target, GLuint index, GLuint buffer);
typedef void (GL_APIENTRYP PFNGLTRANSFORMFEEDBACKVARYINGSPROC) (GLuint program, GLsizei count, const GLchar *const*varyings, GLenum bufferMode);
typedef void (GL_APIENTRYP PFNGLGETTRANSFORMFEEDBACKVARYINGPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLsizei *size, GLenum *type, GLchar *name);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBIPOINTERPROC) (GLuint index, GLint size, GLenum type, GLsizei stride, const void *pointer);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIIVPROC) (GLuint index, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIUIVPROC) (GLuint index, GLenum pname, GLuint *params);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IPROC) (GLuint index, GLint x, GLint y, GLint z, GLint w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIPROC) (GLuint index, GLuint x, GLuint y, GLuint z, GLuint w);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IVPROC) (GLuint index, const GLint *v);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIVPROC) (GLuint index, const GLuint *v);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMUIVPROC) (GLuint program, GLint location, GLuint *params);
typedef GLint (GL_APIENTRYP PFNGLGETFRAGDATALOCATIONPROC) (GLuint program, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLUNIFORM1UIPROC) (GLint location, GLuint v0);
typedef void (GL_APIENTRYP PFNGLUNIFORM2UIPROC) (GLint location, GLuint v0, GLuint v1);
typedef void (GL_APIENTRYP PFNGLUNIFORM3UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2);
typedef void (GL_APIENTRYP PFNGLUNIFORM4UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2, GLuint v3);
typedef void (GL_APIENTRYP PFNGLUNIFORM1UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM2UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM3UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLUNIFORM4UIVPROC) (GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERIVPROC) (GLenum buffer, GLint drawbuffer, const GLint *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERUIVPROC) (GLenum buffer, GLint drawbuffer, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFVPROC) (GLenum buffer, GLint drawbuffer, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFIPROC) (GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);
typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGIPROC) (GLenum name, GLuint index);
typedef void (GL_APIENTRYP PFNGLCOPYBUFFERSUBDATAPROC) (GLenum readTarget, GLenum writeTarget, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
typedef void (GL_APIENTRYP PFNGLGETUNIFORMINDICESPROC) (GLuint program, GLsizei uniformCount, const GLchar *const*uniformNames, GLuint *uniformIndices);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMSIVPROC) (GLuint program, GLsizei uniformCount, const GLuint *uniformIndices, GLenum pname, GLint *params);
typedef GLuint (GL_APIENTRYP PFNGLGETUNIFORMBLOCKINDEXPROC) (GLuint program, const GLchar *uniformBlockName);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKIVPROC) (GLuint program, GLuint uniformBlockIndex, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKNAMEPROC) (GLuint program, GLuint uniformBlockIndex, GLsizei bufSize, GLsizei *length, GLchar *uniformBlockName);
typedef void (GL_APIENTRYP PFNGLUNIFORMBLOCKBINDINGPROC) (GLuint program, GLuint uniformBlockIndex, GLuint uniformBlockBinding);
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINSTANCEDPROC) (GLenum mode, GLint first, GLsizei count, GLsizei instancecount);
typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSINSTANCEDPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices, GLsizei instancecount);
typedef GLsync (GL_APIENTRYP PFNGLFENCESYNCPROC) (GLenum condition, GLbitfield flags);
typedef GLboolean (GL_APIENTRYP PFNGLISSYNCPROC) (GLsync sync);
typedef void (GL_APIENTRYP PFNGLDELETESYNCPROC) (GLsync sync);
typedef GLenum (GL_APIENTRYP PFNGLCLIENTWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);
typedef void (GL_APIENTRYP PFNGLWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);
typedef void (GL_APIENTRYP PFNGLGETINTEGER64VPROC) (GLenum pname, GLint64 *data);
typedef void (GL_APIENTRYP PFNGLGETSYNCIVPROC) (GLsync sync, GLenum pname, GLsizei bufSize, GLsizei *length, GLint *values);
typedef void (GL_APIENTRYP PFNGLGETINTEGER64I_VPROC) (GLenum target, GLuint index, GLint64 *data);
typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERI64VPROC) (GLenum target, GLenum pname, GLint64 *params);
typedef void (GL_APIENTRYP PFNGLGENSAMPLERSPROC) (GLsizei count, GLuint *samplers);
typedef void (GL_APIENTRYP PFNGLDELETESAMPLERSPROC) (GLsizei count, const GLuint *samplers);
typedef GLboolean (GL_APIENTRYP PFNGLISSAMPLERPROC) (GLuint sampler);
typedef void (GL_APIENTRYP PFNGLBINDSAMPLERPROC) (GLuint unit, GLuint sampler);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIPROC) (GLuint sampler, GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, const GLint *param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFPROC) (GLuint sampler, GLenum pname, GLfloat param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, const GLfloat *param);
typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBDIVISORPROC) (GLuint index, GLuint divisor);
typedef void (GL_APIENTRYP PFNGLBINDTRANSFORMFEEDBACKPROC) (GLenum target, GLuint id);
typedef void (GL_APIENTRYP PFNGLDELETETRANSFORMFEEDBACKSPROC) (GLsizei n, const GLuint *ids);
typedef void (GL_APIENTRYP PFNGLGENTRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);
typedef GLboolean (GL_APIENTRYP PFNGLISTRANSFORMFEEDBACKPROC) (GLuint id);
typedef void (GL_APIENTRYP PFNGLPAUSETRANSFORMFEEDBACKPROC) (void);
typedef void (GL_APIENTRYP PFNGLRESUMETRANSFORMFEEDBACKPROC) (void);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMBINARYPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLenum *binaryFormat, void *binary);
typedef void (GL_APIENTRYP PFNGLPROGRAMBINARYPROC) (GLuint program, GLenum binaryFormat, const void *binary, GLsizei length);
typedef void (GL_APIENTRYP PFNGLPROGRAMPARAMETERIPROC) (GLuint program, GLenum pname, GLint value);
typedef void (GL_APIENTRYP PFNGLINVALIDATEFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments);
typedef void (GL_APIENTRYP PFNGLINVALIDATESUBFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments, GLint x, GLint y, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLTEXSTORAGE2DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (GL_APIENTRYP PFNGLTEXSTORAGE3DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);
typedef void (GL_APIENTRYP PFNGLGETINTERNALFORMATIVPROC) (GLenum target, GLenum internalformat, GLenum pname, GLsizei bufSize, GLint *params);
GL_APICALL void GL_APIENTRY glReadBuffer (GLenum src);
GL_APICALL void GL_APIENTRY glReadBuffer (GLenum mode);
GL_APICALL void GL_APIENTRY glDrawRangeElements (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);
GL_APICALL void GL_APIENTRY glTexImage3D (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);
GL_APICALL void GL_APIENTRY glTexSubImage3D (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);
@@ -1373,74 +1107,6 @@ GL_APICALL void GL_APIENTRY glGetInternalformativ (GLenum target, GLenum interna
#define GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET 0x82D9
#define GL_MAX_VERTEX_ATTRIB_BINDINGS 0x82DA
#define GL_MAX_VERTEX_ATTRIB_STRIDE 0x82E5
typedef void (GL_APIENTRYP PFNGLDISPATCHCOMPUTEPROC) (GLuint num_groups_x, GLuint num_groups_y, GLuint num_groups_z);
typedef void (GL_APIENTRYP PFNGLDISPATCHCOMPUTEINDIRECTPROC) (GLintptr indirect);
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINDIRECTPROC) (GLenum mode, const void *indirect);
typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSINDIRECTPROC) (GLenum mode, GLenum type, const void *indirect);
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);
typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMINTERFACEIVPROC) (GLuint program, GLenum programInterface, GLenum pname, GLint *params);
typedef GLuint (GL_APIENTRYP PFNGLGETPROGRAMRESOURCEINDEXPROC) (GLuint program, GLenum programInterface, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMRESOURCENAMEPROC) (GLuint program, GLenum programInterface, GLuint index, GLsizei bufSize, GLsizei *length, GLchar *name);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMRESOURCEIVPROC) (GLuint program, GLenum programInterface, GLuint index, GLsizei propCount, const GLenum *props, GLsizei bufSize, GLsizei *length, GLint *params);
typedef GLint (GL_APIENTRYP PFNGLGETPROGRAMRESOURCELOCATIONPROC) (GLuint program, GLenum programInterface, const GLchar *name);
typedef void (GL_APIENTRYP PFNGLUSEPROGRAMSTAGESPROC) (GLuint pipeline, GLbitfield stages, GLuint program);
typedef void (GL_APIENTRYP PFNGLACTIVESHADERPROGRAMPROC) (GLuint pipeline, GLuint program);
typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROGRAMVPROC) (GLenum type, GLsizei count, const GLchar *const*strings);
typedef void (GL_APIENTRYP PFNGLBINDPROGRAMPIPELINEPROC) (GLuint pipeline);
typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPIPELINESPROC) (GLsizei n, const GLuint *pipelines);
typedef void (GL_APIENTRYP PFNGLGENPROGRAMPIPELINESPROC) (GLsizei n, GLuint *pipelines);
typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPIPELINEPROC) (GLuint pipeline);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMPIPELINEIVPROC) (GLuint pipeline, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1IPROC) (GLuint program, GLint location, GLint v0);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2IPROC) (GLuint program, GLint location, GLint v0, GLint v1);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3IPROC) (GLuint program, GLint location, GLint v0, GLint v1, GLint v2);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4IPROC) (GLuint program, GLint location, GLint v0, GLint v1, GLint v2, GLint v3);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UIPROC) (GLuint program, GLint location, GLuint v0);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UIPROC) (GLuint program, GLint location, GLuint v0, GLuint v1);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UIPROC) (GLuint program, GLint location, GLuint v0, GLuint v1, GLuint v2);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UIPROC) (GLuint program, GLint location, GLuint v0, GLuint v1, GLuint v2, GLuint v3);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1FPROC) (GLuint program, GLint location, GLfloat v0);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2FPROC) (GLuint program, GLint location, GLfloat v0, GLfloat v1);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3FPROC) (GLuint program, GLint location, GLfloat v0, GLfloat v1, GLfloat v2);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4FPROC) (GLuint program, GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX2FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX3FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX4FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX2X3FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX3X2FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX2X4FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX4X2FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX3X4FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX4X3FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPIPELINEPROC) (GLuint pipeline);
typedef void (GL_APIENTRYP PFNGLGETPROGRAMPIPELINEINFOLOGPROC) (GLuint pipeline, GLsizei bufSize, GLsizei *length, GLchar *infoLog);
typedef void (GL_APIENTRYP PFNGLBINDIMAGETEXTUREPROC) (GLuint unit, GLuint texture, GLint level, GLboolean layered, GLint layer, GLenum access, GLenum format);
typedef void (GL_APIENTRYP PFNGLGETBOOLEANI_VPROC) (GLenum target, GLuint index, GLboolean *data);
typedef void (GL_APIENTRYP PFNGLMEMORYBARRIERPROC) (GLbitfield barriers);
typedef void (GL_APIENTRYP PFNGLMEMORYBARRIERBYREGIONPROC) (GLbitfield barriers);
typedef void (GL_APIENTRYP PFNGLTEXSTORAGE2DMULTISAMPLEPROC) (GLenum target, GLsizei samples, GLenum internalformat, GLsizei width, GLsizei height, GLboolean fixedsamplelocations);
typedef void (GL_APIENTRYP PFNGLGETMULTISAMPLEFVPROC) (GLenum pname, GLuint index, GLfloat *val);
typedef void (GL_APIENTRYP PFNGLSAMPLEMASKIPROC) (GLuint maskNumber, GLbitfield mask);
typedef void (GL_APIENTRYP PFNGLGETTEXLEVELPARAMETERIVPROC) (GLenum target, GLint level, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETTEXLEVELPARAMETERFVPROC) (GLenum target, GLint level, GLenum pname, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLBINDVERTEXBUFFERPROC) (GLuint bindingindex, GLuint buffer, GLintptr offset, GLsizei stride);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBFORMATPROC) (GLuint attribindex, GLint size, GLenum type, GLboolean normalized, GLuint relativeoffset);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBIFORMATPROC) (GLuint attribindex, GLint size, GLenum type, GLuint relativeoffset);
typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBBINDINGPROC) (GLuint attribindex, GLuint bindingindex);
typedef void (GL_APIENTRYP PFNGLVERTEXBINDINGDIVISORPROC) (GLuint bindingindex, GLuint divisor);
GL_APICALL void GL_APIENTRY glDispatchCompute (GLuint num_groups_x, GLuint num_groups_y, GLuint num_groups_z);
GL_APICALL void GL_APIENTRY glDispatchComputeIndirect (GLintptr indirect);
GL_APICALL void GL_APIENTRY glDrawArraysIndirect (GLenum mode, const void *indirect);

File diff suppressed because it is too large Load Diff

View File

@@ -1,3 +0,0 @@
[*.h]
indent_style = space
indent_size = 4

View File

@@ -184,7 +184,7 @@ mtx_destroy(mtx_t *mtx)
* Thus the linker will be happy and things don't clash when building
* with -O1 or greater.
*/
#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__)
#ifdef HAVE_FUNC_ATTRIBUTE_WEAK
__attribute__((weak))
int pthread_mutexattr_init(pthread_mutexattr_t *attr);

View File

@@ -1,3 +0,0 @@
[*.h]
indent_style = space
indent_size = 4

View File

@@ -109,10 +109,6 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
@@ -138,30 +134,34 @@ CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")
CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")
CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x5908, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x590B, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x590E, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kabylake GT2)")
CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kabylake GT2)")
CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x592A, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x592B, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x3184, glk, "Intel(R) HD Graphics (Geminilake)")
CHIPSET(0x3185, glk_2x6, "Intel(R) HD Graphics (Geminilake 2x6)")
CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")

View File

@@ -1,3 +0,0 @@
[*.h]
indent_style = space
indent_size = 4

View File

@@ -0,0 +1,72 @@
# ===========================================================================
# http://www.gnu.org/software/autoconf-archive/ax_check_compile_flag.html
# ===========================================================================
#
# SYNOPSIS
#
# AX_CHECK_COMPILE_FLAG(FLAG, [ACTION-SUCCESS], [ACTION-FAILURE], [EXTRA-FLAGS])
#
# DESCRIPTION
#
# Check whether the given FLAG works with the current language's compiler
# or gives an error. (Warnings, however, are ignored)
#
# ACTION-SUCCESS/ACTION-FAILURE are shell commands to execute on
# success/failure.
#
# If EXTRA-FLAGS is defined, it is added to the current language's default
# flags (e.g. CFLAGS) when the check is done. The check is thus made with
# the flags: "CFLAGS EXTRA-FLAGS FLAG". This can for example be used to
# force the compiler to issue an error when a bad flag is given.
#
# NOTE: Implementation based on AX_CFLAGS_GCC_OPTION. Please keep this
# macro in sync with AX_CHECK_{PREPROC,LINK}_FLAG.
#
# LICENSE
#
# Copyright (c) 2008 Guido U. Draheim <guidod@gmx.de>
# Copyright (c) 2011 Maarten Bosmans <mkbosmans@gmail.com>
#
# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation, either version 3 of the License, or (at your
# option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
# Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program. If not, see <http://www.gnu.org/licenses/>.
#
# As a special exception, the respective Autoconf Macro's copyright owner
# gives unlimited permission to copy, distribute and modify the configure
# scripts that are the output of Autoconf when processing the Macro. You
# need not follow the terms of the GNU General Public License when using
# or distributing such scripts, even though portions of the text of the
# Macro appear in them. The GNU General Public License (GPL) does govern
# all other use of the material that constitutes the Autoconf Macro.
#
# This special exception to the GPL applies to versions of the Autoconf
# Macro released by the Autoconf Archive. When you make and distribute a
# modified version of the Autoconf Macro, you may extend this special
# exception to the GPL to apply to your modified version as well.
#serial 2
AC_DEFUN([AX_CHECK_COMPILE_FLAG],
[AC_PREREQ(2.59)dnl for _AC_LANG_PREFIX
AS_VAR_PUSHDEF([CACHEVAR],[ax_cv_check_[]_AC_LANG_ABBREV[]flags_$4_$1])dnl
AC_CACHE_CHECK([whether _AC_LANG compiler accepts $1], CACHEVAR, [
ax_check_save_flags=$[]_AC_LANG_PREFIX[]FLAGS
_AC_LANG_PREFIX[]FLAGS="$[]_AC_LANG_PREFIX[]FLAGS $4 $1"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],
[AS_VAR_SET(CACHEVAR,[yes])],
[AS_VAR_SET(CACHEVAR,[no])])
_AC_LANG_PREFIX[]FLAGS=$ax_check_save_flags])
AS_IF([test x"AS_VAR_GET(CACHEVAR)" = xyes],
[m4_default([$2], :)],
[m4_default([$3], :)])
AS_VAR_POPDEF([CACHEVAR])dnl
])dnl AX_CHECK_COMPILE_FLAGS

View File

@@ -103,14 +103,8 @@ def python_scan(node, env, path):
# http://www.scons.org/doc/0.98.5/HTML/scons-user/c2781.html#AEN2789
# https://docs.python.org/2/library/modulefinder.html
contents = node.get_contents()
# Tell ModuleFinder to search dependencies in the script dir, and the glapi
# dirs
source_dir = node.get_dir().abspath
GLAPI = env.Dir('#src/mapi/glapi/gen').abspath
path = [source_dir, GLAPI] + sys.path
finder = modulefinder.ModuleFinder(path=path)
source_dir = node.get_dir()
finder = modulefinder.ModuleFinder()
finder.run_script(node.abspath)
results = []
for name, mod in finder.modules.iteritems():

View File

@@ -256,7 +256,7 @@ def generate(env):
if env['build'] == 'profile':
env['debug'] = False
env['profile'] = True
if env['build'] in ('release', 'opt'):
if env['build'] == 'release':
env['debug'] = False
env['profile'] = False
@@ -301,8 +301,6 @@ def generate(env):
cppdefines += ['NDEBUG']
if env['build'] == 'profile':
cppdefines += ['PROFILE']
if env['build'] in ('opt', 'profile'):
cppdefines += ['VMX86_STATS']
if env['platform'] in ('posix', 'linux', 'freebsd', 'darwin'):
cppdefines += [
'_POSIX_SOURCE',
@@ -452,7 +450,7 @@ def generate(env):
ccflags += [
'/O2', # optimize for speed
]
if env['build'] in ('release', 'opt'):
if env['build'] == 'release':
if not env['clang']:
ccflags += [
'/GL', # enable whole program optimization
@@ -563,7 +561,7 @@ def generate(env):
shlinkflags += ['-Wl,--enable-stdcall-fixup']
#shlinkflags += ['-Wl,--kill-at']
if msvc:
if env['build'] in ('release', 'opt') and not env['clang']:
if env['build'] == 'release' and not env['clang']:
# enable Link-time Code Generation
linkflags += ['/LTCG']
env.Append(ARFLAGS = ['/LTCG'])
@@ -652,6 +650,7 @@ def generate(env):
env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.38'])
env.PkgCheckModules('UDEV', ['libudev >= 151'])
if env['x11']:
env.Append(CPPPATH = env['X11_CPPPATH'])

View File

@@ -865,7 +865,7 @@ sub top_of_mesa_tree {
$lk_path .= "/";
}
if ( (-f "${lk_path}docs/mesa.css")
&& (-f "${lk_path}docs/features.txt")
&& (-f "${lk_path}docs/GL3.txt")
&& (-f "${lk_path}src/mesa/main/version.c")
&& (-f "${lk_path}REVIEWERS")
&& (-d "${lk_path}scripts")) {

View File

@@ -74,10 +74,6 @@ endif
# include only conditionally ?
SUBDIRS += compiler
if HAVE_AMD_DRIVERS
SUBDIRS += amd
endif
if HAVE_INTEL_DRIVERS
SUBDIRS += intel
endif
@@ -111,25 +107,11 @@ if HAVE_EGL
SUBDIRS += egl
endif
if HAVE_INTEL_DRIVERS
SUBDIRS += intel/tools
endif
if HAVE_VULKAN_COMMON
SUBDIRS += vulkan/wsi
endif
## Requires the i965 compiler (part of mesa) and wayland-drm
if HAVE_INTEL_VULKAN
SUBDIRS += intel/vulkan
endif
# Requires wayland-drm
if HAVE_RADEON_VULKAN
SUBDIRS += amd/common
SUBDIRS += amd/vulkan
endif
if HAVE_GALLIUM
SUBDIRS += gallium
endif
@@ -145,15 +127,12 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/include/ \
-I$(top_srcdir)/src/mapi/ \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
$(DEFINES)
noinst_LTLIBRARIES = libglsl_util.la
libglsl_util_la_SOURCES = \
mesa/main/extensions_table.c \
mesa/main/imports.c \
mesa/program/prog_parameter.c \
mesa/program/prog_hash_table.c \
mesa/program/symbol_table.c \
mesa/program/dummy_errors.c

View File

@@ -1,9 +1,6 @@
import filecmp
import os
import subprocess
Import('*')
if env['platform'] == 'windows':
SConscript('getopt/SConscript')
@@ -15,50 +12,6 @@ if env['hostonly']:
# compilation
Return()
def write_git_sha1_h_file(filename):
"""Mesa looks for a git_sha1.h file at compile time in order to display
the current git hash id in the GL_VERSION string. This function tries
to retrieve the git hashid and write the header file. An empty file
will be created if anything goes wrong."""
args = [ 'git', 'rev-parse', '--short=10', 'HEAD' ]
try:
(commit, foo) = subprocess.Popen(args, stdout=subprocess.PIPE).communicate()
except:
print "Warning: exception in write_git_sha1_h_file()"
# git log command didn't work
if not os.path.exists(filename):
dirname = os.path.dirname(filename)
if dirname and not os.path.exists(dirname):
os.makedirs(dirname)
# create an empty file if none already exists
f = open(filename, "w")
f.close()
return
# note that commit[:-1] removes the trailing newline character
commit = '#define MESA_GIT_SHA1 "git-%s"\n' % commit[:-1]
tempfile = "git_sha1.h.tmp"
f = open(tempfile, "w")
f.write(commit)
f.close()
if not os.path.exists(filename) or not filecmp.cmp(tempfile, filename):
# The filename does not exist or it's different from the new file,
# so replace old file with new.
if os.path.exists(filename):
os.remove(filename)
os.rename(tempfile, filename)
return
# Create the git_sha1.h header file
write_git_sha1_h_file("git_sha1.h")
# and update CPPPATH so the git_sha1.h header can be found
env.Append(CPPPATH = ["#" + env['build_dir']])
if env['platform'] != 'windows':
SConscript('loader/SConscript')

View File

@@ -1,44 +0,0 @@
# Copyright © 2016 Red Hat.
# Copyright © 2016 Mauro Rossi <issor.oruam@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
# ---------------------------------------
# Build libmesa_amdgpu_addrlib
# ---------------------------------------
include $(CLEAR_VARS)
LOCAL_MODULE := libmesa_amdgpu_addrlib
LOCAL_SRC_FILES := $(ADDRLIB_FILES)
LOCAL_CFLAGS := -DBRAHMA_BUILD=1
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src \
$(MESA_TOP)/src/amd/common \
$(MESA_TOP)/src/amd/addrlib \
$(MESA_TOP)/src/amd/addrlib/core \
$(MESA_TOP)/src/amd/addrlib/inc/chip/r800 \
$(MESA_TOP)/src/amd/addrlib/r800/chip
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)

View File

@@ -1,28 +0,0 @@
# Copyright © 2016 Red Hat.
# Copyright © 2016 Mauro Rossi <issor.oruam@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
LOCAL_PATH := $(call my-dir)
# Import variables
include $(LOCAL_PATH)/Makefile.sources
include $(LOCAL_PATH)/Android.addrlib.mk

View File

@@ -1,38 +0,0 @@
# Copyright 2016 Red Hat Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
ADDRLIB_LIBS = addrlib/libamdgpu_addrlib.la
addrlib_libamdgpu_addrlib_la_CPPFLAGS = \
-I$(top_srcdir)/src/ \
-I$(srcdir)/common \
-I$(srcdir)/addrlib \
-I$(srcdir)/addrlib/core \
-I$(srcdir)/addrlib/inc/chip/r800 \
-I$(srcdir)/addrlib/r800/chip \
-DBRAHMA_BUILD=1
addrlib_libamdgpu_addrlib_la_CXXFLAGS = \
$(VISIBILITY_CXXFLAGS)
noinst_LTLIBRARIES += $(ADDRLIB_LIBS)
addrlib_libamdgpu_addrlib_la_SOURCES = $(ADDRLIB_FILES)

View File

@@ -1,27 +0,0 @@
# Copyright © 2016 Red Hat.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
include Makefile.sources
noinst_LTLIBRARIES =
EXTRA_DIST = $(COMMON_HEADER_FILES)
include Makefile.addrlib.am

View File

@@ -1,27 +0,0 @@
COMMON_HEADER_FILES = \
common/sid.h \
common/r600d_common.h \
common/amd_family.h \
common/amd_kernel_code_t.h \
common/amdgpu_id.h
ADDRLIB_FILES = \
addrlib/addrinterface.cpp \
addrlib/addrinterface.h \
addrlib/addrtypes.h \
addrlib/core/addrcommon.h \
addrlib/core/addrelemlib.cpp \
addrlib/core/addrelemlib.h \
addrlib/core/addrlib.cpp \
addrlib/core/addrlib.h \
addrlib/core/addrobject.cpp \
addrlib/core/addrobject.h \
addrlib/inc/chip/r800/si_gb_reg.h \
addrlib/inc/lnx_common_defs.h \
addrlib/r800/chip/si_ci_vi_merged_enum.h \
addrlib/r800/ciaddrlib.cpp \
addrlib/r800/ciaddrlib.h \
addrlib/r800/egbaddrlib.cpp \
addrlib/r800/egbaddrlib.h \
addrlib/r800/siaddrlib.cpp \
addrlib/r800/siaddrlib.h

View File

@@ -1,51 +0,0 @@
# Copyright © 2016 Bas Nieuwenhuizen
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
include Makefile.sources
# TODO cleanup these
AM_CPPFLAGS = \
$(VALGRIND_CFLAGS) \
$(DEFINES) \
-I$(top_srcdir)/include \
-I$(top_builddir)/src \
-I$(top_srcdir)/src \
-I$(top_builddir)/src/compiler \
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/include
AM_CFLAGS = $(VISIBILITY_CFLAGS) \
$(PTHREAD_CFLAGS) \
$(LLVM_CFLAGS) \
$(LIBELF_CFLAGS)
AM_CXXFLAGS = \
$(VISIBILITY_CXXFLAGS) \
$(LLVM_CXXFLAGS)
noinst_LTLIBRARIES = libamd_common.la
libamd_common_la_SOURCES = $(AMD_COMPILER_SOURCES)

View File

@@ -1,29 +0,0 @@
# Copyright © 2016 Bas Nieuwenhuizen
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
AMD_COMPILER_SOURCES := \
ac_binary.c \
ac_binary.h \
ac_llvm_helper.cpp \
ac_llvm_util.c \
ac_llvm_util.h \
ac_nir_to_llvm.c \
ac_nir_to_llvm.h

View File

@@ -1,288 +0,0 @@
/*
* Copyright 2014 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors: Tom Stellard <thomas.stellard@amd.com>
*
* Based on radeon_elf_util.c.
*/
#include "ac_binary.h"
#include "util/u_math.h"
#include "util/u_memory.h"
#include <gelf.h>
#include <libelf.h>
#include <stdio.h>
#include <sid.h>
#define SPILLED_SGPRS 0x4
#define SPILLED_VGPRS 0x8
static void parse_symbol_table(Elf_Data *symbol_table_data,
const GElf_Shdr *symbol_table_header,
struct ac_shader_binary *binary)
{
GElf_Sym symbol;
unsigned i = 0;
unsigned symbol_count =
symbol_table_header->sh_size / symbol_table_header->sh_entsize;
/* We are over allocating this list, because symbol_count gives the
* total number of symbols, and we will only be filling the list
* with offsets of global symbols. The memory savings from
* allocating the correct size of this list will be small, and
* I don't think it is worth the cost of pre-computing the number
* of global symbols.
*/
binary->global_symbol_offsets = CALLOC(symbol_count, sizeof(uint64_t));
while (gelf_getsym(symbol_table_data, i++, &symbol)) {
unsigned i;
if (GELF_ST_BIND(symbol.st_info) != STB_GLOBAL ||
symbol.st_shndx == 0 /* Undefined symbol */) {
continue;
}
binary->global_symbol_offsets[binary->global_symbol_count] =
symbol.st_value;
/* Sort the list using bubble sort. This list will usually
* be small. */
for (i = binary->global_symbol_count; i > 0; --i) {
uint64_t lhs = binary->global_symbol_offsets[i - 1];
uint64_t rhs = binary->global_symbol_offsets[i];
if (lhs < rhs) {
break;
}
binary->global_symbol_offsets[i] = lhs;
binary->global_symbol_offsets[i - 1] = rhs;
}
++binary->global_symbol_count;
}
}
static void parse_relocs(Elf *elf, Elf_Data *relocs, Elf_Data *symbols,
unsigned symbol_sh_link,
struct ac_shader_binary *binary)
{
unsigned i;
if (!relocs || !symbols || !binary->reloc_count) {
return;
}
binary->relocs = CALLOC(binary->reloc_count,
sizeof(struct ac_shader_reloc));
for (i = 0; i < binary->reloc_count; i++) {
GElf_Sym symbol;
GElf_Rel rel;
char *symbol_name;
struct ac_shader_reloc *reloc = &binary->relocs[i];
gelf_getrel(relocs, i, &rel);
gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &symbol);
symbol_name = elf_strptr(elf, symbol_sh_link, symbol.st_name);
reloc->offset = rel.r_offset;
strncpy(reloc->name, symbol_name, sizeof(reloc->name)-1);
reloc->name[sizeof(reloc->name)-1] = 0;
}
}
void ac_elf_read(const char *elf_data, unsigned elf_size,
struct ac_shader_binary *binary)
{
char *elf_buffer;
Elf *elf;
Elf_Scn *section = NULL;
Elf_Data *symbols = NULL, *relocs = NULL;
size_t section_str_index;
unsigned symbol_sh_link = 0;
/* One of the libelf implementations
* (http://www.mr511.de/software/english.htm) requires calling
* elf_version() before elf_memory().
*/
elf_version(EV_CURRENT);
elf_buffer = MALLOC(elf_size);
memcpy(elf_buffer, elf_data, elf_size);
elf = elf_memory(elf_buffer, elf_size);
elf_getshdrstrndx(elf, &section_str_index);
while ((section = elf_nextscn(elf, section))) {
const char *name;
Elf_Data *section_data = NULL;
GElf_Shdr section_header;
if (gelf_getshdr(section, &section_header) != &section_header) {
fprintf(stderr, "Failed to read ELF section header\n");
return;
}
name = elf_strptr(elf, section_str_index, section_header.sh_name);
if (!strcmp(name, ".text")) {
section_data = elf_getdata(section, section_data);
binary->code_size = section_data->d_size;
binary->code = MALLOC(binary->code_size * sizeof(unsigned char));
memcpy(binary->code, section_data->d_buf, binary->code_size);
} else if (!strcmp(name, ".AMDGPU.config")) {
section_data = elf_getdata(section, section_data);
binary->config_size = section_data->d_size;
binary->config = MALLOC(binary->config_size * sizeof(unsigned char));
memcpy(binary->config, section_data->d_buf, binary->config_size);
} else if (!strcmp(name, ".AMDGPU.disasm")) {
/* Always read disassembly if it's available. */
section_data = elf_getdata(section, section_data);
binary->disasm_string = strndup(section_data->d_buf,
section_data->d_size);
} else if (!strncmp(name, ".rodata", 7)) {
section_data = elf_getdata(section, section_data);
binary->rodata_size = section_data->d_size;
binary->rodata = MALLOC(binary->rodata_size * sizeof(unsigned char));
memcpy(binary->rodata, section_data->d_buf, binary->rodata_size);
} else if (!strncmp(name, ".symtab", 7)) {
symbols = elf_getdata(section, section_data);
symbol_sh_link = section_header.sh_link;
parse_symbol_table(symbols, &section_header, binary);
} else if (!strcmp(name, ".rel.text")) {
relocs = elf_getdata(section, section_data);
binary->reloc_count = section_header.sh_size /
section_header.sh_entsize;
}
}
parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);
if (elf){
elf_end(elf);
}
FREE(elf_buffer);
/* Cache the config size per symbol */
if (binary->global_symbol_count) {
binary->config_size_per_symbol =
binary->config_size / binary->global_symbol_count;
} else {
binary->global_symbol_count = 1;
binary->config_size_per_symbol = binary->config_size;
}
}
static
const unsigned char *ac_shader_binary_config_start(
const struct ac_shader_binary *binary,
uint64_t symbol_offset)
{
unsigned i;
for (i = 0; i < binary->global_symbol_count; ++i) {
if (binary->global_symbol_offsets[i] == symbol_offset) {
unsigned offset = i * binary->config_size_per_symbol;
return binary->config + offset;
}
}
return binary->config;
}
static const char *scratch_rsrc_dword0_symbol =
"SCRATCH_RSRC_DWORD0";
static const char *scratch_rsrc_dword1_symbol =
"SCRATCH_RSRC_DWORD1";
void ac_shader_binary_read_config(struct ac_shader_binary *binary,
struct ac_shader_config *conf,
unsigned symbol_offset)
{
unsigned i;
const unsigned char *config =
ac_shader_binary_config_start(binary, symbol_offset);
bool really_needs_scratch = false;
/* LLVM adds SGPR spills to the scratch size.
* Find out if we really need the scratch buffer.
*/
for (i = 0; i < binary->reloc_count; i++) {
const struct ac_shader_reloc *reloc = &binary->relocs[i];
if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) ||
!strcmp(scratch_rsrc_dword1_symbol, reloc->name)) {
really_needs_scratch = true;
break;
}
}
for (i = 0; i < binary->config_size_per_symbol; i+= 8) {
unsigned reg = util_le32_to_cpu(*(uint32_t*)(config + i));
unsigned value = util_le32_to_cpu(*(uint32_t*)(config + i + 4));
switch (reg) {
case R_00B028_SPI_SHADER_PGM_RSRC1_PS:
case R_00B128_SPI_SHADER_PGM_RSRC1_VS:
case R_00B228_SPI_SHADER_PGM_RSRC1_GS:
case R_00B848_COMPUTE_PGM_RSRC1:
conf->num_sgprs = MAX2(conf->num_sgprs, (G_00B028_SGPRS(value) + 1) * 8);
conf->num_vgprs = MAX2(conf->num_vgprs, (G_00B028_VGPRS(value) + 1) * 4);
conf->float_mode = G_00B028_FLOAT_MODE(value);
break;
case R_00B02C_SPI_SHADER_PGM_RSRC2_PS:
conf->lds_size = MAX2(conf->lds_size, G_00B02C_EXTRA_LDS_SIZE(value));
break;
case R_00B84C_COMPUTE_PGM_RSRC2:
conf->lds_size = MAX2(conf->lds_size, G_00B84C_LDS_SIZE(value));
break;
case R_0286CC_SPI_PS_INPUT_ENA:
conf->spi_ps_input_ena = value;
break;
case R_0286D0_SPI_PS_INPUT_ADDR:
conf->spi_ps_input_addr = value;
break;
case R_0286E8_SPI_TMPRING_SIZE:
case R_00B860_COMPUTE_TMPRING_SIZE:
/* WAVESIZE is in units of 256 dwords. */
if (really_needs_scratch)
conf->scratch_bytes_per_wave =
G_00B860_WAVESIZE(value) * 256 * 4;
break;
case SPILLED_SGPRS:
conf->spilled_sgprs = value;
break;
case SPILLED_VGPRS:
conf->spilled_vgprs = value;
break;
default:
{
static bool printed;
if (!printed) {
fprintf(stderr, "Warning: LLVM emitted unknown "
"config register: 0x%x\n", reg);
printed = true;
}
}
break;
}
if (!conf->spi_ps_input_addr)
conf->spi_ps_input_addr = conf->spi_ps_input_ena;
}
}

View File

@@ -1,88 +0,0 @@
/*
* Copyright 2014 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
* Authors: Tom Stellard <thomas.stellard@amd.com>
*
*/
#pragma once
#include <stdint.h>
struct ac_shader_reloc {
char name[32];
uint64_t offset;
};
struct ac_shader_binary {
/** Shader code */
unsigned char *code;
unsigned code_size;
/** Config/Context register state that accompanies this shader.
* This is a stream of dword pairs. First dword contains the
* register address, the second dword contains the value.*/
unsigned char *config;
unsigned config_size;
/** The number of bytes of config information for each global symbol.
*/
unsigned config_size_per_symbol;
/** Constant data accessed by the shader. This will be uploaded
* into a constant buffer. */
unsigned char *rodata;
unsigned rodata_size;
/** List of symbol offsets for the shader */
uint64_t *global_symbol_offsets;
unsigned global_symbol_count;
struct ac_shader_reloc *relocs;
unsigned reloc_count;
/** Disassembled shader in a string. */
char *disasm_string;
};
struct ac_shader_config {
unsigned num_sgprs;
unsigned num_vgprs;
unsigned spilled_sgprs;
unsigned spilled_vgprs;
unsigned lds_size;
unsigned spi_ps_input_ena;
unsigned spi_ps_input_addr;
unsigned float_mode;
unsigned scratch_bytes_per_wave;
};
/*
* Parse the elf binary stored in \p elf_data and create a
* ac_shader_binary object.
*/
void ac_elf_read(const char *elf_data, unsigned elf_size,
struct ac_shader_binary *binary);
void ac_shader_binary_read_config(struct ac_shader_binary *binary,
struct ac_shader_config *conf,
unsigned symbol_offset);

View File

@@ -1,46 +0,0 @@
/*
* Copyright 2014 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
*/
/* based on Marek's patch to lp_bld_misc.cpp */
// Workaround http://llvm.org/PR23628
#if HAVE_LLVM >= 0x0307
# pragma push_macro("DEBUG")
# undef DEBUG
#endif
#include "ac_nir_to_llvm.h"
#include <llvm-c/Core.h>
#include <llvm/Target/TargetOptions.h>
#include <llvm/ExecutionEngine/ExecutionEngine.h>
extern "C" void
ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)
{
llvm::Argument *A = llvm::unwrap<llvm::Argument>(val);
llvm::AttrBuilder B;
B.addDereferenceableAttr(bytes);
A->addAttr(llvm::AttributeSet::get(A->getContext(), A->getArgNo() + 1, B));
}

View File

@@ -1,142 +0,0 @@
/*
* Copyright 2014 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
*/
/* based on pieces from si_pipe.c and radeon_llvm_emit.c */
#include "ac_llvm_util.h"
#include <llvm-c/Core.h>
#include "c11/threads.h"
#include <assert.h>
#include <stdio.h>
static void ac_init_llvm_target()
{
#if HAVE_LLVM < 0x0307
LLVMInitializeR600TargetInfo();
LLVMInitializeR600Target();
LLVMInitializeR600TargetMC();
LLVMInitializeR600AsmPrinter();
#else
LLVMInitializeAMDGPUTargetInfo();
LLVMInitializeAMDGPUTarget();
LLVMInitializeAMDGPUTargetMC();
LLVMInitializeAMDGPUAsmPrinter();
#endif
}
static once_flag ac_init_llvm_target_once_flag = ONCE_FLAG_INIT;
static LLVMTargetRef ac_get_llvm_target(const char *triple)
{
LLVMTargetRef target = NULL;
char *err_message = NULL;
call_once(&ac_init_llvm_target_once_flag, ac_init_llvm_target);
if (LLVMGetTargetFromTriple(triple, &target, &err_message)) {
fprintf(stderr, "Cannot find target for triple %s ", triple);
if (err_message) {
fprintf(stderr, "%s\n", err_message);
}
LLVMDisposeMessage(err_message);
return NULL;
}
return target;
}
static const char *ac_get_llvm_processor_name(enum radeon_family family)
{
switch (family) {
case CHIP_TAHITI:
return "tahiti";
case CHIP_PITCAIRN:
return "pitcairn";
case CHIP_VERDE:
return "verde";
case CHIP_OLAND:
return "oland";
case CHIP_HAINAN:
return "hainan";
case CHIP_BONAIRE:
return "bonaire";
case CHIP_KABINI:
return "kabini";
case CHIP_KAVERI:
return "kaveri";
case CHIP_HAWAII:
return "hawaii";
case CHIP_MULLINS:
return "mullins";
case CHIP_TONGA:
return "tonga";
case CHIP_ICELAND:
return "iceland";
case CHIP_CARRIZO:
return "carrizo";
#if HAVE_LLVM <= 0x0307
case CHIP_FIJI:
return "tonga";
case CHIP_STONEY:
return "carrizo";
#else
case CHIP_FIJI:
return "fiji";
case CHIP_STONEY:
return "stoney";
#endif
#if HAVE_LLVM <= 0x0308
case CHIP_POLARIS10:
return "tonga";
case CHIP_POLARIS11:
return "tonga";
#else
case CHIP_POLARIS10:
return "polaris10";
case CHIP_POLARIS11:
return "polaris11";
#endif
default:
return "";
}
}
LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family)
{
assert(family >= CHIP_TAHITI);
const char *triple = "amdgcn--";
LLVMTargetRef target = ac_get_llvm_target(triple);
LLVMTargetMachineRef tm = LLVMCreateTargetMachine(
target,
triple,
ac_get_llvm_processor_name(family),
"+DumpCode,+vgpr-spilling",
LLVMCodeGenLevelDefault,
LLVMRelocDefault,
LLVMCodeModelDefault);
return tm;
}

View File

@@ -1,31 +0,0 @@
/*
* Copyright 2016 Bas Nieuwenhuizen
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
*/
#pragma once
#include <llvm-c/TargetMachine.h>
#include "amd_family.h"
LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family);

File diff suppressed because it is too large Load Diff

View File

@@ -1,119 +0,0 @@
/*
* Copyright © 2016 Bas Nieuwenhuizen
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#pragma once
#include <stdbool.h>
#include "llvm-c/Core.h"
#include "llvm-c/TargetMachine.h"
#include "amd_family.h"
struct ac_shader_binary;
struct ac_shader_config;
struct nir_shader;
struct radv_pipeline_layout;
struct ac_vs_variant_key {
uint32_t instance_rate_inputs;
};
struct ac_fs_variant_key {
uint32_t col_format;
uint32_t is_int8;
};
union ac_shader_variant_key {
struct ac_vs_variant_key vs;
struct ac_fs_variant_key fs;
};
struct ac_nir_compiler_options {
struct radv_pipeline_layout *layout;
union ac_shader_variant_key key;
bool unsafe_math;
enum radeon_family family;
enum chip_class chip_class;
};
struct ac_shader_variant_info {
unsigned num_user_sgprs;
unsigned num_input_sgprs;
unsigned num_input_vgprs;
union {
struct {
unsigned param_exports;
unsigned pos_exports;
unsigned vgpr_comp_cnt;
uint32_t export_mask;
bool writes_pointsize;
uint8_t clip_dist_mask;
uint8_t cull_dist_mask;
} vs;
struct {
unsigned num_interp;
uint32_t input_mask;
unsigned output_mask;
uint32_t flat_shaded_mask;
bool has_pcoord;
bool can_discard;
bool writes_z;
bool writes_stencil;
bool early_fragment_test;
bool writes_memory;
} fs;
struct {
unsigned block_size[3];
} cs;
};
};
void ac_compile_nir_shader(LLVMTargetMachineRef tm,
struct ac_shader_binary *binary,
struct ac_shader_config *config,
struct ac_shader_variant_info *shader_info,
struct nir_shader *nir,
const struct ac_nir_compiler_options *options,
bool dump_shader);
/* SHADER ABI defines */
/* offset in dwords */
#define AC_USERDATA_DESCRIPTOR_SET_0 0
#define AC_USERDATA_DESCRIPTOR_SET_1 2
#define AC_USERDATA_DESCRIPTOR_SET_2 4
#define AC_USERDATA_DESCRIPTOR_SET_3 6
#define AC_USERDATA_PUSH_CONST_DYN 8
#define AC_USERDATA_VS_VERTEX_BUFFERS 10
#define AC_USERDATA_VS_BASE_VERTEX 12
#define AC_USERDATA_VS_START_INSTANCE 13
#define AC_USERDATA_PS_SAMPLE_POS 10
#define AC_USERDATA_CS_GRID_SIZE 10
#ifdef __cplusplus
extern "C"
#endif
void ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes);

View File

@@ -1,111 +0,0 @@
/*
* Copyright 2008 Corbin Simpson <MostAwesomeDude@gmail.com>
* Copyright 2010 Marek Olšák <maraeo@gmail.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef AMD_FAMILY_H
#define AMD_FAMILY_H
enum radeon_family {
CHIP_UNKNOWN = 0,
CHIP_R300, /* R3xx-based cores. */
CHIP_R350,
CHIP_RV350,
CHIP_RV370,
CHIP_RV380,
CHIP_RS400,
CHIP_RC410,
CHIP_RS480,
CHIP_R420, /* R4xx-based cores. */
CHIP_R423,
CHIP_R430,
CHIP_R480,
CHIP_R481,
CHIP_RV410,
CHIP_RS600,
CHIP_RS690,
CHIP_RS740,
CHIP_RV515, /* R5xx-based cores. */
CHIP_R520,
CHIP_RV530,
CHIP_R580,
CHIP_RV560,
CHIP_RV570,
CHIP_R600,
CHIP_RV610,
CHIP_RV630,
CHIP_RV670,
CHIP_RV620,
CHIP_RV635,
CHIP_RS780,
CHIP_RS880,
CHIP_RV770,
CHIP_RV730,
CHIP_RV710,
CHIP_RV740,
CHIP_CEDAR,
CHIP_REDWOOD,
CHIP_JUNIPER,
CHIP_CYPRESS,
CHIP_HEMLOCK,
CHIP_PALM,
CHIP_SUMO,
CHIP_SUMO2,
CHIP_BARTS,
CHIP_TURKS,
CHIP_CAICOS,
CHIP_CAYMAN,
CHIP_ARUBA,
CHIP_TAHITI,
CHIP_PITCAIRN,
CHIP_VERDE,
CHIP_OLAND,
CHIP_HAINAN,
CHIP_BONAIRE,
CHIP_KAVERI,
CHIP_KABINI,
CHIP_HAWAII,
CHIP_MULLINS,
CHIP_TONGA,
CHIP_ICELAND,
CHIP_CARRIZO,
CHIP_FIJI,
CHIP_STONEY,
CHIP_POLARIS10,
CHIP_POLARIS11,
CHIP_LAST,
};
enum chip_class {
CLASS_UNKNOWN = 0,
R300,
R400,
R500,
R600,
R700,
EVERGREEN,
CAYMAN,
SI,
CIK,
VI,
};
#endif

View File

@@ -1,534 +0,0 @@
/*
* Copyright 2015,2016 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*/
#ifndef AMDKERNELCODET_H
#define AMDKERNELCODET_H
//---------------------------------------------------------------------------//
// AMD Kernel Code, and its dependencies //
//---------------------------------------------------------------------------//
// Sets val bits for specified mask in specified dst packed instance.
#define AMD_HSA_BITS_SET(dst, mask, val) \
dst &= (~(1 << mask ## _SHIFT) & ~mask); \
dst |= (((val) << mask ## _SHIFT) & mask)
// Gets bits for specified mask from specified src packed instance.
#define AMD_HSA_BITS_GET(src, mask) \
((src & mask) >> mask ## _SHIFT) \
/* Every amd_*_code_t has the following properties, which are composed of
* a number of bit fields. Every bit field has a mask (AMD_CODE_PROPERTY_*),
* bit width (AMD_CODE_PROPERTY_*_WIDTH, and bit shift amount
* (AMD_CODE_PROPERTY_*_SHIFT) for convenient access. Unused bits must be 0.
*
* (Note that bit fields cannot be used as their layout is
* implementation defined in the C standard and so cannot be used to
* specify an ABI)
*/
enum amd_code_property_mask_t {
/* Enable the setup of the SGPR user data registers
* (AMD_CODE_PROPERTY_ENABLE_SGPR_*), see documentation of amd_kernel_code_t
* for initial register state.
*
* The total number of SGPRuser data registers requested must not
* exceed 16. Any requests beyond 16 will be ignored.
*
* Used to set COMPUTE_PGM_RSRC2.USER_SGPR (set to total count of
* SGPR user data registers enabled up to 16).
*/
AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_SHIFT = 0,
AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_SHIFT = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_SHIFT = 2,
AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_SHIFT = 3,
AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_SHIFT = 4,
AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_SHIFT = 5,
AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_SHIFT = 6,
AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_SHIFT = 7,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_SHIFT = 8,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_SHIFT,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_SHIFT = 9,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_SHIFT,
AMD_CODE_PROPERTY_RESERVED1_SHIFT = 10,
AMD_CODE_PROPERTY_RESERVED1_WIDTH = 6,
AMD_CODE_PROPERTY_RESERVED1 = ((1 << AMD_CODE_PROPERTY_RESERVED1_WIDTH) - 1) << AMD_CODE_PROPERTY_RESERVED1_SHIFT,
/* Control wave ID base counter for GDS ordered-append. Used to set
* COMPUTE_DISPATCH_INITIATOR.ORDERED_APPEND_ENBL. (Not sure if
* ORDERED_APPEND_MODE also needs to be settable)
*/
AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_SHIFT = 16,
AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_WIDTH = 1,
AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS = ((1 << AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_SHIFT,
/* The interleave (swizzle) element size in bytes required by the
* code for private memory. This must be 2, 4, 8 or 16. This value
* is provided to the finalizer when it is invoked and is recorded
* here. The hardware will interleave the memory requests of each
* lane of a wavefront by this element size to ensure each
* work-item gets a distinct memory memory location. Therefore, the
* finalizer ensures that all load and store operations done to
* private memory do not exceed this size. For example, if the
* element size is 4 (32-bits or dword) and a 64-bit value must be
* loaded, the finalizer will generate two 32-bit loads. This
* ensures that the interleaving will get the work-item
* specific dword for both halves of the 64-bit value. If it just
* did a 64-bit load then it would get one dword which belonged to
* its own work-item, but the second dword would belong to the
* adjacent lane work-item since the interleaving is in dwords.
*
* The value used must match the value that the runtime configures
* the GPU flat scratch (SH_STATIC_MEM_CONFIG.ELEMENT_SIZE). This
* is generally DWORD.
*
* USE VALUES FROM THE AMD_ELEMENT_BYTE_SIZE_T ENUM.
*/
AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_SHIFT = 17,
AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_WIDTH = 2,
AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE = ((1 << AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_WIDTH) - 1) << AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_SHIFT,
/* Are global memory addresses 64 bits. Must match
* amd_kernel_code_t.hsail_machine_model ==
* HSA_MACHINE_LARGE. Must also match
* SH_MEM_CONFIG.PTR32 (GFX6 (SI)/GFX7 (CI)),
* SH_MEM_CONFIG.ADDRESS_MODE (GFX8 (VI)+).
*/
AMD_CODE_PROPERTY_IS_PTR64_SHIFT = 19,
AMD_CODE_PROPERTY_IS_PTR64_WIDTH = 1,
AMD_CODE_PROPERTY_IS_PTR64 = ((1 << AMD_CODE_PROPERTY_IS_PTR64_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_PTR64_SHIFT,
/* Indicate if the generated ISA is using a dynamically sized call
* stack. This can happen if calls are implemented using a call
* stack and recursion, alloca or calls to indirect functions are
* present. In these cases the Finalizer cannot compute the total
* private segment size at compile time. In this case the
* workitem_private_segment_byte_size only specifies the statically
* know private segment size, and additional space must be added
* for the call stack.
*/
AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_SHIFT = 20,
AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_WIDTH = 1,
AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK = ((1 << AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_SHIFT,
/* Indicate if code generated has support for debugging. */
AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_SHIFT = 21,
AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_WIDTH = 1,
AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED = ((1 << AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_SHIFT,
AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_SHIFT = 22,
AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_WIDTH = 1,
AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED = ((1 << AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_SHIFT,
AMD_CODE_PROPERTY_RESERVED2_SHIFT = 23,
AMD_CODE_PROPERTY_RESERVED2_WIDTH = 9,
AMD_CODE_PROPERTY_RESERVED2 = ((1 << AMD_CODE_PROPERTY_RESERVED2_WIDTH) - 1) << AMD_CODE_PROPERTY_RESERVED2_SHIFT
};
/* AMD Kernel Code Object (amd_kernel_code_t). GPU CP uses the AMD Kernel
* Code Object to set up the hardware to execute the kernel dispatch.
*
* Initial Kernel Register State.
*
* Initial kernel register state will be set up by CP/SPI prior to the start
* of execution of every wavefront. This is limited by the constraints of the
* current hardware.
*
* The order of the SGPR registers is defined, but the Finalizer can specify
* which ones are actually setup in the amd_kernel_code_t object using the
* enable_sgpr_* bit fields. The register numbers used for enabled registers
* are dense starting at SGPR0: the first enabled register is SGPR0, the next
* enabled register is SGPR1 etc.; disabled registers do not have an SGPR
* number.
*
* The initial SGPRs comprise up to 16 User SRGPs that are set up by CP and
* apply to all waves of the grid. It is possible to specify more than 16 User
* SGPRs using the enable_sgpr_* bit fields, in which case only the first 16
* are actually initialized. These are then immediately followed by the System
* SGPRs that are set up by ADC/SPI and can have different values for each wave
* of the grid dispatch.
*
* SGPR register initial state is defined as follows:
*
* Private Segment Buffer (enable_sgpr_private_segment_buffer):
* Number of User SGPR registers: 4. V# that can be used, together with
* Scratch Wave Offset as an offset, to access the Private/Spill/Arg
* segments using a segment address. It must be set as follows:
* - Base address: of the scratch memory area used by the dispatch. It
* does not include the scratch wave offset. It will be the per process
* SH_HIDDEN_PRIVATE_BASE_VMID plus any offset from this dispatch (for
* example there may be a per pipe offset, or per AQL Queue offset).
* - Stride + data_format: Element Size * Index Stride (???)
* - Cache swizzle: ???
* - Swizzle enable: SH_STATIC_MEM_CONFIG.SWIZZLE_ENABLE (must be 1 for
* scratch)
* - Num records: Flat Scratch Work Item Size / Element Size (???)
* - Dst_sel_*: ???
* - Num_format: ???
* - Element_size: SH_STATIC_MEM_CONFIG.ELEMENT_SIZE (will be DWORD, must
* agree with amd_kernel_code_t.privateElementSize)
* - Index_stride: SH_STATIC_MEM_CONFIG.INDEX_STRIDE (will be 64 as must
* be number of wavefront lanes for scratch, must agree with
* amd_kernel_code_t.wavefrontSize)
* - Add tid enable: 1
* - ATC: from SH_MEM_CONFIG.PRIVATE_ATC,
* - Hash_enable: ???
* - Heap: ???
* - Mtype: from SH_STATIC_MEM_CONFIG.PRIVATE_MTYPE
* - Type: 0 (a buffer) (???)
*
* Dispatch Ptr (enable_sgpr_dispatch_ptr):
* Number of User SGPR registers: 2. 64 bit address of AQL dispatch packet
* for kernel actually executing.
*
* Queue Ptr (enable_sgpr_queue_ptr):
* Number of User SGPR registers: 2. 64 bit address of AmdQueue object for
* AQL queue on which the dispatch packet was queued.
*
* Kernarg Segment Ptr (enable_sgpr_kernarg_segment_ptr):
* Number of User SGPR registers: 2. 64 bit address of Kernarg segment. This
* is directly copied from the kernargPtr in the dispatch packet. Having CP
* load it once avoids loading it at the beginning of every wavefront.
*
* Dispatch Id (enable_sgpr_dispatch_id):
* Number of User SGPR registers: 2. 64 bit Dispatch ID of the dispatch
* packet being executed.
*
* Flat Scratch Init (enable_sgpr_flat_scratch_init):
* Number of User SGPR registers: 2. This is 2 SGPRs.
*
* For CI/VI:
* The first SGPR is a 32 bit byte offset from SH_MEM_HIDDEN_PRIVATE_BASE
* to base of memory for scratch for this dispatch. This is the same offset
* used in computing the Scratch Segment Buffer base address. The value of
* Scratch Wave Offset must be added by the kernel code and moved to
* SGPRn-4 for use as the FLAT SCRATCH BASE in flat memory instructions.
*
* The second SGPR is 32 bit byte size of a single work-item's scratch
* memory usage. This is directly loaded from the dispatch packet Private
* Segment Byte Size and rounded up to a multiple of DWORD.
*
* \todo [Does CP need to round this to >4 byte alignment?]
*
* The kernel code must move to SGPRn-3 for use as the FLAT SCRATCH SIZE in
* flat memory instructions. Having CP load it once avoids loading it at
* the beginning of every wavefront.
*
* Private Segment Size (enable_sgpr_private_segment_size):
* Number of User SGPR registers: 1. The 32 bit byte size of a single
* work-item's scratch memory allocation. This is the value from the dispatch
* packet. Private Segment Byte Size rounded up by CP to a multiple of DWORD.
*
* \todo [Does CP need to round this to >4 byte alignment?]
*
* Having CP load it once avoids loading it at the beginning of every
* wavefront.
*
* \todo [This will not be used for CI/VI since it is the same value as
* the second SGPR of Flat Scratch Init.
*
* Grid Work-Group Count X (enable_sgpr_grid_workgroup_count_x):
* Number of User SGPR registers: 1. 32 bit count of the number of
* work-groups in the X dimension for the grid being executed. Computed from
* the fields in the HsaDispatchPacket as
* ((gridSize.x+workgroupSize.x-1)/workgroupSize.x).
*
* Grid Work-Group Count Y (enable_sgpr_grid_workgroup_count_y):
* Number of User SGPR registers: 1. 32 bit count of the number of
* work-groups in the Y dimension for the grid being executed. Computed from
* the fields in the HsaDispatchPacket as
* ((gridSize.y+workgroupSize.y-1)/workgroupSize.y).
*
* Only initialized if <16 previous SGPRs initialized.
*
* Grid Work-Group Count Z (enable_sgpr_grid_workgroup_count_z):
* Number of User SGPR registers: 1. 32 bit count of the number of
* work-groups in the Z dimension for the grid being executed. Computed
* from the fields in the HsaDispatchPacket as
* ((gridSize.z+workgroupSize.z-1)/workgroupSize.z).
*
* Only initialized if <16 previous SGPRs initialized.
*
* Work-Group Id X (enable_sgpr_workgroup_id_x):
* Number of System SGPR registers: 1. 32 bit work group id in X dimension
* of grid for wavefront. Always present.
*
* Work-Group Id Y (enable_sgpr_workgroup_id_y):
* Number of System SGPR registers: 1. 32 bit work group id in Y dimension
* of grid for wavefront.
*
* Work-Group Id Z (enable_sgpr_workgroup_id_z):
* Number of System SGPR registers: 1. 32 bit work group id in Z dimension
* of grid for wavefront. If present then Work-group Id Y will also be
* present
*
* Work-Group Info (enable_sgpr_workgroup_info):
* Number of System SGPR registers: 1. {first_wave, 14'b0000,
* ordered_append_term[10:0], threadgroup_size_in_waves[5:0]}
*
* Private Segment Wave Byte Offset
* (enable_sgpr_private_segment_wave_byte_offset):
* Number of System SGPR registers: 1. 32 bit byte offset from base of
* dispatch scratch base. Must be used as an offset with Private/Spill/Arg
* segment address when using Scratch Segment Buffer. It must be added to
* Flat Scratch Offset if setting up FLAT SCRATCH for flat addressing.
*
*
* The order of the VGPR registers is defined, but the Finalizer can specify
* which ones are actually setup in the amd_kernel_code_t object using the
* enableVgpr* bit fields. The register numbers used for enabled registers
* are dense starting at VGPR0: the first enabled register is VGPR0, the next
* enabled register is VGPR1 etc.; disabled registers do not have an VGPR
* number.
*
* VGPR register initial state is defined as follows:
*
* Work-Item Id X (always initialized):
* Number of registers: 1. 32 bit work item id in X dimension of work-group
* for wavefront lane.
*
* Work-Item Id X (enable_vgpr_workitem_id > 0):
* Number of registers: 1. 32 bit work item id in Y dimension of work-group
* for wavefront lane.
*
* Work-Item Id X (enable_vgpr_workitem_id > 0):
* Number of registers: 1. 32 bit work item id in Z dimension of work-group
* for wavefront lane.
*
*
* The setting of registers is being done by existing GPU hardware as follows:
* 1) SGPRs before the Work-Group Ids are set by CP using the 16 User Data
* registers.
* 2) Work-group Id registers X, Y, Z are set by SPI which supports any
* combination including none.
* 3) Scratch Wave Offset is also set by SPI which is why its value cannot
* be added into the value Flat Scratch Offset which would avoid the
* Finalizer generated prolog having to do the add.
* 4) The VGPRs are set by SPI which only supports specifying either (X),
* (X, Y) or (X, Y, Z).
*
* Flat Scratch Dispatch Offset and Flat Scratch Size are adjacent SGRRs so
* they can be moved as a 64 bit value to the hardware required SGPRn-3 and
* SGPRn-4 respectively using the Finalizer ?FLAT_SCRATCH? Register.
*
* The global segment can be accessed either using flat operations or buffer
* operations. If buffer operations are used then the Global Buffer used to
* access HSAIL Global/Readonly/Kernarg (which are combine) segments using a
* segment address is not passed into the kernel code by CP since its base
* address is always 0. Instead the Finalizer generates prolog code to
* initialize 4 SGPRs with a V# that has the following properties, and then
* uses that in the buffer instructions:
* - base address of 0
* - no swizzle
* - ATC=1
* - MTYPE set to support memory coherence specified in
* amd_kernel_code_t.globalMemoryCoherence
*
* When the Global Buffer is used to access the Kernarg segment, must add the
* dispatch packet kernArgPtr to a kernarg segment address before using this V#.
* Alternatively scalar loads can be used if the kernarg offset is uniform, as
* the kernarg segment is constant for the duration of the kernel execution.
*/
typedef struct amd_kernel_code_s {
uint32_t amd_kernel_code_version_major;
uint32_t amd_kernel_code_version_minor;
uint16_t amd_machine_kind;
uint16_t amd_machine_version_major;
uint16_t amd_machine_version_minor;
uint16_t amd_machine_version_stepping;
/* Byte offset (possibly negative) from start of amd_kernel_code_t
* object to kernel's entry point instruction. The actual code for
* the kernel is required to be 256 byte aligned to match hardware
* requirements (SQ cache line is 16). The code must be position
* independent code (PIC) for AMD devices to give runtime the
* option of copying code to discrete GPU memory or APU L2
* cache. The Finalizer should endeavour to allocate all kernel
* machine code in contiguous memory pages so that a device
* pre-fetcher will tend to only pre-fetch Kernel Code objects,
* improving cache performance.
*/
int64_t kernel_code_entry_byte_offset;
/* Range of bytes to consider prefetching expressed as an offset
* and size. The offset is from the start (possibly negative) of
* amd_kernel_code_t object. Set both to 0 if no prefetch
* information is available.
*/
int64_t kernel_code_prefetch_byte_offset;
uint64_t kernel_code_prefetch_byte_size;
/* Number of bytes of scratch backing memory required for full
* occupancy of target chip. This takes into account the number of
* bytes of scratch per work-item, the wavefront size, the maximum
* number of wavefronts per CU, and the number of CUs. This is an
* upper limit on scratch. If the grid being dispatched is small it
* may only need less than this. If the kernel uses no scratch, or
* the Finalizer has not computed this value, it must be 0.
*/
uint64_t max_scratch_backing_memory_byte_size;
/* Shader program settings for CS. Contains COMPUTE_PGM_RSRC1 and
* COMPUTE_PGM_RSRC2 registers.
*/
uint64_t compute_pgm_resource_registers;
/* Code properties. See amd_code_property_mask_t for a full list of
* properties.
*/
uint32_t code_properties;
/* The amount of memory required for the combined private, spill
* and arg segments for a work-item in bytes. If
* is_dynamic_callstack is 1 then additional space must be added to
* this value for the call stack.
*/
uint32_t workitem_private_segment_byte_size;
/* The amount of group segment memory required by a work-group in
* bytes. This does not include any dynamically allocated group
* segment memory that may be added when the kernel is
* dispatched.
*/
uint32_t workgroup_group_segment_byte_size;
/* Number of byte of GDS required by kernel dispatch. Must be 0 if
* not using GDS.
*/
uint32_t gds_segment_byte_size;
/* The size in bytes of the kernarg segment that holds the values
* of the arguments to the kernel. This could be used by CP to
* prefetch the kernarg segment pointed to by the dispatch packet.
*/
uint64_t kernarg_segment_byte_size;
/* Number of fbarrier's used in the kernel and all functions it
* calls. If the implementation uses group memory to allocate the
* fbarriers then that amount must already be included in the
* workgroup_group_segment_byte_size total.
*/
uint32_t workgroup_fbarrier_count;
/* Number of scalar registers used by a wavefront. This includes
* the special SGPRs for VCC, Flat Scratch Base, Flat Scratch Size
* and XNACK (for GFX8 (VI)). It does not include the 16 SGPR added if a
* trap handler is enabled. Used to set COMPUTE_PGM_RSRC1.SGPRS.
*/
uint16_t wavefront_sgpr_count;
/* Number of vector registers used by each work-item. Used to set
* COMPUTE_PGM_RSRC1.VGPRS.
*/
uint16_t workitem_vgpr_count;
/* If reserved_vgpr_count is 0 then must be 0. Otherwise, this is the
* first fixed VGPR number reserved.
*/
uint16_t reserved_vgpr_first;
/* The number of consecutive VGPRs reserved by the client. If
* is_debug_supported then this count includes VGPRs reserved
* for debugger use.
*/
uint16_t reserved_vgpr_count;
/* If reserved_sgpr_count is 0 then must be 0. Otherwise, this is the
* first fixed SGPR number reserved.
*/
uint16_t reserved_sgpr_first;
/* The number of consecutive SGPRs reserved by the client. If
* is_debug_supported then this count includes SGPRs reserved
* for debugger use.
*/
uint16_t reserved_sgpr_count;
/* If is_debug_supported is 0 then must be 0. Otherwise, this is the
* fixed SGPR number used to hold the wave scratch offset for the
* entire kernel execution, or uint16_t(-1) if the register is not
* used or not known.
*/
uint16_t debug_wavefront_private_segment_offset_sgpr;
/* If is_debug_supported is 0 then must be 0. Otherwise, this is the
* fixed SGPR number of the first of 4 SGPRs used to hold the
* scratch V# used for the entire kernel execution, or uint16_t(-1)
* if the registers are not used or not known.
*/
uint16_t debug_private_segment_buffer_sgpr;
/* The maximum byte alignment of variables used by the kernel in
* the specified memory segment. Expressed as a power of two. Must
* be at least HSA_POWERTWO_16.
*/
uint8_t kernarg_segment_alignment;
uint8_t group_segment_alignment;
uint8_t private_segment_alignment;
/* Wavefront size expressed as a power of two. Must be a power of 2
* in range 1..64 inclusive. Used to support runtime query that
* obtains wavefront size, which may be used by application to
* allocated dynamic group memory and set the dispatch work-group
* size.
*/
uint8_t wavefront_size;
int32_t call_convention;
uint8_t reserved3[12];
uint64_t runtime_loader_kernel_symbol;
uint64_t control_directives[16];
} amd_kernel_code_t;
#endif // AMDKERNELCODET_H

View File

@@ -1,7 +0,0 @@
# Generated source files
/radv_entrypoints.c
/radv_entrypoints.h
/radv_timestamp.h
/dev_icd.json
/vk_format_table.c
/radeon_icd.*.json

View File

@@ -1,172 +0,0 @@
# Copyright © 2016 Red Hat
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
include Makefile.sources
vulkan_includedir = $(includedir)/vulkan
vulkan_include_HEADERS = \
$(top_srcdir)/include/vulkan/vk_platform.h \
$(top_srcdir)/include/vulkan/vulkan.h
lib_LTLIBRARIES = libvulkan_radeon.la
# The gallium includes are for the util/u_math.h include from main/macros.h
AM_CPPFLAGS = \
$(AMDGPU_CFLAGS) \
$(VALGRIND_CFLAGS) \
$(DEFINES) \
-I$(top_srcdir)/include \
-I$(top_builddir)/src \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/vulkan/wsi \
-I$(top_srcdir)/src/amd \
-I$(top_srcdir)/src/amd/common \
-I$(top_builddir)/src/compiler \
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/include
AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(PTHREAD_CFLAGS) \
$(LLVM_CFLAGS)
VULKAN_SOURCES = \
$(VULKAN_GENERATED_FILES) \
$(VULKAN_FILES)
VULKAN_LIB_DEPS =
if HAVE_PLATFORM_X11
AM_CPPFLAGS += \
$(XCB_DRI3_CFLAGS) \
-DVK_USE_PLATFORM_XCB_KHR \
-DVK_USE_PLATFORM_XLIB_KHR
VULKAN_SOURCES += $(VULKAN_WSI_X11_FILES)
# FIXME: Use pkg-config for X11-xcb ldflags.
VULKAN_LIB_DEPS += $(XCB_DRI3_LIBS) -lX11-xcb
endif
if HAVE_PLATFORM_WAYLAND
AM_CPPFLAGS += \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
$(WAYLAND_CFLAGS) \
-DVK_USE_PLATFORM_WAYLAND_KHR
VULKAN_SOURCES += $(VULKAN_WSI_WAYLAND_FILES)
VULKAN_LIB_DEPS += \
$(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la \
$(WAYLAND_LIBS)
endif
noinst_LTLIBRARIES = libvulkan_common.la
libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
VULKAN_LIB_DEPS += \
libvulkan_common.la \
$(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \
$(top_builddir)/src/amd/common/libamd_common.la \
$(top_builddir)/src/amd/addrlib/libamdgpu_addrlib.la \
$(top_builddir)/src/compiler/nir/libnir.la \
$(top_builddir)/src/util/libmesautil.la \
$(LLVM_LIBS) \
$(LIBELF_LIBS) \
$(PTHREAD_LIBS) \
$(AMDGPU_LIBS) \
$(LIBDRM_LIBS) \
$(PTHREAD_LIBS) \
$(DLOPEN_LIBS) \
-lm
nodist_EXTRA_libvulkan_radeon_la_SOURCES = dummy.cpp
libvulkan_radeon_la_SOURCES = $(VULKAN_GEM_FILES)
radv_entrypoints.h : radv_entrypoints_gen.py $(vulkan_include_HEADERS)
$(AM_V_GEN) cat $(vulkan_include_HEADERS) |\
$(PYTHON2) $(srcdir)/radv_entrypoints_gen.py header > $@
radv_entrypoints.c : radv_entrypoints_gen.py $(vulkan_include_HEADERS)
$(AM_V_GEN) cat $(vulkan_include_HEADERS) |\
$(PYTHON2) $(srcdir)/radv_entrypoints_gen.py code > $@
.PHONY: radv_timestamp.h
radv_timestamp.h:
@echo "Updating radv_timestamp.h"
$(AM_V_GEN) echo "#define RADV_TIMESTAMP \"$(TIMESTAMP_CMD)\"" > $@
vk_format_table.c: vk_format_table.py \
vk_format_parse.py \
vk_format_layout.csv
$(PYTHON2) $(srcdir)/vk_format_table.py $(srcdir)/vk_format_layout.csv > $@
BUILT_SOURCES = $(VULKAN_GENERATED_FILES)
CLEANFILES = $(BUILT_SOURCES) dev_icd.json radeon_icd.@host_cpu@.json
EXTRA_DIST = \
$(top_srcdir)/include/vulkan/vk_icd.h \
dev_icd.json.in \
radeon_icd.json.in \
radv_entrypoints_gen.py \
vk_format_layout.csv \
vk_format_parse.py \
vk_format_table.py
libvulkan_radeon_la_LIBADD = $(VULKAN_LIB_DEPS)
libvulkan_radeon_la_LDFLAGS = \
-shared \
-module \
-no-undefined \
-avoid-version \
$(BSYMBOLIC) \
$(LLVM_LDFLAGS) \
$(GC_SECTIONS) \
$(LD_NO_UNDEFINED)
icdconfdir = @VULKAN_ICD_INSTALL_DIR@
icdconf_DATA = radeon_icd.@host_cpu@.json
# The following is used for development purposes, by setting VK_ICD_FILENAMES.
noinst_DATA = dev_icd.json
dev_icd.json : dev_icd.json.in
$(AM_V_GEN) $(SED) \
-e "s#@build_libdir@#${abs_top_builddir}/${LIB_DIR}#" \
< $(srcdir)/dev_icd.json.in > $@
radeon_icd.@host_cpu@.json : radeon_icd.json.in
$(AM_V_GEN) $(SED) \
-e "s#@install_libdir@#${libdir}#" \
< $(srcdir)/radeon_icd.json.in > $@
include $(top_srcdir)/install-lib-links.mk

View File

@@ -1,77 +0,0 @@
# Copyright © 2016 Red Hat
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
RADV_WS_AMDGPU_FILES := \
winsys/amdgpu/radv_amdgpu_bo.c \
winsys/amdgpu/radv_amdgpu_bo.h \
winsys/amdgpu/radv_amdgpu_cs.c \
winsys/amdgpu/radv_amdgpu_cs.h \
winsys/amdgpu/radv_amdgpu_surface.c \
winsys/amdgpu/radv_amdgpu_surface.h \
winsys/amdgpu/radv_amdgpu_winsys.c \
winsys/amdgpu/radv_amdgpu_winsys.h \
winsys/amdgpu/radv_amdgpu_winsys_public.h
VULKAN_FILES := \
radv_cmd_buffer.c \
radv_cs.h \
radv_device.c \
radv_descriptor_set.c \
radv_descriptor_set.h \
radv_formats.c \
radv_image.c \
radv_meta.c \
radv_meta.h \
radv_meta_blit.c \
radv_meta_blit2d.c \
radv_meta_buffer.c \
radv_meta_bufimage.c \
radv_meta_clear.c \
radv_meta_copy.c \
radv_meta_decompress.c \
radv_meta_fast_clear.c \
radv_meta_resolve.c \
radv_meta_resolve_cs.c \
radv_pass.c \
radv_pipeline.c \
radv_pipeline_cache.c \
radv_private.h \
radv_radeon_winsys.h \
radv_query.c \
radv_util.c \
radv_util.h \
radv_wsi.c \
si_cmd_buffer.c \
vk_format_table.c \
vk_format.h \
$(RADV_WS_AMDGPU_FILES)
VULKAN_WSI_WAYLAND_FILES := \
radv_wsi_wayland.c
VULKAN_WSI_X11_FILES := \
radv_wsi_x11.c
VULKAN_GENERATED_FILES := \
radv_entrypoints.c \
radv_entrypoints.h \
radv_timestamp.h

View File

@@ -1,7 +0,0 @@
{
"file_format_version": "1.0.0",
"ICD": {
"library_path": "@build_libdir@/libvulkan_radeon.so",
"api_version": "1.0.3"
}
}

View File

@@ -1,7 +0,0 @@
{
"file_format_version": "1.0.0",
"ICD": {
"library_path": "@install_libdir@/libvulkan_radeon.so",
"api_version": "1.0.3"
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,121 +0,0 @@
/*
* Copyright © 2016 Red Hat.
* Copyright © 2016 Bas Nieuwenhuizen
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef RADV_CS_H
#define RADV_CS_H
#include <string.h>
#include <stdint.h>
#include <assert.h>
#include "r600d_common.h"
static inline unsigned radeon_check_space(struct radeon_winsys *ws,
struct radeon_winsys_cs *cs,
unsigned needed)
{
if (cs->max_dw - cs->cdw < needed)
ws->cs_grow(cs, needed);
return cs->cdw + needed;
}
static inline void radeon_set_config_reg_seq(struct radeon_winsys_cs *cs, unsigned reg, unsigned num)
{
assert(reg < R600_CONTEXT_REG_OFFSET);
assert(cs->cdw + 2 + num <= cs->max_dw);
radeon_emit(cs, PKT3(PKT3_SET_CONFIG_REG, num, 0));
radeon_emit(cs, (reg - R600_CONFIG_REG_OFFSET) >> 2);
}
static inline void radeon_set_config_reg(struct radeon_winsys_cs *cs, unsigned reg, unsigned value)
{
radeon_set_config_reg_seq(cs, reg, 1);
radeon_emit(cs, value);
}
static inline void radeon_set_context_reg_seq(struct radeon_winsys_cs *cs, unsigned reg, unsigned num)
{
assert(reg >= R600_CONTEXT_REG_OFFSET);
assert(cs->cdw + 2 + num <= cs->max_dw);
radeon_emit(cs, PKT3(PKT3_SET_CONTEXT_REG, num, 0));
radeon_emit(cs, (reg - R600_CONTEXT_REG_OFFSET) >> 2);
}
static inline void radeon_set_context_reg(struct radeon_winsys_cs *cs, unsigned reg, unsigned value)
{
radeon_set_context_reg_seq(cs, reg, 1);
radeon_emit(cs, value);
}
static inline void radeon_set_context_reg_idx(struct radeon_winsys_cs *cs,
unsigned reg, unsigned idx,
unsigned value)
{
assert(reg >= R600_CONTEXT_REG_OFFSET);
assert(cs->cdw + 3 <= cs->max_dw);
radeon_emit(cs, PKT3(PKT3_SET_CONTEXT_REG, 1, 0));
radeon_emit(cs, (reg - R600_CONTEXT_REG_OFFSET) >> 2 | (idx << 28));
radeon_emit(cs, value);
}
static inline void radeon_set_sh_reg_seq(struct radeon_winsys_cs *cs, unsigned reg, unsigned num)
{
assert(reg >= SI_SH_REG_OFFSET && reg < SI_SH_REG_END);
assert(cs->cdw + 2 + num <= cs->max_dw);
radeon_emit(cs, PKT3(PKT3_SET_SH_REG, num, 0));
radeon_emit(cs, (reg - SI_SH_REG_OFFSET) >> 2);
}
static inline void radeon_set_sh_reg(struct radeon_winsys_cs *cs, unsigned reg, unsigned value)
{
radeon_set_sh_reg_seq(cs, reg, 1);
radeon_emit(cs, value);
}
static inline void radeon_set_uconfig_reg_seq(struct radeon_winsys_cs *cs, unsigned reg, unsigned num)
{
assert(reg >= CIK_UCONFIG_REG_OFFSET && reg < CIK_UCONFIG_REG_END);
assert(cs->cdw + 2 + num <= cs->max_dw);
radeon_emit(cs, PKT3(PKT3_SET_UCONFIG_REG, num, 0));
radeon_emit(cs, (reg - CIK_UCONFIG_REG_OFFSET) >> 2);
}
static inline void radeon_set_uconfig_reg(struct radeon_winsys_cs *cs, unsigned reg, unsigned value)
{
radeon_set_uconfig_reg_seq(cs, reg, 1);
radeon_emit(cs, value);
}
static inline void radeon_set_uconfig_reg_idx(struct radeon_winsys_cs *cs,
unsigned reg, unsigned idx,
unsigned value)
{
assert(reg >= CIK_UCONFIG_REG_OFFSET && reg < CIK_UCONFIG_REG_END);
assert(cs->cdw + 3 <= cs->max_dw);
radeon_emit(cs, PKT3(PKT3_SET_UCONFIG_REG, 1, 0));
radeon_emit(cs, (reg - CIK_UCONFIG_REG_OFFSET) >> 2 | (idx << 28));
radeon_emit(cs, value);
}
#endif /* RADV_CS_H */

View File

@@ -1,717 +0,0 @@
/*
* Copyright © 2016 Red Hat.
* Copyright © 2016 Bas Nieuwenhuizen
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include <assert.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include "util/mesa-sha1.h"
#include "radv_private.h"
#include "sid.h"
VkResult radv_CreateDescriptorSetLayout(
VkDevice _device,
const VkDescriptorSetLayoutCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDescriptorSetLayout* pSetLayout)
{
RADV_FROM_HANDLE(radv_device, device, _device);
struct radv_descriptor_set_layout *set_layout;
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO);
uint32_t max_binding = 0;
uint32_t immutable_sampler_count = 0;
for (uint32_t j = 0; j < pCreateInfo->bindingCount; j++) {
max_binding = MAX2(max_binding, pCreateInfo->pBindings[j].binding);
if (pCreateInfo->pBindings[j].pImmutableSamplers)
immutable_sampler_count += pCreateInfo->pBindings[j].descriptorCount;
}
size_t size = sizeof(struct radv_descriptor_set_layout) +
(max_binding + 1) * sizeof(set_layout->binding[0]) +
immutable_sampler_count * sizeof(struct radv_sampler *);
set_layout = vk_alloc2(&device->alloc, pAllocator, size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!set_layout)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
/* We just allocate all the samplers at the end of the struct */
struct radv_sampler **samplers =
(struct radv_sampler **)&set_layout->binding[max_binding + 1];
set_layout->binding_count = max_binding + 1;
set_layout->shader_stages = 0;
set_layout->size = 0;
memset(set_layout->binding, 0, size - sizeof(struct radv_descriptor_set_layout));
uint32_t buffer_count = 0;
uint32_t dynamic_offset_count = 0;
for (uint32_t j = 0; j < pCreateInfo->bindingCount; j++) {
const VkDescriptorSetLayoutBinding *binding = &pCreateInfo->pBindings[j];
uint32_t b = binding->binding;
uint32_t alignment;
switch (binding->descriptorType) {
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC:
set_layout->binding[b].dynamic_offset_count = 1;
set_layout->dynamic_shader_stages |= binding->stageFlags;
set_layout->binding[b].size = 0;
set_layout->binding[b].buffer_count = 1;
alignment = 1;
break;
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER:
case VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER:
set_layout->binding[b].size = 16;
set_layout->binding[b].buffer_count = 1;
alignment = 16;
break;
case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
case VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE:
case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
/* main descriptor + fmask descriptor */
set_layout->binding[b].size = 64;
set_layout->binding[b].buffer_count = 1;
alignment = 32;
break;
case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
/* main descriptor + fmask descriptor + sampler */
set_layout->binding[b].size = 96;
set_layout->binding[b].buffer_count = 1;
alignment = 32;
break;
case VK_DESCRIPTOR_TYPE_SAMPLER:
set_layout->binding[b].size = 16;
alignment = 16;
break;
default:
unreachable("unknown descriptor type\n");
break;
}
set_layout->size = align(set_layout->size, alignment);
assert(binding->descriptorCount > 0);
set_layout->binding[b].type = binding->descriptorType;
set_layout->binding[b].array_size = binding->descriptorCount;
set_layout->binding[b].offset = set_layout->size;
set_layout->binding[b].buffer_offset = buffer_count;
set_layout->binding[b].dynamic_offset_offset = dynamic_offset_count;
set_layout->size += binding->descriptorCount * set_layout->binding[b].size;
buffer_count += binding->descriptorCount * set_layout->binding[b].buffer_count;
dynamic_offset_count += binding->descriptorCount *
set_layout->binding[b].dynamic_offset_count;
if (binding->pImmutableSamplers) {
set_layout->binding[b].immutable_samplers = samplers;
samplers += binding->descriptorCount;
for (uint32_t i = 0; i < binding->descriptorCount; i++)
set_layout->binding[b].immutable_samplers[i] =
radv_sampler_from_handle(binding->pImmutableSamplers[i]);
} else {
set_layout->binding[b].immutable_samplers = NULL;
}
set_layout->shader_stages |= binding->stageFlags;
}
set_layout->buffer_count = buffer_count;
set_layout->dynamic_offset_count = dynamic_offset_count;
*pSetLayout = radv_descriptor_set_layout_to_handle(set_layout);
return VK_SUCCESS;
}
void radv_DestroyDescriptorSetLayout(
VkDevice _device,
VkDescriptorSetLayout _set_layout,
const VkAllocationCallbacks* pAllocator)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_descriptor_set_layout, set_layout, _set_layout);
if (!set_layout)
return;
vk_free2(&device->alloc, pAllocator, set_layout);
}
/*
* Pipeline layouts. These have nothing to do with the pipeline. They are
* just muttiple descriptor set layouts pasted together
*/
VkResult radv_CreatePipelineLayout(
VkDevice _device,
const VkPipelineLayoutCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkPipelineLayout* pPipelineLayout)
{
RADV_FROM_HANDLE(radv_device, device, _device);
struct radv_pipeline_layout *layout;
struct mesa_sha1 *ctx;
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO);
layout = vk_alloc2(&device->alloc, pAllocator, sizeof(*layout), 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (layout == NULL)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
layout->num_sets = pCreateInfo->setLayoutCount;
unsigned dynamic_offset_count = 0;
ctx = _mesa_sha1_init();
for (uint32_t set = 0; set < pCreateInfo->setLayoutCount; set++) {
RADV_FROM_HANDLE(radv_descriptor_set_layout, set_layout,
pCreateInfo->pSetLayouts[set]);
layout->set[set].layout = set_layout;
layout->set[set].dynamic_offset_start = dynamic_offset_count;
for (uint32_t b = 0; b < set_layout->binding_count; b++) {
dynamic_offset_count += set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count;
}
_mesa_sha1_update(ctx, set_layout->binding,
sizeof(set_layout->binding[0]) * set_layout->binding_count);
}
layout->dynamic_offset_count = dynamic_offset_count;
layout->push_constant_size = 0;
for (unsigned i = 0; i < pCreateInfo->pushConstantRangeCount; ++i) {
const VkPushConstantRange *range = pCreateInfo->pPushConstantRanges + i;
layout->push_constant_size = MAX2(layout->push_constant_size,
range->offset + range->size);
}
layout->push_constant_size = align(layout->push_constant_size, 16);
_mesa_sha1_update(ctx, &layout->push_constant_size,
sizeof(layout->push_constant_size));
_mesa_sha1_final(ctx, layout->sha1);
*pPipelineLayout = radv_pipeline_layout_to_handle(layout);
return VK_SUCCESS;
}
void radv_DestroyPipelineLayout(
VkDevice _device,
VkPipelineLayout _pipelineLayout,
const VkAllocationCallbacks* pAllocator)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_pipeline_layout, pipeline_layout, _pipelineLayout);
if (!pipeline_layout)
return;
vk_free2(&device->alloc, pAllocator, pipeline_layout);
}
#define EMPTY 1
static VkResult
radv_descriptor_set_create(struct radv_device *device,
struct radv_descriptor_pool *pool,
struct radv_cmd_buffer *cmd_buffer,
const struct radv_descriptor_set_layout *layout,
struct radv_descriptor_set **out_set)
{
struct radv_descriptor_set *set;
unsigned mem_size = sizeof(struct radv_descriptor_set) +
sizeof(struct radeon_winsys_bo *) * layout->buffer_count;
set = vk_alloc2(&device->alloc, NULL, mem_size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!set)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
memset(set, 0, mem_size);
if (layout->dynamic_offset_count) {
unsigned size = sizeof(struct radv_descriptor_range) *
layout->dynamic_offset_count;
set->dynamic_descriptors = vk_alloc2(&device->alloc, NULL, size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!set->dynamic_descriptors) {
vk_free2(&device->alloc, NULL, set);
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
}
}
set->layout = layout;
if (layout->size) {
uint32_t layout_size = align_u32(layout->size, 32);
set->size = layout->size;
if (!cmd_buffer) {
if (pool->current_offset + layout_size <= pool->size) {
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr + pool->current_offset);
set->va = device->ws->buffer_get_va(set->bo) + pool->current_offset;
pool->current_offset += layout_size;
} else {
int entry = pool->free_list, prev_entry = -1;
uint32_t offset;
while (entry >= 0) {
if (pool->free_nodes[entry].size >= layout_size) {
if (prev_entry >= 0)
pool->free_nodes[prev_entry].next = pool->free_nodes[entry].next;
else
pool->free_list = pool->free_nodes[entry].next;
break;
}
prev_entry = entry;
entry = pool->free_nodes[entry].next;
}
if (entry < 0) {
vk_free2(&device->alloc, NULL, set);
return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
}
offset = pool->free_nodes[entry].offset;
pool->free_nodes[entry].next = pool->full_list;
pool->full_list = entry;
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr + offset);
set->va = device->ws->buffer_get_va(set->bo) + offset;
}
} else {
unsigned bo_offset;
if (!radv_cmd_buffer_upload_alloc(cmd_buffer, set->size, 32,
&bo_offset,
(void**)&set->mapped_ptr)) {
vk_free2(&device->alloc, NULL, set->dynamic_descriptors);
vk_free2(&device->alloc, NULL, set);
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
}
set->va = device->ws->buffer_get_va(cmd_buffer->upload.upload_bo);
set->va += bo_offset;
}
}
if (pool)
list_add(&set->descriptor_pool, &pool->descriptor_sets);
else
list_inithead(&set->descriptor_pool);
for (unsigned i = 0; i < layout->binding_count; ++i) {
if (!layout->binding[i].immutable_samplers)
continue;
unsigned offset = layout->binding[i].offset / 4;
if (layout->binding[i].type == VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER)
offset += 16;
for (unsigned j = 0; j < layout->binding[i].array_size; ++j) {
struct radv_sampler* sampler = layout->binding[i].immutable_samplers[j];
memcpy(set->mapped_ptr + offset, &sampler->state, 16);
offset += layout->binding[i].size / 4;
}
}
*out_set = set;
return VK_SUCCESS;
}
static void
radv_descriptor_set_destroy(struct radv_device *device,
struct radv_descriptor_pool *pool,
struct radv_descriptor_set *set,
bool free_bo)
{
if (free_bo && set->size) {
assert(pool->full_list >= 0);
int next = pool->free_nodes[pool->full_list].next;
pool->free_nodes[pool->full_list].next = pool->free_list;
pool->free_nodes[pool->full_list].offset = (uint8_t*)set->mapped_ptr - pool->mapped_ptr;
pool->free_nodes[pool->full_list].size = align_u32(set->size, 32);
pool->free_list = pool->full_list;
pool->full_list = next;
}
if (set->dynamic_descriptors)
vk_free2(&device->alloc, NULL, set->dynamic_descriptors);
if (!list_empty(&set->descriptor_pool))
list_del(&set->descriptor_pool);
vk_free2(&device->alloc, NULL, set);
}
VkResult
radv_temp_descriptor_set_create(struct radv_device *device,
struct radv_cmd_buffer *cmd_buffer,
VkDescriptorSetLayout _layout,
VkDescriptorSet *_set)
{
RADV_FROM_HANDLE(radv_descriptor_set_layout, layout, _layout);
struct radv_descriptor_set *set;
VkResult ret;
ret = radv_descriptor_set_create(device, NULL, cmd_buffer, layout, &set);
*_set = radv_descriptor_set_to_handle(set);
return ret;
}
void
radv_temp_descriptor_set_destroy(struct radv_device *device,
VkDescriptorSet _set)
{
RADV_FROM_HANDLE(radv_descriptor_set, set, _set);
radv_descriptor_set_destroy(device, NULL, set, false);
}
VkResult radv_CreateDescriptorPool(
VkDevice _device,
const VkDescriptorPoolCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDescriptorPool* pDescriptorPool)
{
RADV_FROM_HANDLE(radv_device, device, _device);
struct radv_descriptor_pool *pool;
unsigned max_sets = pCreateInfo->maxSets * 2;
int size = sizeof(struct radv_descriptor_pool) +
max_sets * sizeof(struct radv_descriptor_pool_free_node);
uint64_t bo_size = 0;
pool = vk_alloc2(&device->alloc, pAllocator, size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!pool)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
memset(pool, 0, sizeof(*pool));
pool->free_list = -1;
pool->full_list = 0;
pool->free_nodes[max_sets - 1].next = -1;
pool->max_sets = max_sets;
for (int i = 0; i + 1 < max_sets; ++i)
pool->free_nodes[i].next = i + 1;
for (unsigned i = 0; i < pCreateInfo->poolSizeCount; ++i) {
switch(pCreateInfo->pPoolSizes[i].type) {
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC:
break;
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER:
case VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER:
case VK_DESCRIPTOR_TYPE_SAMPLER:
/* 32 as we may need to align for images */
bo_size += 32 * pCreateInfo->pPoolSizes[i].descriptorCount;
break;
case VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE:
case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
bo_size += 64 * pCreateInfo->pPoolSizes[i].descriptorCount;
break;
case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
bo_size += 96 * pCreateInfo->pPoolSizes[i].descriptorCount;
break;
default:
unreachable("unknown descriptor type\n");
break;
}
}
if (bo_size) {
pool->bo = device->ws->buffer_create(device->ws, bo_size,
32, RADEON_DOMAIN_VRAM, 0);
pool->mapped_ptr = (uint8_t*)device->ws->buffer_map(pool->bo);
}
pool->size = bo_size;
list_inithead(&pool->descriptor_sets);
*pDescriptorPool = radv_descriptor_pool_to_handle(pool);
return VK_SUCCESS;
}
void radv_DestroyDescriptorPool(
VkDevice _device,
VkDescriptorPool _pool,
const VkAllocationCallbacks* pAllocator)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_descriptor_pool, pool, _pool);
if (!pool)
return;
list_for_each_entry_safe(struct radv_descriptor_set, set,
&pool->descriptor_sets, descriptor_pool) {
radv_descriptor_set_destroy(device, pool, set, false);
}
if (pool->bo)
device->ws->buffer_destroy(pool->bo);
vk_free2(&device->alloc, pAllocator, pool);
}
VkResult radv_ResetDescriptorPool(
VkDevice _device,
VkDescriptorPool descriptorPool,
VkDescriptorPoolResetFlags flags)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_descriptor_pool, pool, descriptorPool);
list_for_each_entry_safe(struct radv_descriptor_set, set,
&pool->descriptor_sets, descriptor_pool) {
radv_descriptor_set_destroy(device, pool, set, false);
}
pool->current_offset = 0;
pool->free_list = -1;
pool->full_list = 0;
pool->free_nodes[pool->max_sets - 1].next = -1;
for (int i = 0; i + 1 < pool->max_sets; ++i)
pool->free_nodes[i].next = i + 1;
return VK_SUCCESS;
}
VkResult radv_AllocateDescriptorSets(
VkDevice _device,
const VkDescriptorSetAllocateInfo* pAllocateInfo,
VkDescriptorSet* pDescriptorSets)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_descriptor_pool, pool, pAllocateInfo->descriptorPool);
VkResult result = VK_SUCCESS;
uint32_t i;
struct radv_descriptor_set *set;
/* allocate a set of buffers for each shader to contain descriptors */
for (i = 0; i < pAllocateInfo->descriptorSetCount; i++) {
RADV_FROM_HANDLE(radv_descriptor_set_layout, layout,
pAllocateInfo->pSetLayouts[i]);
result = radv_descriptor_set_create(device, pool, NULL, layout, &set);
if (result != VK_SUCCESS)
break;
pDescriptorSets[i] = radv_descriptor_set_to_handle(set);
}
if (result != VK_SUCCESS)
radv_FreeDescriptorSets(_device, pAllocateInfo->descriptorPool,
i, pDescriptorSets);
return result;
}
VkResult radv_FreeDescriptorSets(
VkDevice _device,
VkDescriptorPool descriptorPool,
uint32_t count,
const VkDescriptorSet* pDescriptorSets)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_descriptor_pool, pool, descriptorPool);
for (uint32_t i = 0; i < count; i++) {
RADV_FROM_HANDLE(radv_descriptor_set, set, pDescriptorSets[i]);
if (set)
radv_descriptor_set_destroy(device, pool, set, true);
}
return VK_SUCCESS;
}
static void write_texel_buffer_descriptor(struct radv_device *device,
unsigned *dst,
struct radeon_winsys_bo **buffer_list,
const VkBufferView _buffer_view)
{
RADV_FROM_HANDLE(radv_buffer_view, buffer_view, _buffer_view);
memcpy(dst, buffer_view->state, 4 * 4);
*buffer_list = buffer_view->bo;
}
static void write_buffer_descriptor(struct radv_device *device,
unsigned *dst,
struct radeon_winsys_bo **buffer_list,
const VkDescriptorBufferInfo *buffer_info)
{
RADV_FROM_HANDLE(radv_buffer, buffer, buffer_info->buffer);
uint64_t va = device->ws->buffer_get_va(buffer->bo);
uint32_t range = buffer_info->range;
if (buffer_info->range == VK_WHOLE_SIZE)
range = buffer->size - buffer_info->offset;
va += buffer_info->offset + buffer->offset;
dst[0] = va;
dst[1] = S_008F04_BASE_ADDRESS_HI(va >> 32);
dst[2] = range;
dst[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
*buffer_list = buffer->bo;
}
static void write_dynamic_buffer_descriptor(struct radv_device *device,
struct radv_descriptor_range *range,
struct radeon_winsys_bo **buffer_list,
const VkDescriptorBufferInfo *buffer_info)
{
RADV_FROM_HANDLE(radv_buffer, buffer, buffer_info->buffer);
uint64_t va = device->ws->buffer_get_va(buffer->bo);
unsigned size = buffer_info->range;
if (buffer_info->range == VK_WHOLE_SIZE)
size = buffer->size - buffer_info->offset;
va += buffer_info->offset + buffer->offset;
range->va = va;
range->size = size;
*buffer_list = buffer->bo;
}
static void
write_image_descriptor(struct radv_device *device,
unsigned *dst,
struct radeon_winsys_bo **buffer_list,
const VkDescriptorImageInfo *image_info)
{
RADV_FROM_HANDLE(radv_image_view, iview, image_info->imageView);
memcpy(dst, iview->descriptor, 8 * 4);
memcpy(dst + 8, iview->fmask_descriptor, 8 * 4);
*buffer_list = iview->bo;
}
static void
write_combined_image_sampler_descriptor(struct radv_device *device,
unsigned *dst,
struct radeon_winsys_bo **buffer_list,
const VkDescriptorImageInfo *image_info,
bool has_sampler)
{
RADV_FROM_HANDLE(radv_sampler, sampler, image_info->sampler);
write_image_descriptor(device, dst, buffer_list, image_info);
/* copy over sampler state */
if (has_sampler)
memcpy(dst + 16, sampler->state, 16);
}
static void
write_sampler_descriptor(struct radv_device *device,
unsigned *dst,
const VkDescriptorImageInfo *image_info)
{
RADV_FROM_HANDLE(radv_sampler, sampler, image_info->sampler);
memcpy(dst, sampler->state, 16);
}
void radv_UpdateDescriptorSets(
VkDevice _device,
uint32_t descriptorWriteCount,
const VkWriteDescriptorSet* pDescriptorWrites,
uint32_t descriptorCopyCount,
const VkCopyDescriptorSet* pDescriptorCopies)
{
RADV_FROM_HANDLE(radv_device, device, _device);
uint32_t i, j;
for (i = 0; i < descriptorWriteCount; i++) {
const VkWriteDescriptorSet *writeset = &pDescriptorWrites[i];
RADV_FROM_HANDLE(radv_descriptor_set, set, writeset->dstSet);
const struct radv_descriptor_set_binding_layout *binding_layout =
set->layout->binding + writeset->dstBinding;
uint32_t *ptr = set->mapped_ptr;
struct radeon_winsys_bo **buffer_list = set->descriptors;
ptr += binding_layout->offset / 4;
ptr += binding_layout->size * writeset->dstArrayElement / 4;
buffer_list += binding_layout->buffer_offset;
buffer_list += binding_layout->buffer_count * writeset->dstArrayElement;
for (j = 0; j < writeset->descriptorCount; ++j) {
switch(writeset->descriptorType) {
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC: {
unsigned idx = writeset->dstArrayElement + j;
idx += binding_layout->dynamic_offset_offset;
write_dynamic_buffer_descriptor(device, set->dynamic_descriptors + idx,
buffer_list, writeset->pBufferInfo + j);
break;
}
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER:
write_buffer_descriptor(device, ptr, buffer_list,
writeset->pBufferInfo + j);
break;
case VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER:
write_texel_buffer_descriptor(device, ptr, buffer_list,
writeset->pTexelBufferView[j]);
break;
case VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE:
case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
write_image_descriptor(device, ptr, buffer_list,
writeset->pImageInfo + j);
break;
case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
write_combined_image_sampler_descriptor(device, ptr, buffer_list,
writeset->pImageInfo + j,
!binding_layout->immutable_samplers);
break;
case VK_DESCRIPTOR_TYPE_SAMPLER:
assert(!binding_layout->immutable_samplers);
write_sampler_descriptor(device, ptr,
writeset->pImageInfo + j);
break;
default:
unreachable("unimplemented descriptor type");
break;
}
ptr += binding_layout->size / 4;
buffer_list += binding_layout->buffer_count;
}
}
if (descriptorCopyCount)
radv_finishme("copy descriptors");
}

View File

@@ -1,85 +0,0 @@
/*
* Copyright © 2016 Bas Nieuwenhuizen
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef RADV_DESCRIPTOR_SET_H
#define RADV_DESCRIPTOR_SET_H
#include <vulkan/vulkan.h>
#define MAX_SETS 8
struct radv_descriptor_set_binding_layout {
VkDescriptorType type;
/* Number of array elements in this binding */
uint16_t array_size;
uint16_t offset;
uint16_t buffer_offset;
uint16_t dynamic_offset_offset;
/* redundant with the type, each for a single array element */
uint16_t size;
uint16_t buffer_count;
uint16_t dynamic_offset_count;
/* Immutable samplers (or NULL if no immutable samplers) */
struct radv_sampler **immutable_samplers;
};
struct radv_descriptor_set_layout {
/* Number of bindings in this descriptor set */
uint16_t binding_count;
/* Total size of the descriptor set with room for all array entries */
uint16_t size;
/* Shader stages affected by this descriptor set */
uint16_t shader_stages;
uint16_t dynamic_shader_stages;
/* Number of buffers in this descriptor set */
uint16_t buffer_count;
/* Number of dynamic offsets used by this descriptor set */
uint16_t dynamic_offset_count;
/* Bindings in this descriptor set */
struct radv_descriptor_set_binding_layout binding[0];
};
struct radv_pipeline_layout {
struct {
struct radv_descriptor_set_layout *layout;
uint32_t size;
uint32_t dynamic_offset_start;
} set[MAX_SETS];
uint32_t num_sets;
uint32_t push_constant_size;
uint32_t dynamic_offset_count;
unsigned char sha1[20];
};
#endif /* RADV_DESCRIPTOR_SET_H */

File diff suppressed because it is too large Load Diff

View File

@@ -1,293 +0,0 @@
# coding=utf-8
#
# Copyright © 2015 Intel Corporation
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
#
import fileinput, re, sys
# Each function typedef in the vulkan.h header is all on one line and matches
# this regepx. We hope that won't change.
p = re.compile('typedef ([^ ]*) *\((?:VKAPI_PTR)? *\*PFN_vk([^(]*)\)(.*);')
entrypoints = []
# We generate a static hash table for entry point lookup
# (vkGetProcAddress). We use a linear congruential generator for our hash
# function and a power-of-two size table. The prime numbers are determined
# experimentally.
none = 0xffff
hash_size = 256
u32_mask = 2**32 - 1
hash_mask = hash_size - 1
prime_factor = 5024183
prime_step = 19
def hash(name):
h = 0;
for c in name:
h = (h * prime_factor + ord(c)) & u32_mask
return h
def get_platform_guard_macro(name):
if "Xlib" in name:
return "VK_USE_PLATFORM_XLIB_KHR"
elif "Xcb" in name:
return "VK_USE_PLATFORM_XCB_KHR"
elif "Wayland" in name:
return "VK_USE_PLATFORM_WAYLAND_KHR"
elif "Mir" in name:
return "VK_USE_PLATFORM_MIR_KHR"
elif "Android" in name:
return "VK_USE_PLATFORM_ANDROID_KHR"
elif "Win32" in name:
return "VK_USE_PLATFORM_WIN32_KHR"
else:
return None
def print_guard_start(name):
guard = get_platform_guard_macro(name)
if guard is not None:
print "#ifdef {0}".format(guard)
def print_guard_end(name):
guard = get_platform_guard_macro(name)
if guard is not None:
print "#endif // {0}".format(guard)
opt_header = False
opt_code = False
if (sys.argv[1] == "header"):
opt_header = True
sys.argv.pop()
elif (sys.argv[1] == "code"):
opt_code = True
sys.argv.pop()
# Parse the entry points in the header
i = 0
for line in fileinput.input():
m = p.match(line)
if (m):
if m.group(2) == 'VoidFunction':
continue
fullname = "vk" + m.group(2)
h = hash(fullname)
entrypoints.append((m.group(1), m.group(2), m.group(3), i, h))
i = i + 1
# For outputting entrypoints.h we generate a radv_EntryPoint() prototype
# per entry point.
if opt_header:
print "/* This file generated from vk_gen.py, don't edit directly. */\n"
print "struct radv_dispatch_table {"
print " union {"
print " void *entrypoints[%d];" % len(entrypoints)
print " struct {"
for type, name, args, num, h in entrypoints:
guard = get_platform_guard_macro(name)
if guard is not None:
print "#ifdef {0}".format(guard)
print " PFN_vk{0} {0};".format(name)
print "#else"
print " void *{0};".format(name)
print "#endif"
else:
print " PFN_vk{0} {0};".format(name)
print " };\n"
print " };\n"
print "};\n"
for type, name, args, num, h in entrypoints:
print_guard_start(name)
print "%s radv_%s%s;" % (type, name, args)
print_guard_end(name)
exit()
print """/*
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
/* DO NOT EDIT! This is a generated file. */
#include "radv_private.h"
struct radv_entrypoint {
uint32_t name;
uint32_t hash;
};
/* We use a big string constant to avoid lots of reloctions from the entry
* point table to lots of little strings. The entries in the entry point table
* store the index into this big string.
*/
static const char strings[] ="""
offsets = []
i = 0;
for type, name, args, num, h in entrypoints:
print " \"vk%s\\0\"" % name
offsets.append(i)
i += 2 + len(name) + 1
print " ;"
# Now generate the table of all entry points
print "\nstatic const struct radv_entrypoint entrypoints[] = {"
for type, name, args, num, h in entrypoints:
print " { %5d, 0x%08x }," % (offsets[num], h)
print "};\n"
print """
/* Weak aliases for all potential implementations. These will resolve to
* NULL if they're not defined, which lets the resolve_entrypoint() function
* either pick the correct entry point.
*/
"""
for layer in [ "radv" ]:
for type, name, args, num, h in entrypoints:
print_guard_start(name)
print "%s %s_%s%s __attribute__ ((weak));" % (type, layer, name, args)
print_guard_end(name)
print "\nconst struct radv_dispatch_table %s_layer = {" % layer
for type, name, args, num, h in entrypoints:
print_guard_start(name)
print " .%s = %s_%s," % (name, layer, name)
print_guard_end(name)
print "};\n"
print """
void * __attribute__ ((noinline))
radv_resolve_entrypoint(uint32_t index)
{
return radv_layer.entrypoints[index];
}
"""
# Now generate the hash table used for entry point look up. This is a
# uint16_t table of entry point indices. We use 0xffff to indicate an entry
# in the hash table is empty.
map = [none for f in xrange(hash_size)]
collisions = [0 for f in xrange(10)]
for type, name, args, num, h in entrypoints:
level = 0
while map[h & hash_mask] != none:
h = h + prime_step
level = level + 1
if level > 9:
collisions[9] += 1
else:
collisions[level] += 1
map[h & hash_mask] = num
print "/* Hash table stats:"
print " * size %d entries" % hash_size
print " * collisions entries"
for i in xrange(10):
if (i == 9):
plus = "+"
else:
plus = " "
print " * %2d%s %4d" % (i, plus, collisions[i])
print " */\n"
print "#define none 0x%04x\n" % none
print "static const uint16_t map[] = {"
for i in xrange(0, hash_size, 8):
print " ",
for j in xrange(i, i + 8):
if map[j] & 0xffff == 0xffff:
print " none,",
else:
print "0x%04x," % (map[j] & 0xffff),
print
print "};"
# Finally we generate the hash table lookup function. The hash function and
# linear probing algorithm matches the hash table generated above.
print """
void *
radv_lookup_entrypoint(const char *name)
{
static const uint32_t prime_factor = %d;
static const uint32_t prime_step = %d;
const struct radv_entrypoint *e;
uint32_t hash, h, i;
const char *p;
hash = 0;
for (p = name; *p; p++)
hash = hash * prime_factor + *p;
h = hash;
do {
i = map[h & %d];
if (i == none)
return NULL;
e = &entrypoints[i];
h += prime_step;
} while (e->hash != hash);
if (strcmp(name, strings + e->name) != 0)
return NULL;
return radv_resolve_entrypoint(i);
}
""" % (prime_factor, prime_step, hash_mask)

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,388 +0,0 @@
/*
* Copyright © 2016 Red Hat
* based on intel anv code:
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "radv_meta.h"
#include <fcntl.h>
#include <limits.h>
#include <pwd.h>
#include <sys/stat.h>
void
radv_meta_save(struct radv_meta_saved_state *state,
const struct radv_cmd_buffer *cmd_buffer,
uint32_t dynamic_mask)
{
state->old_pipeline = cmd_buffer->state.pipeline;
state->old_descriptor_set0 = cmd_buffer->state.descriptors[0];
memcpy(state->old_vertex_bindings, cmd_buffer->state.vertex_bindings,
sizeof(state->old_vertex_bindings));
state->dynamic_mask = dynamic_mask;
radv_dynamic_state_copy(&state->dynamic, &cmd_buffer->state.dynamic,
dynamic_mask);
memcpy(state->push_constants, cmd_buffer->push_constants, MAX_PUSH_CONSTANTS_SIZE);
}
void
radv_meta_restore(const struct radv_meta_saved_state *state,
struct radv_cmd_buffer *cmd_buffer)
{
cmd_buffer->state.pipeline = state->old_pipeline;
radv_bind_descriptor_set(cmd_buffer, state->old_descriptor_set0, 0);
memcpy(cmd_buffer->state.vertex_bindings, state->old_vertex_bindings,
sizeof(state->old_vertex_bindings));
cmd_buffer->state.vb_dirty |= (1 << RADV_META_VERTEX_BINDING_COUNT) - 1;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_PIPELINE;
radv_dynamic_state_copy(&cmd_buffer->state.dynamic, &state->dynamic,
state->dynamic_mask);
cmd_buffer->state.dirty |= state->dynamic_mask;
memcpy(cmd_buffer->push_constants, state->push_constants, MAX_PUSH_CONSTANTS_SIZE);
cmd_buffer->push_constant_stages |= VK_SHADER_STAGE_ALL_GRAPHICS | VK_SHADER_STAGE_COMPUTE_BIT;
}
void
radv_meta_save_pass(struct radv_meta_saved_pass_state *state,
const struct radv_cmd_buffer *cmd_buffer)
{
state->pass = cmd_buffer->state.pass;
state->subpass = cmd_buffer->state.subpass;
state->framebuffer = cmd_buffer->state.framebuffer;
state->attachments = cmd_buffer->state.attachments;
state->render_area = cmd_buffer->state.render_area;
}
void
radv_meta_restore_pass(const struct radv_meta_saved_pass_state *state,
struct radv_cmd_buffer *cmd_buffer)
{
cmd_buffer->state.pass = state->pass;
cmd_buffer->state.subpass = state->subpass;
cmd_buffer->state.framebuffer = state->framebuffer;
cmd_buffer->state.attachments = state->attachments;
cmd_buffer->state.render_area = state->render_area;
if (state->subpass)
radv_emit_framebuffer_state(cmd_buffer);
}
void
radv_meta_save_compute(struct radv_meta_saved_compute_state *state,
const struct radv_cmd_buffer *cmd_buffer,
unsigned push_constant_size)
{
state->old_pipeline = cmd_buffer->state.compute_pipeline;
state->old_descriptor_set0 = cmd_buffer->state.descriptors[0];
if (push_constant_size)
memcpy(state->push_constants, cmd_buffer->push_constants, push_constant_size);
}
void
radv_meta_restore_compute(const struct radv_meta_saved_compute_state *state,
struct radv_cmd_buffer *cmd_buffer,
unsigned push_constant_size)
{
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer), VK_PIPELINE_BIND_POINT_COMPUTE,
radv_pipeline_to_handle(state->old_pipeline));
radv_bind_descriptor_set(cmd_buffer, state->old_descriptor_set0, 0);
if (push_constant_size) {
memcpy(cmd_buffer->push_constants, state->push_constants, push_constant_size);
cmd_buffer->push_constant_stages |= VK_SHADER_STAGE_COMPUTE_BIT;
}
}
VkImageViewType
radv_meta_get_view_type(const struct radv_image *image)
{
switch (image->type) {
case VK_IMAGE_TYPE_1D: return VK_IMAGE_VIEW_TYPE_1D;
case VK_IMAGE_TYPE_2D: return VK_IMAGE_VIEW_TYPE_2D;
case VK_IMAGE_TYPE_3D: return VK_IMAGE_VIEW_TYPE_3D;
default:
unreachable("bad VkImageViewType");
}
}
/**
* When creating a destination VkImageView, this function provides the needed
* VkImageViewCreateInfo::subresourceRange::baseArrayLayer.
*/
uint32_t
radv_meta_get_iview_layer(const struct radv_image *dest_image,
const VkImageSubresourceLayers *dest_subresource,
const VkOffset3D *dest_offset)
{
switch (dest_image->type) {
case VK_IMAGE_TYPE_1D:
case VK_IMAGE_TYPE_2D:
return dest_subresource->baseArrayLayer;
case VK_IMAGE_TYPE_3D:
/* HACK: Vulkan does not allow attaching a 3D image to a framebuffer,
* but meta does it anyway. When doing so, we translate the
* destination's z offset into an array offset.
*/
return dest_offset->z;
default:
assert(!"bad VkImageType");
return 0;
}
}
static void *
meta_alloc(void* _device, size_t size, size_t alignment,
VkSystemAllocationScope allocationScope)
{
struct radv_device *device = _device;
return device->alloc.pfnAllocation(device->alloc.pUserData, size, alignment,
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE);
}
static void *
meta_realloc(void* _device, void *original, size_t size, size_t alignment,
VkSystemAllocationScope allocationScope)
{
struct radv_device *device = _device;
return device->alloc.pfnReallocation(device->alloc.pUserData, original,
size, alignment,
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE);
}
static void
meta_free(void* _device, void *data)
{
struct radv_device *device = _device;
return device->alloc.pfnFree(device->alloc.pUserData, data);
}
static bool
radv_builtin_cache_path(char *path)
{
char *xdg_cache_home = getenv("XDG_CACHE_HOME");
const char *suffix = "/radv_builtin_shaders";
const char *suffix2 = "/.cache/radv_builtin_shaders";
struct passwd pwd, *result;
char path2[PATH_MAX + 1]; /* PATH_MAX is not a real max,but suffices here. */
if (xdg_cache_home) {
if (strlen(xdg_cache_home) + strlen(suffix) > PATH_MAX)
return false;
strcpy(path, xdg_cache_home);
strcat(path, suffix);
return true;
}
getpwuid_r(getuid(), &pwd, path2, PATH_MAX - strlen(suffix2), &result);
if (!result)
return false;
strcpy(path, pwd.pw_dir);
strcat(path, "/.cache");
mkdir(path, 0755);
strcat(path, suffix);
return true;
}
static void
radv_load_meta_pipeline(struct radv_device *device)
{
char path[PATH_MAX + 1];
struct stat st;
void *data = NULL;
if (!radv_builtin_cache_path(path))
return;
int fd = open(path, O_RDONLY);
if (fd < 0)
return;
if (fstat(fd, &st))
goto fail;
data = malloc(st.st_size);
if (!data)
goto fail;
if(read(fd, data, st.st_size) == -1)
goto fail;
radv_pipeline_cache_load(&device->meta_state.cache, data, st.st_size);
fail:
free(data);
close(fd);
}
static void
radv_store_meta_pipeline(struct radv_device *device)
{
char path[PATH_MAX + 1], path2[PATH_MAX + 7];
size_t size;
void *data = NULL;
if (!device->meta_state.cache.modified)
return;
if (radv_GetPipelineCacheData(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&size, NULL))
return;
if (!radv_builtin_cache_path(path))
return;
strcpy(path2, path);
strcat(path2, "XXXXXX");
int fd = mkstemp(path2);//open(path, O_WRONLY | O_CREAT, 0600);
if (fd < 0)
return;
data = malloc(size);
if (!data)
goto fail;
if (radv_GetPipelineCacheData(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&size, data))
goto fail;
if(write(fd, data, size) == -1)
goto fail;
rename(path2, path);
fail:
free(data);
close(fd);
unlink(path2);
}
VkResult
radv_device_init_meta(struct radv_device *device)
{
VkResult result;
device->meta_state.alloc = (VkAllocationCallbacks) {
.pUserData = device,
.pfnAllocation = meta_alloc,
.pfnReallocation = meta_realloc,
.pfnFree = meta_free,
};
device->meta_state.cache.alloc = device->meta_state.alloc;
radv_pipeline_cache_init(&device->meta_state.cache, device);
radv_load_meta_pipeline(device);
result = radv_device_init_meta_clear_state(device);
if (result != VK_SUCCESS)
goto fail_clear;
result = radv_device_init_meta_resolve_state(device);
if (result != VK_SUCCESS)
goto fail_resolve;
result = radv_device_init_meta_blit_state(device);
if (result != VK_SUCCESS)
goto fail_blit;
result = radv_device_init_meta_blit2d_state(device);
if (result != VK_SUCCESS)
goto fail_blit2d;
result = radv_device_init_meta_bufimage_state(device);
if (result != VK_SUCCESS)
goto fail_bufimage;
result = radv_device_init_meta_depth_decomp_state(device);
if (result != VK_SUCCESS)
goto fail_depth_decomp;
result = radv_device_init_meta_buffer_state(device);
if (result != VK_SUCCESS)
goto fail_buffer;
result = radv_device_init_meta_fast_clear_flush_state(device);
if (result != VK_SUCCESS)
goto fail_fast_clear;
result = radv_device_init_meta_resolve_compute_state(device);
if (result != VK_SUCCESS)
goto fail_resolve_compute;
return VK_SUCCESS;
fail_resolve_compute:
radv_device_finish_meta_fast_clear_flush_state(device);
fail_fast_clear:
radv_device_finish_meta_buffer_state(device);
fail_buffer:
radv_device_finish_meta_depth_decomp_state(device);
fail_depth_decomp:
radv_device_finish_meta_bufimage_state(device);
fail_bufimage:
radv_device_finish_meta_blit2d_state(device);
fail_blit2d:
radv_device_finish_meta_blit_state(device);
fail_blit:
radv_device_finish_meta_resolve_state(device);
fail_resolve:
radv_device_finish_meta_clear_state(device);
fail_clear:
radv_pipeline_cache_finish(&device->meta_state.cache);
return result;
}
void
radv_device_finish_meta(struct radv_device *device)
{
radv_device_finish_meta_clear_state(device);
radv_device_finish_meta_resolve_state(device);
radv_device_finish_meta_blit_state(device);
radv_device_finish_meta_blit2d_state(device);
radv_device_finish_meta_bufimage_state(device);
radv_device_finish_meta_depth_decomp_state(device);
radv_device_finish_meta_buffer_state(device);
radv_device_finish_meta_fast_clear_flush_state(device);
radv_device_finish_meta_resolve_compute_state(device);
radv_store_meta_pipeline(device);
radv_pipeline_cache_finish(&device->meta_state.cache);
}
/*
* The most common meta operations all want to have the viewport
* reset and any scissors disabled. The rest of the dynamic state
* should have no effect.
*/
void
radv_meta_save_graphics_reset_vport_scissor(struct radv_meta_saved_state *saved_state,
struct radv_cmd_buffer *cmd_buffer)
{
uint32_t dirty_state = (1 << VK_DYNAMIC_STATE_VIEWPORT) | (1 << VK_DYNAMIC_STATE_SCISSOR);
radv_meta_save(saved_state, cmd_buffer, dirty_state);
cmd_buffer->state.dynamic.viewport.count = 0;
cmd_buffer->state.dynamic.scissor.count = 0;
cmd_buffer->state.dirty |= dirty_state;
}

View File

@@ -1,193 +0,0 @@
/*
* Copyright © 2016 Red Hat
* based on intel anv code:
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef RADV_META_H
#define RADV_META_H
#include "radv_private.h"
#ifdef __cplusplus
extern "C" {
#endif
#define RADV_META_VERTEX_BINDING_COUNT 2
struct radv_meta_saved_state {
struct radv_vertex_binding old_vertex_bindings[RADV_META_VERTEX_BINDING_COUNT];
struct radv_descriptor_set *old_descriptor_set0;
struct radv_pipeline *old_pipeline;
/**
* Bitmask of (1 << VK_DYNAMIC_STATE_*). Defines the set of saved dynamic
* state.
*/
uint32_t dynamic_mask;
struct radv_dynamic_state dynamic;
char push_constants[128];
};
struct radv_meta_saved_pass_state {
struct radv_render_pass *pass;
const struct radv_subpass *subpass;
struct radv_attachment_state *attachments;
struct radv_framebuffer *framebuffer;
VkRect2D render_area;
};
struct radv_meta_saved_compute_state {
struct radv_descriptor_set *old_descriptor_set0;
struct radv_pipeline *old_pipeline;
char push_constants[128];
};
VkResult radv_device_init_meta_clear_state(struct radv_device *device);
void radv_device_finish_meta_clear_state(struct radv_device *device);
VkResult radv_device_init_meta_resolve_state(struct radv_device *device);
void radv_device_finish_meta_resolve_state(struct radv_device *device);
VkResult radv_device_init_meta_depth_decomp_state(struct radv_device *device);
void radv_device_finish_meta_depth_decomp_state(struct radv_device *device);
VkResult radv_device_init_meta_fast_clear_flush_state(struct radv_device *device);
void radv_device_finish_meta_fast_clear_flush_state(struct radv_device *device);
VkResult radv_device_init_meta_blit_state(struct radv_device *device);
void radv_device_finish_meta_blit_state(struct radv_device *device);
VkResult radv_device_init_meta_blit2d_state(struct radv_device *device);
void radv_device_finish_meta_blit2d_state(struct radv_device *device);
VkResult radv_device_init_meta_buffer_state(struct radv_device *device);
void radv_device_finish_meta_buffer_state(struct radv_device *device);
VkResult radv_device_init_meta_resolve_compute_state(struct radv_device *device);
void radv_device_finish_meta_resolve_compute_state(struct radv_device *device);
void radv_meta_save(struct radv_meta_saved_state *state,
const struct radv_cmd_buffer *cmd_buffer,
uint32_t dynamic_mask);
void radv_meta_restore(const struct radv_meta_saved_state *state,
struct radv_cmd_buffer *cmd_buffer);
void radv_meta_save_pass(struct radv_meta_saved_pass_state *state,
const struct radv_cmd_buffer *cmd_buffer);
void radv_meta_restore_pass(const struct radv_meta_saved_pass_state *state,
struct radv_cmd_buffer *cmd_buffer);
void radv_meta_save_compute(struct radv_meta_saved_compute_state *state,
const struct radv_cmd_buffer *cmd_buffer,
unsigned push_constant_size);
void radv_meta_restore_compute(const struct radv_meta_saved_compute_state *state,
struct radv_cmd_buffer *cmd_buffer,
unsigned push_constant_size);
VkImageViewType radv_meta_get_view_type(const struct radv_image *image);
uint32_t radv_meta_get_iview_layer(const struct radv_image *dest_image,
const VkImageSubresourceLayers *dest_subresource,
const VkOffset3D *dest_offset);
struct radv_meta_blit2d_surf {
/** The size of an element in bytes. */
uint8_t bs;
VkFormat format;
struct radv_image *image;
unsigned level;
unsigned layer;
VkImageAspectFlags aspect_mask;
};
struct radv_meta_blit2d_buffer {
struct radv_buffer *buffer;
uint32_t offset;
uint32_t pitch;
uint8_t bs;
VkFormat format;
};
struct radv_meta_blit2d_rect {
uint32_t src_x, src_y;
uint32_t dst_x, dst_y;
uint32_t width, height;
};
void radv_meta_begin_blit2d(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_saved_state *save);
void radv_meta_blit2d(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *src_img,
struct radv_meta_blit2d_buffer *src_buf,
struct radv_meta_blit2d_surf *dst,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects);
void radv_meta_end_blit2d(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_saved_state *save);
VkResult radv_device_init_meta_bufimage_state(struct radv_device *device);
void radv_device_finish_meta_bufimage_state(struct radv_device *device);
void radv_meta_begin_bufimage(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_saved_compute_state *save);
void radv_meta_end_bufimage(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_saved_compute_state *save);
void radv_meta_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *src,
struct radv_meta_blit2d_buffer *dst,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects);
void radv_decompress_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageSubresourceRange *subresourceRange);
void radv_resummarize_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageSubresourceRange *subresourceRange);
void radv_fast_clear_flush_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image);
void radv_meta_save_graphics_reset_vport_scissor(struct radv_meta_saved_state *saved_state,
struct radv_cmd_buffer *cmd_buffer);
void radv_meta_resolve_compute_image(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *src_image,
VkImageLayout src_image_layout,
struct radv_image *dest_image,
VkImageLayout dest_image_layout,
uint32_t region_count,
const VkImageResolve *regions);
#ifdef __cplusplus
}
#endif
#endif /* RADV_META_H */

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,543 +0,0 @@
#include "radv_meta.h"
#include "nir/nir_builder.h"
#include "sid.h"
#include "radv_cs.h"
static nir_shader *
build_buffer_fill_shader(struct radv_device *dev)
{
nir_builder b;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, "meta_buffer_fill");
b.shader->info.cs.local_size[0] = 64;
b.shader->info.cs.local_size[1] = 1;
b.shader->info.cs.local_size[2] = 1;
nir_ssa_def *invoc_id = nir_load_system_value(&b, nir_intrinsic_load_local_invocation_id, 0);
nir_ssa_def *wg_id = nir_load_system_value(&b, nir_intrinsic_load_work_group_id, 0);
nir_ssa_def *block_size = nir_imm_ivec4(&b,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
b.shader->info.cs.local_size[2], 0);
nir_ssa_def *global_id = nir_iadd(&b, nir_imul(&b, wg_id, block_size), invoc_id);
nir_ssa_def *offset = nir_imul(&b, global_id, nir_imm_int(&b, 16));
offset = nir_swizzle(&b, offset, (unsigned[]) {0, 0, 0, 0}, 1, false);
nir_intrinsic_instr *dst_buf = nir_intrinsic_instr_create(b.shader,
nir_intrinsic_vulkan_resource_index);
dst_buf->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
nir_intrinsic_set_desc_set(dst_buf, 0);
nir_intrinsic_set_binding(dst_buf, 0);
nir_ssa_dest_init(&dst_buf->instr, &dst_buf->dest, 1, 32, NULL);
nir_builder_instr_insert(&b, &dst_buf->instr);
nir_intrinsic_instr *load = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
load->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
load->num_components = 1;
nir_ssa_dest_init(&load->instr, &load->dest, 1, 32, "fill_value");
nir_builder_instr_insert(&b, &load->instr);
nir_ssa_def *swizzled_load = nir_swizzle(&b, &load->dest.ssa, (unsigned[]) { 0, 0, 0, 0}, 4, false);
nir_intrinsic_instr *store = nir_intrinsic_instr_create(b.shader, nir_intrinsic_store_ssbo);
store->src[0] = nir_src_for_ssa(swizzled_load);
store->src[1] = nir_src_for_ssa(&dst_buf->dest.ssa);
store->src[2] = nir_src_for_ssa(offset);
nir_intrinsic_set_write_mask(store, 0xf);
store->num_components = 4;
nir_builder_instr_insert(&b, &store->instr);
return b.shader;
}
static nir_shader *
build_buffer_copy_shader(struct radv_device *dev)
{
nir_builder b;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, "meta_buffer_copy");
b.shader->info.cs.local_size[0] = 64;
b.shader->info.cs.local_size[1] = 1;
b.shader->info.cs.local_size[2] = 1;
nir_ssa_def *invoc_id = nir_load_system_value(&b, nir_intrinsic_load_local_invocation_id, 0);
nir_ssa_def *wg_id = nir_load_system_value(&b, nir_intrinsic_load_work_group_id, 0);
nir_ssa_def *block_size = nir_imm_ivec4(&b,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
b.shader->info.cs.local_size[2], 0);
nir_ssa_def *global_id = nir_iadd(&b, nir_imul(&b, wg_id, block_size), invoc_id);
nir_ssa_def *offset = nir_imul(&b, global_id, nir_imm_int(&b, 16));
offset = nir_swizzle(&b, offset, (unsigned[]) {0, 0, 0, 0}, 1, false);
nir_intrinsic_instr *dst_buf = nir_intrinsic_instr_create(b.shader,
nir_intrinsic_vulkan_resource_index);
dst_buf->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
nir_intrinsic_set_desc_set(dst_buf, 0);
nir_intrinsic_set_binding(dst_buf, 0);
nir_ssa_dest_init(&dst_buf->instr, &dst_buf->dest, 1, 32, NULL);
nir_builder_instr_insert(&b, &dst_buf->instr);
nir_intrinsic_instr *src_buf = nir_intrinsic_instr_create(b.shader,
nir_intrinsic_vulkan_resource_index);
src_buf->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
nir_intrinsic_set_desc_set(src_buf, 0);
nir_intrinsic_set_binding(src_buf, 1);
nir_ssa_dest_init(&src_buf->instr, &src_buf->dest, 1, 32, NULL);
nir_builder_instr_insert(&b, &src_buf->instr);
nir_intrinsic_instr *load = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_ssbo);
load->src[0] = nir_src_for_ssa(&src_buf->dest.ssa);
load->src[1] = nir_src_for_ssa(offset);
nir_ssa_dest_init(&load->instr, &load->dest, 4, 32, NULL);
load->num_components = 4;
nir_builder_instr_insert(&b, &load->instr);
nir_intrinsic_instr *store = nir_intrinsic_instr_create(b.shader, nir_intrinsic_store_ssbo);
store->src[0] = nir_src_for_ssa(&load->dest.ssa);
store->src[1] = nir_src_for_ssa(&dst_buf->dest.ssa);
store->src[2] = nir_src_for_ssa(offset);
nir_intrinsic_set_write_mask(store, 0xf);
store->num_components = 4;
nir_builder_instr_insert(&b, &store->instr);
return b.shader;
}
VkResult radv_device_init_meta_buffer_state(struct radv_device *device)
{
VkResult result;
struct radv_shader_module fill_cs = { .nir = NULL };
struct radv_shader_module copy_cs = { .nir = NULL };
zero(device->meta_state.buffer);
fill_cs.nir = build_buffer_fill_shader(device);
copy_cs.nir = build_buffer_copy_shader(device);
VkDescriptorSetLayoutCreateInfo fill_ds_create_info = {
.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
.bindingCount = 1,
.pBindings = (VkDescriptorSetLayoutBinding[]) {
{
.binding = 0,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
}
};
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&fill_ds_create_info,
&device->meta_state.alloc,
&device->meta_state.buffer.fill_ds_layout);
if (result != VK_SUCCESS)
goto fail;
VkDescriptorSetLayoutCreateInfo copy_ds_create_info = {
.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
.bindingCount = 2,
.pBindings = (VkDescriptorSetLayoutBinding[]) {
{
.binding = 0,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
{
.binding = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
}
};
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&copy_ds_create_info,
&device->meta_state.alloc,
&device->meta_state.buffer.copy_ds_layout);
if (result != VK_SUCCESS)
goto fail;
VkPipelineLayoutCreateInfo fill_pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.buffer.fill_ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 4},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&fill_pl_create_info,
&device->meta_state.alloc,
&device->meta_state.buffer.fill_p_layout);
if (result != VK_SUCCESS)
goto fail;
VkPipelineLayoutCreateInfo copy_pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.buffer.copy_ds_layout,
.pushConstantRangeCount = 0,
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&copy_pl_create_info,
&device->meta_state.alloc,
&device->meta_state.buffer.copy_p_layout);
if (result != VK_SUCCESS)
goto fail;
VkPipelineShaderStageCreateInfo fill_pipeline_shader_stage = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&fill_cs),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo fill_vk_pipeline_info = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = fill_pipeline_shader_stage,
.flags = 0,
.layout = device->meta_state.buffer.fill_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &fill_vk_pipeline_info, NULL,
&device->meta_state.buffer.fill_pipeline);
if (result != VK_SUCCESS)
goto fail;
VkPipelineShaderStageCreateInfo copy_pipeline_shader_stage = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&copy_cs),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo copy_vk_pipeline_info = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = copy_pipeline_shader_stage,
.flags = 0,
.layout = device->meta_state.buffer.copy_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &copy_vk_pipeline_info, NULL,
&device->meta_state.buffer.copy_pipeline);
if (result != VK_SUCCESS)
goto fail;
ralloc_free(fill_cs.nir);
ralloc_free(copy_cs.nir);
return VK_SUCCESS;
fail:
radv_device_finish_meta_buffer_state(device);
ralloc_free(fill_cs.nir);
ralloc_free(copy_cs.nir);
return result;
}
void radv_device_finish_meta_buffer_state(struct radv_device *device)
{
if (device->meta_state.buffer.copy_pipeline)
radv_DestroyPipeline(radv_device_to_handle(device),
device->meta_state.buffer.copy_pipeline,
&device->meta_state.alloc);
if (device->meta_state.buffer.fill_pipeline)
radv_DestroyPipeline(radv_device_to_handle(device),
device->meta_state.buffer.fill_pipeline,
&device->meta_state.alloc);
if (device->meta_state.buffer.copy_p_layout)
radv_DestroyPipelineLayout(radv_device_to_handle(device),
device->meta_state.buffer.copy_p_layout,
&device->meta_state.alloc);
if (device->meta_state.buffer.fill_p_layout)
radv_DestroyPipelineLayout(radv_device_to_handle(device),
device->meta_state.buffer.fill_p_layout,
&device->meta_state.alloc);
if (device->meta_state.buffer.copy_ds_layout)
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
device->meta_state.buffer.copy_ds_layout,
&device->meta_state.alloc);
if (device->meta_state.buffer.fill_ds_layout)
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
device->meta_state.buffer.fill_ds_layout,
&device->meta_state.alloc);
}
static void fill_buffer_shader(struct radv_cmd_buffer *cmd_buffer,
struct radeon_winsys_bo *bo,
uint64_t offset, uint64_t size, uint32_t value)
{
struct radv_device *device = cmd_buffer->device;
uint64_t block_count = round_up_u64(size, 1024);
struct radv_meta_saved_compute_state saved_state;
VkDescriptorSet ds;
radv_meta_save_compute(&saved_state, cmd_buffer, 4);
radv_temp_descriptor_set_create(device, cmd_buffer,
device->meta_state.buffer.fill_ds_layout,
&ds);
struct radv_buffer dst_buffer = {
.bo = bo,
.offset = offset,
.size = size
};
radv_UpdateDescriptorSets(radv_device_to_handle(device),
1, /* writeCount */
(VkWriteDescriptorSet[]) {
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = ds,
.dstBinding = 0,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
.pBufferInfo = &(VkDescriptorBufferInfo) {
.buffer = radv_buffer_to_handle(&dst_buffer),
.offset = 0,
.range = size
}
}
}, 0, NULL);
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE,
device->meta_state.buffer.fill_pipeline);
radv_CmdBindDescriptorSets(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE,
device->meta_state.buffer.fill_p_layout, 0, 1,
&ds, 0, NULL);
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.buffer.fill_p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 4,
&value);
radv_CmdDispatch(radv_cmd_buffer_to_handle(cmd_buffer), block_count, 1, 1);
radv_temp_descriptor_set_destroy(device, ds);
radv_meta_restore_compute(&saved_state, cmd_buffer, 4);
}
static void copy_buffer_shader(struct radv_cmd_buffer *cmd_buffer,
struct radeon_winsys_bo *src_bo,
struct radeon_winsys_bo *dst_bo,
uint64_t src_offset, uint64_t dst_offset,
uint64_t size)
{
struct radv_device *device = cmd_buffer->device;
uint64_t block_count = round_up_u64(size, 1024);
struct radv_meta_saved_compute_state saved_state;
VkDescriptorSet ds;
radv_meta_save_compute(&saved_state, cmd_buffer, 0);
radv_temp_descriptor_set_create(device, cmd_buffer,
device->meta_state.buffer.copy_ds_layout,
&ds);
struct radv_buffer dst_buffer = {
.bo = dst_bo,
.offset = dst_offset,
.size = size
};
struct radv_buffer src_buffer = {
.bo = src_bo,
.offset = src_offset,
.size = size
};
radv_UpdateDescriptorSets(radv_device_to_handle(device),
2, /* writeCount */
(VkWriteDescriptorSet[]) {
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = ds,
.dstBinding = 0,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
.pBufferInfo = &(VkDescriptorBufferInfo) {
.buffer = radv_buffer_to_handle(&dst_buffer),
.offset = 0,
.range = size
}
},
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = ds,
.dstBinding = 1,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
.pBufferInfo = &(VkDescriptorBufferInfo) {
.buffer = radv_buffer_to_handle(&src_buffer),
.offset = 0,
.range = size
}
}
}, 0, NULL);
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE,
device->meta_state.buffer.copy_pipeline);
radv_CmdBindDescriptorSets(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE,
device->meta_state.buffer.copy_p_layout, 0, 1,
&ds, 0, NULL);
radv_CmdDispatch(radv_cmd_buffer_to_handle(cmd_buffer), block_count, 1, 1);
radv_temp_descriptor_set_destroy(device, ds);
radv_meta_restore_compute(&saved_state, cmd_buffer, 0);
}
void radv_fill_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radeon_winsys_bo *bo,
uint64_t offset, uint64_t size, uint32_t value)
{
assert(!(offset & 3));
assert(!(size & 3));
if (size >= 4096)
fill_buffer_shader(cmd_buffer, bo, offset, size, value);
else if (size) {
uint64_t va = cmd_buffer->device->ws->buffer_get_va(bo);
va += offset;
cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, bo, 8);
si_cp_dma_clear_buffer(cmd_buffer, va, size, value);
}
}
static
void radv_copy_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radeon_winsys_bo *src_bo,
struct radeon_winsys_bo *dst_bo,
uint64_t src_offset, uint64_t dst_offset,
uint64_t size)
{
if (size >= 4096 && !(size & 3) && !(src_offset & 3) && !(dst_offset & 3))
copy_buffer_shader(cmd_buffer, src_bo, dst_bo,
src_offset, dst_offset, size);
else if (size) {
uint64_t src_va = cmd_buffer->device->ws->buffer_get_va(src_bo);
uint64_t dst_va = cmd_buffer->device->ws->buffer_get_va(dst_bo);
src_va += src_offset;
dst_va += dst_offset;
cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, src_bo, 8);
cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, dst_bo, 8);
si_cp_dma_buffer_copy(cmd_buffer, src_va, dst_va, size);
}
}
void radv_CmdFillBuffer(
VkCommandBuffer commandBuffer,
VkBuffer dstBuffer,
VkDeviceSize dstOffset,
VkDeviceSize fillSize,
uint32_t data)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_buffer, dst_buffer, dstBuffer);
if (fillSize == VK_WHOLE_SIZE)
fillSize = (dst_buffer->size - dstOffset) & ~3ull;
radv_fill_buffer(cmd_buffer, dst_buffer->bo, dst_buffer->offset + dstOffset,
fillSize, data);
}
void radv_CmdCopyBuffer(
VkCommandBuffer commandBuffer,
VkBuffer srcBuffer,
VkBuffer destBuffer,
uint32_t regionCount,
const VkBufferCopy* pRegions)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_buffer, src_buffer, srcBuffer);
RADV_FROM_HANDLE(radv_buffer, dest_buffer, destBuffer);
for (unsigned r = 0; r < regionCount; r++) {
uint64_t src_offset = src_buffer->offset + pRegions[r].srcOffset;
uint64_t dest_offset = dest_buffer->offset + pRegions[r].dstOffset;
uint64_t copy_size = pRegions[r].size;
radv_copy_buffer(cmd_buffer, src_buffer->bo, dest_buffer->bo,
src_offset, dest_offset, copy_size);
}
}
void radv_CmdUpdateBuffer(
VkCommandBuffer commandBuffer,
VkBuffer dstBuffer,
VkDeviceSize dstOffset,
VkDeviceSize dataSize,
const uint32_t* pData)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_buffer, dst_buffer, dstBuffer);
uint64_t words = dataSize / 4;
uint64_t va = cmd_buffer->device->ws->buffer_get_va(dst_buffer->bo);
va += dstOffset + dst_buffer->offset;
assert(!(dataSize & 3));
assert(!(va & 3));
if (dataSize < 4096) {
cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, dst_buffer->bo, 8);
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, words + 4);
radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 2 + words, 0));
radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEMORY_SYNC) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_ME));
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, va >> 32);
radeon_emit_array(cmd_buffer->cs, pData, words);
} else {
uint32_t buf_offset;
radv_cmd_buffer_upload_data(cmd_buffer, dataSize, 32, pData, &buf_offset);
radv_copy_buffer(cmd_buffer, cmd_buffer->upload.upload_bo, dst_buffer->bo,
buf_offset, dstOffset + dst_buffer->offset, dataSize);
}
}

View File

@@ -1,396 +0,0 @@
#include "radv_meta.h"
#include "nir/nir_builder.h"
static nir_shader *
build_nir_itob_compute_shader(struct radv_device *dev)
{
nir_builder b;
const struct glsl_type *sampler_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
const struct glsl_type *img_type = glsl_sampler_type(GLSL_SAMPLER_DIM_BUF,
false,
false,
GLSL_TYPE_FLOAT);
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, "meta_itob_cs");
b.shader->info.cs.local_size[0] = 16;
b.shader->info.cs.local_size[1] = 16;
b.shader->info.cs.local_size[2] = 1;
nir_variable *input_img = nir_variable_create(b.shader, nir_var_uniform,
sampler_type, "s_tex");
input_img->data.descriptor_set = 0;
input_img->data.binding = 0;
nir_variable *output_img = nir_variable_create(b.shader, nir_var_uniform,
img_type, "out_img");
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
nir_ssa_def *invoc_id = nir_load_system_value(&b, nir_intrinsic_load_local_invocation_id, 0);
nir_ssa_def *wg_id = nir_load_system_value(&b, nir_intrinsic_load_work_group_id, 0);
nir_ssa_def *block_size = nir_imm_ivec4(&b,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
b.shader->info.cs.local_size[2], 0);
nir_ssa_def *global_id = nir_iadd(&b, nir_imul(&b, wg_id, block_size), invoc_id);
nir_intrinsic_instr *offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
offset->num_components = 2;
nir_ssa_dest_init(&offset->instr, &offset->dest, 2, 32, "offset");
nir_builder_instr_insert(&b, &offset->instr);
nir_intrinsic_instr *stride = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
stride->src[0] = nir_src_for_ssa(nir_imm_int(&b, 8));
stride->num_components = 1;
nir_ssa_dest_init(&stride->instr, &stride->dest, 1, 32, "stride");
nir_builder_instr_insert(&b, &stride->instr);
nir_ssa_def *img_coord = nir_iadd(&b, global_id, &offset->dest.ssa);
nir_tex_instr *tex = nir_tex_instr_create(b.shader, 2);
tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
tex->op = nir_texop_txf;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(img_coord);
tex->src[1].src_type = nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(&b, 0));
tex->dest_type = nir_type_float;
tex->is_array = false;
tex->coord_components = 2;
tex->texture = nir_deref_var_create(tex, input_img);
tex->sampler = NULL;
nir_ssa_dest_init(&tex->instr, &tex->dest, 4, 32, "tex");
nir_builder_instr_insert(&b, &tex->instr);
nir_ssa_def *pos_x = nir_channel(&b, global_id, 0);
nir_ssa_def *pos_y = nir_channel(&b, global_id, 1);
nir_ssa_def *tmp = nir_imul(&b, pos_y, &stride->dest.ssa);
tmp = nir_iadd(&b, tmp, pos_x);
nir_ssa_def *coord = nir_vec4(&b, tmp, tmp, tmp, tmp);
nir_ssa_def *outval = &tex->dest.ssa;
nir_intrinsic_instr *store = nir_intrinsic_instr_create(b.shader, nir_intrinsic_image_store);
store->src[0] = nir_src_for_ssa(coord);
store->src[1] = nir_src_for_ssa(nir_ssa_undef(&b, 1, 32));
store->src[2] = nir_src_for_ssa(outval);
store->variables[0] = nir_deref_var_create(store, output_img);
nir_builder_instr_insert(&b, &store->instr);
return b.shader;
}
/* Image to buffer - don't write use image accessors */
static VkResult
radv_device_init_meta_itob_state(struct radv_device *device)
{
VkResult result;
struct radv_shader_module cs = { .nir = NULL };
zero(device->meta_state.itob);
cs.nir = build_nir_itob_compute_shader(device);
/*
* two descriptors one for the image being sampled
* one for the buffer being written.
*/
VkDescriptorSetLayoutCreateInfo ds_create_info = {
.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
.bindingCount = 2,
.pBindings = (VkDescriptorSetLayoutBinding[]) {
{
.binding = 0,
.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
{
.binding = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
}
};
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&ds_create_info,
&device->meta_state.alloc,
&device->meta_state.itob.img_ds_layout);
if (result != VK_SUCCESS)
goto fail;
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.itob.img_ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 12},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.itob.img_p_layout);
if (result != VK_SUCCESS)
goto fail;
/* compute shader */
VkPipelineShaderStageCreateInfo pipeline_shader_stage = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&cs),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo vk_pipeline_info = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = pipeline_shader_stage,
.flags = 0,
.layout = device->meta_state.itob.img_p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &vk_pipeline_info, NULL,
&device->meta_state.itob.pipeline);
if (result != VK_SUCCESS)
goto fail;
ralloc_free(cs.nir);
return VK_SUCCESS;
fail:
ralloc_free(cs.nir);
return result;
}
static void
radv_device_finish_meta_itob_state(struct radv_device *device)
{
if (device->meta_state.itob.img_p_layout) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
device->meta_state.itob.img_p_layout,
&device->meta_state.alloc);
}
if (device->meta_state.itob.img_ds_layout) {
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
device->meta_state.itob.img_ds_layout,
&device->meta_state.alloc);
}
if (device->meta_state.itob.pipeline) {
radv_DestroyPipeline(radv_device_to_handle(device),
device->meta_state.itob.pipeline,
&device->meta_state.alloc);
}
}
void
radv_device_finish_meta_bufimage_state(struct radv_device *device)
{
radv_device_finish_meta_itob_state(device);
}
VkResult
radv_device_init_meta_bufimage_state(struct radv_device *device)
{
VkResult result;
result = radv_device_init_meta_itob_state(device);
if (result != VK_SUCCESS)
return result;
return VK_SUCCESS;
}
void
radv_meta_begin_bufimage(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_saved_compute_state *save)
{
radv_meta_save_compute(save, cmd_buffer, 12);
}
void
radv_meta_end_bufimage(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_saved_compute_state *save)
{
radv_meta_restore_compute(save, cmd_buffer, 12);
}
static void
create_iview(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *surf,
VkImageUsageFlags usage,
struct radv_image_view *iview)
{
radv_image_view_init(iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(surf->image),
.viewType = VK_IMAGE_VIEW_TYPE_2D,
.format = surf->format,
.subresourceRange = {
.aspectMask = surf->aspect_mask,
.baseMipLevel = surf->level,
.levelCount = 1,
.baseArrayLayer = surf->layer,
.layerCount = 1
},
}, cmd_buffer, usage);
}
static void
create_bview(struct radv_cmd_buffer *cmd_buffer,
struct radv_buffer *buffer,
unsigned offset,
VkFormat format,
struct radv_buffer_view *bview)
{
radv_buffer_view_init(bview, cmd_buffer->device,
&(VkBufferViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO,
.flags = 0,
.buffer = radv_buffer_to_handle(buffer),
.format = format,
.offset = offset,
.range = VK_WHOLE_SIZE,
}, cmd_buffer);
}
struct itob_temps {
struct radv_image_view src_iview;
struct radv_buffer_view dst_bview;
VkDescriptorSet set;
};
static void
itob_bind_src_image(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *src,
struct radv_meta_blit2d_rect *rect,
struct itob_temps *tmp)
{
create_iview(cmd_buffer, src, VK_IMAGE_USAGE_SAMPLED_BIT, &tmp->src_iview);
}
static void
itob_bind_dst_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_buffer *dst,
struct radv_meta_blit2d_rect *rect,
struct itob_temps *tmp)
{
create_bview(cmd_buffer, dst->buffer, dst->offset, dst->format, &tmp->dst_bview);
}
static void
itob_bind_descriptors(struct radv_cmd_buffer *cmd_buffer,
struct itob_temps *tmp)
{
struct radv_device *device = cmd_buffer->device;
VkDevice vk_device = radv_device_to_handle(cmd_buffer->device);
radv_temp_descriptor_set_create(device, cmd_buffer,
device->meta_state.itob.img_ds_layout,
&tmp->set);
radv_UpdateDescriptorSets(vk_device,
2, /* writeCount */
(VkWriteDescriptorSet[]) {
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = tmp->set,
.dstBinding = 0,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
.pImageInfo = (VkDescriptorImageInfo[]) {
{
.sampler = NULL,
.imageView = radv_image_view_to_handle(&tmp->src_iview),
.imageLayout = VK_IMAGE_LAYOUT_GENERAL,
},
}
},
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = tmp->set,
.dstBinding = 1,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER,
.pTexelBufferView = (VkBufferView[]) { radv_buffer_view_to_handle(&tmp->dst_bview) },
}
}, 0, NULL);
radv_CmdBindDescriptorSets(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE,
device->meta_state.itob.img_p_layout, 0, 1,
&tmp->set, 0, NULL);
}
static void
itob_unbind_src_image(struct radv_cmd_buffer *cmd_buffer,
struct itob_temps *temps)
{
}
static void
bind_pipeline(struct radv_cmd_buffer *cmd_buffer)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.itob.pipeline;
if (cmd_buffer->state.compute_pipeline != radv_pipeline_from_handle(pipeline)) {
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
}
}
void
radv_meta_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *src,
struct radv_meta_blit2d_buffer *dst,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects)
{
struct radv_device *device = cmd_buffer->device;
for (unsigned r = 0; r < num_rects; ++r) {
struct itob_temps temps;
itob_bind_src_image(cmd_buffer, src, &rects[r], &temps);
itob_bind_dst_buffer(cmd_buffer, dst, &rects[r], &temps);
itob_bind_descriptors(cmd_buffer, &temps);
bind_pipeline(cmd_buffer);
unsigned push_constants[3] = {
rects[r].src_x,
rects[r].src_y,
dst->pitch
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.itob.img_p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 12,
push_constants);
radv_unaligned_dispatch(cmd_buffer, rects[r].width, rects[r].height, 1);
radv_temp_descriptor_set_destroy(cmd_buffer->device, temps.set);
itob_unbind_src_image(cmd_buffer, &temps);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,399 +0,0 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "radv_meta.h"
#include "vk_format.h"
static VkExtent3D
meta_image_block_size(const struct radv_image *image)
{
const struct vk_format_description *desc = vk_format_description(image->vk_format);
return (VkExtent3D) { desc->block.width, desc->block.height, 1 };
}
/* Returns the user-provided VkBufferImageCopy::imageExtent in units of
* elements rather than texels. One element equals one texel or one block
* if Image is uncompressed or compressed, respectively.
*/
static struct VkExtent3D
meta_region_extent_el(const struct radv_image *image,
const struct VkExtent3D *extent)
{
const VkExtent3D block = meta_image_block_size(image);
return radv_sanitize_image_extent(image->type, (VkExtent3D) {
.width = DIV_ROUND_UP(extent->width , block.width),
.height = DIV_ROUND_UP(extent->height, block.height),
.depth = DIV_ROUND_UP(extent->depth , block.depth),
});
}
/* Returns the user-provided VkBufferImageCopy::imageOffset in units of
* elements rather than texels. One element equals one texel or one block
* if Image is uncompressed or compressed, respectively.
*/
static struct VkOffset3D
meta_region_offset_el(const struct radv_image *image,
const struct VkOffset3D *offset)
{
const VkExtent3D block = meta_image_block_size(image);
return radv_sanitize_image_offset(image->type, (VkOffset3D) {
.x = offset->x / block.width,
.y = offset->y / block.height,
.z = offset->z / block.depth,
});
}
static VkFormat
vk_format_for_size(int bs)
{
switch (bs) {
case 1: return VK_FORMAT_R8_UINT;
case 2: return VK_FORMAT_R8G8_UINT;
case 4: return VK_FORMAT_R8G8B8A8_UINT;
case 8: return VK_FORMAT_R16G16B16A16_UINT;
case 16: return VK_FORMAT_R32G32B32A32_UINT;
default:
unreachable("Invalid format block size");
}
}
static struct radv_meta_blit2d_surf
blit_surf_for_image_level_layer(struct radv_image* image, VkImageAspectFlags aspectMask,
int level, int layer)
{
VkFormat format = image->vk_format;
if (aspectMask & VK_IMAGE_ASPECT_DEPTH_BIT)
format = vk_format_depth_only(format);
else if (aspectMask & VK_IMAGE_ASPECT_STENCIL_BIT)
format = vk_format_stencil_only(format);
if (!image->surface.dcc_size)
format = vk_format_for_size(vk_format_get_blocksize(format));
return (struct radv_meta_blit2d_surf) {
.format = format,
.bs = vk_format_get_blocksize(format),
.level = level,
.layer = layer,
.image = image,
.aspect_mask = aspectMask,
};
}
static void
meta_copy_buffer_to_image(struct radv_cmd_buffer *cmd_buffer,
struct radv_buffer* buffer,
struct radv_image* image,
uint32_t regionCount,
const VkBufferImageCopy* pRegions)
{
struct radv_meta_saved_state saved_state;
/* The Vulkan 1.0 spec says "dstImage must have a sample count equal to
* VK_SAMPLE_COUNT_1_BIT."
*/
assert(image->samples == 1);
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
for (unsigned r = 0; r < regionCount; r++) {
/**
* From the Vulkan 1.0.6 spec: 18.3 Copying Data Between Images
* extent is the size in texels of the source image to copy in width,
* height and depth. 1D images use only x and width. 2D images use x, y,
* width and height. 3D images use x, y, z, width, height and depth.
*
*
* Also, convert the offsets and extent from units of texels to units of
* blocks - which is the highest resolution accessible in this command.
*/
const VkOffset3D img_offset_el =
meta_region_offset_el(image, &pRegions[r].imageOffset);
const VkExtent3D bufferExtent = {
.width = pRegions[r].bufferRowLength ?
pRegions[r].bufferRowLength : pRegions[r].imageExtent.width,
.height = pRegions[r].bufferImageHeight ?
pRegions[r].bufferImageHeight : pRegions[r].imageExtent.height,
};
const VkExtent3D buf_extent_el =
meta_region_extent_el(image, &bufferExtent);
/* Start creating blit rect */
const VkExtent3D img_extent_el =
meta_region_extent_el(image, &pRegions[r].imageExtent);
struct radv_meta_blit2d_rect rect = {
.width = img_extent_el.width,
.height = img_extent_el.height,
};
/* Create blit surfaces */
struct radv_meta_blit2d_surf img_bsurf =
blit_surf_for_image_level_layer(image,
pRegions[r].imageSubresource.aspectMask,
pRegions[r].imageSubresource.mipLevel,
pRegions[r].imageSubresource.baseArrayLayer);
struct radv_meta_blit2d_buffer buf_bsurf = {
.bs = img_bsurf.bs,
.format = img_bsurf.format,
.buffer = buffer,
.offset = pRegions[r].bufferOffset,
.pitch = buf_extent_el.width,
};
/* Loop through each 3D or array slice */
unsigned num_slices_3d = img_extent_el.depth;
unsigned num_slices_array = pRegions[r].imageSubresource.layerCount;
unsigned slice_3d = 0;
unsigned slice_array = 0;
while (slice_3d < num_slices_3d && slice_array < num_slices_array) {
rect.dst_x = img_offset_el.x;
rect.dst_y = img_offset_el.y;
/* Perform Blit */
radv_meta_blit2d(cmd_buffer, NULL, &buf_bsurf, &img_bsurf, 1, &rect);
/* Once we've done the blit, all of the actual information about
* the image is embedded in the command buffer so we can just
* increment the offset directly in the image effectively
* re-binding it to different backing memory.
*/
buf_bsurf.offset += buf_extent_el.width *
buf_extent_el.height * buf_bsurf.bs;
img_bsurf.layer++;
if (image->type == VK_IMAGE_TYPE_3D)
slice_3d++;
else
slice_array++;
}
}
radv_meta_restore(&saved_state, cmd_buffer);
}
void radv_CmdCopyBufferToImage(
VkCommandBuffer commandBuffer,
VkBuffer srcBuffer,
VkImage destImage,
VkImageLayout destImageLayout,
uint32_t regionCount,
const VkBufferImageCopy* pRegions)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_image, dest_image, destImage);
RADV_FROM_HANDLE(radv_buffer, src_buffer, srcBuffer);
meta_copy_buffer_to_image(cmd_buffer, src_buffer, dest_image,
regionCount, pRegions);
}
static void
meta_copy_image_to_buffer(struct radv_cmd_buffer *cmd_buffer,
struct radv_buffer* buffer,
struct radv_image* image,
uint32_t regionCount,
const VkBufferImageCopy* pRegions)
{
struct radv_meta_saved_compute_state saved_state;
radv_meta_begin_bufimage(cmd_buffer, &saved_state);
for (unsigned r = 0; r < regionCount; r++) {
/**
* From the Vulkan 1.0.6 spec: 18.3 Copying Data Between Images
* extent is the size in texels of the source image to copy in width,
* height and depth. 1D images use only x and width. 2D images use x, y,
* width and height. 3D images use x, y, z, width, height and depth.
*
*
* Also, convert the offsets and extent from units of texels to units of
* blocks - which is the highest resolution accessible in this command.
*/
const VkOffset3D img_offset_el =
meta_region_offset_el(image, &pRegions[r].imageOffset);
const VkExtent3D bufferExtent = {
.width = pRegions[r].bufferRowLength ?
pRegions[r].bufferRowLength : pRegions[r].imageExtent.width,
.height = pRegions[r].bufferImageHeight ?
pRegions[r].bufferImageHeight : pRegions[r].imageExtent.height,
};
const VkExtent3D buf_extent_el =
meta_region_extent_el(image, &bufferExtent);
/* Start creating blit rect */
const VkExtent3D img_extent_el =
meta_region_extent_el(image, &pRegions[r].imageExtent);
struct radv_meta_blit2d_rect rect = {
.width = img_extent_el.width,
.height = img_extent_el.height,
};
/* Create blit surfaces */
struct radv_meta_blit2d_surf img_info =
blit_surf_for_image_level_layer(image,
pRegions[r].imageSubresource.aspectMask,
pRegions[r].imageSubresource.mipLevel,
pRegions[r].imageSubresource.baseArrayLayer);
struct radv_meta_blit2d_buffer buf_info = {
.bs = img_info.bs,
.format = img_info.format,
.buffer = buffer,
.offset = pRegions[r].bufferOffset,
.pitch = buf_extent_el.width,
};
/* Loop through each 3D or array slice */
unsigned num_slices_3d = img_extent_el.depth;
unsigned num_slices_array = pRegions[r].imageSubresource.layerCount;
unsigned slice_3d = 0;
unsigned slice_array = 0;
while (slice_3d < num_slices_3d && slice_array < num_slices_array) {
rect.src_x = img_offset_el.x;
rect.src_y = img_offset_el.y;
/* Perform Blit */
radv_meta_image_to_buffer(cmd_buffer, &img_info, &buf_info, 1, &rect);
buf_info.offset += buf_extent_el.width *
buf_extent_el.height * buf_info.bs;
img_info.layer++;
if (image->type == VK_IMAGE_TYPE_3D)
slice_3d++;
else
slice_array++;
}
}
radv_meta_end_bufimage(cmd_buffer, &saved_state);
}
void radv_CmdCopyImageToBuffer(
VkCommandBuffer commandBuffer,
VkImage srcImage,
VkImageLayout srcImageLayout,
VkBuffer destBuffer,
uint32_t regionCount,
const VkBufferImageCopy* pRegions)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_image, src_image, srcImage);
RADV_FROM_HANDLE(radv_buffer, dst_buffer, destBuffer);
meta_copy_image_to_buffer(cmd_buffer, dst_buffer, src_image,
regionCount, pRegions);
}
void radv_CmdCopyImage(
VkCommandBuffer commandBuffer,
VkImage srcImage,
VkImageLayout srcImageLayout,
VkImage destImage,
VkImageLayout destImageLayout,
uint32_t regionCount,
const VkImageCopy* pRegions)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
RADV_FROM_HANDLE(radv_image, src_image, srcImage);
RADV_FROM_HANDLE(radv_image, dest_image, destImage);
struct radv_meta_saved_state saved_state;
/* From the Vulkan 1.0 spec:
*
* vkCmdCopyImage can be used to copy image data between multisample
* images, but both images must have the same number of samples.
*/
assert(src_image->samples == dest_image->samples);
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
for (unsigned r = 0; r < regionCount; r++) {
assert(pRegions[r].srcSubresource.aspectMask ==
pRegions[r].dstSubresource.aspectMask);
/* Create blit surfaces */
struct radv_meta_blit2d_surf b_src =
blit_surf_for_image_level_layer(src_image,
pRegions[r].srcSubresource.aspectMask,
pRegions[r].srcSubresource.mipLevel,
pRegions[r].srcSubresource.baseArrayLayer);
struct radv_meta_blit2d_surf b_dst =
blit_surf_for_image_level_layer(dest_image,
pRegions[r].dstSubresource.aspectMask,
pRegions[r].dstSubresource.mipLevel,
pRegions[r].dstSubresource.baseArrayLayer);
/* for DCC */
b_src.format = b_dst.format;
/**
* From the Vulkan 1.0.6 spec: 18.4 Copying Data Between Buffers and Images
* imageExtent is the size in texels of the image to copy in width, height
* and depth. 1D images use only x and width. 2D images use x, y, width
* and height. 3D images use x, y, z, width, height and depth.
*
* Also, convert the offsets and extent from units of texels to units of
* blocks - which is the highest resolution accessible in this command.
*/
const VkOffset3D dst_offset_el =
meta_region_offset_el(dest_image, &pRegions[r].dstOffset);
const VkOffset3D src_offset_el =
meta_region_offset_el(src_image, &pRegions[r].srcOffset);
const VkExtent3D img_extent_el =
meta_region_extent_el(src_image, &pRegions[r].extent);
/* Start creating blit rect */
struct radv_meta_blit2d_rect rect = {
.width = img_extent_el.width,
.height = img_extent_el.height,
};
/* Loop through each 3D or array slice */
unsigned num_slices_3d = img_extent_el.depth;
unsigned num_slices_array = pRegions[r].dstSubresource.layerCount;
unsigned slice_3d = 0;
unsigned slice_array = 0;
while (slice_3d < num_slices_3d && slice_array < num_slices_array) {
/* Finish creating blit rect */
rect.dst_x = dst_offset_el.x;
rect.dst_y = dst_offset_el.y;
rect.src_x = src_offset_el.x;
rect.src_y = src_offset_el.y;
/* Perform Blit */
radv_meta_blit2d(cmd_buffer, &b_src, NULL, &b_dst, 1, &rect);
b_src.layer++;
b_dst.layer++;
if (dest_image->type == VK_IMAGE_TYPE_3D)
slice_3d++;
else
slice_array++;
}
}
radv_meta_restore(&saved_state, cmd_buffer);
}

View File

@@ -1,463 +0,0 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include <assert.h>
#include <stdbool.h>
#include "radv_meta.h"
#include "radv_private.h"
#include "nir/nir_builder.h"
#include "sid.h"
/**
* Vertex attributes used by all pipelines.
*/
struct vertex_attrs {
float position[2]; /**< 3DPRIM_RECTLIST */
};
/* passthrough vertex shader */
static nir_shader *
build_nir_vs(void)
{
const struct glsl_type *vec4 = glsl_vec4_type();
nir_builder b;
nir_variable *a_position;
nir_variable *v_position;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_VERTEX, NULL);
b.shader->info.name = ralloc_strdup(b.shader, "meta_depth_decomp_vs");
a_position = nir_variable_create(b.shader, nir_var_shader_in, vec4,
"a_position");
a_position->data.location = VERT_ATTRIB_GENERIC0;
v_position = nir_variable_create(b.shader, nir_var_shader_out, vec4,
"gl_Position");
v_position->data.location = VARYING_SLOT_POS;
nir_copy_var(&b, v_position, a_position);
return b.shader;
}
/* simple passthrough shader */
static nir_shader *
build_nir_fs(void)
{
nir_builder b;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_FRAGMENT, NULL);
b.shader->info.name = ralloc_asprintf(b.shader,
"meta_depth_decomp_noop_fs");
return b.shader;
}
static VkResult
create_pass(struct radv_device *device)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
VkAttachmentDescription attachment;
attachment.format = VK_FORMAT_UNDEFINED;
attachment.samples = 1;
attachment.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachment.initialLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
attachment.finalLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
result = radv_CreateRenderPass(device_h,
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = &attachment,
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 0,
.pColorAttachments = NULL,
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
},
.preserveAttachmentCount = 0,
.pPreserveAttachments = NULL,
},
.dependencyCount = 0,
},
alloc,
&device->meta_state.depth_decomp.pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
struct radv_shader_module fs_module = {
.nir = build_nir_fs(),
};
if (!fs_module.nir) {
/* XXX: Need more accurate error */
result = VK_ERROR_OUT_OF_HOST_MEMORY;
goto cleanup;
}
const VkGraphicsPipelineCreateInfo pipeline_create_info = {
.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
.stageCount = 2,
.pStages = (VkPipelineShaderStageCreateInfo[]) {
{
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_VERTEX_BIT,
.module = vs_module_h,
.pName = "main",
},
{
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_FRAGMENT_BIT,
.module = radv_shader_module_to_handle(&fs_module),
.pName = "main",
},
},
.pVertexInputState = &(VkPipelineVertexInputStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,
.vertexBindingDescriptionCount = 1,
.pVertexBindingDescriptions = (VkVertexInputBindingDescription[]) {
{
.binding = 0,
.stride = sizeof(struct vertex_attrs),
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
},
},
.vertexAttributeDescriptionCount = 1,
.pVertexAttributeDescriptions = (VkVertexInputAttributeDescription[]) {
{
/* Position */
.location = 0,
.binding = 0,
.format = VK_FORMAT_R32G32_SFLOAT,
.offset = offsetof(struct vertex_attrs, position),
},
},
},
.pInputAssemblyState = &(VkPipelineInputAssemblyStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO,
.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP,
.primitiveRestartEnable = false,
},
.pViewportState = &(VkPipelineViewportStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
.viewportCount = 0,
.scissorCount = 0,
},
.pRasterizationState = &(VkPipelineRasterizationStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,
.depthClampEnable = false,
.rasterizerDiscardEnable = false,
.polygonMode = VK_POLYGON_MODE_FILL,
.cullMode = VK_CULL_MODE_NONE,
.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.sampleShadingEnable = false,
.pSampleMask = NULL,
.alphaToCoverageEnable = false,
.alphaToOneEnable = false,
},
.pColorBlendState = &(VkPipelineColorBlendStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO,
.logicOpEnable = false,
.attachmentCount = 0,
.pAttachments = NULL,
},
.pDepthStencilState = &(VkPipelineDepthStencilStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO,
.depthTestEnable = false,
.depthWriteEnable = false,
.depthBoundsTestEnable = false,
.stencilTestEnable = false,
},
.pDynamicState = NULL,
.renderPass = device->meta_state.depth_decomp.pass,
.subpass = 0,
};
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&pipeline_create_info,
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.db_flush_depth_inplace = true,
.db_flush_stencil_inplace = true,
},
&device->meta_state.alloc,
&device->meta_state.depth_decomp.decompress_pipeline);
if (result != VK_SUCCESS)
goto cleanup;
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&pipeline_create_info,
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.db_flush_depth_inplace = true,
.db_flush_stencil_inplace = true,
.db_resummarize = true,
},
&device->meta_state.alloc,
&device->meta_state.depth_decomp.resummarize_pipeline);
if (result != VK_SUCCESS)
goto cleanup;
goto cleanup;
cleanup:
ralloc_free(fs_module.nir);
return result;
}
void
radv_device_finish_meta_depth_decomp_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
VkDevice device_h = radv_device_to_handle(device);
VkRenderPass pass_h = device->meta_state.depth_decomp.pass;
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
if (pass_h)
radv_DestroyRenderPass(device_h, pass_h,
&device->meta_state.alloc);
VkPipeline pipeline_h = state->depth_decomp.decompress_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
pipeline_h = state->depth_decomp.resummarize_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
}
VkResult
radv_device_init_meta_depth_decomp_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
zero(device->meta_state.depth_decomp);
struct radv_shader_module vs_module = { .nir = build_nir_vs() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
res = VK_ERROR_OUT_OF_HOST_MEMORY;
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
goto cleanup;
fail:
radv_device_finish_meta_depth_decomp_state(device);
cleanup:
ralloc_free(vs_module.nir);
return res;
}
static void
emit_depth_decomp(struct radv_cmd_buffer *cmd_buffer,
const VkOffset2D *dest_offset,
const VkExtent2D *depth_decomp_extent,
VkPipeline pipeline_h)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
uint32_t offset;
const struct vertex_attrs vertex_data[3] = {
{
.position = {
dest_offset->x,
dest_offset->y,
},
},
{
.position = {
dest_offset->x,
dest_offset->y + depth_decomp_extent->height,
},
},
{
.position = {
dest_offset->x + depth_decomp_extent->width,
dest_offset->y,
},
},
};
radv_cmd_buffer_upload_data(cmd_buffer, sizeof(vertex_data), 16, vertex_data, &offset);
struct radv_buffer vertex_buffer = {
.device = device,
.size = sizeof(vertex_data),
.bo = cmd_buffer->upload.upload_bo,
.offset = offset,
};
VkBuffer vertex_buffer_h = radv_buffer_to_handle(&vertex_buffer);
radv_CmdBindVertexBuffers(cmd_buffer_h,
/*firstBinding*/ 0,
/*bindingCount*/ 1,
(VkBuffer[]) { vertex_buffer_h },
(VkDeviceSize[]) { 0 });
RADV_FROM_HANDLE(radv_pipeline, pipeline, pipeline_h);
if (cmd_buffer->state.pipeline != pipeline) {
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
pipeline_h);
}
radv_CmdDraw(cmd_buffer_h, 3, 1, 0, 0);
}
static void radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageSubresourceRange *subresourceRange,
VkPipeline pipeline_h)
{
struct radv_meta_saved_state saved_state;
struct radv_meta_saved_pass_state saved_pass_state;
VkDevice device_h = radv_device_to_handle(cmd_buffer->device);
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
uint32_t width = radv_minify(image->extent.width,
subresourceRange->baseMipLevel);
uint32_t height = radv_minify(image->extent.height,
subresourceRange->baseMipLevel);
if (!image->htile.size)
return;
radv_meta_save_pass(&saved_pass_state, cmd_buffer);
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
for (uint32_t layer = 0; layer < subresourceRange->layerCount; layer++) {
struct radv_image_view iview;
radv_image_view_init(&iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(image),
.format = image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT,
.baseMipLevel = subresourceRange->baseMipLevel,
.levelCount = 1,
.baseArrayLayer = subresourceRange->baseArrayLayer + layer,
.layerCount = 1,
},
},
cmd_buffer, VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT);
VkFramebuffer fb_h;
radv_CreateFramebuffer(device_h,
&(VkFramebufferCreateInfo) {
.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = (VkImageView[]) {
radv_image_view_to_handle(&iview)
},
.width = width,
.height = height,
.layers = 1
},
&cmd_buffer->pool->alloc,
&fb_h);
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = cmd_buffer->device->meta_state.depth_decomp.pass,
.framebuffer = fb_h,
.renderArea = {
.offset = {
0,
0,
},
.extent = {
width,
height,
}
},
.clearValueCount = 0,
.pClearValues = NULL,
},
VK_SUBPASS_CONTENTS_INLINE);
emit_depth_decomp(cmd_buffer, &(VkOffset2D){0, 0 }, &(VkExtent2D){width, height}, pipeline_h);
radv_CmdEndRenderPass(cmd_buffer_h);
radv_DestroyFramebuffer(device_h, fb_h,
&cmd_buffer->pool->alloc);
}
radv_meta_restore(&saved_state, cmd_buffer);
radv_meta_restore_pass(&saved_pass_state, cmd_buffer);
}
void radv_decompress_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageSubresourceRange *subresourceRange)
{
radv_process_depth_image_inplace(cmd_buffer, image, subresourceRange,
cmd_buffer->device->meta_state.depth_decomp.decompress_pipeline);
}
void radv_resummarize_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageSubresourceRange *subresourceRange)
{
radv_process_depth_image_inplace(cmd_buffer, image, subresourceRange,
cmd_buffer->device->meta_state.depth_decomp.resummarize_pipeline);
}

View File

@@ -1,486 +0,0 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include <assert.h>
#include <stdbool.h>
#include "radv_meta.h"
#include "radv_private.h"
#include "nir/nir_builder.h"
#include "sid.h"
/**
* Vertex attributes used by all pipelines.
*/
struct vertex_attrs {
float position[2]; /**< 3DPRIM_RECTLIST */
};
/* passthrough vertex shader */
static nir_shader *
build_nir_vs(void)
{
const struct glsl_type *vec4 = glsl_vec4_type();
nir_builder b;
nir_variable *a_position;
nir_variable *v_position;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_VERTEX, NULL);
b.shader->info.name = ralloc_strdup(b.shader, "meta_fast_clear_vs");
a_position = nir_variable_create(b.shader, nir_var_shader_in, vec4,
"a_position");
a_position->data.location = VERT_ATTRIB_GENERIC0;
v_position = nir_variable_create(b.shader, nir_var_shader_out, vec4,
"gl_Position");
v_position->data.location = VARYING_SLOT_POS;
nir_copy_var(&b, v_position, a_position);
return b.shader;
}
/* simple passthrough shader */
static nir_shader *
build_nir_fs(void)
{
nir_builder b;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_FRAGMENT, NULL);
b.shader->info.name = ralloc_asprintf(b.shader,
"meta_fast_clear_noop_fs");
return b.shader;
}
static VkResult
create_pass(struct radv_device *device)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
VkAttachmentDescription attachment;
attachment.format = VK_FORMAT_UNDEFINED;
attachment.samples = 1;
attachment.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachment.initialLayout = VK_IMAGE_LAYOUT_GENERAL;
attachment.finalLayout = VK_IMAGE_LAYOUT_GENERAL;
result = radv_CreateRenderPass(device_h,
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = &attachment,
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 1,
.pColorAttachments = (VkAttachmentReference[]) {
{
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_GENERAL,
},
},
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = VK_ATTACHMENT_UNUSED,
},
.preserveAttachmentCount = 0,
.pPreserveAttachments = NULL,
},
.dependencyCount = 0,
},
alloc,
&device->meta_state.fast_clear_flush.pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
struct radv_shader_module fs_module = {
.nir = build_nir_fs(),
};
if (!fs_module.nir) {
/* XXX: Need more accurate error */
result = VK_ERROR_OUT_OF_HOST_MEMORY;
goto cleanup;
}
const VkPipelineShaderStageCreateInfo stages[2] = {
{
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_VERTEX_BIT,
.module = vs_module_h,
.pName = "main",
},
{
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_FRAGMENT_BIT,
.module = radv_shader_module_to_handle(&fs_module),
.pName = "main",
},
};
const VkPipelineVertexInputStateCreateInfo vi_state = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,
.vertexBindingDescriptionCount = 1,
.pVertexBindingDescriptions = (VkVertexInputBindingDescription[]) {
{
.binding = 0,
.stride = sizeof(struct vertex_attrs),
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
},
},
.vertexAttributeDescriptionCount = 1,
.pVertexAttributeDescriptions = (VkVertexInputAttributeDescription[]) {
{
/* Position */
.location = 0,
.binding = 0,
.format = VK_FORMAT_R32G32_SFLOAT,
.offset = offsetof(struct vertex_attrs, position),
},
}
};
const VkPipelineInputAssemblyStateCreateInfo ia_state = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO,
.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP,
.primitiveRestartEnable = false,
};
const VkPipelineColorBlendStateCreateInfo blend_state = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO,
.logicOpEnable = false,
.attachmentCount = 1,
.pAttachments = (VkPipelineColorBlendAttachmentState []) {
{
.colorWriteMask = VK_COLOR_COMPONENT_R_BIT |
VK_COLOR_COMPONENT_G_BIT |
VK_COLOR_COMPONENT_B_BIT |
VK_COLOR_COMPONENT_A_BIT,
},
}
};
const VkPipelineRasterizationStateCreateInfo rs_state = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,
.depthClampEnable = false,
.rasterizerDiscardEnable = false,
.polygonMode = VK_POLYGON_MODE_FILL,
.cullMode = VK_CULL_MODE_NONE,
.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE,
};
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&(VkGraphicsPipelineCreateInfo) {
.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
.stageCount = 2,
.pStages = stages,
.pVertexInputState = &vi_state,
.pInputAssemblyState = &ia_state,
.pViewportState = &(VkPipelineViewportStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
.viewportCount = 0,
.scissorCount = 0,
},
.pRasterizationState = &rs_state,
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.sampleShadingEnable = false,
.pSampleMask = NULL,
.alphaToCoverageEnable = false,
.alphaToOneEnable = false,
},
.pColorBlendState = &blend_state,
.pDynamicState = NULL,
.renderPass = device->meta_state.fast_clear_flush.pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_ELIMINATE_FAST_CLEAR,
},
&device->meta_state.alloc,
&device->meta_state.fast_clear_flush.cmask_eliminate_pipeline);
if (result != VK_SUCCESS)
goto cleanup;
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&(VkGraphicsPipelineCreateInfo) {
.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
.stageCount = 2,
.pStages = stages,
.pVertexInputState = &vi_state,
.pInputAssemblyState = &ia_state,
.pViewportState = &(VkPipelineViewportStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
.viewportCount = 0,
.scissorCount = 0,
},
.pRasterizationState = &rs_state,
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.sampleShadingEnable = false,
.pSampleMask = NULL,
.alphaToCoverageEnable = false,
.alphaToOneEnable = false,
},
.pColorBlendState = &blend_state,
.pDynamicState = NULL,
.renderPass = device->meta_state.fast_clear_flush.pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_FMASK_DECOMPRESS,
},
&device->meta_state.alloc,
&device->meta_state.fast_clear_flush.fmask_decompress_pipeline);
if (result != VK_SUCCESS)
goto cleanup_cmask;
goto cleanup;
cleanup_cmask:
radv_DestroyPipeline(device_h, device->meta_state.fast_clear_flush.cmask_eliminate_pipeline, &device->meta_state.alloc);
cleanup:
ralloc_free(fs_module.nir);
return result;
}
void
radv_device_finish_meta_fast_clear_flush_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
VkDevice device_h = radv_device_to_handle(device);
VkRenderPass pass_h = device->meta_state.fast_clear_flush.pass;
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
if (pass_h)
radv_DestroyRenderPass(device_h, pass_h,
&device->meta_state.alloc);
VkPipeline pipeline_h = state->fast_clear_flush.cmask_eliminate_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
pipeline_h = state->fast_clear_flush.fmask_decompress_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
}
VkResult
radv_device_init_meta_fast_clear_flush_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
zero(device->meta_state.fast_clear_flush);
struct radv_shader_module vs_module = { .nir = build_nir_vs() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
res = VK_ERROR_OUT_OF_HOST_MEMORY;
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
goto cleanup;
fail:
radv_device_finish_meta_fast_clear_flush_state(device);
cleanup:
ralloc_free(vs_module.nir);
return res;
}
static void
emit_fast_clear_flush(struct radv_cmd_buffer *cmd_buffer,
const VkExtent2D *resolve_extent,
bool fmask_decompress)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
uint32_t offset;
const struct vertex_attrs vertex_data[3] = {
{
.position = {
0,
0,
},
},
{
.position = {
0,
resolve_extent->height,
},
},
{
.position = {
resolve_extent->width,
0,
},
},
};
cmd_buffer->state.flush_bits |= (RADV_CMD_FLAG_FLUSH_AND_INV_CB |
RADV_CMD_FLAG_FLUSH_AND_INV_CB_META);
radv_cmd_buffer_upload_data(cmd_buffer, sizeof(vertex_data), 16, vertex_data, &offset);
struct radv_buffer vertex_buffer = {
.device = device,
.size = sizeof(vertex_data),
.bo = cmd_buffer->upload.upload_bo,
.offset = offset,
};
VkBuffer vertex_buffer_h = radv_buffer_to_handle(&vertex_buffer);
radv_CmdBindVertexBuffers(cmd_buffer_h,
/*firstBinding*/ 0,
/*bindingCount*/ 1,
(VkBuffer[]) { vertex_buffer_h },
(VkDeviceSize[]) { 0 });
VkPipeline pipeline_h;
if (fmask_decompress)
pipeline_h = device->meta_state.fast_clear_flush.fmask_decompress_pipeline;
else
pipeline_h = device->meta_state.fast_clear_flush.cmask_eliminate_pipeline;
RADV_FROM_HANDLE(radv_pipeline, pipeline, pipeline_h);
if (cmd_buffer->state.pipeline != pipeline) {
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
pipeline_h);
}
radv_CmdDraw(cmd_buffer_h, 3, 1, 0, 0);
cmd_buffer->state.flush_bits |= (RADV_CMD_FLAG_FLUSH_AND_INV_CB |
RADV_CMD_FLAG_FLUSH_AND_INV_CB_META);
si_emit_cache_flush(cmd_buffer);
}
/**
*/
void
radv_fast_clear_flush_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image)
{
struct radv_meta_saved_state saved_state;
struct radv_meta_saved_pass_state saved_pass_state;
VkDevice device_h = radv_device_to_handle(cmd_buffer->device);
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
radv_meta_save_pass(&saved_pass_state, cmd_buffer);
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
struct radv_image_view iview;
radv_image_view_init(&iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(image),
.format = image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = 0,
.levelCount = 1,
.baseArrayLayer = 0,
.layerCount = 1,
},
},
cmd_buffer, VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT);
VkFramebuffer fb_h;
radv_CreateFramebuffer(device_h,
&(VkFramebufferCreateInfo) {
.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
.attachmentCount = 1,
.pAttachments = (VkImageView[]) {
radv_image_view_to_handle(&iview)
},
.width = image->extent.width,
.height = image->extent.height,
.layers = 1
},
&cmd_buffer->pool->alloc,
&fb_h);
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = cmd_buffer->device->meta_state.fast_clear_flush.pass,
.framebuffer = fb_h,
.renderArea = {
.offset = {
0,
0,
},
.extent = {
image->extent.width,
image->extent.height,
}
},
.clearValueCount = 0,
.pClearValues = NULL,
},
VK_SUBPASS_CONTENTS_INLINE);
emit_fast_clear_flush(cmd_buffer,
&(VkExtent2D) { image->extent.width, image->extent.height },
image->fmask.size > 0);
radv_CmdEndRenderPass(cmd_buffer_h);
radv_DestroyFramebuffer(device_h, fb_h,
&cmd_buffer->pool->alloc);
radv_meta_restore(&saved_state, cmd_buffer);
radv_meta_restore_pass(&saved_pass_state, cmd_buffer);
}

View File

@@ -1,672 +0,0 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include <assert.h>
#include <stdbool.h>
#include "radv_meta.h"
#include "radv_private.h"
#include "nir/nir_builder.h"
#include "sid.h"
/**
* Vertex attributes used by all pipelines.
*/
struct vertex_attrs {
float position[2]; /**< 3DPRIM_RECTLIST */
float tex_position[2];
};
/* passthrough vertex shader */
static nir_shader *
build_nir_vs(void)
{
const struct glsl_type *vec4 = glsl_vec4_type();
nir_builder b;
nir_variable *a_position;
nir_variable *v_position;
nir_variable *a_tex_position;
nir_variable *v_tex_position;
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_VERTEX, NULL);
b.shader->info.name = ralloc_strdup(b.shader, "meta_resolve_vs");
a_position = nir_variable_create(b.shader, nir_var_shader_in, vec4,
"a_position");
a_position->data.location = VERT_ATTRIB_GENERIC0;
v_position = nir_variable_create(b.shader, nir_var_shader_out, vec4,
"gl_Position");
v_position->data.location = VARYING_SLOT_POS;
a_tex_position = nir_variable_create(b.shader, nir_var_shader_in, vec4,
"a_tex_position");
a_tex_position->data.location = VERT_ATTRIB_GENERIC1;
v_tex_position = nir_variable_create(b.shader, nir_var_shader_out, vec4,
"v_tex_position");
v_tex_position->data.location = VARYING_SLOT_VAR0;
nir_copy_var(&b, v_position, a_position);
nir_copy_var(&b, v_tex_position, a_tex_position);
return b.shader;
}
/* simple passthrough shader */
static nir_shader *
build_nir_fs(void)
{
const struct glsl_type *vec4 = glsl_vec4_type();
nir_builder b;
nir_variable *v_tex_position; /* vec4, varying texture coordinate */
nir_variable *f_color; /* vec4, fragment output color */
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_FRAGMENT, NULL);
b.shader->info.name = ralloc_asprintf(b.shader,
"meta_resolve_fs");
v_tex_position = nir_variable_create(b.shader, nir_var_shader_in, vec4,
"v_tex_position");
v_tex_position->data.location = VARYING_SLOT_VAR0;
f_color = nir_variable_create(b.shader, nir_var_shader_out, vec4,
"f_color");
f_color->data.location = FRAG_RESULT_DATA0;
nir_copy_var(&b, f_color, v_tex_position);
return b.shader;
}
static VkResult
create_pass(struct radv_device *device)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
VkAttachmentDescription attachments[2];
int i;
for (i = 0; i < 2; i++) {
attachments[i].format = VK_FORMAT_UNDEFINED;
attachments[i].samples = 1;
attachments[i].loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachments[i].storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachments[i].initialLayout = VK_IMAGE_LAYOUT_GENERAL;
attachments[i].finalLayout = VK_IMAGE_LAYOUT_GENERAL;
}
result = radv_CreateRenderPass(device_h,
&(VkRenderPassCreateInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
.attachmentCount = 2,
.pAttachments = attachments,
.subpassCount = 1,
.pSubpasses = &(VkSubpassDescription) {
.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS,
.inputAttachmentCount = 0,
.colorAttachmentCount = 2,
.pColorAttachments = (VkAttachmentReference[]) {
{
.attachment = 0,
.layout = VK_IMAGE_LAYOUT_GENERAL,
},
{
.attachment = 1,
.layout = VK_IMAGE_LAYOUT_GENERAL,
},
},
.pResolveAttachments = NULL,
.pDepthStencilAttachment = &(VkAttachmentReference) {
.attachment = VK_ATTACHMENT_UNUSED,
},
.preserveAttachmentCount = 0,
.pPreserveAttachments = NULL,
},
.dependencyCount = 0,
},
alloc,
&device->meta_state.resolve.pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
struct radv_shader_module fs_module = {
.nir = build_nir_fs(),
};
if (!fs_module.nir) {
/* XXX: Need more accurate error */
result = VK_ERROR_OUT_OF_HOST_MEMORY;
goto cleanup;
}
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&(VkGraphicsPipelineCreateInfo) {
.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
.stageCount = 2,
.pStages = (VkPipelineShaderStageCreateInfo[]) {
{
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_VERTEX_BIT,
.module = vs_module_h,
.pName = "main",
},
{
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_FRAGMENT_BIT,
.module = radv_shader_module_to_handle(&fs_module),
.pName = "main",
},
},
.pVertexInputState = &(VkPipelineVertexInputStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,
.vertexBindingDescriptionCount = 1,
.pVertexBindingDescriptions = (VkVertexInputBindingDescription[]) {
{
.binding = 0,
.stride = sizeof(struct vertex_attrs),
.inputRate = VK_VERTEX_INPUT_RATE_VERTEX
},
},
.vertexAttributeDescriptionCount = 2,
.pVertexAttributeDescriptions = (VkVertexInputAttributeDescription[]) {
{
/* Position */
.location = 0,
.binding = 0,
.format = VK_FORMAT_R32G32_SFLOAT,
.offset = offsetof(struct vertex_attrs, position),
},
{
/* Texture Coordinate */
.location = 1,
.binding = 0,
.format = VK_FORMAT_R32G32_SFLOAT,
.offset = offsetof(struct vertex_attrs, tex_position),
},
},
},
.pInputAssemblyState = &(VkPipelineInputAssemblyStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO,
.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP,
.primitiveRestartEnable = false,
},
.pViewportState = &(VkPipelineViewportStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
.viewportCount = 0,
.scissorCount = 0,
},
.pRasterizationState = &(VkPipelineRasterizationStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,
.depthClampEnable = false,
.rasterizerDiscardEnable = false,
.polygonMode = VK_POLYGON_MODE_FILL,
.cullMode = VK_CULL_MODE_NONE,
.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.sampleShadingEnable = false,
.pSampleMask = NULL,
.alphaToCoverageEnable = false,
.alphaToOneEnable = false,
},
.pColorBlendState = &(VkPipelineColorBlendStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO,
.logicOpEnable = false,
.attachmentCount = 2,
.pAttachments = (VkPipelineColorBlendAttachmentState []) {
{
.colorWriteMask = VK_COLOR_COMPONENT_R_BIT |
VK_COLOR_COMPONENT_G_BIT |
VK_COLOR_COMPONENT_B_BIT |
VK_COLOR_COMPONENT_A_BIT,
},
{
.colorWriteMask = 0,
}
},
},
.pDynamicState = NULL,
.renderPass = device->meta_state.resolve.pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_RESOLVE,
},
&device->meta_state.alloc,
&device->meta_state.resolve.pipeline);
if (result != VK_SUCCESS)
goto cleanup;
goto cleanup;
cleanup:
ralloc_free(fs_module.nir);
return result;
}
void
radv_device_finish_meta_resolve_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
VkDevice device_h = radv_device_to_handle(device);
VkRenderPass pass_h = device->meta_state.resolve.pass;
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
if (pass_h)
radv_DestroyRenderPass(device_h, pass_h,
&device->meta_state.alloc);
VkPipeline pipeline_h = state->resolve.pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
}
VkResult
radv_device_init_meta_resolve_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
zero(device->meta_state.resolve);
struct radv_shader_module vs_module = { .nir = build_nir_vs() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
res = VK_ERROR_OUT_OF_HOST_MEMORY;
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
goto cleanup;
fail:
radv_device_finish_meta_resolve_state(device);
cleanup:
ralloc_free(vs_module.nir);
return res;
}
static void
emit_resolve(struct radv_cmd_buffer *cmd_buffer,
const VkOffset2D *src_offset,
const VkOffset2D *dest_offset,
const VkExtent2D *resolve_extent)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
uint32_t offset;
const struct vertex_attrs vertex_data[3] = {
{
.position = {
dest_offset->x,
dest_offset->y,
},
.tex_position = {
src_offset->x,
src_offset->y,
},
},
{
.position = {
dest_offset->x,
dest_offset->y + resolve_extent->height,
},
.tex_position = {
src_offset->x,
src_offset->y + resolve_extent->height,
},
},
{
.position = {
dest_offset->x + resolve_extent->width,
dest_offset->y,
},
.tex_position = {
src_offset->x + resolve_extent->width,
src_offset->y,
},
},
};
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
radv_cmd_buffer_upload_data(cmd_buffer, sizeof(vertex_data), 16, vertex_data, &offset);
struct radv_buffer vertex_buffer = {
.device = device,
.size = sizeof(vertex_data),
.bo = cmd_buffer->upload.upload_bo,
.offset = offset,
};
VkBuffer vertex_buffer_h = radv_buffer_to_handle(&vertex_buffer);
radv_CmdBindVertexBuffers(cmd_buffer_h,
/*firstBinding*/ 0,
/*bindingCount*/ 1,
(VkBuffer[]) { vertex_buffer_h },
(VkDeviceSize[]) { 0 });
VkPipeline pipeline_h = device->meta_state.resolve.pipeline;
RADV_FROM_HANDLE(radv_pipeline, pipeline, pipeline_h);
if (cmd_buffer->state.pipeline != pipeline) {
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
pipeline_h);
}
radv_CmdDraw(cmd_buffer_h, 3, 1, 0, 0);
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
si_emit_cache_flush(cmd_buffer);
}
void radv_CmdResolveImage(
VkCommandBuffer cmd_buffer_h,
VkImage src_image_h,
VkImageLayout src_image_layout,
VkImage dest_image_h,
VkImageLayout dest_image_layout,
uint32_t region_count,
const VkImageResolve* regions)
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, cmd_buffer_h);
RADV_FROM_HANDLE(radv_image, src_image, src_image_h);
RADV_FROM_HANDLE(radv_image, dest_image, dest_image_h);
struct radv_device *device = cmd_buffer->device;
struct radv_meta_saved_state saved_state;
VkDevice device_h = radv_device_to_handle(device);
bool use_compute_resolve = false;
/* we can use the hw resolve only for single full resolves */
if (region_count == 1) {
if (regions[0].srcOffset.x ||
regions[0].srcOffset.y ||
regions[0].srcOffset.z)
use_compute_resolve = true;
if (regions[0].dstOffset.x ||
regions[0].dstOffset.y ||
regions[0].dstOffset.z)
use_compute_resolve = true;
if (regions[0].extent.width != src_image->extent.width ||
regions[0].extent.height != src_image->extent.height ||
regions[0].extent.depth != src_image->extent.depth)
use_compute_resolve = true;
} else
use_compute_resolve = true;
if (use_compute_resolve) {
radv_fast_clear_flush_image_inplace(cmd_buffer, src_image);
radv_meta_resolve_compute_image(cmd_buffer,
src_image,
src_image_layout,
dest_image,
dest_image_layout,
region_count, regions);
return;
}
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
assert(src_image->samples > 1);
assert(dest_image->samples == 1);
if (src_image->samples >= 16) {
/* See commit aa3f9aaf31e9056a255f9e0472ebdfdaa60abe54 for the
* glBlitFramebuffer workaround for samples >= 16.
*/
radv_finishme("vkCmdResolveImage: need interpolation workaround when "
"samples >= 16");
}
if (src_image->array_size > 1)
radv_finishme("vkCmdResolveImage: multisample array images");
for (uint32_t r = 0; r < region_count; ++r) {
const VkImageResolve *region = &regions[r];
/* From the Vulkan 1.0 spec:
*
* - The aspectMask member of srcSubresource and dstSubresource must
* only contain VK_IMAGE_ASPECT_COLOR_BIT
*
* - The layerCount member of srcSubresource and dstSubresource must
* match
*/
assert(region->srcSubresource.aspectMask == VK_IMAGE_ASPECT_COLOR_BIT);
assert(region->dstSubresource.aspectMask == VK_IMAGE_ASPECT_COLOR_BIT);
assert(region->srcSubresource.layerCount ==
region->dstSubresource.layerCount);
const uint32_t src_base_layer =
radv_meta_get_iview_layer(src_image, &region->srcSubresource,
&region->srcOffset);
const uint32_t dest_base_layer =
radv_meta_get_iview_layer(dest_image, &region->dstSubresource,
&region->dstOffset);
/**
* From Vulkan 1.0.6 spec: 18.6 Resolving Multisample Images
*
* extent is the size in texels of the source image to resolve in width,
* height and depth. 1D images use only x and width. 2D images use x, y,
* width and height. 3D images use x, y, z, width, height and depth.
*
* srcOffset and dstOffset select the initial x, y, and z offsets in
* texels of the sub-regions of the source and destination image data.
* extent is the size in texels of the source image to resolve in width,
* height and depth. 1D images use only x and width. 2D images use x, y,
* width and height. 3D images use x, y, z, width, height and depth.
*/
const struct VkExtent3D extent =
radv_sanitize_image_extent(src_image->type, region->extent);
const struct VkOffset3D srcOffset =
radv_sanitize_image_offset(src_image->type, region->srcOffset);
const struct VkOffset3D dstOffset =
radv_sanitize_image_offset(dest_image->type, region->dstOffset);
for (uint32_t layer = 0; layer < region->srcSubresource.layerCount;
++layer) {
struct radv_image_view src_iview;
radv_image_view_init(&src_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = src_image_h,
.viewType = radv_meta_get_view_type(src_image),
.format = src_image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = region->srcSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = src_base_layer + layer,
.layerCount = 1,
},
},
cmd_buffer, VK_IMAGE_USAGE_SAMPLED_BIT);
struct radv_image_view dest_iview;
radv_image_view_init(&dest_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = dest_image_h,
.viewType = radv_meta_get_view_type(dest_image),
.format = dest_image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = region->dstSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = dest_base_layer + layer,
.layerCount = 1,
},
},
cmd_buffer, VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT);
VkFramebuffer fb_h;
radv_CreateFramebuffer(device_h,
&(VkFramebufferCreateInfo) {
.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
.attachmentCount = 2,
.pAttachments = (VkImageView[]) {
radv_image_view_to_handle(&src_iview),
radv_image_view_to_handle(&dest_iview),
},
.width = radv_minify(dest_image->extent.width,
region->dstSubresource.mipLevel),
.height = radv_minify(dest_image->extent.height,
region->dstSubresource.mipLevel),
.layers = 1
},
&cmd_buffer->pool->alloc,
&fb_h);
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.resolve.pass,
.framebuffer = fb_h,
.renderArea = {
.offset = {
dstOffset.x,
dstOffset.y,
},
.extent = {
extent.width,
extent.height,
}
},
.clearValueCount = 0,
.pClearValues = NULL,
},
VK_SUBPASS_CONTENTS_INLINE);
emit_resolve(cmd_buffer,
&(VkOffset2D) {
.x = srcOffset.x,
.y = srcOffset.y,
},
&(VkOffset2D) {
.x = dstOffset.x,
.y = dstOffset.y,
},
&(VkExtent2D) {
.width = extent.width,
.height = extent.height,
});
radv_CmdEndRenderPass(cmd_buffer_h);
radv_DestroyFramebuffer(device_h, fb_h,
&cmd_buffer->pool->alloc);
}
}
radv_meta_restore(&saved_state, cmd_buffer);
}
/**
* Emit any needed resolves for the current subpass.
*/
void
radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
{
struct radv_framebuffer *fb = cmd_buffer->state.framebuffer;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
struct radv_meta_saved_state saved_state;
/* FINISHME(perf): Skip clears for resolve attachments.
*
* From the Vulkan 1.0 spec:
*
* If the first use of an attachment in a render pass is as a resolve
* attachment, then the loadOp is effectively ignored as the resolve is
* guaranteed to overwrite all pixels in the render area.
*/
if (!subpass->has_resolve)
return;
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
for (uint32_t i = 0; i < subpass->color_count; ++i) {
VkAttachmentReference src_att = subpass->color_attachments[i];
VkAttachmentReference dest_att = subpass->resolve_attachments[i];
struct radv_image *dst_img = cmd_buffer->state.framebuffer->attachments[dest_att.attachment].attachment->image;
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
if (dst_img->surface.dcc_size) {
radv_initialize_dcc(cmd_buffer, dst_img, 0xffffffff);
cmd_buffer->state.attachments[dest_att.attachment].current_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
}
struct radv_subpass resolve_subpass = {
.color_count = 2,
.color_attachments = (VkAttachmentReference[]) { src_att, dest_att },
.depth_stencil_attachment = { .attachment = VK_ATTACHMENT_UNUSED },
};
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
/* Subpass resolves must respect the render area. We can ignore the
* render area here because vkCmdBeginRenderPass set the render area
* with 3DSTATE_DRAWING_RECTANGLE.
*
* XXX(chadv): Does the hardware really respect
* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
*/
emit_resolve(cmd_buffer,
&(VkOffset2D) { 0, 0 },
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
}
cmd_buffer->state.subpass = subpass;
radv_meta_restore(&saved_state, cmd_buffer);
}

View File

@@ -1,461 +0,0 @@
/*
* Copyright © 2016 Dave Airlie
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include <assert.h>
#include <stdbool.h>
#include "radv_meta.h"
#include "radv_private.h"
#include "nir/nir_builder.h"
#include "sid.h"
#include "vk_format.h"
static nir_shader *
build_resolve_compute_shader(struct radv_device *dev, bool is_integer, int samples)
{
nir_builder b;
char name[64];
nir_if *outer_if = NULL;
const struct glsl_type *sampler_type = glsl_sampler_type(GLSL_SAMPLER_DIM_MS,
false,
false,
GLSL_TYPE_FLOAT);
const struct glsl_type *img_type = glsl_sampler_type(GLSL_SAMPLER_DIM_2D,
false,
false,
GLSL_TYPE_FLOAT);
snprintf(name, 64, "meta_resolve_cs-%d-%s", samples, is_integer ? "int" : "float");
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_COMPUTE, NULL);
b.shader->info.name = ralloc_strdup(b.shader, name);
b.shader->info.cs.local_size[0] = 16;
b.shader->info.cs.local_size[1] = 16;
b.shader->info.cs.local_size[2] = 1;
nir_variable *input_img = nir_variable_create(b.shader, nir_var_uniform,
sampler_type, "s_tex");
input_img->data.descriptor_set = 0;
input_img->data.binding = 0;
nir_variable *output_img = nir_variable_create(b.shader, nir_var_uniform,
img_type, "out_img");
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
nir_ssa_def *invoc_id = nir_load_system_value(&b, nir_intrinsic_load_local_invocation_id, 0);
nir_ssa_def *wg_id = nir_load_system_value(&b, nir_intrinsic_load_work_group_id, 0);
nir_ssa_def *block_size = nir_imm_ivec4(&b,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
b.shader->info.cs.local_size[2], 0);
nir_ssa_def *global_id = nir_iadd(&b, nir_imul(&b, wg_id, block_size), invoc_id);
nir_intrinsic_instr *src_offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
src_offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 0));
src_offset->num_components = 2;
nir_ssa_dest_init(&src_offset->instr, &src_offset->dest, 2, 32, "src_offset");
nir_builder_instr_insert(&b, &src_offset->instr);
nir_intrinsic_instr *dst_offset = nir_intrinsic_instr_create(b.shader, nir_intrinsic_load_push_constant);
dst_offset->src[0] = nir_src_for_ssa(nir_imm_int(&b, 8));
dst_offset->num_components = 2;
nir_ssa_dest_init(&dst_offset->instr, &dst_offset->dest, 2, 32, "dst_offset");
nir_builder_instr_insert(&b, &dst_offset->instr);
nir_ssa_def *img_coord = nir_iadd(&b, global_id, &src_offset->dest.ssa);
/* do a txf_ms on each sample */
nir_ssa_def *tmp;
nir_tex_instr *tex = nir_tex_instr_create(b.shader, 2);
tex->sampler_dim = GLSL_SAMPLER_DIM_MS;
tex->op = nir_texop_txf_ms;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(img_coord);
tex->src[1].src_type = nir_tex_src_ms_index;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(&b, 0));
tex->dest_type = nir_type_float;
tex->is_array = false;
tex->coord_components = 2;
tex->texture = nir_deref_var_create(tex, input_img);
tex->sampler = NULL;
nir_ssa_dest_init(&tex->instr, &tex->dest, 4, 32, "tex");
nir_builder_instr_insert(&b, &tex->instr);
tmp = &tex->dest.ssa;
nir_variable *color =
nir_local_variable_create(b.impl, glsl_vec4_type(), "color");
if (!is_integer && samples > 1) {
nir_tex_instr *tex_all_same = nir_tex_instr_create(b.shader, 1);
tex_all_same->sampler_dim = GLSL_SAMPLER_DIM_MS;
tex_all_same->op = nir_texop_samples_identical;
tex_all_same->src[0].src_type = nir_tex_src_coord;
tex_all_same->src[0].src = nir_src_for_ssa(img_coord);
tex_all_same->dest_type = nir_type_float;
tex_all_same->is_array = false;
tex_all_same->coord_components = 2;
tex_all_same->texture = nir_deref_var_create(tex_all_same, input_img);
tex_all_same->sampler = NULL;
nir_ssa_dest_init(&tex_all_same->instr, &tex_all_same->dest, 1, 32, "tex");
nir_builder_instr_insert(&b, &tex_all_same->instr);
nir_ssa_def *all_same = nir_ine(&b, &tex_all_same->dest.ssa, nir_imm_int(&b, 0));
nir_if *if_stmt = nir_if_create(b.shader);
if_stmt->condition = nir_src_for_ssa(all_same);
nir_cf_node_insert(b.cursor, &if_stmt->cf_node);
b.cursor = nir_after_cf_list(&if_stmt->then_list);
for (int i = 1; i < samples; i++) {
nir_tex_instr *tex_add = nir_tex_instr_create(b.shader, 2);
tex_add->sampler_dim = GLSL_SAMPLER_DIM_MS;
tex_add->op = nir_texop_txf_ms;
tex_add->src[0].src_type = nir_tex_src_coord;
tex_add->src[0].src = nir_src_for_ssa(img_coord);
tex_add->src[1].src_type = nir_tex_src_ms_index;
tex_add->src[1].src = nir_src_for_ssa(nir_imm_int(&b, i));
tex_add->dest_type = nir_type_float;
tex_add->is_array = false;
tex_add->coord_components = 2;
tex_add->texture = nir_deref_var_create(tex_add, input_img);
tex_add->sampler = NULL;
nir_ssa_dest_init(&tex_add->instr, &tex_add->dest, 4, 32, "tex");
nir_builder_instr_insert(&b, &tex_add->instr);
tmp = nir_fadd(&b, tmp, &tex_add->dest.ssa);
}
tmp = nir_fdiv(&b, tmp, nir_imm_float(&b, samples));
nir_store_var(&b, color, tmp, 0xf);
b.cursor = nir_after_cf_list(&if_stmt->else_list);
outer_if = if_stmt;
}
nir_store_var(&b, color, &tex->dest.ssa, 0xf);
if (outer_if)
b.cursor = nir_after_cf_node(&outer_if->cf_node);
nir_ssa_def *newv = nir_load_var(&b, color);
nir_ssa_def *coord = nir_iadd(&b, global_id, &dst_offset->dest.ssa);
nir_intrinsic_instr *store = nir_intrinsic_instr_create(b.shader, nir_intrinsic_image_store);
store->src[0] = nir_src_for_ssa(coord);
store->src[1] = nir_src_for_ssa(nir_ssa_undef(&b, 1, 32));
store->src[2] = nir_src_for_ssa(newv);
store->variables[0] = nir_deref_var_create(store, output_img);
nir_builder_instr_insert(&b, &store->instr);
return b.shader;
}
static VkResult
create_layout(struct radv_device *device)
{
VkResult result;
/*
* two descriptors one for the image being sampled
* one for the buffer being written.
*/
VkDescriptorSetLayoutCreateInfo ds_create_info = {
.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
.bindingCount = 2,
.pBindings = (VkDescriptorSetLayoutBinding[]) {
{
.binding = 0,
.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
{
.binding = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
.descriptorCount = 1,
.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT,
.pImmutableSamplers = NULL
},
}
};
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&ds_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve_compute.ds_layout);
if (result != VK_SUCCESS)
goto fail;
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.resolve_compute.ds_layout,
.pushConstantRangeCount = 1,
.pPushConstantRanges = &(VkPushConstantRange){VK_SHADER_STAGE_COMPUTE_BIT, 0, 16},
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve_compute.p_layout);
if (result != VK_SUCCESS)
goto fail;
return VK_SUCCESS;
fail:
return result;
}
static VkResult
create_resolve_pipeline(struct radv_device *device,
int samples,
bool is_integer,
VkPipeline *pipeline)
{
VkResult result;
struct radv_shader_module cs = { .nir = NULL };
cs.nir = build_resolve_compute_shader(device, is_integer, samples);
/* compute shader */
VkPipelineShaderStageCreateInfo pipeline_shader_stage = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
.stage = VK_SHADER_STAGE_COMPUTE_BIT,
.module = radv_shader_module_to_handle(&cs),
.pName = "main",
.pSpecializationInfo = NULL,
};
VkComputePipelineCreateInfo vk_pipeline_info = {
.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
.stage = pipeline_shader_stage,
.flags = 0,
.layout = device->meta_state.resolve_compute.p_layout,
};
result = radv_CreateComputePipelines(radv_device_to_handle(device),
radv_pipeline_cache_to_handle(&device->meta_state.cache),
1, &vk_pipeline_info, NULL,
pipeline);
if (result != VK_SUCCESS)
goto fail;
ralloc_free(cs.nir);
return VK_SUCCESS;
fail:
ralloc_free(cs.nir);
return result;
}
VkResult
radv_device_init_meta_resolve_compute_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
VkResult res;
memset(&device->meta_state.resolve_compute, 0, sizeof(device->meta_state.resolve_compute));
res = create_layout(device);
if (res != VK_SUCCESS)
return res;
for (uint32_t i = 0; i < MAX_SAMPLES_LOG2; ++i) {
uint32_t samples = 1 << i;
res = create_resolve_pipeline(device, samples, false,
&state->resolve_compute.rc[i].pipeline);
res = create_resolve_pipeline(device, samples, true,
&state->resolve_compute.rc[i].i_pipeline);
}
return res;
}
void
radv_device_finish_meta_resolve_compute_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
for (uint32_t i = 0; i < MAX_SAMPLES_LOG2; ++i) {
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve_compute.rc[i].pipeline,
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve_compute.rc[i].i_pipeline,
&state->alloc);
}
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
state->resolve_compute.ds_layout,
&state->alloc);
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->resolve_compute.p_layout,
&state->alloc);
}
void radv_meta_resolve_compute_image(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *src_image,
VkImageLayout src_image_layout,
struct radv_image *dest_image,
VkImageLayout dest_image_layout,
uint32_t region_count,
const VkImageResolve *regions)
{
struct radv_device *device = cmd_buffer->device;
struct radv_meta_saved_compute_state saved_state;
const uint32_t samples = src_image->samples;
const uint32_t samples_log2 = ffs(samples) - 1;
radv_meta_save_compute(&saved_state, cmd_buffer, 16);
for (uint32_t r = 0; r < region_count; ++r) {
const VkImageResolve *region = &regions[r];
assert(region->srcSubresource.aspectMask == VK_IMAGE_ASPECT_COLOR_BIT);
assert(region->dstSubresource.aspectMask == VK_IMAGE_ASPECT_COLOR_BIT);
assert(region->srcSubresource.layerCount == region->dstSubresource.layerCount);
const uint32_t src_base_layer =
radv_meta_get_iview_layer(src_image, &region->srcSubresource,
&region->srcOffset);
const uint32_t dest_base_layer =
radv_meta_get_iview_layer(dest_image, &region->dstSubresource,
&region->dstOffset);
const struct VkExtent3D extent =
radv_sanitize_image_extent(src_image->type, region->extent);
const struct VkOffset3D srcOffset =
radv_sanitize_image_offset(src_image->type, region->srcOffset);
const struct VkOffset3D dstOffset =
radv_sanitize_image_offset(dest_image->type, region->dstOffset);
for (uint32_t layer = 0; layer < region->srcSubresource.layerCount;
++layer) {
struct radv_image_view src_iview;
VkDescriptorSet set;
radv_image_view_init(&src_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(src_image),
.viewType = radv_meta_get_view_type(src_image),
.format = src_image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = region->srcSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = src_base_layer + layer,
.layerCount = 1,
},
},
cmd_buffer, VK_IMAGE_USAGE_SAMPLED_BIT);
struct radv_image_view dest_iview;
radv_image_view_init(&dest_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(dest_image),
.viewType = radv_meta_get_view_type(dest_image),
.format = dest_image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = region->dstSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = dest_base_layer + layer,
.layerCount = 1,
},
},
cmd_buffer, VK_IMAGE_USAGE_STORAGE_BIT);
radv_temp_descriptor_set_create(device, cmd_buffer,
device->meta_state.resolve_compute.ds_layout,
&set);
radv_UpdateDescriptorSets(radv_device_to_handle(device),
2, /* writeCount */
(VkWriteDescriptorSet[]) {
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = set,
.dstBinding = 0,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
.pImageInfo = (VkDescriptorImageInfo[]) {
{
.sampler = NULL,
.imageView = radv_image_view_to_handle(&src_iview),
.imageLayout = VK_IMAGE_LAYOUT_GENERAL,
},
}
},
{
.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
.dstSet = set,
.dstBinding = 1,
.dstArrayElement = 0,
.descriptorCount = 1,
.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
.pImageInfo = (VkDescriptorImageInfo[]) {
{
.sampler = NULL,
.imageView = radv_image_view_to_handle(&dest_iview),
.imageLayout = VK_IMAGE_LAYOUT_GENERAL,
},
}
}
}, 0, NULL);
radv_CmdBindDescriptorSets(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE,
device->meta_state.resolve_compute.p_layout, 0, 1,
&set, 0, NULL);
VkPipeline pipeline;
if (vk_format_is_int(src_image->vk_format))
pipeline = device->meta_state.resolve_compute.rc[samples_log2].i_pipeline;
else
pipeline = device->meta_state.resolve_compute.rc[samples_log2].pipeline;
if (cmd_buffer->state.compute_pipeline != radv_pipeline_from_handle(pipeline)) {
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
}
unsigned push_constants[4] = {
srcOffset.x,
srcOffset.y,
dstOffset.x,
dstOffset.y,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.resolve_compute.p_layout,
VK_SHADER_STAGE_COMPUTE_BIT, 0, 16,
push_constants);
radv_unaligned_dispatch(cmd_buffer, extent.width, extent.height, 1);
radv_temp_descriptor_set_destroy(cmd_buffer->device, set);
}
}
radv_meta_restore_compute(&saved_state, cmd_buffer, 16);
}

View File

@@ -1,183 +0,0 @@
/*
* Copyright © 2016 Red Hat.
* Copyright © 2016 Bas Nieuwenhuizen
*
* based in part on anv driver which is:
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "radv_private.h"
VkResult radv_CreateRenderPass(
VkDevice _device,
const VkRenderPassCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkRenderPass* pRenderPass)
{
RADV_FROM_HANDLE(radv_device, device, _device);
struct radv_render_pass *pass;
size_t size;
size_t attachments_offset;
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO);
size = sizeof(*pass);
size += pCreateInfo->subpassCount * sizeof(pass->subpasses[0]);
attachments_offset = size;
size += pCreateInfo->attachmentCount * sizeof(pass->attachments[0]);
pass = vk_alloc2(&device->alloc, pAllocator, size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (pass == NULL)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
memset(pass, 0, size);
pass->attachment_count = pCreateInfo->attachmentCount;
pass->subpass_count = pCreateInfo->subpassCount;
pass->attachments = (void *) pass + attachments_offset;
for (uint32_t i = 0; i < pCreateInfo->attachmentCount; i++) {
struct radv_render_pass_attachment *att = &pass->attachments[i];
att->format = pCreateInfo->pAttachments[i].format;
att->samples = pCreateInfo->pAttachments[i].samples;
att->load_op = pCreateInfo->pAttachments[i].loadOp;
att->stencil_load_op = pCreateInfo->pAttachments[i].stencilLoadOp;
att->initial_layout = pCreateInfo->pAttachments[i].initialLayout;
att->final_layout = pCreateInfo->pAttachments[i].finalLayout;
// att->store_op = pCreateInfo->pAttachments[i].storeOp;
// att->stencil_store_op = pCreateInfo->pAttachments[i].stencilStoreOp;
}
uint32_t subpass_attachment_count = 0;
VkAttachmentReference *p;
for (uint32_t i = 0; i < pCreateInfo->subpassCount; i++) {
const VkSubpassDescription *desc = &pCreateInfo->pSubpasses[i];
subpass_attachment_count +=
desc->inputAttachmentCount +
desc->colorAttachmentCount +
/* Count colorAttachmentCount again for resolve_attachments */
desc->colorAttachmentCount;
}
if (subpass_attachment_count) {
pass->subpass_attachments =
vk_alloc2(&device->alloc, pAllocator,
subpass_attachment_count * sizeof(VkAttachmentReference), 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (pass->subpass_attachments == NULL) {
vk_free2(&device->alloc, pAllocator, pass);
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
}
} else
pass->subpass_attachments = NULL;
p = pass->subpass_attachments;
for (uint32_t i = 0; i < pCreateInfo->subpassCount; i++) {
const VkSubpassDescription *desc = &pCreateInfo->pSubpasses[i];
struct radv_subpass *subpass = &pass->subpasses[i];
subpass->input_count = desc->inputAttachmentCount;
subpass->color_count = desc->colorAttachmentCount;
if (desc->inputAttachmentCount > 0) {
subpass->input_attachments = p;
p += desc->inputAttachmentCount;
for (uint32_t j = 0; j < desc->inputAttachmentCount; j++) {
subpass->input_attachments[j]
= desc->pInputAttachments[j];
}
}
if (desc->colorAttachmentCount > 0) {
subpass->color_attachments = p;
p += desc->colorAttachmentCount;
for (uint32_t j = 0; j < desc->colorAttachmentCount; j++) {
subpass->color_attachments[j]
= desc->pColorAttachments[j];
}
}
subpass->has_resolve = false;
if (desc->pResolveAttachments) {
subpass->resolve_attachments = p;
p += desc->colorAttachmentCount;
for (uint32_t j = 0; j < desc->colorAttachmentCount; j++) {
uint32_t a = desc->pResolveAttachments[j].attachment;
subpass->resolve_attachments[j]
= desc->pResolveAttachments[j];
if (a != VK_ATTACHMENT_UNUSED)
subpass->has_resolve = true;
}
}
if (desc->pDepthStencilAttachment) {
subpass->depth_stencil_attachment =
*desc->pDepthStencilAttachment;
} else {
subpass->depth_stencil_attachment.attachment = VK_ATTACHMENT_UNUSED;
}
}
for (unsigned i = 0; i < pCreateInfo->dependencyCount; ++i) {
uint32_t dst = pCreateInfo->pDependencies[i].dstSubpass;
if (dst == VK_SUBPASS_EXTERNAL) {
pass->end_barrier.src_stage_mask = pCreateInfo->pDependencies[i].srcStageMask;
pass->end_barrier.src_access_mask = pCreateInfo->pDependencies[i].srcAccessMask;
pass->end_barrier.dst_access_mask = pCreateInfo->pDependencies[i].dstAccessMask;
} else {
pass->subpasses[dst].start_barrier.src_stage_mask = pCreateInfo->pDependencies[i].srcStageMask;
pass->subpasses[dst].start_barrier.src_access_mask = pCreateInfo->pDependencies[i].srcAccessMask;
pass->subpasses[dst].start_barrier.dst_access_mask = pCreateInfo->pDependencies[i].dstAccessMask;
}
}
*pRenderPass = radv_render_pass_to_handle(pass);
return VK_SUCCESS;
}
void radv_DestroyRenderPass(
VkDevice _device,
VkRenderPass _pass,
const VkAllocationCallbacks* pAllocator)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_render_pass, pass, _pass);
if (!_pass)
return;
vk_free2(&device->alloc, pAllocator, pass->subpass_attachments);
vk_free2(&device->alloc, pAllocator, pass);
}
void radv_GetRenderAreaGranularity(
VkDevice device,
VkRenderPass renderPass,
VkExtent2D* pGranularity)
{
pGranularity->width = 1;
pGranularity->height = 1;
}

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More