Compare commits

...

32 Commits

Author SHA1 Message Date
Emil Velikov
01579a9d00 docs: add release notes for 12.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 15:31:47 +00:00
Emil Velikov
cd9a116558 Update version to 12.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 15:25:21 +00:00
Marek Olšák
4a5cce8bd5 radeonsi: silence runtime warnings with LLVM 3.9
Such as:
Warning: LLVM emitted unknown config register: 0x4

This is a non-intrusive back port of commit 0f7a6ea5e7.
2016-12-05 13:15:35 +00:00
Marek Olšák
b4c28b1755 radeonsi: disable RB+ blend optimizations for dual source blending
This fixes dual source blending on Stoney. The fix was copied from Vulkan.
The problem was discovered during internal testing.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5e5573b1bf)
2016-12-05 13:13:11 +00:00
Marek Olšák
4f71f93878 radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending
copied from Vulkan

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ff50c44a5f)
2016-12-05 13:12:21 +00:00
Marek Olšák
a9e5a98c19 radeonsi: always set all blend registers
better safe than sorry

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 87b208a54e)

Conflicts:
	src/gallium/drivers/radeonsi/si_state.c
2016-12-05 13:11:05 +00:00
Nanley Chery
c1cb184488 mesa/fbobject: Update CubeMapFace when reusing textures
Framebuffer attachments can be specified through FramebufferTexture*
calls. Upon specifying a depth (or stencil) framebuffer attachment that
internally reuses a texture, the cube map face of the new attachment
would not be updated (defaulting to TEXTURE_CUBE_MAP_POSITIVE_X).
Fix this issue by actually updating the CubeMapFace field.

This bug manifested itself in BindFramebuffer calls performed on
framebuffers whose stencil attachments internally reused a depth
texture.  When binding a framebuffer, we walk through the framebuffer's
attachments and update each one's corresponding gl_renderbuffer. Since
the framebuffer's depth and stencil attachments may share a
gl_renderbuffer and the walk visits the stencil attachment after
the depth attachment, the uninitialized CubeMapFace forced rendering
to TEXTURE_CUBE_MAP_POSITIVE_X.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77662
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 63318d34ac)
2016-12-02 21:23:10 +00:00
Marek Olšák
e3ef7da79c gallium/radeon: add support for sharing textures with DCC between processes
v2: use a function for calculating WORD1 of bo metadata

[Lyude]
On Fedora 24 and 25, I ended up noticing some rather nasty graphical
glitches on my desktop (using an R9 380 w/ amdgpu, Mesa version 12.0.4)
while I was in Wayland where the content of windows was garbled, as seen
here:

https://people.freedesktop.org/~lyudess/archive/11-30-2017/amdgpu-fix-example.png

After doing some reverse bisecting with Mesa v13, I ended up tracking
down the fix to this patch, which seems to fix the problem entirely on
all of the systems I've tested.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Lyude <lyude@redhat.com>
CC: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 095803a37a)
2016-12-02 19:57:55 +00:00
Matt Turner
9666f75b1b anv: Replace "abi_versions" with correct "api_version".
git history shows "abi_versions" was used from the outset.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07755237d3)
2016-12-02 19:57:55 +00:00
Marek Olšák
0afbb9d052 radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b425b57d1e)
2016-12-02 19:57:55 +00:00
Marek Olšák
bd114e6be6 radeonsi: fix a crash in imageSize for cubemap arrays
Sometimes it was f32, other times it was i32. Now it's always i32.

This fixes:
GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 3e756f09d4)
2016-12-02 19:57:55 +00:00
Marek Olšák
29bac28a04 radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader
This fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation
.gl_PatchVerticesIn

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 03708deed2)
2016-12-02 19:57:55 +00:00
Marek Olšák
31aa3c014b gallium/radeon: set VPORT_ZMIN/MAX registers correctly
Calculate depth ranges from viewport states and
pipe_rasterizer_state::clip_halfz.

The evergreend.h change is required to silence a warning.

This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 687c4be9cf)
2016-12-02 19:57:55 +00:00
Marek Olšák
b65a812d60 gallium/radeon: unify viewport emission code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8b0507672e)
2016-12-02 19:57:54 +00:00
Haixia Shi
5dd6e23ad8 mesa: change state query return value for RGB565
The GL_BGR and GL_UNSIGNED_SHORT_5_6_5_REV are not defined anywhere in
OpenGL ES 3.2 (or earlier) specification, and there are no known extensions
in the Khronos registry that would add these enums as valid responses for
glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_TYPE) and
glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT) queries.

Note that this patch does not change the bit layout returned by the query. As
defined by the GL spec, the bit layout of GL_RGB + GL_UNSIGNED_SHORT_5_6_5 and
GL_BGR + GL_UNSIGNED_SHORT_5_6_5_REV are identical.

TEST=dEQP-GLES3.functional.state_query.integers.*

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Change-Id: I81bbc8ccdc7e125edaeae443baf6fa8fdefcc6b6
(cherry picked from commit 8c56ff643b)
2016-12-02 19:57:54 +00:00
Adam Jackson
422b584c00 glx/glvnd: Fix dispatch function names and indices
As this array was not actually sorted, FindGLXFunction's binary search
would only sometimes work.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 8bca8d89ef)
2016-12-02 19:57:54 +00:00
Adam Jackson
b1bced0d1f glx/glvnd: Don't modify the dummy slot in the dispatch table
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit deb0eb1660)
2016-12-02 19:57:54 +00:00
Steinar H. Gunderson
9baee818b6 Fix races during _mesa_HashWalk().
There is currently no protection against walking a hash (using
_mesa_HashWalk()) and modifying it at the same time, for instance by inserting
or deleting elements. This leads to segfaults in multithreaded code if e.g.
someone calls glTexImage2D (which may have to walk the list of FBOs) while
another thread is calling glDeleteFramebuffers on another thread with the two
contexts sharing lists.

The reason for this is that _mesa_HashWalk() doesn't actually take the mutex
that normally protects the hash; it takes an entirely different mutex.
Thus, walks are only protected against other walks, and there is also no
outer lock taking this. There is an old comment saying that this is to fix
problems with deadlock if the callback needs to take a mutex; we solve this
by changing the mutex to be recursive.

A demonstration Helgrind hit from a real application:

==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1
==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400
==13412==    at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395)
==13412==    by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350)
==13412==    by 0x1EE98174: _mesa_HashRemove (hash.c:365)
==13412==    by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669)
==13412==    by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473)
==13412==    by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442)
[...]
==13412== This conflicts with a previous read of size 8 by thread #20
==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318
==13412==    at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415)
==13412==    by 0x1EE982A8: _mesa_HashWalk (hash.c:426)
==13412==    by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683)
==13412==    by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043)
==13412==    by 0x1EED9410: teximage (teximage.c:3073)
==13412==    by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105)
==13412==    by 0x166A68: operator() (mixer.cpp:454)

There are many more interactions than just these two possible.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Steinar H. Gunderson <steinar+mesa@gunderson.no>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 2e2562cabb)
2016-12-02 19:57:54 +00:00
Jason Ekstrand
68dd6ad433 anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4
This fixes hangs in Dota2

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a6c3d0f92b)
2016-12-02 19:57:54 +00:00
Jason Ekstrand
6bcdb0611f anv/cmd_buffer: Take a command buffer instead of a batch in two helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1e3e347fd5)
2016-12-02 19:57:54 +00:00
Emil Velikov
0703bab2cd cherry-ignore: add reverted LLVM_LIBDIR patch
The patch was reverted shortly after it was merged.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-02 19:57:54 +00:00
Anuj Phogat
a7b662633e i965: Fix GPU hang related to multiple render targets and alpha testing
This patch should have been the part of commit e592f7df.
In a situation when there are multiple render targets with alpha testing
enabled, if fragment shader doesn't write to draw buffer zero, it causes
the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a
warning for all gen6+ h/w:
"Illegal render target write message length 0xa expected 0xc"

This patch fixes the GPU hang as well as the simulator warning with
new piglit test fbo-mrt-alphatest-no-buffer-zero-write:
https://patchwork.freedesktop.org/patch/118212

No regressions in Jenkins CI system.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit b9df2251c1)
2016-12-02 19:57:54 +00:00
Marek Olšák
faa684802f radeonsi: fix an assertion failure in si_decompress_sampler_color_textures
This fixes a crash in Deus Ex: Mankind Divided. Release builds were
unaffected, so it's not too serious.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 00baaa4752)
2016-12-02 19:57:54 +00:00
Jason Ekstrand
9a844035c0 i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
The address immediate field is only 9 bits and, since the value is in
bytes, the highest GRF we can point to with it is g15.  This makes it
pretty close to useless for MOV_INDIRECT.  There were already piles of
restrictions preventing us from using it prior to Broadwell, so let's get
rid of the gen8+ code path entirely.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2a4a86862c)
2016-12-02 19:57:54 +00:00
Tim Rowley
5f4284fd36 swr: [rasterizer] add support for llvm-3.9
v2: use signed compare, remove unneeded vmask

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit f810907669)
2016-12-02 19:57:54 +00:00
Tim Rowley
a4cd90283a swr: [rasterizer jitter] fix llvm-3.7 compile
d3d97f8 broke llvm-3.7, which has a mismatched API for
setDataLayout/getDataLayout.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit ae4f2c849a)
2016-12-02 19:57:53 +00:00
Tim Rowley
0934f29c50 swr: [rasterizer jitter] cleanup supporting different llvm versions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit d3d97f8395)
2016-12-02 19:57:53 +00:00
Kenneth Graunke
e6bc5248aa intel: Fix pixel shader scratch space allocation on Gen9+ platforms.
We had missed a bit of errata - PS scratch needs to be computed as if
there were 4 subslices per slice, rather than 3.

This is a conservative backport of commit aaee3daa90.
It only increases the scratch amount, unlike the original commit which
decreases it on Skylake GT1-3 to avoid overallocating.

Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-11-30 07:01:47 -08:00
Marek Olšák
352902218e gallium/radeon: make sure HTILE address is aligned properly
This should fix random GPU hangs on Hawaii and Fiji.
It's already been fixed in 13.0 and later.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-23 18:58:24 +01:00
Marek Olšák
6e77fbc8d7 gallium/radeon: fix behavior of GLSL findLSB(0)
This is for 12.0 and older. A different commit fixes 13.0 and newer.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
2016-11-11 22:41:45 +01:00
Emil Velikov
7b9d7257b2 docs: add sha256 checksums for 12.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-11 01:55:08 +00:00
Emil Velikov
3776e97f9d docs: add release notes for 12.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-11 01:53:32 +00:00
41 changed files with 992 additions and 308 deletions

View File

@@ -1 +1 @@
12.0.4
12.0.5

View File

@@ -23,3 +23,6 @@ f2b9b0c730e345bcffa9eadabb25af3ab02642f2 i965: Add missing BRW_NEW_FS_PROG_DATA
# Patches depend on the fence_finish() gallium API change and corresponding driver work
f240ad98bc05281ea7013d91973cb5f932ae9434 st/mesa: unduplicate st_check_sync code
b687f766fddb7b39479cd9ee0427984029ea3559 st/mesa: allow multiple concurrent waiters in ClientWaitSync
# Commit was reverted shortly after it landed in master
a39ad185932eab4f25a0cb2b112c10d8700ef242 configure.ac: honour LLVM_LIBDIR when linking against LLVM

321
docs/relnotes/12.0.4.html Normal file
View File

@@ -0,0 +1,321 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.4 Release Notes / November 10, 2016</h1>
<p>
Mesa 12.0.4 is a bug fix release which fixes bugs found since the 12.0.4 release.
</p>
<p>
Mesa 12.0.4 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
22026ce4f1c6a7908b0d10ff057decec0a5633afe7f38a0cef5c08d0689f02a6 mesa-12.0.4.tar.gz
5d6003da867d3f54e5000b4acdfc37e6cce5b6a4459274fdad73e24bd2f0065e mesa-12.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef GLXContext</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (4):</p>
<ul>
<li>gallium/util: Really allow aliasing of dst for u_box_union_*</li>
<li>st/nine: Fix the calculation of the number of vs inputs</li>
<li>st/nine: Fix mistake in Volume9 UnlockBox</li>
<li>st/nine: Fix locking CubeTexture surfaces.</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>configure.ac: fix the name of the Wayland Scanner pc file</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()</li>
</ul>
<p>Chad Versace (3):</p>
<ul>
<li>egl: Fix truncation error in _eglParseSyncAttribList64</li>
<li>i965/sync: Fix uninitalized usage and leak of mutex</li>
<li>egl: Don't advertise unsupported platform extensions</li>
</ul>
<p>Chuanbo Weng (1):</p>
<ul>
<li>gbm: fix potential NULL deref of mapImage/unmapImage.</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>autoconf: Make header install distinct for various APIs (v2)</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>anv: initialise and increment send_sbc</li>
<li>anv/wsi: fix apps that acquire multiple images up front</li>
<li>Revert "st/vdpau: use linear layout for output surfaces"</li>
</ul>
<p>Emil Velikov (12):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.3</li>
<li>cherry-ignore: add non-applicable i965 commit</li>
<li>cherry-ignore: add vaapi encode fix</li>
<li>cherry-ignore: add EGL_KHR_debug fix</li>
<li>cherry-ignore: add update_renderbuffer_read_surfaces()</li>
<li>isl/gen6: correctly check msaa layout samples count</li>
<li>egl/x11: don't crash if dri2_dpy-&gt;conn is NULL</li>
<li>get-pick-list.sh: Require explicit "12.0" for nominating stable patches</li>
<li>automake: don't forget to pick wglext.h in the tarball</li>
<li>cherry-ignore: add N/A EGL revert</li>
<li>cherry-ignore: add ClientWaitSync fixes</li>
<li>Update version to 12.0.4</li>
</ul>
<p>Eric Anholt (5):</p>
<ul>
<li>travis: Parse configure.ac to pick an updated LIBDRM_VERSION.</li>
<li>travis: Update to the Ubuntu Trusty image.</li>
<li>travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.</li>
<li>travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.</li>
<li>gallium: Fix install-gallium-links.mk on non-bash /bin/sh</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept</li>
</ul>
<p>Ilia Mirkin (16):</p>
<ul>
<li>nv30: set usage to staging so that the buffer is allocated in GART</li>
<li>a3xx: make sure to actually clamp depth as requested</li>
<li>a3xx: make use of software clipping when hw can't handle it</li>
<li>a3xx: use window scissor to simulate viewport xy clip</li>
<li>main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer</li>
<li>mesa/formatquery: limit ES target support, fix core context support</li>
<li>nir: fix definition of pack_uvec2_to_uint</li>
<li>gm107/ir: AL2P writes to a predicate register</li>
<li>st/mesa: fix is_scissor_enabled when X/Y are negative</li>
<li>nvc0/ir: fix overwriting of value backing non-constant gather offset</li>
<li>nv50/ir: copy over value's register id when resolving merge of a phi</li>
<li>nvc0/ir: fix textureGather with a single offset</li>
<li>gm107/ir: fix texturing with indirect samplers</li>
<li>gm107/ir: fix bit offset of tex lod setting for indirect texturing</li>
<li>nv50,nvc0: avoid reading out of bounds when getting bogus so info</li>
<li>nv50/ir: process texture offset sources as regular sources</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radeonsi: Fix primitive restart when index changes</li>
</ul>
<p>Jason Ekstrand (9):</p>
<ul>
<li>nir/spirv: Swap the argument order for AtomicCompareExchange</li>
<li>nir/spirv: Use the correct sources for CompareExchange on images</li>
<li>nir/spirv: Break variable decoration handling into a helper</li>
<li>nir/spirv: Refactor variable deocration handling</li>
<li>nir/spirv/cfg: Handle switches whose break block is a loop continue</li>
<li>nir/spirv/cfg: Detect switch_break after loop_break/continue</li>
<li>nir: Add a nop intrinsic</li>
<li>nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks</li>
<li>intel/blorp: Rework our usage of ralloc when compiling shaders</li>
</ul>
<p>Jonathan Gray (3):</p>
<ul>
<li>genxml: add generated headers to EXTRA_DIST</li>
<li>mapi: automake: set VISIBILITY_CFLAGS for shared glapi</li>
<li>mesa: automake: include mesa_glinterop.h in distfile</li>
</ul>
<p>Julien Isorce (1):</p>
<ul>
<li>st/va: also honors interlaced preference when providing a video format</li>
</ul>
<p>Kenneth Graunke (8):</p>
<ul>
<li>nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().</li>
<li>mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.</li>
<li>i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.</li>
<li>i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.</li>
<li>i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.</li>
<li>i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.</li>
<li>i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.</li>
<li>i965: Fix gl_InvocationID in dual object GS where invocations == 1.</li>
</ul>
<p>Marek Olšák (12):</p>
<ul>
<li>radeonsi: fix cubemaps viewed as 2D</li>
<li>radeonsi: take compute shader and dispatch indirect memory usage into account</li>
<li>radeonsi: fix FP64 UBO loads with indirect uniform block indexing</li>
<li>mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc</li>
<li>radeonsi: fix interpolateAt opcodes for .zw components</li>
<li>radeonsi: fix texture border colors for compute shaders</li>
<li>radeonsi: disable ReZ</li>
<li>gallium/radeon: make sure the address of separate CMASK is aligned properly</li>
<li>winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures</li>
<li>egl: use util/macros.h</li>
<li>egl: make interop ABI visible again</li>
<li>glx: make interop ABI visible again</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Perform check for valid fbconfig against proper X-Screen.</li>
</ul>
<p>Martin Peres (2):</p>
<ul>
<li>loader/dri3: add get_dri_screen() to the vtable</li>
<li>loader/dri3: import prime buffers in the currently-bound screen</li>
</ul>
<p>Matt Whitlock (5):</p>
<ul>
<li>egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
</ul>
<p>Max Staudt (1):</p>
<ul>
<li>r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Overhaul dri3_update_num_back</li>
</ul>
<p>Nicholas Bishop (2):</p>
<ul>
<li>gbm: return appropriate error when queryImage() fails</li>
<li>st/dri: check pipe_screen-&gt;resource_get_handle() return value</li>
</ul>
<p>Nicolai Hähnle (10):</p>
<ul>
<li>gallium/radeon: cleanup and fix branch emits</li>
<li>st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations</li>
<li>st/glsl_to_tgsi: simplify translate_tex_offset</li>
<li>st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets</li>
<li>st/mesa: fix vertex elements setup for doubles</li>
<li>radeonsi: fix indirect loads of 64 bit constants</li>
<li>st/glsl_to_tgsi: fix atomic counter addressing</li>
<li>st/glsl_to_tgsi: fix block copies of arrays of doubles</li>
<li>st/mesa: only set primitive_restart when the restart index is in range</li>
<li>radeonsi: fix 64-bit loads from LDS</li>
</ul>
<p>Samuel Pitoiset (4):</p>
<ul>
<li>nvc0/ir: fix subops for IMAD</li>
<li>gk110/ir: fix wrong emission of OP_NOT</li>
<li>nvc0: use correct bufctx when invalidating CP textures</li>
<li>nvc0/ir: fix emission of IMAD with NEG modifiers</li>
</ul>
<p>Stencel, Joanna (1):</p>
<ul>
<li>egl/wayland: add missing destroy_window callback</li>
</ul>
<p>Tapani Pälli (5):</p>
<ul>
<li>egl: stop claiming support for pbuffer + msaa</li>
<li>egl/dri2: set max values for pbuffer width and height</li>
<li>egl: add check that eglCreateContext gets a valid config</li>
<li>mesa: fix error handling in DrawBuffers</li>
<li>egl: set preserved behavior for surface only if config supports it</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>configure.ac: add llvm inteljitevents component if enabled</li>
</ul>
<p>Vedran Miletić (1):</p>
<ul>
<li>clover: Fix build against clang SVN &gt;= r273191</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>Revert "mesa_glinterop: remove inclusion of GLX header"</li>
</ul>
</div>
</body>
</html>

137
docs/relnotes/12.0.5.html Normal file
View File

@@ -0,0 +1,137 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.5 Release Notes / December 5, 2016</h1>
<p>
Mesa 12.0.5 is a bug fix release which fixes bugs found since the 12.0.5 release.
</p>
<p>
Mesa 12.0.5 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (2):</p>
<ul>
<li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>
<li>glx/glvnd: Fix dispatch function names and indices</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add release notes for 12.0.4</li>
<li>docs: add sha256 checksums for 12.0.4</li>
<li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>
<li>Update version to 12.0.5</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>mesa: change state query return value for RGB565</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>i965/fs/generator: Don't use the address immediate for MOV_INDIRECT</li>
<li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>
<li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>intel: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>
</ul>
<p>Marek Olšák (13):</p>
<ul>
<li>gallium/radeon: fix behavior of GLSL findLSB(0)</li>
<li>gallium/radeon: make sure HTILE address is aligned properly</li>
<li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>
<li>gallium/radeon: unify viewport emission code</li>
<li>gallium/radeon: set VPORT_ZMIN/MAX registers correctly</li>
<li>radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader</li>
<li>radeonsi: fix a crash in imageSize for cubemap arrays</li>
<li>radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it</li>
<li>gallium/radeon: add support for sharing textures with DCC between processes</li>
<li>radeonsi: always set all blend registers</li>
<li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>
<li>radeonsi: disable RB+ blend optimizations for dual source blending</li>
<li>radeonsi: silence runtime warnings with LLVM 3.9</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>anv: Replace "abi_versions" with correct "api_version".</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>mesa/fbobject: Update CubeMapFace when reusing textures</li>
</ul>
<p>Steinar H. Gunderson (1):</p>
<ul>
<li>Fix races during _mesa_HashWalk().</li>
</ul>
<p>Tim Rowley (3):</p>
<ul>
<li>swr: [rasterizer jitter] cleanup supporting different llvm versions</li>
<li>swr: [rasterizer jitter] fix llvm-3.7 compile</li>
<li>swr: [rasterizer] add support for llvm-3.9</li>
</ul>
</div>
</body>
</html>

View File

@@ -473,6 +473,7 @@ static void *evergreen_create_rs_state(struct pipe_context *ctx,
r600_init_command_buffer(&rs->buffer, 30);
rs->scissor_enable = state->scissor;
rs->clip_halfz = state->clip_halfz;
rs->flatshade = state->flatshade;
rs->sprite_coord_enable = state->sprite_coord_enable;
rs->two_side = state->light_twoside;

View File

@@ -1862,8 +1862,8 @@
#define R_0283F8_SQ_VTX_SEMANTIC_30 0x000283F8
#define R_0283FC_SQ_VTX_SEMANTIC_31 0x000283FC
#define R_0288F0_SQ_VTX_SEMANTIC_CLEAR 0x000288F0
#define R_0282D0_PA_SC_VPORT_ZMIN_0 0x000282D0
#define R_0282D4_PA_SC_VPORT_ZMAX_0 0x000282D4
#define R_0282D0_PA_SC_VPORT_ZMIN_0 0x0282D0
#define R_0282D4_PA_SC_VPORT_ZMAX_0 0x0282D4
#define R_028400_VGT_MAX_VTX_INDX 0x00028400
#define R_028404_VGT_MIN_VTX_INDX 0x00028404
#define R_028408_VGT_INDX_OFFSET 0x00028408

View File

@@ -308,6 +308,7 @@ void r600_begin_new_cs(struct r600_context *ctx)
ctx->b.scissors.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
r600_mark_atom_dirty(ctx, &ctx->b.scissors.atom);
ctx->b.viewports.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
ctx->b.viewports.depth_range_dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
r600_mark_atom_dirty(ctx, &ctx->b.viewports.atom);
if (ctx->b.chip_class <= EVERGREEN) {
r600_mark_atom_dirty(ctx, &ctx->config_state.atom);

View File

@@ -274,6 +274,7 @@ struct r600_rasterizer_state {
bool offset_enable;
bool scissor_enable;
bool multisample_enable;
bool clip_halfz;
};
struct r600_poly_offset_state {

View File

@@ -459,6 +459,7 @@ static void *r600_create_rs_state(struct pipe_context *ctx,
r600_init_command_buffer(&rs->buffer, 30);
rs->scissor_enable = state->scissor;
rs->clip_halfz = state->clip_halfz;
rs->flatshade = state->flatshade;
rs->sprite_coord_enable = state->sprite_coord_enable;
rs->two_side = state->light_twoside;

View File

@@ -364,7 +364,7 @@ static void r600_bind_rs_state(struct pipe_context *ctx, void *state)
r600_mark_atom_dirty(rctx, &rctx->clip_misc_state.atom);
}
r600_set_scissor_enable(&rctx->b, rs->scissor_enable);
r600_viewport_set_rast_deps(&rctx->b, rs->scissor_enable, rs->clip_halfz);
/* Re-emit PA_SC_LINE_STIPPLE. */
rctx->last_primitive_type = -1;

View File

@@ -366,6 +366,10 @@ struct r600_common_screen {
void (*query_opaque_metadata)(struct r600_common_screen *rscreen,
struct r600_texture *rtex,
struct radeon_bo_metadata *md);
void (*apply_opaque_metadata)(struct r600_common_screen *rscreen,
struct r600_texture *rtex,
struct radeon_bo_metadata *md);
};
/* This encapsulates a state or an operation which can emitted into the GPU
@@ -430,6 +434,7 @@ struct r600_scissors {
struct r600_viewports {
struct r600_atom atom;
unsigned dirty_mask;
unsigned depth_range_dirty_mask;
struct pipe_viewport_state states[R600_MAX_VIEWPORTS];
struct r600_signed_scissor as_scissor[R600_MAX_VIEWPORTS];
};
@@ -469,6 +474,7 @@ struct r600_common_context {
struct r600_scissors scissors;
struct r600_viewports viewports;
bool scissor_enabled;
bool clip_halfz;
bool vs_writes_viewport_index;
bool vs_disables_clipping_viewport;
@@ -669,7 +675,8 @@ void r600_init_context_texture_functions(struct r600_common_context *rctx);
/* r600_viewport.c */
void evergreen_apply_scissor_bug_workaround(struct r600_common_context *rctx,
struct pipe_scissor_state *scissor);
void r600_set_scissor_enable(struct r600_common_context *rctx, bool enable);
void r600_viewport_set_rast_deps(struct r600_common_context *rctx,
bool scissor_enable, bool clip_halfz);
void r600_update_vs_writes_viewport_index(struct r600_common_context *rctx,
struct tgsi_shader_info *info);
void r600_init_viewport_functions(struct r600_common_context *rctx);

View File

@@ -723,10 +723,11 @@ static void r600_texture_alloc_cmask_separate(struct r600_common_screen *rscreen
}
static unsigned r600_texture_get_htile_size(struct r600_common_screen *rscreen,
struct r600_texture *rtex)
struct r600_texture *rtex,
unsigned *base_align)
{
unsigned cl_width, cl_height, width, height;
unsigned slice_elements, slice_bytes, pipe_interleave_bytes, base_align;
unsigned slice_elements, slice_bytes, pipe_interleave_bytes;
unsigned num_pipes = rscreen->info.num_tile_pipes;
if (rscreen->chip_class <= EVERGREEN &&
@@ -788,7 +789,7 @@ static unsigned r600_texture_get_htile_size(struct r600_common_screen *rscreen,
slice_bytes = slice_elements * 4;
pipe_interleave_bytes = rscreen->info.pipe_interleave_bytes;
base_align = num_pipes * pipe_interleave_bytes;
*base_align = num_pipes * pipe_interleave_bytes;
rtex->htile.pitch = width;
rtex->htile.height = height;
@@ -796,20 +797,22 @@ static unsigned r600_texture_get_htile_size(struct r600_common_screen *rscreen,
rtex->htile.yalign = cl_height * 8;
return (util_max_layer(&rtex->resource.b.b, 0) + 1) *
align(slice_bytes, base_align);
align(slice_bytes, *base_align);
}
static void r600_texture_allocate_htile(struct r600_common_screen *rscreen,
struct r600_texture *rtex)
{
unsigned htile_size = r600_texture_get_htile_size(rscreen, rtex);
unsigned alignment = 0;
unsigned htile_size = r600_texture_get_htile_size(rscreen, rtex,
&alignment);
if (!htile_size)
return;
rtex->htile_buffer = (struct r600_resource*)
pipe_buffer_create(&rscreen->b, PIPE_BIND_CUSTOM,
PIPE_USAGE_DEFAULT, htile_size);
r600_aligned_buffer_create(&rscreen->b, 0, PIPE_USAGE_DEFAULT,
htile_size, alignment);
if (rtex->htile_buffer == NULL) {
/* this is not a fatal error as we can still keep rendering
* without htile buffer */
@@ -965,8 +968,12 @@ r600_texture_create_object(struct pipe_screen *screen,
}
}
if (!buf && rtex->surface.dcc_size &&
!(rscreen->debug_flags & DBG_NO_DCC)) {
/* Shared textures must always set up DCC here.
* If it's not present, it will be disabled by
* apply_opaque_metadata later.
*/
if (rtex->surface.dcc_size &&
(buf || !(rscreen->debug_flags & DBG_NO_DCC))) {
/* Reserve space for the DCC buffer. */
rtex->dcc_offset = align64(rtex->size, rtex->surface.dcc_alignment);
rtex->size = rtex->dcc_offset + rtex->surface.dcc_size;
@@ -993,7 +1000,9 @@ r600_texture_create_object(struct pipe_screen *screen,
rtex->cmask.offset, rtex->cmask.size,
0xCCCCCCCC, R600_COHERENCY_NONE);
}
if (rtex->dcc_offset) {
/* Initialize DCC only if the texture is not being imported. */
if (!buf && rtex->dcc_offset) {
r600_screen_clear_buffer(rscreen, &rtex->resource.b.b,
rtex->dcc_offset,
rtex->surface.dcc_size,
@@ -1159,6 +1168,10 @@ static struct pipe_resource *r600_texture_from_handle(struct pipe_screen *screen
rtex->resource.is_shared = true;
rtex->resource.external_usage = usage;
if (rscreen->apply_opaque_metadata)
rscreen->apply_opaque_metadata(rscreen, rtex, &metadata);
return &rtex->resource.b.b;
}

View File

@@ -22,6 +22,7 @@
*/
#include "r600_cs.h"
#include "util/u_viewport.h"
#include "tgsi/tgsi_scan.h"
#define GET_MAX_SCISSOR(rctx) (rctx->chip_class >= EVERGREEN ? 16384 : 8192)
@@ -260,6 +261,7 @@ static void r600_set_viewport_states(struct pipe_context *ctx,
const struct pipe_viewport_state *state)
{
struct r600_common_context *rctx = (struct r600_common_context *)ctx;
unsigned mask;
int i;
for (i = 0; i < num_viewports; i++) {
@@ -270,13 +272,28 @@ static void r600_set_viewport_states(struct pipe_context *ctx,
&rctx->viewports.as_scissor[index]);
}
rctx->viewports.dirty_mask |= ((1 << num_viewports) - 1) << start_slot;
rctx->scissors.dirty_mask |= ((1 << num_viewports) - 1) << start_slot;
mask = ((1 << num_viewports) - 1) << start_slot;
rctx->viewports.dirty_mask |= mask;
rctx->viewports.depth_range_dirty_mask |= mask;
rctx->scissors.dirty_mask |= mask;
rctx->set_atom_dirty(rctx, &rctx->viewports.atom, true);
rctx->set_atom_dirty(rctx, &rctx->scissors.atom, true);
}
static void r600_emit_viewports(struct r600_common_context *rctx, struct r600_atom *atom)
static void r600_emit_one_viewport(struct r600_common_context *rctx,
struct pipe_viewport_state *state)
{
struct radeon_winsys_cs *cs = rctx->gfx.cs;
radeon_emit(cs, fui(state->scale[0]));
radeon_emit(cs, fui(state->translate[0]));
radeon_emit(cs, fui(state->scale[1]));
radeon_emit(cs, fui(state->translate[1]));
radeon_emit(cs, fui(state->scale[2]));
radeon_emit(cs, fui(state->translate[2]));
}
static void r600_emit_viewports(struct r600_common_context *rctx)
{
struct radeon_winsys_cs *cs = rctx->gfx.cs;
struct pipe_viewport_state *states = rctx->viewports.states;
@@ -288,12 +305,7 @@ static void r600_emit_viewports(struct r600_common_context *rctx, struct r600_at
return;
radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE, 6);
radeon_emit(cs, fui(states[0].scale[0]));
radeon_emit(cs, fui(states[0].translate[0]));
radeon_emit(cs, fui(states[0].scale[1]));
radeon_emit(cs, fui(states[0].translate[1]));
radeon_emit(cs, fui(states[0].scale[2]));
radeon_emit(cs, fui(states[0].translate[2]));
r600_emit_one_viewport(rctx, &states[0]);
rctx->viewports.dirty_mask &= ~1; /* clear one bit */
return;
}
@@ -305,25 +317,70 @@ static void r600_emit_viewports(struct r600_common_context *rctx, struct r600_at
radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE +
start * 4 * 6, count * 6);
for (i = start; i < start+count; i++) {
radeon_emit(cs, fui(states[i].scale[0]));
radeon_emit(cs, fui(states[i].translate[0]));
radeon_emit(cs, fui(states[i].scale[1]));
radeon_emit(cs, fui(states[i].translate[1]));
radeon_emit(cs, fui(states[i].scale[2]));
radeon_emit(cs, fui(states[i].translate[2]));
}
for (i = start; i < start+count; i++)
r600_emit_one_viewport(rctx, &states[i]);
}
rctx->viewports.dirty_mask = 0;
}
void r600_set_scissor_enable(struct r600_common_context *rctx, bool enable)
static void r600_emit_depth_ranges(struct r600_common_context *rctx)
{
if (rctx->scissor_enabled != enable) {
rctx->scissor_enabled = enable;
struct radeon_winsys_cs *cs = rctx->gfx.cs;
struct pipe_viewport_state *states = rctx->viewports.states;
unsigned mask = rctx->viewports.depth_range_dirty_mask;
float zmin, zmax;
/* The simple case: Only 1 viewport is active. */
if (!rctx->vs_writes_viewport_index) {
if (!(mask & 1))
return;
util_viewport_zmin_zmax(&states[0], rctx->clip_halfz, &zmin, &zmax);
radeon_set_context_reg_seq(cs, R_0282D0_PA_SC_VPORT_ZMIN_0, 2);
radeon_emit(cs, fui(zmin));
radeon_emit(cs, fui(zmax));
rctx->viewports.depth_range_dirty_mask &= ~1; /* clear one bit */
return;
}
while (mask) {
int start, count, i;
u_bit_scan_consecutive_range(&mask, &start, &count);
radeon_set_context_reg_seq(cs, R_0282D0_PA_SC_VPORT_ZMIN_0 +
start * 4 * 2, count * 2);
for (i = start; i < start+count; i++) {
util_viewport_zmin_zmax(&states[i], rctx->clip_halfz, &zmin, &zmax);
radeon_emit(cs, fui(zmin));
radeon_emit(cs, fui(zmax));
}
}
rctx->viewports.depth_range_dirty_mask = 0;
}
static void r600_emit_viewport_states(struct r600_common_context *rctx,
struct r600_atom *atom)
{
r600_emit_viewports(rctx);
r600_emit_depth_ranges(rctx);
}
/* Set viewport dependencies on pipe_rasterizer_state. */
void r600_viewport_set_rast_deps(struct r600_common_context *rctx,
bool scissor_enable, bool clip_halfz)
{
if (rctx->scissor_enabled != scissor_enable) {
rctx->scissor_enabled = scissor_enable;
rctx->scissors.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
rctx->set_atom_dirty(rctx, &rctx->scissors.atom, true);
}
if (rctx->clip_halfz != clip_halfz) {
rctx->clip_halfz = clip_halfz;
rctx->viewports.depth_range_dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
rctx->set_atom_dirty(rctx, &rctx->viewports.atom, true);
}
}
/**
@@ -357,14 +414,16 @@ void r600_update_vs_writes_viewport_index(struct r600_common_context *rctx,
if (rctx->scissors.dirty_mask)
rctx->set_atom_dirty(rctx, &rctx->scissors.atom, true);
if (rctx->viewports.dirty_mask)
if (rctx->viewports.dirty_mask ||
rctx->viewports.depth_range_dirty_mask)
rctx->set_atom_dirty(rctx, &rctx->viewports.atom, true);
}
void r600_init_viewport_functions(struct r600_common_context *rctx)
{
rctx->scissors.atom.emit = r600_emit_scissors;
rctx->viewports.atom.emit = r600_emit_viewports;
rctx->viewports.atom.emit = r600_emit_viewport_states;
rctx->scissors.atom.num_dw = (2 + 16 * 2) + 6;
rctx->viewports.atom.num_dw = 2 + 16 * 6;

View File

@@ -241,5 +241,7 @@
#define S_028254_BR_Y(x) (((unsigned)(x) & 0x7FFF) << 16)
#define G_028254_BR_Y(x) (((x) >> 16) & 0x7FFF)
#define C_028254_BR_Y 0x8000FFFF
#define R_0282D0_PA_SC_VPORT_ZMIN_0 0x0282D0
#define R_0282D4_PA_SC_VPORT_ZMAX_0 0x0282D4
#endif

View File

@@ -1303,23 +1303,32 @@ static void emit_lsb(const struct lp_build_tgsi_action * action,
struct lp_build_emit_data * emit_data)
{
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef args[2] = {
emit_data->args[0],
/* The value of 1 means that ffs(x=0) = undef, so LLVM won't
* add special code to check for x=0. The reason is that
* the LLVM behavior for x=0 is different from what we
* need here.
*
* The hardware already implements the correct behavior.
* need here. However, LLVM also assumes that ffs(x) is
* in [0, 31], but GLSL expects that ffs(0) = -1, so
* a conditional assignment to handle 0 is still required.
*/
lp_build_const_int32(gallivm, 1)
LLVMConstInt(LLVMInt1TypeInContext(gallivm->context), 1, 0)
};
emit_data->output[emit_data->chan] =
LLVMValueRef lsb =
lp_build_intrinsic(gallivm->builder, "llvm.cttz.i32",
emit_data->dst_type, args, ARRAY_SIZE(args),
LLVMReadNoneAttribute);
/* TODO: We need an intrinsic to skip this conditional. */
/* Check for zero: */
emit_data->output[emit_data->chan] =
LLVMBuildSelect(builder,
LLVMBuildICmp(builder, LLVMIntEQ, args[0],
bld_base->uint_bld.zero, ""),
lp_build_const_int32(gallivm, -1), lsb, "");
}
/* Find the last bit set. */

View File

@@ -376,7 +376,9 @@ si_decompress_sampler_color_textures(struct si_context *sctx,
assert(view);
tex = (struct r600_texture *)view->texture;
assert(tex->cmask.size || tex->fmask.size || tex->dcc_offset);
/* CMASK or DCC can be discarded and we can still end up here. */
if (!tex->cmask.size && !tex->fmask.size && !tex->dcc_offset)
continue;
si_blit_decompress_color(&sctx->b.b, tex,
view->u.tex.first_level, view->u.tex.last_level,

View File

@@ -202,7 +202,12 @@ static void si_initialize_compute(struct si_context *sctx)
radeon_emit(cs, bc_va >> 8); /* R_030E00_TA_CS_BC_BASE_ADDR */
radeon_emit(cs, bc_va >> 40); /* R_030E04_TA_CS_BC_BASE_ADDR_HI */
} else {
radeon_set_config_reg(cs, R_00950C_TA_CS_BC_BASE_ADDR, bc_va >> 8);
if (sctx->screen->b.info.drm_major == 3 ||
(sctx->screen->b.info.drm_major == 2 &&
sctx->screen->b.info.drm_minor >= 48)) {
radeon_set_config_reg(cs, R_00950C_TA_CS_BC_BASE_ADDR,
bc_va >> 8);
}
}
sctx->cs_shader_state.emitted_program = NULL;

View File

@@ -231,6 +231,7 @@ void si_begin_new_cs(struct si_context *ctx)
ctx->b.scissors.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
ctx->b.viewports.dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
ctx->b.viewports.depth_range_dirty_mask = (1 << R600_MAX_VIEWPORTS) - 1;
si_mark_atom_dirty(ctx, &ctx->b.scissors.atom);
si_mark_atom_dirty(ctx, &ctx->b.viewports.atom);

View File

@@ -1667,7 +1667,12 @@ static void declare_system_value(
}
case TGSI_SEMANTIC_VERTICESIN:
value = unpack_param(ctx, SI_PARAM_TCS_OUT_LAYOUT, 26, 6);
if (ctx->type == PIPE_SHADER_TESS_CTRL)
value = unpack_param(ctx, SI_PARAM_TCS_OUT_LAYOUT, 26, 6);
else if (ctx->type == PIPE_SHADER_TESS_EVAL)
value = unpack_param(ctx, SI_PARAM_TCS_OFFCHIP_LAYOUT, 9, 7);
else
assert(!"invalid shader stage for TGSI_SEMANTIC_VERTICESIN");
break;
case TGSI_SEMANTIC_TESSINNER:
@@ -4028,7 +4033,7 @@ static void resq_fetch_args(
const struct tgsi_full_instruction *inst = emit_data->inst;
const struct tgsi_full_src_register *reg = &inst->Src[0];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
emit_data->dst_type = ctx->v4i32;
if (reg->Register.File == TGSI_FILE_BUFFER) {
emit_data->args[0] = shader_buffer_fetch_rsrc(ctx, reg);
@@ -4079,9 +4084,7 @@ static void resq_emit(
LLVMValueRef imm6 = lp_build_const_int32(gallivm, 6);
LLVMValueRef z = LLVMBuildExtractElement(builder, out, imm2, "");
z = LLVMBuildBitCast(builder, z, bld_base->uint_bld.elem_type, "");
z = LLVMBuildSDiv(builder, z, imm6, "");
z = LLVMBuildBitCast(builder, z, bld_base->base.elem_type, "");
out = LLVMBuildInsertElement(builder, out, z, imm2, "");
}
}
@@ -5862,6 +5865,9 @@ void si_shader_binary_read_config(struct radeon_shader_binary *binary,
conf->scratch_bytes_per_wave =
G_00B860_WAVESIZE(value) * 256 * 4 * 1;
break;
case 0x4:
case 0x8:
break; /* just spilling stats, not important */
default:
{
static bool printed;

View File

@@ -461,16 +461,19 @@ static void *si_create_blend_state_mode(struct pipe_context *ctx,
S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED);
/* Only set dual source blending for MRT0 to avoid a hang. */
if (i >= 1 && blend->dual_src_blend)
continue;
if (i >= 1 && blend->dual_src_blend) {
/* Vulkan does this for dual source blending. */
if (i == 1)
blend_cntl |= S_028780_ENABLE(1);
if (!state->rt[j].colormask)
si_pm4_set_reg(pm4, R_028780_CB_BLEND0_CONTROL + i * 4, blend_cntl);
continue;
}
/* cb_render_state will disable unused ones */
blend->cb_target_mask |= (unsigned)state->rt[j].colormask << (4 * i);
if (!state->rt[j].blend_enable) {
if (!state->rt[j].colormask || !state->rt[j].blend_enable) {
si_pm4_set_reg(pm4, R_028780_CB_BLEND0_CONTROL + i * 4, blend_cntl);
continue;
}
@@ -551,6 +554,17 @@ static void *si_create_blend_state_mode(struct pipe_context *ctx,
}
if (sctx->b.family == CHIP_STONEY) {
/* Disable RB+ blend optimizations for dual source blending.
* Vulkan does this.
*/
if (blend->dual_src_blend) {
for (int i = 0; i < 8; i++) {
sx_mrt_blend_opt[i] =
S_028760_COLOR_COMB_FCN(V_028760_OPT_COMB_NONE) |
S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_NONE);
}
}
for (int i = 0; i < 8; i++)
si_pm4_set_reg(pm4, R_028760_SX_MRT0_BLEND_OPT + i * 4,
sx_mrt_blend_opt[i]);
@@ -728,6 +742,7 @@ static void *si_create_rs_state(struct pipe_context *ctx,
}
rs->scissor_enable = state->scissor;
rs->clip_halfz = state->clip_halfz;
rs->two_side = state->light_twoside;
rs->multisample_enable = state->multisample;
rs->force_persample_interp = state->force_persample_interp;
@@ -857,7 +872,7 @@ static void si_bind_rs_state(struct pipe_context *ctx, void *state)
si_mark_atom_dirty(sctx, &sctx->msaa_sample_locs.atom);
}
r600_set_scissor_enable(&sctx->b, rs->scissor_enable);
r600_viewport_set_rast_deps(&sctx->b, rs->scissor_enable, rs->clip_halfz);
si_pm4_bind_state(sctx, rasterizer, rs);
si_update_poly_offset_state(sctx);
@@ -3427,6 +3442,11 @@ void si_init_state_functions(struct si_context *sctx)
si_init_config(sctx);
}
static uint32_t si_get_bo_metadata_word1(struct r600_common_screen *rscreen)
{
return (ATI_VENDOR_ID << 16) | rscreen->info.pci_id;
}
static void si_query_opaque_metadata(struct r600_common_screen *rscreen,
struct r600_texture *rtex,
struct radeon_bo_metadata *md)
@@ -3461,7 +3481,7 @@ static void si_query_opaque_metadata(struct r600_common_screen *rscreen,
md->metadata[0] = 1; /* metadata image format version 1 */
/* TILE_MODE_INDEX is ambiguous without a PCI ID. */
md->metadata[1] = (ATI_VENDOR_ID << 16) | rscreen->info.pci_id;
md->metadata[1] = si_get_bo_metadata_word1(rscreen);
si_make_texture_descriptor(sscreen, rtex, true,
res->target, res->format,
@@ -3485,9 +3505,37 @@ static void si_query_opaque_metadata(struct r600_common_screen *rscreen,
md->size_metadata = (11 + res->last_level) * 4;
}
static void si_apply_opaque_metadata(struct r600_common_screen *rscreen,
struct r600_texture *rtex,
struct radeon_bo_metadata *md)
{
uint32_t *desc = &md->metadata[2];
if (rscreen->chip_class < VI)
return;
/* Return if DCC is enabled. The texture should be set up with it
* already.
*/
if (md->size_metadata >= 11 * 4 &&
md->metadata[0] != 0 &&
md->metadata[1] == si_get_bo_metadata_word1(rscreen) &&
G_008F28_COMPRESSION_EN(desc[6])) {
assert(rtex->dcc_offset == ((uint64_t)desc[7] << 8));
return;
}
/* Disable DCC. These are always set by texture_from_handle and must
* be cleared here.
*/
rtex->dcc_offset = 0;
rtex->cb_color_info &= ~VI_S_028C70_DCC_ENABLE(1);
}
void si_init_screen_state_functions(struct si_screen *sscreen)
{
sscreen->b.query_opaque_metadata = si_query_opaque_metadata;
sscreen->b.apply_opaque_metadata = si_apply_opaque_metadata;
}
static void

View File

@@ -78,6 +78,7 @@ struct si_state_rasterizer {
bool clamp_fragment_color;
bool rasterizer_discard;
bool scissor_enable;
bool clip_halfz;
};
struct si_dsa_stencil_ref_part {

View File

@@ -35,11 +35,13 @@
#include "JitManager.h"
#include "fetch_jit.h"
#pragma push_macro("DEBUG")
#undef DEBUG
#if defined(_WIN32)
#include "llvm/ADT/Triple.h"
#endif
#include "llvm/IR/Function.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/SourceMgr.h"
@@ -53,6 +55,8 @@
#include "llvm/ExecutionEngine/JITEventListener.h"
#endif
#pragma pop_macro("DEBUG")
#include "core/state.h"
#include "state_llvm.h"
@@ -237,6 +241,13 @@ bool JitManager::SetupModuleFromIR(const uint8_t *pIR)
return false;
}
#if HAVE_LLVM == 0x307
// llvm-3.7 has mismatched setDataLyout/getDataLayout APIs
newModule->setDataLayout(*mpExec->getDataLayout());
#else
newModule->setDataLayout(mpExec->getDataLayout());
#endif
mpCurrentModule = newModule.get();
#if defined(_WIN32)
// Needed for MCJIT on windows
@@ -251,7 +262,6 @@ bool JitManager::SetupModuleFromIR(const uint8_t *pIR)
return true;
}
//////////////////////////////////////////////////////////////////////////
/// @brief Dump function x86 assembly to file.
/// @note This should only be called after the module has been jitted to x86 and the

View File

@@ -54,7 +54,7 @@
#endif
#ifndef HAVE_LLVM
#define HAVE_LLVM (LLVM_VERSION_MAJOR << 8) || LLVM_VERSION_MINOR
#define HAVE_LLVM ((LLVM_VERSION_MAJOR << 8) | LLVM_VERSION_MINOR)
#endif
#include "llvm/IR/Verifier.h"
@@ -66,8 +66,12 @@
#if HAVE_LLVM == 0x306
#include "llvm/PassManager.h"
using FunctionPassManager = llvm::FunctionPassManager;
using PassManager = llvm::PassManager;
#else
#include "llvm/IR/LegacyPassManager.h"
using FunctionPassManager = llvm::legacy::FunctionPassManager;
using PassManager = llvm::legacy::PassManager;
#endif
#include "llvm/CodeGen/Passes.h"
@@ -77,6 +81,7 @@
#include "llvm/Transforms/IPO.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Support/Host.h"
#include "llvm/Support/DynamicLibrary.h"
#pragma pop_macro("DEBUG")

View File

@@ -31,7 +31,6 @@
#include "blend_jit.h"
#include "builder.h"
#include "state_llvm.h"
#include "llvm/IR/DataLayout.h"
#include <sstream>
@@ -725,12 +724,7 @@ struct BlendJit : public Builder
JitManager::DumpToFile(blendFunc, "");
#if HAVE_LLVM == 0x306
FunctionPassManager
#else
llvm::legacy::FunctionPassManager
#endif
passes(JM()->mpCurrentModule);
::FunctionPassManager passes(JM()->mpCurrentModule);
passes.add(createBreakCriticalEdgesPass());
passes.add(createCFGSimplificationPass());

View File

@@ -30,8 +30,6 @@
#include "builder.h"
#include "common/rdtsc_buckets.h"
#include "llvm/Support/DynamicLibrary.h"
void __cdecl CallPrint(const char* fmt, ...);
//////////////////////////////////////////////////////////////////////////
@@ -322,6 +320,32 @@ CallInst *Builder::CALL(Value *Callee, const std::initializer_list<Value*> &args
return CALLA(Callee, args);
}
#if HAVE_LLVM > 0x306
CallInst *Builder::CALL(Value *Callee, Value* arg)
{
std::vector<Value*> args;
args.push_back(arg);
return CALLA(Callee, args);
}
CallInst *Builder::CALL2(Value *Callee, Value* arg1, Value* arg2)
{
std::vector<Value*> args;
args.push_back(arg1);
args.push_back(arg2);
return CALLA(Callee, args);
}
CallInst *Builder::CALL3(Value *Callee, Value* arg1, Value* arg2, Value* arg3)
{
std::vector<Value*> args;
args.push_back(arg1);
args.push_back(arg2);
args.push_back(arg3);
return CALLA(Callee, args);
}
#endif
Value *Builder::VRCP(Value *va)
{
return FDIV(VIMMED1(1.0f), va); // 1 / a
@@ -676,20 +700,22 @@ Value *Builder::PSHUFB(Value* a, Value* b)
/// lower 8 values are used.
Value *Builder::PMOVSXBD(Value* a)
{
Value* res;
// llvm-3.9 removed the pmovsxbd intrinsic
#if HAVE_LLVM < 0x309
// use avx2 byte sign extend instruction if available
if(JM()->mArch.AVX2())
{
res = VPMOVSXBD(a);
Function *pmovsxbd = Intrinsic::getDeclaration(JM()->mpCurrentModule, Intrinsic::x86_avx2_pmovsxbd);
return CALL(pmovsxbd, std::initializer_list<Value*>{a});
}
else
#endif
{
// VPMOVSXBD output type
Type* v8x32Ty = VectorType::get(mInt32Ty, 8);
// Extract 8 values from 128bit lane and sign extend
res = S_EXT(VSHUFFLE(a, a, C<int>({0, 1, 2, 3, 4, 5, 6, 7})), v8x32Ty);
return S_EXT(VSHUFFLE(a, a, C<int>({0, 1, 2, 3, 4, 5, 6, 7})), v8x32Ty);
}
return res;
}
//////////////////////////////////////////////////////////////////////////
@@ -698,20 +724,22 @@ Value *Builder::PMOVSXBD(Value* a)
/// @param a - 128bit SIMD lane(8x16bit) of 16bit integer values.
Value *Builder::PMOVSXWD(Value* a)
{
Value* res;
// llvm-3.9 removed the pmovsxwd intrinsic
#if HAVE_LLVM < 0x309
// use avx2 word sign extend if available
if(JM()->mArch.AVX2())
{
res = VPMOVSXWD(a);
Function *pmovsxwd = Intrinsic::getDeclaration(JM()->mpCurrentModule, Intrinsic::x86_avx2_pmovsxwd);
return CALL(pmovsxwd, std::initializer_list<Value*>{a});
}
else
#endif
{
// VPMOVSXWD output type
Type* v8x32Ty = VectorType::get(mInt32Ty, 8);
// Extract 8 values from 128bit lane and sign extend
res = S_EXT(VSHUFFLE(a, a, C<int>({0, 1, 2, 3, 4, 5, 6, 7})), v8x32Ty);
return S_EXT(VSHUFFLE(a, a, C<int>({0, 1, 2, 3, 4, 5, 6, 7})), v8x32Ty);
}
return res;
}
//////////////////////////////////////////////////////////////////////////
@@ -726,8 +754,7 @@ Value *Builder::PERMD(Value* a, Value* idx)
// use avx2 permute instruction if available
if(JM()->mArch.AVX2())
{
// llvm 3.6.0 swapped the order of the args to vpermd
res = VPERMD(idx, a);
res = VPERMD(a, idx);
}
else
{
@@ -852,9 +879,15 @@ Value *Builder::CVTPS2PH(Value* a, Value* rounding)
Value *Builder::PMAXSD(Value* a, Value* b)
{
// llvm-3.9 removed the pmax intrinsics
#if HAVE_LLVM >= 0x309
Value* cmp = ICMP_SGT(a, b);
return SELECT(cmp, a, b);
#else
if (JM()->mArch.AVX2())
{
return VPMAXSD(a, b);
Function* pmaxsd = Intrinsic::getDeclaration(JM()->mpCurrentModule, Intrinsic::x86_avx2_pmaxs_d);
return CALL(pmaxsd, {a, b});
}
else
{
@@ -877,13 +910,20 @@ Value *Builder::PMAXSD(Value* a, Value* b)
return result;
}
#endif
}
Value *Builder::PMINSD(Value* a, Value* b)
{
// llvm-3.9 removed the pmin intrinsics
#if HAVE_LLVM >= 0x309
Value* cmp = ICMP_SLT(a, b);
return SELECT(cmp, a, b);
#else
if (JM()->mArch.AVX2())
{
return VPMINSD(a, b);
Function* pminsd = Intrinsic::getDeclaration(JM()->mpCurrentModule, Intrinsic::x86_avx2_pmins_d);
return CALL(pminsd, {a, b});
}
else
{
@@ -906,6 +946,7 @@ Value *Builder::PMINSD(Value* a, Value* b)
return result;
}
#endif
}
void Builder::Gather4(const SWR_FORMAT format, Value* pSrcBase, Value* byteOffsets,

View File

@@ -72,6 +72,12 @@ int32_t S_IMMED(Value* i);
Value *GEP(Value* ptr, const std::initializer_list<Value*> &indexList);
Value *GEP(Value* ptr, const std::initializer_list<uint32_t> &indexList);
CallInst *CALL(Value *Callee, const std::initializer_list<Value*> &args);
#if HAVE_LLVM > 0x306
CallInst *CALL(Value *Callee) { return CALLA(Callee); }
CallInst *CALL(Value *Callee, Value* arg);
CallInst *CALL2(Value *Callee, Value* arg1, Value* arg2);
CallInst *CALL3(Value *Callee, Value* arg1, Value* arg2, Value* arg3);
#endif
LoadInst *LOAD(Value *BasePtr, const std::initializer_list<uint32_t> &offset, const llvm::Twine& name = "");
LoadInst *LOADV(Value *BasePtr, const std::initializer_list<Value*> &offset, const llvm::Twine& name = "");

View File

@@ -31,7 +31,6 @@
#include "fetch_jit.h"
#include "builder.h"
#include "state_llvm.h"
#include "llvm/IR/DataLayout.h"
#include <sstream>
#include <tuple>
@@ -181,12 +180,7 @@ Function* FetchJit::Create(const FETCH_COMPILE_STATE& fetchState)
verifyFunction(*fetch);
#if HAVE_LLVM == 0x306
FunctionPassManager
#else
llvm::legacy::FunctionPassManager
#endif
setupPasses(JM()->mpCurrentModule);
::FunctionPassManager setupPasses(JM()->mpCurrentModule);
///@todo We don't need the CFG passes for fetch. (e.g. BreakCriticalEdges and CFGSimplification)
setupPasses.add(createBreakCriticalEdgesPass());
@@ -198,12 +192,7 @@ Function* FetchJit::Create(const FETCH_COMPILE_STATE& fetchState)
JitManager::DumpToFile(fetch, "se");
#if HAVE_LLVM == 0x306
FunctionPassManager
#else
llvm::legacy::FunctionPassManager
#endif
optPasses(JM()->mpCurrentModule);
::FunctionPassManager optPasses(JM()->mpCurrentModule);
///@todo Haven't touched these either. Need to remove some of these and add others.
optPasses.add(createCFGSimplificationPass());

View File

@@ -91,8 +91,6 @@ intrinsics = [
["VRCPPS", "x86_avx_rcp_ps_256", ["a"]],
["VMINPS", "x86_avx_min_ps_256", ["a", "b"]],
["VMAXPS", "x86_avx_max_ps_256", ["a", "b"]],
["VPMINSD", "x86_avx2_pmins_d", ["a", "b"]],
["VPMAXSD", "x86_avx2_pmaxs_d", ["a", "b"]],
["VROUND", "x86_avx_round_ps_256", ["a", "rounding"]],
["VCMPPS", "x86_avx_cmp_ps_256", ["a", "b", "cmpop"]],
["VBLENDVPS", "x86_avx_blendv_ps_256", ["a", "b", "mask"]],
@@ -100,9 +98,7 @@ intrinsics = [
["VMASKLOADD", "x86_avx2_maskload_d_256", ["src", "mask"]],
["VMASKMOVPS", "x86_avx_maskload_ps_256", ["src", "mask"]],
["VPSHUFB", "x86_avx2_pshuf_b", ["a", "b"]],
["VPMOVSXBD", "x86_avx2_pmovsxbd", ["a"]], # sign extend packed 8bit components
["VPMOVSXWD", "x86_avx2_pmovsxwd", ["a"]], # sign extend packed 16bit components
["VPERMD", "x86_avx2_permd", ["idx", "a"]],
["VPERMD", "x86_avx2_permd", ["a", "idx"]],
["VPERMPS", "x86_avx2_permps", ["idx", "a"]],
["VCVTPH2PS", "x86_vcvtph2ps_256", ["a"]],
["VCVTPS2PH", "x86_vcvtps2ph_256", ["a", "round"]],
@@ -110,7 +106,6 @@ intrinsics = [
["VPTESTC", "x86_avx_ptestc_256", ["a", "b"]],
["VPTESTZ", "x86_avx_ptestz_256", ["a", "b"]],
["VFMADDPS", "x86_fma_vfmadd_ps_256", ["a", "b", "c"]],
["VCVTTPS2DQ", "x86_avx_cvtt_ps2dq_256", ["a"]],
["VMOVMSKPS", "x86_avx_movmsk_ps_256", ["a"]],
["INTERRUPT", "x86_int", ["a"]],
]
@@ -352,7 +347,29 @@ def generate_x86_cpp(output_file):
'Value *Builder::%s(%s)' % (inst[0], args),
'{',
' Function *func = Intrinsic::getDeclaration(JM()->mpCurrentModule, Intrinsic::%s);' % inst[1],
]
if inst[0] == "VPERMD":
rev_args = ''
first = True
for arg in reversed(inst[2]):
if not first:
rev_args += ', '
rev_args += arg
first = False
output_lines += [
'#if (HAVE_LLVM == 0x306) && (LLVM_VERSION_PATCH == 0)',
' return CALL(func, std::initializer_list<Value*>{%s});' % rev_args,
'#else',
]
output_lines += [
' return CALL(func, std::initializer_list<Value*>{%s});' % pass_args,
]
if inst[0] == "VPERMD":
output_lines += [
'#endif',
]
output_lines += [
'}',
'',
]

View File

@@ -292,12 +292,7 @@ struct StreamOutJit : public Builder
JitManager::DumpToFile(soFunc, "SoFunc");
#if HAVE_LLVM == 0x306
FunctionPassManager
#else
llvm::legacy::FunctionPassManager
#endif
passes(JM()->mpCurrentModule);
::FunctionPassManager passes(JM()->mpCurrentModule);
passes.add(createBreakCriticalEdgesPass());
passes.add(createCFGSimplificationPass());

View File

@@ -17,16 +17,19 @@ const char * const __glXDispatchTableStrings[DI_LAST_INDEX] = {
#define __ATTRIB(field) \
[DI_##field] = "glX"#field
__ATTRIB(BindSwapBarrierSGIX),
__ATTRIB(BindTexImageEXT),
// glXChooseFBConfig implemented by libglvnd
__ATTRIB(ChooseFBConfigSGIX),
// glXChooseVisual implemented by libglvnd
// glXCopyContext implemented by libglvnd
__ATTRIB(CopySubBufferMESA),
// glXCreateContext implemented by libglvnd
__ATTRIB(CreateContextAttribsARB),
__ATTRIB(CreateContextWithConfigSGIX),
__ATTRIB(CreateGLXPbufferSGIX),
// glXCreateGLXPixmap implemented by libglvnd
__ATTRIB(CreateGLXPixmapMESA),
__ATTRIB(CreateGLXPixmapWithConfigSGIX),
// glXCreateNewContext implemented by libglvnd
// glXCreatePbuffer implemented by libglvnd
@@ -51,54 +54,50 @@ const char * const __glXDispatchTableStrings[DI_LAST_INDEX] = {
__ATTRIB(GetFBConfigAttribSGIX),
__ATTRIB(GetFBConfigFromVisualSGIX),
// glXGetFBConfigs implemented by libglvnd
__ATTRIB(GetMscRateOML),
// glXGetProcAddress implemented by libglvnd
// glXGetProcAddressARB implemented by libglvnd
__ATTRIB(GetScreenDriver),
// glXGetSelectedEvent implemented by libglvnd
__ATTRIB(GetSelectedEventSGIX),
__ATTRIB(GetSwapIntervalMESA),
__ATTRIB(GetSyncValuesOML),
__ATTRIB(GetVideoSyncSGI),
// glXGetVisualFromFBConfig implemented by libglvnd
__ATTRIB(GetVisualFromFBConfigSGIX),
// glXImportContextEXT implemented by libglvnd
// glXIsDirect implemented by libglvnd
__ATTRIB(JoinSwapGroupSGIX),
// glXMakeContextCurrent implemented by libglvnd
// glXMakeCurrent implemented by libglvnd
// glXQueryContext implemented by libglvnd
__ATTRIB(QueryContextInfoEXT),
__ATTRIB(QueryCurrentRendererIntegerMESA),
__ATTRIB(QueryCurrentRendererStringMESA),
// glXQueryDrawable implemented by libglvnd
// glXQueryExtension implemented by libglvnd
// glXQueryExtensionsString implemented by libglvnd
__ATTRIB(QueryGLXPbufferSGIX),
__ATTRIB(QueryMaxSwapBarriersSGIX),
__ATTRIB(QueryRendererIntegerMESA),
__ATTRIB(QueryRendererStringMESA),
// glXQueryServerString implemented by libglvnd
// glXQueryVersion implemented by libglvnd
__ATTRIB(ReleaseBuffersMESA),
__ATTRIB(ReleaseTexImageEXT),
// glXSelectEvent implemented by libglvnd
__ATTRIB(SelectEventSGIX),
// glXSwapBuffers implemented by libglvnd
__ATTRIB(SwapBuffersMscOML),
__ATTRIB(SwapIntervalMESA),
__ATTRIB(SwapIntervalSGI),
// glXUseXFont implemented by libglvnd
__ATTRIB(WaitForMscOML),
__ATTRIB(WaitForSbcOML),
// glXWaitGL implemented by libglvnd
__ATTRIB(WaitVideoSyncSGI),
// glXWaitX implemented by libglvnd
__ATTRIB(glXBindSwapBarrierSGIX),
__ATTRIB(glXCopySubBufferMESA),
__ATTRIB(glXCreateGLXPixmapMESA),
__ATTRIB(glXGetMscRateOML),
__ATTRIB(glXGetScreenDriver),
__ATTRIB(glXGetSwapIntervalMESA),
__ATTRIB(glXGetSyncValuesOML),
__ATTRIB(glXJoinSwapGroupSGIX),
__ATTRIB(glXQueryCurrentRendererIntegerMESA),
__ATTRIB(glXQueryCurrentRendererStringMESA),
__ATTRIB(glXQueryMaxSwapBarriersSGIX),
__ATTRIB(glXQueryRendererIntegerMESA),
__ATTRIB(glXQueryRendererStringMESA),
__ATTRIB(glXReleaseBuffersMESA),
__ATTRIB(glXSwapBuffersMscOML),
__ATTRIB(glXSwapIntervalMESA),
__ATTRIB(glXWaitForMscOML),
__ATTRIB(glXWaitForSbcOML),
#undef __ATTRIB
};
@@ -557,49 +556,49 @@ static int dispatch_WaitVideoSyncSGI(int divisor, int remainder,
static void dispatch_glXBindSwapBarrierSGIX(Display *dpy, GLXDrawable drawable,
static void dispatch_BindSwapBarrierSGIX(Display *dpy, GLXDrawable drawable,
int barrier)
{
PFNGLXBINDSWAPBARRIERSGIXPROC pglXBindSwapBarrierSGIX;
PFNGLXBINDSWAPBARRIERSGIXPROC pBindSwapBarrierSGIX;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return;
__FETCH_FUNCTION_PTR(glXBindSwapBarrierSGIX);
if (pglXBindSwapBarrierSGIX == NULL)
__FETCH_FUNCTION_PTR(BindSwapBarrierSGIX);
if (pBindSwapBarrierSGIX == NULL)
return;
(*pglXBindSwapBarrierSGIX)(dpy, drawable, barrier);
(*pBindSwapBarrierSGIX)(dpy, drawable, barrier);
}
static void dispatch_glXCopySubBufferMESA(Display *dpy, GLXDrawable drawable,
static void dispatch_CopySubBufferMESA(Display *dpy, GLXDrawable drawable,
int x, int y, int width, int height)
{
PFNGLXCOPYSUBBUFFERMESAPROC pglXCopySubBufferMESA;
PFNGLXCOPYSUBBUFFERMESAPROC pCopySubBufferMESA;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return;
__FETCH_FUNCTION_PTR(glXCopySubBufferMESA);
if (pglXCopySubBufferMESA == NULL)
__FETCH_FUNCTION_PTR(CopySubBufferMESA);
if (pCopySubBufferMESA == NULL)
return;
(*pglXCopySubBufferMESA)(dpy, drawable, x, y, width, height);
(*pCopySubBufferMESA)(dpy, drawable, x, y, width, height);
}
static GLXPixmap dispatch_glXCreateGLXPixmapMESA(Display *dpy,
static GLXPixmap dispatch_CreateGLXPixmapMESA(Display *dpy,
XVisualInfo *visinfo,
Pixmap pixmap, Colormap cmap)
{
PFNGLXCREATEGLXPIXMAPMESAPROC pglXCreateGLXPixmapMESA;
PFNGLXCREATEGLXPIXMAPMESAPROC pCreateGLXPixmapMESA;
__GLXvendorInfo *dd;
GLXPixmap ret;
@@ -607,11 +606,11 @@ static GLXPixmap dispatch_glXCreateGLXPixmapMESA(Display *dpy,
if (dd == NULL)
return None;
__FETCH_FUNCTION_PTR(glXCreateGLXPixmapMESA);
if (pglXCreateGLXPixmapMESA == NULL)
__FETCH_FUNCTION_PTR(CreateGLXPixmapMESA);
if (pCreateGLXPixmapMESA == NULL)
return None;
ret = (*pglXCreateGLXPixmapMESA)(dpy, visinfo, pixmap, cmap);
ret = (*pCreateGLXPixmapMESA)(dpy, visinfo, pixmap, cmap);
if (AddDrawableMapping(dpy, ret, dd)) {
/* XXX: Call glXDestroyGLXPixmap which lives in libglvnd. If we're not
* allowed to call it from here, should we extend __glXDispatchTableIndices ?
@@ -624,47 +623,47 @@ static GLXPixmap dispatch_glXCreateGLXPixmapMESA(Display *dpy,
static GLboolean dispatch_glXGetMscRateOML(Display *dpy, GLXDrawable drawable,
static GLboolean dispatch_GetMscRateOML(Display *dpy, GLXDrawable drawable,
int32_t *numerator, int32_t *denominator)
{
PFNGLXGETMSCRATEOMLPROC pglXGetMscRateOML;
PFNGLXGETMSCRATEOMLPROC pGetMscRateOML;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return GL_FALSE;
__FETCH_FUNCTION_PTR(glXGetMscRateOML);
if (pglXGetMscRateOML == NULL)
__FETCH_FUNCTION_PTR(GetMscRateOML);
if (pGetMscRateOML == NULL)
return GL_FALSE;
return (*pglXGetMscRateOML)(dpy, drawable, numerator, denominator);
return (*pGetMscRateOML)(dpy, drawable, numerator, denominator);
}
static const char *dispatch_glXGetScreenDriver(Display *dpy, int scrNum)
static const char *dispatch_GetScreenDriver(Display *dpy, int scrNum)
{
typedef const char *(*fn_glXGetScreenDriver_ptr)(Display *dpy, int scrNum);
fn_glXGetScreenDriver_ptr pglXGetScreenDriver;
fn_glXGetScreenDriver_ptr pGetScreenDriver;
__GLXvendorInfo *dd;
dd = __VND->getDynDispatch(dpy, scrNum);
if (dd == NULL)
return NULL;
__FETCH_FUNCTION_PTR(glXGetScreenDriver);
if (pglXGetScreenDriver == NULL)
__FETCH_FUNCTION_PTR(GetScreenDriver);
if (pGetScreenDriver == NULL)
return NULL;
return (*pglXGetScreenDriver)(dpy, scrNum);
return (*pGetScreenDriver)(dpy, scrNum);
}
static int dispatch_glXGetSwapIntervalMESA(void)
static int dispatch_GetSwapIntervalMESA(void)
{
PFNGLXGETSWAPINTERVALMESAPROC pglXGetSwapIntervalMESA;
PFNGLXGETSWAPINTERVALMESAPROC pGetSwapIntervalMESA;
__GLXvendorInfo *dd;
if (!__VND->getCurrentContext())
@@ -674,57 +673,57 @@ static int dispatch_glXGetSwapIntervalMESA(void)
if (dd == NULL)
return 0;
__FETCH_FUNCTION_PTR(glXGetSwapIntervalMESA);
if (pglXGetSwapIntervalMESA == NULL)
__FETCH_FUNCTION_PTR(GetSwapIntervalMESA);
if (pGetSwapIntervalMESA == NULL)
return 0;
return (*pglXGetSwapIntervalMESA)();
return (*pGetSwapIntervalMESA)();
}
static Bool dispatch_glXGetSyncValuesOML(Display *dpy, GLXDrawable drawable,
static Bool dispatch_GetSyncValuesOML(Display *dpy, GLXDrawable drawable,
int64_t *ust, int64_t *msc, int64_t *sbc)
{
PFNGLXGETSYNCVALUESOMLPROC pglXGetSyncValuesOML;
PFNGLXGETSYNCVALUESOMLPROC pGetSyncValuesOML;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXGetSyncValuesOML);
if (pglXGetSyncValuesOML == NULL)
__FETCH_FUNCTION_PTR(GetSyncValuesOML);
if (pGetSyncValuesOML == NULL)
return False;
return (*pglXGetSyncValuesOML)(dpy, drawable, ust, msc, sbc);
return (*pGetSyncValuesOML)(dpy, drawable, ust, msc, sbc);
}
static void dispatch_glXJoinSwapGroupSGIX(Display *dpy, GLXDrawable drawable,
static void dispatch_JoinSwapGroupSGIX(Display *dpy, GLXDrawable drawable,
GLXDrawable member)
{
PFNGLXJOINSWAPGROUPSGIXPROC pglXJoinSwapGroupSGIX;
PFNGLXJOINSWAPGROUPSGIXPROC pJoinSwapGroupSGIX;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return;
__FETCH_FUNCTION_PTR(glXJoinSwapGroupSGIX);
if (pglXJoinSwapGroupSGIX == NULL)
__FETCH_FUNCTION_PTR(JoinSwapGroupSGIX);
if (pJoinSwapGroupSGIX == NULL)
return;
(*pglXJoinSwapGroupSGIX)(dpy, drawable, member);
(*pJoinSwapGroupSGIX)(dpy, drawable, member);
}
static Bool dispatch_glXQueryCurrentRendererIntegerMESA(int attribute,
static Bool dispatch_QueryCurrentRendererIntegerMESA(int attribute,
unsigned int *value)
{
PFNGLXQUERYCURRENTRENDERERINTEGERMESAPROC pglXQueryCurrentRendererIntegerMESA;
PFNGLXQUERYCURRENTRENDERERINTEGERMESAPROC pQueryCurrentRendererIntegerMESA;
__GLXvendorInfo *dd;
if (!__VND->getCurrentContext())
@@ -734,18 +733,18 @@ static Bool dispatch_glXQueryCurrentRendererIntegerMESA(int attribute,
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXQueryCurrentRendererIntegerMESA);
if (pglXQueryCurrentRendererIntegerMESA == NULL)
__FETCH_FUNCTION_PTR(QueryCurrentRendererIntegerMESA);
if (pQueryCurrentRendererIntegerMESA == NULL)
return False;
return (*pglXQueryCurrentRendererIntegerMESA)(attribute, value);
return (*pQueryCurrentRendererIntegerMESA)(attribute, value);
}
static const char *dispatch_glXQueryCurrentRendererStringMESA(int attribute)
static const char *dispatch_QueryCurrentRendererStringMESA(int attribute)
{
PFNGLXQUERYCURRENTRENDERERSTRINGMESAPROC pglXQueryCurrentRendererStringMESA;
PFNGLXQUERYCURRENTRENDERERSTRINGMESAPROC pQueryCurrentRendererStringMESA;
__GLXvendorInfo *dd;
if (!__VND->getCurrentContext())
@@ -755,114 +754,114 @@ static const char *dispatch_glXQueryCurrentRendererStringMESA(int attribute)
if (dd == NULL)
return NULL;
__FETCH_FUNCTION_PTR(glXQueryCurrentRendererStringMESA);
if (pglXQueryCurrentRendererStringMESA == NULL)
__FETCH_FUNCTION_PTR(QueryCurrentRendererStringMESA);
if (pQueryCurrentRendererStringMESA == NULL)
return NULL;
return (*pglXQueryCurrentRendererStringMESA)(attribute);
return (*pQueryCurrentRendererStringMESA)(attribute);
}
static Bool dispatch_glXQueryMaxSwapBarriersSGIX(Display *dpy, int screen,
static Bool dispatch_QueryMaxSwapBarriersSGIX(Display *dpy, int screen,
int *max)
{
PFNGLXQUERYMAXSWAPBARRIERSSGIXPROC pglXQueryMaxSwapBarriersSGIX;
PFNGLXQUERYMAXSWAPBARRIERSSGIXPROC pQueryMaxSwapBarriersSGIX;
__GLXvendorInfo *dd;
dd = __VND->getDynDispatch(dpy, screen);
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXQueryMaxSwapBarriersSGIX);
if (pglXQueryMaxSwapBarriersSGIX == NULL)
__FETCH_FUNCTION_PTR(QueryMaxSwapBarriersSGIX);
if (pQueryMaxSwapBarriersSGIX == NULL)
return False;
return (*pglXQueryMaxSwapBarriersSGIX)(dpy, screen, max);
return (*pQueryMaxSwapBarriersSGIX)(dpy, screen, max);
}
static Bool dispatch_glXQueryRendererIntegerMESA(Display *dpy, int screen,
static Bool dispatch_QueryRendererIntegerMESA(Display *dpy, int screen,
int renderer, int attribute,
unsigned int *value)
{
PFNGLXQUERYRENDERERINTEGERMESAPROC pglXQueryRendererIntegerMESA;
PFNGLXQUERYRENDERERINTEGERMESAPROC pQueryRendererIntegerMESA;
__GLXvendorInfo *dd;
dd = __VND->getDynDispatch(dpy, screen);
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXQueryRendererIntegerMESA);
if (pglXQueryRendererIntegerMESA == NULL)
__FETCH_FUNCTION_PTR(QueryRendererIntegerMESA);
if (pQueryRendererIntegerMESA == NULL)
return False;
return (*pglXQueryRendererIntegerMESA)(dpy, screen, renderer, attribute, value);
return (*pQueryRendererIntegerMESA)(dpy, screen, renderer, attribute, value);
}
static const char *dispatch_glXQueryRendererStringMESA(Display *dpy, int screen,
static const char *dispatch_QueryRendererStringMESA(Display *dpy, int screen,
int renderer, int attribute)
{
PFNGLXQUERYRENDERERSTRINGMESAPROC pglXQueryRendererStringMESA;
PFNGLXQUERYRENDERERSTRINGMESAPROC pQueryRendererStringMESA;
__GLXvendorInfo *dd = NULL;
dd = __VND->getDynDispatch(dpy, screen);
if (dd == NULL)
return NULL;
__FETCH_FUNCTION_PTR(glXQueryRendererStringMESA);
if (pglXQueryRendererStringMESA == NULL)
__FETCH_FUNCTION_PTR(QueryRendererStringMESA);
if (pQueryRendererStringMESA == NULL)
return NULL;
return (*pglXQueryRendererStringMESA)(dpy, screen, renderer, attribute);
return (*pQueryRendererStringMESA)(dpy, screen, renderer, attribute);
}
static Bool dispatch_glXReleaseBuffersMESA(Display *dpy, GLXDrawable d)
static Bool dispatch_ReleaseBuffersMESA(Display *dpy, GLXDrawable d)
{
PFNGLXRELEASEBUFFERSMESAPROC pglXReleaseBuffersMESA;
PFNGLXRELEASEBUFFERSMESAPROC pReleaseBuffersMESA;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, d);
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXReleaseBuffersMESA);
if (pglXReleaseBuffersMESA == NULL)
__FETCH_FUNCTION_PTR(ReleaseBuffersMESA);
if (pReleaseBuffersMESA == NULL)
return False;
return (*pglXReleaseBuffersMESA)(dpy, d);
return (*pReleaseBuffersMESA)(dpy, d);
}
static int64_t dispatch_glXSwapBuffersMscOML(Display *dpy, GLXDrawable drawable,
static int64_t dispatch_SwapBuffersMscOML(Display *dpy, GLXDrawable drawable,
int64_t target_msc, int64_t divisor,
int64_t remainder)
{
PFNGLXSWAPBUFFERSMSCOMLPROC pglXSwapBuffersMscOML;
PFNGLXSWAPBUFFERSMSCOMLPROC pSwapBuffersMscOML;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return 0;
__FETCH_FUNCTION_PTR(glXSwapBuffersMscOML);
if (pglXSwapBuffersMscOML == NULL)
__FETCH_FUNCTION_PTR(SwapBuffersMscOML);
if (pSwapBuffersMscOML == NULL)
return 0;
return (*pglXSwapBuffersMscOML)(dpy, drawable, target_msc, divisor, remainder);
return (*pSwapBuffersMscOML)(dpy, drawable, target_msc, divisor, remainder);
}
static int dispatch_glXSwapIntervalMESA(unsigned int interval)
static int dispatch_SwapIntervalMESA(unsigned int interval)
{
PFNGLXSWAPINTERVALMESAPROC pglXSwapIntervalMESA;
PFNGLXSWAPINTERVALMESAPROC pSwapIntervalMESA;
__GLXvendorInfo *dd;
if (!__VND->getCurrentContext())
@@ -872,52 +871,52 @@ static int dispatch_glXSwapIntervalMESA(unsigned int interval)
if (dd == NULL)
return 0;
__FETCH_FUNCTION_PTR(glXSwapIntervalMESA);
if (pglXSwapIntervalMESA == NULL)
__FETCH_FUNCTION_PTR(SwapIntervalMESA);
if (pSwapIntervalMESA == NULL)
return 0;
return (*pglXSwapIntervalMESA)(interval);
return (*pSwapIntervalMESA)(interval);
}
static Bool dispatch_glXWaitForMscOML(Display *dpy, GLXDrawable drawable,
static Bool dispatch_WaitForMscOML(Display *dpy, GLXDrawable drawable,
int64_t target_msc, int64_t divisor,
int64_t remainder, int64_t *ust,
int64_t *msc, int64_t *sbc)
{
PFNGLXWAITFORMSCOMLPROC pglXWaitForMscOML;
PFNGLXWAITFORMSCOMLPROC pWaitForMscOML;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXWaitForMscOML);
if (pglXWaitForMscOML == NULL)
__FETCH_FUNCTION_PTR(WaitForMscOML);
if (pWaitForMscOML == NULL)
return False;
return (*pglXWaitForMscOML)(dpy, drawable, target_msc, divisor, remainder, ust, msc, sbc);
return (*pWaitForMscOML)(dpy, drawable, target_msc, divisor, remainder, ust, msc, sbc);
}
static Bool dispatch_glXWaitForSbcOML(Display *dpy, GLXDrawable drawable,
static Bool dispatch_WaitForSbcOML(Display *dpy, GLXDrawable drawable,
int64_t target_sbc, int64_t *ust,
int64_t *msc, int64_t *sbc)
{
PFNGLXWAITFORSBCOMLPROC pglXWaitForSbcOML;
PFNGLXWAITFORSBCOMLPROC pWaitForSbcOML;
__GLXvendorInfo *dd;
dd = GetDispatchFromDrawable(dpy, drawable);
if (dd == NULL)
return False;
__FETCH_FUNCTION_PTR(glXWaitForSbcOML);
if (pglXWaitForSbcOML == NULL)
__FETCH_FUNCTION_PTR(WaitForSbcOML);
if (pWaitForSbcOML == NULL)
return False;
return (*pglXWaitForSbcOML)(dpy, drawable, target_sbc, ust, msc, sbc);
return (*pWaitForSbcOML)(dpy, drawable, target_sbc, ust, msc, sbc);
}
#undef __FETCH_FUNCTION_PTR
@@ -928,45 +927,44 @@ const void * const __glXDispatchFunctions[DI_LAST_INDEX + 1] = {
#define __ATTRIB(field) \
[DI_##field] = (void *)dispatch_##field
__ATTRIB(BindTexImageEXT),
__ATTRIB(BindSwapBarrierSGIX),
__ATTRIB(BindTexImageEXT),
__ATTRIB(ChooseFBConfigSGIX),
__ATTRIB(CopySubBufferMESA),
__ATTRIB(CreateContextAttribsARB),
__ATTRIB(CreateContextWithConfigSGIX),
__ATTRIB(CreateGLXPbufferSGIX),
__ATTRIB(CreateGLXPixmapMESA),
__ATTRIB(CreateGLXPixmapWithConfigSGIX),
__ATTRIB(DestroyGLXPbufferSGIX),
__ATTRIB(GetContextIDEXT),
__ATTRIB(GetCurrentDisplayEXT),
__ATTRIB(GetFBConfigAttribSGIX),
__ATTRIB(GetFBConfigFromVisualSGIX),
__ATTRIB(GetMscRateOML),
__ATTRIB(GetScreenDriver),
__ATTRIB(GetSelectedEventSGIX),
__ATTRIB(GetSwapIntervalMESA),
__ATTRIB(GetSyncValuesOML),
__ATTRIB(GetVideoSyncSGI),
__ATTRIB(GetVisualFromFBConfigSGIX),
__ATTRIB(JoinSwapGroupSGIX),
__ATTRIB(QueryContextInfoEXT),
__ATTRIB(QueryCurrentRendererIntegerMESA),
__ATTRIB(QueryCurrentRendererStringMESA),
__ATTRIB(QueryGLXPbufferSGIX),
__ATTRIB(QueryMaxSwapBarriersSGIX),
__ATTRIB(QueryRendererIntegerMESA),
__ATTRIB(QueryRendererStringMESA),
__ATTRIB(ReleaseBuffersMESA),
__ATTRIB(ReleaseTexImageEXT),
__ATTRIB(SelectEventSGIX),
__ATTRIB(SwapBuffersMscOML),
__ATTRIB(SwapIntervalMESA),
__ATTRIB(SwapIntervalSGI),
__ATTRIB(WaitForMscOML),
__ATTRIB(WaitForSbcOML),
__ATTRIB(WaitVideoSyncSGI),
__ATTRIB(glXBindSwapBarrierSGIX),
__ATTRIB(glXCopySubBufferMESA),
__ATTRIB(glXCreateGLXPixmapMESA),
__ATTRIB(glXGetMscRateOML),
__ATTRIB(glXGetScreenDriver),
__ATTRIB(glXGetSwapIntervalMESA),
__ATTRIB(glXGetSyncValuesOML),
__ATTRIB(glXJoinSwapGroupSGIX),
__ATTRIB(glXQueryCurrentRendererIntegerMESA),
__ATTRIB(glXQueryCurrentRendererStringMESA),
__ATTRIB(glXQueryMaxSwapBarriersSGIX),
__ATTRIB(glXQueryRendererIntegerMESA),
__ATTRIB(glXQueryRendererStringMESA),
__ATTRIB(glXReleaseBuffersMESA),
__ATTRIB(glXSwapBuffersMscOML),
__ATTRIB(glXSwapIntervalMESA),
__ATTRIB(glXWaitForMscOML),
__ATTRIB(glXWaitForSbcOML),
[DI_LAST_INDEX] = NULL,
#undef __ATTRIB

View File

@@ -6,16 +6,19 @@
#define __glxlibglvnd_dispatchindex_h__
typedef enum __GLXdispatchIndex {
DI_BindSwapBarrierSGIX,
DI_BindTexImageEXT,
// ChooseFBConfig implemented by libglvnd
DI_ChooseFBConfigSGIX,
// ChooseVisual implemented by libglvnd
// CopyContext implemented by libglvnd
DI_CopySubBufferMESA,
// CreateContext implemented by libglvnd
DI_CreateContextAttribsARB,
DI_CreateContextWithConfigSGIX,
DI_CreateGLXPbufferSGIX,
// CreateGLXPixmap implemented by libglvnd
DI_CreateGLXPixmapMESA,
DI_CreateGLXPixmapWithConfigSGIX,
// CreateNewContext implemented by libglvnd
// CreatePbuffer implemented by libglvnd
@@ -40,6 +43,7 @@ typedef enum __GLXdispatchIndex {
DI_GetFBConfigAttribSGIX,
DI_GetFBConfigFromVisualSGIX,
// GetFBConfigs implemented by libglvnd
DI_GetMscRateOML,
// GetProcAddress implemented by libglvnd
// GetProcAddressARB implemented by libglvnd
// GetSelectedEvent implemented by libglvnd
@@ -47,45 +51,41 @@ typedef enum __GLXdispatchIndex {
DI_GetVideoSyncSGI,
// GetVisualFromFBConfig implemented by libglvnd
DI_GetVisualFromFBConfigSGIX,
DI_GetScreenDriver,
DI_GetSwapIntervalMESA,
DI_GetSyncValuesOML,
// ImportContextEXT implemented by libglvnd
// IsDirect implemented by libglvnd
DI_JoinSwapGroupSGIX,
// MakeContextCurrent implemented by libglvnd
// MakeCurrent implemented by libglvnd
// QueryContext implemented by libglvnd
DI_QueryContextInfoEXT,
DI_QueryCurrentRendererIntegerMESA,
DI_QueryCurrentRendererStringMESA,
// QueryDrawable implemented by libglvnd
// QueryExtension implemented by libglvnd
// QueryExtensionsString implemented by libglvnd
DI_QueryGLXPbufferSGIX,
DI_QueryMaxSwapBarriersSGIX,
DI_QueryRendererIntegerMESA,
DI_QueryRendererStringMESA,
// QueryServerString implemented by libglvnd
// QueryVersion implemented by libglvnd
DI_ReleaseBuffersMESA,
DI_ReleaseTexImageEXT,
// SelectEvent implemented by libglvnd
DI_SelectEventSGIX,
// SwapBuffers implemented by libglvnd
DI_SwapBuffersMscOML,
DI_SwapIntervalMESA,
DI_SwapIntervalSGI,
// UseXFont implemented by libglvnd
// WaitGL implemented by libglvnd
DI_WaitForMscOML,
DI_WaitForSbcOML,
DI_WaitVideoSyncSGI,
// WaitX implemented by libglvnd
DI_glXBindSwapBarrierSGIX,
DI_glXCopySubBufferMESA,
DI_glXCreateGLXPixmapMESA,
DI_glXGetMscRateOML,
DI_glXGetScreenDriver,
DI_glXGetSwapIntervalMESA,
DI_glXGetSyncValuesOML,
DI_glXJoinSwapGroupSGIX,
DI_glXQueryCurrentRendererIntegerMESA,
DI_glXQueryCurrentRendererStringMESA,
DI_glXQueryMaxSwapBarriersSGIX,
DI_glXQueryRendererIntegerMESA,
DI_glXQueryRendererStringMESA,
DI_glXReleaseBuffersMESA,
DI_glXSwapBuffersMscOML,
DI_glXSwapIntervalMESA,
DI_glXWaitForMscOML,
DI_glXWaitForSbcOML,
DI_LAST_INDEX
} __GLXdispatchIndex;

View File

@@ -50,6 +50,9 @@ static void __glXGLVNDSetDispatchIndex(const GLubyte *procName, int index)
{
unsigned internalIndex = FindGLXFunction(procName);
if (internalIndex == DI_FUNCTION_COUNT)
return; /* unknown or static dispatch */
__glXDispatchTableIndices[internalIndex] = index;
}

View File

@@ -2,6 +2,6 @@
"file_format_version": "1.0.0",
"ICD": {
"library_path": "@build_libdir@/libvulkan_intel.so",
"abi_versions": "1.0.3"
"api_version": "1.0.3"
}
}

View File

@@ -1194,22 +1194,25 @@ void genX(CmdEndRenderPass)(
}
static void
emit_ps_depth_count(struct anv_batch *batch,
emit_ps_depth_count(struct anv_cmd_buffer *cmd_buffer,
struct anv_bo *bo, uint32_t offset)
{
anv_batch_emit(batch, GENX(PIPE_CONTROL), pc) {
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.DestinationAddressType = DAT_PPGTT;
pc.PostSyncOperation = WritePSDepthCount;
pc.DepthStallEnable = true;
pc.Address = (struct anv_address) { bo, offset };
if (GEN_GEN == 9 && cmd_buffer->device->info.gt == 4)
pc.CommandStreamerStallEnable = true;
}
}
static void
emit_query_availability(struct anv_batch *batch,
emit_query_availability(struct anv_cmd_buffer *cmd_buffer,
struct anv_bo *bo, uint32_t offset)
{
anv_batch_emit(batch, GENX(PIPE_CONTROL), pc) {
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.DestinationAddressType = DAT_PPGTT;
pc.PostSyncOperation = WriteImmediateData;
pc.Address = (struct anv_address) { bo, offset };
@@ -1242,7 +1245,7 @@ void genX(CmdBeginQuery)(
switch (pool->type) {
case VK_QUERY_TYPE_OCCLUSION:
emit_ps_depth_count(&cmd_buffer->batch, &pool->bo,
emit_ps_depth_count(cmd_buffer, &pool->bo,
query * sizeof(struct anv_query_pool_slot));
break;
@@ -1262,10 +1265,10 @@ void genX(CmdEndQuery)(
switch (pool->type) {
case VK_QUERY_TYPE_OCCLUSION:
emit_ps_depth_count(&cmd_buffer->batch, &pool->bo,
emit_ps_depth_count(cmd_buffer, &pool->bo,
query * sizeof(struct anv_query_pool_slot) + 8);
emit_query_availability(&cmd_buffer->batch, &pool->bo,
emit_query_availability(cmd_buffer, &pool->bo,
query * sizeof(struct anv_query_pool_slot) + 16);
break;
@@ -1307,11 +1310,14 @@ void genX(CmdWriteTimestamp)(
pc.DestinationAddressType = DAT_PPGTT,
pc.PostSyncOperation = WriteTimestamp,
pc.Address = (struct anv_address) { &pool->bo, offset };
if (GEN_GEN == 9 && cmd_buffer->device->info.gt == 4)
pc.CommandStreamerStallEnable = true;
}
break;
}
emit_query_availability(&cmd_buffer->batch, &pool->bo, query + 16);
emit_query_availability(cmd_buffer, &pool->bo, query + 16);
}
#if GEN_GEN > 7 || GEN_IS_HASWELL

View File

@@ -2,6 +2,6 @@
"file_format_version": "1.0.0",
"ICD": {
"library_path": "libvulkan_intel.so",
"abi_versions": "1.0.3"
"api_version": "1.0.3"
}
}

View File

@@ -336,7 +336,7 @@ static const struct brw_device_info brw_device_info_chv = {
.max_gs_threads = 336, \
.max_hs_threads = 336, \
.max_ds_threads = 336, \
.max_wm_threads = 64 * 9, \
.max_wm_threads = 64 * 12, \
.max_cs_threads = 56, \
.urb = { \
.size = 384, \
@@ -389,7 +389,7 @@ static const struct brw_device_info brw_device_info_bxt = {
.max_hs_threads = 112,
.max_ds_threads = 112,
.max_gs_threads = 112,
.max_wm_threads = 64 * 3,
.max_wm_threads = 64 * 4,
.max_cs_threads = 6 * 6,
.urb = {
.size = 192,
@@ -412,7 +412,7 @@ static const struct brw_device_info brw_device_info_bxt_2x6 = {
.max_hs_threads = 56, /* XXX: guess */
.max_ds_threads = 56,
.max_gs_threads = 56,
.max_wm_threads = 64 * 2,
.max_wm_threads = 64 * 4,
.max_cs_threads = 6 * 6,
.urb = {
.size = 128,
@@ -439,7 +439,7 @@ static const struct brw_device_info brw_device_info_kbl_gt1 = {
.gt = 1,
.max_cs_threads = 7 * 6,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 2,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4,
.urb.size = 192,
.num_slices = 1,
};
@@ -449,7 +449,7 @@ static const struct brw_device_info brw_device_info_kbl_gt1_5 = {
.gt = 1,
.max_cs_threads = 7 * 6,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4,
.num_slices = 1,
};
@@ -457,7 +457,7 @@ static const struct brw_device_info brw_device_info_kbl_gt2 = {
GEN9_FEATURES,
.gt = 2,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4,
.num_slices = 1,
};
@@ -465,7 +465,7 @@ static const struct brw_device_info brw_device_info_kbl_gt3 = {
GEN9_FEATURES,
.gt = 3,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 8,
.num_slices = 2,
};
@@ -473,7 +473,7 @@ static const struct brw_device_info brw_device_info_kbl_gt4 = {
GEN9_FEATURES,
.gt = 4,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9,
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 12,
/*
* From the "L3 Allocation and Programming" documentation:
*

View File

@@ -3885,6 +3885,12 @@ lower_fb_write_logical_send(const fs_builder &bld, fs_inst *inst,
*/
setup_color_payload(bld, key, &sources[length], src0_alpha, 1);
length++;
} else if (key->replicate_alpha && inst->target != 0) {
/* Handle the case when fragment shader doesn't write to draw buffer
* zero. No need to call setup_color_payload() for src0_alpha because
* alpha value will be undefined.
*/
length++;
}
setup_color_payload(bld, key, &sources[length], color0, components);

View File

@@ -385,34 +385,33 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
indirect_byte_offset =
retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
struct brw_reg ind_src;
if (devinfo->gen < 8) {
/* From the Haswell PRM section "Register Region Restrictions":
*
* "The lower bits of the AddressImmediate must not overflow to
* change the register address. The lower 5 bits of Address
* Immediate when added to lower 5 bits of address register gives
* the sub-register offset. The upper bits of Address Immediate
* when added to upper bits of address register gives the register
* address. Any overflow from sub-register offset is dropped."
*
* This restriction is only listed in the Haswell PRM but emperical
* testing indicates that it applies on all older generations and is
* lifted on Broadwell.
*
* Since the indirect may cause us to cross a register boundary, this
* makes the base offset almost useless. We could try and do
* something clever where we use a actual base offset if
* base_offset % 32 == 0 but that would mean we were generating
* different code depending on the base offset. Instead, for the
* sake of consistency, we'll just do the add ourselves.
*/
brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
ind_src = brw_VxH_indirect(0, 0);
} else {
brw_MOV(p, addr, indirect_byte_offset);
ind_src = brw_VxH_indirect(0, imm_byte_offset);
}
/* There are a number of reasons why we don't use the base offset here.
* One reason is that the field is only 9 bits which means we can only
* use it to access the first 16 GRFs. Also, from the Haswell PRM
* section "Register Region Restrictions":
*
* "The lower bits of the AddressImmediate must not overflow to
* change the register address. The lower 5 bits of Address
* Immediate when added to lower 5 bits of address register gives
* the sub-register offset. The upper bits of Address Immediate
* when added to upper bits of address register gives the register
* address. Any overflow from sub-register offset is dropped."
*
* Since the indirect may cause us to cross a register boundary, this
* makes the base offset almost useless. We could try and do something
* clever where we use a actual base offset if base_offset % 32 == 0 but
* that would mean we were generating different code depending on the
* base offset. Instead, for the sake of consistency, we'll just do the
* add ourselves. This restriction is only listed in the Haswell PRM
* but empirical testing indicates that it applies on all older
* generations and is lifted on Broadwell.
*
* In the end, while base_offset is nice to look at in the generated
* code, using it saves us 0 instructions and would require quite a bit
* of case-by-case work. It's just not worth it.
*/
brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
struct brw_reg ind_src = brw_VxH_indirect(0, 0);
brw_inst *mov = brw_MOV(p, dst, retype(ind_src, dst.type));

View File

@@ -2848,6 +2848,7 @@ reuse_framebuffer_texture_attachment(struct gl_framebuffer *fb,
dst_att->Type = src_att->Type;
dst_att->Complete = src_att->Complete;
dst_att->TextureLevel = src_att->TextureLevel;
dst_att->CubeMapFace = src_att->CubeMapFace;
dst_att->Zoffset = src_att->Zoffset;
dst_att->Layered = src_att->Layered;
}

View File

@@ -857,7 +857,7 @@ _mesa_get_color_read_format(struct gl_context *ctx)
if (format == MESA_FORMAT_B8G8R8A8_UNORM)
return GL_BGRA;
else if (format == MESA_FORMAT_B5G6R5_UNORM)
return GL_BGR;
return GL_RGB;
else if (format == MESA_FORMAT_R_UNORM8)
return GL_RED;
@@ -892,7 +892,7 @@ _mesa_get_color_read_type(struct gl_context *ctx)
const GLenum data_type = _mesa_get_format_datatype(format);
if (format == MESA_FORMAT_B5G6R5_UNORM)
return GL_UNSIGNED_SHORT_5_6_5_REV;
return GL_UNSIGNED_SHORT_5_6_5;
switch (data_type) {
case GL_SIGNED_NORMALIZED:

View File

@@ -59,7 +59,6 @@ struct _mesa_HashTable {
struct hash_table *ht;
GLuint MaxKey; /**< highest key inserted so far */
mtx_t Mutex; /**< mutual exclusion lock */
mtx_t WalkMutex; /**< for _mesa_HashWalk() */
GLboolean InDeleteAll; /**< Debug check */
/** Value that would be in the table for DELETED_KEY_VALUE. */
void *deleted_key_data;
@@ -129,8 +128,11 @@ _mesa_NewHashTable(void)
}
_mesa_hash_table_set_deleted_key(table->ht, uint_key(DELETED_KEY_VALUE));
mtx_init(&table->Mutex, mtx_plain);
mtx_init(&table->WalkMutex, mtx_plain);
/*
* Needs to be recursive, since the callback in _mesa_HashWalk()
* is allowed to call _mesa_HashRemove().
*/
mtx_init(&table->Mutex, mtx_recursive);
}
else {
_mesa_error_no_memory(__func__);
@@ -161,7 +163,6 @@ _mesa_DeleteHashTable(struct _mesa_HashTable *table)
_mesa_hash_table_destroy(table->ht, NULL);
mtx_destroy(&table->Mutex);
mtx_destroy(&table->WalkMutex);
free(table);
}
@@ -401,11 +402,6 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table,
/**
* Walk over all entries in a hash table, calling callback function for each.
* Note: we use a separate mutex in this function to avoid a recursive
* locking deadlock (in case the callback calls _mesa_HashRemove()) and to
* prevent multiple threads/contexts from getting tangled up.
* A lock-less version of this function could be used when the table will
* not be modified.
* \param table the hash table to walk
* \param callback the callback function
* \param userData arbitrary pointer to pass along to the callback
@@ -422,13 +418,13 @@ _mesa_HashWalk(const struct _mesa_HashTable *table,
assert(table);
assert(callback);
mtx_lock(&table2->WalkMutex);
mtx_lock(&table2->Mutex);
hash_table_foreach(table->ht, entry) {
callback((uintptr_t)entry->key, entry->data, userData);
}
if (table->deleted_key_data)
callback(DELETED_KEY_VALUE, table->deleted_key_data, userData);
mtx_unlock(&table2->WalkMutex);
mtx_unlock(&table2->Mutex);
}
static void