Compare commits

...

49 Commits

Author SHA1 Message Date
Andres Gomez
e644f9996b docs: add release notes for 17.1.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-28 16:17:02 +03:00
Andres Gomez
187f6a4c8e Update version to 17.1.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-28 16:10:56 +03:00
Andres Gomez
62c7e9a6ee cherry-ignore: add "egl/drm: Fix misused x and y offsets in swrast_*_image*"
fixes: Depend on earlier commit 04a40f7d2a that did not land in branch
and which exposes new API.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-25 16:03:37 +03:00
Andres Gomez
43a76ec1bb cherry-ignore: add "i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit."
stable: 17.2 nomination only. Depends on earlier commit f296c22989
which did not land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-25 16:03:37 +03:00
Andres Gomez
456f07b845 cherry-ignore: add "i965/tex: Don't pass samples to miptree_create_for_teximage"
stable: Depends on earlier commit 76e2f390f9 which did not land in
branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-25 16:03:37 +03:00
Andres Gomez
25793cfe0e cherry-ignore: cherry-ignore: added 17.2 nominations.
stable: 17.2 nominations only.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-25 16:03:37 +03:00
Dave Airlie
3f3e925d40 radv: don't crash if we have no framebuffer
Recording secondaries with no framebuffer attachment may
make this happen, though this might not be the complete solution.

(esp if someone does meta stuff in there, would we have to
save things, not sure).

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4a091b0788)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_cmd_buffer.c
2017-08-25 16:03:37 +03:00
Kai Chen
81a1ecda15 egl/wayland: Use roundtrips when awaiting buffer release
In get_back_bo, we use wl_display_dispatch_queue() to block and wait for
a buffer release event. However, not all Wayland compositors flush the
client socket on posting a buffer-release event, so by only blocking
client-side, we may block indefinitely, or at least need to wait for an
input event / frame completion to arrive for the compositor to flush.

We now use dispatch_queue as a first pass, but if our entire buffer pool
is exhausted, use a roundtrip (an immediately-triggered wl_callback) to
ensure that the compositor flushes out our release event immediately.

[daniels: Modified comment and commit message.]

Signed-off-by: Kai Chen <kai.chen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 151188d1e3)
2017-08-25 16:03:37 +03:00
Ilia Mirkin
52b0ad8666 nv50/ir: properly set sType for TXF ops to U32
All of the coordinates and LOD args are integers for TXF. This mostly
doesn't matter, except for converting into a levelZero=true operation by
removing an explicit zero LOD. For the comparison against zero to work
properly, the sType of the instruction has to be set correctly.

Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch
Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 96be442b77)
2017-08-25 16:03:37 +03:00
Marek Olšák
54bb87c25a radeonsi/gfx9: add a temporary workaround for a tessellation driver bug
The workaround will do for now. The root cause is still unknown.

This fixes new piglit: 16in-1out

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 166823bfd2)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_draw.c
2017-08-25 16:03:36 +03:00
Topi Pohjolainen
b85502603f intel/blorp: Adjust intra-tile x when faking rgb with red-only
v2 (Jason): Adjust directly in surf_fake_rgb_with_red()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 393ec1a507)
2017-08-25 16:03:36 +03:00
Christoph Haag
4fce4ce271 mesa: only copy requested compressed teximage cubemap faces
This is analogous to commit 2259b11 which only fixed the regular case

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102308
Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 87556a650a)
[Andres Gomez: helpers had not yet been refactored]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/mesa/main/teximage.c
2017-08-25 16:03:36 +03:00
Jason Ekstrand
996cd238b8 i965: Stop looking at NewDriverState when emitting 3DSTATE_URB
Looking at NewDriverState is not safe in general.  The state atom system
is set up to ensure that new bits that get added to NewDriverState get
accumulated into the set of bits used when emitting atoms but it doesn't
go the other way.  If we read NewDriverState, we may not get the full
picture because the per-pipeline state (3D or compute) does not get
added to NewDriverState before state emit is done.  It's especially
dangerous to do this from BLORP (either explicitly or implicitly when
BLORP calls gen7_upload_urb) because that does not happen during one of
the normal state upload paths.

This commit solves the problem by whacking all of the per-shader-stage
URB sizes to zero whenever we change the total URB size.  We still have
to flag BRW_NEW_URB_SIZE to ensure that the gen7_urb atom triggers but
the actual decision in gen7_upload_urb can now be based entirely on URB
sizes rather than on state atoms.  This also makes BLORP correct because
it just asks for a new URB config whenever the vsize is too small and so
any change to the total URB size will trigger blorp to re-emit as well
because 0 < vs_entry_size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/102289
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d5e217dbfd)
2017-08-25 16:03:36 +03:00
Ilia Mirkin
b31ccc62ab glsl: add a few missing int64 constant propagation cases
Fixes KHR-GL45.shader_ballot_tests.ShaderBallotAvailability, which
causes some silly swizzles to appear, triggering this optimization to
get hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9c8f017f77)
2017-08-25 16:03:36 +03:00
Lionel Landwerlin
e3e4477fed i965: perf: minimize the chances to spread queries across batchbuffers
Counter related to timings will be sensitive to any delay introduced
by the software. In particular if our begin & end of performance
queries end up in different batches, time related counters will
exhibit biffer values caused by the time it takes for the kernel
driver to load new requests into the hardware.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit adafe4b733)
2017-08-25 16:03:36 +03:00
Tim Rowley
dfd6753058 swr/rast: switch gen_knobs.cpp license
Unintentionally added with an apache2 license; relicense to match
the rest of the tree.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit fb3e50a351)
2017-08-25 16:03:36 +03:00
Andres Gomez
aa0f85ee86 docs: add sha256 checksums for 17.1.7
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 18:22:49 +03:00
Andres Gomez
c2d9f33f2c docs: add release notes for 17.1.7
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 18:10:18 +03:00
Andres Gomez
a3dc1060dd Update version to 17.1.7
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 18:03:05 +03:00
Andres Gomez
6100cd7170 cherry-ignore: add "radv: handle 10-bit format clamping workaround."
fixes: This commit is complex and has non trivial conflicts due to
previous changes.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:50 +03:00
Andres Gomez
bdee70d473 cherry-ignore: add "virgl: drop precise modifier."
fixes: This commit addressed an earlier commit af22adee4f which did
not land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:50 +03:00
Andres Gomez
c86696577e cherry-ignore: add "radv: Handle VK_ATTACHMENT_UNUSED in color attachments."
fixes: This commit is complex and has non trivial conflicts due to
multiple previous changes.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:50 +03:00
Andres Gomez
b0f793ccc4 cherry-ignore: added 17.2 nominations.
stable: 17.2 nominations only.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:50 +03:00
Andres Gomez
a1e6128565 cherry-ignore: add "configure: remove trailing "-a" in swr architecture teststable: 17.2 nomination only."
stable: 17.2 nomination only. Depends on earlier commit 1cb5a6061c
which did not land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:50 +03:00
Andres Gomez
00d96f51e1 cherry-ignore: add "radeon/ac: use ds_swizzle for derivs on si/cik."
stable: Depends on earlier commit 28634ff7d3 which did not land in
branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:49 +03:00
Andres Gomez
58afecdbba cherry-ignore: add "swr: use the correct variable for no undefined symbols"
stable: Breaks SWR compilation due to earlier commit f50aa21456 which
did not land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 14:30:49 +03:00
Eric Anholt
81a12778f5 util: Fix build on old glibc.
We need to link librt for u_thread.h's clock_gettime() call.

Fixes: b822d9dd67 ("gallium/util: move u_queue.{c,h} to src/util")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b94ddc181b)
2017-08-21 14:30:49 +03:00
Dave Airlie
35e9910c20 radv: force cs/ps/l2 flush at end of command stream. (v2)
This seems like a workaround, but we don't see the bug on CIK/VI.

On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.

Could also be a kernel difference between SI and CIK.

v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 82ba384c10)
2017-08-21 14:30:49 +03:00
Dave Airlie
98d54d0c95 radv: fix MSAA on SI gpus.
This ports the workaround from radeonsi, that was missing in radv.

This fixes Talos rendering when MSAA is enabled on my Tahiti card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8bf3930751)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_device.c
2017-08-21 14:30:49 +03:00
Dave Airlie
3ba481ba48 radv: fix f16->f32 denorm handling for SI/CIK. (v2)
This just copies the code from the -pro shaders,
and fixes the tests on CIK.

With this CIK passes the same set of conformance
tests as VI.

Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3f389f75b6)
2017-08-21 14:30:49 +03:00
Ilia Mirkin
75ce282ee2 nv50/ir: fix TXQ srcMask
src0.x is always read for the LOD, irrespective of which outputs are
read.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 934511d1f3)
2017-08-19 17:39:35 +03:00
Ilia Mirkin
37e61310e3 nv50/ir: fix srcMask computation for TG4 and TXF
This affects which inputs are marked as used. In a situation where only
the texture instruction uses an input, it might have been ignored as
unused due to input masks.

Affects subtests of KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 054c54d1be)
2017-08-19 17:39:35 +03:00
Frank Richter
baf8c7b1c4 gallium/os: fix os_time_get_nano() to roll over less
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 7fb7287ce7)
2017-08-19 17:39:35 +03:00
Frank Richter
313fc5331d st/wgl: check for negative delta in wait_swap_interval()
This can happen because of rollover.  See bug report for details.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit d90e05ad48)
2017-08-19 17:39:35 +03:00
Frank Richter
e4db525e82 st/mesa: fix a null pointer access
Fixes crash with llvmpipe on Windows.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102148
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 496a691e35)
2017-08-19 17:39:35 +03:00
Tim Rowley
ed594f19d7 swr/rast: Fix invalid casting for calls to Interlocked* functions
CID: 1416243, 1416244, 1416255
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit b333bc753e)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/gallium/drivers/swr/rasterizer/core/api.cpp
	src/gallium/drivers/swr/rasterizer/core/threads.cpp
2017-08-19 17:39:35 +03:00
Ilia Mirkin
4e993fc542 glsl/ast: update rhs in addition to the var's constant_value
We continue in the code to do some more things with the rhs, including
setting a constant initializer. If the type is wrong, this causes some
confusion down the line, leading to assertions. This makes sure that the
rhs processing continues to flow as-if the type was correct to start
with (even though the state has been marked as an error state).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101766
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 978c4c597a)
2017-08-19 17:39:35 +03:00
Marek Olšák
e05ea17c50 radeonsi: disable CE by default
It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 1ab7fed707)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_pipe.c
2017-08-19 17:39:35 +03:00
Emil Velikov
774e77ab64 egl: avoid eglCreatePlatform*Surface{EXT,} crash with invalid dpy
If we have an invalid display fed into the functions, the display lookup
will return NULL. Thus as we attempt to get the platform type, we'll
deref. it leading to a crash.

Keep in mind that this will not happen if Mesa is built without X11 or
when the legacy eglCreate*Surface codepaths are used.

A similar check was added with earlier commit 5e97b8f5ce ("egl: Fix
crashes in eglCreate*Surface), although it was only applicable when the
surfaceless platform is built.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 26fbb9eacd)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/egl/main/eglapi.c
2017-08-19 17:39:35 +03:00
Marek Olšák
c968de1989 ac: fail shader compilation if libelf is replaced by an incompatible version
UE4Editor has this issue.

This commit prevents hangs (release build) or assertion failures (debug
build). It doesn't fix the editor, but catastrophic scenarios are
prevented.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4630ede102)
2017-08-19 17:39:34 +03:00
Karol Herbst
8aa358bd69 nv50/ir: fix ConstantFolding with saturation
For mul(a, +-1) codegen can generate OP_MOV with a saturation flag
set which is ignored at emission. The same can happen with add(a, 0),
and others.

Adding an assert for detecting more of such issues.

Fixes wrongly rendered water in Hitman Absolution running under wine.
Also a few shaders in Mad Max and Alien Isolation produce such MOVs.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: generalize the fix for other cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 24a799ad35)

squashed with:

nv50/ir: clean up saturated values immediately

Since we don't iterate to a fixed point, we can end up in situations
where we have a SAT instruction + a long immediate. This is not legal.
However since it's immediately computable, just run unary straight away
to handle the situation.

Fixes: 24a799ad35 ("nv50/ir: fix ConstantFolding with saturation")
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 165e18dd21)
2017-08-19 17:38:58 +03:00
Jason Ekstrand
9f8925702d anv/formats: Allow sampling on depth-only formats on gen7
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 06d3115bb9)
2017-08-19 13:46:04 +03:00
Dave Airlie
ad07debbd9 radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.

v2: get it right, reverse the polarity.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 36a1b61321)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_meta_resolve.c
2017-08-19 13:46:04 +03:00
Emil Velikov
0cfd8879b1 egl/x11: don't leak xfixes_query in the error path
If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c961b679fe)
2017-08-19 13:46:04 +03:00
Dave Airlie
1c3dcd3aa4 intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.
If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.

Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 271fa3a684)
2017-08-19 13:46:04 +03:00
Chris Wilson
894953dae6 i965/blit: Remember to include miptree buffer offset in relocs
Remember to add the offset to the start of the buffer in the relocation
or else we write 0xff into random bytes elsewhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit fb63c43fd1)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/mesa/drivers/dri/i965/intel_pixel_bitmap.c
2017-08-19 13:46:04 +03:00
Kenneth Graunke
9e164c6aa9 i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.
The cacheline alignment restriction is on the base address; the pitch
can be anything.

Fixes assertion failures when using primus (say, on glxgears, which
creates a 300x300 linear BGRX surface with a pitch of 1200):

intel_blit.c:190: get_blit_intratile_offset_el: Assertion `mt->surf.row_pitch % 64 == 0' failed.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 595a47b829)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/mesa/drivers/dri/i965/intel_blit.c
2017-08-19 13:46:04 +03:00
Connor Abbott
7056362f8d ac/nir: fix lsb emission
This makes it match radeonsi. The LLVM backend itself will emit the
correct instruction, but LLVM might do incorrect optimizations since it
thinks the output is undefined when the input is 0, even though it's not
supposed to be. We really need a new intrinsic, or for the backend to
become smarter and recognize this pattern.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
(cherry picked from commit 6d731c5651)
[Andres Gomez: nir_to_llvm_context not yet converted into ac_llvm_context]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-08-19 13:46:04 +03:00
Emil Velikov
2766ed0d45 docs: add sha256 checksums for 17.1.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-07 13:09:08 +01:00
44 changed files with 584 additions and 67 deletions

View File

@@ -1 +1 @@
17.1.6
17.1.8

View File

@@ -77,3 +77,124 @@ be5773fa8dfe9255d9abaf5c7d5bbbd2d922da08 Android: fix compile error for DRI2 loa
# stable: Commit was never applied - see above.
d85802e501a67e193a4a363cfe3b4c17c3d9e2e9 Revert "st/mesa: release sampler views when redefining a texture in st_context_teximage"
# stable: Breaks SWR compilation due to earlier commit f50aa21456
# which did not land in branch.
4d53b16f555b2d33216518100fb2cd578428512d swr: use the correct variable for no undefined symbols
# stable: 17.2 nomination only. Depends on earlier commit 28634ff7d3
# which did not land in branch.
cb6f16dce90b4737f62588f8ea5083ee6544787e radeon/ac: use ds_swizzle for derivs on si/cik.
# stable: 17.2 nomination only. Depends on earlier commit 1cb5a6061c
# which did not land in branch.
4d9b0dcccb81ad10113d9aef52b4c84496e879f1 configure: remove trailing "-a" in swr architecture test
# stable: 17.2 nomination only.
31a6750988d7dd431f72ff1ff11bfca83bde5d8c st/dri: NULL check before deref DRI loader .getCapability
# stable: 17.2 nomination only.
9966c85e01a4344d2a6bb76e432e0bed70d52ff6 st/osmesa: add osmesa framebuffer iface hash table per st manager
# stable: 17.2 nomination only.
c15b92ce1160d742ea431062bbe4b3e818bb2aaf intel/isl: Stop padding surfaces
4d27c6095e8385cccd225993452baad4d2e35420 intel/isl: Don't align the height of the last array slice
# stable: 17.2 nomination only.
8e5808fc0c9d9da19a0c7f683c156386d4648842 i965/miptree: Call alloc_aux in create_for_bo
# stable: 17.2 nomination only.
2e9a13bf2205b6e96cba408e3f48f1c3fe49634a radv: Fix decompression on multisampled depth buffers
# stable: 17.2 nomination only.
5563872dbfbf733ed56e1b367bc8944ca59b1c3e isl: Validate row pitch of stencil surfaces.
# stable: 17.2 nomination only.
27fef5d52d44c8684fa4e7a21bd7a4284f3688ee radeonsi/gfx9: use the VI codepath for clamping Z
# stable: 17.2 nomination only.
f7dfc44c617bec0f847ebe49b8672a64354ab13d i965/blorp: Correct type of src_format in call to intel_miptree_texture_aux_usage
# stable: 17.2 nomination only.
5247b311e9b348fedd74980a34c4b6542d85b07b radv/gfx9: fix set predication packet.
fc600eb98d5846fe59f4a79ed1c7ad2a0667e927 radv/gfx9: remove some leftover gfx6 descriptor setup.
674ecbfef2acb17be363867425a013ca151e16b2 radv: emit db_htile_surface reg on gfx9 as well
e43cc3e3afc98783310f81f8c0151a8314044739 radv/gfx9: handle GFX9 opaque metadata
31bb8517a194af733deefe2d821537d994d39365 radv/gfx9: fix tile swizzle handling for gfx9
# stable: 17.2 nomination only.
694d59fbaf4bc85daaff6cc411162dd6d1232968 radv/gfx9: for fast clear use is_linear flag.
# stable: 17.2 nomination only.
49eda75df6aafdf5d2ffe5d9247b516ac7d14691 i965: Always allow CPU readback of the scanout on LLC platforms
# stable: 17.2 nomination only.
4c02e2bd95d16407084914ff7248a1717bdce658 radv: disable texture gather workaround on gfx9.
# fixes: This commit is complex and has non trivial conflicts due to
# multiple previous changes.
ea08a296fe226f5e67366b4db420c2322f38774c radv: Handle VK_ATTACHMENT_UNUSED in color attachments.
# fixes: This commit addressed an earlier commit af22adee4f which did not
# land in branch.
554aa094406f3f5a935c4adbe77569cc9beb4312 virgl: drop precise modifier.
# fixes: This commit is complex and has non trivial conflicts due to
# previous changes.
df61a05019d5c7479d4b29d251af4231f125e61c radv: handle 10-bit format clamping workaround.
# stable: 17.2 nomination only.
611076a41aac3095a82dff2432943d7f8d429822 radv: disable support for VEGA for now.
# stable: 17.2 nomination only.
bc56dfbf3f20504fce13e0f1730eea05ea0ea69a i965: Mark all EGLimages as non-coherent.
# stable: 17.2 nomination only.
61d2f3f1c24323a1c067595ec78dfbfefdc72b41 i965/miptree: Return NONE from texture_aux_usage when fully resolved
# stable: 17.2 nomination only.
b040f51b61d4d5ee671ba9d862e871ac5ac67ddf ac/nir: fixup layer/viewport export for GFX9.
# stable: 17.2 nomination only.
2843c5d15cf7c051d6aaf0744c3c1c7d4a734184 radeonsi: update non-resident bindless descriptors if needed
# stable: 17.2 nomination only.
4734bfc02adad103efa1fa51e4c0f93fcaedb73c Android: Fix LLVM duplicated symbols linking for N and M
# stable: 17.2 nomination only.
0ae9ce0f29ea1973b850a4e6c6cae8606973036e i965/clear: Quantize the depth clear value based on the format
# stable: 17.2 nomination only.
fdef2f0fd19ac6f2715a802d1e14b8ddfa094f11 radeonsi/gfx9: properly handle imported textures with unexpected swizzle mode
8dadb077908ad6d875577ca08e0e04a5741ba95b radeonsi: emit VGT_REUSE_OFF in the right place
# stable: 17.2 nomination only.
df09f1f3cd5110874899ed0f4b4c33ba9b006c50 radv/gfx9: use total levels in texture descriptor
11834195e9c276e1f3756cf8f6161be14124261b radv/gfx9: fix level count in color register setup.
d987b4ab9e240b479c71129c3c261982112c57d8 radv/gfx9: fixup db/stencil disable.
864eb1852778abaa6f63ca106216001c9f375f05 radv: bump space check for indexed draw.
9c080100d336e4f90575d5138508b519ed334eef radv/gfx9: emit sx_mrt_blend registers
5378b5d0710be00d1316e42e692a52d4bc5d92fe radv: cleanup some image view descriptor setup.
a74d98743115b928eaeabc0d58b63174158aa209 radv/image: don't rescale width/height if the format isn't changing
bae7723e132d3177697606c799eabbb7cdde2f38 radv/gfx9: only minify image view width/height/depth before gfx9.
5d26e0baf223b361c9919db213915a82d2dff5c4 radv: don't degrade tiling mode for small compressed or depth texture.
8985ad494bce5a4c365fe38fdf500d8582b5a7d0 radv/gfx9: don't expose linear depth on vega.
# stable: 17.2 nomination only.
43595db30274f714e2b1f6120c2f5ec4c41614fe ac/nir: Cast sources of integer ops to int.
# stable: 17.2 nomination only.
19f6906c1e498499035e98929657e2faebe6c993 radv/gfx9: gfx9 has buffer sizing rules like pre-VI.
# stable: 17.2 nomination only. Depends on earlier commit 76e2f390f98
# which did not land in branch.
f24cf82d6db290a88abfff0669d2c5e2aa463901 i965/tex: Don't pass samples to miptree_create_for_teximage
# stable: 17.2 nomination only. Depends on earlier commit f296c22989ff
# which did not land in branch.
54c41af0aa92333579a72830254ac3aaa9f4aea1 i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit.
# fixes: Depend on earlier commit 04a40f7d2a that did not land in
# branch and which exposes new API.
3a5e3aa5a53cff55a5e31766d713a41ffa5a93d7 egl/drm: Fix misused x and y offsets in swrast_put_image2()
fe2a6281b3b299998fe7399e7dbcc2077d773824 egl/drm: Fix misused x and y offsets in swrast_get_image()

View File

@@ -31,7 +31,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
971831bc1e748b3e8367eee6b9eb509bad2970e3c2f8520ad25f5caa12ca5491 mesa-17.1.6.tar.gz
0686deadde1f126b20aa67e47e8c50502043eee4ecdf60d5009ffda3cebfee50 mesa-17.1.6.tar.xz
</pre>

148
docs/relnotes/17.1.7.html Normal file
View File

@@ -0,0 +1,148 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.7 Release Notes / August 21, 2017</h1>
<p>
Mesa 17.1.7 is a bug fix release which fixes bugs found since the 17.1.6 release.
</p>
<p>
Mesa 17.1.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7ca484fe3194e8185d9a20261845bfd284cc40d0f3fda690d317f85ac7b91af5 mesa-17.1.7.tar.gz
69f472a874b1122404fa0bd13e2d6bf87eb3b9ad9c21d2f39872a96d83d9e5f5 mesa-17.1.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>cherry-ignore: add "swr: use the correct variable for no undefined symbols"</li>
<li>cherry-ignore: add "radeon/ac: use ds_swizzle for derivs on si/cik."</li>
<li>cherry-ignore: add "configure: remove trailing "-a" in swr architecture teststable: 17.2 nomination only."</li>
<li>cherry-ignore: added 17.2 nominations.</li>
<li>cherry-ignore: add "radv: Handle VK_ATTACHMENT_UNUSED in color attachments."</li>
<li>cherry-ignore: add "virgl: drop precise modifier."</li>
<li>cherry-ignore: add "radv: handle 10-bit format clamping workaround."</li>
<li>Update version to 17.1.7</li>
</ul>
<p>Chris Wilson (1):</p>
<ul>
<li>i965/blit: Remember to include miptree buffer offset in relocs</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>ac/nir: fix lsb emission</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.</li>
<li>radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)</li>
<li>radv: fix f16-&gt;f32 denorm handling for SI/CIK. (v2)</li>
<li>radv: fix MSAA on SI gpus.</li>
<li>radv: force cs/ps/l2 flush at end of command stream. (v2)</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.6</li>
<li>egl/x11: don't leak xfixes_query in the error path</li>
<li>egl: avoid eglCreatePlatform*Surface{EXT,} crash with invalid dpy</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>util: Fix build on old glibc.</li>
</ul>
<p>Frank Richter (3):</p>
<ul>
<li>st/mesa: fix a null pointer access</li>
<li>st/wgl: check for negative delta in wait_swap_interval()</li>
<li>gallium/os: fix os_time_get_nano() to roll over less</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>glsl/ast: update rhs in addition to the var's constant_value</li>
<li>nv50/ir: fix srcMask computation for TG4 and TXF</li>
<li>nv50/ir: fix TXQ srcMask</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv/formats: Allow sampling on depth-only formats on gen7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nv50/ir: fix ConstantFolding with saturation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>ac: fail shader compilation if libelf is replaced by an incompatible version</li>
<li>radeonsi: disable CE by default</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: Fix invalid casting for calls to Interlocked* functions</li>
</ul>
</div>
</body>
</html>

114
docs/relnotes/17.1.8.html Normal file
View File

@@ -0,0 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.8 Release Notes / August 28, 2017</h1>
<p>
Mesa 17.1.8 is a bug fix release which fixes bugs found since the 17.1.7 release.
</p>
<p>
Mesa 17.1.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.7</li>
<li>cherry-ignore: cherry-ignore: added 17.2 nominations.</li>
<li>cherry-ignore: add "i965/tex: Don't pass samples to miptree_create_for_teximage"</li>
<li>cherry-ignore: add "i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit."</li>
<li>cherry-ignore: add "egl/drm: Fix misused x and y offsets in swrast_*_image*"</li>
<li>Update version to 17.1.8</li>
</ul>
<p>Christoph Haag (1):</p>
<ul>
<li>mesa: only copy requested compressed teximage cubemap faces</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: don't crash if we have no framebuffer</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>glsl: add a few missing int64 constant propagation cases</li>
<li>nv50/ir: properly set sType for TXF ops to U32</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Stop looking at NewDriverState when emitting 3DSTATE_URB</li>
</ul>
<p>Kai Chen (1):</p>
<ul>
<li>egl/wayland: Use roundtrips when awaiting buffer release</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: minimize the chances to spread queries across batchbuffers</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi/gfx9: add a temporary workaround for a tessellation driver bug</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: switch gen_knobs.cpp license</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>intel/blorp: Adjust intra-tile x when faking rgb with red-only</li>
</ul>
</div>
</body>
</html>

View File

@@ -109,7 +109,7 @@ static void parse_relocs(Elf *elf, Elf_Data *relocs, Elf_Data *symbols,
}
}
void ac_elf_read(const char *elf_data, unsigned elf_size,
bool ac_elf_read(const char *elf_data, unsigned elf_size,
struct ac_shader_binary *binary)
{
char *elf_buffer;
@@ -118,6 +118,7 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
Elf_Data *symbols = NULL, *relocs = NULL;
size_t section_str_index;
unsigned symbol_sh_link = 0;
bool success = true;
/* One of the libelf implementations
* (http://www.mr511.de/software/english.htm) requires calling
@@ -137,7 +138,8 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
GElf_Shdr section_header;
if (gelf_getshdr(section, &section_header) != &section_header) {
fprintf(stderr, "Failed to read ELF section header\n");
return;
success = false;
break;
}
name = elf_strptr(elf, section_str_index, section_header.sh_name);
if (!strcmp(name, ".text")) {
@@ -148,6 +150,11 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
} else if (!strcmp(name, ".AMDGPU.config")) {
section_data = elf_getdata(section, section_data);
binary->config_size = section_data->d_size;
if (!binary->config_size) {
fprintf(stderr, ".AMDGPU.config is empty!\n");
success = false;
break;
}
binary->config = MALLOC(binary->config_size * sizeof(unsigned char));
memcpy(binary->config, section_data->d_buf, binary->config_size);
} else if (!strcmp(name, ".AMDGPU.disasm")) {
@@ -186,6 +193,7 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
binary->global_symbol_count = 1;
binary->config_size_per_symbol = binary->config_size;
}
return success;
}
const unsigned char *ac_shader_binary_config_start(

View File

@@ -83,7 +83,7 @@ struct ac_shader_config {
* Parse the elf binary stored in \p elf_data and create a
* ac_shader_binary object.
*/
void ac_elf_read(const char *elf_data, unsigned elf_size,
bool ac_elf_read(const char *elf_data, unsigned elf_size,
struct ac_shader_binary *binary);
/**

View File

@@ -1135,7 +1135,17 @@ static LLVMValueRef emit_find_lsb(struct nir_to_llvm_context *ctx,
*/
LLVMConstInt(ctx->i32, 1, false),
};
return ac_build_intrinsic(&ctx->ac, "llvm.cttz.i32", ctx->i32, params, 2, AC_FUNC_ATTR_READNONE);
LLVMValueRef lsb = ac_build_intrinsic(&ctx->ac, "llvm.cttz.i32", ctx->i32,
params, 2,
AC_FUNC_ATTR_READNONE);
/* TODO: We need an intrinsic to skip this conditional. */
/* Check for zero: */
return LLVMBuildSelect(ctx->builder, LLVMBuildICmp(ctx->builder,
LLVMIntEQ, src0,
ctx->i32zero, ""),
LLVMConstInt(ctx->i32, -1, 0), lsb, "");
}
static LLVMValueRef emit_ifind_msb(struct nir_to_llvm_context *ctx,
@@ -1239,7 +1249,6 @@ static LLVMValueRef emit_f2f16(struct nir_to_llvm_context *ctx,
src0 = to_float(ctx, src0);
result = LLVMBuildFPTrunc(ctx->builder, src0, ctx->f16, "");
/* TODO SI/CIK options here */
if (ctx->options->chip_class >= VI) {
LLVMValueRef args[2];
/* Check if the result is a denormal - and flush to 0 if so. */
@@ -1253,7 +1262,22 @@ static LLVMValueRef emit_f2f16(struct nir_to_llvm_context *ctx,
if (ctx->options->chip_class >= VI)
result = LLVMBuildSelect(ctx->builder, cond, ctx->f32zero, result, "");
else {
/* for SI/CIK */
/* 0x38800000 is smallest half float value (2^-14) in 32-bit float,
* so compare the result and flush to 0 if it's smaller.
*/
LLVMValueRef temp, cond2;
temp = emit_intrin_1f_param(&ctx->ac, "llvm.fabs",
ctx->f32, result);
cond = LLVMBuildFCmp(ctx->builder, LLVMRealUGT,
LLVMBuildBitCast(ctx->builder, LLVMConstInt(ctx->i32, 0x38800000, false), ctx->f32, ""),
temp, "");
cond2 = LLVMBuildFCmp(ctx->builder, LLVMRealUNE,
temp, ctx->f32zero, "");
cond = LLVMBuildAnd(ctx->builder, cond, cond2, "");
result = LLVMBuildSelect(ctx->builder, cond, ctx->f32zero, result, "");
}
return result;
}

View File

@@ -1106,6 +1106,10 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer *cmd_buffer)
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
int dst_resolve_micro_tile_mode = -1;
/* this may happen for inherited secondary recording */
if (!framebuffer)
return;
if (subpass->has_resolve) {
uint32_t a = subpass->resolve_attachments[0].attachment;
const struct radv_image *image = framebuffer->attachments[a].attachment->image;
@@ -2093,8 +2097,11 @@ VkResult radv_EndCommandBuffer(
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
if (cmd_buffer->queue_family_index != RADV_QUEUE_TRANSFER)
if (cmd_buffer->queue_family_index != RADV_QUEUE_TRANSFER) {
if (cmd_buffer->device->physical_device->rad_info.chip_class == SI)
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_CS_PARTIAL_FLUSH | RADV_CMD_FLAG_PS_PARTIAL_FLUSH | RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2;
si_emit_cache_flush(cmd_buffer);
}
if (!cmd_buffer->device->ws->cs_finalize(cmd_buffer->cs) ||
cmd_buffer->record_fail)

View File

@@ -2730,9 +2730,13 @@ radv_initialise_color_surface(struct radv_device *device,
format != V_028C70_COLOR_24_8) |
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
if (iview->image->samples > 1)
if (iview->image->fmask.size)
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if ((iview->image->samples > 1) && iview->image->fmask.size) {
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if (device->physical_device->rad_info.chip_class == SI) {
unsigned fmask_bankh = util_logbase2(iview->image->fmask.bank_height);
cb->cb_color_attrib |= S_028C74_FMASK_BANK_HEIGHT(fmask_bankh);
}
}
if (iview->image->cmask.size &&
!(device->debug_flags & RADV_DEBUG_NO_FAST_CLEARS))

View File

@@ -436,6 +436,11 @@ void radv_CmdResolveImage(
radv_meta_save_graphics_reset_vport_scissor(&saved_state, cmd_buffer);
assert(src_image->samples > 1);
if (src_image->samples <= 1) {
/* this causes GPU hangs if we get past here */
fprintf(stderr, "radv: Illegal resolve operation (src not multisampled), will hang GPU.");
return;
}
assert(dest_image->samples == 1);
if (src_image->samples >= 16) {

View File

@@ -4281,7 +4281,7 @@ process_initializer(ir_variable *var, ast_declaration *decl,
} else {
if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = type->qualifier.flags.q.constant
rhs = var->constant_value = type->qualifier.flags.q.constant
? ir_constant::zero(state, var->type) : NULL;
}
}

View File

@@ -725,6 +725,8 @@ ir_swizzle::constant_expression_value(struct hash_table *variable_context)
case GLSL_TYPE_FLOAT: data.f[i] = v->value.f[swiz_idx[i]]; break;
case GLSL_TYPE_BOOL: data.b[i] = v->value.b[swiz_idx[i]]; break;
case GLSL_TYPE_DOUBLE:data.d[i] = v->value.d[swiz_idx[i]]; break;
case GLSL_TYPE_UINT64:data.u64[i] = v->value.u64[swiz_idx[i]]; break;
case GLSL_TYPE_INT64: data.i64[i] = v->value.i64[swiz_idx[i]]; break;
default: assert(!"Should not get here."); break;
}
}

View File

@@ -237,6 +237,12 @@ ir_constant_propagation_visitor::constant_propagation(ir_rvalue **rvalue) {
case GLSL_TYPE_BOOL:
data.b[i] = found->constant->value.b[rhs_channel];
break;
case GLSL_TYPE_UINT64:
data.u64[i] = found->constant->value.u64[rhs_channel];
break;
case GLSL_TYPE_INT64:
data.i64[i] = found->constant->value.i64[rhs_channel];
break;
default:
assert(!"not reached");
break;

View File

@@ -370,8 +370,13 @@ get_back_bo(struct dri2_egl_surface *dri2_surf)
break;
/* If we don't have a buffer, then block on the server to release one for
* us, and try again. */
if (wl_display_dispatch_queue(dri2_dpy->wl_dpy, dri2_surf->wl_queue) < 0)
* us, and try again. wl_display_dispatch_queue will process any pending
* events, however not all servers flush on issuing a buffer release
* event. So, we spam the server with roundtrips as they always cause a
* client flush.
*/
if (wl_display_roundtrip_queue(dri2_dpy->wl_dpy,
dri2_surf->wl_queue) < 0)
return -1;
}

View File

@@ -647,6 +647,7 @@ dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
error != NULL || xfixes_query->major_version < 2) {
_eglLog(_EGL_WARNING, "DRI2: failed to query xfixes version");
free(error);
free(xfixes_query);
return EGL_FALSE;
}
free(xfixes_query);

View File

@@ -920,7 +920,7 @@ static void *
_fixupNativeWindow(_EGLDisplay *disp, void *native_window)
{
#ifdef HAVE_X11_PLATFORM
if (disp->Platform == _EGL_PLATFORM_X11 && native_window != NULL) {
if (disp && disp->Platform == _EGL_PLATFORM_X11 && native_window != NULL) {
/* The `native_window` parameter for the X11 platform differs between
* eglCreateWindowSurface() and eglCreatePlatformPixmapSurfaceEXT(). In
* eglCreateWindowSurface(), the type of `native_window` is an Xlib
@@ -982,7 +982,7 @@ _fixupNativePixmap(_EGLDisplay *disp, void *native_pixmap)
* `Pixmap*`. Convert `Pixmap*` to `Pixmap` because that's what
* dri2_x11_create_pixmap_surface() expects.
*/
if (disp->Platform == _EGL_PLATFORM_X11 && native_pixmap != NULL)
if (disp && disp->Platform == _EGL_PLATFORM_X11 && native_pixmap != NULL)
return (void *)(* (Pixmap*) native_pixmap);
#endif
return native_pixmap;

View File

@@ -69,10 +69,17 @@ os_time_get_nano(void)
static LARGE_INTEGER frequency;
LARGE_INTEGER counter;
int64_t secs, nanosecs;
if(!frequency.QuadPart)
QueryPerformanceFrequency(&frequency);
QueryPerformanceCounter(&counter);
return counter.QuadPart*INT64_C(1000000000)/frequency.QuadPart;
/* Compute seconds and nanoseconds parts separately to
* reduce severity of precision loss.
*/
secs = counter.QuadPart / frequency.QuadPart;
nanosecs = (counter.QuadPart % frequency.QuadPart) * INT64_C(1000000000)
/ frequency.QuadPart;
return secs*INT64_C(1000000000) + nanosecs;
#else

View File

@@ -905,6 +905,9 @@ TexInstruction::TexInstruction(Function *fn, operation op)
tex.rIndirectSrc = -1;
tex.sIndirectSrc = -1;
if (op == OP_TXF)
sType = TYPE_U32;
}
TexInstruction::~TexInstruction()

View File

@@ -2006,6 +2006,7 @@ CodeEmitterNVC0::getSRegEncoding(const ValueRef& ref)
void
CodeEmitterNVC0::emitMOV(const Instruction *i)
{
assert(!i->saturate);
if (i->def(0).getFile() == FILE_PREDICATE) {
if (i->src(0).getFile() == FILE_GPR) {
code[0] = 0xfc01c003;

View File

@@ -305,6 +305,8 @@ unsigned int Instruction::srcMask(unsigned int s) const
case TGSI_OPCODE_TXD:
case TGSI_OPCODE_TXL:
case TGSI_OPCODE_TXP:
case TGSI_OPCODE_TXF:
case TGSI_OPCODE_TG4:
case TGSI_OPCODE_TEX_LZ:
case TGSI_OPCODE_TXF_LZ:
case TGSI_OPCODE_LODQ:
@@ -343,6 +345,8 @@ unsigned int Instruction::srcMask(unsigned int s) const
}
}
return mask;
case TGSI_OPCODE_TXQ:
return 1;
case TGSI_OPCODE_XPD:
{
unsigned int x = 0;

View File

@@ -727,7 +727,9 @@ ConstantFolding::expr(Instruction *i,
// Leave PFETCH alone... we just folded its 2 args into 1.
break;
default:
i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
i->op = i->saturate ? OP_SAT : OP_MOV;
if (i->saturate)
unary(i, *i->getSrc(0)->asImm());
break;
}
i->subOp = 0;
@@ -1509,6 +1511,17 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
default:
return;
}
// This can get left behind some of the optimizations which simplify
// saturatable values.
if (newi->op == OP_MOV && newi->saturate) {
ImmediateValue tmp;
newi->saturate = 0;
newi->op = OP_SAT;
if (newi->src(0).getImmediate(tmp))
unary(newi, tmp);
}
if (newi->op != op)
foldCount++;
}

View File

@@ -769,6 +769,7 @@ static const struct debug_named_value common_debug_options[] = {
{ "norbplus", DBG_NO_RB_PLUS, "Disable RB+." },
{ "sisched", DBG_SI_SCHED, "Enable LLVM SI Machine Instruction Scheduler." },
{ "mono", DBG_MONOLITHIC_SHADERS, "Use old-style monolithic shaders compiled on demand" },
{ "ce", DBG_CE, "Force enable the constant engine" },
{ "noce", DBG_NO_CE, "Disable the constant engine"},
{ "unsafemath", DBG_UNSAFE_MATH, "Enable unsafe math shader optimizations" },
{ "nodccfb", DBG_NO_DCC_FB, "Disable separate DCC on the main framebuffer" },

View File

@@ -64,12 +64,12 @@
#define R600_PRIM_RECTANGLE_LIST PIPE_PRIM_MAX
/* Debug flags. */
/* logging */
/* logging and features */
#define DBG_TEX (1 << 0)
/* gap - reuse */
#define DBG_COMPUTE (1 << 2)
#define DBG_VM (1 << 3)
/* gap - reuse */
#define DBG_CE (1 << 4)
/* shader logging */
#define DBG_FS (1 << 5)
#define DBG_VS (1 << 6)

View File

@@ -185,15 +185,24 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
sctx->b.gfx.cs = ws->cs_create(sctx->b.ctx, RING_GFX,
si_context_gfx_flush, sctx);
/* SI + AMDGPU + CE = GPU hang */
if (!(sscreen->b.debug_flags & DBG_NO_CE) && ws->cs_add_const_ib &&
sscreen->b.chip_class != SI &&
/* These can't use CE due to a power gating bug in the kernel. */
sscreen->b.family != CHIP_CARRIZO &&
sscreen->b.family != CHIP_STONEY &&
/* Some CE bug is causing green screen corruption w/ MPV video
* playback and occasional corruption w/ 3D. */
sscreen->b.chip_class != GFX9) {
bool enable_ce = sscreen->b.chip_class != SI && /* SI hangs */
/* These can't use CE due to a power gating bug in the kernel. */
sscreen->b.family != CHIP_CARRIZO &&
sscreen->b.family != CHIP_STONEY;
/* CE is currently disabled by default, because it makes s_load latency
* worse, because CE IB doesn't run in lockstep with DE.
* Remove this line after that performance issue has been resolved.
*/
enable_ce = false;
/* Apply CE overrides. */
if (sscreen->b.debug_flags & DBG_NO_CE)
enable_ce = false;
else if (sscreen->b.debug_flags & DBG_CE)
enable_ce = true;
if (ws->cs_add_const_ib && enable_ce) {
sctx->ce_ib = ws->cs_add_const_ib(sctx->b.gfx.cs);
if (!sctx->ce_ib)
goto fail;

View File

@@ -243,7 +243,10 @@ unsigned si_llvm_compile(LLVMModuleRef M, struct ac_shader_binary *binary,
buffer_size = LLVMGetBufferSize(out_buffer);
buffer_data = LLVMGetBufferStart(out_buffer);
ac_elf_read(buffer_data, buffer_size, binary);
if (!ac_elf_read(buffer_data, buffer_size, binary)) {
fprintf(stderr, "radeonsi: cannot read an ELF shader binary\n");
diag.retval = 1;
}
/* Clean up */
LLVMDisposeMemoryBuffer(out_buffer);

View File

@@ -181,7 +181,11 @@ static void si_emit_derived_tess_state(struct si_context *sctx,
*num_patches = MIN2(*num_patches, 40);
/* SI bug workaround - limit LS-HS threadgroups to only one wave. */
if (sctx->b.chip_class == SI) {
if (sctx->b.chip_class == SI ||
/* TODO: fix GFX9 where a threadgroup contains more than 1 wave and
* LS vertices per patch > HS vertices per patch. Piglit: 16in-1out */
(sctx->b.chip_class == GFX9 &&
num_tcs_input_cp > num_tcs_output_cp)) {
unsigned one_wave = 64 / MAX2(num_tcs_input_cp, num_tcs_output_cp);
*num_patches = MIN2(*num_patches, one_wave);
}

View File

@@ -1,19 +1,24 @@
/******************************************************************************
* Copyright (C) 2015-2017 Intel Corporation. All Rights Reserved.
*
* Copyright 2015-2017
* Intel Corporation
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* http ://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*
% if gen_header:
* @file ${filename}.h

View File

@@ -192,7 +192,7 @@ void QueueWork(SWR_CONTEXT *pContext)
if (IsDraw)
{
InterlockedIncrement((volatile LONG*)&pContext->drawsOutstandingFE);
InterlockedIncrement(&pContext->drawsOutstandingFE);
}
_ReadWriteBarrier();

View File

@@ -408,12 +408,12 @@ struct DRAW_CONTEXT
bool dependent; // Backend work is dependent on all previous BE
bool isCompute; // Is this DC a compute context?
bool cleanupState; // True if this is the last draw using an entry in the state ring.
volatile bool doneFE; // Is FE work done for this draw?
FE_WORK FeWork;
volatile OSALIGNLINE(bool) doneFE; // Is FE work done for this draw?
volatile OSALIGNLINE(uint32_t) FeLock;
volatile int32_t threadsDone;
volatile OSALIGNLINE(uint32_t) threadsDone;
SYNC_DESC retireCallback; // Call this func when this DC is retired.
@@ -504,9 +504,9 @@ struct SWR_CONTEXT
// Scratch space for workers.
uint8_t** ppScratch;
volatile int32_t drawsOutstandingFE;
volatile OSALIGNLINE(uint32_t) drawsOutstandingFE;
CachingAllocator cachingArenaAllocator;
OSALIGNLINE(CachingAllocator) cachingArenaAllocator;
uint32_t frameCount;
uint32_t lastFrameChecked;

View File

@@ -393,7 +393,7 @@ INLINE void ExecuteCallbacks(SWR_CONTEXT* pContext, uint32_t workerId, DRAW_CONT
// inlined-only version
INLINE int32_t CompleteDrawContextInl(SWR_CONTEXT* pContext, uint32_t workerId, DRAW_CONTEXT* pDC)
{
int32_t result = InterlockedDecrement((volatile LONG*)&pDC->threadsDone);
int32_t result = static_cast<int32_t>(InterlockedDecrement(&pDC->threadsDone));
SWR_ASSERT(result >= 0);
AR_FLUSH(pDC->drawId);
@@ -639,7 +639,7 @@ INLINE void CompleteDrawFE(SWR_CONTEXT* pContext, uint32_t workerId, DRAW_CONTEX
_mm_mfence();
pDC->doneFE = true;
InterlockedDecrement((volatile LONG*)&pContext->drawsOutstandingFE);
InterlockedDecrement(&pContext->drawsOutstandingFE);
}
void WorkOnFifoFE(SWR_CONTEXT *pContext, uint32_t workerId, uint32_t &curDrawFE)

View File

@@ -601,8 +601,11 @@ wait_swap_interval(struct stw_framebuffer *fb)
int64_t min_swap_period =
1.0e6 / stw_dev->refresh_rate * stw_dev->swap_interval;
/* if time since last swap is less than wait period, wait */
if (delta < min_swap_period) {
/* If time since last swap is less than wait period, wait.
* Note that it's possible for the delta to be negative because of
* rollover. See https://bugs.freedesktop.org/show_bug.cgi?id=102241
*/
if ((delta >= 0) && (delta < min_swap_period)) {
float fudge = 1.75f; /* emperical fudge factor */
int64_t wait = (min_swap_period - delta) * fudge;
os_time_sleep(wait);

View File

@@ -1555,6 +1555,7 @@ surf_fake_rgb_with_red(const struct isl_device *isl_dev,
info->surf.logical_level0_px.width *= 3;
info->surf.phys_level0_sa.width *= 3;
info->tile_x_sa *= 3;
*x *= 3;
*width *= 3;

View File

@@ -897,6 +897,7 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data,
memcpy(prog_data->base.base.param, param,
sizeof(gl_constant_value*) * param_count);
prog_data->base.base.nr_params = param_count;
prog_data->base.base.nr_pull_params = 0;
ralloc_free(param);
}
}

View File

@@ -401,7 +401,8 @@ anv_physical_device_get_format_properties(struct anv_physical_device *physical_d
/* Nothing to do here */
} else if (vk_format_is_depth_or_stencil(format)) {
tiled |= VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT;
if (physical_device->info.gen >= 8)
if (vk_format_aspects(format) == VK_IMAGE_ASPECT_DEPTH_BIT ||
physical_device->info.gen >= 8)
tiled |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT;
tiled |= VK_FORMAT_FEATURE_BLIT_SRC_BIT |

View File

@@ -1095,6 +1095,14 @@ brw_end_perf_query(struct gl_context *ctx,
obj->oa.begin_report_id + 1);
}
/* We flush the batchbuffer here to minimize the chances that MI_RPC
* delimiting commands end up in different batchbuffers. If that's the
* case, the measurement will include the time it takes for the kernel
* scheduler to load a new request into the hardware. This is manifested
* in tools like frameretrace by spikes in the "GPU Core Clocks"
* counter.
*/
intel_batchbuffer_flush(brw);
--brw->perfquery.n_active_oa_queries;
/* NB: even though the query has now ended, it can't be accumulated

View File

@@ -204,6 +204,15 @@ update_urb_size(struct brw_context *brw, const struct gen_l3_config *cfg)
if (brw->urb.size != sz) {
brw->urb.size = sz;
brw->ctx.NewDriverState |= BRW_NEW_URB_SIZE;
/* If we change the total URB size, reset the individual stage sizes to
* zero so that, even if there is no URB size change, gen7_upload_urb
* still re-emits 3DSTATE_URB_*.
*/
brw->urb.vsize = 0;
brw->urb.gsize = 0;
brw->urb.hsize = 0;
brw->urb.dsize = 0;
}
}

View File

@@ -197,9 +197,7 @@ gen7_upload_urb(struct brw_context *brw, unsigned vs_size,
/* If we're just switching between programs with the same URB requirements,
* skip the rest of the logic.
*/
if (!(brw->ctx.NewDriverState & BRW_NEW_CONTEXT) &&
!(brw->ctx.NewDriverState & BRW_NEW_URB_SIZE) &&
brw->urb.vsize == entry_size[MESA_SHADER_VERTEX] &&
if (brw->urb.vsize == entry_size[MESA_SHADER_VERTEX] &&
brw->urb.gs_present == gs_present &&
brw->urb.gsize == entry_size[MESA_SHADER_GEOMETRY] &&
brw->urb.tess_present == tess_present &&

View File

@@ -161,8 +161,7 @@ blorp_emit_urb_config(struct blorp_batch *batch, unsigned vs_entry_size)
struct brw_context *brw = batch->driver_batch;
#if GEN_GEN >= 7
if (!(brw->ctx.NewDriverState & (BRW_NEW_CONTEXT | BRW_NEW_URB_SIZE)) &&
brw->urb.vsize >= vs_entry_size)
if (brw->urb.vsize >= vs_entry_size)
return;
brw->ctx.NewDriverState |= BRW_NEW_URB_SIZE;

View File

@@ -188,7 +188,6 @@ get_blit_intratile_offset_el(const struct brw_context *brw,
* The offsets we get from ISL in the tiled case are already aligned.
* In the linear case, we need to do some of our own aligning.
*/
assert(mt->pitch % 64 == 0);
uint32_t delta = *base_address_offset & 63;
assert(delta % mt->cpp == 0);
*base_address_offset -= delta;
@@ -826,11 +825,11 @@ intel_miptree_set_alpha_to_one(struct brw_context *brw,
if (brw->gen >= 8) {
OUT_RELOC64(mt->bo,
I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
offset);
mt->offset + offset);
} else {
OUT_RELOC(mt->bo,
I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
offset);
mt->offset + offset);
}
OUT_BATCH(0xffffffff); /* white, but only alpha gets written */
ADVANCE_BATCH_TILED(dst_y_tiled, false);

View File

@@ -294,7 +294,7 @@ do_blit_bitmap( struct gl_context *ctx,
color,
irb->mt->pitch,
irb->mt->bo,
0,
irb->mt->offset,
irb->mt->tiling,
dstx + px,
dsty + py,

View File

@@ -4766,13 +4766,13 @@ _mesa_CompressedTextureSubImage3D(GLuint texture, GLint level, GLint xoffset,
}
/* Copy in each face. */
for (i = 0; i < 6; ++i) {
for (i = zoffset; i < zoffset + depth; ++i) {
texImage = texObj->Image[i][level];
assert(texImage);
_mesa_compressed_texture_sub_image(ctx, 3, texObj, texImage,
texObj->Target, level,
xoffset, yoffset, zoffset,
xoffset, yoffset, 0,
width, height, 1,
format, imageSize, pixels);

View File

@@ -496,7 +496,7 @@ st_context_flush(struct st_context_iface *stctxi, unsigned flags,
st_flush(st, fence, pipe_flags);
if ((flags & ST_FLUSH_WAIT) && fence) {
if ((flags & ST_FLUSH_WAIT) && fence && *fence) {
st->pipe->screen->fence_finish(st->pipe->screen, NULL, *fence,
PIPE_TIMEOUT_INFINITE);
st->pipe->screen->fence_reference(st->pipe->screen, fence, NULL);

View File

@@ -44,7 +44,9 @@ libmesautil_la_SOURCES = \
$(MESA_UTIL_FILES) \
$(MESA_UTIL_GENERATED_FILES)
libmesautil_la_LIBADD = $(ZLIB_LIBS)
libmesautil_la_LIBADD = \
$(CLOCK_LIB) \
$(ZLIB_LIBS)
roundeven_test_LDADD = -lm