Compare commits

..

91 Commits

Author SHA1 Message Date
Emil Velikov
2f9820c553 docs: add release notes for 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:09:05 +00:00
Emil Velikov
5f2d38cc1d Update version to 17.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 11:02:49 +00:00
Emil Velikov
5d961e1630 cherry-ignore: add a few more meson fixes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 04:23:18 +00:00
Roland Scheidegger
1bf16e4fbc r600: don't do stack workarounds for hemlock
By the looks of it it seems hemlock is treated separately to cypress, but
certainly it won't need the stack workarounds cedar/redwood (and
seemingly every other eg chip except cypress/juniper) need.
(Discovered by accident.)

Acked-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c2f0e08857)
2018-02-09 04:23:18 +00:00
Jon Turney
70604e8808 glx/apple: locate dispatch table functions to wrap by name
Avoid reaching into the dispatch table internals (and thus having to deal
with the complexities of remap etc.) by identifying functions to wrap by
name.

See:
https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq.
https://bugs.freedesktop.org/show_bug.cgi?id=90311

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d3540b405b)
2018-02-09 04:23:18 +00:00
Jon Turney
15beac3a01 glx/apple: include util/debug.h for env_var_as_boolean prototype
mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b37b7b42dc)
2018-02-09 04:23:18 +00:00
Jon Turney
e55ca6768f configure: Default to gbm=no on osx
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7ad7a07c88)
2018-02-09 04:23:17 +00:00
Igor Gnatenko
9f6e05d11f link mesautil with pthreads
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_setname':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:66: undefined reference to `pthread_setname_np'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_join':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:336: undefined reference to `pthread_join'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:48: undefined reference to `pthread_sigmask'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:296: undefined reference to `pthread_create'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask'
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `call_once':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:84: undefined reference to `pthread_getcpuclockid'
collect2: error: ld returned 1 exit status

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Igor Gnatenko <ignatenko@redhat.com>
(cherry picked from commit 23ce168048)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104818
2018-02-09 04:23:17 +00:00
Kenneth Graunke
5f862311e7 i965: Bump official kernel requirement to Linux v3.9.
In commit 3f353342a6 (present in 17.3.0)
we started unconditionally using I915_EXEC_NO_RELOC, which was
introduced in Linux v3.9.  ChromeOS kernel 3.8 has backported this,
so it should work too.

Running on older kernels would likely result in every single batch
being rejected by the kernel, which is pretty catastrophic.  Yet, it
appears that nobody noticed.  So, let's just bump the official
requirement and move forward ever so slowly.

Fixes: 3f353342a6 ("i965: Use I915_EXEC_NO_RELOC")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c3cd2aac27)
2018-02-09 04:23:17 +00:00
Andres Gomez
833808f01c i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.

In 61a8a55f55 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.

Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 256-bits through 2 uploads.

Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.

Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1

v2: Added more inline comments to explain why we are using
    ISL_FORMAT_R32_FLOAT and its consequences, as requested by
    Alejandro and Antía.

Fixes: 75968a668e ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Antia Puentes <apuentes@igalia.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5a7aba2e0a)
2018-02-09 04:23:17 +00:00
Michel Dänzer
790cc8abe5 winsys/radeon: Compute is_displayable in surf_drm_to_winsys
It was always 0, breaking (at least) DRI3 with Xwayland.

Bugzilla: https://bugs.freedesktop.org/104306
Fixes: 5f2073be32 ("ac/surface: add ac_surface::is_displayable")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1cf1bf32ef)
2018-02-09 04:23:17 +00:00
Matthew Nicholls
69beac3f38 radv: remove predication on cache flushes
This can lead to a situation where cache flushes could get conditionally
disabled while still clearing the flush_bits, and thus flushes due to
application pipeline barriers may never get executed.

Fixes: a6c2001ace (radv: add support for cmd predication.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ef272b161e)
[Emil Velikov: trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_cmd_buffer.c
2018-02-09 04:23:17 +00:00
Dave Airlie
da327c6ce6 virgl: also remove dimension on indirect.
This fixes some dEQP tests that generated bad shaders.

Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 49c61d8b84)
2018-02-09 04:23:17 +00:00
Dave Airlie
64ab67602b radv/gfx9: fix block compression texture views. (v2)
This ports a fix from amdvlk, to fix the sizing for mip levels
when block compressed images are viewed using uncompressed views.

My original fix didn't power the clamping, but it looks like
the clamping is required to stop the sizing going too large.

Fixes:
dEQP-VK.image.texel_view_compatible.graphic.extended*bc*
Doesn't crash DOW3 anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f6cc15dccd)
2018-02-09 04:23:17 +00:00
Emil Velikov
e27f066126 cherry-ignore: add meson fix
Meson is disabled in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 04:23:16 +00:00
Maxin B. John
d2258c5538 anv_icd.py: improve reproducible builds
Sort the output to ensure build reproducibility

Signed-off-by: Maxin B. John <maxin.john@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8116b9170b)
2018-02-09 04:22:54 +00:00
Dave Airlie
5ef9c58f4b radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
This seems to be broken, at least the cts tests fail.

This fixes:
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8

2 samples seems to pass fine, amdvlk doesn't appear to enable TC for
possibly some other reasons here.

This is most likely a hack.

v1.1: add a bit of explaination text. (Samuel)
Fixes: ad3d98da9 (radv: enable tc compatible htile for d32s8 also.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f4c534ef68)
2018-02-09 04:22:54 +00:00
Emil Velikov
47542b1f99 configure.ac: correct driglx-direct help text
The default was toggled a while back, but the text wasn't updated.

Fixes: bd526ec9e1 ("configure: Always default to
--enable-driglx-direct")
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 6aeef54644)
2018-02-09 04:22:54 +00:00
Jason Ekstrand
caad5571fb i965: Call prepare_external after implicit window-system MSAA resolves
This fixes some rendering corruption in a couple of Android apps that
use window-system MSAA.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 2f7205be47)
2018-02-09 04:22:54 +00:00
Emil Velikov
0d3a990c7f cherry-ignore: radv: Don't expose VK_KHX_multiview on android.
stable: The KHX extension is disabled all together in the stable
branches.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 04:22:17 +00:00
Emil Velikov
2b9e16d182 radv: Stop advertising VK_KHX_multiview
We don't want to advertise experimental extensions in actual releases.
However, there's no harm in leaving the code lying around in the tree.

[Emil Velikov: port from equivalent ANV commit]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 04:21:24 +00:00
Jason Ekstrand
87ffdbae1c anv: Stop advertising VK_KHX_multiview
We don't want to advertise experimental extensions in actual releases.
However, there's no harm in leaving the code lying around in the tree.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_device.c
2018-02-09 04:20:27 +00:00
Lucas Stach
ac087eb40d renderonly: fix dumb BO allocation for non 32bpp formats
Take into account the resource format, instead of applying a hardcoded
32bpp. This not only over-allocates 16bpp formats, but also results in
a wrong stride being filled into the handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 0c71a19fe4)
2018-02-09 03:50:11 +00:00
Jason Ekstrand
6a7e3a152e anv/cmd_buffer: Re-emit the pipeline at every subpass
If we ever hit this edge-case, it can theoretically cause problem for
CNL because we could end up changing render targets without re-emitting
3DSTATE_MULTISAMPLE which is part of the pipeline.  Just get rid of the
edge case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 97938dac36)
2018-02-09 03:50:11 +00:00
Dave Airlie
b75f12a2f2 r600/sb: insert the else clause when we might depart from a loop
If there is a break inside the else clause and this means we
are breaking from a loop, the loop finalise will want to insert
the LOOP_BREAK/CONTINUE instruction, however if we don't emit
the else there is no where for these to end up, so they will end
up in the wrong place.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8d633f067b)
2018-02-09 03:50:10 +00:00
Emil Velikov
9161ac5c6d cherry-ignore: nir: mark unused space in packed_tex_data
stable: The commit covers nir serialise, which did not land in branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Emil Velikov
56427ff05e cherry-ignore: add i965 shader cache fixes
The feature is available in the 18.0 branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Emil Velikov
e1ab1de6b6 cherry-ignore: add r600/amdgpu 18.0 nominations
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Emil Velikov
eaa9449c26 cherry-ignore: add gen10 fixes
Initial gen10 support landed in the 18.0 series.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Eleni Maria Stea
62e0a8893b mesa: Fix function pointers initialization in status tracker
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 8096b558a7)
[Emil Velikov: trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collaboral.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/state_tracker/st_context.c
2018-02-09 03:50:10 +00:00
Emil Velikov
7e7b4c2c68 cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs
stable: The commit requires earlier commit w41c36c45 which did not land
in branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Jason Ekstrand
ae5e793fd7 anv/pipeline: Don't look at blend state unless we have an attachment
Without this, we may end up dereferencing blend before we check for
binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
be NULL so long as you don't have any color attachments.  This fixes a
segfault when running The Talos Principal.

Fixes: 12f4e00b69
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit c8949e2498)
2018-02-09 03:50:10 +00:00
Jason Ekstrand
0bc9182f89 intel/fs: Use the original destination region for int MUL lowering
Some hardware (CHV, BXT) have special restrictions on register regions
when doing integer multiplication.  We want to respect those when we
lower to DxW multiplication.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 18fde36ced)

Squashed with:

i965/fs: Reset the register file to VGRF in lower_integer_multiplication

18fde36ced changed the way temporary
registers were allocated in lower_integer_multiplication so that we
allocate regs_written(inst) space and keep the stride of the original
destination register.  This was to ensure that any MUL which originally
followed the CHV/BXT integer multiply regioning restrictions would
continue to follow those restrictions even after lowering.  This works
fine except that I forgot to reset the register file to VGRF so, even
though they were assigned a number from alloc.allocate(), they had the
wrong register file.  This caused some GLES 3.0 CTS tests to start
failing on Sandy Bridge due to attempted reads from the MRF:

    ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64

This commit remedies this problem by, instead of copying inst->dst and
overwriting nr, just make a new register and set the region to match
inst->dst.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
Fixes: 18fde36ced
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit db682b8f0e)
2018-02-09 03:50:10 +00:00
Emil Velikov
a094314340 Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"
This reverts commit 7295b97d61.

Originally the nomination was causing a regression. With that addressed,
we can pick it up alongside it's fix.
2018-02-09 03:50:10 +00:00
Chuck Atkins
557f2cd46c configure.ac: add missing llvm dependencies to .pc files
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: <mesa-stable@lists.freedesktop.org>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6ac5e851f1)
2018-02-09 03:50:10 +00:00
Emil Velikov
f23257b623 cherry-ignore: swr/rast: support llvm 3.9 type declarations
stable: The commit requires earlier commit 01ab218bbc which did not land
in branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Dave Airlie
a78ff020c6 radv: move spi_baryc_cntl to pipeline
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.

This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*

Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 298554541d)
2018-02-09 03:50:10 +00:00
Emil Velikov
5ef3cadf15 cherry-ignore: meson: multiple fixes
stable: The commits address the Meson build that is explicitly disabled
in branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:10 +00:00
Jason Ekstrand
4987b561b5 i965/surface_state: Drop brw_aux_surface_disabled
The only purpose of this function is to disable aux on texture surfaces
when the corresponding renderbuffer has aux disabled.  However, the act
of disabling aux on the renderbuffer will cause it to be resolved and
intel_miptree_texture_aux_usage will already check the resolved status
of a texture and return ISL_AUX_USAGE_NONE for it.  Even if we used CCS
for it, that wouldn't really be a problem because the CCS will be in the
pass-through state and so it would effectively be ignored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 468ea3cc45)
2018-02-09 03:50:10 +00:00
Jason Ekstrand
468a2b6525 i965/miptree: Add an aux_disabled parameter to render_aux_usage
Only one of the callers of intel_miptree_render_aux_usage actually took
brw->draw_aux_buffer_disabled into account.  This was causing us to
ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render.
This isn't a problem because the draw_aux_buffer_disabled entry was set
during texture preparation and we already did the resolve at that time.
However, this also meant that the aux_usage we were passing to
brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our
automatic cache flushing around aux_usage changes wasn't happening.
This was causing GPU hangs in Oxenfree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d38ec24f53)
2018-02-09 03:50:10 +00:00
Jason Ekstrand
0dd5120ded i965/miptree: Take an aux_usage in prepare/finish_render
Both callers of intel_miptree_prepare/finish_render have to call
intel_miptree_render_aux_usage anyway for other reasons.  They may as
well pass the result in instead of us calling it again.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dfe0217905)
2018-02-09 03:50:10 +00:00
Marek Olšák
1d7d13ffc6 radeonsi: don't ignore pitch for imported textures
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 022c5b22fe)
[Emil Velikov: attribute for lack of slice_size_dw]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/radeon/r600_texture.c
2018-02-09 03:50:10 +00:00
Boyuan Zhang
36e1b57bad radeon/uvd: add and manage render picture list
Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 2ec48039b8)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104745
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeon/radeon_uvd.c
2018-02-09 03:50:10 +00:00
Boyuan Zhang
2b6d2f6a81 radeon/vcn: add and manage render picture list
Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit f2bfd1cbb7)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104745
2018-02-09 03:50:10 +00:00
Indrajit Das
30a35f8d43 st/va: clear pointers for mpeg2 quantiser matrices
This is to fix VA-API issues with GStreamer and MPEG2.
Since gstreamer does not pass quantiser matrices with each frame, invalid
pointers were being passed to the driver. This patch addresses the same.

Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 338638a8af)
2018-02-09 03:50:09 +00:00
Indrajit Das
e46597f273 radeon/vcn: update quantiser matrices only when requested
Only update them when the pointers are valid.

Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit f5277e8492)
2018-02-09 03:50:09 +00:00
Indrajit Das
08ad68ea19 radeon/uvd: update quantiser matrices only when requested
Only upload them when the pointers are valid.

Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 38dee62c9a)
2018-02-09 03:50:09 +00:00
Indrajit Das
339b43b0af st/omx_bellagio: Update default intra matrix per MPEG2 spec
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit e05d5b0cf3)
2018-02-09 03:50:09 +00:00
Emil Velikov
1bfeb763fb cherry-ignore: radv: fix sample_mask_in loading. (v3.1)
fixes: The commit requires earlier commit 49d035122e which did not land
in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-09 03:50:08 +00:00
Emil Velikov
c465067ff8 cherry-ignore: anv: add explicit 18.0 only nominations
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 05:09:02 +00:00
Emil Velikov
b31e232baa cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error
stable: The commit depends on earlier commit a4be2bcee2 which did not
land in branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 05:09:01 +00:00
Emil Velikov
d20d97ec8f cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext
stable: The commit addresses earlier commit 6d87500fe1 which did not
land in branch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-25 05:08:57 +00:00
Dave Airlie
eaa3da4189 radv: don't use hw resolves for r16g16 norm formats.
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.

Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c727ea9370)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_meta_resolve.c
2018-01-25 02:57:56 +00:00
Dave Airlie
c30a6252c2 radv: don't use hw resolve for integer image formats
From reading AMDVLK it currently never uses hw resolve paths.

This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.

This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4df414bbd2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_meta_resolve.c
2018-01-25 02:55:57 +00:00
Dave Airlie
1bd25a4d99 radv: add fs_key meta format support to resolve passes.
Some of the hw resolve passes need the SPI color format setup
correctly.

This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 316d762186)
2018-01-25 02:52:34 +00:00
Samuel Pitoiset
5e889ae22c radv: create pipeline layout objects for all meta operations
They are dummy objects but the spec requires layout to not be
NULL, this just makes sure we are creating valid pipeline layout
objects. This will allow us to remove some useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 3595a11648)
2018-01-25 02:52:20 +00:00
Eric Engestrom
558411c21e radeon: remove left over dead code
Fixes: 4e0d99a635 "r100: Use shared debug code"
Cc: Pauli Nieminen <suokkos@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit eee8dd7c33)
2018-01-25 02:46:25 +00:00
Bas Nieuwenhuizen
4a79113e2b ac/nir: Fix vector extraction if source vector has >4 elements.
v2: Add forgotten argument and start offset.

Fixes: 91074bb11b "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 32170d87e3)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2018-01-25 02:30:50 +00:00
Bas Nieuwenhuizen
022cdd4eaa ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.
Fixes: 91074bb11b "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit f4211e6f93)
2018-01-25 02:26:30 +00:00
Timothy Arceri
a45a6ed808 ac: fix visit_ssa_undef() for doubles
V2: use LLVMIntTypeInContext()

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 3bccb5dba9)
2018-01-25 02:26:18 +00:00
Dave Airlie
a060dc27b0 ac/nir: account for view index in the user sgpr allocation.
The view index user sgpr wasn't being accounted for properly,
this refactors out the code to decide if it's required and then
uses that info to account for it.

Fixes: 180c1b924e (ac/nir: Add shader support for multiviews.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3153d74207)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2018-01-25 02:24:37 +00:00
Timothy Arceri
78e1165645 ac: fix buffer overflow bug in 64bit SSBO loads
Fixes: 441ee1e65b "radv/ac: Implement Float64 SSBO loads"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e2b9296146)
2018-01-25 02:13:19 +00:00
Samuel Thibault
ff06368950 glx: fix non-dri build
glXGetDriverConfig parameters do not provide a context to dynamically
check for the presence of the function, so the dispatcher directly calls
glXGetDriverConfig, but in non-dri builds dri_glx.c didn't provide
glXGetDriverConfig.

This change make it just return NULL in that case.

Fixes: 84f764a759 "glxglvnddispatch: Add missing dispatch for GetDriverConfig
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 47ac11bcf8)
2018-01-25 02:13:07 +00:00
Bas Nieuwenhuizen
ad764e365b ac/nir: Use instance_rate_inputs per attribute, not per variable.
This did the wrong thing if we had e.g. an array for which only some
of the attributes use the instance index. Tripped up some new CTS
tests.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5a4dc28500)

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2018-01-25 02:09:22 +00:00
Jose Fonseca
473d665a4d svga: Prevent use after free.
Courtesy of clang static analyzer.

I was hunting for potential sources of memory corruption using Mesa with
a GL trace, and happened to find this (unrelated) issue.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit dcbb224c68)
2018-01-24 21:06:26 +00:00
Matthew Nicholls
93ffa56658 radv: restore previous stencil reference after depth-stencil clear
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
(cherry picked from commit 005375717b)
2018-01-24 20:33:57 +00:00
Jason Ekstrand
623d843692 i965: Set tiling on BOs imported with modifiers
We need this to ensure that GTT maps work on buffers we get from Vulkan
on the off chance that someone does a readpixels or something.  Soon, we
will be removing GTT maps from i965 entirely and this can be reverted.
None the less, it's needed for stable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 5048572352)
2018-01-24 20:33:57 +00:00
Jason Ekstrand
8ebfa265e2 i965/bufmgr: Add a create_from_prime_tiled function
This new function is an import and a set tiling in one go.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b9e7b29705)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_bufmgr.c
2018-01-24 20:33:00 +00:00
Jason Ekstrand
9b2ac06cd6 i965/miptree: Use the tiling from the modifier instead of the BO
This fixes a bug where we were taking the tiling from the BO regardless
of what the modifier said.  When we got images in from Vulkan where it
doesn't set the tiling on the BO, we would treat them as linear even
though the modifier expressly said to treat it as Y-tiled.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ad424b2243)
2018-01-24 20:20:27 +00:00
Jason Ekstrand
be2a7b6a28 i965/miptree: Add an explicit tiling parameter to create_for_bo
Otherwise, create_for_bo will just grab the tiling from the BO which is
not what we want when using modifiers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0465dd13d2)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
17647d08a5 radv: Don't allow 3d or 1d depth/stencil textures.
addrlib asserts when that happens, and supporting it is not
required so lets not allow this for now.

It also assert on fmask, but we don't have the number of samples here.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4584c4ef04)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
635b9549dc radv: Init variant entry with memset.
This gets memcpy'd and written driectly, and due to alignment, this
resulted in uninitialized gaps. This makes those gaps go away.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8b98929074)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
43d8d13377 radv: Fix bufimage failure deallocation.
The inidividual init parts don't clean up their own stuff on failure.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit fb0992e967)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
1663b7edf0 radv: Fix fragment resolve init memory allocation failure paths.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2c802ca66c)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
f1c8bc6e85 radv: Fix freeing meta state if the device pipeline cache fails to allocate.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c685076ab0)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
87d254b818 radv: Fix memory allocation failure path in compute resolve init.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 71f0315a88)
2018-01-24 20:20:27 +00:00
Bas Nieuwenhuizen
acca16e3fb radv: Fix ordering issue in meta memory allocation failure path.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d956e0bdf5)
2018-01-24 20:20:27 +00:00
Lucas Stach
cf807eff65 etnaviv: dirty TS state when framebuffer has changed
When switching between framebuffers with and without TS, the TS state
needs to be flushed to the command stream even if the derived state
isn't changed.

Fixes: 4ee7c2c284 ("etnaviv: enable TS, but disable autodisable")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 29a0ea699a)
2018-01-24 20:20:27 +00:00
Grazvydas Ignotas
212a59e216 st/vdpau: release held lock in error path
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e6abc613e2)
2018-01-24 20:20:27 +00:00
Kenneth Graunke
3cd9d65a1b i965: Bind null render targets for shadow sampling + color.
Portal 2 appears to bind RGBA8888_UNORM textures to a sampler2DShadow,
and calls shadow2D() on it.  This causes undefined behavior in OpenGL.

Unfortunately, our sampler appears to hang in this scenario, which is
not acceptable.  Just give them a null surface instead, which returns
all zeroes.

Fixes GPU hangs in Portal 2 on Kabylake.

Huge thanks to Jason Ekstrand for noticing this crazy behavior while
sifting through crash dumps.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104487
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3e18c53e59)
2018-01-24 20:20:27 +00:00
Dave Airlie
14ebd7ecd9 r600/sb: fix a bug emitting ar load from a constant.
Some tess shaders were doing MOVA_INT _, c0.x on cayman, and then
hitting an assert in sb_bc_finalize.cpp:translate_kcache.

This makes sure the toplevel kcache tracker gets updated,
and the clause gets fixed up.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 68b976bd91)
2018-01-24 20:20:26 +00:00
Jason Ekstrand
48db8ed822 i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage
This commit unifies the CCS_E and CCS_D cases.  This should fix a couple
of subtle issues.  One is that when you use INTEL_DEBUG=norbc to disable
CCS_E, we don't get the sRGB blending workaround.  By unifying the code,
we give CCS_D that workaround as well.

The second issue fixed by this refactor is that the blending workaround
was appears to be enabled on all gens but really only applies on gen9.
Due to a happy accident in the way code was laid out, it was only
getting enabled on gen9: gen8 and earlier don't support non-zero-one
clear colors, and gen10 supports sRGB for CCS_E so it got caught in the
format_ccs_e_compat_with_miptree case.  This refactor moves it above the
format_ccs_e_compat_with_miptree case so it's an explicit early exit and
makes it explicitly only on gen9.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 361e1df1ed)
2018-01-24 20:20:15 +00:00
Jason Ekstrand
0b31126ba9 Re-enable regular fast-clears (CCS_D) on gen9+
This reverts commit ee57b15ec7, "i965:
Disable regular fast-clears (CCS_D) on gen9+".  How taht we've fixed the
issue with too many different aux usages in the render cache, it should
be safe to re-enable CCS_D for sRGB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104163
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f79bb2e651)
[Emil Velikov: resolve trivial conflicts - gen10 is missing in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_meta_util.c
2018-01-24 20:18:47 +00:00
Jason Ekstrand
d6bfb9c31a i965: Track format and aux usage in the render cache
This lets us perform render cache flushes whenever a surface goes from
being used with one aux+format to a different aux+format.

This is the "proper" fix for https://bugs.freedesktop.org/102435.
ee57b15ec7 which was really just a partial
revert of 3e57e9494c was just a hack to
get rid of a hang in a bunch of Valve games.  This solves the actual
problem responsible for the hang and lets us enable CCS_E once again.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d84275b884)
2018-01-24 20:15:03 +00:00
Jason Ekstrand
6fce0e2065 i965: Track the depth and render caches separately
Previously, we just had one hash set for tracking depth and render
caches called brw_context::render_cache.  This is less than ideal
because the depth and render caches are separate and we can't track
moves between the depth and the render caches.  This limitation led
to some unnecessary flushing around the depth cache.  There are cases
(mostly with BLORP) where we can end up touching a depth or stencil
buffer through the render cache.  To guard against this, blorp would
unconditionally do a render_cache_set_check_flush on it's destination
which meant that if you did any rendering (including a BLORP operation)
to a given surface and then used it as a blorp destination, you would
end up flushing it out of the render cache before rendering into it.

Things get worse when you dig into the depth/stencil state code for
regular GL draw calls.  Because we may end up rendering to a depth
or stencil buffer via BLORP, we did a render_cache_set_check_flush on
all depth and stencil buffers in brw_emit_depthbuffer to ensure that
they got flushed out of the render cache prior to using them for depth
or stencil testing.  However, because we also need to track dirtiness
for depth and stencil so that we can implement depth and stencil
texturing correctly, we were adding all depth and stencil buffers to the
render cache set in brw_postdraw_set_buffers_need_resolve.  This meant
that, if anything caused 3DSTATE_DEPTH_BUFFER to get re-emitted
(currently _NEW_BUFFERS, BRW_NEW_BATCH, and BRW_NEW_BLORP), we would
almost always do a full pipeline stall and render/depth cache flush.

The root cause of both of these problems is that we can't tell the
difference between the render and depth caches in our tracking.  This
commit splits our cache tracking into two sets, one for render and one
for depth, and properly handles transitioning between the two.  We still
flush all the caches whenever anything needs to be flushed.  The idea is
that if we're going to take the hit of a flush and stall, we may as well
flush everything in the hopes that we can avoid a flush by something
else later.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit fb0e9b5197)
2018-01-24 20:14:47 +00:00
Jason Ekstrand
0bbd60f3e9 i965/blorp: Add more destination flushing
Right now we just always flush the destination for render and aren't
particularly careful about depth or stencil.  Soon, flush_for_render
isn't going to do the same thing as flush_for_depth and we may be doing
a good deal less depth flushing so we should be a bit more precise.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d6d0ac95d5)
2018-01-24 20:14:45 +00:00
Jason Ekstrand
6f5752dba7 i965: Add more precise cache tracking helpers
In theory, this will let us track the depth and render caches
separately.  Right now, they're just wrappers around
brw_render_cache_set_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 4a09070295)
2018-01-24 20:11:50 +00:00
Jason Ekstrand
e66bafa973 i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer
This makes sure we flush things out of other caches prior to using a
surface through the render cache.  Currently, this is a no-op because GL
won't let you bind anything other than a color surface as color so it
should never end up in the depth cache.  However, this does complete the
flush/add_bo pair for regular drawing which will be required for the
next commit.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 622786c20c)
2018-01-24 19:27:19 +00:00
Grazvydas Ignotas
510f1b3cb9 st/va: release held locks in error paths
Found with the help of following Coccinelle semantic patch:
// <smpl>
@@
expression E;
@@

  \(pthread_mutex_lock\|mtx_lock\|simple_mtx_lock\)(E)
  ...
(
  \(pthread_mutex_unlock\|mtx_unlock\|simple_mtx_unlock\)(E);
  ...
  return ...;
|
+ maybe need_unlock(E);
  return ...;
)
// </smpl>

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0ad73031ec)
2018-01-24 19:27:19 +00:00
Gert Wollny
694ed0d61a r600/shader: Initialize max_driver_temp_used correctly for the first time
Without this initialization the temp registers used in tgsi_declaration
may used random indices, and this may result in failing translation from TGSI
with an error message "GPR limit exceeded", because the random index is greater
then the allowed limit implying that the shader uses more temporary registers then
available.

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5d6470d26b)
2018-01-24 19:27:19 +00:00
Juan A. Suarez Romero
bc1503b13f docs: add sha256 checksums for 17.3.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-01-18 22:34:34 +01:00
73 changed files with 1258 additions and 352 deletions

View File

@@ -1 +1 @@
17.3.3
17.3.4

View File

@@ -4,10 +4,6 @@ ab0809e5529725bd0af6f7b6ce06415020b9d32e meson: fix strtof locale support check
# fixes: The commit addresses Meson which is explicitly disabled for 17.3
44fbbd6fd07e5784b05e08e762e54b6c71f95ab1 util: add mesa-sha1 test to meson
# stable: The commit is causing a regression
# (https://bugs.freedesktop.org/show_bug.cgi?id=103626)
18fde36ced4279f2577097a1a7d31b55f2f5f141 intel/fs: Use the original destination region for int MUL lowering
# stable: The commit addresses earlier commit 6132992cdb which did not land in
# branch
3d2b157e23c9d66df97d59be6efd1098878cc110 i965/fs: Use UW types when using V immediates
@@ -22,3 +18,78 @@ c1ff99fd70cd2ceb2cac4723e4fd5efc93834746 main: Clear shader program data wheneve
# fixes: The commit addresses earlier commit d50937f137 which did not land in
# branch
78a8b73e7d45f55ced98a148b26247d91f4e0171 vulkan/wsi: free cmd pools
# stable: The commit addresses earlier commit 6d87500fe12 which did not land in
# branch
525b4f7548462bfc2e82f2d1f04f61ce6854a3c5 i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext
# stable: The commit depends on earlier commit a4be2bcee2 which did not land in
# branch
a29d63ecf71546c4798c609e37810f0ec81793d8 swr: refactor swr_create_screen to allow for proper cleanup on error
# stable: Explicit 18.0 only nominations
4b69ba381766cd911eb1284f1b0332a139ec8a75 anv/pipeline: Don't assert on more than 32 samplers
bc0a21e34811e0e1542236dbaf5fb1fa56bbb98c anv/cmd_state: Drop the scratch_size field
d6c9a89d1324ed2c723cbd3c6d8390691c58dfd2 anv/cmd_buffer: Get rid of the meta query workaround
cd3feea74582cea2d18306d167609f4fbe681bb3 anv/cmd_buffer: Rework anv_cmd_state_reset
ddc2d285484a1607f79ffeb2fc6c09367c6aea1f anv/cmd_buffer: Use some pre-existing pipeline temporaries
9af5379228d7be9c7ea41e0912a8770d28ead92b anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
d5592e2fdaa9ce8b98d38b2d29e2a7d2c4abda08 anv: Remove semicolons from vk_error[f] definitions
90cceaa9dd3b12e039a131a50c6866dce04e7fb2 anv/cmd_buffer: Refactor ensure_push_descriptor_set
b9e1ca16f84016f1d40efa9bfee89db48a7702b4 anv/cmd_buffer: Add a helper for binding descriptor sets
31b2144c836485ef6476bd455f1c02b96deafab7 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
97f96610c8b858267c121c0ad6ffc630e2aafc09 anv: Separate compute and graphics descriptor sets
e85aaec1489b00f24ebef4ae5b1da598091275e1 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
8bd5ec5b862333c936426ff18d093d07dd006182 anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
24caee8975355a2b54b41c484ff3c897e1911760 anv/cmd_buffer: Use a temporary variable for dynamic state
95ff2322948692f5f7b1d444aabe878fba53304c anv/cmd_buffer: Move dynamic state to graphics state
38ec78049f69821091a2d42b0f457a1b044d4273 anv/cmd_buffer: Move num_workgroups to compute state
4064fe59e7144fa822568543cfcc043387645d4e anv/cmd_buffer: Move gen7 index buffer state to graphics state
# fixes: The commit requires earlier commit 49d035122ee which did not land in
# branch
766589d89a211e67f313e8cb38f2d05b09975f96 radv: fix sample_mask_in loading. (v3.1)
# stable: The commits address the Meson build that is explicitly disabled in
# branch
c38c60a63c63b02d1030c6c349aa0a73105e10eb meson: fix BSD build
5781c3d1db4a01e77f416c1685025c4d830ae87d meson: correctly set SYSCONFDIR for loading dirrc
7c8cfe2d59bfc0dbf718a74b08b6dceaa84f7242 meson: fix missing dependencies
53f9131205a63fa8b282ab2a7e96c48209447da0 meson: fix getting cflags from pkg-config
8fae5eddd9982f4586d76471d0196befeb46de24 meson: handle LLVM 'x.x.xgit-revision' versionsi
# stable: The commit requires earlier commit 01ab218bbc which did not land in
# branch
0e879aad2fd1dac102c13d680edf455aa068d5df swr/rast: support llvm 3.9 type declarations
# stable: The commit requires earlier commit w41c36c45 which did not land in
# branch
49b0a140a731069e0e4959c65bfd1b597a4fb141 ac/nir: set amdgpu.uniform and invariant.load for UBOs
# stable: The commits address gen10 support which is missing in branch
ca19ee33d7d39cb89d948b1c983763065975ce5b i965/gen10: Ignore push constant packets during context restore.
78c125af3904c539ea69bec2dd9fdf7a5162854f anv/gen10: Ignore push constant packets during context restore.
bcfd78e4489f538e34138269650fc6cbe8c9d75f i965/gen10: Re-enable push constants.
# stable: The commits are explicit 18.0 nominations
17423c993d0b083c7a77a404b85788687f5efe36 winsys/amdgpu: fix assertion failure with UVD and VCE rings
e0e23ea69cab23b9193b1e7c568fd23fc7073071 r600/eg: construct proper rat mask for image/buffers.
# stable: The commits address the initial shader cache support which did not land in branch
28db950b51274ce296cd625db62abe935d1e4ed9 i965: fix prog_data leak in brw_disk_cache
b99c88037bf64b033579f237ec287857c53b0ad6 i965: fix disk_cache leak when destroying context
# stable: The commit covers nir serialise, which did not land in branch
d0343bef6680cc660ba691bbed31a2a1b7449f79 nir: mark unused space in packed_tex_data
# stable: The KHX extension is disabled all together in the stable branches.
bee9270853c34aa8e4b3d19a125608ee67c87b86 radv: Don't expose VK_KHX_multiview on android.
# fixes: The commit addresses the meson build, which is disabled in branch
4a0bab1d7f942ad0ac9b98ab34e6a9e4694f3c04 meson: libdrm shouldn't appear in Requires.private: if it wasn't found
16bf8138308008f4b889caa827a8291ff72745b8 meson/swr: re-shuffle generated files
bbef9474fa52d9aba06eeede52558fc5ccb762dd meson/swr: Updated copyright dates
d7235ef83b92175537e3b538634ffcff29bf0dce meson: Don't confuse the install and search paths for dri drivers
c75a4e5b465261e982ea31ef875325a3cc30e79d meson: Check for actual LLVM required versions
105178db8f5d7d45b268c7664388d7db90350704 meson: fix test source name for static glapi
c74719cf4adae2fa142e154bff56716427d3d992 glapi: fix check_table test for non-shared glapi with meson

View File

@@ -1207,10 +1207,10 @@ AC_ARG_ENABLE([xa],
[enable_xa=no])
AC_ARG_ENABLE([gbm],
[AS_HELP_STRING([--enable-gbm],
[enable gbm library @<:@default=yes except cygwin@:>@])],
[enable gbm library @<:@default=yes except cygwin and macOS@:>@])],
[enable_gbm="$enableval"],
[case "$host_os" in
cygwin*)
cygwin* | darwin*)
enable_gbm=no
;;
*)
@@ -1535,7 +1535,7 @@ fi
AC_ARG_ENABLE([driglx-direct],
[AS_HELP_STRING([--disable-driglx-direct],
[disable direct rendering in GLX and EGL for DRI \
@<:@default=auto@:>@])],
@<:@default=enabled@:>@])],
[driglx_direct="$enableval"],
[driglx_direct="yes"])
@@ -2711,6 +2711,18 @@ if test "x$enable_llvm" = xyes; then
fi
fi
fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" == xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
fi
fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)

View File

@@ -31,7 +31,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
c733d37a161501cd81dc9b309ccb613753b98eafc6d35e0847548a6642749772 mesa-17.3.3.tar.gz
41bac5de0ef6adc1f41a1ec0f80c19e361298ce02fa81b5f9ba4fdca33a9379b mesa-17.3.3.tar.xz
</pre>

274
docs/relnotes/17.3.4.html Normal file
View File

@@ -0,0 +1,274 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>
<p>
Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.
</p>
<p>
Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>
</ul>
<p>Bas Nieuwenhuizen (10):</p>
<ul>
<li>radv: Fix ordering issue in meta memory allocation failure path.</li>
<li>radv: Fix memory allocation failure path in compute resolve init.</li>
<li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>
<li>radv: Fix fragment resolve init memory allocation failure paths.</li>
<li>radv: Fix bufimage failure deallocation.</li>
<li>radv: Init variant entry with memset.</li>
<li>radv: Don't allow 3d or 1d depth/stencil textures.</li>
<li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>
<li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>
<li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>
</ul>
<p>Boyuan Zhang (2):</p>
<ul>
<li>radeon/vcn: add and manage render picture list</li>
<li>radeon/uvd: add and manage render picture list</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>configure.ac: add missing llvm dependencies to .pc files</li>
</ul>
<p>Dave Airlie (10):</p>
<ul>
<li>r600/sb: fix a bug emitting ar load from a constant.</li>
<li>ac/nir: account for view index in the user sgpr allocation.</li>
<li>radv: add fs_key meta format support to resolve passes.</li>
<li>radv: don't use hw resolve for integer image formats</li>
<li>radv: don't use hw resolves for r16g16 norm formats.</li>
<li>radv: move spi_baryc_cntl to pipeline</li>
<li>r600/sb: insert the else clause when we might depart from a loop</li>
<li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>
<li>radv/gfx9: fix block compression texture views. (v2)</li>
<li>virgl: also remove dimension on indirect.</li>
</ul>
<p>Eleni Maria Stea (1):</p>
<ul>
<li>mesa: Fix function pointers initialization in status tracker</li>
</ul>
<p>Emil Velikov (18):</p>
<ul>
<li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>
<li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>
<li>cherry-ignore: anv: add explicit 18.0 only nominations</li>
<li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>
<li>cherry-ignore: meson: multiple fixes</li>
<li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>
<li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>
<li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>
<li>cherry-ignore: add gen10 fixes</li>
<li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>
<li>cherry-ignore: add i965 shader cache fixes</li>
<li>cherry-ignore: nir: mark unused space in packed_tex_data</li>
<li>radv: Stop advertising VK_KHX_multiview</li>
<li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>
<li>configure.ac: correct driglx-direct help text</li>
<li>cherry-ignore: add meson fix</li>
<li>cherry-ignore: add a few more meson fixes</li>
<li>Update version to 17.3.4</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>radeon: remove left over dead code</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>st/va: release held locks in error paths</li>
<li>st/vdpau: release held lock in error path</li>
</ul>
<p>Igor Gnatenko (1):</p>
<ul>
<li>link mesautil with pthreads</li>
</ul>
<p>Indrajit Das (4):</p>
<ul>
<li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>
<li>radeon/uvd: update quantiser matrices only when requested</li>
<li>radeon/vcn: update quantiser matrices only when requested</li>
<li>st/va: clear pointers for mpeg2 quantiser matrices</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>
<li>i965: Add more precise cache tracking helpers</li>
<li>i965/blorp: Add more destination flushing</li>
<li>i965: Track the depth and render caches separately</li>
<li>i965: Track format and aux usage in the render cache</li>
<li>Re-enable regular fast-clears (CCS_D) on gen9+</li>
<li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>
<li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>
<li>i965/miptree: Use the tiling from the modifier instead of the BO</li>
<li>i965/bufmgr: Add a create_from_prime_tiled function</li>
<li>i965: Set tiling on BOs imported with modifiers</li>
<li>i965/miptree: Take an aux_usage in prepare/finish_render</li>
<li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>
<li>i965/surface_state: Drop brw_aux_surface_disabled</li>
<li>intel/fs: Use the original destination region for int MUL lowering</li>
<li>anv/pipeline: Don't look at blend state unless we have an attachment</li>
<li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>
<li>anv: Stop advertising VK_KHX_multiview</li>
<li>i965: Call prepare_external after implicit window-system MSAA resolves</li>
</ul>
<p>Jon Turney (3):</p>
<ul>
<li>configure: Default to gbm=no on osx</li>
<li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>
<li>glx/apple: locate dispatch table functions to wrap by name</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>svga: Prevent use after free.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.3.3</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Bind null render targets for shadow sampling + color.</li>
<li>i965: Bump official kernel requirement to Linux v3.9.</li>
</ul>
<p>Lucas Stach (2):</p>
<ul>
<li>etnaviv: dirty TS state when framebuffer has changed</li>
<li>renderonly: fix dumb BO allocation for non 32bpp formats</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't ignore pitch for imported textures</li>
</ul>
<p>Matthew Nicholls (2):</p>
<ul>
<li>radv: restore previous stencil reference after depth-stencil clear</li>
<li>radv: remove predication on cache flushes</li>
</ul>
<p>Maxin B. John (1):</p>
<ul>
<li>anv_icd.py: improve reproducible builds</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>r600: don't do stack workarounds for hemlock</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: create pipeline layout objects for all meta operations</li>
</ul>
<p>Samuel Thibault (1):</p>
<ul>
<li>glx: fix non-dri build</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>ac: fix buffer overflow bug in 64bit SSBO loads</li>
<li>ac: fix visit_ssa_undef() for doubles</li>
</ul>
</div>
</body>
</html>

View File

@@ -562,7 +562,30 @@ struct user_sgpr_info {
bool indirect_all_descriptor_sets;
};
static bool needs_view_index_sgpr(struct nir_to_llvm_context *ctx,
gl_shader_stage stage)
{
switch (stage) {
case MESA_SHADER_VERTEX:
if (ctx->shader_info->info.needs_multiview_view_index ||
(!ctx->options->key.vs.as_es && !ctx->options->key.vs.as_ls && ctx->options->key.has_multiview_view_index))
return true;
break;
case MESA_SHADER_TESS_EVAL:
if (ctx->shader_info->info.needs_multiview_view_index || (!ctx->options->key.tes.as_es && ctx->options->key.has_multiview_view_index))
return true;
case MESA_SHADER_GEOMETRY:
case MESA_SHADER_TESS_CTRL:
if (ctx->shader_info->info.needs_multiview_view_index)
return true;
default:
break;
}
return false;
}
static void allocate_user_sgprs(struct nir_to_llvm_context *ctx,
bool needs_view_index,
struct user_sgpr_info *user_sgpr_info)
{
memset(user_sgpr_info, 0, sizeof(struct user_sgpr_info));
@@ -616,6 +639,9 @@ static void allocate_user_sgprs(struct nir_to_llvm_context *ctx,
break;
}
if (needs_view_index)
user_sgpr_info->sgpr_count++;
if (ctx->shader_info->info.needs_push_constants)
user_sgpr_info->sgpr_count += 2;
@@ -745,8 +771,8 @@ static void create_function(struct nir_to_llvm_context *ctx,
struct user_sgpr_info user_sgpr_info;
struct arg_info args = {};
LLVMValueRef desc_sets;
allocate_user_sgprs(ctx, &user_sgpr_info);
bool needs_view_index = needs_view_index_sgpr(ctx, stage);
allocate_user_sgprs(ctx, needs_view_index, &user_sgpr_info);
if (user_sgpr_info.need_ring_offsets && !ctx->options->supports_spill) {
add_user_sgpr_argument(&args, const_array(ctx->v4i32, 16), &ctx->ring_offsets); /* address of rings */
@@ -764,7 +790,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
case MESA_SHADER_VERTEX:
radv_define_common_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &user_sgpr_info, &args, &desc_sets);
radv_define_vs_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &args);
if (ctx->shader_info->info.needs_multiview_view_index || (!ctx->options->key.vs.as_es && !ctx->options->key.vs.as_ls && ctx->options->key.has_multiview_view_index))
if (needs_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
if (ctx->options->key.vs.as_es)
add_sgpr_argument(&args, ctx->i32, &ctx->es2gs_offset); // es2gs offset
@@ -796,7 +822,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_offsets); // tcs out offsets
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_layout); // tcs out layout
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_in_layout); // tcs in layout
if (ctx->shader_info->info.needs_multiview_view_index)
if (needs_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_vgpr_argument(&args, ctx->i32, &ctx->tcs_patch_id); // patch id
@@ -811,7 +837,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_offsets); // tcs out offsets
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_out_layout); // tcs out layout
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_in_layout); // tcs in layout
if (ctx->shader_info->info.needs_multiview_view_index)
if (needs_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_sgpr_argument(&args, ctx->i32, &ctx->oc_lds); // param oc lds
add_sgpr_argument(&args, ctx->i32, &ctx->tess_factor_offset); // tess factor offset
@@ -822,8 +848,9 @@ static void create_function(struct nir_to_llvm_context *ctx,
case MESA_SHADER_TESS_EVAL:
radv_define_common_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &user_sgpr_info, &args, &desc_sets);
add_user_sgpr_argument(&args, ctx->i32, &ctx->tcs_offchip_layout); // tcs offchip layout
if (ctx->shader_info->info.needs_multiview_view_index || (!ctx->options->key.tes.as_es && ctx->options->key.has_multiview_view_index))
if (needs_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
if (ctx->options->key.tes.as_es) {
add_sgpr_argument(&args, ctx->i32, &ctx->oc_lds); // OC LDS
add_sgpr_argument(&args, ctx->i32, NULL); //
@@ -855,7 +882,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
radv_define_vs_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &args);
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_ring_stride); // gsvs stride
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_num_entries); // gsvs num entires
if (ctx->shader_info->info.needs_multiview_view_index)
if (needs_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_vgpr_argument(&args, ctx->i32, &ctx->gs_vtx_offset[0]); // vtx01
@@ -880,7 +907,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
radv_define_vs_user_sgprs_phase1(ctx, stage, has_previous_stage, previous_stage, &args);
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_ring_stride); // gsvs stride
add_user_sgpr_argument(&args, ctx->i32, &ctx->gsvs_num_entries); // gsvs num entires
if (ctx->shader_info->info.needs_multiview_view_index)
if (needs_view_index)
add_user_sgpr_argument(&args, ctx->i32, &ctx->view_index);
add_sgpr_argument(&args, ctx->i32, &ctx->gs2vs_offset); // gs2vs offset
add_sgpr_argument(&args, ctx->i32, &ctx->gs_wave_id); // wave id
@@ -2343,6 +2370,46 @@ static LLVMValueRef visit_get_buffer_size(struct ac_nir_context *ctx,
return get_buffer_size(ctx, desc, false);
}
static uint32_t widen_mask(uint32_t mask, unsigned multiplier)
{
uint32_t new_mask = 0;
for(unsigned i = 0; i < 32 && (1u << i) <= mask; ++i)
if (mask & (1u << i))
new_mask |= ((1u << multiplier) - 1u) << (i * multiplier);
return new_mask;
}
static LLVMValueRef extract_vector_range(struct ac_llvm_context *ctx, LLVMValueRef src,
unsigned start, unsigned count)
{
LLVMTypeRef type = LLVMTypeOf(src);
if (LLVMGetTypeKind(type) != LLVMVectorTypeKind) {
assert(start == 0);
assert(count == 1);
return src;
}
unsigned src_elements = LLVMGetVectorSize(type);
assert(start < src_elements);
assert(start + count <= src_elements);
if (start == 0 && count == src_elements)
return src;
if (count == 1)
return LLVMBuildExtractElement(ctx->builder, src, LLVMConstInt(ctx->i32, start, false), "");
assert(count <= 8);
LLVMValueRef indices[8];
for (unsigned i = 0; i < count; ++i)
indices[i] = LLVMConstInt(ctx->i32, start + i, false);
LLVMValueRef swizzle = LLVMConstVector(indices, count);
return LLVMBuildShuffleVector(ctx->builder, src, src, swizzle, "");
}
static void visit_store_ssbo(struct ac_nir_context *ctx,
nir_intrinsic_instr *instr)
{
@@ -2365,6 +2432,8 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
if (components_32bit > 1)
data_type = LLVMVectorType(ctx->ac.f32, components_32bit);
writemask = widen_mask(writemask, elem_size_mult);
base_data = ac_to_float(&ctx->ac, src_data);
base_data = trim_vector(&ctx->ac, base_data, instr->num_components);
base_data = LLVMBuildBitCast(ctx->ac.builder, base_data,
@@ -2374,7 +2443,7 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
int start, count;
LLVMValueRef data;
LLVMValueRef offset;
LLVMValueRef tmp;
u_bit_scan_consecutive_range(&writemask, &start, &count);
/* Due to an LLVM limitation, split 3-element writes
@@ -2384,9 +2453,6 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
count = 2;
}
start *= elem_size_mult;
count *= elem_size_mult;
if (count > 4) {
writemask |= ((1u << (count - 4)) - 1u) << (start + 4);
count = 4;
@@ -2394,30 +2460,14 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
if (count == 4) {
store_name = "llvm.amdgcn.buffer.store.v4f32";
data = base_data;
} else if (count == 2) {
LLVMTypeRef v2f32 = LLVMVectorType(ctx->ac.f32, 2);
tmp = LLVMBuildExtractElement(ctx->ac.builder,
base_data, LLVMConstInt(ctx->ac.i32, start, false), "");
data = LLVMBuildInsertElement(ctx->ac.builder, LLVMGetUndef(v2f32), tmp,
ctx->ac.i32_0, "");
tmp = LLVMBuildExtractElement(ctx->ac.builder,
base_data, LLVMConstInt(ctx->ac.i32, start + 1, false), "");
data = LLVMBuildInsertElement(ctx->ac.builder, data, tmp,
ctx->ac.i32_1, "");
store_name = "llvm.amdgcn.buffer.store.v2f32";
} else {
assert(count == 1);
if (get_llvm_num_components(base_data) > 1)
data = LLVMBuildExtractElement(ctx->ac.builder, base_data,
LLVMConstInt(ctx->ac.i32, start, false), "");
else
data = base_data;
store_name = "llvm.amdgcn.buffer.store.f32";
}
data = extract_vector_range(&ctx->ac, base_data, start, count);
offset = base_offset;
if (start != 0) {
@@ -2527,8 +2577,11 @@ static LLVMValueRef visit_load_buffer(struct ac_nir_context *ctx,
i1false,
};
results[i] = ac_build_intrinsic(&ctx->ac, load_name, data_type, params, 5, 0);
int idx = i;
if (instr->dest.ssa.bit_size == 64)
idx = i > 1 ? 1 : 0;
results[idx] = ac_build_intrinsic(&ctx->ac, load_name, data_type, params, 5, 0);
}
LLVMValueRef ret = results[0];
@@ -3177,17 +3230,12 @@ visit_store_var(struct ac_nir_context *ctx,
NULL, NULL, &const_index, &indir_index);
if (get_elem_bits(&ctx->ac, LLVMTypeOf(src)) == 64) {
int old_writemask = writemask;
src = LLVMBuildBitCast(ctx->ac.builder, src,
LLVMVectorType(ctx->ac.f32, get_llvm_num_components(src) * 2),
"");
writemask = 0;
for (unsigned chan = 0; chan < 4; chan++) {
if (old_writemask & (1 << chan))
writemask |= 3u << (2 * chan);
}
writemask = widen_mask(writemask, 2);
}
switch (instr->variables[0]->var->data.mode) {
@@ -4901,12 +4949,13 @@ static void visit_ssa_undef(struct ac_nir_context *ctx,
const nir_ssa_undef_instr *instr)
{
unsigned num_components = instr->def.num_components;
LLVMTypeRef type = LLVMIntTypeInContext(ctx->ac.context, instr->def.bit_size);
LLVMValueRef undef;
if (num_components == 1)
undef = LLVMGetUndef(ctx->ac.i32);
undef = LLVMGetUndef(type);
else {
undef = LLVMGetUndef(LLVMVectorType(ctx->ac.i32, num_components));
undef = LLVMGetUndef(LLVMVectorType(type, num_components));
}
_mesa_hash_table_insert(ctx->defs, &instr->def, undef);
}
@@ -5067,16 +5116,16 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
variable->data.driver_location = idx * 4;
if (ctx->options->key.vs.instance_rate_inputs & (1u << index)) {
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.instance_id,
ctx->abi.start_instance, "");
ctx->shader_info->vs.vgpr_comp_cnt = MAX2(3,
ctx->shader_info->vs.vgpr_comp_cnt);
} else
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.vertex_id,
ctx->abi.base_vertex, "");
for (unsigned i = 0; i < attrib_count; ++i, ++idx) {
if (ctx->options->key.vs.instance_rate_inputs & (1u << (index + 1))) {
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.instance_id,
ctx->abi.start_instance, "");
ctx->shader_info->vs.vgpr_comp_cnt =
MAX2(3, ctx->shader_info->vs.vgpr_comp_cnt);
} else
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.vertex_id,
ctx->abi.base_vertex, "");
t_offset = LLVMConstInt(ctx->i32, index + i, false);
t_list = ac_build_load_to_sgpr(&ctx->ac, t_list_ptr, t_offset);

View File

@@ -380,7 +380,7 @@ radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer)
flags = RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
si_cs_emit_cache_flush(cmd_buffer->cs, false,
si_cs_emit_cache_flush(cmd_buffer->cs,
cmd_buffer->device->physical_device->rad_info.chip_class,
NULL, 0,
radv_cmd_buffer_uses_mec(cmd_buffer),
@@ -919,7 +919,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
{
struct radv_shader_variant *ps;
uint64_t va;
unsigned spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
struct radv_blend_state *blend = &pipeline->graphics.blend;
assert (pipeline->shaders[MESA_SHADER_FRAGMENT]);
@@ -941,13 +940,10 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR,
ps->config.spi_ps_input_addr);
if (ps->info.info.ps.force_persample)
spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL,
S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, pipeline->graphics.spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_028710_SPI_SHADER_Z_FORMAT,
pipeline->graphics.shader_z_format);

View File

@@ -1119,13 +1119,15 @@ VkResult radv_CreateDevice(
result = radv_CreatePipelineCache(radv_device_to_handle(device),
&ci, NULL, &pc);
if (result != VK_SUCCESS)
goto fail;
goto fail_meta;
device->mem_cache = radv_pipeline_cache_from_handle(pc);
*pDevice = radv_device_to_handle(device);
return VK_SUCCESS;
fail_meta:
radv_device_finish_meta(device);
fail:
if (device->trace_bo)
device->ws->buffer_destroy(device->trace_bo);
@@ -1688,7 +1690,6 @@ radv_get_preamble_cs(struct radv_queue *queue,
if (i == 0) {
si_cs_emit_cache_flush(cs,
false,
queue->device->physical_device->rad_info.chip_class,
NULL, 0,
queue->queue_family_index == RING_COMPUTE &&
@@ -1700,7 +1701,6 @@ radv_get_preamble_cs(struct radv_queue *queue,
RADV_CMD_FLAG_INV_GLOBAL_L2);
} else if (i == 1) {
si_cs_emit_cache_flush(cs,
false,
queue->device->physical_device->rad_info.chip_class,
NULL, 0,
queue->queue_family_index == RING_COMPUTE &&

View File

@@ -76,7 +76,7 @@ EXTENSIONS = [
Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'),
Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'),
Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'),
Extension('VK_KHX_multiview', 1, True),
Extension('VK_KHX_multiview', 1, False),
Extension('VK_EXT_global_priority', 1, 'device->rad_info.has_ctx_priority'),
Extension('VK_AMD_draw_indirect_count', 1, True),
Extension('VK_AMD_rasterization_order', 1, 'device->rad_info.chip_class >= VI && device->rad_info.max_se >= 2'),

View File

@@ -1063,6 +1063,9 @@ static VkResult radv_get_image_format_properties(struct radv_physical_device *ph
if (format_feature_flags == 0)
goto unsupported;
if (info->type != VK_IMAGE_TYPE_2D && vk_format_is_depth_or_stencil(info->format))
goto unsupported;
switch (info->type) {
default:
unreachable("bad vkimage type\n");

View File

@@ -116,7 +116,8 @@ radv_init_surface(struct radv_device *device,
pCreateInfo->mipLevels <= 1 &&
device->physical_device->rad_info.chip_class >= VI &&
((pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) ||
/* for some reason TC compat with 4/8 samples breaks some cts tests - disable for now */
(pCreateInfo->samples < 4 && pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT)) ||
(device->physical_device->rad_info.chip_class >= GFX9 &&
pCreateInfo->format == VK_FORMAT_D16_UNORM)))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;
@@ -1047,10 +1048,55 @@ radv_image_view_init(struct radv_image_view *iview,
}
if (iview->vk_format != image->vk_format) {
iview->extent.width = round_up_u32(iview->extent.width * vk_format_get_blockwidth(iview->vk_format),
vk_format_get_blockwidth(image->vk_format));
iview->extent.height = round_up_u32(iview->extent.height * vk_format_get_blockheight(iview->vk_format),
vk_format_get_blockheight(image->vk_format));
unsigned view_bw = vk_format_get_blockwidth(iview->vk_format);
unsigned view_bh = vk_format_get_blockheight(iview->vk_format);
unsigned img_bw = vk_format_get_blockwidth(image->vk_format);
unsigned img_bh = vk_format_get_blockheight(image->vk_format);
iview->extent.width = round_up_u32(iview->extent.width * view_bw, img_bw);
iview->extent.height = round_up_u32(iview->extent.height * view_bh, img_bh);
/* Comment ported from amdvlk -
* If we have the following image:
* Uncompressed pixels Compressed block sizes (4x4)
* mip0: 22 x 22 6 x 6
* mip1: 11 x 11 3 x 3
* mip2: 5 x 5 2 x 2
* mip3: 2 x 2 1 x 1
* mip4: 1 x 1 1 x 1
*
* On GFX9 the descriptor is always programmed with the WIDTH and HEIGHT of the base level and the HW is
* calculating the degradation of the block sizes down the mip-chain as follows (straight-up
* divide-by-two integer math):
* mip0: 6x6
* mip1: 3x3
* mip2: 1x1
* mip3: 1x1
*
* This means that mip2 will be missing texels.
*
* Fix this by calculating the base mip's width and height, then convert that, and round it
* back up to get the level 0 size.
* Clamp the converted size between the original values, and next power of two, which
* means we don't oversize the image.
*/
if (device->physical_device->rad_info.chip_class >= GFX9 &&
vk_format_is_compressed(image->vk_format) &&
!vk_format_is_compressed(iview->vk_format)) {
unsigned rounded_img_w = util_next_power_of_two(iview->extent.width);
unsigned rounded_img_h = util_next_power_of_two(iview->extent.height);
unsigned lvl_width = radv_minify(image->info.width , range->baseMipLevel);
unsigned lvl_height = radv_minify(image->info.height, range->baseMipLevel);
lvl_width = round_up_u32(lvl_width * view_bw, img_bw);
lvl_height = round_up_u32(lvl_height * view_bh, img_bh);
lvl_width <<= range->baseMipLevel;
lvl_height <<= range->baseMipLevel;
iview->extent.width = CLAMP(lvl_width, iview->extent.width, rounded_img_w);
iview->extent.height = CLAMP(lvl_height, iview->extent.height, rounded_img_h);
}
}
iview->base_layer = range->baseArrayLayer;

View File

@@ -377,9 +377,9 @@ fail_resolve_fragment:
fail_resolve_compute:
radv_device_finish_meta_fast_clear_flush_state(device);
fail_fast_clear:
radv_device_finish_meta_buffer_state(device);
fail_query:
radv_device_finish_meta_query_state(device);
fail_query:
radv_device_finish_meta_buffer_state(device);
fail_buffer:
radv_device_finish_meta_depth_decomp_state(device);
fail_depth_decomp:

View File

@@ -901,21 +901,23 @@ radv_device_init_meta_bufimage_state(struct radv_device *device)
result = radv_device_init_meta_itob_state(device);
if (result != VK_SUCCESS)
return result;
goto fail_itob;
result = radv_device_init_meta_btoi_state(device);
if (result != VK_SUCCESS)
goto fail_itob;
goto fail_btoi;
result = radv_device_init_meta_itoi_state(device);
if (result != VK_SUCCESS)
goto fail_btoi;
goto fail_itoi;
result = radv_device_init_meta_cleari_state(device);
if (result != VK_SUCCESS)
goto fail_itoi;
goto fail_cleari;
return VK_SUCCESS;
fail_cleari:
radv_device_finish_meta_cleari_state(device);
fail_itoi:
radv_device_finish_meta_itoi_state(device);
fail_btoi:

View File

@@ -628,6 +628,7 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
VK_SHADER_STAGE_VERTEX_BIT, 0, 4,
&clear_value.depth);
uint32_t prev_reference = cmd_buffer->state.dynamic.stencil_reference.front;
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
radv_CmdSetStencilReference(cmd_buffer_h, VK_STENCIL_FACE_FRONT_BIT,
clear_value.stencil);
@@ -662,6 +663,11 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer,
radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &clear_rect->rect);
radv_CmdDraw(cmd_buffer_h, 3, clear_rect->layerCount, 0, clear_rect->baseArrayLayer);
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
radv_CmdSetStencilReference(cmd_buffer_h, VK_STENCIL_FACE_FRONT_BIT,
prev_reference);
}
}
static bool

View File

@@ -75,11 +75,29 @@ create_pass(struct radv_device *device,
return result;
}
static VkResult
create_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout)
{
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 0,
.pSetLayouts = NULL,
.pushConstantRangeCount = 0,
.pPushConstantRanges = NULL,
};
return radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
layout);
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h,
uint32_t samples,
VkRenderPass pass,
VkPipelineLayout layout,
VkPipeline *decompress_pipeline,
VkPipeline *resummarize_pipeline)
{
@@ -165,6 +183,7 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = layout,
.renderPass = pass,
.subpass = 0,
};
@@ -212,6 +231,9 @@ radv_device_finish_meta_depth_decomp_state(struct radv_device *device)
radv_DestroyRenderPass(radv_device_to_handle(device),
state->depth_decomp[i].pass,
&state->alloc);
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->depth_decomp[i].p_layout,
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->depth_decomp[i].decompress_pipeline,
&state->alloc);
@@ -243,8 +265,14 @@ radv_device_init_meta_depth_decomp_state(struct radv_device *device)
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline_layout(device,
&state->depth_decomp[i].p_layout);
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline(device, vs_module_h, samples,
state->depth_decomp[i].pass,
state->depth_decomp[i].p_layout,
&state->depth_decomp[i].decompress_pipeline,
&state->depth_decomp[i].resummarize_pipeline);
if (res != VK_SUCCESS)

View File

@@ -74,9 +74,27 @@ create_pass(struct radv_device *device)
return result;
}
static VkResult
create_pipeline_layout(struct radv_device *device, VkPipelineLayout *layout)
{
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 0,
.pSetLayouts = NULL,
.pushConstantRangeCount = 0,
.pPushConstantRanges = NULL,
};
return radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
layout);
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
VkShaderModule vs_module_h,
VkPipelineLayout layout)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -173,6 +191,7 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = layout,
.renderPass = device->meta_state.fast_clear_flush.pass,
.subpass = 0,
},
@@ -218,6 +237,7 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.layout = layout,
.renderPass = device->meta_state.fast_clear_flush.pass,
.subpass = 0,
},
@@ -245,6 +265,9 @@ radv_device_finish_meta_fast_clear_flush_state(struct radv_device *device)
radv_DestroyRenderPass(radv_device_to_handle(device),
state->fast_clear_flush.pass, &state->alloc);
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->fast_clear_flush.p_layout,
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->fast_clear_flush.cmask_eliminate_pipeline,
&state->alloc);
@@ -269,8 +292,14 @@ radv_device_init_meta_fast_clear_flush_state(struct radv_device *device)
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline_layout(device,
&device->meta_state.fast_clear_flush.p_layout);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
res = create_pipeline(device, vs_module_h,
device->meta_state.fast_clear_flush.p_layout);
if (res != VK_SUCCESS)
goto fail;

View File

@@ -26,6 +26,7 @@
#include "radv_meta.h"
#include "radv_private.h"
#include "vk_format.h"
#include "nir/nir_builder.h"
#include "sid.h"
@@ -50,7 +51,7 @@ build_nir_fs(void)
}
static VkResult
create_pass(struct radv_device *device)
create_pass(struct radv_device *device, VkFormat vk_format, VkRenderPass *pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -59,7 +60,7 @@ create_pass(struct radv_device *device)
int i;
for (i = 0; i < 2; i++) {
attachments[i].format = VK_FORMAT_UNDEFINED;
attachments[i].format = vk_format;
attachments[i].samples = 1;
attachments[i].loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachments[i].storeOp = VK_ATTACHMENT_STORE_OP_STORE;
@@ -99,14 +100,16 @@ create_pass(struct radv_device *device)
.dependencyCount = 0,
},
alloc,
&device->meta_state.resolve.pass);
pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
VkShaderModule vs_module_h,
VkPipeline *pipeline,
VkRenderPass pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -121,6 +124,23 @@ create_pipeline(struct radv_device *device,
goto cleanup;
}
VkPipelineLayoutCreateInfo pl_create_info = {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 0,
.pSetLayouts = NULL,
.pushConstantRangeCount = 0,
.pPushConstantRanges = NULL,
};
if (!device->meta_state.resolve.p_layout) {
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve.p_layout);
if (result != VK_SUCCESS)
goto cleanup;
}
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&(VkGraphicsPipelineCreateInfo) {
@@ -196,15 +216,15 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.renderPass = device->meta_state.resolve.pass,
.layout = device->meta_state.resolve.p_layout,
.renderPass = pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_RESOLVE,
},
&device->meta_state.alloc,
&device->meta_state.resolve.pipeline);
&device->meta_state.alloc, pipeline);
if (result != VK_SUCCESS)
goto cleanup;
@@ -220,17 +240,37 @@ radv_device_finish_meta_resolve_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline, &state->alloc);
for (uint32_t j = 0; j < NUM_META_FS_KEYS; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass[j], &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline[j], &state->alloc);
}
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->resolve.p_layout, &state->alloc);
}
static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,
VK_FORMAT_R16G16B16A16_SINT,
VK_FORMAT_R32_SFLOAT,
VK_FORMAT_R32G32_SFLOAT,
VK_FORMAT_R32G32B32A32_SFLOAT
};
VkResult
radv_device_init_meta_resolve_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
struct radv_meta_state *state = &device->meta_state;
struct radv_shader_module vs_module = { .nir = radv_meta_build_nir_vs_generate_vertices() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
@@ -238,14 +278,19 @@ radv_device_init_meta_resolve_state(struct radv_device *device)
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
for (uint32_t i = 0; i < ARRAY_SIZE(pipeline_formats); ++i) {
VkFormat format = pipeline_formats[i];
unsigned fs_key = radv_format_meta_fs_key(format);
res = create_pass(device, format, &state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h,
&state->resolve.pipeline[fs_key], state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
}
goto cleanup;
@@ -260,16 +305,18 @@ cleanup:
static void
emit_resolve(struct radv_cmd_buffer *cmd_buffer,
VkFormat vk_format,
const VkOffset2D *dest_offset,
const VkExtent2D *resolve_extent)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
unsigned fs_key = radv_format_meta_fs_key(vk_format);
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.resolve.pipeline);
device->meta_state.resolve.pipeline[fs_key]);
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) {
.x = dest_offset->x,
@@ -300,6 +347,12 @@ static void radv_pick_resolve_method_images(struct radv_image *src_image,
enum radv_resolve_method *method)
{
if (src_image->vk_format == VK_FORMAT_R16G16_UNORM ||
src_image->vk_format == VK_FORMAT_R16G16_SNORM)
*method = RESOLVE_COMPUTE;
else if (vk_format_is_int(src_image->vk_format))
*method = RESOLVE_COMPUTE;
if (dest_image->surface.num_dcc_levels > 0) {
*method = RESOLVE_FRAGMENT;
} else if (dest_image->surface.micro_tile_mode != src_image->surface.micro_tile_mode) {
@@ -389,6 +442,7 @@ void radv_CmdResolveImage(
if (dest_image->surface.dcc_size) {
radv_initialize_dcc(cmd_buffer, dest_image, 0xffffffff);
}
unsigned fs_key = radv_format_meta_fs_key(dest_image->vk_format);
for (uint32_t r = 0; r < region_count; ++r) {
const VkImageResolve *region = &regions[r];
@@ -488,7 +542,7 @@ void radv_CmdResolveImage(
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.resolve.pass,
.renderPass = device->meta_state.resolve.pass[fs_key],
.framebuffer = fb_h,
.renderArea = {
.offset = {
@@ -506,6 +560,7 @@ void radv_CmdResolveImage(
VK_SUBPASS_CONTENTS_INLINE);
emit_resolve(cmd_buffer,
dest_iview.vk_format,
&(VkOffset2D) {
.x = dstOffset.x,
.y = dstOffset.y,
@@ -600,6 +655,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
emit_resolve(cmd_buffer,
dst_img->vk_format,
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
}

View File

@@ -253,22 +253,31 @@ radv_device_init_meta_resolve_compute_state(struct radv_device *device)
res = create_layout(device);
if (res != VK_SUCCESS)
return res;
goto fail;
for (uint32_t i = 0; i < MAX_SAMPLES_LOG2; ++i) {
uint32_t samples = 1 << i;
res = create_resolve_pipeline(device, samples, false, false,
&state->resolve_compute.rc[i].pipeline);
if (res != VK_SUCCESS)
goto fail;
res = create_resolve_pipeline(device, samples, true, false,
&state->resolve_compute.rc[i].i_pipeline);
if (res != VK_SUCCESS)
goto fail;
res = create_resolve_pipeline(device, samples, false, true,
&state->resolve_compute.rc[i].srgb_pipeline);
if (res != VK_SUCCESS)
goto fail;
}
return VK_SUCCESS;
fail:
radv_device_finish_meta_resolve_compute_state(device);
return res;
}

View File

@@ -316,16 +316,9 @@ create_resolve_pipeline(struct radv_device *device,
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
pipeline);
ralloc_free(vs.nir);
ralloc_free(fs.nir);
if (result != VK_SUCCESS)
goto fail;
return VK_SUCCESS;
fail:
ralloc_free(vs.nir);
ralloc_free(fs.nir);
return result;
}
@@ -336,14 +329,19 @@ radv_device_init_meta_resolve_fragment_state(struct radv_device *device)
res = create_layout(device);
if (res != VK_SUCCESS)
return res;
goto fail;
for (uint32_t i = 0; i < MAX_SAMPLES_LOG2; ++i) {
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
res = create_resolve_pipeline(device, i, pipeline_formats[j]);
if (res != VK_SUCCESS)
goto fail;
}
}
return VK_SUCCESS;
fail:
radv_device_finish_meta_resolve_fragment_state(device);
return res;
}

View File

@@ -879,6 +879,8 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
S_028BE0_MAX_SAMPLE_DIST(radv_cayman_get_maxdist(log_samples)) |
S_028BE0_MSAA_EXPOSED_SAMPLES(log_samples); /* CM_R_028BE0_PA_SC_AA_CONFIG */
ms->pa_sc_mode_cntl_1 |= S_028A4C_PS_ITER_SAMPLE(ps_iter_samples > 1);
if (ps_iter_samples > 1)
pipeline->graphics.spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
}
const struct VkPipelineRasterizationStateRasterizationOrderAMD *raster_order =
@@ -1995,6 +1997,7 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
radv_create_shaders(pipeline, device, cache, keys, pStages);
pipeline->graphics.spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
radv_pipeline_init_depth_stencil_state(pipeline, pCreateInfo, extra);
radv_pipeline_init_raster_state(pipeline, pCreateInfo);
radv_pipeline_init_multisample_state(pipeline, pCreateInfo);

View File

@@ -375,6 +375,7 @@ radv_pipeline_cache_insert_shaders(struct radv_device *device,
char* p = entry->code;
struct cache_entry_variant_info info;
memset(&info, 0, sizeof(info));
for (int i = 0; i < MESA_SHADER_STAGES; ++i) {
if (!variants[i])

View File

@@ -449,8 +449,9 @@ struct radv_meta_state {
} cleari;
struct {
VkPipeline pipeline;
VkRenderPass pass;
VkPipelineLayout p_layout;
VkPipeline pipeline[NUM_META_FS_KEYS];
VkRenderPass pass[NUM_META_FS_KEYS];
} resolve;
struct {
@@ -474,12 +475,14 @@ struct radv_meta_state {
} resolve_fragment;
struct {
VkPipelineLayout p_layout;
VkPipeline decompress_pipeline;
VkPipeline resummarize_pipeline;
VkRenderPass pass;
} depth_decomp[1 + MAX_SAMPLES_LOG2];
struct {
VkPipelineLayout p_layout;
VkPipeline cmask_eliminate_pipeline;
VkPipeline fmask_decompress_pipeline;
VkRenderPass pass;
@@ -915,7 +918,6 @@ void si_emit_wait_fence(struct radeon_winsys_cs *cs,
uint64_t va, uint32_t ref,
uint32_t mask);
void si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
bool predicated,
enum chip_class chip_class,
uint32_t *fence_ptr, uint64_t va,
bool is_mec,
@@ -1129,6 +1131,7 @@ struct radv_pipeline {
struct radv_gs_state gs;
uint32_t db_shader_control;
uint32_t shader_z_format;
uint32_t spi_baryc_cntl;
unsigned prim;
unsigned gs_out;
uint32_t vgt_gs_mode;

View File

@@ -919,7 +919,6 @@ si_emit_acquire_mem(struct radeon_winsys_cs *cs,
void
si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
bool predicated,
enum chip_class chip_class,
uint32_t *flush_cnt,
uint64_t flush_va,
@@ -950,7 +949,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
/* Necessary for DCC */
if (chip_class >= VI) {
si_cs_emit_write_event_eop(cs,
predicated,
false,
chip_class,
is_mec,
V_028A90_FLUSH_AND_INV_CB_DATA_TS,
@@ -964,12 +963,12 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_FLUSH_AND_INV_CB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_CB_META) | EVENT_INDEX(0));
}
if (flush_bits & RADV_CMD_FLAG_FLUSH_AND_INV_DB_META) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_DB_META) | EVENT_INDEX(0));
}
@@ -982,7 +981,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_CS_PARTIAL_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_CS_PARTIAL_FLUSH) | EVENT_INDEX(4));
}
@@ -1036,14 +1035,14 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
assert(flush_cnt);
uint32_t old_fence = (*flush_cnt)++;
si_cs_emit_write_event_eop(cs, predicated, chip_class, false, cb_db_event, tc_flags, 1,
si_cs_emit_write_event_eop(cs, false, chip_class, false, cb_db_event, tc_flags, 1,
flush_va, old_fence, *flush_cnt);
si_emit_wait_fence(cs, predicated, flush_va, *flush_cnt, 0xffffffff);
si_emit_wait_fence(cs, false, flush_va, *flush_cnt, 0xffffffff);
}
/* VGT state sync */
if (flush_bits & RADV_CMD_FLAG_VGT_FLUSH) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, predicated));
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | EVENT_INDEX(0));
}
@@ -1056,13 +1055,13 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
RADV_CMD_FLAG_INV_GLOBAL_L2 |
RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2))) &&
!is_mec) {
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, predicated));
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
radeon_emit(cs, 0);
}
if ((flush_bits & RADV_CMD_FLAG_INV_GLOBAL_L2) ||
(chip_class <= CIK && (flush_bits & RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2))) {
si_emit_acquire_mem(cs, is_mec, predicated, chip_class >= GFX9,
si_emit_acquire_mem(cs, is_mec, false, chip_class >= GFX9,
cp_coher_cntl |
S_0085F0_TC_ACTION_ENA(1) |
S_0085F0_TCL1_ACTION_ENA(1) |
@@ -1076,7 +1075,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
*
* WB doesn't work without NC.
*/
si_emit_acquire_mem(cs, is_mec, predicated,
si_emit_acquire_mem(cs, is_mec, false,
chip_class >= GFX9,
cp_coher_cntl |
S_0301F0_TC_WB_ACTION_ENA(1) |
@@ -1085,7 +1084,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
}
if (flush_bits & RADV_CMD_FLAG_INV_VMEM_L1) {
si_emit_acquire_mem(cs, is_mec,
predicated, chip_class >= GFX9,
false, chip_class >= GFX9,
cp_coher_cntl |
S_0085F0_TCL1_ACTION_ENA(1));
cp_coher_cntl = 0;
@@ -1096,7 +1095,7 @@ si_cs_emit_cache_flush(struct radeon_winsys_cs *cs,
* Therefore, it should be last. Done in PFP.
*/
if (cp_coher_cntl)
si_emit_acquire_mem(cs, is_mec, predicated, chip_class >= GFX9, cp_coher_cntl);
si_emit_acquire_mem(cs, is_mec, false, chip_class >= GFX9, cp_coher_cntl);
}
void
@@ -1126,7 +1125,6 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
ptr = &cmd_buffer->gfx9_fence_idx;
}
si_cs_emit_cache_flush(cmd_buffer->cs,
cmd_buffer->state.predicating,
cmd_buffer->device->physical_device->rad_info.chip_class,
ptr, va,
radv_cmd_buffer_uses_mec(cmd_buffer),

View File

@@ -33,6 +33,7 @@
#include "state_tracker/drm_driver.h"
#include "pipe/p_screen.h"
#include "util/u_format.h"
#include "util/u_inlines.h"
#include "util/u_memory.h"
@@ -73,7 +74,7 @@ renderonly_create_kms_dumb_buffer_for_resource(struct pipe_resource *rsc,
struct drm_mode_create_dumb create_dumb = {
.width = rsc->width0,
.height = rsc->height0,
.bpp = 32,
.bpp = util_format_get_blocksizebits(rsc->format),
};
struct drm_mode_destroy_dumb destroy_dumb = { };

View File

@@ -595,7 +595,8 @@ etna_update_ts_config(struct etna_context *ctx)
}
}
if (new_ts_config != ctx->framebuffer.TS_MEM_CONFIG) {
if (new_ts_config != ctx->framebuffer.TS_MEM_CONFIG ||
(ctx->dirty & ETNA_DIRTY_FRAMEBUFFER)) {
ctx->framebuffer.TS_MEM_CONFIG = new_ts_config;
ctx->dirty |= ETNA_DIRTY_TS;
}

View File

@@ -3127,6 +3127,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
ctx.nliterals = 0;
ctx.literals = NULL;
ctx.max_driver_temp_used = 0;
shader->fs_write_all = ctx.info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS] &&
ctx.info.colors_written == 1;

View File

@@ -659,6 +659,7 @@ public:
return false;
switch (hw_chip) {
case HW_CHIP_HEMLOCK:
case HW_CHIP_CYPRESS:
case HW_CHIP_JUNIPER:
return false;

View File

@@ -208,8 +208,25 @@ void bc_finalizer::finalize_if(region_node* r) {
r->push_front(if_jump);
r->push_back(if_pop);
/* the depart/repeat 1 is actually part of the "else" code.
* if it's a depart for an outer loop region it will want to
* insert a LOOP_BREAK or LOOP_CONTINUE in here, so we need
* to emit the else clause.
*/
bool has_else = n_if->next;
if (repdep1->is_depart()) {
depart_node *dep1 = static_cast<depart_node*>(repdep1);
if (dep1->target != r && dep1->target->is_loop())
has_else = true;
}
if (repdep1->is_repeat()) {
repeat_node *rep1 = static_cast<repeat_node*>(repdep1);
if (rep1->target != r && rep1->target->is_loop())
has_else = true;
}
if (has_else) {
cf_node *nelse = sh.create_cf(CF_OP_ELSE);
n_if->insert_after(nelse);

View File

@@ -1130,6 +1130,9 @@ void post_scheduler::emit_clause() {
if (alu.current_ar) {
emit_load_ar();
process_group();
if (!alu.check_clause_limits()) {
// Can't happen since clause only contains MOVA/CF_SET_IDX0/1
}
alu.emit_group();
}

View File

@@ -296,19 +296,20 @@ static int r600_init_surface(struct r600_common_screen *rscreen,
return r;
}
unsigned pitch = pitch_in_bytes_override / bpe;
if (rscreen->chip_class >= GFX9) {
assert(!pitch_in_bytes_override ||
pitch_in_bytes_override == surface->u.gfx9.surf_pitch * bpe);
if (pitch) {
surface->u.gfx9.surf_pitch = pitch;
surface->u.gfx9.surf_slice_size =
(uint64_t)pitch * surface->u.gfx9.surf_height * bpe;
}
surface->u.gfx9.surf_offset = offset;
} else {
if (pitch_in_bytes_override &&
pitch_in_bytes_override != surface->u.legacy.level[0].nblk_x * bpe) {
/* old ddx on evergreen over estimate alignment for 1d, only 1 level
* for those
*/
surface->u.legacy.level[0].nblk_x = pitch_in_bytes_override / bpe;
surface->u.legacy.level[0].slice_size = pitch_in_bytes_override *
surface->u.legacy.level[0].nblk_y;
if (pitch) {
surface->u.legacy.level[0].nblk_x = pitch;
surface->u.legacy.level[0].slice_size =
((uint64_t)pitch * surface->u.legacy.level[0].nblk_y * bpe);
}
if (offset) {

View File

@@ -97,6 +97,8 @@ struct ruvd_decoder {
unsigned cmd;
unsigned cntl;
} reg;
void *render_pic_list[16];
};
/* flush IB to the hardware */
@@ -596,7 +598,7 @@ static struct ruvd_h265 get_h265_msg(struct ruvd_decoder *dec, struct pipe_video
struct pipe_h265_picture_desc *pic)
{
struct ruvd_h265 result;
unsigned i;
unsigned i, j;
memset(&result, 0, sizeof(result));
@@ -676,11 +678,28 @@ static struct ruvd_h265 get_h265_msg(struct ruvd_decoder *dec, struct pipe_video
result.row_height_minus1[i] = pic->pps->row_height_minus1[i];
result.num_delta_pocs_ref_rps_idx = pic->NumDeltaPocsOfRefRpsIdx;
result.curr_idx = pic->CurrPicOrderCntVal;
result.curr_poc = pic->CurrPicOrderCntVal;
for (i = 0 ; i < 16 ; i++) {
for (j = 0; (pic->ref[j] != NULL) && (j < 16) ; j++) {
if (dec->render_pic_list[i] == pic->ref[j])
break;
if (j == 15)
dec->render_pic_list[i] = NULL;
else if (pic->ref[j+1] == NULL)
dec->render_pic_list[i] = NULL;
}
}
for (i = 0 ; i < 16 ; i++) {
if (dec->render_pic_list[i] == NULL) {
dec->render_pic_list[i] = target;
result.curr_idx = i;
break;
}
}
vl_video_buffer_set_associated_data(target, &dec->base,
(void *)(uintptr_t)pic->CurrPicOrderCntVal,
(void *)(uintptr_t)result.curr_idx,
&ruvd_destroy_associated_data);
for (i = 0; i < 16; ++i) {
@@ -723,7 +742,7 @@ static struct ruvd_h265 get_h265_msg(struct ruvd_decoder *dec, struct pipe_video
memcpy(dec->it + 864, pic->pps->sps->ScalingList32x32, 2 * 64);
for (i = 0 ; i < 2 ; i++) {
for (int j = 0 ; j < 15 ; j++)
for (j = 0 ; j < 15 ; j++)
result.direct_reflist[i][j] = pic->RefPicList[i][j];
}
@@ -858,12 +877,17 @@ static struct ruvd_mpeg2 get_mpeg2_msg(struct ruvd_decoder *dec,
for (i = 0; i < 2; ++i)
result.ref_pic_idx[i] = get_ref_pic_idx(dec, pic->ref[i]);
result.load_intra_quantiser_matrix = 1;
result.load_nonintra_quantiser_matrix = 1;
for (i = 0; i < 64; ++i) {
result.intra_quantiser_matrix[i] = pic->intra_matrix[zscan[i]];
result.nonintra_quantiser_matrix[i] = pic->non_intra_matrix[zscan[i]];
if(pic->intra_matrix) {
result.load_intra_quantiser_matrix = 1;
for (i = 0; i < 64; ++i) {
result.intra_quantiser_matrix[i] = pic->intra_matrix[zscan[i]];
}
}
if(pic->non_intra_matrix) {
result.load_nonintra_quantiser_matrix = 1;
for (i = 0; i < 64; ++i) {
result.nonintra_quantiser_matrix[i] = pic->non_intra_matrix[zscan[i]];
}
}
result.profile_and_level_indication = 0;
@@ -1407,6 +1431,8 @@ struct pipe_video_codec *si_common_uvd_create_decoder(struct pipe_context *conte
goto error;
}
for (i = 0; i < 16; i++)
dec->render_pic_list[i] = NULL;
dec->fb_size = (info.family == CHIP_TONGA) ? FB_BUFFER_SIZE_TONGA :
FB_BUFFER_SIZE;
bs_buf_size = width * height * (512 / (16 * 16));

View File

@@ -78,6 +78,7 @@ struct radeon_decoder {
unsigned bs_size;
unsigned cur_buffer;
void *render_pic_list[16];
};
static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
@@ -186,7 +187,7 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
struct pipe_h265_picture_desc *pic)
{
rvcn_dec_message_hevc_t result;
unsigned i;
unsigned i, j;
memset(&result, 0, sizeof(result));
result.sps_info_flags = 0;
@@ -273,11 +274,28 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
result.row_height_minus1[i] = pic->pps->row_height_minus1[i];
result.num_delta_pocs_ref_rps_idx = pic->NumDeltaPocsOfRefRpsIdx;
result.curr_idx = pic->CurrPicOrderCntVal;
result.curr_poc = pic->CurrPicOrderCntVal;
for (i = 0 ; i < 16 ; i++) {
for (j = 0; (pic->ref[j] != NULL) && (j < 16) ; j++) {
if (dec->render_pic_list[i] == pic->ref[j])
break;
if (j == 15)
dec->render_pic_list[i] = NULL;
else if (pic->ref[j+1] == NULL)
dec->render_pic_list[i] = NULL;
}
}
for (i = 0 ; i < 16 ; i++) {
if (dec->render_pic_list[i] == NULL) {
dec->render_pic_list[i] = target;
result.curr_idx = i;
break;
}
}
vl_video_buffer_set_associated_data(target, &dec->base,
(void *)(uintptr_t)pic->CurrPicOrderCntVal,
(void *)(uintptr_t)result.curr_idx,
&radeon_dec_destroy_associated_data);
for (i = 0; i < 16; ++i) {
@@ -320,7 +338,7 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct radeon_decoder *dec,
memcpy(dec->it + 864, pic->pps->sps->ScalingList32x32, 2 * 64);
for (i = 0 ; i < 2 ; i++) {
for (int j = 0 ; j < 15 ; j++)
for (j = 0 ; j < 15 ; j++)
result.direct_reflist[i][j] = pic->RefPicList[i][j];
}
@@ -480,12 +498,17 @@ static rvcn_dec_message_mpeg2_vld_t get_mpeg2_msg(struct radeon_decoder *dec,
result.forward_ref_pic_idx = get_ref_pic_idx(dec, pic->ref[0]);
result.backward_ref_pic_idx = get_ref_pic_idx(dec, pic->ref[1]);
result.load_intra_quantiser_matrix = 1;
result.load_nonintra_quantiser_matrix = 1;
for (i = 0; i < 64; ++i) {
result.intra_quantiser_matrix[i] = pic->intra_matrix[zscan[i]];
result.nonintra_quantiser_matrix[i] = pic->non_intra_matrix[zscan[i]];
if(pic->intra_matrix) {
result.load_intra_quantiser_matrix = 1;
for (i = 0; i < 64; ++i) {
result.intra_quantiser_matrix[i] = pic->intra_matrix[zscan[i]];
}
}
if(pic->non_intra_matrix) {
result.load_nonintra_quantiser_matrix = 1;
for (i = 0; i < 64; ++i) {
result.nonintra_quantiser_matrix[i] = pic->non_intra_matrix[zscan[i]];
}
}
result.profile_and_level_indication = 0;
@@ -1236,6 +1259,8 @@ struct pipe_video_codec *radeon_create_decoder(struct pipe_context *context,
goto error;
}
for (i = 0; i < 16; i++)
dec->render_pic_list[i] = NULL;
bs_buf_size = width * height * (512 / (16 * 16));
for (i = 0; i < NUM_BUFFERS; ++i) {
unsigned msg_fb_it_size = FB_BUFFER_OFFSET + FB_BUFFER_SIZE;

View File

@@ -327,6 +327,7 @@ cleanup:
util_bitmask_destroy(svga->stream_output_id_bm);
util_bitmask_destroy(svga->query_id_bm);
FREE(svga);
svga = NULL;
done:
SVGA_STATS_TIME_POP(svgascreen->sws);

View File

@@ -76,7 +76,6 @@ virgl_tgsi_transform_instruction(struct tgsi_transform_context *ctx,
for (unsigned i = 0; i < inst->Instruction.NumSrcRegs; i++) {
if (inst->Src[i].Register.File == TGSI_FILE_CONSTANT &&
inst->Src[i].Register.Dimension &&
!inst->Src[i].Register.Indirect &&
inst->Src[i].Dimension.Index == 0)
inst->Src[i].Register.Dimension = 0;
}

View File

@@ -39,11 +39,11 @@
static uint8_t default_intra_matrix[64] = {
8, 16, 19, 22, 26, 27, 29, 34,
16, 16, 19, 22, 22, 22, 22, 26,
26, 27, 22, 26, 26, 27, 29, 24,
27, 27, 29, 32, 27, 29, 29, 32,
35, 29, 34, 34, 35, 40, 34, 34,
37, 40, 48, 37, 38, 40, 48, 58,
16, 16, 22, 24, 27, 29, 34, 37,
19, 22, 26, 27, 29, 34, 34, 38,
22, 22, 26, 27, 29, 34, 37, 40,
22, 26, 27, 29, 32, 35, 40, 48,
26, 27, 29, 32, 35, 40, 48, 58,
26, 27, 29, 34, 38, 46, 56, 69,
27, 29, 35, 38, 46, 56, 69, 83
};

View File

@@ -308,8 +308,10 @@ vlVaDestroyConfig(VADriverContextP ctx, VAConfigID config_id)
mtx_lock(&drv->mutex);
config = handle_table_get(drv->htab, config_id);
if (!config)
if (!config) {
mtx_unlock(&drv->mutex);
return VA_STATUS_ERROR_INVALID_CONFIG;
}
FREE(config);
handle_table_remove(drv->htab, config_id);

View File

@@ -548,8 +548,10 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, VAImageID image,
PIPE_TRANSFER_WRITE |
PIPE_TRANSFER_DISCARD_RANGE,
&dst_box, &transfer);
if (map == NULL)
if (map == NULL) {
mtx_unlock(&drv->mutex);
return VA_STATUS_ERROR_OPERATION_FAILED;
}
u_copy_nv12_from_yv12((const void * const*) data, pitches, i, j,
transfer->stride, tex->array_size,

View File

@@ -57,6 +57,11 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID context_id, VASurfaceID rende
return VA_STATUS_ERROR_INVALID_CONTEXT;
}
if (u_reduce_video_profile(context->templat.profile) == PIPE_VIDEO_FORMAT_MPEG12) {
context->desc.mpeg12.intra_matrix = NULL;
context->desc.mpeg12.non_intra_matrix = NULL;
}
surf = handle_table_get(drv->htab, render_target);
mtx_unlock(&drv->mutex);
if (!surf || !surf->buffer)
@@ -678,9 +683,11 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
vl_compositor_yuv_deint_full(&drv->cstate, &drv->compositor,
old_buf, surf->buffer,
&src_rect, &dst_rect, VL_COMPOSITOR_WEAVE);
} else
} else {
/* Can't convert from progressive to interlaced yet */
mtx_unlock(&drv->mutex);
return VA_STATUS_ERROR_INVALID_SURFACE;
}
}
old_buf->destroy(old_buf);

View File

@@ -369,8 +369,10 @@ vlVdpVideoSurfacePutBitsYCbCr(VdpVideoSurface surface,
if (pformat == PIPE_FORMAT_YV12 &&
p_surf->video_buffer->buffer_format == PIPE_FORMAT_NV12)
conversion = CONVERSION_YV12_TO_NV12;
else
else {
mtx_unlock(&p_surf->device->mutex);
return VDP_STATUS_NO_IMPLEMENTATION;
}
}
sampler_views = p_surf->video_buffer->get_sampler_view_planes(p_surf->video_buffer);

View File

@@ -218,6 +218,9 @@ static void surf_drm_to_winsys(struct radeon_drm_winsys *ws,
}
set_micro_tile_mode(surf_ws, &ws->info);
surf_ws->is_displayable = surf_ws->is_linear ||
surf_ws->micro_tile_mode == RADEON_MICRO_MODE_DISPLAY ||
surf_ws->micro_tile_mode == RADEON_MICRO_MODE_ROTATED;
}
static int radeon_winsys_surface_init(struct radeon_winsys *rws,

View File

@@ -41,7 +41,6 @@
#include "main/glheader.h"
#include "glapi.h"
#include "glapitable.h"
#include "main/dispatch.h"
#include "apple_glx.h"
#include "apple_xgl_api.h"
@@ -61,12 +60,11 @@ static void _apple_glapi_create_table(void) {
assert(__applegl_api);
memcpy(__applegl_api, __ogl_framework_api, sizeof(struct _glapi_table));
SET_ReadPixels(__applegl_api, __applegl_glReadPixels);
SET_CopyPixels(__applegl_api, __applegl_glCopyPixels);
SET_CopyColorTable(__applegl_api, __applegl_glCopyColorTable);
SET_DrawBuffer(__applegl_api, __applegl_glDrawBuffer);
SET_DrawBuffers(__applegl_api, __applegl_glDrawBuffers);
SET_Viewport(__applegl_api, __applegl_glViewport);
_glapi_table_patch(__applegl_api, "ReadPixels", __applegl_glReadPixels);
_glapi_table_patch(__applegl_api, "CopyPixels", __applegl_glCopyPixels);
_glapi_table_patch(__applegl_api, "CopyColorTable", __applegl_glCopyColorTable);
_glapi_table_patch(__applegl_api, "DrawBuffers", __applegl_glDrawBuffer);
_glapi_table_patch(__applegl_api, "Viewport", __applegl_glViewport);
}
void apple_glapi_set_dispatch(void) {

View File

@@ -32,6 +32,7 @@
#include <stdlib.h>
#include <assert.h>
#include <GL/gl.h>
#include <util/debug.h>
/* <rdar://problem/6953344> */
#define glTexImage1D glTexImage1D_OSX

View File

@@ -338,11 +338,15 @@ static Display *dispatch_GetCurrentDisplayEXT(void)
static const char *dispatch_GetDriverConfig(const char *driverName)
{
#if defined(GLX_DIRECT_RENDERING) && !defined(GLX_USE_APPLEGL)
/*
* The options are constant for a given driverName, so we do not need
* a context (and apps expect to be able to call this without one).
*/
return glXGetDriverConfig(driverName);
#else
return NULL;
#endif
}

View File

@@ -43,6 +43,7 @@
#ifdef GLX_USE_APPLEGL
#include "apple/apple_glx_context.h"
#include "apple/apple_glx.h"
#include "util/debug.h"
#else
#include <sys/time.h>
#ifdef XF86VIDMODE

View File

@@ -3485,18 +3485,25 @@ fs_visitor::lower_integer_multiplication()
bool needs_mov = false;
fs_reg orig_dst = inst->dst;
fs_reg low = inst->dst;
if (orig_dst.is_null() || orig_dst.file == MRF ||
regions_overlap(inst->dst, inst->size_written,
inst->src[0], inst->size_read(0)) ||
regions_overlap(inst->dst, inst->size_written,
inst->src[1], inst->size_read(1))) {
needs_mov = true;
inst->dst = fs_reg(VGRF, alloc.allocate(dispatch_width / 8),
inst->dst.type);
/* Get a new VGRF but keep the same stride as inst->dst */
low = fs_reg(VGRF, alloc.allocate(regs_written(inst)),
inst->dst.type);
low.stride = inst->dst.stride;
low.offset = inst->dst.offset % REG_SIZE;
}
fs_reg low = inst->dst;
fs_reg high(VGRF, alloc.allocate(dispatch_width / 8),
/* Get a new VGRF but keep the same stride as inst->dst */
fs_reg high(VGRF, alloc.allocate(regs_written(inst)),
inst->dst.type);
high.stride = inst->dst.stride;
high.offset = inst->dst.offset % REG_SIZE;
if (devinfo->gen >= 7) {
if (inst->src[1].file == IMM) {
@@ -3517,13 +3524,13 @@ fs_visitor::lower_integer_multiplication()
inst->src[1]);
}
ibld.ADD(subscript(inst->dst, BRW_REGISTER_TYPE_UW, 1),
ibld.ADD(subscript(low, BRW_REGISTER_TYPE_UW, 1),
subscript(low, BRW_REGISTER_TYPE_UW, 1),
subscript(high, BRW_REGISTER_TYPE_UW, 0));
if (needs_mov || inst->conditional_mod) {
set_condmod(inst->conditional_mod,
ibld.MOV(orig_dst, inst->dst));
ibld.MOV(orig_dst, low));
}
}

View File

@@ -82,7 +82,7 @@ EXTENSIONS = [
Extension('VK_KHR_wayland_surface', 6, 'VK_USE_PLATFORM_WAYLAND_KHR'),
Extension('VK_KHR_xcb_surface', 6, 'VK_USE_PLATFORM_XCB_KHR'),
Extension('VK_KHR_xlib_surface', 6, 'VK_USE_PLATFORM_XLIB_KHR'),
Extension('VK_KHX_multiview', 1, True),
Extension('VK_KHX_multiview', 1, False),
Extension('VK_EXT_debug_report', 8, True),
]

View File

@@ -44,4 +44,4 @@ if __name__ == '__main__':
}
with open(args.out, 'w') as f:
json.dump(json_data, f, indent = 4)
json.dump(json_data, f, indent = 4, sort_keys=True)

View File

@@ -3095,6 +3095,17 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
if (GEN_GEN == 7)
cmd_buffer->state.vb_dirty |= ~0;
/* It is possible to start a render pass with an old pipeline. Because the
* render pass and subpass index are both baked into the pipeline, this is
* highly unlikely. In order to do so, it requires that you have a render
* pass with a single subpass and that you use that render pass twice
* back-to-back and use the same pipeline at the start of the second render
* pass as at the end of the first. In order to avoid unpredictable issues
* with this edge case, we just dirty the pipeline at the start of every
* subpass.
*/
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_PIPELINE;
/* Perform transitions to the subpass layout before any writes have
* occurred.
*/

View File

@@ -1371,10 +1371,10 @@ has_color_buffer_write_enabled(const struct anv_pipeline *pipeline,
if (binding->set != ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS)
continue;
const VkPipelineColorBlendAttachmentState *a =
&blend->pAttachments[binding->index];
if (binding->index == UINT32_MAX)
continue;
if (binding->index != UINT32_MAX && a->colorWriteMask != 0)
if (blend->pAttachments[binding->index].colorWriteMask != 0)
return true;
}

View File

@@ -56,6 +56,7 @@ header = """/* GLXEXT is the define used in the xserver when the GLX extension i
#endif
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "main/glheader.h"
@@ -144,6 +145,19 @@ _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
return disp;
}
void
_glapi_table_patch(struct _glapi_table *table, const char *name, void *wrapper)
{
for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
if (!strcmp(_glapi_table_func_names[func_index], name)) {
((void **)table)[func_index] = wrapper;
return;
}
}
fprintf(stderr, "could not patch %s in dispatch table\\n", name);
}
"""

View File

@@ -161,6 +161,9 @@ _glapi_get_proc_name(unsigned int offset);
#if defined(GLX_USE_APPLEGL) || defined(GLX_USE_WINDOWSGL)
_GLAPI_EXPORT struct _glapi_table *
_glapi_create_table_from_handle(void *handle, const char *symbol_prefix);
_GLAPI_EXPORT void
_glapi_table_patch(struct _glapi_table *, const char *name, void *wrapper);
#endif

View File

@@ -319,7 +319,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
enum isl_format dst_isl_format =
brw_blorp_to_isl_format(brw, dst_format, true);
enum isl_aux_usage dst_aux_usage =
intel_miptree_render_aux_usage(brw, dst_mt, dst_isl_format, false);
intel_miptree_render_aux_usage(brw, dst_mt, dst_isl_format,
false, false);
const bool dst_clear_supported = dst_aux_usage != ISL_AUX_USAGE_NONE;
intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
dst_aux_usage, dst_clear_supported);
@@ -933,7 +934,8 @@ brw_blorp_upload_miptree(struct brw_context *brw,
brw, src_bo, src_format,
src_offset + i * src_image_stride,
width, height, 1,
src_row_stride, 0);
src_row_stride,
ISL_TILING_LINEAR, 0);
if (!src_mt) {
perf_debug("intel_texsubimage: miptree creation for src failed\n");
@@ -1054,7 +1056,8 @@ brw_blorp_download_miptree(struct brw_context *brw,
brw, dst_bo, dst_format,
dst_offset + i * dst_image_stride,
width, height, 1,
dst_row_stride, 0);
dst_row_stride,
ISL_TILING_LINEAR, 0);
if (!dst_mt) {
perf_debug("intel_texsubimage: miptree creation for src failed\n");
@@ -1264,9 +1267,10 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
irb->mt, irb->mt_level, irb->mt_layer, num_layers);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format, false);
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
false, false);
intel_miptree_prepare_render(brw, irb->mt, level, irb->mt_layer,
num_layers, isl_format, false);
num_layers, aux_usage);
struct isl_surf isl_tmp[2];
struct blorp_surf surf;
@@ -1285,7 +1289,7 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
blorp_batch_finish(&batch);
intel_miptree_finish_render(brw, irb->mt, level, irb->mt_layer,
num_layers, isl_format, false);
num_layers, aux_usage);
}
return;

View File

@@ -1106,13 +1106,13 @@ brw_bo_get_tiling(struct brw_bo *bo, uint32_t *tiling_mode,
return 0;
}
struct brw_bo *
brw_bo_gem_create_from_prime(struct brw_bufmgr *bufmgr, int prime_fd)
static struct brw_bo *
brw_bo_gem_create_from_prime_internal(struct brw_bufmgr *bufmgr, int prime_fd,
int tiling_mode, uint32_t stride)
{
int ret;
uint32_t handle;
struct brw_bo *bo;
struct drm_i915_gem_get_tiling get_tiling;
mtx_lock(&bufmgr->lock);
ret = drmPrimeFDToHandle(bufmgr->fd, prime_fd, &handle);
@@ -1158,14 +1158,17 @@ brw_bo_gem_create_from_prime(struct brw_bufmgr *bufmgr, int prime_fd)
bo->reusable = false;
bo->external = true;
memclear(get_tiling);
get_tiling.handle = bo->gem_handle;
if (drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_GET_TILING, &get_tiling))
goto err;
if (tiling_mode < 0) {
struct drm_i915_gem_get_tiling get_tiling = { .handle = bo->gem_handle };
if (drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_GET_TILING, &get_tiling))
goto err;
bo->tiling_mode = get_tiling.tiling_mode;
bo->swizzle_mode = get_tiling.swizzle_mode;
/* XXX stride is unknown */
bo->tiling_mode = get_tiling.tiling_mode;
bo->swizzle_mode = get_tiling.swizzle_mode;
/* XXX stride is unknown */
} else {
bo_set_tiling_internal(bo, tiling_mode, stride);
}
out:
mtx_unlock(&bufmgr->lock);
@@ -1177,6 +1180,24 @@ err:
return NULL;
}
struct brw_bo *
brw_bo_gem_create_from_prime(struct brw_bufmgr *bufmgr, int prime_fd)
{
return brw_bo_gem_create_from_prime_internal(bufmgr, prime_fd, -1, 0);
}
struct brw_bo *
brw_bo_gem_create_from_prime_tiled(struct brw_bufmgr *bufmgr, int prime_fd,
uint32_t tiling_mode, uint32_t stride)
{
assert(tiling_mode == I915_TILING_NONE ||
tiling_mode == I915_TILING_X ||
tiling_mode == I915_TILING_Y);
return brw_bo_gem_create_from_prime_internal(bufmgr, prime_fd,
tiling_mode, stride);
}
static void
brw_bo_make_external(struct brw_bo *bo)
{

View File

@@ -336,6 +336,10 @@ void brw_destroy_hw_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id);
int brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd);
struct brw_bo *brw_bo_gem_create_from_prime(struct brw_bufmgr *bufmgr,
int prime_fd);
struct brw_bo *brw_bo_gem_create_from_prime_tiled(struct brw_bufmgr *bufmgr,
int prime_fd,
uint32_t tiling_mode,
uint32_t stride);
uint32_t brw_bo_export_gem_handle(struct brw_bo *bo);

View File

@@ -1261,6 +1261,21 @@ intel_resolve_for_dri2_flush(struct brw_context *brw,
intel_miptree_prepare_external(brw, rb->mt);
} else {
intel_renderbuffer_downsample(brw, rb);
/* Call prepare_external on the single-sample miptree to do any
* needed resolves prior to handing it off to the window system.
* This is needed in the case that rb->singlesample_mt is Y-tiled
* with CCS_E enabled but without I915_FORMAT_MOD_Y_TILED_CCS_E. In
* this case, the MSAA resolve above will write compressed data into
* rb->singlesample_mt.
*
* TODO: Some day, if we decide to care about the tiny performance
* hit we're taking by doing the MSAA resolve and then a CCS resolve,
* we could detect this case and just allocate the single-sampled
* miptree without aux. However, that would be a lot of plumbing and
* this is a rather exotic case so it's not really worth it.
*/
intel_miptree_prepare_external(brw, rb->singlesample_mt);
}
}
}
@@ -1545,6 +1560,9 @@ intel_process_dri2_buffer(struct brw_context *brw,
return;
}
uint32_t tiling, swizzle;
brw_bo_get_tiling(bo, &tiling, &swizzle);
struct intel_mipmap_tree *mt =
intel_miptree_create_for_bo(brw,
bo,
@@ -1554,6 +1572,7 @@ intel_process_dri2_buffer(struct brw_context *brw,
drawable->h,
1,
buffer->pitch,
isl_tiling_from_i915_tiling(tiling),
MIPTREE_CREATE_DEFAULT);
if (!mt) {
brw_bo_unreference(bo);

View File

@@ -688,7 +688,14 @@ struct brw_context
* and would need flushing before being used from another cache domain that
* isn't coherent with it (i.e. the sampler).
*/
struct set *render_cache;
struct hash_table *render_cache;
/**
* Set of struct brw_bo * that have been used as a depth buffer within this
* batchbuffer and would need flushing before being used from another cache
* domain that isn't coherent with it (i.e. the sampler).
*/
struct set *depth_cache;
/**
* Number of resets observed in the system at context creation.

View File

@@ -426,7 +426,7 @@ brw_predraw_resolve_inputs(struct brw_context *brw)
min_layer, num_layers,
disable_aux);
brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
brw_cache_flush_for_read(brw, tex_obj->mt->bo);
if (tex_obj->base.StencilSampling ||
tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
@@ -450,7 +450,7 @@ brw_predraw_resolve_inputs(struct brw_context *brw)
intel_miptree_prepare_image(brw, tex_obj->mt);
brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
brw_cache_flush_for_read(brw, tex_obj->mt->bo);
}
}
}
@@ -507,11 +507,18 @@ brw_predraw_resolve_framebuffer(struct brw_context *brw)
mesa_format mesa_format =
_mesa_get_render_format(ctx, intel_rb_format(irb));
enum isl_format isl_format = brw_isl_format_for_mesa_format(mesa_format);
bool blend_enabled = ctx->Color.BlendEnabled & (1 << i);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
blend_enabled,
brw->draw_aux_buffer_disabled[i]);
intel_miptree_prepare_render(brw, irb->mt, irb->mt_level,
irb->mt_layer, irb->layer_count,
isl_format,
ctx->Color.BlendEnabled & (1 << i));
aux_usage);
brw_cache_flush_for_render(brw, irb->mt->bo,
isl_format, aux_usage);
}
}
@@ -561,11 +568,11 @@ brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)
depth_written);
}
if (depth_written)
brw_render_cache_set_add_bo(brw, depth_irb->mt->bo);
brw_depth_cache_add_bo(brw, depth_irb->mt->bo);
}
if (stencil_irb && brw->stencil_write_enabled)
brw_render_cache_set_add_bo(brw, stencil_irb->mt->bo);
brw_depth_cache_add_bo(brw, stencil_irb->mt->bo);
for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
struct intel_renderbuffer *irb =
@@ -577,12 +584,17 @@ brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)
mesa_format mesa_format =
_mesa_get_render_format(ctx, intel_rb_format(irb));
enum isl_format isl_format = brw_isl_format_for_mesa_format(mesa_format);
bool blend_enabled = ctx->Color.BlendEnabled & (1 << i);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
blend_enabled,
brw->draw_aux_buffer_disabled[i]);
brw_render_cache_add_bo(brw, irb->mt->bo, isl_format, aux_usage);
brw_render_cache_set_add_bo(brw, irb->mt->bo);
intel_miptree_finish_render(brw, irb->mt, irb->mt_level,
irb->mt_layer, irb->layer_count,
isl_format,
ctx->Color.BlendEnabled & (1 << i));
aux_usage);
}
}
@@ -593,7 +605,7 @@ intel_renderbuffer_move_temp_back(struct brw_context *brw,
if (irb->align_wa_mt == NULL)
return;
brw_render_cache_set_check_flush(brw, irb->align_wa_mt->bo);
brw_cache_flush_for_read(brw, irb->align_wa_mt->bo);
intel_miptree_copy_slice(brw, irb->align_wa_mt, 0, 0,
irb->mt,

View File

@@ -293,17 +293,6 @@ brw_is_color_fast_clear_compatible(struct brw_context *brw,
brw->mesa_to_isl_render_format[mt->format])
return false;
/* Gen9 doesn't support fast clear on single-sampled SRGB buffers. When
* GL_FRAMEBUFFER_SRGB is enabled any color renderbuffers will be
* resolved in intel_update_state. In that case it's pointless to do a
* fast clear because it's very likely to be immediately resolved.
*/
if (devinfo->gen >= 9 &&
mt->surf.samples == 1 &&
ctx->Color.sRGBEnabled &&
_mesa_get_srgb_format_linear(mt->format) != mt->format)
return false;
const mesa_format format = _mesa_get_render_format(ctx, mt->format);
if (_mesa_is_format_integer_color(format)) {
if (devinfo->gen >= 8) {

View File

@@ -333,9 +333,9 @@ brw_emit_depthbuffer(struct brw_context *brw)
}
if (depth_mt)
brw_render_cache_set_check_flush(brw, depth_mt->bo);
brw_cache_flush_for_depth(brw, depth_mt->bo);
if (stencil_mt)
brw_render_cache_set_check_flush(brw, stencil_mt->bo);
brw_cache_flush_for_depth(brw, stencil_mt->bo);
brw->vtbl.emit_depth_stencil_hiz(brw, depth_mt, depth_offset,
depthbuffer_format, depth_surface_type,

View File

@@ -230,9 +230,9 @@ gen6_update_renderbuffer_surface(struct brw_context *brw,
enum isl_format isl_format = brw->mesa_to_isl_render_format[rb_format];
enum isl_aux_usage aux_usage =
brw->draw_aux_buffer_disabled[unit] ? ISL_AUX_USAGE_NONE :
intel_miptree_render_aux_usage(brw, mt, isl_format,
ctx->Color.BlendEnabled & (1 << unit));
ctx->Color.BlendEnabled & (1 << unit),
brw->draw_aux_buffer_disabled[unit]);
struct isl_view view = {
.format = isl_format,
@@ -441,23 +441,6 @@ swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
}
static bool
brw_aux_surface_disabled(const struct brw_context *brw,
const struct intel_mipmap_tree *mt)
{
const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
const struct intel_renderbuffer *irb =
intel_renderbuffer(fb->_ColorDrawBuffers[i]);
if (irb && irb->mt == mt)
return brw->draw_aux_buffer_disabled[i];
}
return false;
}
static void
brw_update_texture_surface(struct gl_context *ctx,
unsigned unit,
@@ -588,9 +571,6 @@ brw_update_texture_surface(struct gl_context *ctx,
enum isl_aux_usage aux_usage =
intel_miptree_texture_aux_usage(brw, mt, format);
if (brw_aux_surface_disabled(brw, mt))
aux_usage = ISL_AUX_USAGE_NONE;
brw_emit_surface_state(brw, mt, mt->target, view, aux_usage,
surf_offset, surf_index,
0);
@@ -1133,6 +1113,14 @@ const struct brw_tracked_state brw_renderbuffer_read_surfaces = {
.emit = update_renderbuffer_read_surfaces,
};
static bool
is_depth_texture(struct intel_texture_object *iobj)
{
GLenum base_format = _mesa_get_format_base_format(iobj->_Format);
return base_format == GL_DEPTH_COMPONENT ||
(base_format == GL_DEPTH_STENCIL && !iobj->base.StencilSampling);
}
static void
update_stage_texture_surfaces(struct brw_context *brw,
const struct gl_program *prog,
@@ -1159,9 +1147,32 @@ update_stage_texture_surfaces(struct brw_context *brw,
if (prog->SamplersUsed & (1 << s)) {
const unsigned unit = prog->SamplerUnits[s];
const bool used_by_txf = prog->info.textures_used_by_txf & (1 << s);
struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
struct intel_texture_object *iobj = intel_texture_object(obj);
/* _NEW_TEXTURE */
if (ctx->Texture.Unit[unit]._Current) {
if (!obj)
continue;
if ((prog->ShadowSamplers & (1 << s)) && !is_depth_texture(iobj)) {
/* A programming note for the sample_c message says:
*
* "The Surface Format of the associated surface must be
* indicated as supporting shadow mapping as indicated in the
* surface format table."
*
* Accessing non-depth textures via a sampler*Shadow type is
* undefined. GLSL 4.50 page 162 says:
*
* "If a shadow texture call is made to a sampler that does not
* represent a depth texture, then results are undefined."
*
* We give them a null surface (zeros) for undefined. We've seen
* GPU hangs with color buffers and sample_c, so we try and avoid
* those with this hack.
*/
emit_null_surface_state(brw, NULL, surf_offset + s);
} else {
brw_update_texture_surface(ctx, unit, surf_offset + s, for_gather,
used_by_txf, plane);
}

View File

@@ -225,8 +225,16 @@ genX(blorp_exec)(struct blorp_batch *batch,
* data.
*/
if (params->src.enabled)
brw_render_cache_set_check_flush(brw, params->src.addr.buffer);
brw_render_cache_set_check_flush(brw, params->dst.addr.buffer);
brw_cache_flush_for_read(brw, params->src.addr.buffer);
if (params->dst.enabled) {
brw_cache_flush_for_render(brw, params->dst.addr.buffer,
params->dst.view.format,
params->dst.aux_usage);
}
if (params->depth.enabled)
brw_cache_flush_for_depth(brw, params->depth.addr.buffer);
if (params->stencil.enabled)
brw_cache_flush_for_depth(brw, params->stencil.addr.buffer);
brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
@@ -292,10 +300,13 @@ retry:
!params->stencil.enabled;
brw->ib.index_size = -1;
if (params->dst.enabled)
brw_render_cache_set_add_bo(brw, params->dst.addr.buffer);
if (params->dst.enabled) {
brw_render_cache_add_bo(brw, params->dst.addr.buffer,
params->dst.view.format,
params->dst.aux_usage);
}
if (params->depth.enabled)
brw_render_cache_set_add_bo(brw, params->depth.addr.buffer);
brw_depth_cache_add_bo(brw, params->depth.addr.buffer);
if (params->stencil.enabled)
brw_render_cache_set_add_bo(brw, params->stencil.addr.buffer);
brw_depth_cache_add_bo(brw, params->stencil.addr.buffer);
}

View File

@@ -367,11 +367,15 @@ is_passthru_format(uint32_t format)
}
UNUSED static int
uploads_needed(uint32_t format)
uploads_needed(uint32_t format,
bool is_dual_slot)
{
if (!is_passthru_format(format))
return 1;
if (is_dual_slot)
return 2;
switch (format) {
case ISL_FORMAT_R64_PASSTHRU:
case ISL_FORMAT_R64G64_PASSTHRU:
@@ -400,11 +404,19 @@ downsize_format_if_needed(uint32_t format,
if (!is_passthru_format(format))
return format;
/* ISL_FORMAT_R64_PASSTHRU and ISL_FORMAT_R64G64_PASSTHRU with an upload ==
* 1 means that we have been forced to do 2 uploads for a size <= 2. This
* happens with gen < 8 and dvec3 or dvec4 vertex shader input
* variables. In those cases, we return ISL_FORMAT_R32_FLOAT as a way of
* flagging that we want to fill with zeroes this second forced upload.
*/
switch (format) {
case ISL_FORMAT_R64_PASSTHRU:
return ISL_FORMAT_R32G32_FLOAT;
return !upload ? ISL_FORMAT_R32G32_FLOAT
: ISL_FORMAT_R32_FLOAT;
case ISL_FORMAT_R64G64_PASSTHRU:
return ISL_FORMAT_R32G32B32A32_FLOAT;
return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
: ISL_FORMAT_R32_FLOAT;
case ISL_FORMAT_R64G64B64_PASSTHRU:
return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
: ISL_FORMAT_R32G32_FLOAT;
@@ -423,6 +435,15 @@ static int
upload_format_size(uint32_t upload_format)
{
switch (upload_format) {
case ISL_FORMAT_R32_FLOAT:
/* downsized_format has returned this one in order to flag that we are
* performing a second upload which we want to have filled with
* zeroes. This happens with gen < 8, a size <= 2, and dvec3 or dvec4
* vertex shader input variables.
*/
return 0;
case ISL_FORMAT_R32G32_FLOAT:
return 2;
case ISL_FORMAT_R32G32B32A32_FLOAT:
@@ -520,7 +541,7 @@ genX(emit_vertices)(struct brw_context *brw)
struct brw_vertex_element *input = brw->vb.enabled[i];
uint32_t format = brw_get_vertex_surface_type(brw, input->glarray);
if (uploads_needed(format) > 1)
if (uploads_needed(format, input->is_dual_slot) > 1)
nr_elements++;
}
#endif
@@ -616,7 +637,8 @@ genX(emit_vertices)(struct brw_context *brw)
uint32_t comp1 = VFCOMP_STORE_SRC;
uint32_t comp2 = VFCOMP_STORE_SRC;
uint32_t comp3 = VFCOMP_STORE_SRC;
const unsigned num_uploads = GEN_GEN < 8 ? uploads_needed(format) : 1;
const unsigned num_uploads = GEN_GEN < 8 ?
uploads_needed(format, input->is_dual_slot) : 1;
#if GEN_GEN >= 8
/* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):

View File

@@ -212,7 +212,7 @@ static void
intel_batchbuffer_reset_and_clear_render_cache(struct brw_context *brw)
{
intel_batchbuffer_reset(brw);
brw_render_cache_set_clear(brw);
brw_cache_sets_clear(brw);
}
void

View File

@@ -970,19 +970,15 @@ intel_renderbuffer_move_to_temp(struct brw_context *brw,
}
void
brw_render_cache_set_clear(struct brw_context *brw)
brw_cache_sets_clear(struct brw_context *brw)
{
struct set_entry *entry;
struct hash_entry *render_entry;
hash_table_foreach(brw->render_cache, render_entry)
_mesa_hash_table_remove(brw->render_cache, render_entry);
set_foreach(brw->render_cache, entry) {
_mesa_set_remove(brw->render_cache, entry);
}
}
void
brw_render_cache_set_add_bo(struct brw_context *brw, struct brw_bo *bo)
{
_mesa_set_add(brw->render_cache, bo);
struct set_entry *depth_entry;
set_foreach(brw->depth_cache, depth_entry)
_mesa_set_remove(brw->depth_cache, depth_entry);
}
/**
@@ -997,14 +993,11 @@ brw_render_cache_set_add_bo(struct brw_context *brw, struct brw_bo *bo)
* necessary is flushed before another use of that BO, but for reuse from
* different caches within a batchbuffer, it's all our responsibility.
*/
void
brw_render_cache_set_check_flush(struct brw_context *brw, struct brw_bo *bo)
static void
flush_depth_and_render_caches(struct brw_context *brw, struct brw_bo *bo)
{
const struct gen_device_info *devinfo = &brw->screen->devinfo;
if (!_mesa_set_search(brw->render_cache, bo))
return;
if (devinfo->gen >= 6) {
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
@@ -1018,7 +1011,89 @@ brw_render_cache_set_check_flush(struct brw_context *brw, struct brw_bo *bo)
brw_emit_mi_flush(brw);
}
brw_render_cache_set_clear(brw);
brw_cache_sets_clear(brw);
}
void
brw_cache_flush_for_read(struct brw_context *brw, struct brw_bo *bo)
{
if (_mesa_hash_table_search(brw->render_cache, bo) ||
_mesa_set_search(brw->depth_cache, bo))
flush_depth_and_render_caches(brw, bo);
}
static void *
format_aux_tuple(enum isl_format format, enum isl_aux_usage aux_usage)
{
return (void *)(uintptr_t)((uint32_t)format << 8 | aux_usage);
}
void
brw_cache_flush_for_render(struct brw_context *brw, struct brw_bo *bo,
enum isl_format format,
enum isl_aux_usage aux_usage)
{
if (_mesa_set_search(brw->depth_cache, bo))
flush_depth_and_render_caches(brw, bo);
/* Check to see if this bo has been used by a previous rendering operation
* but with a different format or aux usage. If it has, flush the render
* cache so we ensure that it's only in there with one format or aux usage
* at a time.
*
* Even though it's not obvious, this can easily happen in practice.
* Suppose a client is blending on a surface with sRGB encode enabled on
* gen9. This implies that you get AUX_USAGE_CCS_D at best. If the client
* then disables sRGB decode and continues blending we will flip on
* AUX_USAGE_CCS_E without doing any sort of resolve in-between (this is
* perfectly valid since CCS_E is a subset of CCS_D). However, this means
* that we have fragments in-flight which are rendering with UNORM+CCS_E
* and other fragments in-flight with SRGB+CCS_D on the same surface at the
* same time and the pixel scoreboard and color blender are trying to sort
* it all out. This ends badly (i.e. GPU hangs).
*
* To date, we have never observed GPU hangs or even corruption to be
* associated with switching the format, only the aux usage. However,
* there are comments in various docs which indicate that the render cache
* isn't 100% resilient to format changes. We may as well be conservative
* and flush on format changes too. We can always relax this later if we
* find it to be a performance problem.
*/
struct hash_entry *entry = _mesa_hash_table_search(brw->render_cache, bo);
if (entry && entry->data != format_aux_tuple(format, aux_usage))
flush_depth_and_render_caches(brw, bo);
}
void
brw_render_cache_add_bo(struct brw_context *brw, struct brw_bo *bo,
enum isl_format format,
enum isl_aux_usage aux_usage)
{
#ifndef NDEBUG
struct hash_entry *entry = _mesa_hash_table_search(brw->render_cache, bo);
if (entry) {
/* Otherwise, someone didn't do a flush_for_render and that would be
* very bad indeed.
*/
assert(entry->data == format_aux_tuple(format, aux_usage));
}
#endif
_mesa_hash_table_insert(brw->render_cache, bo,
format_aux_tuple(format, aux_usage));
}
void
brw_cache_flush_for_depth(struct brw_context *brw, struct brw_bo *bo)
{
if (_mesa_hash_table_search(brw->render_cache, bo))
flush_depth_and_render_caches(brw, bo);
}
void
brw_depth_cache_add_bo(struct brw_context *brw, struct brw_bo *bo)
{
_mesa_set_add(brw->depth_cache, bo);
}
/**
@@ -1038,6 +1113,8 @@ intel_fbo_init(struct brw_context *brw)
dd->EGLImageTargetRenderbufferStorage =
intel_image_target_renderbuffer_storage;
brw->render_cache = _mesa_set_create(brw, _mesa_hash_pointer,
_mesa_key_pointer_equal);
brw->render_cache = _mesa_hash_table_create(brw, _mesa_hash_pointer,
_mesa_key_pointer_equal);
brw->depth_cache = _mesa_set_create(brw, _mesa_hash_pointer,
_mesa_key_pointer_equal);
}

View File

@@ -229,9 +229,16 @@ void
intel_renderbuffer_upsample(struct brw_context *brw,
struct intel_renderbuffer *irb);
void brw_render_cache_set_clear(struct brw_context *brw);
void brw_render_cache_set_add_bo(struct brw_context *brw, struct brw_bo *bo);
void brw_render_cache_set_check_flush(struct brw_context *brw, struct brw_bo *bo);
void brw_cache_sets_clear(struct brw_context *brw);
void brw_cache_flush_for_read(struct brw_context *brw, struct brw_bo *bo);
void brw_cache_flush_for_render(struct brw_context *brw, struct brw_bo *bo,
enum isl_format format,
enum isl_aux_usage aux_usage);
void brw_cache_flush_for_depth(struct brw_context *brw, struct brw_bo *bo);
void brw_render_cache_add_bo(struct brw_context *brw, struct brw_bo *bo,
enum isl_format format,
enum isl_aux_usage aux_usage);
void brw_depth_cache_add_bo(struct brw_context *brw, struct brw_bo *bo);
unsigned
intel_quantize_num_samples(struct intel_screen *intel, unsigned num_samples);

View File

@@ -207,13 +207,7 @@ intel_miptree_supports_ccs(struct brw_context *brw,
if (!brw->mesa_format_supports_render[mt->format])
return false;
if (devinfo->gen >= 9) {
mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
const enum isl_format isl_format =
brw_isl_format_for_mesa_format(linear_format);
return isl_format_supports_ccs_e(&brw->screen->devinfo, isl_format);
} else
return true;
return true;
}
static bool
@@ -256,7 +250,7 @@ intel_miptree_supports_hiz(const struct brw_context *brw,
* our HW tends to support more linear formats than sRGB ones, we use this
* format variant for check for CCS_E compatibility.
*/
MAYBE_UNUSED static bool
static bool
format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo,
const struct intel_mipmap_tree *mt,
enum isl_format access_format)
@@ -290,12 +284,13 @@ intel_miptree_supports_ccs_e(struct brw_context *brw,
if (!intel_miptree_supports_ccs(brw, mt))
return false;
/* Fast clear can be also used to clear srgb surfaces by using equivalent
* linear format. This trick, however, can't be extended to be used with
* lossless compression and therefore a check is needed to see if the format
* really is linear.
/* Many window system buffers are sRGB even if they are never rendered as
* sRGB. For those, we want CCS_E for when sRGBEncode is false. When the
* surface is used as sRGB, we fall back to CCS_D.
*/
return _mesa_get_srgb_format_linear(mt->format) == mt->format;
mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
enum isl_format isl_format = brw_isl_format_for_mesa_format(linear_format);
return isl_format_supports_ccs_e(&brw->screen->devinfo, isl_format);
}
/**
@@ -805,11 +800,11 @@ intel_miptree_create_for_bo(struct brw_context *brw,
uint32_t height,
uint32_t depth,
int pitch,
enum isl_tiling tiling,
enum intel_miptree_create_flags flags)
{
const struct gen_device_info *devinfo = &brw->screen->devinfo;
struct intel_mipmap_tree *mt;
uint32_t tiling, swizzle;
const GLenum target = depth > 1 ? GL_TEXTURE_2D_ARRAY : GL_TEXTURE_2D;
const GLenum base_format = _mesa_get_format_base_format(format);
@@ -847,12 +842,10 @@ intel_miptree_create_for_bo(struct brw_context *brw,
return mt;
}
brw_bo_get_tiling(bo, &tiling, &swizzle);
/* Nothing will be able to use this miptree with the BO if the offset isn't
* aligned.
*/
if (tiling != I915_TILING_NONE)
if (tiling != ISL_TILING_LINEAR)
assert(offset % 4096 == 0);
/* miptrees can't handle negative pitch. If you need flipping of images,
@@ -867,7 +860,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
mt = make_surface(brw, target, format,
0, 0, width, height, depth, 1,
1lu << isl_tiling_from_i915_tiling(tiling),
1lu << tiling,
ISL_SURF_USAGE_RENDER_TARGET_BIT |
ISL_SURF_USAGE_TEXTURE_BIT,
0, pitch, bo);
@@ -892,7 +885,8 @@ intel_miptree_create_for_bo(struct brw_context *brw,
static struct intel_mipmap_tree *
miptree_create_for_planar_image(struct brw_context *brw,
__DRIimage *image, GLenum target)
__DRIimage *image, GLenum target,
enum isl_tiling tiling)
{
const struct intel_image_format *f = image->planar_format;
struct intel_mipmap_tree *planar_mt = NULL;
@@ -914,6 +908,7 @@ miptree_create_for_planar_image(struct brw_context *brw,
image->offsets[index],
width, height, 1,
image->strides[index],
tiling,
MIPTREE_CREATE_NO_AUX);
if (mt == NULL)
return NULL;
@@ -988,8 +983,17 @@ intel_miptree_create_for_dri_image(struct brw_context *brw,
mesa_format format,
bool is_winsys_image)
{
uint32_t bo_tiling, bo_swizzle;
brw_bo_get_tiling(image->bo, &bo_tiling, &bo_swizzle);
const struct isl_drm_modifier_info *mod_info =
isl_drm_modifier_get_info(image->modifier);
const enum isl_tiling tiling =
mod_info ? mod_info->tiling : isl_tiling_from_i915_tiling(bo_tiling);
if (image->planar_format && image->planar_format->nplanes > 1)
return miptree_create_for_planar_image(brw, image, target);
return miptree_create_for_planar_image(brw, image, target, tiling);
if (image->planar_format)
assert(image->planar_format->planes[0].dri_format == image->dri_format);
@@ -1010,9 +1014,6 @@ intel_miptree_create_for_dri_image(struct brw_context *brw,
if (!brw->ctx.TextureFormatSupported[format])
return NULL;
const struct isl_drm_modifier_info *mod_info =
isl_drm_modifier_get_info(image->modifier);
enum intel_miptree_create_flags mt_create_flags = 0;
/* If this image comes in from a window system, we have different
@@ -1038,7 +1039,7 @@ intel_miptree_create_for_dri_image(struct brw_context *brw,
struct intel_mipmap_tree *mt =
intel_miptree_create_for_bo(brw, image->bo, format,
image->offset, image->width, image->height, 1,
image->pitch, mt_create_flags);
image->pitch, tiling, mt_create_flags);
if (mt == NULL)
return NULL;
@@ -2682,38 +2683,42 @@ enum isl_aux_usage
intel_miptree_render_aux_usage(struct brw_context *brw,
struct intel_mipmap_tree *mt,
enum isl_format render_format,
bool blend_enabled)
bool blend_enabled,
bool draw_aux_disabled)
{
struct gen_device_info *devinfo = &brw->screen->devinfo;
if (draw_aux_disabled)
return ISL_AUX_USAGE_NONE;
switch (mt->aux_usage) {
case ISL_AUX_USAGE_MCS:
assert(mt->mcs_buf);
return ISL_AUX_USAGE_MCS;
case ISL_AUX_USAGE_CCS_D:
/* If FRAMEBUFFER_SRGB is used on Gen9+ then we need to resolve any of
* the single-sampled color renderbuffers because the CCS buffer isn't
* supported for SRGB formats. This only matters if FRAMEBUFFER_SRGB is
* enabled because otherwise the surface state will be programmed with
* the linear equivalent format anyway.
*/
if (isl_format_is_srgb(render_format) &&
_mesa_get_srgb_format_linear(mt->format) != mt->format) {
case ISL_AUX_USAGE_CCS_E:
if (!mt->mcs_buf) {
assert(mt->aux_usage == ISL_AUX_USAGE_CCS_D);
return ISL_AUX_USAGE_NONE;
} else if (!mt->mcs_buf) {
return ISL_AUX_USAGE_NONE;
} else {
return ISL_AUX_USAGE_CCS_D;
}
case ISL_AUX_USAGE_CCS_E: {
/* Lossless compression is not supported for SRGB formats, it
* should be impossible to get here with such surfaces.
/* gen9 hardware technically supports non-0/1 clear colors with sRGB
* formats. However, there are issues with blending where it doesn't
* properly apply the sRGB curve to the clear color when blending.
*/
assert(!isl_format_is_srgb(render_format) ||
_mesa_get_srgb_format_linear(mt->format) == mt->format);
if (devinfo->gen == 9 && blend_enabled &&
isl_format_is_srgb(render_format) &&
!isl_color_value_is_zero_one(mt->fast_clear_color, render_format))
return ISL_AUX_USAGE_NONE;
return ISL_AUX_USAGE_CCS_E;
}
if (mt->aux_usage == ISL_AUX_USAGE_CCS_E &&
format_ccs_e_compat_with_miptree(&brw->screen->devinfo,
mt, render_format))
return ISL_AUX_USAGE_CCS_E;
/* Otherwise, we have to fall back to CCS_D */
return ISL_AUX_USAGE_CCS_D;
default:
return ISL_AUX_USAGE_NONE;
@@ -2724,11 +2729,8 @@ void
intel_miptree_prepare_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled)
enum isl_aux_usage aux_usage)
{
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, mt, render_format, blend_enabled);
intel_miptree_prepare_access(brw, mt, level, 1, start_layer, layer_count,
aux_usage, aux_usage != ISL_AUX_USAGE_NONE);
}
@@ -2737,13 +2739,10 @@ void
intel_miptree_finish_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled)
enum isl_aux_usage aux_usage)
{
assert(_mesa_is_format_color_format(mt->format));
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, mt, render_format, blend_enabled);
intel_miptree_finish_write(brw, mt, level, start_layer, layer_count,
aux_usage);
}
@@ -3007,7 +3006,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
}
brw_render_cache_set_check_flush(brw, dst->bo);
brw_cache_flush_for_read(brw, dst->bo);
src->r8stencil_needs_update = false;
}

View File

@@ -401,6 +401,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
uint32_t height,
uint32_t depth,
int pitch,
enum isl_tiling tiling,
enum intel_miptree_create_flags flags);
struct intel_mipmap_tree *
@@ -651,19 +652,18 @@ enum isl_aux_usage
intel_miptree_render_aux_usage(struct brw_context *brw,
struct intel_mipmap_tree *mt,
enum isl_format render_format,
bool blend_enabled);
bool blend_enabled,
bool draw_aux_disabled);
void
intel_miptree_prepare_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled);
enum isl_aux_usage aux_usage);
void
intel_miptree_finish_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled);
enum isl_aux_usage aux_usage);
void
intel_miptree_prepare_depth(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,

View File

@@ -118,6 +118,7 @@ do_blit_drawpixels(struct gl_context * ctx,
src_offset,
width, height, 1,
src_stride,
ISL_TILING_LINEAR,
MIPTREE_CREATE_DEFAULT);
if (!pbo_mt)
return false;

View File

@@ -964,7 +964,16 @@ intel_create_image_from_fds_common(__DRIscreen *dri_screen,
image->planar_format = f;
image->bo = brw_bo_gem_create_from_prime(screen->bufmgr, fds[0]);
if (modifier != DRM_FORMAT_MOD_INVALID) {
const struct isl_drm_modifier_info *mod_info =
isl_drm_modifier_get_info(modifier);
uint32_t tiling = isl_tiling_to_i915_tiling(mod_info->tiling);
image->bo = brw_bo_gem_create_from_prime_tiled(screen->bufmgr, fds[0],
tiling, strides[0]);
} else {
image->bo = brw_bo_gem_create_from_prime(screen->bufmgr, fds[0]);
}
if (image->bo == NULL) {
free(image);
return NULL;
@@ -1664,8 +1673,8 @@ intel_init_bufmgr(struct intel_screen *screen)
return false;
}
if (!intel_get_boolean(screen, I915_PARAM_HAS_WAIT_TIMEOUT)) {
fprintf(stderr, "[%s: %u] Kernel 3.6 required.\n", __func__, __LINE__);
if (!intel_get_boolean(screen, I915_PARAM_HAS_EXEC_NO_RELOC)) {
fprintf(stderr, "[%s: %u] Kernel 3.9 required.\n", __func__, __LINE__);
return false;
}

View File

@@ -480,6 +480,7 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target,
rb->Base.Base.Width,
rb->Base.Base.Height,
1, rb->mt->surf.row_pitch,
rb->mt->surf.tiling,
MIPTREE_CREATE_DEFAULT);
if (mt == NULL)
return;

View File

@@ -334,12 +334,6 @@ r100CreateContext( gl_api api,
rmesa->radeon.do_usleeps = (fthrottle_mode == DRI_CONF_FTHROTTLE_USLEEPS);
#if DO_DEBUG
RADEON_DEBUG = parse_debug_string( getenv( "RADEON_DEBUG" ),
debug_control );
#endif
tcl_mode = driQueryOptioni(&rmesa->radeon.optionCache, "tcl_mode");
if (driQueryOptionb(&rmesa->radeon.optionCache, "no_rast")) {
fprintf(stderr, "disabling 3D acceleration\n");

View File

@@ -734,6 +734,6 @@ void st_init_driver_functions(struct pipe_screen *screen,
functions->UpdateState = st_invalidate_state;
functions->QueryMemoryInfo = st_query_memory_info;
functions->SetBackgroundContext = st_set_background_context;
functions->GetDriverUuid = st_get_device_uuid;
functions->GetDeviceUuid = st_get_driver_uuid;
functions->GetDriverUuid = st_get_driver_uuid;
functions->GetDeviceUuid = st_get_device_uuid;
}

View File

@@ -31,6 +31,7 @@ noinst_LTLIBRARIES = \
libxmlconfig.la
AM_CPPFLAGS = \
$(PTHREAD_CFLAGS) \
-I$(top_srcdir)/include
libmesautil_la_CPPFLAGS = \
@@ -50,6 +51,7 @@ libmesautil_la_SOURCES = \
$(MESA_UTIL_GENERATED_FILES)
libmesautil_la_LIBADD = \
$(PTHREAD_LIBS) \
$(CLOCK_LIB) \
$(ZLIB_LIBS) \
$(LIBATOMIC_LIBS)