Compare commits

...

60 Commits

Author SHA1 Message Date
Emil Velikov
17c0e248d7 Update version to 18.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 11:27:18 +00:00
Emil Velikov
92a332ed1a cherry-ignore: add patches picked without -x
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-26 19:53:02 +00:00
Maxin B. John
74b39c0bbf anv_icd.py: improve reproducible builds
Sort the output to ensure build reproducibility

Signed-off-by: Maxin B. John <maxin.john@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 0ab04ba979 ("anv: Use python to generate ICD json files")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8116b9170b)
2018-01-26 19:53:02 +00:00
Bas Nieuwenhuizen
a5bdf2abf9 radeonsi: Export signalled sync file instead of -1.
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5a3404d443)
2018-01-26 19:53:02 +00:00
Dave Airlie
305b0b1356 radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)
This seems to be broken, at least the cts tests fail.

This fixes:
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4
dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8

2 samples seems to pass fine, amdvlk doesn't appear to enable TC for
possibly some other reasons here.

This is most likely a hack.

v1.1: add a bit of explaination text. (Samuel)
Fixes: ad3d98da9 (radv: enable tc compatible htile for d32s8 also.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f4c534ef68)
2018-01-26 19:53:02 +00:00
Emil Velikov
28680e72b8 configure.ac: correct driglx-direct help text
The default was toggled a while back, but the text wasn't updated.

Fixes: bd526ec9e1 ("configure: Always default to
--enable-driglx-direct")
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 6aeef54644)
2018-01-26 19:53:02 +00:00
Roland Scheidegger
32b2c0da59 gallivm: fix crash with seamless cube filtering with different min/mag filter
We are not allowed to modify the incoming coords values, or things may
crash (as we may be inside a llvm conditional and the values may be used
in another branch).
I recently broke this when fixing an issue with NaNs and seamless cube
map filtering, and it causes crashes when doing cubemap filtering
if the min and mag filters are different.
Add const to the pointers passed in to prevent this mishap in the future.

Fixes: a485ad0bcd ("gallivm: fix an issue with NaNs with seamless cube filtering")

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4fe662c58f)
2018-01-26 19:53:02 +00:00
Greg V
b01ea9701e meson: handle LLVM 'x.x.xgit-revision' versions
When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git
exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version
becomes something like 5.0.0git-f8ab206b2176.

New meson versions already handle this, but we support older versions too.

Fixes: 673dda8330 ("meson: build "radv" vulkan driver for radeon hardware")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 8fae5eddd9)
2018-01-26 19:53:02 +00:00
Greg V
bf22d563f5 meson: fix getting cflags from pkg-config
get_pkgconfig_variable('cflags') always returns an empty list, it's a
function for getting *custom* variables.

Meson does not yet support asking for cflags, so explicitly invoke
pkg-config for now.

Fixes: 68076b8747 ("meson: build gallium vdpau state tracker")
Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker")
Fixes: 1d36dc674d ("meson: build gallium omx state tracker")
Fixes: 5a785d51a6 ("meson: build gallium va state tracker")
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit 53f9131205)
2018-01-26 19:53:02 +00:00
Greg V
af8c66ba6b meson: fix missing dependencies
Fixes: 66f97f6640 ("meson: build radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@colalbora.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit 7c8cfe2d59)
2018-01-26 19:53:02 +00:00
Dylan Baker
a807ad2f7c meson: correctly set SYSCONFDIR for loading dirrc
Fixes: d1992255bb ("meson: Add build Intel "anv" vulkan driver")
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 5781c3d1db)
2018-01-26 19:53:01 +00:00
Dave Airlie
90e4f15053 radv: move spi_baryc_cntl to pipeline
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.

This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*

Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 298554541d)
2018-01-26 19:53:01 +00:00
Scott D Phillips
12afb389d6 meson: Fix define for USE_SSE41
Before we were adding -DHAVE_SSE41 which isn't what the code is
looking for, so some uses of the sse4.1 code were always being
skipped.

v2: Don't add any compile check for the quite old -msse4.1 option (Dylan)

Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 0b8d38bd48)
2018-01-26 19:53:01 +00:00
Brian Paul
2d7035ee48 vbo: fix incorrect min/max_index values in display list draw call
This fixes another regression from commit 8e4efdc895 ("vbo: optimize
some display list drawing").  The problem was the min_index, max_index
values passed to the vbo drawing function were not computed to compensate
for the biased prim::start values.

https://bugs.freedesktop.org/show_bug.cgi?id=104746
https://bugs.freedesktop.org/show_bug.cgi?id=104742
https://bugs.freedesktop.org/show_bug.cgi?id=104690
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: 8e4efdc895 ("vbo: optimize some display list drawing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
(cherry picked from commit 365a48abdd)
2018-01-26 19:53:01 +00:00
Dave Airlie
80ca933e68 radv: fix sample_mask_in loading. (v3.1)
This is ported from radeonsi and fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_*

v2: don't call this path for radeonsi, it does it in the epilog.
use the radeonsi code path.
v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel)
v3.1: set ps_iter_samples default to 1 (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: bdcbe7c76 (radv: add sample mask input support)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 766589d89a)
2018-01-26 19:53:01 +00:00
Dave Airlie
05e6e669bd radv: don't use hw resolves for r16g16 norm formats.
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.

Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c727ea9370)
2018-01-26 19:53:01 +00:00
Dave Airlie
62803e022e radv: don't use hw resolve for integer image formats
From reading AMDVLK it currently never uses hw resolve paths.

This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.

This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4df414bbd2)
2018-01-26 19:53:01 +00:00
Dave Airlie
e76f0abed8 radv: add fs_key meta format support to resolve passes.
Some of the hw resolve passes need the SPI color format setup
correctly.

This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 316d762186)
2018-01-26 19:53:01 +00:00
Christoph Haag
eaf9500651 meson: remove lib prefix from libd3dadapter9.so
Fixes: 6b4c7047d5 ("meson: build gallium nine state_tracker")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit 4b4d929c27)
2018-01-26 19:53:01 +00:00
Eric Engestrom
3ca5ace19d radeon: remove left over dead code
Fixes: 4e0d99a635 "r100: Use shared debug code"
Cc: Pauli Nieminen <suokkos@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit eee8dd7c33)
2018-01-26 19:53:01 +00:00
Rafael Antognolli
e1ac54507e i965/gen10: Re-enable push constants.
The GPU hang caused by push constants is apparently fixed, so let's
enable them again.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bcfd78e448)
2018-01-26 19:53:01 +00:00
Rafael Antognolli
dcdeb6a33e anv/gen10: Ignore push constant packets during context restore.
Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a
context restore.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 78c125af39)
2018-01-26 19:53:01 +00:00
Rafael Antognolli
8452d0f466 i965/gen10: Ignore push constant packets during context restore.
These packets were causing GPU hangs when the context was restored,
possibly because they were pointing to BO's that were already
unreferenced. So we tell the hardware to ignore such packets after the
batch buffer ends, since we know those BO's are not around anymore.

This change fixes GPU hangs on CNL. The (partial) solution to this
problem so far was to entirely disable push constants on this platform.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ca19ee33d7)
2018-01-26 19:53:01 +00:00
Eleni Maria Stea
123a39cd6a mesa: Fix function pointers initialization in status tracker
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 8096b558a7)
2018-01-26 19:53:01 +00:00
Samuel Pitoiset
639d95e93f ac/nir: set amdgpu.uniform and invariant.load for UBOs
UBOs are constants buffers.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 41c36c45 ("amd/common: use ac_build_buffer_load() for emitting UBO loads")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 49b0a140a7)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
70814af14f anv/pipeline: Don't look at blend state unless we have an attachment
Without this, we may end up dereferencing blend before we check for
binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
be NULL so long as you don't have any color attachments.  This fixes a
segfault when running The Talos Principal.

Fixes: 12f4e00b69
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit c8949e2498)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
ca6942c672 i965/fs: Reset the register file to VGRF in lower_integer_multiplication
18fde36ced changed the way temporary
registers were allocated in lower_integer_multiplication so that we
allocate regs_written(inst) space and keep the stride of the original
destination register.  This was to ensure that any MUL which originally
followed the CHV/BXT integer multiply regioning restrictions would
continue to follow those restrictions even after lowering.  This works
fine except that I forgot to reset the register file to VGRF so, even
though they were assigned a number from alloc.allocate(), they had the
wrong register file.  This caused some GLES 3.0 CTS tests to start
failing on Sandy Bridge due to attempted reads from the MRF:

    ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
    ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64

This commit remedies this problem by, instead of copying inst->dst and
overwriting nr, just make a new register and set the region to match
inst->dst.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
Fixes: 18fde36ced
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit db682b8f0e)
2018-01-26 19:53:01 +00:00
Chuck Atkins
9550852086 configure.ac: add missing llvm dependencies to .pc files
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: <mesa-stable@lists.freedesktop.org>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6ac5e851f1)
2018-01-26 19:53:01 +00:00
George Kyriazis
2594045132 swr/rast: support llvm 3.9 type declarations
LLVM 3.9 was not taken into account in initial check-in.

Fixes: 01ab218bbc ("swr/rast: Initial work for debugging support.")
cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 0e879aad2f)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
b62cefdef8 i965/draw: Set NEW_AUX_STATE when draw aux changes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 20f70ae385)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
521d5b4dcc i965: Replace draw_aux_buffer_disabled with draw_aux_usage
Instead of keeping an array of booleans, we now hang onto an array of
isl_aux_usage enums.  This means that the thing we are passing from
brw_draw.c to surface state setup is the thing that surface state setup
actually needs instead of an input to compute what it needs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e52a9f18d6)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
85c18bb410 i965/surface_state: Drop brw_aux_surface_disabled
The only purpose of this function is to disable aux on texture surfaces
when the corresponding renderbuffer has aux disabled.  However, the act
of disabling aux on the renderbuffer will cause it to be resolved and
intel_miptree_texture_aux_usage will already check the resolved status
of a texture and return ISL_AUX_USAGE_NONE for it.  Even if we used CCS
for it, that wouldn't really be a problem because the CCS will be in the
pass-through state and so it would effectively be ignored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 468ea3cc45)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
d2e9fe8351 i965/miptree: Add an aux_disabled parameter to render_aux_usage
Only one of the callers of intel_miptree_render_aux_usage actually took
brw->draw_aux_buffer_disabled into account.  This was causing us to
ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render.
This isn't a problem because the draw_aux_buffer_disabled entry was set
during texture preparation and we already did the resolve at that time.
However, this also meant that the aux_usage we were passing to
brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our
automatic cache flushing around aux_usage changes wasn't happening.
This was causing GPU hangs in Oxenfree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383
Fixes: ea0d2e98ec
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d38ec24f53)
2018-01-26 19:53:01 +00:00
Jason Ekstrand
a6f4d96a1a i965/miptree: Take an aux_usage in prepare/finish_render
Both callers of intel_miptree_prepare/finish_render have to call
intel_miptree_render_aux_usage anyway for other reasons.  They may as
well pass the result in instead of us calling it again.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dfe0217905)
2018-01-26 19:53:01 +00:00
Greg V
658e9e442c meson: fix BSD build
CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c38c60a63c)
2018-01-26 19:53:01 +00:00
Marek Olšák
48510dccc4 radeonsi: don't ignore pitch for imported textures
Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 022c5b22fe)
2018-01-26 19:53:00 +00:00
Topi Pohjolainen
f6f43e6a4c i965: Don't try to disable render aux buffers for compute
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit ec4bb693a0)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
90b00bf766 anv/cmd_buffer: Move gen7 index buffer state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4064fe59e7)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
19b3e2b781 anv/cmd_buffer: Move num_workgroups to compute state
While we're here, make it an anv_address.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 38ec78049f)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
81a740b941 anv/cmd_buffer: Move dynamic state to graphics state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 95ff232294)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
f310f42ed3 anv/cmd_buffer: Use a temporary variable for dynamic state
We were already doing this for some packets to keep the lines shorter.
We may as well just do it for all of them.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 24caee8975)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
8c93db854c anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
Vertex buffers are entirely a graphics pipeline thing.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8bd5ec5b86)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
76e7324b79 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e85aaec148)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
66d17b545f anv: Separate compute and graphics descriptor sets
The Vulkan spec says:

    "pipelineBindPoint is a VkPipelineBindPoint indicating whether the
    descriptors will be used by graphics pipelines or compute pipelines.
    There is a separate set of bind points for each of graphics and
    compute, so binding one does not disturb the other."

Up until now, we've been ignoring the pipeline bind point and had just
one bind point for everything.  This commit separates things out into
separate bind points.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 97f96610c8)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
144a300204 anv/cmd_buffer: Use anv_descriptor_for_binding for samplers
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 31b2144c83)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
2dec9ce687 anv/cmd_buffer: Add a helper for binding descriptor sets
This lets us unify some code between push descriptors and regular
descriptors.  It doesn't do much for us yet but it will.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9e1ca16f8)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
bde35c09de anv/cmd_buffer: Refactor ensure_push_descriptor_set
It's now a function which returns the push descriptor set.  Since we set
the error on the command buffer, returning the error is a little
redundant.  Returning the descriptor set (or NULL on error) is more
convenient.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 90cceaa9dd)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
064fbf7180 anv: Remove semicolons from vk_error[f] definitions
With the semicolons, they can't be used in a function argument without
throwing syntax errors.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d5592e2fda)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
cb5abcd715 anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute
Initially, these just contain the pipeline in a base struct.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9af5379228)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
f0a1c2c69e anv/cmd_buffer: Use some pre-existing pipeline temporaries
There are several places where we'd already saved the pipeline off to a
temporary variable but, due to an artifact of history, weren't actually
using that temporary everywhere.  No functional change.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ddc2d28548)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
3c85e8c8e3 anv/cmd_buffer: Rework anv_cmd_state_reset
This splits anv_cmd_state_reset into separate init and finish functions.
This lets us share init code with cmd_buffer_create.  This potentially
fixes subtle bugs where we may have missed some bit of state that needs
to get initialized on command buffer creation.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd3feea745)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
2ecc2f85fe anv/cmd_buffer: Get rid of the meta query workaround
Meta has been gone for a long time.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d6c9a89d13)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
f4f0838d31 anv/cmd_state: Drop the scratch_size field
This is a legacy left-over from the mechanism we used to use to handle
scratch.  The new (and better) mechanism doesn't use this.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bc0a21e348)
2018-01-26 19:53:00 +00:00
Jason Ekstrand
44b15816bb anv/pipeline: Don't assert on more than 32 samplers
This prevents an assert when running one unreleased Vulkan game.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b69ba3817)
2018-01-26 19:53:00 +00:00
Marc Dietrich
4d0b43117d meson: fix some defines misspelled errors in meson.build
Defines
- HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
- HAVE_FUNC_ATTRIBUTE_VISIBILITY
were misspelled.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 911ca587f8)
2018-01-26 19:53:00 +00:00
Emil Velikov
99a48002a2 Update version to 18.0.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 18:07:37 +00:00
Emil Velikov
e91e68d6a8 Update version to 18.0.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 16:39:33 +00:00
Emil Velikov
a9db8ac935 automake: small cleanup after the meson.build inclusion
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.

Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 14:25:34 +00:00
Emil Velikov
41e48eac87 automake: anv: ship anv_extensions_gen.py in the tarball
Fixes: dd088d4bec ("anv/extensions: Generate a header file with
extension tables")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 14:25:34 +00:00
Emil Velikov
90002ba41e automake: vc5: remove non-applicable v3dx_simulator.h
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-23 14:25:32 +00:00
54 changed files with 836 additions and 511 deletions

View File

@@ -1 +1 @@
17.4.0-devel
18.0.0-rc3

3
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,3 @@
# fixes: The following commits were applied without the "cherry-picked from" tag
50265cd9ee4caffee853700bdcd75b92eedc0e7b automake: anv: ship anv_extensions_gen.py in the tarball
ac4437b20b87c7285b89466f05b51518ae616873 automake: small cleanup after the meson.build inclusion

View File

@@ -1598,7 +1598,7 @@ fi
AC_ARG_ENABLE([driglx-direct],
[AS_HELP_STRING([--disable-driglx-direct],
[disable direct rendering in GLX and EGL for DRI \
@<:@default=auto@:>@])],
@<:@default=enabled@:>@])],
[driglx_direct="$enableval"],
[driglx_direct="yes"])
@@ -2780,6 +2780,18 @@ if test "x$enable_llvm" = xyes; then
fi
fi
fi
dnl The gallium-xlib GLX and gallium OSMesa targets directly embed the
dnl swr/llvmpipe driver into the final binary. Adding LLVM_LIBS results in
dnl the LLVM library propagated in the Libs.private of the respective .pc
dnl file which ensures complete dependency information when statically
dnl linking.
if test "x$enable_glx" == xgallium-xlib; then
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
fi
if test "x$enable_gallium_osmesa" = xyes; then
OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
fi
fi
AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)

View File

@@ -202,18 +202,20 @@ if with_dri_i915 or with_gallium_i915
dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75')
endif
system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 'linux'].contains(host_machine.system())
if host_machine.system() == 'darwin'
with_dri_platform = 'apple'
elif ['windows', 'cygwin'].contains(host_machine.system())
with_dri_platform = 'windows'
elif host_machine.system() == 'linux'
# FIXME: This should include BSD and possibly other systems
elif system_has_kms_drm
with_dri_platform = 'drm'
else
# FIXME: haiku doesn't use dri, and xlib doesn't use dri, probably should
# assert here that one of those cases has been met.
# FIXME: GNU (hurd) ends up here as well, but meson doesn't officially
# support Hurd at time of writing (2017/11)
# FIXME: illumos ends up here as well
with_dri_platform = 'none'
endif
@@ -225,7 +227,7 @@ with_platform_surfaceless = false
egl_native_platform = ''
_platforms = get_option('platforms')
if _platforms == 'auto'
if ['linux'].contains(host_machine.system())
if system_has_kms_drm
_platforms = 'x11,wayland,drm,surfaceless'
else
error('Unknown OS, no platforms enabled. Patches gladly accepted to fix this.')
@@ -272,10 +274,10 @@ endif
with_gbm = get_option('gbm')
if with_gbm == 'auto' and with_dri # TODO: or gallium
with_gbm = host_machine.system() == 'linux'
with_gbm = system_has_kms_drm
elif with_gbm == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('GBM only supports unix-like platforms')
if not system_has_kms_drm
error('GBM only supports DRM/KMS platforms')
endif
with_gbm = true
else
@@ -351,7 +353,7 @@ endif
with_dri2 = (with_dri or with_any_vk) and with_dri_platform == 'drm'
with_dri3 = get_option('dri3')
if with_dri3 == 'auto'
if host_machine.system() == 'linux' and with_dri2
if system_has_kms_drm and with_dri2
with_dri3 = true
else
with_dri3 = false
@@ -371,10 +373,12 @@ if with_dri or with_gallium
endif
endif
prog_pkgconfig = find_program('pkg-config')
dep_vdpau = []
_vdpau = get_option('gallium-vdpau')
if _vdpau == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_vdpau = false
elif not with_platform_x11
with_gallium_vdpau = false
@@ -386,8 +390,8 @@ if _vdpau == 'auto'
with_gallium_vdpau = dep_vdpau.found()
endif
elif _vdpau == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('VDPAU state tracker can only be build on unix-like OSes.')
if not system_has_kms_drm
error('VDPAU state tracker can only be build on DRM/KMS OSes.')
elif not with_platform_x11
error('VDPAU state tracker requires X11 support.')
with_gallium_vdpau = false
@@ -402,7 +406,7 @@ else
endif
if with_gallium_vdpau
dep_vdpau = declare_dependency(
compile_args : dep_vdpau.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['vdpau', '--cflags']).stdout().split()
)
endif
@@ -417,7 +421,7 @@ endif
dep_xvmc = []
_xvmc = get_option('gallium-xvmc')
if _xvmc == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_xvmc = false
elif not with_platform_x11
with_gallium_xvmc = false
@@ -428,8 +432,8 @@ if _xvmc == 'auto'
with_gallium_xvmc = dep_xvmc.found()
endif
elif _xvmc == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('XVMC state tracker can only be build on unix-like OSes.')
if not system_has_kms_drm
error('XVMC state tracker can only be build on DRM/KMS OSes.')
elif not with_platform_x11
error('XVMC state tracker requires X11 support.')
with_gallium_xvmc = false
@@ -443,7 +447,7 @@ else
endif
if with_gallium_xvmc
dep_xvmc = declare_dependency(
compile_args : dep_xvmc.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['xvmc', '--cflags']).stdout().split()
)
endif
@@ -455,7 +459,7 @@ endif
dep_omx = []
_omx = get_option('gallium-omx')
if _omx == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_omx = false
elif not with_platform_x11
with_gallium_omx = false
@@ -466,8 +470,8 @@ if _omx == 'auto'
with_gallium_omx = dep_omx.found()
endif
elif _omx == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('OMX state tracker can only be built on unix-like OSes.')
if not system_has_kms_drm
error('OMX state tracker can only be built on DRM/KMS OSes.')
elif not (with_platform_x11 or with_platform_drm)
error('OMX state tracker requires X11 or drm platform support.')
with_gallium_omx = false
@@ -506,14 +510,14 @@ if with_gallium_omx
endif
if with_gallium_omx
dep_omx = declare_dependency(
compile_args : dep_omx.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['libomxil-bellagio', '--cflags']).stdout().split()
)
endif
dep_va = []
_va = get_option('gallium-va')
if _va == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_va = false
elif not with_platform_x11
with_gallium_va = false
@@ -524,8 +528,8 @@ if _va == 'auto'
with_gallium_va = dep_va.found()
endif
elif _va == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('VA state tracker can only be built on unix-like OSes.')
if not system_has_kms_drm
error('VA state tracker can only be built on DRM/KMS OSes.')
elif not (with_platform_x11 or with_platform_drm)
error('VA state tracker requires X11 or drm or wayland platform support.')
with_gallium_va = false
@@ -539,7 +543,7 @@ else
endif
if with_gallium_va
dep_va = declare_dependency(
compile_args : dep_va.get_pkgconfig_variable('cflags').split()
compile_args : run_command(prog_pkgconfig, ['libva', '--cflags']).stdout().split()
)
endif
@@ -550,7 +554,7 @@ endif
_xa = get_option('gallium-xa')
if _xa == 'auto'
if not ['linux', 'bsd'].contains(host_machine.system())
if not system_has_kms_drm
with_gallium_xa = false
elif not (with_gallium_nouveau or with_gallium_freedreno or with_gallium_i915
or with_gallium_svga)
@@ -559,8 +563,8 @@ if _xa == 'auto'
with_gallium_xa = true
endif
elif _xa == 'true'
if not ['linux', 'bsd'].contains(host_machine.system())
error('XA state tracker can only be built on unix-like OSes.')
if not system_has_kms_drm
error('XA state tracker can only be built on DRM/KMS OSes.')
elif not (with_gallium_nouveau or with_gallium_freedreno or with_gallium_i915
or with_gallium_svga)
error('XA state tracker requires at least one of the following gallium drivers: nouveau, freedreno, i915, svga.')
@@ -692,14 +696,14 @@ if cc.compiles('struct __attribute__((packed)) foo { int bar; };',
endif
if cc.compiles('int *foo(void) __attribute__((returns_nonnull));',
name : '__attribute__((returns_nonnull))')
pre_args += '-DHAVE_FUNC_ATTRIBUTE_NONNULL'
pre_args += '-DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL'
endif
if cc.compiles('''int foo_def(void) __attribute__((visibility("default")));
int foo_hid(void) __attribute__((visibility("hidden")));
int foo_int(void) __attribute__((visibility("internal")));
int foo_pro(void) __attribute__((visibility("protected")));''',
name : '__attribute__((visibility(...)))')
pre_args += '-DHAVE_FUNC_ATTRIBUTE_VISBILITY'
pre_args += '-DHAVE_FUNC_ATTRIBUTE_VISIBILITY'
endif
if cc.compiles('int foo(void) { return 0; } int bar(void) __attribute__((alias("foo")));',
name : '__attribute__((alias(...)))')
@@ -772,7 +776,7 @@ foreach a : ['-Werror=pointer-arith', '-Werror=vla']
endforeach
if host_machine.cpu_family().startswith('x86')
pre_args += '-DHAVE_SSE41'
pre_args += '-DUSE_SSE41'
with_sse41 = true
sse41_args = ['-msse4.1']
@@ -820,23 +824,23 @@ with_asm_arch = ''
if with_asm
# TODO: SPARC and PPC
if host_machine.cpu_family() == 'x86'
if ['linux', 'bsd'].contains(host_machine.system()) # FIXME: hurd?
if system_has_kms_drm
with_asm_arch = 'x86'
pre_args += ['-DUSE_X86_ASM', '-DUSE_MMX_ASM', '-DUSE_3DNOW_ASM',
'-DUSE_SSE_ASM']
endif
elif host_machine.cpu_family() == 'x86_64'
if host_machine.system() == 'linux'
if system_has_kms_drm
with_asm_arch = 'x86_64'
pre_args += ['-DUSE_X86_64_ASM']
endif
elif host_machine.cpu_family() == 'arm'
if host_machine.system() == 'linux'
if system_has_kms_drm
with_asm_arch = 'arm'
pre_args += ['-DUSE_ARM_ASM']
endif
elif host_machine.cpu_family() == 'aarch64'
if host_machine.system() == 'linux'
if system_has_kms_drm
with_asm_arch = 'aarch64'
pre_args += ['-DUSE_AARCH64_ASM']
endif
@@ -1018,11 +1022,15 @@ else
endif
if with_llvm
_llvm_version = dep_llvm.version().split('.')
# Development versions of LLVM have an 'svn' suffix, we don't want that for
# our version checks.
# Development versions of LLVM have an 'svn' or 'git' suffix, we don't want
# that for our version checks.
# svn suffixes are stripped by meson as of 0.43, and git suffixes are
# strippped as of 0.44, but we support older meson versions.
_llvm_patch = _llvm_version[2]
if _llvm_patch.endswith('svn')
_llvm_patch = _llvm_patch.split('s')[0]
elif _llvm_patch.contains('git')
_llvm_patch = _llvm_patch.split('g')[0]
endif
pre_args += [
'-DHAVE_LLVM=0x0@0@@1@@2@'.format(_llvm_version[0], _llvm_version[1], _llvm_patch),

View File

@@ -4049,6 +4049,30 @@ static LLVMValueRef load_sample_pos(struct ac_nir_context *ctx)
return ac_build_gather_values(&ctx->ac, values, 2);
}
static LLVMValueRef load_sample_mask_in(struct ac_nir_context *ctx)
{
uint8_t log2_ps_iter_samples = ctx->nctx->shader_info->info.ps.force_persample ? ctx->nctx->options->key.fs.log2_num_samples : ctx->nctx->options->key.fs.log2_ps_iter_samples;
/* The bit pattern matches that used by fixed function fragment
* processing. */
static const uint16_t ps_iter_masks[] = {
0xffff, /* not used */
0x5555,
0x1111,
0x0101,
0x0001,
};
assert(log2_ps_iter_samples < ARRAY_SIZE(ps_iter_masks));
uint32_t ps_iter_mask = ps_iter_masks[log2_ps_iter_samples];
LLVMValueRef result, sample_id;
sample_id = unpack_param(&ctx->ac, ctx->abi->ancillary, 8, 4);
sample_id = LLVMBuildShl(ctx->ac.builder, LLVMConstInt(ctx->ac.i32, ps_iter_mask, false), sample_id, "");
result = LLVMBuildAnd(ctx->ac.builder, sample_id, ctx->abi->sample_coverage, "");
return result;
}
static LLVMValueRef visit_interp(struct nir_to_llvm_context *ctx,
const nir_intrinsic_instr *instr)
{
@@ -4353,7 +4377,10 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
result = load_sample_pos(ctx);
break;
case nir_intrinsic_load_sample_mask_in:
result = ctx->abi->sample_coverage;
if (ctx->nctx)
result = load_sample_mask_in(ctx);
else
result = ctx->abi->sample_coverage;
break;
case nir_intrinsic_load_frag_coord: {
LLVMValueRef values[4] = {
@@ -4532,8 +4559,14 @@ static LLVMValueRef radv_load_ssbo(struct ac_shader_abi *abi,
static LLVMValueRef radv_load_ubo(struct ac_shader_abi *abi, LLVMValueRef buffer_ptr)
{
struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
LLVMValueRef result;
return LLVMBuildLoad(ctx->builder, buffer_ptr, "");
LLVMSetMetadata(buffer_ptr, ctx->ac.uniform_md_kind, ctx->ac.empty_md);
result = LLVMBuildLoad(ctx->builder, buffer_ptr, "");
LLVMSetMetadata(result, ctx->ac.invariant_load_md_kind, ctx->ac.empty_md);
return result;
}
static LLVMValueRef radv_get_sampler_desc(struct ac_shader_abi *abi,

View File

@@ -60,6 +60,8 @@ struct ac_tcs_variant_key {
struct ac_fs_variant_key {
uint32_t col_format;
uint8_t log2_ps_iter_samples;
uint8_t log2_num_samples;
uint32_t is_int8;
uint32_t is_int10;
uint32_t multisample : 1;

View File

@@ -990,7 +990,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
{
struct radv_shader_variant *ps;
uint64_t va;
unsigned spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
struct radv_blend_state *blend = &pipeline->graphics.blend;
assert (pipeline->shaders[MESA_SHADER_FRAGMENT]);
@@ -1012,13 +1011,10 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR,
ps->config.spi_ps_input_addr);
if (ps->info.info.ps.force_persample)
spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL,
S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, pipeline->graphics.spi_baryc_cntl);
radeon_set_context_reg(cmd_buffer->cs, R_028710_SPI_SHADER_Z_FORMAT,
pipeline->graphics.shader_z_format);

View File

@@ -116,7 +116,8 @@ radv_init_surface(struct radv_device *device,
pCreateInfo->mipLevels <= 1 &&
device->physical_device->rad_info.chip_class >= VI &&
((pCreateInfo->format == VK_FORMAT_D32_SFLOAT ||
pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) ||
/* for some reason TC compat with 4/8 samples breaks some cts tests - disable for now */
(pCreateInfo->samples < 4 && pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT)) ||
(device->physical_device->rad_info.chip_class >= GFX9 &&
pCreateInfo->format == VK_FORMAT_D16_UNORM)))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;

View File

@@ -26,6 +26,7 @@
#include "radv_meta.h"
#include "radv_private.h"
#include "vk_format.h"
#include "nir/nir_builder.h"
#include "sid.h"
@@ -50,7 +51,7 @@ build_nir_fs(void)
}
static VkResult
create_pass(struct radv_device *device)
create_pass(struct radv_device *device, VkFormat vk_format, VkRenderPass *pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -59,7 +60,7 @@ create_pass(struct radv_device *device)
int i;
for (i = 0; i < 2; i++) {
attachments[i].format = VK_FORMAT_UNDEFINED;
attachments[i].format = vk_format;
attachments[i].samples = 1;
attachments[i].loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachments[i].storeOp = VK_ATTACHMENT_STORE_OP_STORE;
@@ -99,14 +100,16 @@ create_pass(struct radv_device *device)
.dependencyCount = 0,
},
alloc,
&device->meta_state.resolve.pass);
pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
VkShaderModule vs_module_h,
VkPipeline *pipeline,
VkRenderPass pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -129,12 +132,14 @@ create_pipeline(struct radv_device *device,
.pPushConstantRanges = NULL,
};
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve.p_layout);
if (result != VK_SUCCESS)
goto cleanup;
if (!device->meta_state.resolve.p_layout) {
result = radv_CreatePipelineLayout(radv_device_to_handle(device),
&pl_create_info,
&device->meta_state.alloc,
&device->meta_state.resolve.p_layout);
if (result != VK_SUCCESS)
goto cleanup;
}
result = radv_graphics_pipeline_create(device_h,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
@@ -212,15 +217,14 @@ create_pipeline(struct radv_device *device,
},
},
.layout = device->meta_state.resolve.p_layout,
.renderPass = device->meta_state.resolve.pass,
.renderPass = pass,
.subpass = 0,
},
&(struct radv_graphics_pipeline_create_info) {
.use_rectlist = true,
.custom_blend_mode = V_028808_CB_RESOLVE,
},
&device->meta_state.alloc,
&device->meta_state.resolve.pipeline);
&device->meta_state.alloc, pipeline);
if (result != VK_SUCCESS)
goto cleanup;
@@ -236,19 +240,37 @@ radv_device_finish_meta_resolve_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass, &state->alloc);
for (uint32_t j = 0; j < NUM_META_FS_KEYS; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve.pass[j], &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline[j], &state->alloc);
}
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->resolve.p_layout, &state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve.pipeline, &state->alloc);
}
static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,
VK_FORMAT_R16G16B16A16_SINT,
VK_FORMAT_R32_SFLOAT,
VK_FORMAT_R32G32_SFLOAT,
VK_FORMAT_R32G32B32A32_SFLOAT
};
VkResult
radv_device_init_meta_resolve_state(struct radv_device *device)
{
VkResult res = VK_SUCCESS;
struct radv_meta_state *state = &device->meta_state;
struct radv_shader_module vs_module = { .nir = radv_meta_build_nir_vs_generate_vertices() };
if (!vs_module.nir) {
/* XXX: Need more accurate error */
@@ -256,14 +278,19 @@ radv_device_init_meta_resolve_state(struct radv_device *device)
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
for (uint32_t i = 0; i < ARRAY_SIZE(pipeline_formats); ++i) {
VkFormat format = pipeline_formats[i];
unsigned fs_key = radv_format_meta_fs_key(format);
res = create_pass(device, format, &state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h,
&state->resolve.pipeline[fs_key], state->resolve.pass[fs_key]);
if (res != VK_SUCCESS)
goto fail;
}
goto cleanup;
@@ -278,16 +305,18 @@ cleanup:
static void
emit_resolve(struct radv_cmd_buffer *cmd_buffer,
VkFormat vk_format,
const VkOffset2D *dest_offset,
const VkExtent2D *resolve_extent)
{
struct radv_device *device = cmd_buffer->device;
VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer);
unsigned fs_key = radv_format_meta_fs_key(vk_format);
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB;
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.resolve.pipeline);
device->meta_state.resolve.pipeline[fs_key]);
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) {
.x = dest_offset->x,
@@ -323,6 +352,13 @@ static void radv_pick_resolve_method_images(struct radv_image *src_image,
uint32_t queue_mask = radv_image_queue_family_mask(dest_image,
cmd_buffer->queue_family_index,
cmd_buffer->queue_family_index);
if (src_image->vk_format == VK_FORMAT_R16G16_UNORM ||
src_image->vk_format == VK_FORMAT_R16G16_SNORM)
*method = RESOLVE_COMPUTE;
else if (vk_format_is_int(src_image->vk_format))
*method = RESOLVE_COMPUTE;
if (radv_layout_dcc_compressed(dest_image, dest_image_layout, queue_mask)) {
*method = RESOLVE_FRAGMENT;
} else if (dest_image->surface.micro_tile_mode != src_image->surface.micro_tile_mode) {
@@ -413,6 +449,7 @@ void radv_CmdResolveImage(
if (dest_image->surface.dcc_size) {
radv_initialize_dcc(cmd_buffer, dest_image, 0xffffffff);
}
unsigned fs_key = radv_format_meta_fs_key(dest_image->vk_format);
for (uint32_t r = 0; r < region_count; ++r) {
const VkImageResolve *region = &regions[r];
@@ -512,7 +549,7 @@ void radv_CmdResolveImage(
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.resolve.pass,
.renderPass = device->meta_state.resolve.pass[fs_key],
.framebuffer = fb_h,
.renderArea = {
.offset = {
@@ -530,6 +567,7 @@ void radv_CmdResolveImage(
VK_SUBPASS_CONTENTS_INLINE);
emit_resolve(cmd_buffer,
dest_iview.vk_format,
&(VkOffset2D) {
.x = dstOffset.x,
.y = dstOffset.y,
@@ -624,6 +662,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
emit_resolve(cmd_buffer,
dst_img->vk_format,
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
}

View File

@@ -798,6 +798,18 @@ radv_pipeline_init_raster_state(struct radv_pipeline *pipeline,
}
static uint8_t radv_pipeline_get_ps_iter_samples(const VkPipelineMultisampleStateCreateInfo *vkms)
{
uint32_t num_samples = vkms->rasterizationSamples;
uint32_t ps_iter_samples = 1;
if (vkms->sampleShadingEnable) {
ps_iter_samples = ceil(vkms->minSampleShading * num_samples);
ps_iter_samples = util_next_power_of_two(ps_iter_samples);
}
return ps_iter_samples;
}
static void
radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
const VkGraphicsPipelineCreateInfo *pCreateInfo)
@@ -813,9 +825,9 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
else
ms->num_samples = 1;
if (vkms && vkms->sampleShadingEnable) {
ps_iter_samples = ceil(vkms->minSampleShading * ms->num_samples);
} else if (pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.force_persample) {
if (vkms)
ps_iter_samples = radv_pipeline_get_ps_iter_samples(vkms);
if (vkms && !vkms->sampleShadingEnable && pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.force_persample) {
ps_iter_samples = ms->num_samples;
}
@@ -838,7 +850,7 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
if (ms->num_samples > 1) {
unsigned log_samples = util_logbase2(ms->num_samples);
unsigned log_ps_iter_samples = util_logbase2(util_next_power_of_two(ps_iter_samples));
unsigned log_ps_iter_samples = util_logbase2(ps_iter_samples);
ms->pa_sc_mode_cntl_0 |= S_028A48_MSAA_ENABLE(1);
ms->pa_sc_line_cntl |= S_028BDC_EXPAND_LINE_WIDTH(1); /* CM_R_028BDC_PA_SC_LINE_CNTL */
ms->db_eqaa |= S_028804_MAX_ANCHOR_SAMPLES(log_samples) |
@@ -849,6 +861,8 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
S_028BE0_MAX_SAMPLE_DIST(radv_cayman_get_maxdist(log_samples)) |
S_028BE0_MSAA_EXPOSED_SAMPLES(log_samples); /* CM_R_028BE0_PA_SC_AA_CONFIG */
ms->pa_sc_mode_cntl_1 |= S_028A4C_PS_ITER_SAMPLE(ps_iter_samples > 1);
if (ps_iter_samples > 1)
pipeline->graphics.spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
}
const struct VkPipelineRasterizationStateRasterizationOrderAMD *raster_order =
@@ -1745,8 +1759,13 @@ radv_generate_graphics_pipeline_key(struct radv_pipeline *pipeline,
if (pCreateInfo->pMultisampleState &&
pCreateInfo->pMultisampleState->rasterizationSamples > 1)
pCreateInfo->pMultisampleState->rasterizationSamples > 1) {
uint32_t num_samples = pCreateInfo->pMultisampleState->rasterizationSamples;
uint32_t ps_iter_samples = radv_pipeline_get_ps_iter_samples(pCreateInfo->pMultisampleState);
key.multisample = true;
key.log2_num_samples = util_logbase2(num_samples);
key.log2_ps_iter_samples = util_logbase2(ps_iter_samples);
}
key.col_format = pipeline->graphics.blend.spi_shader_col_format;
if (pipeline->device->physical_device->rad_info.chip_class < VI)
@@ -1784,6 +1803,8 @@ radv_fill_shader_keys(struct ac_shader_variant_key *keys,
keys[MESA_SHADER_FRAGMENT].fs.col_format = key->col_format;
keys[MESA_SHADER_FRAGMENT].fs.is_int8 = key->is_int8;
keys[MESA_SHADER_FRAGMENT].fs.is_int10 = key->is_int10;
keys[MESA_SHADER_FRAGMENT].fs.log2_ps_iter_samples = key->log2_ps_iter_samples;
keys[MESA_SHADER_FRAGMENT].fs.log2_num_samples = key->log2_num_samples;
}
static void
@@ -2430,6 +2451,7 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
radv_generate_graphics_pipeline_key(pipeline, pCreateInfo, has_view_index),
pStages);
pipeline->graphics.spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
radv_pipeline_init_depth_stencil_state(pipeline, pCreateInfo, extra);
radv_pipeline_init_raster_state(pipeline, pCreateInfo);
radv_pipeline_init_multisample_state(pipeline, pCreateInfo);

View File

@@ -331,6 +331,8 @@ struct radv_pipeline_key {
uint32_t col_format;
uint32_t is_int8;
uint32_t is_int10;
uint8_t log2_ps_iter_samples;
uint8_t log2_num_samples;
uint32_t multisample : 1;
uint32_t has_multiview_view_index : 1;
};
@@ -478,8 +480,8 @@ struct radv_meta_state {
struct {
VkPipelineLayout p_layout;
VkPipeline pipeline;
VkRenderPass pass;
VkPipeline pipeline[NUM_META_FS_KEYS];
VkRenderPass pass[NUM_META_FS_KEYS];
} resolve;
struct {
@@ -1237,6 +1239,7 @@ struct radv_pipeline {
struct radv_binning_state bin;
uint32_t db_shader_control;
uint32_t shader_z_format;
uint32_t spi_baryc_cntl;
unsigned prim;
unsigned gs_out;
uint32_t vgt_gs_mode;

View File

@@ -64,7 +64,6 @@ EXTRA_DIST = \
glsl/meson.build \
nir/meson.build \
meson.build
MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

View File

@@ -857,7 +857,7 @@ lp_build_sample_image_nearest(struct lp_build_sample_context *bld,
LLVMValueRef img_stride_vec,
LLVMValueRef data_ptr,
LLVMValueRef mipoffsets,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef colors_out[4])
{
@@ -1004,7 +1004,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
LLVMValueRef img_stride_vec,
LLVMValueRef data_ptr,
LLVMValueRef mipoffsets,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef colors_out[4])
{
@@ -1106,7 +1106,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
struct lp_build_if_state edge_if;
LLVMTypeRef int1t;
LLVMValueRef new_faces[4], new_xcoords[4][2], new_ycoords[4][2];
LLVMValueRef coord, have_edge, have_corner;
LLVMValueRef coord0, coord1, have_edge, have_corner;
LLVMValueRef fall_off_ym_notxm, fall_off_ym_notxp, fall_off_x, fall_off_y;
LLVMValueRef fall_off_yp_notxm, fall_off_yp_notxp;
LLVMValueRef x0, x1, y0, y1, y0_clamped, y1_clamped;
@@ -1130,20 +1130,20 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
* other values might be bogus in the end too).
* So kill off the NaNs here.
*/
coords[0] = lp_build_max_ext(coord_bld, coords[0], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coords[1] = lp_build_max_ext(coord_bld, coords[1], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coord = lp_build_mul(coord_bld, coords[0], flt_width_vec);
coord0 = lp_build_max_ext(coord_bld, coords[0], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coord0 = lp_build_mul(coord_bld, coord0, flt_width_vec);
/* instead of clamp, build mask if overflowed */
coord = lp_build_sub(coord_bld, coord, half);
coord0 = lp_build_sub(coord_bld, coord0, half);
/* convert to int, compute lerp weight */
/* not ideal with AVX (and no AVX2) */
lp_build_ifloor_fract(coord_bld, coord, &x0, &s_fpart);
lp_build_ifloor_fract(coord_bld, coord0, &x0, &s_fpart);
x1 = lp_build_add(ivec_bld, x0, ivec_bld->one);
coord = lp_build_mul(coord_bld, coords[1], flt_height_vec);
coord = lp_build_sub(coord_bld, coord, half);
lp_build_ifloor_fract(coord_bld, coord, &y0, &t_fpart);
coord1 = lp_build_max_ext(coord_bld, coords[1], coord_bld->zero,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
coord1 = lp_build_mul(coord_bld, coord1, flt_height_vec);
coord1 = lp_build_sub(coord_bld, coord1, half);
lp_build_ifloor_fract(coord_bld, coord1, &y0, &t_fpart);
y1 = lp_build_add(ivec_bld, y0, ivec_bld->one);
fall_off[0] = lp_build_cmp(ivec_bld, PIPE_FUNC_LESS, x0, ivec_bld->zero);
@@ -1747,7 +1747,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
unsigned img_filter,
unsigned mip_filter,
boolean is_gather,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef ilevel0,
LLVMValueRef ilevel1,
@@ -1820,6 +1820,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
PIPE_FUNC_GREATER,
lod_fpart, bld->lodf_bld.zero);
need_lerp = lp_build_any_true_range(&bld->lodi_bld, bld->num_lods, need_lerp);
lp_build_name(need_lerp, "need_lerp");
}
lp_build_if(&if_ctx, bld->gallivm, need_lerp);
@@ -1888,7 +1889,7 @@ static void
lp_build_sample_mipmap_both(struct lp_build_sample_context *bld,
LLVMValueRef linear_mask,
unsigned mip_filter,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef ilevel0,
LLVMValueRef ilevel1,
@@ -1945,6 +1946,7 @@ lp_build_sample_mipmap_both(struct lp_build_sample_context *bld,
* should be able to merge the branches in this case.
*/
need_lerp = lp_build_any_true_range(&bld->lodi_bld, bld->num_lods, lod_positive);
lp_build_name(need_lerp, "need_lerp");
lp_build_if(&if_ctx, bld->gallivm, need_lerp);
{
@@ -2422,7 +2424,7 @@ static void
lp_build_sample_general(struct lp_build_sample_context *bld,
unsigned sampler_unit,
boolean is_gather,
LLVMValueRef *coords,
const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef lod_positive,
LLVMValueRef lod_fpart,
@@ -2483,7 +2485,8 @@ lp_build_sample_general(struct lp_build_sample_context *bld,
struct lp_build_if_state if_ctx;
lod_positive = LLVMBuildTrunc(builder, lod_positive,
LLVMInt1TypeInContext(bld->gallivm->context), "");
LLVMInt1TypeInContext(bld->gallivm->context),
"lod_pos");
lp_build_if(&if_ctx, bld->gallivm, lod_positive);
{
@@ -2519,6 +2522,7 @@ lp_build_sample_general(struct lp_build_sample_context *bld,
}
need_linear = lp_build_any_true_range(&bld->lodi_bld, bld->num_lods,
linear_mask);
lp_build_name(need_linear, "need_linear");
if (bld->num_lods != bld->coord_type.length) {
linear_mask = lp_build_unpack_broadcast_aos_scalars(bld->gallivm,

View File

@@ -46,4 +46,4 @@ ir3_compiler_LDADD = \
$(GALLIUM_COMMON_LIB_DEPS) \
$(FREEDRENO_LIBS)
EXTRA_DIST = meson.build
EXTRA_DIST += meson.build

View File

@@ -30,4 +30,4 @@ noinst_LTLIBRARIES = libi915.la
libi915_la_SOURCES = $(C_SOURCES)
EXTRA_DIST = meson.build
EXTRA_DIST = TODO meson.build

View File

@@ -298,11 +298,21 @@ static int r600_init_surface(struct si_screen *sscreen,
return r;
}
unsigned pitch = pitch_in_bytes_override / bpe;
if (sscreen->info.chip_class >= GFX9) {
assert(!pitch_in_bytes_override ||
pitch_in_bytes_override == surface->u.gfx9.surf_pitch * bpe);
if (pitch) {
surface->u.gfx9.surf_pitch = pitch;
surface->u.gfx9.surf_slice_size =
(uint64_t)pitch * surface->u.gfx9.surf_height * bpe;
}
surface->u.gfx9.surf_offset = offset;
} else {
if (pitch) {
surface->u.legacy.level[0].nblk_x = pitch;
surface->u.legacy.level[0].slice_size_dw =
((uint64_t)pitch * surface->u.legacy.level[0].nblk_y * bpe) / 4;
}
if (offset) {
for (i = 0; i < ARRAY_SIZE(surface->u.legacy.level); ++i)
surface->u.legacy.level[i].offset += offset;

View File

@@ -610,6 +610,11 @@ struct radeon_winsys {
int (*fence_export_sync_file)(struct radeon_winsys *ws,
struct pipe_fence_handle *fence);
/**
* Return a sync file FD that is already signalled.
*/
int (*export_signalled_sync_file)(struct radeon_winsys *ws);
/**
* Initialize surface
*

View File

@@ -77,7 +77,7 @@ libradeonsi = static_library(
],
c_args : [c_vis_args],
cpp_args : [cpp_vis_args],
dependencies : [dep_llvm, idep_nir_headers],
dependencies : [dep_llvm, dep_libdrm_radeon, idep_nir_headers],
)
driver_radeonsi = declare_dependency(

View File

@@ -356,6 +356,8 @@ static int si_fence_get_fd(struct pipe_screen *screen,
/* If we don't have FDs at this point, it means we don't have fences
* either. */
if (sdma_fd == -1 && gfx_fd == -1)
return ws->export_signalled_sync_file(ws);
if (sdma_fd == -1)
return gfx_fd;
if (gfx_fd == -1)

View File

@@ -249,9 +249,15 @@ DIType* JitManager::GetDebugType(Type* pTy)
switch (id)
{
case Type::VoidTyID: return builder.createUnspecifiedType("void"); break;
#if LLVM_VERSION_MAJOR >= 4
case Type::HalfTyID: return builder.createBasicType("float16", 16, dwarf::DW_ATE_float); break;
case Type::FloatTyID: return builder.createBasicType("float", 32, dwarf::DW_ATE_float); break;
case Type::DoubleTyID: return builder.createBasicType("double", 64, dwarf::DW_ATE_float); break;
#else
case Type::HalfTyID: return builder.createBasicType("float16", 16, 0, dwarf::DW_ATE_float); break;
case Type::FloatTyID: return builder.createBasicType("float", 32, 0, dwarf::DW_ATE_float); break;
case Type::DoubleTyID: return builder.createBasicType("double", 64, 0, dwarf::DW_ATE_float); break;
#endif
case Type::IntegerTyID: return GetDebugIntegerType(pTy); break;
case Type::StructTyID: return GetDebugStructType(pTy); break;
case Type::ArrayTyID: return GetDebugArrayType(pTy); break;
@@ -288,11 +294,19 @@ DIType* JitManager::GetDebugIntegerType(Type* pTy)
IntegerType* pIntTy = cast<IntegerType>(pTy);
switch (pIntTy->getBitWidth())
{
#if LLVM_VERSION_MAJOR >= 4
case 1: return builder.createBasicType("int1", 1, dwarf::DW_ATE_unsigned); break;
case 8: return builder.createBasicType("int8", 8, dwarf::DW_ATE_signed); break;
case 16: return builder.createBasicType("int16", 16, dwarf::DW_ATE_signed); break;
case 32: return builder.createBasicType("int", 32, dwarf::DW_ATE_signed); break;
case 64: return builder.createBasicType("int64", 64, dwarf::DW_ATE_signed); break;
#else
case 1: return builder.createBasicType("int1", 1, 0, dwarf::DW_ATE_unsigned); break;
case 8: return builder.createBasicType("int8", 8, 0, dwarf::DW_ATE_signed); break;
case 16: return builder.createBasicType("int16", 16, 0, dwarf::DW_ATE_signed); break;
case 32: return builder.createBasicType("int", 32, 0, dwarf::DW_ATE_signed); break;
case 64: return builder.createBasicType("int64", 64, 0, dwarf::DW_ATE_signed); break;
#endif
default: SWR_ASSERT(false, "Unimplemented integer bit width");
}
return nullptr;

View File

@@ -29,7 +29,6 @@ VC5_PER_VERSION_SOURCES = \
v3dx_context.h \
v3dx_format_table.c \
v3dx_simulator.c \
v3dx_simulator.h \
vc5_draw.c \
vc5_emit.c \
vc5_rcl.c \

View File

@@ -68,6 +68,7 @@ libgallium_nine = shared_library(
driver_swrast, driver_r300, driver_r600, driver_radeonsi, driver_nouveau,
driver_i915, driver_svga,
],
name_prefix : '',
version : '.'.join(nine_version),
install : true,
install_dir : d3d_drivers_path,

View File

@@ -114,6 +114,27 @@ static int amdgpu_fence_export_sync_file(struct radeon_winsys *rws,
return fd;
}
static int amdgpu_export_signalled_sync_file(struct radeon_winsys *rws)
{
struct amdgpu_winsys *ws = amdgpu_winsys(rws);
uint32_t syncobj;
int fd = -1;
int r = amdgpu_cs_create_syncobj2(ws->dev, DRM_SYNCOBJ_CREATE_SIGNALED,
&syncobj);
if (r) {
return -1;
}
r = amdgpu_cs_syncobj_export_sync_file(ws->dev, syncobj, &fd);
if (r) {
fd = -1;
}
amdgpu_cs_destroy_syncobj(ws->dev, syncobj);
return fd;
}
static void amdgpu_fence_submitted(struct pipe_fence_handle *fence,
uint64_t seq_no,
uint64_t *user_fence_cpu_address)
@@ -1560,4 +1581,5 @@ void amdgpu_cs_init_functions(struct amdgpu_winsys *ws)
ws->base.fence_reference = amdgpu_fence_reference;
ws->base.fence_import_sync_file = amdgpu_fence_import_sync_file;
ws->base.fence_export_sync_file = amdgpu_fence_export_sync_file;
ws->base.export_signalled_sync_file = amdgpu_export_signalled_sync_file;
}

View File

@@ -65,6 +65,7 @@ CLEANFILES += \
EXTRA_DIST += \
$(top_srcdir)/include/vulkan/vk_icd.h \
vulkan/anv_entrypoints_gen.py \
vulkan/anv_extensions_gen.py \
vulkan/anv_extensions.py \
vulkan/anv_icd.py \
vulkan/TODO

View File

@@ -2096,15 +2096,6 @@ fs_visitor::assign_constant_locations()
if (subgroup_id_index >= 0)
max_push_components--; /* Save a slot for the thread ID */
/* FIXME: We currently have some GPU hangs that happen apparently when using
* push constants. Since we have no solution for such hangs yet, just
* go ahead and use pull constants for now.
*/
if (devinfo->gen == 10 && compiler->supports_pull_constants) {
compiler->shader_perf_log(log_data, "Disabling push constants.");
max_push_components = 0;
}
/* We push small arrays, but no bigger than 16 floats. This is big enough
* for a vec4 but hopefully not large enough to push out other stuff. We
* should probably use a better heuristic at some point.
@@ -3640,13 +3631,18 @@ fs_visitor::lower_integer_multiplication()
regions_overlap(inst->dst, inst->size_written,
inst->src[1], inst->size_read(1))) {
needs_mov = true;
low.nr = alloc.allocate(regs_written(inst));
low.offset = low.offset % REG_SIZE;
/* Get a new VGRF but keep the same stride as inst->dst */
low = fs_reg(VGRF, alloc.allocate(regs_written(inst)),
inst->dst.type);
low.stride = inst->dst.stride;
low.offset = inst->dst.offset % REG_SIZE;
}
fs_reg high = inst->dst;
high.nr = alloc.allocate(regs_written(inst));
high.offset = high.offset % REG_SIZE;
/* Get a new VGRF but keep the same stride as inst->dst */
fs_reg high(VGRF, alloc.allocate(regs_written(inst)),
inst->dst.type);
high.stride = inst->dst.stride;
high.offset = inst->dst.offset % REG_SIZE;
if (devinfo->gen >= 7) {
if (inst->src[1].file == IMM) {

View File

@@ -113,47 +113,43 @@ anv_dynamic_state_copy(struct anv_dynamic_state *dest,
}
static void
anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
anv_cmd_state_init(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_cmd_state *state = &cmd_buffer->state;
cmd_buffer->batch.status = VK_SUCCESS;
memset(state, 0, sizeof(*state));
memset(&state->descriptors, 0, sizeof(state->descriptors));
for (uint32_t i = 0; i < ARRAY_SIZE(state->push_descriptors); i++) {
vk_free(&cmd_buffer->pool->alloc, state->push_descriptors[i]);
state->push_descriptors[i] = NULL;
}
for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
vk_free(&cmd_buffer->pool->alloc, state->push_constants[i]);
state->push_constants[i] = NULL;
}
memset(state->binding_tables, 0, sizeof(state->binding_tables));
memset(state->samplers, 0, sizeof(state->samplers));
/* 0 isn't a valid config. This ensures that we always configure L3$. */
cmd_buffer->state.current_l3_config = 0;
state->dirty = 0;
state->vb_dirty = 0;
state->pending_pipe_bits = 0;
state->descriptors_dirty = 0;
state->push_constants_dirty = 0;
state->pipeline = NULL;
state->framebuffer = NULL;
state->pass = NULL;
state->subpass = NULL;
state->push_constant_stages = 0;
state->restart_index = UINT32_MAX;
state->dynamic = default_dynamic_state;
state->need_query_wa = true;
state->pma_fix_enabled = false;
state->hiz_enabled = false;
state->gfx.dynamic = default_dynamic_state;
}
static void
anv_cmd_pipeline_state_finish(struct anv_cmd_buffer *cmd_buffer,
struct anv_cmd_pipeline_state *pipe_state)
{
for (uint32_t i = 0; i < ARRAY_SIZE(pipe_state->push_descriptors); i++)
vk_free(&cmd_buffer->pool->alloc, pipe_state->push_descriptors[i]);
}
static void
anv_cmd_state_finish(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_cmd_state *state = &cmd_buffer->state;
anv_cmd_pipeline_state_finish(cmd_buffer, &state->gfx.base);
anv_cmd_pipeline_state_finish(cmd_buffer, &state->compute.base);
for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++)
vk_free(&cmd_buffer->pool->alloc, state->push_constants[i]);
vk_free(&cmd_buffer->pool->alloc, state->attachments);
state->attachments = NULL;
}
state->gen7.index_buffer = NULL;
static void
anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
{
anv_cmd_state_finish(cmd_buffer);
anv_cmd_state_init(cmd_buffer);
}
VkResult
@@ -198,14 +194,10 @@ static VkResult anv_create_cmd_buffer(
cmd_buffer->batch.status = VK_SUCCESS;
for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
cmd_buffer->state.push_constants[i] = NULL;
}
cmd_buffer->_loader_data.loaderMagic = ICD_LOADER_MAGIC;
cmd_buffer->device = device;
cmd_buffer->pool = pool;
cmd_buffer->level = level;
cmd_buffer->state.attachments = NULL;
result = anv_cmd_buffer_init_batch_bo_chain(cmd_buffer);
if (result != VK_SUCCESS)
@@ -216,8 +208,7 @@ static VkResult anv_create_cmd_buffer(
anv_state_stream_init(&cmd_buffer->dynamic_state_stream,
&device->dynamic_state_pool, 16384);
memset(cmd_buffer->state.push_descriptors, 0,
sizeof(cmd_buffer->state.push_descriptors));
anv_cmd_state_init(cmd_buffer);
if (pool) {
list_addtail(&cmd_buffer->pool_link, &pool->cmd_buffers);
@@ -276,7 +267,7 @@ anv_cmd_buffer_destroy(struct anv_cmd_buffer *cmd_buffer)
anv_state_stream_finish(&cmd_buffer->surface_state_stream);
anv_state_stream_finish(&cmd_buffer->dynamic_state_stream);
anv_cmd_state_reset(cmd_buffer);
anv_cmd_state_finish(cmd_buffer);
vk_free(&cmd_buffer->pool->alloc, cmd_buffer);
}
@@ -353,22 +344,22 @@ void anv_CmdBindPipeline(
switch (pipelineBindPoint) {
case VK_PIPELINE_BIND_POINT_COMPUTE:
cmd_buffer->state.compute_pipeline = pipeline;
cmd_buffer->state.compute_dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.compute.base.pipeline = pipeline;
cmd_buffer->state.compute.pipeline_dirty = true;
cmd_buffer->state.push_constants_dirty |= VK_SHADER_STAGE_COMPUTE_BIT;
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_COMPUTE_BIT;
break;
case VK_PIPELINE_BIND_POINT_GRAPHICS:
cmd_buffer->state.pipeline = pipeline;
cmd_buffer->state.vb_dirty |= pipeline->vb_used;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.gfx.base.pipeline = pipeline;
cmd_buffer->state.gfx.vb_dirty |= pipeline->vb_used;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.push_constants_dirty |= pipeline->active_stages;
cmd_buffer->state.descriptors_dirty |= pipeline->active_stages;
/* Apply the dynamic state from the pipeline */
cmd_buffer->state.dirty |= pipeline->dynamic_state_mask;
anv_dynamic_state_copy(&cmd_buffer->state.dynamic,
cmd_buffer->state.gfx.dirty |= pipeline->dynamic_state_mask;
anv_dynamic_state_copy(&cmd_buffer->state.gfx.dynamic,
&pipeline->dynamic_state,
pipeline->dynamic_state_mask);
break;
@@ -388,13 +379,13 @@ void anv_CmdSetViewport(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
const uint32_t total_count = firstViewport + viewportCount;
if (cmd_buffer->state.dynamic.viewport.count < total_count)
cmd_buffer->state.dynamic.viewport.count = total_count;
if (cmd_buffer->state.gfx.dynamic.viewport.count < total_count)
cmd_buffer->state.gfx.dynamic.viewport.count = total_count;
memcpy(cmd_buffer->state.dynamic.viewport.viewports + firstViewport,
memcpy(cmd_buffer->state.gfx.dynamic.viewport.viewports + firstViewport,
pViewports, viewportCount * sizeof(*pViewports));
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_VIEWPORT;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_VIEWPORT;
}
void anv_CmdSetScissor(
@@ -406,13 +397,13 @@ void anv_CmdSetScissor(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
const uint32_t total_count = firstScissor + scissorCount;
if (cmd_buffer->state.dynamic.scissor.count < total_count)
cmd_buffer->state.dynamic.scissor.count = total_count;
if (cmd_buffer->state.gfx.dynamic.scissor.count < total_count)
cmd_buffer->state.gfx.dynamic.scissor.count = total_count;
memcpy(cmd_buffer->state.dynamic.scissor.scissors + firstScissor,
memcpy(cmd_buffer->state.gfx.dynamic.scissor.scissors + firstScissor,
pScissors, scissorCount * sizeof(*pScissors));
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_SCISSOR;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_SCISSOR;
}
void anv_CmdSetLineWidth(
@@ -421,8 +412,8 @@ void anv_CmdSetLineWidth(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
cmd_buffer->state.dynamic.line_width = lineWidth;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH;
cmd_buffer->state.gfx.dynamic.line_width = lineWidth;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH;
}
void anv_CmdSetDepthBias(
@@ -433,11 +424,11 @@ void anv_CmdSetDepthBias(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
cmd_buffer->state.dynamic.depth_bias.bias = depthBiasConstantFactor;
cmd_buffer->state.dynamic.depth_bias.clamp = depthBiasClamp;
cmd_buffer->state.dynamic.depth_bias.slope = depthBiasSlopeFactor;
cmd_buffer->state.gfx.dynamic.depth_bias.bias = depthBiasConstantFactor;
cmd_buffer->state.gfx.dynamic.depth_bias.clamp = depthBiasClamp;
cmd_buffer->state.gfx.dynamic.depth_bias.slope = depthBiasSlopeFactor;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS;
}
void anv_CmdSetBlendConstants(
@@ -446,10 +437,10 @@ void anv_CmdSetBlendConstants(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
memcpy(cmd_buffer->state.dynamic.blend_constants,
memcpy(cmd_buffer->state.gfx.dynamic.blend_constants,
blendConstants, sizeof(float) * 4);
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS;
}
void anv_CmdSetDepthBounds(
@@ -459,10 +450,10 @@ void anv_CmdSetDepthBounds(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
cmd_buffer->state.dynamic.depth_bounds.min = minDepthBounds;
cmd_buffer->state.dynamic.depth_bounds.max = maxDepthBounds;
cmd_buffer->state.gfx.dynamic.depth_bounds.min = minDepthBounds;
cmd_buffer->state.gfx.dynamic.depth_bounds.max = maxDepthBounds;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BOUNDS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_DEPTH_BOUNDS;
}
void anv_CmdSetStencilCompareMask(
@@ -473,11 +464,11 @@ void anv_CmdSetStencilCompareMask(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
if (faceMask & VK_STENCIL_FACE_FRONT_BIT)
cmd_buffer->state.dynamic.stencil_compare_mask.front = compareMask;
cmd_buffer->state.gfx.dynamic.stencil_compare_mask.front = compareMask;
if (faceMask & VK_STENCIL_FACE_BACK_BIT)
cmd_buffer->state.dynamic.stencil_compare_mask.back = compareMask;
cmd_buffer->state.gfx.dynamic.stencil_compare_mask.back = compareMask;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK;
}
void anv_CmdSetStencilWriteMask(
@@ -488,11 +479,11 @@ void anv_CmdSetStencilWriteMask(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
if (faceMask & VK_STENCIL_FACE_FRONT_BIT)
cmd_buffer->state.dynamic.stencil_write_mask.front = writeMask;
cmd_buffer->state.gfx.dynamic.stencil_write_mask.front = writeMask;
if (faceMask & VK_STENCIL_FACE_BACK_BIT)
cmd_buffer->state.dynamic.stencil_write_mask.back = writeMask;
cmd_buffer->state.gfx.dynamic.stencil_write_mask.back = writeMask;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK;
}
void anv_CmdSetStencilReference(
@@ -503,11 +494,59 @@ void anv_CmdSetStencilReference(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
if (faceMask & VK_STENCIL_FACE_FRONT_BIT)
cmd_buffer->state.dynamic.stencil_reference.front = reference;
cmd_buffer->state.gfx.dynamic.stencil_reference.front = reference;
if (faceMask & VK_STENCIL_FACE_BACK_BIT)
cmd_buffer->state.dynamic.stencil_reference.back = reference;
cmd_buffer->state.gfx.dynamic.stencil_reference.back = reference;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
}
static void
anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
struct anv_pipeline_layout *layout,
uint32_t set_index,
struct anv_descriptor_set *set,
uint32_t *dynamic_offset_count,
const uint32_t **dynamic_offsets)
{
struct anv_descriptor_set_layout *set_layout =
layout->set[set_index].layout;
struct anv_cmd_pipeline_state *pipe_state;
if (bind_point == VK_PIPELINE_BIND_POINT_COMPUTE) {
pipe_state = &cmd_buffer->state.compute.base;
} else {
assert(bind_point == VK_PIPELINE_BIND_POINT_GRAPHICS);
pipe_state = &cmd_buffer->state.gfx.base;
}
pipe_state->descriptors[set_index] = set;
if (dynamic_offsets) {
if (set_layout->dynamic_offset_count > 0) {
uint32_t dynamic_offset_start =
layout->set[set_index].dynamic_offset_start;
/* Assert that everything is in range */
assert(set_layout->dynamic_offset_count <= *dynamic_offset_count);
assert(dynamic_offset_start + set_layout->dynamic_offset_count <=
ARRAY_SIZE(pipe_state->dynamic_offsets));
typed_memcpy(&pipe_state->dynamic_offsets[dynamic_offset_start],
*dynamic_offsets, set_layout->dynamic_offset_count);
*dynamic_offsets += set_layout->dynamic_offset_count;
*dynamic_offset_count -= set_layout->dynamic_offset_count;
}
}
if (bind_point == VK_PIPELINE_BIND_POINT_COMPUTE) {
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_COMPUTE_BIT;
} else {
assert(bind_point == VK_PIPELINE_BIND_POINT_GRAPHICS);
cmd_buffer->state.descriptors_dirty |=
set_layout->shader_stages & VK_SHADER_STAGE_ALL_GRAPHICS;
}
}
void anv_CmdBindDescriptorSets(
@@ -522,35 +561,15 @@ void anv_CmdBindDescriptorSets(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_pipeline_layout, layout, _layout);
struct anv_descriptor_set_layout *set_layout;
assert(firstSet + descriptorSetCount < MAX_SETS);
uint32_t dynamic_slot = 0;
for (uint32_t i = 0; i < descriptorSetCount; i++) {
ANV_FROM_HANDLE(anv_descriptor_set, set, pDescriptorSets[i]);
set_layout = layout->set[firstSet + i].layout;
cmd_buffer->state.descriptors[firstSet + i] = set;
if (set_layout->dynamic_offset_count > 0) {
uint32_t dynamic_offset_start =
layout->set[firstSet + i].dynamic_offset_start;
/* Assert that everything is in range */
assert(dynamic_offset_start + set_layout->dynamic_offset_count <=
ARRAY_SIZE(cmd_buffer->state.dynamic_offsets));
assert(dynamic_slot + set_layout->dynamic_offset_count <=
dynamicOffsetCount);
typed_memcpy(&cmd_buffer->state.dynamic_offsets[dynamic_offset_start],
&pDynamicOffsets[dynamic_slot],
set_layout->dynamic_offset_count);
dynamic_slot += set_layout->dynamic_offset_count;
}
cmd_buffer->state.descriptors_dirty |= set_layout->shader_stages;
anv_cmd_buffer_bind_descriptor_set(cmd_buffer, pipelineBindPoint,
layout, firstSet + i, set,
&dynamicOffsetCount,
&pDynamicOffsets);
}
}
@@ -571,7 +590,7 @@ void anv_CmdBindVertexBuffers(
for (uint32_t i = 0; i < bindingCount; i++) {
vb[firstBinding + i].buffer = anv_buffer_from_handle(pBuffers[i]);
vb[firstBinding + i].offset = pOffsets[i];
cmd_buffer->state.vb_dirty |= 1 << (firstBinding + i);
cmd_buffer->state.gfx.vb_dirty |= 1 << (firstBinding + i);
}
}
@@ -653,14 +672,16 @@ struct anv_state
anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
gl_shader_stage stage)
{
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
/* If we don't have this stage, bail. */
if (!anv_pipeline_has_stage(cmd_buffer->state.pipeline, stage))
if (!anv_pipeline_has_stage(pipeline, stage))
return (struct anv_state) { .offset = 0 };
struct anv_push_constants *data =
cmd_buffer->state.push_constants[stage];
const struct brw_stage_prog_data *prog_data =
cmd_buffer->state.pipeline->shaders[stage]->prog_data;
pipeline->shaders[stage]->prog_data;
/* If we don't actually have any push constants, bail. */
if (data == NULL || prog_data == NULL || prog_data->nr_params == 0)
@@ -686,7 +707,7 @@ anv_cmd_buffer_cs_push_constants(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_push_constants *data =
cmd_buffer->state.push_constants[MESA_SHADER_COMPUTE];
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
const struct brw_cs_prog_data *cs_prog_data = get_cs_prog_data(pipeline);
const struct brw_stage_prog_data *prog_data = &cs_prog_data->base;
@@ -850,12 +871,21 @@ anv_cmd_buffer_get_depth_stencil_view(const struct anv_cmd_buffer *cmd_buffer)
return iview;
}
static VkResult
anv_cmd_buffer_ensure_push_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
uint32_t set)
static struct anv_push_descriptor_set *
anv_cmd_buffer_get_push_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
uint32_t set)
{
struct anv_cmd_pipeline_state *pipe_state;
if (bind_point == VK_PIPELINE_BIND_POINT_COMPUTE) {
pipe_state = &cmd_buffer->state.compute.base;
} else {
assert(bind_point == VK_PIPELINE_BIND_POINT_GRAPHICS);
pipe_state = &cmd_buffer->state.gfx.base;
}
struct anv_push_descriptor_set **push_set =
&cmd_buffer->state.push_descriptors[set];
&pipe_state->push_descriptors[set];
if (*push_set == NULL) {
*push_set = vk_alloc(&cmd_buffer->pool->alloc,
@@ -863,11 +893,11 @@ anv_cmd_buffer_ensure_push_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (*push_set == NULL) {
anv_batch_set_error(&cmd_buffer->batch, VK_ERROR_OUT_OF_HOST_MEMORY);
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
return NULL;
}
}
return VK_SUCCESS;
return *push_set;
}
void anv_CmdPushDescriptorSetKHR(
@@ -881,17 +911,17 @@ void anv_CmdPushDescriptorSetKHR(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_pipeline_layout, layout, _layout);
assert(pipelineBindPoint == VK_PIPELINE_BIND_POINT_GRAPHICS ||
pipelineBindPoint == VK_PIPELINE_BIND_POINT_COMPUTE);
assert(_set < MAX_SETS);
const struct anv_descriptor_set_layout *set_layout =
layout->set[_set].layout;
if (anv_cmd_buffer_ensure_push_descriptor_set(cmd_buffer, _set) != VK_SUCCESS)
return;
struct anv_push_descriptor_set *push_set =
cmd_buffer->state.push_descriptors[_set];
anv_cmd_buffer_get_push_descriptor_set(cmd_buffer,
pipelineBindPoint, _set);
if (!push_set)
return;
struct anv_descriptor_set *set = &push_set->set;
set->layout = set_layout;
@@ -958,8 +988,8 @@ void anv_CmdPushDescriptorSetKHR(
}
}
cmd_buffer->state.descriptors[_set] = set;
cmd_buffer->state.descriptors_dirty |= set_layout->shader_stages;
anv_cmd_buffer_bind_descriptor_set(cmd_buffer, pipelineBindPoint,
layout, _set, set, NULL, NULL);
}
void anv_CmdPushDescriptorSetWithTemplateKHR(
@@ -979,10 +1009,12 @@ void anv_CmdPushDescriptorSetWithTemplateKHR(
const struct anv_descriptor_set_layout *set_layout =
layout->set[_set].layout;
if (anv_cmd_buffer_ensure_push_descriptor_set(cmd_buffer, _set) != VK_SUCCESS)
return;
struct anv_push_descriptor_set *push_set =
cmd_buffer->state.push_descriptors[_set];
anv_cmd_buffer_get_push_descriptor_set(cmd_buffer,
template->bind_point, _set);
if (!push_set)
return;
struct anv_descriptor_set *set = &push_set->set;
set->layout = set_layout;
@@ -996,6 +1028,6 @@ void anv_CmdPushDescriptorSetWithTemplateKHR(
template,
pData);
cmd_buffer->state.descriptors[_set] = set;
cmd_buffer->state.descriptors_dirty |= set_layout->shader_stages;
anv_cmd_buffer_bind_descriptor_set(cmd_buffer, template->bind_point,
layout, _set, set, NULL, NULL);
}

View File

@@ -893,6 +893,8 @@ VkResult anv_CreateDescriptorUpdateTemplateKHR(
if (template == NULL)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
template->bind_point = pCreateInfo->pipelineBindPoint;
if (pCreateInfo->templateType == VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR)
template->set = pCreateInfo->set;

View File

@@ -44,4 +44,4 @@ if __name__ == '__main__':
}
with open(args.out, 'w') as f:
json.dump(json_data, f, indent = 4)
json.dump(json_data, f, indent = 4, sort_keys=True)

View File

@@ -313,10 +313,10 @@ VkResult __vk_errorf(struct anv_instance *instance, const void *object,
#ifdef DEBUG
#define vk_error(error) __vk_errorf(NULL, NULL,\
VK_DEBUG_REPORT_OBJECT_TYPE_UNKNOWN_EXT,\
error, __FILE__, __LINE__, NULL);
error, __FILE__, __LINE__, NULL)
#define vk_errorf(instance, obj, error, format, ...)\
__vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\
__FILE__, __LINE__, format, ## __VA_ARGS__);
__FILE__, __LINE__, format, ## __VA_ARGS__)
#else
#define vk_error(error) error
#define vk_errorf(instance, obj, error, format, ...) error
@@ -1306,6 +1306,8 @@ struct anv_descriptor_template_entry {
};
struct anv_descriptor_update_template {
VkPipelineBindPoint bind_point;
/* The descriptor set this template corresponds to. This value is only
* valid if the template was created with the templateType
* VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR.
@@ -1456,6 +1458,7 @@ enum anv_pipe_bits {
ANV_PIPE_CONSTANT_CACHE_INVALIDATE_BIT = (1 << 3),
ANV_PIPE_VF_CACHE_INVALIDATE_BIT = (1 << 4),
ANV_PIPE_DATA_CACHE_FLUSH_BIT = (1 << 5),
ANV_PIPE_ISP_DISABLE_BIT = (1 << 9),
ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT = (1 << 10),
ANV_PIPE_INSTRUCTION_CACHE_INVALIDATE_BIT = (1 << 11),
ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT = (1 << 12),
@@ -1663,38 +1666,83 @@ struct anv_attachment_state {
bool clear_color_is_zero;
};
/** State tracking for particular pipeline bind point
*
* This struct is the base struct for anv_cmd_graphics_state and
* anv_cmd_compute_state. These are used to track state which is bound to a
* particular type of pipeline. Generic state that applies per-stage such as
* binding table offsets and push constants is tracked generically with a
* per-stage array in anv_cmd_state.
*/
struct anv_cmd_pipeline_state {
struct anv_pipeline *pipeline;
struct anv_descriptor_set *descriptors[MAX_SETS];
uint32_t dynamic_offsets[MAX_DYNAMIC_BUFFERS];
struct anv_push_descriptor_set *push_descriptors[MAX_SETS];
};
/** State tracking for graphics pipeline
*
* This has anv_cmd_pipeline_state as a base struct to track things which get
* bound to a graphics pipeline. Along with general pipeline bind point state
* which is in the anv_cmd_pipeline_state base struct, it also contains other
* state which is graphics-specific.
*/
struct anv_cmd_graphics_state {
struct anv_cmd_pipeline_state base;
anv_cmd_dirty_mask_t dirty;
uint32_t vb_dirty;
struct anv_dynamic_state dynamic;
struct {
struct anv_buffer *index_buffer;
uint32_t index_type; /**< 3DSTATE_INDEX_BUFFER.IndexFormat */
uint32_t index_offset;
} gen7;
};
/** State tracking for compute pipeline
*
* This has anv_cmd_pipeline_state as a base struct to track things which get
* bound to a compute pipeline. Along with general pipeline bind point state
* which is in the anv_cmd_pipeline_state base struct, it also contains other
* state which is compute-specific.
*/
struct anv_cmd_compute_state {
struct anv_cmd_pipeline_state base;
bool pipeline_dirty;
struct anv_address num_workgroups;
};
/** State required while building cmd buffer */
struct anv_cmd_state {
/* PIPELINE_SELECT.PipelineSelection */
uint32_t current_pipeline;
const struct gen_l3_config * current_l3_config;
uint32_t vb_dirty;
anv_cmd_dirty_mask_t dirty;
anv_cmd_dirty_mask_t compute_dirty;
struct anv_cmd_graphics_state gfx;
struct anv_cmd_compute_state compute;
enum anv_pipe_bits pending_pipe_bits;
uint32_t num_workgroups_offset;
struct anv_bo *num_workgroups_bo;
VkShaderStageFlags descriptors_dirty;
VkShaderStageFlags push_constants_dirty;
uint32_t scratch_size;
struct anv_pipeline * pipeline;
struct anv_pipeline * compute_pipeline;
struct anv_framebuffer * framebuffer;
struct anv_render_pass * pass;
struct anv_subpass * subpass;
VkRect2D render_area;
uint32_t restart_index;
struct anv_vertex_binding vertex_bindings[MAX_VBS];
struct anv_descriptor_set * descriptors[MAX_SETS];
uint32_t dynamic_offsets[MAX_DYNAMIC_BUFFERS];
VkShaderStageFlags push_constant_stages;
struct anv_push_constants * push_constants[MESA_SHADER_STAGES];
struct anv_state binding_tables[MESA_SHADER_STAGES];
struct anv_state samplers[MESA_SHADER_STAGES];
struct anv_dynamic_state dynamic;
bool need_query_wa;
struct anv_push_descriptor_set * push_descriptors[MAX_SETS];
/**
* Whether or not the gen8 PMA fix is enabled. We ensure that, at the top
@@ -1728,12 +1776,6 @@ struct anv_cmd_state {
* is one of the states in render_pass_states.
*/
struct anv_state null_surface_state;
struct {
struct anv_buffer * index_buffer;
uint32_t index_type; /**< 3DSTATE_INDEX_BUFFER.IndexFormat */
uint32_t index_offset;
} gen7;
};
struct anv_cmd_pool {

View File

@@ -48,8 +48,8 @@ clamp_int64(int64_t x, int64_t min, int64_t max)
void
gen7_cmd_buffer_emit_scissor(struct anv_cmd_buffer *cmd_buffer)
{
uint32_t count = cmd_buffer->state.dynamic.scissor.count;
const VkRect2D *scissors = cmd_buffer->state.dynamic.scissor.scissors;
uint32_t count = cmd_buffer->state.gfx.dynamic.scissor.count;
const VkRect2D *scissors = cmd_buffer->state.gfx.dynamic.scissor.scissors;
struct anv_state scissor_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer, count * 8, 32);
@@ -113,12 +113,12 @@ void genX(CmdBindIndexBuffer)(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
if (GEN_IS_HASWELL)
cmd_buffer->state.restart_index = restart_index_for_type[indexType];
cmd_buffer->state.gen7.index_buffer = buffer;
cmd_buffer->state.gen7.index_type = vk_to_gen_index_type[indexType];
cmd_buffer->state.gen7.index_offset = offset;
cmd_buffer->state.gfx.gen7.index_buffer = buffer;
cmd_buffer->state.gfx.gen7.index_type = vk_to_gen_index_type[indexType];
cmd_buffer->state.gfx.gen7.index_offset = offset;
}
static uint32_t
@@ -154,38 +154,38 @@ get_depth_format(struct anv_cmd_buffer *cmd_buffer)
void
genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
struct anv_dynamic_state *d = &cmd_buffer->state.gfx.dynamic;
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)) {
uint32_t sf_dw[GENX(3DSTATE_SF_length)];
struct GENX(3DSTATE_SF) sf = {
GENX(3DSTATE_SF_header),
.DepthBufferSurfaceFormat = get_depth_format(cmd_buffer),
.LineWidth = cmd_buffer->state.dynamic.line_width,
.GlobalDepthOffsetConstant = cmd_buffer->state.dynamic.depth_bias.bias,
.GlobalDepthOffsetScale = cmd_buffer->state.dynamic.depth_bias.slope,
.GlobalDepthOffsetClamp = cmd_buffer->state.dynamic.depth_bias.clamp
.LineWidth = d->line_width,
.GlobalDepthOffsetConstant = d->depth_bias.bias,
.GlobalDepthOffsetScale = d->depth_bias.slope,
.GlobalDepthOffsetClamp = d->depth_bias.clamp
};
GENX(3DSTATE_SF_pack)(NULL, sf_dw, &sf);
anv_batch_emit_merge(&cmd_buffer->batch, sf_dw, pipeline->gen7.sf);
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
GENX(COLOR_CALC_STATE_length) * 4,
64);
struct GENX(COLOR_CALC_STATE) cc = {
.BlendConstantColorRed = cmd_buffer->state.dynamic.blend_constants[0],
.BlendConstantColorGreen = cmd_buffer->state.dynamic.blend_constants[1],
.BlendConstantColorBlue = cmd_buffer->state.dynamic.blend_constants[2],
.BlendConstantColorAlpha = cmd_buffer->state.dynamic.blend_constants[3],
.BlendConstantColorRed = d->blend_constants[0],
.BlendConstantColorGreen = d->blend_constants[1],
.BlendConstantColorBlue = d->blend_constants[2],
.BlendConstantColorAlpha = d->blend_constants[3],
.StencilReferenceValue = d->stencil_reference.front & 0xff,
.BackfaceStencilReferenceValue = d->stencil_reference.back & 0xff,
};
@@ -197,12 +197,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
uint32_t depth_stencil_dw[GENX(DEPTH_STENCIL_STATE_length)];
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
struct GENX(DEPTH_STENCIL_STATE) depth_stencil = {
.StencilTestMask = d->stencil_compare_mask.front & 0xff,
@@ -228,11 +227,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.gen7.index_buffer &&
cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
struct anv_buffer *buffer = cmd_buffer->state.gen7.index_buffer;
uint32_t offset = cmd_buffer->state.gen7.index_offset;
if (cmd_buffer->state.gfx.gen7.index_buffer &&
cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
struct anv_buffer *buffer = cmd_buffer->state.gfx.gen7.index_buffer;
uint32_t offset = cmd_buffer->state.gfx.gen7.index_offset;
#if GEN_IS_HASWELL
anv_batch_emit(&cmd_buffer->batch, GEN75_3DSTATE_VF, vf) {
@@ -245,7 +244,7 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
#if !GEN_IS_HASWELL
ib.CutIndexEnable = pipeline->primitive_restart;
#endif
ib.IndexFormat = cmd_buffer->state.gen7.index_type;
ib.IndexFormat = cmd_buffer->state.gfx.gen7.index_type;
ib.MemoryObjectControlState = GENX(MOCS);
ib.BufferStartingAddress =
@@ -255,7 +254,7 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
cmd_buffer->state.dirty = 0;
cmd_buffer->state.gfx.dirty = 0;
}
void

View File

@@ -36,8 +36,9 @@
void
gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer *cmd_buffer)
{
uint32_t count = cmd_buffer->state.dynamic.viewport.count;
const VkViewport *viewports = cmd_buffer->state.dynamic.viewport.viewports;
uint32_t count = cmd_buffer->state.gfx.dynamic.viewport.count;
const VkViewport *viewports =
cmd_buffer->state.gfx.dynamic.viewport.viewports;
struct anv_state sf_clip_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer, count * 64, 64);
@@ -79,8 +80,9 @@ void
gen8_cmd_buffer_emit_depth_viewport(struct anv_cmd_buffer *cmd_buffer,
bool depth_clamp_enable)
{
uint32_t count = cmd_buffer->state.dynamic.viewport.count;
const VkViewport *viewports = cmd_buffer->state.dynamic.viewport.viewports;
uint32_t count = cmd_buffer->state.gfx.dynamic.viewport.count;
const VkViewport *viewports =
cmd_buffer->state.gfx.dynamic.viewport.viewports;
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer, count * 8, 32);
@@ -218,7 +220,7 @@ want_depth_pma_fix(struct anv_cmd_buffer *cmd_buffer)
return false;
/* 3DSTATE_PS_EXTRA::PixelShaderValid */
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
if (!anv_pipeline_has_stage(pipeline, MESA_SHADER_FRAGMENT))
return false;
@@ -328,7 +330,7 @@ want_stencil_pma_fix(struct anv_cmd_buffer *cmd_buffer)
assert(ds_iview && ds_iview->image->planes[0].aux_usage == ISL_AUX_USAGE_HIZ);
/* 3DSTATE_PS_EXTRA::PixelShaderValid */
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
if (!anv_pipeline_has_stage(pipeline, MESA_SHADER_FRAGMENT))
return false;
@@ -381,36 +383,36 @@ want_stencil_pma_fix(struct anv_cmd_buffer *cmd_buffer)
void
genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
struct anv_dynamic_state *d = &cmd_buffer->state.gfx.dynamic;
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_LINE_WIDTH)) {
uint32_t sf_dw[GENX(3DSTATE_SF_length)];
struct GENX(3DSTATE_SF) sf = {
GENX(3DSTATE_SF_header),
};
#if GEN_GEN == 8
if (cmd_buffer->device->info.is_cherryview) {
sf.CHVLineWidth = cmd_buffer->state.dynamic.line_width;
sf.CHVLineWidth = d->line_width;
} else {
sf.LineWidth = cmd_buffer->state.dynamic.line_width;
sf.LineWidth = d->line_width;
}
#else
sf.LineWidth = cmd_buffer->state.dynamic.line_width,
sf.LineWidth = d->line_width,
#endif
GENX(3DSTATE_SF_pack)(NULL, sf_dw, &sf);
anv_batch_emit_merge(&cmd_buffer->batch, sf_dw,
cmd_buffer->state.pipeline->gen8.sf);
anv_batch_emit_merge(&cmd_buffer->batch, sf_dw, pipeline->gen8.sf);
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)){
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_DYNAMIC_DEPTH_BIAS)){
uint32_t raster_dw[GENX(3DSTATE_RASTER_length)];
struct GENX(3DSTATE_RASTER) raster = {
GENX(3DSTATE_RASTER_header),
.GlobalDepthOffsetConstant = cmd_buffer->state.dynamic.depth_bias.bias,
.GlobalDepthOffsetScale = cmd_buffer->state.dynamic.depth_bias.slope,
.GlobalDepthOffsetClamp = cmd_buffer->state.dynamic.depth_bias.clamp
.GlobalDepthOffsetConstant = d->depth_bias.bias,
.GlobalDepthOffsetScale = d->depth_bias.slope,
.GlobalDepthOffsetClamp = d->depth_bias.clamp
};
GENX(3DSTATE_RASTER_pack)(NULL, raster_dw, &raster);
anv_batch_emit_merge(&cmd_buffer->batch, raster_dw,
@@ -423,18 +425,17 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
* using a big old #if switch here.
*/
#if GEN_GEN == 8
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
GENX(COLOR_CALC_STATE_length) * 4,
64);
struct GENX(COLOR_CALC_STATE) cc = {
.BlendConstantColorRed = cmd_buffer->state.dynamic.blend_constants[0],
.BlendConstantColorGreen = cmd_buffer->state.dynamic.blend_constants[1],
.BlendConstantColorBlue = cmd_buffer->state.dynamic.blend_constants[2],
.BlendConstantColorAlpha = cmd_buffer->state.dynamic.blend_constants[3],
.BlendConstantColorRed = d->blend_constants[0],
.BlendConstantColorGreen = d->blend_constants[1],
.BlendConstantColorBlue = d->blend_constants[2],
.BlendConstantColorAlpha = d->blend_constants[3],
.StencilReferenceValue = d->stencil_reference.front & 0xff,
.BackfaceStencilReferenceValue = d->stencil_reference.back & 0xff,
};
@@ -448,12 +449,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK)) {
uint32_t wm_depth_stencil_dw[GENX(3DSTATE_WM_DEPTH_STENCIL_length)];
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
struct GENX(3DSTATE_WM_DEPTH_STENCIL wm_depth_stencil) = {
GENX(3DSTATE_WM_DEPTH_STENCIL_header),
@@ -478,16 +478,16 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
want_depth_pma_fix(cmd_buffer));
}
#else
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS) {
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_DYNAMIC_BLEND_CONSTANTS) {
struct anv_state cc_state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
GENX(COLOR_CALC_STATE_length) * 4,
64);
struct GENX(COLOR_CALC_STATE) cc = {
.BlendConstantColorRed = cmd_buffer->state.dynamic.blend_constants[0],
.BlendConstantColorGreen = cmd_buffer->state.dynamic.blend_constants[1],
.BlendConstantColorBlue = cmd_buffer->state.dynamic.blend_constants[2],
.BlendConstantColorAlpha = cmd_buffer->state.dynamic.blend_constants[3],
.BlendConstantColorRed = d->blend_constants[0],
.BlendConstantColorGreen = d->blend_constants[1],
.BlendConstantColorBlue = d->blend_constants[2],
.BlendConstantColorAlpha = d->blend_constants[3],
};
GENX(COLOR_CALC_STATE_pack)(NULL, cc_state.map, &cc);
@@ -499,13 +499,12 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_RENDER_TARGETS |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_COMPARE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_WRITE_MASK |
ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE)) {
uint32_t dwords[GENX(3DSTATE_WM_DEPTH_STENCIL_length)];
struct anv_dynamic_state *d = &cmd_buffer->state.dynamic;
struct GENX(3DSTATE_WM_DEPTH_STENCIL) wm_depth_stencil = {
GENX(3DSTATE_WM_DEPTH_STENCIL_header),
@@ -532,15 +531,15 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer *cmd_buffer)
}
#endif
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_PIPELINE |
ANV_CMD_DIRTY_INDEX_BUFFER)) {
anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_VF), vf) {
vf.IndexedDrawCutIndexEnable = pipeline->primitive_restart;
vf.CutIndex = cmd_buffer->state.restart_index;
}
}
cmd_buffer->state.dirty = 0;
cmd_buffer->state.gfx.dirty = 0;
}
void genX(CmdBindIndexBuffer)(
@@ -572,7 +571,7 @@ void genX(CmdBindIndexBuffer)(
ib.BufferSize = buffer->size - offset;
}
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_INDEX_BUFFER;
}
/* Set of stage bits for which are pipelined, i.e. they get queued by the

View File

@@ -218,7 +218,7 @@ genX(blorp_exec)(struct blorp_batch *batch,
blorp_exec(batch, params);
cmd_buffer->state.vb_dirty = ~0;
cmd_buffer->state.dirty = ~0;
cmd_buffer->state.gfx.vb_dirty = ~0;
cmd_buffer->state.gfx.dirty = ~0;
cmd_buffer->state.push_constants_dirty = ~0;
}

View File

@@ -1002,12 +1002,56 @@ genX(BeginCommandBuffer)(
}
}
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
}
return result;
}
/**
* From the PRM, Volume 2a:
*
* "Indirect State Pointers Disable
*
* At the completion of the post-sync operation associated with this pipe
* control packet, the indirect state pointers in the hardware are
* considered invalid; the indirect pointers are not saved in the context.
* If any new indirect state commands are executed in the command stream
* while the pipe control is pending, the new indirect state commands are
* preserved.
*
* [DevIVB+]: Using Invalidate State Pointer (ISP) only inhibits context
* restoring of Push Constant (3DSTATE_CONSTANT_*) commands. Push Constant
* commands are only considered as Indirect State Pointers. Once ISP is
* issued in a context, SW must initialize by programming push constant
* commands for all the shaders (at least to zero length) before attempting
* any rendering operation for the same context."
*
* 3DSTATE_CONSTANT_* packets are restored during a context restore,
* even though they point to a BO that has been already unreferenced at
* the end of the previous batch buffer. This has been fine so far since
* we are protected by these scratch page (every address not covered by
* a BO should be pointing to the scratch page). But on CNL, it is
* causing a GPU hang during context restore at the 3DSTATE_CONSTANT_*
* instruction.
*
* The flag "Indirect State Pointers Disable" in PIPE_CONTROL tells the
* hardware to ignore previous 3DSTATE_CONSTANT_* packets during a
* context restore, so the mentioned hang doesn't happen. However,
* software must program push constant commands for all stages prior to
* rendering anything, so we flag them as dirty.
*/
static void
emit_isp_disable(struct anv_cmd_buffer *cmd_buffer)
{
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.IndirectStatePointersDisable = true;
pc.PostSyncOperation = WriteImmediateData;
pc.Address =
(struct anv_address) { &cmd_buffer->device->workaround_bo, 0 };
}
}
VkResult
genX(EndCommandBuffer)(
VkCommandBuffer commandBuffer)
@@ -1024,6 +1068,9 @@ genX(EndCommandBuffer)(
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
if (GEN_GEN == 10)
emit_isp_disable(cmd_buffer);
anv_cmd_buffer_end_batch_buffer(cmd_buffer);
return VK_SUCCESS;
@@ -1398,7 +1445,8 @@ void genX(CmdPipelineBarrier)(
static void
cmd_buffer_alloc_push_constants(struct anv_cmd_buffer *cmd_buffer)
{
VkShaderStageFlags stages = cmd_buffer->state.pipeline->active_stages;
VkShaderStageFlags stages =
cmd_buffer->state.gfx.base.pipeline->active_stages;
/* In order to avoid thrash, we assume that vertex and fragment stages
* always exist. In the rare case where one is missing *and* the other
@@ -1462,32 +1510,32 @@ cmd_buffer_alloc_push_constants(struct anv_cmd_buffer *cmd_buffer)
}
static const struct anv_descriptor *
anv_descriptor_for_binding(const struct anv_cmd_buffer *cmd_buffer,
anv_descriptor_for_binding(const struct anv_cmd_pipeline_state *pipe_state,
const struct anv_pipeline_binding *binding)
{
assert(binding->set < MAX_SETS);
const struct anv_descriptor_set *set =
cmd_buffer->state.descriptors[binding->set];
pipe_state->descriptors[binding->set];
const uint32_t offset =
set->layout->binding[binding->binding].descriptor_index;
return &set->descriptors[offset + binding->index];
}
static uint32_t
dynamic_offset_for_binding(const struct anv_cmd_buffer *cmd_buffer,
dynamic_offset_for_binding(const struct anv_cmd_pipeline_state *pipe_state,
const struct anv_pipeline *pipeline,
const struct anv_pipeline_binding *binding)
{
assert(binding->set < MAX_SETS);
const struct anv_descriptor_set *set =
cmd_buffer->state.descriptors[binding->set];
pipe_state->descriptors[binding->set];
uint32_t dynamic_offset_idx =
pipeline->layout->set[binding->set].dynamic_offset_start +
set->layout->binding[binding->binding].dynamic_offset_index +
binding->index;
return cmd_buffer->state.dynamic_offsets[dynamic_offset_idx];
return pipe_state->dynamic_offsets[dynamic_offset_idx];
}
static VkResult
@@ -1496,19 +1544,21 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
struct anv_state *bt_state)
{
struct anv_subpass *subpass = cmd_buffer->state.subpass;
struct anv_cmd_pipeline_state *pipe_state;
struct anv_pipeline *pipeline;
uint32_t bias, state_offset;
switch (stage) {
case MESA_SHADER_COMPUTE:
pipeline = cmd_buffer->state.compute_pipeline;
pipe_state = &cmd_buffer->state.compute.base;
bias = 1;
break;
default:
pipeline = cmd_buffer->state.pipeline;
pipe_state = &cmd_buffer->state.gfx.base;
bias = 0;
break;
}
pipeline = pipe_state->pipeline;
if (!anv_pipeline_has_stage(pipeline, stage)) {
*bt_state = (struct anv_state) { 0, };
@@ -1530,9 +1580,9 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
return VK_ERROR_OUT_OF_DEVICE_MEMORY;
if (stage == MESA_SHADER_COMPUTE &&
get_cs_prog_data(cmd_buffer->state.compute_pipeline)->uses_num_work_groups) {
struct anv_bo *bo = cmd_buffer->state.num_workgroups_bo;
uint32_t bo_offset = cmd_buffer->state.num_workgroups_offset;
get_cs_prog_data(pipeline)->uses_num_work_groups) {
struct anv_bo *bo = cmd_buffer->state.compute.num_workgroups.bo;
uint32_t bo_offset = cmd_buffer->state.compute.num_workgroups.offset;
struct anv_state surface_state;
surface_state =
@@ -1593,7 +1643,7 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
}
const struct anv_descriptor *desc =
anv_descriptor_for_binding(cmd_buffer, binding);
anv_descriptor_for_binding(pipe_state, binding);
switch (desc->type) {
case VK_DESCRIPTOR_TYPE_SAMPLER:
@@ -1669,7 +1719,7 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC: {
/* Compute the offset within the buffer */
uint32_t dynamic_offset =
dynamic_offset_for_binding(cmd_buffer, pipeline, binding);
dynamic_offset_for_binding(pipe_state, pipeline, binding);
uint64_t offset = desc->offset + dynamic_offset;
/* Clamp to the buffer size */
offset = MIN2(offset, desc->buffer->size);
@@ -1725,12 +1775,10 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
gl_shader_stage stage,
struct anv_state *state)
{
struct anv_pipeline *pipeline;
if (stage == MESA_SHADER_COMPUTE)
pipeline = cmd_buffer->state.compute_pipeline;
else
pipeline = cmd_buffer->state.pipeline;
struct anv_cmd_pipeline_state *pipe_state =
stage == MESA_SHADER_COMPUTE ? &cmd_buffer->state.compute.base :
&cmd_buffer->state.gfx.base;
struct anv_pipeline *pipeline = pipe_state->pipeline;
if (!anv_pipeline_has_stage(pipeline, stage)) {
*state = (struct anv_state) { 0, };
@@ -1751,10 +1799,8 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
for (uint32_t s = 0; s < map->sampler_count; s++) {
struct anv_pipeline_binding *binding = &map->sampler_to_descriptor[s];
struct anv_descriptor_set *set =
cmd_buffer->state.descriptors[binding->set];
uint32_t offset = set->layout->binding[binding->binding].descriptor_index;
struct anv_descriptor *desc = &set->descriptors[offset + binding->index];
const struct anv_descriptor *desc =
anv_descriptor_for_binding(pipe_state, binding);
if (desc->type != VK_DESCRIPTOR_TYPE_SAMPLER &&
desc->type != VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER)
@@ -1780,8 +1826,10 @@ emit_samplers(struct anv_cmd_buffer *cmd_buffer,
static uint32_t
flush_descriptor_sets(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
VkShaderStageFlags dirty = cmd_buffer->state.descriptors_dirty &
cmd_buffer->state.pipeline->active_stages;
pipeline->active_stages;
VkResult result = VK_SUCCESS;
anv_foreach_stage(s, dirty) {
@@ -1807,7 +1855,7 @@ flush_descriptor_sets(struct anv_cmd_buffer *cmd_buffer)
genX(cmd_buffer_emit_state_base_address)(cmd_buffer);
/* Re-emit all active binding tables */
dirty |= cmd_buffer->state.pipeline->active_stages;
dirty |= pipeline->active_stages;
anv_foreach_stage(s, dirty) {
result = emit_samplers(cmd_buffer, s, &cmd_buffer->state.samplers[s]);
if (result != VK_SUCCESS) {
@@ -1876,7 +1924,8 @@ static void
cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
VkShaderStageFlags dirty_stages)
{
UNUSED const struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
const struct anv_cmd_graphics_state *gfx_state = &cmd_buffer->state.gfx;
const struct anv_pipeline *pipeline = gfx_state->base.pipeline;
static const uint32_t push_constant_opcodes[] = {
[MESA_SHADER_VERTEX] = 21,
@@ -1896,7 +1945,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c) {
c._3DCommandSubOpcode = push_constant_opcodes[stage];
if (anv_pipeline_has_stage(cmd_buffer->state.pipeline, stage)) {
if (anv_pipeline_has_stage(pipeline, stage)) {
#if GEN_GEN >= 8 || GEN_IS_HASWELL
const struct brw_stage_prog_data *prog_data =
pipeline->shaders[stage]->prog_data;
@@ -1929,7 +1978,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
&bind_map->surface_to_descriptor[surface];
const struct anv_descriptor *desc =
anv_descriptor_for_binding(cmd_buffer, binding);
anv_descriptor_for_binding(&gfx_state->base, binding);
struct anv_address read_addr;
uint32_t read_len;
@@ -1945,7 +1994,8 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
assert(desc->type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC);
uint32_t dynamic_offset =
dynamic_offset_for_binding(cmd_buffer, pipeline, binding);
dynamic_offset_for_binding(&gfx_state->base,
pipeline, binding);
uint32_t buf_offset =
MIN2(desc->offset + dynamic_offset, desc->buffer->size);
uint32_t buf_range =
@@ -2005,10 +2055,10 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
void
genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
uint32_t *p;
uint32_t vb_emit = cmd_buffer->state.vb_dirty & pipeline->vb_used;
uint32_t vb_emit = cmd_buffer->state.gfx.vb_dirty & pipeline->vb_used;
assert((pipeline->active_stages & VK_SHADER_STAGE_COMPUTE_BIT) == 0);
@@ -2059,16 +2109,15 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
cmd_buffer->state.vb_dirty &= ~vb_emit;
cmd_buffer->state.gfx.vb_dirty &= ~vb_emit;
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_PIPELINE) {
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_PIPELINE) {
anv_batch_emit_batch(&cmd_buffer->batch, &pipeline->batch);
/* The exact descriptor layout is pulled from the pipeline, so we need
* to re-emit binding tables on every pipeline change.
*/
cmd_buffer->state.descriptors_dirty |=
cmd_buffer->state.pipeline->active_stages;
cmd_buffer->state.descriptors_dirty |= pipeline->active_stages;
/* If the pipeline changed, we may need to re-allocate push constant
* space in the URB.
@@ -2099,7 +2148,7 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
#endif
/* Render targets live in the same binding table as fragment descriptors */
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_RENDER_TARGETS)
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_RENDER_TARGETS)
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_FRAGMENT_BIT;
/* We emit the binding tables and sampler tables first, then emit push
@@ -2125,16 +2174,16 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
if (dirty)
cmd_buffer_emit_descriptor_pointers(cmd_buffer, dirty);
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_DYNAMIC_VIEWPORT)
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_DYNAMIC_VIEWPORT)
gen8_cmd_buffer_emit_viewport(cmd_buffer);
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_DYNAMIC_VIEWPORT |
if (cmd_buffer->state.gfx.dirty & (ANV_CMD_DIRTY_DYNAMIC_VIEWPORT |
ANV_CMD_DIRTY_PIPELINE)) {
gen8_cmd_buffer_emit_depth_viewport(cmd_buffer,
pipeline->depth_clamp_enable);
}
if (cmd_buffer->state.dirty & ANV_CMD_DIRTY_DYNAMIC_SCISSOR)
if (cmd_buffer->state.gfx.dirty & ANV_CMD_DIRTY_DYNAMIC_SCISSOR)
gen7_cmd_buffer_emit_scissor(cmd_buffer);
genX(cmd_buffer_flush_dynamic_state)(cmd_buffer);
@@ -2213,7 +2262,7 @@ void genX(CmdDraw)(
uint32_t firstInstance)
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2251,7 +2300,7 @@ void genX(CmdDrawIndexed)(
uint32_t firstInstance)
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2403,7 +2452,7 @@ void genX(CmdDrawIndirect)(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2441,7 +2490,7 @@ void genX(CmdDrawIndexedIndirect)(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.gfx.base.pipeline;
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2474,7 +2523,7 @@ void genX(CmdDrawIndexedIndirect)(
static VkResult
flush_compute_descriptor_set(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
struct anv_state surfaces = { 0, }, samplers = { 0, };
VkResult result;
@@ -2530,7 +2579,7 @@ flush_compute_descriptor_set(struct anv_cmd_buffer *cmd_buffer)
void
genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
MAYBE_UNUSED VkResult result;
assert(pipeline->active_stages == VK_SHADER_STAGE_COMPUTE_BIT);
@@ -2539,7 +2588,7 @@ genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
genX(flush_pipeline_select_gpgpu)(cmd_buffer);
if (cmd_buffer->state.compute_dirty & ANV_CMD_DIRTY_PIPELINE) {
if (cmd_buffer->state.compute.pipeline_dirty) {
/* From the Sky Lake PRM Vol 2a, MEDIA_VFE_STATE:
*
* "A stalling PIPE_CONTROL is required before MEDIA_VFE_STATE unless
@@ -2555,7 +2604,7 @@ genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
}
if ((cmd_buffer->state.descriptors_dirty & VK_SHADER_STAGE_COMPUTE_BIT) ||
(cmd_buffer->state.compute_dirty & ANV_CMD_DIRTY_PIPELINE)) {
cmd_buffer->state.compute.pipeline_dirty) {
/* FIXME: figure out descriptors for gen7 */
result = flush_compute_descriptor_set(cmd_buffer);
if (result != VK_SUCCESS)
@@ -2576,7 +2625,7 @@ genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer)
}
}
cmd_buffer->state.compute_dirty = 0;
cmd_buffer->state.compute.pipeline_dirty = false;
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
}
@@ -2607,7 +2656,7 @@ void genX(CmdDispatch)(
uint32_t z)
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
const struct brw_cs_prog_data *prog_data = get_cs_prog_data(pipeline);
if (anv_batch_has_error(&cmd_buffer->batch))
@@ -2621,9 +2670,10 @@ void genX(CmdDispatch)(
sizes[1] = y;
sizes[2] = z;
anv_state_flush(cmd_buffer->device, state);
cmd_buffer->state.num_workgroups_offset = state.offset;
cmd_buffer->state.num_workgroups_bo =
&cmd_buffer->device->dynamic_state_pool.block_pool.bo;
cmd_buffer->state.compute.num_workgroups = (struct anv_address) {
.bo = &cmd_buffer->device->dynamic_state_pool.block_pool.bo,
.offset = state.offset,
};
}
genX(cmd_buffer_flush_compute_state)(cmd_buffer);
@@ -2654,7 +2704,7 @@ void genX(CmdDispatchIndirect)(
{
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
struct anv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
struct anv_pipeline *pipeline = cmd_buffer->state.compute.base.pipeline;
const struct brw_cs_prog_data *prog_data = get_cs_prog_data(pipeline);
struct anv_bo *bo = buffer->bo;
uint32_t bo_offset = buffer->offset + offset;
@@ -2670,8 +2720,10 @@ void genX(CmdDispatchIndirect)(
#endif
if (prog_data->uses_num_work_groups) {
cmd_buffer->state.num_workgroups_offset = bo_offset;
cmd_buffer->state.num_workgroups_bo = bo;
cmd_buffer->state.compute.num_workgroups = (struct anv_address) {
.bo = bo,
.offset = bo_offset,
};
}
genX(cmd_buffer_flush_compute_state)(cmd_buffer);
@@ -3138,7 +3190,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
{
cmd_buffer->state.subpass = subpass;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
/* Our implementation of VK_KHR_multiview uses instancing to draw the
* different views. If the client asks for instancing, we need to use the
@@ -3148,7 +3200,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
* of each subpass.
*/
if (GEN_GEN == 7)
cmd_buffer->state.vb_dirty |= ~0;
cmd_buffer->state.gfx.vb_dirty |= ~0;
/* Perform transitions to the subpass layout before any writes have
* occurred.

View File

@@ -272,5 +272,5 @@ genX(cmd_buffer_so_memcpy)(struct anv_cmd_buffer *cmd_buffer,
prim.BaseVertexLocation = 0;
}
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_PIPELINE;
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_PIPELINE;
}

View File

@@ -1081,7 +1081,13 @@ emit_3dstate_streamout(struct anv_pipeline *pipeline,
static uint32_t
get_sampler_count(const struct anv_shader_bin *bin)
{
return DIV_ROUND_UP(bin->bind_map.sampler_count, 4);
uint32_t count_by_4 = DIV_ROUND_UP(bin->bind_map.sampler_count, 4);
/* We can potentially have way more than 32 samplers and that's ok.
* However, the 3DSTATE_XS packets only have 3 bits to specify how
* many to pre-fetch and all values above 4 are marked reserved.
*/
return MIN2(count_by_4, 4);
}
static uint32_t
@@ -1345,10 +1351,10 @@ has_color_buffer_write_enabled(const struct anv_pipeline *pipeline,
if (binding->set != ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS)
continue;
const VkPipelineColorBlendAttachmentState *a =
&blend->pAttachments[binding->index];
if (binding->index == UINT32_MAX)
continue;
if (binding->index != UINT32_MAX && a->colorWriteMask != 0)
if (blend->pAttachments[binding->index].colorWriteMask != 0)
return true;
}

View File

@@ -409,20 +409,6 @@ void genX(CmdBeginQuery)(
ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
ANV_FROM_HANDLE(anv_query_pool, pool, queryPool);
/* Workaround: When meta uses the pipeline with the VS disabled, it seems
* that the pipelining of the depth write breaks. What we see is that
* samples from the render pass clear leaks into the first query
* immediately after the clear. Doing a pipecontrol with a post-sync
* operation and DepthStallEnable seems to work around the issue.
*/
if (cmd_buffer->state.need_query_wa) {
cmd_buffer->state.need_query_wa = false;
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.DepthCacheFlushEnable = true;
pc.DepthStallEnable = true;
}
}
switch (pool->type) {
case VK_QUERY_TYPE_OCCLUSION:
emit_ps_depth_count(cmd_buffer, &pool->bo, query * pool->stride + 8);

View File

@@ -105,7 +105,6 @@ BUILT_SOURCES = $(i965_oa_GENERATED_FILES)
CLEANFILES = $(BUILT_SOURCES)
EXTRA_DIST = \
meson.build \
brw_oa_hsw.xml \
brw_oa_bdw.xml \
brw_oa_chv.xml \
@@ -118,7 +117,8 @@ EXTRA_DIST = \
brw_oa_glk.xml \
brw_oa_cflgt2.xml \
brw_oa_cflgt3.xml \
brw_oa.py
brw_oa.py \
meson.build
# Note: we avoid using a multi target rule here and outputting both the
# .c and .h files in one go so we don't hit problems with parallel

View File

@@ -320,7 +320,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
enum isl_format dst_isl_format =
brw_blorp_to_isl_format(brw, dst_format, true);
enum isl_aux_usage dst_aux_usage =
intel_miptree_render_aux_usage(brw, dst_mt, dst_isl_format, false);
intel_miptree_render_aux_usage(brw, dst_mt, dst_isl_format,
false, false);
const bool dst_clear_supported = dst_aux_usage != ISL_AUX_USAGE_NONE;
intel_miptree_prepare_access(brw, dst_mt, dst_level, 1, dst_layer, 1,
dst_aux_usage, dst_clear_supported);
@@ -1267,9 +1268,10 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
irb->mt, irb->mt_level, irb->mt_layer, num_layers);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format, false);
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
false, false);
intel_miptree_prepare_render(brw, irb->mt, level, irb->mt_layer,
num_layers, isl_format, false);
num_layers, aux_usage);
struct isl_surf isl_tmp[2];
struct blorp_surf surf;
@@ -1288,7 +1290,7 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb,
blorp_batch_finish(&batch);
intel_miptree_finish_render(brw, irb->mt, level, irb->mt_layer,
num_layers, isl_format, false);
num_layers, aux_usage);
}
return;

View File

@@ -177,7 +177,7 @@ brw_dispatch_compute_common(struct gl_context *ctx)
brw_validate_textures(brw);
brw_predraw_resolve_inputs(brw, false);
brw_predraw_resolve_inputs(brw, false, NULL);
/* Flush the batch if the batch/state buffers are nearly full. We can
* grow them if needed, but this is not free, so we'd like to avoid it.

View File

@@ -1290,15 +1290,11 @@ struct brw_context
struct brw_fast_clear_state *fast_clear_state;
/* Array of flags telling if auxiliary buffer is disabled for corresponding
* renderbuffer. If draw_aux_buffer_disabled[i] is set then use of
* auxiliary buffer for gl_framebuffer::_ColorDrawBuffers[i] is
* disabled.
* This is needed in case the same underlying buffer is also configured
* to be sampled but with a format that the sampling engine can't treat
* compressed or fast cleared.
/* Array of aux usages to use for drawing. Aux usage for render targets is
* a bit more complex than simply calling a single function so we need some
* way of passing it form brw_draw.c to surface state setup.
*/
bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS];
enum isl_aux_usage draw_aux_usage[MAX_DRAW_BUFFERS];
__DRIcontext *driContext;
struct intel_screen *screen;
@@ -1324,7 +1320,8 @@ void intel_update_renderbuffers(__DRIcontext *context,
__DRIdrawable *drawable);
void intel_prepare_render(struct brw_context *brw);
void brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering);
void brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering,
bool *draw_aux_buffer_disabled);
void intel_resolve_for_dri2_flush(struct brw_context *brw,
__DRIdrawable *drawable);

View File

@@ -341,6 +341,7 @@ brw_merge_inputs(struct brw_context *brw,
*/
static bool
intel_disable_rb_aux_buffer(struct brw_context *brw,
bool *draw_aux_buffer_disabled,
struct intel_mipmap_tree *tex_mt,
unsigned min_level, unsigned num_levels,
const char *usage)
@@ -360,7 +361,7 @@ intel_disable_rb_aux_buffer(struct brw_context *brw,
if (irb && irb->mt->bo == tex_mt->bo &&
irb->mt_level >= min_level &&
irb->mt_level < min_level + num_levels) {
found = brw->draw_aux_buffer_disabled[i] = true;
found = draw_aux_buffer_disabled[i] = true;
}
}
@@ -393,14 +394,12 @@ mark_textures_used_for_txf(BITSET_WORD *used_for_txf,
* enabled depth texture, and flush the render cache for any dirty textures.
*/
void
brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering,
bool *draw_aux_buffer_disabled)
{
struct gl_context *ctx = &brw->ctx;
struct intel_texture_object *tex_obj;
memset(brw->draw_aux_buffer_disabled, 0,
sizeof(brw->draw_aux_buffer_disabled));
BITSET_DECLARE(used_for_txf, MAX_COMBINED_TEXTURE_IMAGE_UNITS);
memset(used_for_txf, 0, sizeof(used_for_txf));
if (rendering) {
@@ -441,7 +440,8 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
}
const bool disable_aux = rendering &&
intel_disable_rb_aux_buffer(brw, tex_obj->mt, min_level, num_levels,
intel_disable_rb_aux_buffer(brw, draw_aux_buffer_disabled,
tex_obj->mt, min_level, num_levels,
"for sampling");
intel_miptree_prepare_texture(brw, tex_obj->mt, view_format,
@@ -482,8 +482,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
tex_obj = intel_texture_object(u->TexObj);
if (tex_obj && tex_obj->mt) {
intel_disable_rb_aux_buffer(brw, tex_obj->mt, 0, ~0,
"as a shader image");
if (rendering) {
intel_disable_rb_aux_buffer(brw, draw_aux_buffer_disabled,
tex_obj->mt, 0, ~0,
"as a shader image");
}
intel_miptree_prepare_image(brw, tex_obj->mt);
@@ -495,7 +498,8 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering)
}
static void
brw_predraw_resolve_framebuffer(struct brw_context *brw)
brw_predraw_resolve_framebuffer(struct brw_context *brw,
bool *draw_aux_buffer_disabled)
{
struct gl_context *ctx = &brw->ctx;
struct intel_renderbuffer *depth_irb;
@@ -547,11 +551,16 @@ brw_predraw_resolve_framebuffer(struct brw_context *brw)
bool blend_enabled = ctx->Color.BlendEnabled & (1 << i);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
blend_enabled);
blend_enabled,
draw_aux_buffer_disabled[i]);
if (brw->draw_aux_usage[i] != aux_usage) {
brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
brw->draw_aux_usage[i] = aux_usage;
}
intel_miptree_prepare_render(brw, irb->mt, irb->mt_level,
irb->mt_layer, irb->layer_count,
isl_format, blend_enabled);
aux_usage);
brw_cache_flush_for_render(brw, irb->mt->bo,
isl_format, aux_usage);
@@ -620,16 +629,13 @@ brw_postdraw_set_buffers_need_resolve(struct brw_context *brw)
mesa_format mesa_format =
_mesa_get_render_format(ctx, intel_rb_format(irb));
enum isl_format isl_format = brw_isl_format_for_mesa_format(mesa_format);
bool blend_enabled = ctx->Color.BlendEnabled & (1 << i);
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, irb->mt, isl_format,
blend_enabled);
enum isl_aux_usage aux_usage = brw->draw_aux_usage[i];
brw_render_cache_add_bo(brw, irb->mt->bo, isl_format, aux_usage);
intel_miptree_finish_render(brw, irb->mt, irb->mt_level,
irb->mt_layer, irb->layer_count,
isl_format, blend_enabled);
aux_usage);
}
}
@@ -732,8 +738,9 @@ brw_prepare_drawing(struct gl_context *ctx,
* and finalizing textures but before setting up any hardware state for
* this draw call.
*/
brw_predraw_resolve_inputs(brw, true);
brw_predraw_resolve_framebuffer(brw);
bool draw_aux_buffer_disabled[MAX_DRAW_BUFFERS] = { };
brw_predraw_resolve_inputs(brw, true, draw_aux_buffer_disabled);
brw_predraw_resolve_framebuffer(brw, draw_aux_buffer_disabled);
/* Bind all inputs, derive varying and size information:
*/

View File

@@ -317,6 +317,55 @@ gen7_emit_vs_workaround_flush(struct brw_context *brw)
brw->workaround_bo, 0, 0);
}
/**
* From the PRM, Volume 2a:
*
* "Indirect State Pointers Disable
*
* At the completion of the post-sync operation associated with this pipe
* control packet, the indirect state pointers in the hardware are
* considered invalid; the indirect pointers are not saved in the context.
* If any new indirect state commands are executed in the command stream
* while the pipe control is pending, the new indirect state commands are
* preserved.
*
* [DevIVB+]: Using Invalidate State Pointer (ISP) only inhibits context
* restoring of Push Constant (3DSTATE_CONSTANT_*) commands. Push Constant
* commands are only considered as Indirect State Pointers. Once ISP is
* issued in a context, SW must initialize by programming push constant
* commands for all the shaders (at least to zero length) before attempting
* any rendering operation for the same context."
*
* 3DSTATE_CONSTANT_* packets are restored during a context restore,
* even though they point to a BO that has been already unreferenced at
* the end of the previous batch buffer. This has been fine so far since
* we are protected by these scratch page (every address not covered by
* a BO should be pointing to the scratch page). But on CNL, it is
* causing a GPU hang during context restore at the 3DSTATE_CONSTANT_*
* instruction.
*
* The flag "Indirect State Pointers Disable" in PIPE_CONTROL tells the
* hardware to ignore previous 3DSTATE_CONSTANT_* packets during a
* context restore, so the mentioned hang doesn't happen. However,
* software must program push constant commands for all stages prior to
* rendering anything, so we flag them as dirty.
*/
void
gen10_emit_isp_disable(struct brw_context *brw)
{
const struct gen_device_info *devinfo = &brw->screen->devinfo;
brw_emit_pipe_control_write(brw,
PIPE_CONTROL_ISP_DIS |
PIPE_CONTROL_WRITE_IMMEDIATE,
brw->workaround_bo, 0, 0);
brw->vs.base.push_constants_dirty = true;
brw->tcs.base.push_constants_dirty = true;
brw->tes.base.push_constants_dirty = true;
brw->gs.base.push_constants_dirty = true;
brw->wm.base.push_constants_dirty = true;
}
/**
* Emit a PIPE_CONTROL command for gen7 with the CS Stall bit set.

View File

@@ -85,5 +85,6 @@ void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
void brw_emit_depth_stall_flushes(struct brw_context *brw);
void gen7_emit_vs_workaround_flush(struct brw_context *brw);
void gen7_emit_cs_stall_flush(struct brw_context *brw);
void gen10_emit_isp_disable(struct brw_context *brw);
#endif

View File

@@ -229,11 +229,6 @@ gen6_update_renderbuffer_surface(struct brw_context *brw,
}
enum isl_format isl_format = brw->mesa_to_isl_render_format[rb_format];
enum isl_aux_usage aux_usage =
brw->draw_aux_buffer_disabled[unit] ? ISL_AUX_USAGE_NONE :
intel_miptree_render_aux_usage(brw, mt, isl_format,
ctx->Color.BlendEnabled & (1 << unit));
struct isl_view view = {
.format = isl_format,
.base_level = irb->mt_level - irb->mt->first_level,
@@ -245,7 +240,8 @@ gen6_update_renderbuffer_surface(struct brw_context *brw,
};
uint32_t offset;
brw_emit_surface_state(brw, mt, mt->target, view, aux_usage,
brw_emit_surface_state(brw, mt, mt->target, view,
brw->draw_aux_usage[unit],
&offset, surf_index,
RELOC_WRITE);
return offset;
@@ -441,25 +437,7 @@ swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
}
static bool
brw_aux_surface_disabled(const struct brw_context *brw,
const struct intel_mipmap_tree *mt)
{
const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
const struct intel_renderbuffer *irb =
intel_renderbuffer(fb->_ColorDrawBuffers[i]);
if (irb && irb->mt == mt)
return brw->draw_aux_buffer_disabled[i];
}
return false;
}
static void
brw_update_texture_surface(struct gl_context *ctx,
static void brw_update_texture_surface(struct gl_context *ctx,
unsigned unit,
uint32_t *surf_offset,
bool for_gather,
@@ -588,9 +566,6 @@ brw_update_texture_surface(struct gl_context *ctx,
enum isl_aux_usage aux_usage =
intel_miptree_texture_aux_usage(brw, mt, format);
if (brw_aux_surface_disabled(brw, mt))
aux_usage = ISL_AUX_USAGE_NONE;
brw_emit_surface_state(brw, mt, mt->target, view, aux_usage,
surf_offset, surf_index,
0);
@@ -1069,7 +1044,7 @@ update_renderbuffer_read_surfaces(struct brw_context *brw)
enum isl_aux_usage aux_usage =
intel_miptree_texture_aux_usage(brw, irb->mt, format);
if (brw->draw_aux_buffer_disabled[i])
if (brw->draw_aux_usage[i] == ISL_AUX_USAGE_NONE)
aux_usage = ISL_AUX_USAGE_NONE;
brw_emit_surface_state(brw, irb->mt, target, view, aux_usage,

View File

@@ -764,6 +764,10 @@ brw_finish_batch(struct brw_context *brw)
brw_emit_pipe_control_flush(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH |
PIPE_CONTROL_CS_STALL);
}
/* Do not restore push constant packets during context restore. */
if (devinfo->gen == 10)
gen10_emit_isp_disable(brw);
}
/* Emit MI_BATCH_BUFFER_END to finish our batch. Note that execbuf2

View File

@@ -2682,10 +2682,14 @@ enum isl_aux_usage
intel_miptree_render_aux_usage(struct brw_context *brw,
struct intel_mipmap_tree *mt,
enum isl_format render_format,
bool blend_enabled)
bool blend_enabled,
bool draw_aux_disabled)
{
struct gen_device_info *devinfo = &brw->screen->devinfo;
if (draw_aux_disabled)
return ISL_AUX_USAGE_NONE;
switch (mt->aux_usage) {
case ISL_AUX_USAGE_MCS:
assert(mt->mcs_buf);
@@ -2724,11 +2728,8 @@ void
intel_miptree_prepare_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled)
enum isl_aux_usage aux_usage)
{
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, mt, render_format, blend_enabled);
intel_miptree_prepare_access(brw, mt, level, 1, start_layer, layer_count,
aux_usage, aux_usage != ISL_AUX_USAGE_NONE);
}
@@ -2737,13 +2738,10 @@ void
intel_miptree_finish_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled)
enum isl_aux_usage aux_usage)
{
assert(_mesa_is_format_color_format(mt->format));
enum isl_aux_usage aux_usage =
intel_miptree_render_aux_usage(brw, mt, render_format, blend_enabled);
intel_miptree_finish_write(brw, mt, level, start_layer, layer_count,
aux_usage);
}

View File

@@ -652,19 +652,18 @@ enum isl_aux_usage
intel_miptree_render_aux_usage(struct brw_context *brw,
struct intel_mipmap_tree *mt,
enum isl_format render_format,
bool blend_enabled);
bool blend_enabled,
bool draw_aux_disabled);
void
intel_miptree_prepare_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled);
enum isl_aux_usage aux_usage);
void
intel_miptree_finish_render(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,
uint32_t start_layer, uint32_t layer_count,
enum isl_format render_format,
bool blend_enabled);
enum isl_aux_usage aux_usage);
void
intel_miptree_prepare_depth(struct brw_context *brw,
struct intel_mipmap_tree *mt, uint32_t level,

View File

@@ -330,12 +330,6 @@ r100CreateContext( gl_api api,
rmesa->radeon.do_usleeps = (fthrottle_mode == DRI_CONF_FTHROTTLE_USLEEPS);
#if DO_DEBUG
RADEON_DEBUG = parse_debug_string( getenv( "RADEON_DEBUG" ),
debug_control );
#endif
tcl_mode = driQueryOptioni(&rmesa->radeon.optionCache, "tcl_mode");
if (driQueryOptionb(&rmesa->radeon.optionCache, "no_rast")) {
fprintf(stderr, "disabling 3D acceleration\n");

View File

@@ -721,7 +721,7 @@ libmesa_gallium = static_library(
cpp_args : [cpp_vis_args, cpp_msvc_compat_args],
include_directories : [inc_common, include_directories('main')],
link_with : [libglsl, libmesa_sse41],
dependencies : idep_nir_headers,
dependencies : [idep_nir_headers, dep_vdpau],
build_by_default : false,
)

View File

@@ -757,8 +757,8 @@ st_init_driver_functions(struct pipe_screen *screen,
functions->UpdateState = st_invalidate_state;
functions->QueryMemoryInfo = st_query_memory_info;
functions->SetBackgroundContext = st_set_background_context;
functions->GetDriverUuid = st_get_device_uuid;
functions->GetDeviceUuid = st_get_driver_uuid;
functions->GetDriverUuid = st_get_driver_uuid;
functions->GetDeviceUuid = st_get_device_uuid;
/* GL_ARB_get_program_binary */
functions->GetProgramBinaryDriverSHA1 = st_get_program_binary_driver_sha1;

View File

@@ -74,7 +74,8 @@ struct vbo_save_vertex_list {
GLuint current_size;
GLuint buffer_offset; /**< in bytes */
GLuint vertex_count;
GLuint start_vertex; /**< first vertex used by any primitive */
GLuint vertex_count; /**< number of vertices in this list */
GLuint wrap_count; /* number of copied vertices at start */
GLboolean dangling_attr_ref; /* current attr implicitly referenced
outside the list */

View File

@@ -558,6 +558,9 @@ compile_vertex_list(struct gl_context *ctx)
for (unsigned i = 0; i < save->prim_count; i++) {
save->prims[i].start += start_offset;
}
node->start_vertex = start_offset;
} else {
node->start_vertex = 0;
}
/* Reset our structures for the next run of vertices:

View File

@@ -325,13 +325,14 @@ vbo_save_playback_vertex_list(struct gl_context *ctx, void *data)
_mesa_update_state(ctx);
if (node->vertex_count > 0) {
GLuint min_index = node->start_vertex;
GLuint max_index = min_index + node->vertex_count - 1;
vbo_context(ctx)->draw_prims(ctx,
node->prims,
node->prim_count,
NULL,
GL_TRUE,
0, /* Node is a VBO, so this is ok */
node->vertex_count - 1,
min_index, max_index,
NULL, 0, NULL);
}
}

View File

@@ -112,8 +112,12 @@ libxmlconfig = static_library(
files_xmlconfig,
include_directories : inc_common,
dependencies : [dep_expat, dep_m],
c_args : [c_msvc_compat_args, c_vis_args,
'-DSYSCONFDIR="@0@"'.format(get_option('sysconfdir'))],
c_args : [
c_msvc_compat_args, c_vis_args,
'-DSYSCONFDIR="@0@"'.format(
join_paths(get_option('prefix'), get_option('sysconfdir'))
),
],
build_by_default : false,
)