VK_KHR_sampler_ycbcr_conversion is a dependency from the
VK_EXT_image_drm_format_modifier spec. panvk arch < 10 still
doesn't support it, so VK_EXT_image_drm_format_modifier should
not be exposed.
Otherwise, a Vulkan validation error is triggered for users of
VK_EXT_image_drm_format_modifier and it may cause applications
to fail to create a device.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34458>
This verifies that the requested allocation doesn't exceed the maximum
in cases where the size passed to `clSVMAlloc()` isn't a multiple of the
provided alignment. It also clamps the maximum allocation to `i32::MAX`,
which prevents overflowing `pipe_box`'s `width` field.
Both of these changes prevent possible undefined behavior on 32-bit
systems due to violation of `Layout` prerequisites.
v2: use safe layout creation for maintainability, add a few comments
v3: use Layout utils for aligned size calc, split out max alloc changes
v4: use `checked_compare()` for alloc/size comparison
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34166>
In order to enable higher MSAA modes, we're going to have to perform
some calculations on how to budget the (sometimes) limited tile-buffer
space.
Due to limited tilebuffer space, we need to prioritize a bit here.
First, we reserve space for 4x MSAA for all formats. Then we try to fit
8 color attachments into the tile-buffer. And then finally, we calculate
how many extra multi-sample buffers we can fit into the rest.
The reason we reserve 4x MSAA first, is that this is required by all
Vulkan versions. It also prevents us from regressing existing features.
Then we try to pick 8 color attachments next, because that's required by
Vulkan 1.4 as well as Vulkan Roadmap 2024 and D3D12. Vulkan Roadmap 2022
requires 7 as well.
This adds helpers that implements this, which can be used by both the
Gallium and the Vulkan driver. It's really benefitial if both of these
drivers prioritize the same way here.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
We also have a budget for the tile size for depth-buffers. It's
currently hard to trigger issues with this than for color-buffers,
but this becomes important when we support larger MSAA counts.
We also need to take a bit of care for stencil-only attachments, because
they also count against a limit here. We really only care about the
sample counts here, because the stencil buffer budget is always a
quarter of the depth-buffer budget, and always uses a single byte per
sample.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
There's two limitations we have to cater to:
1. The HW needs at least one render-target. We can disable write-back for
it, but it needs to allocate tile-buffer space for it.
2. The HW can't have "holes" in the render-targets.
In both of those cases, we already set up dummy RGBA8 UNORM as the format,
and disable write-back. But we forgot to take this into account when
calculating the tile buffer allocation.
This makes what we program the HW to do consistent, meaning we don't end
up smashing the tile-buffer space. We might be able to do something
better by adjusting how we program these buffers, but let's leave that
for later.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
These GPUs had their tilebuffer sizes listed at twice their actual
values. While that still works, it ends up disabling pipelining in some
cases. This gives a significant performance hit, compared to using the
correct values.
But, it turns out to be hard or impossible to trigger at the moment, due
to the limited number of MSAA samples we support. Once that changes,
this is a lot easier to trigger, so let's fix it up.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
This is an n-queen pattern, where no two values should be on the same
row or column. But this and the second to last element has the same y
component, and neither has the negative one.
Let's fix this up by setting the first value to the negative value. This
matches the D3D 16x sample pattern.
Fixes: a61fb62966 ("panfrost: Upload sample positions on device init")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
Fixes two known issues:
- We did not lower invalid swizzles for IADD.v4s8, triggered in the CTS by
enabling uniformAndStorageBuffer8BitAccess and storageBuffer8BitAccess in
panvk.
- We did not lower invalid swizzles for IMUL.v4i8, triggered by
dEQP-VK.spirv_assembly.instruction.compute.mul_extended.(un)signed_8bit
on bifrost.
The old logic was missing several other instructions, so there may be
additional bugs that we don't know about.
There are no cases where the new behavior will keep swizzles that would
have been lowered previously, so this change should not introduce any
new bugs with valhall.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
Primary reason to do this is to make codegen using the swizzle names in
bifrost/ISA.xml simpler. A secondary benefit is that dependent code can
now use the swizzle name that matches the context, making things a
little more readable.
We may want to consider giving widens separate values later, so that
va_lower_constants and bi_opt_constant_fold can fold them correctly, but
I don't know of current bugs caused by this.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33416>
Valhall supports all combinations of ftz/preserve denorm behavior
between FP16 and FP32 except FP16=ftz, FP32=preserve. Because of this,
we can't advertise independent denorm behavior.
Even with INDEPENDENCE_NONE, it is still possible for shaders to set
denorm behavior for one size and leave the other size unspecified.
Previously we were defaulting to preserve for any unspecified size, but
with FP16=ftz, we need to default unspecified FP32 to preserve.
When advertising INDEPENDENCE_NONE, the CTS checks that the
shaderDenormFlushToZeroFloat* and shaderDenormPreserveFloat* features
are equal for all sizes, so we need to advertise the same supported
denorm behavior for FP64 even though we don't support FP64 at all.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33660>