sim_state.perfcnt_total provides the total number of counters
supported by the underlying simulated platform and is what we
use when we create a perform to validate that the counters
requested are valid, so we should use this.
V3D_PERFCNT_NUM is a fixed enum value that is only valid for
V3D 4.2 at present and is not sufficiently large for all the
counters available in V3D 7.x.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28870>
Check how much smaller can the weight+bias buffers be with different
amount of bits to encode runs of zeroes and choose the smallest one.
This reduces the bandwidth considerably, which is at present the
bottleneck with useful models.
On a Libre Computer Alta AML-A311D-CC, I see these improvements:
MobileNetV1: 15.650ms -> 9.991ms
SSDLite MobileDet: 56.149ms -> 32.692ms
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27513>
This allows us to use perf_debug_ctx() instead of perf_debug(), which
will help make things a bit cleaner down the line.
In order to do this, we also need to make sure we always have access to
the context, so let's also pass ctx to panfrost_should_linear_convert
while we're at it.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28693>
This patch introduces code to enforce the pages-long regioning
restrictions introduced by Xe2 that apply to sub-dword integer
datatypes (See BSpec page 56640). They impose a number of
restrictions on what the regioning parameters of a source can be
depending on the source and destination datatypes as well as the
alignment of the destination. The tricky cases are when the
destination stride is smaller than 32 bits and the source stride is at
least 32 bits, since such cases require the destination and source
offsets to be in agreement based on an equation determined by the
source and destination strides. The second source of instructions
with multiple sources is even more restricted, and due to the
existence of hardware bug HSDES#16012383669 it basically requires the
source data to be packed in the GRF if the destination stride isn't
dword-aligned.
In order to address those restrictions this patch leverages the
existing infrastructure from brw_fs_lower_regioning.cpp. The same
general approach can be used to handle this restriction we were using
to handle restrictions of the floating-point pipeline in previous
generations: Unsupported source regions are lowered by emitting an
additional copy before the instruction that shuffles the data in a way
that allows using a valid region in the original instruction. The
main difficulty that wasn't encountered in previous platforms is that
it is non-trivial to come up with a copy instruction that doesn't
break the regioning restrictions itself, since on previous platforms
we could just bitcast floating-point data and use integer copies in
order to implement arbitrary regioning, which is unfortunately no
longer a choice lacking a magic third pipeline able to do the
regioning modes the integer pipeline is no longer able to do.
The required_src_byte_stride() and required_src_byte_offset() helpers
introduced here try to calculate parameters for both regions that
avoid that situation, but it isn't always possible, and actually in
some cases that involve the second source of ALU instructions a chain
of multiple copy instructions will be required, so the
lower_instruction() routine needs to be applied recursively to the
instructions emitted to lower the original instruction.
XXX - Allow more flexible regioning for the second source of an
instruction if bug HSDES#16012383669 is fixed in a future
hardware platform.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28698>
Improves the error logging in the LAVA job submitter by capturing and
logging the exception message rather than just the exception type when a
job fails to run.
Additionally, introduces a clearer script interruption
message to aid in debugging and immediate understanding of job
submission failures.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
This commit refactors the exception hierarchy to differentiate between
retriable and fatal errors in the CI pipeline, specifically within the
LAVA job submission process.
A new base class, `MesaCIRetriableException`, is introduced for
exceptions that should trigger a retry of the CI job, while
`MesaCIFatalException` is added for non-recoverable errors that halt the
process immediately.
Additionally, the logic for deciding whether a job should be retried or
not is updated to check for instances of `MesaCIRetriableException`,
improving the robustness and reliability of the CI job execution
strategy.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28778>
This mirrors RADV's pipeline behavior (which is more performant
when programs like DXVK don't use the pipeline cache functionality)
Drivers need to set the implicit cache variable to use this though
(the next patch will enable this for NVK)
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28851>
We track stencil clears and writes to optimize them. Unfortunately, the
code for doing this tracks the whole resource, not individual layers or
levels within the resource, which can result in incorrect output when
different levels or layers are accessed. Modified to optimize only the first
layer/level; this will handle the common case of a single stencil texture
while allowing arrays or mipmaps to still work (albeit slightly slower).
The original optimization was introduced in a2463ec271 ("panfrost:
Constant stencil buffer tracking") but the code has been reformatted
since then, so this change won't apply as-is that far back (although it's
fairly obvious how to apply it by hand).
Fixes: a2463ec271 ("panfrost: Constant stencil value tracking")
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28832>