Don't try to chase across blocks to find a matching destination for a
given source. This can be prone to exponential blowup when there is a
complicated series of if-ladders and we have to crawl through every
possible path. With scalar ALU, this was causing timeouts on one test
when we stopped counting scalar ALU. Rather than adding yet more
band-aids, just switch to a different approach that most other backends
are using where we have a scoreboard of outstanding registers and we
keep track of the cycle when each register becomes "ready". This
integrates nicely into the pre-existing ir3 legalize infrastructure for
(ss) and (sy), although it does require duplicating the logic in
ir3_delayslots() in a different form.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28750>
At the beginning of each block we were taking the state for the current
block, which after traversing the block already will contain the state
at the *end* of the block, and combining it with the predecessors to get
the state for the *start* of the block. This will cause
over-synchronization.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28750>
begin() resets rate control state, calling it every frame will cause
issues such as not reaching the desired target bitrate.
Rate control only has to be reset when changing rate control mode,
otherwise it's enough to update the parameters.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28727>
We recently made some effort to reduce the LDS use of TCS:
The lowering no longer uses the same output location mapping when
storing TCS outputs to LDS and VRAM. This means that the same
patch will use a different amount of LDS and VRAM.
Therefore, we need to properly calculate the patch size in VRAM
when determining the number of output patches.
Fixes: 0e481a4adc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28739>
The blob uses MME shadow scratch 26 to indicate whether or not it is running in
a simulated environment (not real hardware). If true, the SET_FALCON04
firmware method used to modify PRI registers isn't used.
As Nouveau only runs on real hardware, this is useless and can be removed.
Tested by running dEQP-VK.subgroups.vote.frag_helper.subgroupallequal_bvec2_fragment,
which exposes the original issue with a ~50% failure rate on RTX3080
(when disabling the register write altogether). With this change, the
success rate remains at 100%.
Signed-off-by: Arthur Huillet <ahuillet@nvidia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28714>
If an application tries to call
vkGetDeviceImageSparseMemoryRequirements() for an image that's not
supported by Sparse, then anv_image_init_from_create_info() will fail
and we will either hit the assertion in case of a debug build or just
pretend everything works in case of a release build. Properly return
no properties to signal the image is not supported.
The spec is not clear in specifying that this is what should be done
in this case, but this behavior should match the other
query-properties-from-sparse-images-we-didn't-create-yet functions
such as vkGetPhysicalDeviceSparseImageFormatProperties().
No known application outside my computer is tripping on this failure.
I discovered it when writing my own micro test cases for MSAA sparse.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28792>
While it lived inside anv_batch_chain.c it made sense to call this
function xe_exec_fill_sync() because, well, the function was used to
fill the sync objects for the xe execbuf ioctl. Now that the function
is exported to the .h file and accessible to the rest of the driver,
let's give it a name that reflects what it does instead of what it was
used for when it was static: call it vk_sync_to_drm_xe_sync() because
it converts a vk_sync to a drm_xe_sync.
Also let's bikeshed the implementation so that it returns the struct
it builds: this should make callers cleaner and easier to understand.
No functional changes, only bikeshedding.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28792>
The reason we made the vm_bind ioctl return DEVICE_LOST on error is
that we thought there wasn't anything we could do if the ioctl failed.
Thomas Hellstrom pointed us that in case the system is under memory
pressure ENOMEM will be returned and there are things we can try to do
to make it work: either free memory or do fewer bind operations per
ioctl. For now let's just return the appropriate error for the case
(OUT_OF_HOST_MEMORY) so that maybe applications can try to something
about it. In the next patch we'll implement an additional strategy to
try to make things work.
Due to an unrleated failure, our CI system has also pointed out that
we were submitting invalid arguments, so we were getting an EINVAL.
Although we fixed that, these situations really shouldn't happen and
they should be easy to detect, so just put an assertion there, which
will make it easier for us developers to spot any issues.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28792>