Since the group is created from the onset, we have to make
sure that four or eight src values don't have a readport
conflict, so force a pre-loading of the values to registers
evenly distributed over the channels and let copy-propagation
take care of cleaning up un-neccesary moves.
Fixes: 79ca456b48
r600/sfn: rewrite NIR backend
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28840>
getenv("HOME") is significantly faster than getpwuid_r(...)->pw_dir,
because the latter may require loading NSS libraries, reading
/etc/passwd, etc.
Furthermore, the Linux man pages for getpwuid_r recommend using
getenv("HOME") instead of getpwuid_r, because the user may wish to
change the value of HOME after logging in.
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13614>
This isn't an exact translation. It's identical to the previous values
for all uncompressed items but I fixed some of the compressed ones
(which we don't currently use) based on the enum names.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24596>
This also involves adding array support to the struct parser.
Fortunately, the header files for QMDs are really consistent here and we
can make lots of assumptions like that i is always the index variable.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28756>
For integer types, the signedness is determined by flags on the muladd
instruction. The types of the sources play no role. Previously we were
using the signedness of the type and ignoring the mask.
Adjust the types passed to the dpas_intel intrinsic to match.
Fixes various
dEQP-VK.compute.*.cooperative_matrix.khr_*.matrixmuladd_cross.* tests on
different Intel platforms. Some platforms had failing tests, and some
platforms failed EU validation before the tests could fail.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28822>
this fixes some cases with separate shaders where an output component
store was eliminated early but the input component load was not also
eliminated
some outputs can't yet be scalarized without exploding everything
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28723>
"adb remount -R" doesn't actually remount partitions read/write, it just
enables the overlayfs and it needs to be followed by "adb remount" after
rebooting to actually remount.
Also, patchelf seems to work for changing the soname and is probably
better than hacking meson.build.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28812>
The system needs to be android, or else we run into the libarchive build
error fixed in 735fe243a7:
In file included from ../subprojects/libarchive-3.7.2/libarchive/archive_write_open_memory.c:33: ../subprojects/libarchive-3.7.2/libarchive/archive.h:101:10: fatal error: 'android_lf.h' file not found
Also, it uses the aarch64 clang but "cpu = 'armv8'", which doesn't
work (armv8 is the 32-bit version). Use aarch64 as presumably intended.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28812>
Don't try to chase across blocks to find a matching destination for a
given source. This can be prone to exponential blowup when there is a
complicated series of if-ladders and we have to crawl through every
possible path. With scalar ALU, this was causing timeouts on one test
when we stopped counting scalar ALU. Rather than adding yet more
band-aids, just switch to a different approach that most other backends
are using where we have a scoreboard of outstanding registers and we
keep track of the cycle when each register becomes "ready". This
integrates nicely into the pre-existing ir3 legalize infrastructure for
(ss) and (sy), although it does require duplicating the logic in
ir3_delayslots() in a different form.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28750>
At the beginning of each block we were taking the state for the current
block, which after traversing the block already will contain the state
at the *end* of the block, and combining it with the predecessors to get
the state for the *start* of the block. This will cause
over-synchronization.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28750>