Compare commits

..

50 Commits

Author SHA1 Message Date
Eric Engestrom
22fafc9824 VERSION: bump for 24.2.0 2024-08-14 18:37:13 +02:00
WANG Xuerui
a2d4bd10c3 meson: Additionally probe -mtls-dialect=desc for TLSDESC support
Previously only `-mtls-dialect=gnu2` was probed, which was appropriate
for arm, x86 and x86_64, but not for newer architectures such as
aarch64, loongarch64 and riscv64 which all use `-mtls-dialect=desc`
instead. Because the driver option is not consistent across
architectures (and probably will not), try both variants and choose the
first one working.

While at it, rename "gnu2_*" variables to "tlsdesc_*" respectively, for
clarity.

Cc: mesa-stable
Reviewed-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Yukari Chiba <i@0x7f.cc>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30599>
(cherry picked from commit cc2dbb8ea5)
2024-08-14 17:45:45 +02:00
WANG Xuerui
14f6b72604 meson: Force use of LLVM ORCJIT for hosts without MCJIT support
Although the ORCJIT codepath is fresh and relatively less tested, this
is still better than no llvmpipe at all for those newer architectures
that will not gain MCJIT support, such as LoongArch or RISC-V.

Fixes: 6f02ec5ed1 ("llvmpipe: add an implementation with llvm orcjit")
Reviewed-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Yukari Chiba <i@0x7f.cc>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30599>
(cherry picked from commit 56f38672a2)
2024-08-14 17:45:45 +02:00
Hans-Kristian Arntzen
f11e04e331 wsi/x11: Bump maximum number of outstanding COMPLETE events.
Fixes a "regression" where comically large FPS tests regressed.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 19dba854 ("wsi/x11: Rewrite implementation to always use threads.")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30638>
(cherry picked from commit 5a97916fdc)
2024-08-14 17:45:41 +02:00
Eric Engestrom
a4b0f0f765 .pick_status.json: Update to cc2dbb8ea5 2024-08-14 17:45:37 +02:00
Antonio Ospite
5854ff2dd9 android: simplify building libgallium_dri on Android
The versioned libgallium library can be confusing on Android, and it is
probably not even needed there, so simplify the build on Android by
always build the unversioned `libgallium_dri.so` overriding the
`-Dunversion-libgallium=true` option added in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30579

Remove also all the bits that deal with the versioned library which are
not needed anymore.

Fixes: 9568976c52 ("android: fix build in multiple ways")
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30641>
(cherry picked from commit 2d2bc5b307)
2024-08-14 16:07:01 +02:00
Rob Clark
9f8856c5af gallium: Add option to not add version to libgallium filename
This is unneeded in some environments, like ChromeOS and Android.  And
for CrOS it specifically causes problems with the gpu sandbox rules.. we
don't want to have to update the sandbox rules for each new mesa
version.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30579>
(cherry picked from commit 19ff16387a)
2024-08-14 16:07:01 +02:00
Eric Engestrom
10dfd5d13b .pick_status.json: Update to 214b6c3040 2024-08-14 16:07:01 +02:00
Icenowy Zheng
b01adb2118 gallivm: orcjit: use atexit to release LPJit singleton at exit
Valgrind will report some memory possibly lost because of this singleton
(it's dynamically allocated when it is first accessed).

Use atexit() to register a handler that releases this singleton.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5f22e152ad)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30645>
2024-08-14 16:06:41 +02:00
Icenowy Zheng
27b6484317 gallivm: orcjit: keep the ownership of tm for LPJit
The ownership of the TargetMachine object is released when LPJit
singleton is constructed, leads to a slight memory loss detectable.

Keep the ownership by saving the unique pointer as another class member
named tm_unique.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3423e73cec)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30645>
2024-08-14 16:06:34 +02:00
Icenowy Zheng
df083003ab llvmpipe: add LoongArch support in ORCJIT
LoongArch is an architecture too new to have MCJIT support.

Add its support to ORCJIT code.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e16a74c023)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30645>
2024-08-14 16:06:29 +02:00
Icenowy Zheng
89dbb1ca29 gallivm: add LoongArch support to the mattrs setting code
Currently the mattrs is set according to the softdev convention, with
LSX explicitly disabled because it's troublesome at least on LLVM 17.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 979c364018)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30645>
2024-08-14 16:06:23 +02:00
Icenowy Zheng
a8ea86d2e8 util: detect LoongArch architecture
Only 64-bit is considered now because 32-bit LoongArch Linux support
doesn't exist in upstream yet.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 08425d9aaf)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30645>
2024-08-14 16:06:05 +02:00
Matt Turner
56bef9de05 util: Force emission of stack frame in stack unit test
The `capture_not_overwritten` unit test captures and compares two
backtraces -- one from inside a call to `func_c` and one outside -- and
confirms that they are not identical. That is, that `func_c` is in the
backtrace.

On 32-bit x86, without `-fno-omit-frame-pointer`, the function will not
emit a stack frame. As a result, the unit test fails.

The fix is to compile `func_c` with the flag `-fno-omit-frame-pointer`
to prevent the compiler from optimizing out the stack frame which is
otherwise unneeded.

Bug: https://bugs.gentoo.org/823774
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4091
Fixes: d0d14f3f64 ("util: Add unit test for stack backtrace caputure")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30622>
(cherry picked from commit 05dc4eb536)
2024-08-14 11:53:31 +02:00
Matt Turner
7335dbb895 util: Add ATTRIBUTE_OPTIMIZE(flags)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30622>
(cherry picked from commit b3430a7bb8)
2024-08-14 11:53:31 +02:00
Mike Blumenkrantz
f6b2fe8455 zink: fix partial update handling
* the damage region was not being used correctly (this is a normal rect)
* use_damage was never unset at frame boundary
* original renderArea was never re-set

Fixes: 3d38c9597f ("zink: hook up KHR_partial_update")

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30625>
(cherry picked from commit a7f64c6203)
2024-08-14 11:52:50 +02:00
Lionel Landwerlin
db297c6534 brw/rt: fix ray_object_(direction|origin) for closest-hit shaders
When closest hit shader is called, the BVH object level
brw_nir_rt_load_mem_ray origin/direction is 0. What we should be using
is the ray origin/direction and apply the transform of the current
instance.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9ba7d459a3 ("intel/rt: Implement the new ray-tracing system values")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30578>
(cherry picked from commit aaff191356)
2024-08-14 11:52:49 +02:00
Karol Herbst
46ad101f67 rusticl/memory: fix sampler argument size check
Not entirely sure why this hasn't caused any problems...

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30602>
(cherry picked from commit 0cfcd2ff83)
2024-08-14 11:52:48 +02:00
Pavel Ondračka
fbba6b7b8d r300: bias presubtract fix
We need to double check that the source is indeed constant before
looking at the constant type.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Fixes: 0508db9155
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29893>
(cherry picked from commit 1cad339409)
2024-08-14 11:52:47 +02:00
Karol Herbst
fef78c34aa util/u_printf: do not double print format string with unused arugments
the CL CTS added a new test being printf("\n", "foo"), but we ended up
printing the new line twice. If we can't find a specifier anymore, ignore
the argument as after the loop processing all arguments we'll print the
remaining format string anyway.

Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30574>
(cherry picked from commit 4080269845)
2024-08-14 11:52:47 +02:00
Valentine Burley
0ffd6a87d0 tu: Always report that we can present on kgsl
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8637
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9240
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9365
Fixes: 3e7f6c9aeb ("tu: implement wsi hook to decide if we can present directly on device")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29627>
(cherry picked from commit 367191ff63)
2024-08-14 11:52:46 +02:00
Valentine Burley
3dc242cb5e vulkan/wsi: Refactor can_present_on_device
Make wsi_device_matches_drm_fd() a default helper that PCI based GPUs plug in to
wsi_dev->can_present_on_device. This is needed for devices without libdrm, where
wsi_device_matches_drm_fd was still being called causing an "undefined reference"
build error.

Suggested-by: Rob Clark <robdclark@chromium.org>
Fixes: baa38c144f ("vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching")
Reviewed-by: Mark Collins <mark@igalia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29627>
(cherry picked from commit 47289ebc8d)
2024-08-14 11:52:45 +02:00
Mike Blumenkrantz
cf393b4076 egl/wayland: bail on zink init in non-sw mode if extension check fails
cc: mesa-stable

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30479>
(cherry picked from commit d120992e1a)
2024-08-14 11:52:44 +02:00
Georg Lehmann
6d680b5d39 aco/gfx10+: set lateKill for sgprs used by wave64 VALU writing a mask
RDNA2 ISA doc, 6.2.4. Wave64 Destination Restrictions:
The first pass of a wave64 VALU instruction may not overwrite a scalar value
used by the second half.

Foz-DB Navi31:
Totals from 5221 (6.58% of 79395) affected shaders:
Instrs: 9751484 -> 9752179 (+0.01%); split: -0.01%, +0.01%
CodeSize: 50624072 -> 50626088 (+0.00%); split: -0.00%, +0.01%
Latency: 85646450 -> 85647419 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 15039160 -> 15039277 (+0.00%); split: -0.00%, +0.00%
VClause: 200275 -> 200204 (-0.04%)
SClause: 248645 -> 248607 (-0.02%); split: -0.03%, +0.01%
Copies: 640802 -> 641413 (+0.10%); split: -0.01%, +0.11%
PreSGPRs: 236297 -> 236735 (+0.19%)
VALU: 5666449 -> 5666440 (-0.00%)
SALU: 967482 -> 968111 (+0.07%); split: -0.01%, +0.07%

Cc: mesa-stable

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30368>
(cherry picked from commit 510f5e55be)
2024-08-14 11:52:40 +02:00
Timothy Arceri
75a131315f glsl: always copy bindless sampler packing constructors to a temp
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11648
Fixes: 3cdcc5f02f ("glsl: implement ARB_bindless_texture conversions")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30586>
(cherry picked from commit 3da4b5eaa5)
2024-08-14 11:52:39 +02:00
David Heidelberg
88b8d72234 ci/alpine: use llvm variables
Fixes: da391650f5 ("ci: build a host version of mesa for cross builds")

Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30482>
(cherry picked from commit 34753cefd8)
2024-08-14 11:52:24 +02:00
David Heidelberg
3cf68b1295 llvmpipe: Silence "possibly uninitialized value" warning for ssbo_limit (cont)
Fixes: ce611935df ("llvmpipe: Silence "possibly uninitialized value" warning for ssbo_limit.")

Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30482>
(cherry picked from commit 9c8e75e256)
2024-08-14 11:52:19 +02:00
Faith Ekstrand
239fb0bdd2 zink: Align descriptor buffers to descriptorBufferOffsetAlignment
Instead of aligning offsets, we just align the size every time we query
it.  This simplifies our offset and size calculations later since we can
always just add up descriptor buffer sizes and know that we'll be okay.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: 7ab5c5d36d ("zink: use EXT_descriptor_buffer with ZINK_DESCRIPTORS=db")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
(cherry picked from commit 0f8f407e57)
2024-08-14 11:52:19 +02:00
Faith Ekstrand
cd2ea3a45c nvk: Support STORAGE_READ_WITHOUT_FORMAT on buffers
Fixes: fc19173014 ("nvk: Rework format features queries")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
(cherry picked from commit 8244b87822)
2024-08-14 11:52:17 +02:00
Faith Ekstrand
7ce99a3f63 nvk: Require color or depth/stencil attachment support for input attachments
Fixes: 20d8d1e239 ("nvk: Add a more competent GetPhysicalDeviceImageFormatProperties")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30580>
(cherry picked from commit 08f6066e87)
2024-08-14 11:52:16 +02:00
Ian Romanick
fef088fd5d intel/elk: Don't propagate saturate to an instruction that writes flags
There are two problems.

1. This is not NaN safe. 'add.le.sat dst F, Inf F, -Inf F' has a
   different result than 'add dst F, Inf F, -Inf F; cmp.le null, dst F, 0F'.

2. Ignoring the first problem, this only produces the desired flags
   for LE and G. All other cases can produce the wrong result.

shader-db:

All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18282314 -> 18282316 (<.01%)
instructions in affected programs: 78 -> 80 (2.56%)
helped: 0
HURT: 2

total cycles in shared programs: 952924234 -> 952924252 (<.01%)
cycles in affected programs: 584 -> 602 (3.08%)
helped: 0
HURT: 2

Fixes: e6022281f2 ("intel/elk: Rename files to use elk prefix")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
(cherry picked from commit 9125b7c1b4)
2024-08-14 11:52:16 +02:00
Ian Romanick
6f625b1b95 intel/brw: Don't propagate saturate to an instruction that writes flags
There are two problems.

1. This is not NaN safe. 'add.le.sat dst F, Inf F, -Inf F' has a
   different result than 'add dst F, Inf F, -Inf F; cmp.le null, dst F, 0F'.

2. Ignoring the first problem, this only produces the desired flags
   for LE and G. All other cases can produce the wrong result.

For example, batman_arkham_city_goty.foz 6a63c4caacaa0dae has the
following code:

    mad.ge.f0.0(8)  g51<1>F         g50<8,8,1>F     g46<8,8,1>F     g11<1,1,1>F
    mov.sat(8)      g52<1>F         g51<1,1,0>F
    ...
    (+f0.0) sel(8)  g54<1>UD        g53<8,8,1>UD    0x3f000000UD

Without this commit, the saturate is incorrectly propagated to the MAD.

A similar case exists in witcher_3_dxvk_g2.foz 5b03243be667a275.

There are even worse cases like total_war_warhammer3.dx12vk-g6.foz
78328466761ef7ab and ee920491573860fc. The former has the following
code (and the latter has very similar code):

    mad.l.f0.0(16)  g95<1>F         g93<8,8,1>F     g62<8,8,1>F     g68<1,1,1>F
    ...
    mov.sat(16)     g109<1>F        -g95<1,1,0>F
    ...
    (+f0.0) sel(16) g68<1>UD        g111<1,1,0>UD   g54<1,1,0>UD
    (+f0.0) sel(16) g70<1>UD        g113<1,1,0>UD   g56<1,1,0>UD
    (+f0.0) sel(16) g72<1>UD        g115<1,1,0>UD   g58<1,1,0>UD

Saturate propagation makes a hash of this code:

    mad.sat.l.f0.0(16) g106<1>F     -g93<8,8,1>F    -g62<8,8,1>F    g68<1,1,1>F
    ...
    (+f0.0) sel(16) g70<1>UD        g110<1,1,0>UD   g56<1,1,0>UD
    (+f0.0) sel(16) g72<1>UD        g112<1,1,0>UD   g58<1,1,0>UD
    (+f0.0) sel(16) g68<1>UD        g108<1,1,0>UD   g54<1,1,0>UD

Not only is the saturate incorrectly applied to the MAD, but the MAD
result is negated without changing the conditional modifier to G!

NOTE: Backports of this commit to stable branches may need to be more
like the following commit to elk.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19729375 -> 19729377 (<.01%)
instructions in affected programs: 112 -> 114 (1.79%)
helped: 0
HURT: 2

total cycles in shared programs: 916234266 -> 916234288 (<.01%)
cycles in affected programs: 636 -> 658 (3.46%)
helped: 0
HURT: 2

fossil-db:

All Intel platforms had similar results. (Meteor Lake shown)
Totals:
Instrs: 151531594 -> 151531601 (+0.00%)
Cycle count: 17209107419 -> 17209107474 (+0.00%); split: -0.00%, +0.00%

Totals from 6 (0.00% of 630198) affected shaders:
Instrs: 4550 -> 4557 (+0.15%)
Cycle count: 194629 -> 194684 (+0.03%); split: -0.00%, +0.03%

Fixes: 947c828d5c ("i965/fs: Add a saturation propagation optimization pass.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
(cherry picked from commit 3d8fea0e09)
2024-08-14 11:52:15 +02:00
Jesse Natalie
c9f1f288b6 meson: Add an error message for llvmpipe without llvm draw support
Fixes: 010b2f9497 ("gallium/meson: Deconflate swrast/softpipe/llvmpipe")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30577>
(cherry picked from commit 169f8ec227)
2024-08-14 11:51:43 +02:00
Georg Lehmann
2ff3011b02 nir/lower_int64: replace uadd_sat with ior for find_lsb64 and ufind_msb64
Using ior here is equivalent to using uadd_sat, but works for every driver
and shouldn't hurt anywhere.

I forgot to fix this up when fixing up some vvl errors with zink.

Fixes crashes with the integer_ctz CL CTS tests in zink.

Fixes: 39ec184db6 ("zink: lower 64 bit find_lsb, ufind_msb and bit_count")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30535>
(cherry picked from commit 48acf9d358)
2024-08-14 11:51:29 +02:00
Mike Blumenkrantz
8fd12baaaa dri: fix kms_swrast screen fail
this should match all the other screen init functions

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30524>
(cherry picked from commit d6ac254c01)
2024-08-14 11:51:29 +02:00
Mike Blumenkrantz
ec87b9cb5c egl: fix zink init
* close(fd) requires also resetting the fd=-1 or else boom
* checking just driver_name is broken because loader_get_driver_for_fd()
  uses MESA_LOADER_DRIVER_OVERRIDE, so there's no way to differentiate
  an inferred load

Fixes: b907eb4750 ("egl: don't bind zink under dri2/3")

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30556>
(cherry picked from commit 1a579552af)
2024-08-14 11:51:28 +02:00
Tapani Pälli
0a8b665c8f anv: fix a cmd_buffer reference in simple shader
In utrace timestamp copy case cmd_buffer is NULL.

Fixes: dbbcd5c32c ("anv: factor out generation kernel dispatch into helper")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30475>
(cherry picked from commit ff8953f666)
2024-08-14 11:51:18 +02:00
Karol Herbst
3fcb2db345 rusticl/queue: add clSetCommandQueueProperty
The CL CTS started to call this API, luckily we don't have to actually
implement it, because we don't intent to support CL 1.0 only devices in
the first place (probably).

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30575>
(cherry picked from commit cd2dc4f70c)
2024-08-14 11:51:17 +02:00
Lionel Landwerlin
6e51ea0001 anv/blorp: force CC_VIEWPORT reallocation when programming 3DSTATE_VIEWPORT_STATE_POINTERS_CC
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11647
Fixes: fe1baa6481 ("anv: reduce blorp dynamic state emissions")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30532>
(cherry picked from commit 10533e7b4c)
2024-08-14 11:51:14 +02:00
Eric R. Smith
3f8d9068e1 panfrost: use RGB1 component ordering for R5G6B5 pixel formats
For some purposes (e.g. advanced blending) we need a non-zero alpha
value returned from reads. This is only guaranteed on Bifrost if
we explicitly request RGB1 component ordering. The default is to use
RGBA component ordering, which for R5G6B5 causes 0 to be read for
alpha.

A complication is that the Mali fixed function hardware requires
four components (which implies RGBA rather than RGB1). If fixed
function blending is in use, we modify the pixel format back to
RGBA when building the blend descriptor.

Cc: mesa-stable
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29606>
(cherry picked from commit 004e0eb3ab)
2024-08-14 11:51:13 +02:00
Eric R. Smith
3610b26a3e panfrost: fix texture.border_clamp regression for valhall
We have to swizzle the border color in order to offset the
automatic swizzling introduced to compensate for limited
component order support in AFBC/AFRC. However, the border color
format is only available if the `TEXTURE_BORDER_COLOR_QUIRK` is
enabled, so set that for v10 (it was already set for v7).

While testing, we uncovered another issue: valhall introduces a
swizzle for depth+stencil formats that isn't present for bifrost, and
also isn't needed (or wanted) for the border color. So ignore the
border color swizzle for depth+stencil on valhall (on bifrost the
swizzle is a no-op anyway).

Fixes: 87aad0a5e4 ("panfrost: encode component order as an inverted swizzle (v10)")
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30542>
(cherry picked from commit 3135f76331)
2024-08-14 11:51:12 +02:00
Marek Olšák
5c6f02805f ac/surface/gfx12: turn off HiZ for pre-production samples
Fixes: f703dfd1bb - radeonsi: add gfx12

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30525>
(cherry picked from commit 97d664b22f)
2024-08-14 11:51:11 +02:00
Eric Engestrom
4194c3b925 android: fix build in multiple ways
Includes the libgallium versioning, the megadriver hardlink -> symlink
change, and some fixes for things like abusing ls output.

backport-to: 24.2

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30470>
(cherry picked from commit 9568976c52)
2024-08-14 11:51:05 +02:00
Paulo Zanoni
c7dfe51eb3 intel: fix compute SLM sizes on Xe2 and newer
Before the patch, intel_device_info_get_max_preferred_slm_size()
returns values in kilobytes, but then
intel_device_info_get_max_slm_size() is multiplying it by 1024.
As a result, LNL is reporting maxComputeSharedMemorySize to be
134217728, which is 128mb.

Fix this by making intel_device_info_get_max_slm_size() not multiply
it by 1024.

This should fix at least the following dEQP tests:
    dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.1
    dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.128
    dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.16
    dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.2
    dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.4
    dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.64

Some tests were failing with:
    deqp-vk: ../../src/intel/common/intel_compute_slm.c:24: slm_encode_lookup: Assertion `kbytes <= table[table_len - 1].size_in_kb' failed.
while other tests were triggering the OOM.

v2:
 - Make everybody return sizes in bytes (José).
v3:
 - Rename variable to bytes (José, Jordan).

Fixes: fd368f5521 ("anv: Set maxComputeSharedMemorySize value for Xe2 platforms")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30541>
(cherry picked from commit 0e38b794e2)
2024-08-14 11:51:03 +02:00
Sil Vilerino
54386fe91a Revert "d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported"
This reverts commit d6bb4ddc63.
Fixes: d6bb4ddc63 ("d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported")

PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE is necessary for some scenarios like the example below
described in https://github.com/microsoft/WSL/issues/11838

gst-launch-1.0 -v videotestsrc num-buffers=250 !
    video/x-raw,width=1920,height=1200 !
    vaapipostproc !
    vaapih264enc !
    filesink location=~/wsl_test.h264

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30548>
(cherry picked from commit a0f1a708c4)
2024-08-14 11:50:40 +02:00
Karol Herbst
7880933b15 rusticl/image: properly sync mappings content for 1Dbuffer images
This fixes clFillImage 1Dbuffer use_pitches CL CTS tests.

Fixes: 7b22bc617b ("rusticl/memory: complete rework on how mapping is implemented")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30528>
(cherry picked from commit 012323a1d1)
2024-08-14 11:50:06 +02:00
Karol Herbst
34ad0f8bd7 rusticl/image: take pitches into account when allocating memory for maps
This is more correct than the previous code and the CL CTS relies on edge
case behavior here, e.g. for 1Dbuffer images.

I think part of that is not actually required by the spec, but whatever.

Fixes: 7b22bc617b ("rusticl/memory: complete rework on how mapping is implemented")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30528>
(cherry picked from commit 2484331e82)
2024-08-14 11:50:06 +02:00
Karol Herbst
19e29aca4b rusticl/memory: Fix memory unmaps after rework
An application could map and unmap a host ptr allocation multiple times,
but because how the refcounting works, we might never ended up syncing the
written data to the mapped region.

This moves the refcounting out of the event processing.

Fixes: 7b22bc617b ("rusticl/memory: complete rework on how mapping is implemented")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30528>
(cherry picked from commit 1fa288b224)
2024-08-14 11:50:06 +02:00
Eric Engestrom
f55d119d6a ci: pass MESA_SPIRV_LOG_LEVEL from job to the test
Fixes: 4b8735cd4e ("ci: raise the log level threshold of spirv logs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30546>
(cherry picked from commit b6d8459e3a)
2024-08-14 11:14:57 +02:00
Eric Engestrom
30645ecbf8 .pick_status.json: Update to c90e2bccf7 2024-08-14 11:01:39 +02:00
55 changed files with 4195 additions and 198 deletions

View File

@@ -84,6 +84,7 @@ VARS=(
MESA_IMAGE_PATH
MESA_IMAGE_TAG
MESA_LOADER_DRIVER_OVERRIDE
MESA_SPIRV_LOG_LEVEL
MESA_TEMPLATES_COMMIT
MESA_VK_ABORT_ON_DEVICE_LOSS
MESA_VK_IGNORE_CONFORMANCE_WARNING

View File

@@ -18,7 +18,7 @@ DEPS=(
bash
bison
ccache
clang16-dev
clang${LLVM_VERSION}-dev
cmake
clang-dev
coreutils
@@ -31,8 +31,8 @@ DEPS=(
glslang
graphviz
linux-headers
llvm16-static
llvm16-dev
llvm${LLVM_VERSION}-static
llvm${LLVM_VERSION}-dev
meson
mold
musl-dev

File diff suppressed because it is too large Load Diff

View File

@@ -1 +1 @@
24.2.0-rc4
24.2.0

View File

@@ -157,9 +157,9 @@ endif
endef
ifneq ($(strip $(BOARD_MESA3D_GALLIUM_DRIVERS)),)
# Module 'libgallium_dri', produces '/vendor/lib{64}/dri/libgallium_dri.so'
# Module 'libgallium_dri', produces '/vendor/lib{64}/libgallium_dri.so'
# This module also trigger DRI symlinks creation process
$(eval $(call mesa3d-lib,libgallium_dri,dri,MESA3D_GALLIUM_DRI_BIN))
$(eval $(call mesa3d-lib,libgallium_dri,,MESA3D_GALLIUM_BIN))
# Module 'libglapi', produces '/vendor/lib{64}/libglapi.so'
$(eval $(call mesa3d-lib,libglapi,,MESA3D_LIBGLAPI_BIN))

View File

@@ -63,8 +63,8 @@ MESON_OUT_DIR := $($(M_TARGET_PREFIX)TARGET_OUT_INTER
MESON_GEN_DIR := $(MESON_OUT_DIR)_GEN
MESON_GEN_FILES_TARGET := $(MESON_GEN_DIR)/.timestamp
MESA3D_GALLIUM_DRI_DIR := $(MESON_OUT_DIR)/install/usr/local/lib/dri
$(M_TARGET_PREFIX)MESA3D_GALLIUM_DRI_BIN := $(MESON_OUT_DIR)/install/usr/local/lib/libgallium_dri.so
MESA3D_GALLIUM_DIR := $(MESON_OUT_DIR)/install/usr/local/lib
$(M_TARGET_PREFIX)MESA3D_GALLIUM_BIN := $(MESON_OUT_DIR)/install/usr/local/lib/libgallium_dri.so
$(M_TARGET_PREFIX)MESA3D_LIBEGL_BIN := $(MESON_OUT_DIR)/install/usr/local/lib/libEGL.so
$(M_TARGET_PREFIX)MESA3D_LIBGLESV1_BIN := $(MESON_OUT_DIR)/install/usr/local/lib/libGLESv1_CM.so
$(M_TARGET_PREFIX)MESA3D_LIBGLESV2_BIN := $(MESON_OUT_DIR)/install/usr/local/lib/libGLESv2.so
@@ -73,6 +73,7 @@ $(M_TARGET_PREFIX)MESA3D_LIBGBM_BIN := $(MESON_OUT_DIR)/install/usr/local/l
MESA3D_GLES_BINS := \
$($(M_TARGET_PREFIX)MESA3D_GALLIUM_BIN) \
$($(M_TARGET_PREFIX)MESA3D_LIBEGL_BIN) \
$($(M_TARGET_PREFIX)MESA3D_LIBGLESV1_BIN) \
$($(M_TARGET_PREFIX)MESA3D_LIBGLESV2_BIN) \
@@ -284,16 +285,11 @@ endif
$(MESON_BUILD)
touch $@
MESON_COPY_LIBGALLIUM := \
cp `ls -1 $(MESA3D_GALLIUM_DRI_DIR)/* | head -1` $($(M_TARGET_PREFIX)MESA3D_GALLIUM_DRI_BIN)
$(MESON_OUT_DIR)/install/.install.timestamp: MESON_COPY_LIBGALLIUM:=$(MESON_COPY_LIBGALLIUM)
$(MESON_OUT_DIR)/install/.install.timestamp: MESON_BUILD:=$(MESON_BUILD)
$(MESON_OUT_DIR)/install/.install.timestamp: $(MESON_OUT_DIR)/.build.timestamp
rm -rf $(dir $@)
mkdir -p $(dir $@)
DESTDIR=$(call relative-to-absolute,$(dir $@)) $(MESON_BUILD) install
$(if $(BOARD_MESA3D_GALLIUM_DRIVERS),$(MESON_COPY_LIBGALLIUM))
touch $@
$($(M_TARGET_PREFIX)MESA3D_LIBGBM_BIN) $(MESA3D_GLES_BINS): $(MESON_OUT_DIR)/install/.install.timestamp
@@ -308,14 +304,3 @@ $(MESON_OUT_DIR)/install/usr/local/lib/libvulkan_$(MESA_VK_LIB_SUFFIX_$1).so: $(
endef
$(foreach driver,$(BOARD_MESA3D_VULKAN_DRIVERS), $(eval $(call vulkan_target,$(driver))))
$($(M_TARGET_PREFIX)TARGET_OUT_VENDOR_SHARED_LIBRARIES)/dri/.symlinks.timestamp: MESA3D_GALLIUM_DRI_DIR:=$(MESA3D_GALLIUM_DRI_DIR)
$($(M_TARGET_PREFIX)TARGET_OUT_VENDOR_SHARED_LIBRARIES)/dri/.symlinks.timestamp: $(MESON_OUT_DIR)/install/.install.timestamp
# Create Symlinks
mkdir -p $(dir $@)
ls -1 $(MESA3D_GALLIUM_DRI_DIR)/ | PATH=/usr/bin:$$PATH xargs -I{} ln -s -f libgallium_dri.so $(dir $@)/{}
touch $@
$($(M_TARGET_PREFIX)MESA3D_GALLIUM_DRI_BIN): $(TARGET_OUT_VENDOR)/$(MESA3D_LIB_DIR)/dri/.symlinks.timestamp
echo "Build $@"
touch $@

View File

@@ -501,22 +501,28 @@ if not have_mtls_dialect
if meson.is_cross_build() and not meson.can_run_host_binaries()
warning('cannot auto-detect -mtls-dialect when cross-compiling, using compiler default')
else
# -fpic to force dynamic tls, otherwise TLS relaxation defeats check
gnu2_test = cc.run('int __thread x; int main() { return x; }',
args: ['-mtls-dialect=gnu2', '-fpic'],
name: '-mtls-dialect=gnu2')
if gnu2_test.returncode() == 0 and (
# check for lld 13 bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665
host_machine.cpu_family() != 'x86_64' or
# get_linker_id misses LDFLAGS=-fuse-ld=lld: https://github.com/mesonbuild/meson/issues/6377
#cc.get_linker_id() != 'ld.lld' or
cc.links('''int __thread x; int y; int main() { __asm__(
"leaq x@TLSDESC(%rip), %rax\n"
"movq y@GOTPCREL(%rip), %rdx\n"
"call *x@TLSCALL(%rax)\n"); }''', name: 'split TLSDESC')
)
c_cpp_args += '-mtls-dialect=gnu2'
endif
# The way to specify the TLSDESC dialect is architecture-specific.
# We probe both because there is not a fallback guaranteed to work for all
# future architectures.
foreach tlsdesc_arg : ['-mtls-dialect=gnu2', '-mtls-dialect=desc']
# -fpic to force dynamic tls, otherwise TLS relaxation defeats check
tlsdesc_test = cc.run('int __thread x; int main() { return x; }',
args: [tlsdesc_arg, '-fpic'],
name: tlsdesc_arg)
if tlsdesc_test.returncode() == 0 and (
# check for lld 13 bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665
host_machine.cpu_family() != 'x86_64' or
# get_linker_id misses LDFLAGS=-fuse-ld=lld: https://github.com/mesonbuild/meson/issues/6377
#cc.get_linker_id() != 'ld.lld' or
cc.links('''int __thread x; int y; int main() { __asm__(
"leaq x@TLSDESC(%rip), %rax\n"
"movq y@GOTPCREL(%rip), %rdx\n"
"call *x@TLSCALL(%rax)\n"); }''', name: 'split TLSDESC')
)
c_cpp_args += tlsdesc_arg
break
endif
endforeach
endif
endif
@@ -1013,6 +1019,7 @@ endforeach
_attributes = [
'const', 'flatten', 'malloc', 'pure', 'unused', 'warn_unused_result',
'weak', 'format', 'packed', 'returns_nonnull', 'alias', 'noreturn',
'optimize',
]
foreach a : cc.get_supported_function_attributes(_attributes)
pre_args += '-DHAVE_FUNC_ATTRIBUTE_@0@'.format(a.to_upper())
@@ -1753,7 +1760,6 @@ if with_clc
llvm_optional_modules += ['all-targets', 'windowsdriver', 'frontendhlsl', 'frontenddriver']
endif
draw_with_llvm = get_option('draw-use-llvm')
llvm_with_orcjit = get_option('llvm-orcjit')
if draw_with_llvm
llvm_modules += 'native'
# lto is needded with LLVM>=15, but we don't know what LLVM verrsion we are using yet
@@ -1761,6 +1767,12 @@ if draw_with_llvm
endif
amd_with_llvm = get_option('amd-use-llvm')
# MCJIT is deprecated in LLVM and will not accept new architecture ports,
# so any architecture not in the exhaustive list will have to rely on LLVM
# ORCJIT for llvmpipe functionality.
llvm_has_mcjit = host_machine.cpu_family() in ['aarch64', 'arm', 'ppc', 'ppc64', 's390x', 'x86', 'x86_64']
llvm_with_orcjit = get_option('llvm-orcjit') or not llvm_has_mcjit
if with_amd_vk or with_gallium_radeonsi or with_clc or llvm_with_orcjit
_llvm_version = '>= 15.0.0'
elif with_gallium_clover
@@ -1797,8 +1809,8 @@ if with_llvm
pre_args += '-DMESA_LLVM_VERSION_STRING="@0@"'.format(dep_llvm.version())
pre_args += '-DLLVM_IS_SHARED=@0@'.format(_shared_llvm.to_int())
if with_swrast_vk and not draw_with_llvm
error('Lavapipe requires LLVM draw support.')
if (with_swrast_vk or with_gallium_llvmpipe) and not draw_with_llvm
error('Lavapipe and llvmpipe require LLVM draw support.')
endif
if with_gallium_r600 and not amd_with_llvm

View File

@@ -65,6 +65,14 @@ option(
description : 'Location to install dri drivers. Default: $libdir/dri.'
)
option(
'unversion-libgallium',
type : 'boolean',
value : false,
description : 'Do not include mesa version in libgallium DSO filename. ' +
'Do not enable unless you know what you are doing. Default: false'
)
option(
'dri-search-path',
type : 'string',
@@ -436,7 +444,10 @@ option (
'llvm-orcjit',
type : 'boolean',
value : false,
description: 'Build llvmpipe with LLVM ORCJIT support.'
description: 'Build llvmpipe with LLVM ORCJIT support. Has no effect when ' +
'building for architectures without LLVM MCJIT support -- ' +
'ORCJIT is the only choice on such architectures and will ' +
'always be enabled.'
)
option(

View File

@@ -3015,7 +3015,7 @@ static bool gfx12_compute_hiz_his_info(struct ac_addrlib *addrlib, const struct
{
assert(surf_in->flags.depth != surf_in->flags.stencil);
if (surf->flags & RADEON_SURF_NO_HTILE)
if (surf->flags & RADEON_SURF_NO_HTILE || (info->gfx_level == GFX12 && info->chip_rev == 0))
return true;
ADDR3_COMPUTE_SURFACE_INFO_OUTPUT out = {0};

View File

@@ -199,6 +199,21 @@ process_live_temps_per_block(live_ctx& ctx, Block* block)
}
}
if (ctx.program->gfx_level >= GFX10 && insn->isVALU() &&
insn->definitions.back().regClass() == s2) {
/* RDNA2 ISA doc, 6.2.4. Wave64 Destination Restrictions:
* The first pass of a wave64 VALU instruction may not overwrite a scalar value used by
* the second half.
*/
bool carry_in = insn->opcode == aco_opcode::v_addc_co_u32 ||
insn->opcode == aco_opcode::v_subb_co_u32 ||
insn->opcode == aco_opcode::v_subbrev_co_u32;
for (unsigned op_idx = 0; op_idx < (carry_in ? 2 : insn->operands.size()); op_idx++) {
if (insn->operands[op_idx].isOfType(RegType::sgpr))
insn->operands[op_idx].setLateKill(true);
}
}
/* we need to do this in a separate loop because the next one can
* setKill() for several operands at once and we don't want to
* overwrite that in a later iteration */

View File

@@ -2398,6 +2398,24 @@ ast_function_expression::hir(exec_list *instructions,
ir_rvalue *result = convert_component(ir, desired_type);
/* If the bindless packing constructors are used directly as function
* params to bultin functions the compiler doesn't know what to do
* with them. To avoid this make sure we always copy the results from
* the pack to a temp first.
*/
if (result->as_expression() &&
result->as_expression()->operation == ir_unop_pack_sampler_2x32) {
ir_variable *var =
new(ctx) ir_variable(desired_type, "sampler_ctor",
ir_var_temporary);
instructions->push_tail(var);
ir_dereference *lhs = new(ctx) ir_dereference_variable(var);
ir_instruction *assignment = new(ctx) ir_assignment(lhs, result);
instructions->push_tail(assignment);
result = lhs;
}
/* Attempt to convert the parameter to a constant valued expression.
* After doing so, track whether or not all the parameters to the
* constructor are trivially constant valued expressions.

View File

@@ -683,24 +683,13 @@ lower_ufind_msb64(nir_builder *b, nir_def *x)
nir_def *lo_count = nir_ufind_msb(b, x_lo);
nir_def *hi_count = nir_ufind_msb(b, x_hi);
if (b->shader->options->lower_uadd_sat) {
nir_def *valid_hi_bits = nir_ine_imm(b, x_hi, 0);
nir_def *hi_res = nir_iadd_imm(b, hi_count, 32);
return nir_bcsel(b, valid_hi_bits, hi_res, lo_count);
} else {
/* If hi_count was -1, it will still be -1 after this uadd_sat. As a
* result, hi_count is either -1 or the correct return value for 64-bit
* ufind_msb.
*/
nir_def *hi_res = nir_uadd_sat(b, nir_imm_intN_t(b, 32, 32), hi_count);
/* hi_res is either -1 or a value in the range [63, 32]. lo_count is
* either -1 or a value in the range [31, 0]. The imax will pick
* lo_count only when hi_res is -1. In those cases, lo_count is
* guaranteed to be the correct answer.
*/
return nir_imax(b, hi_res, lo_count);
}
/* hi_count is either -1 or a value in the range [31, 0]. lo_count is
* the same. The imax will pick lo_count only when hi_count is -1. In those
* cases, lo_count is guaranteed to be the correct answer.
* The ior 32 is always safe here as with -1 the value won't change,
* otherwise it adds 32, which is what we want anyway.
*/
return nir_imax(b, lo_count, nir_ior_imm(b, hi_count, 32));
}
static nir_def *
@@ -713,11 +702,9 @@ lower_find_lsb64(nir_builder *b, nir_def *x)
/* Use umin so that -1 (no bits found) becomes larger (0xFFFFFFFF)
* than any actual bit position, so we return a found bit instead.
* This is similar to the ufind_msb lowering. If you need this lowering
* without uadd_sat, add code like in lower_ufind_msb64.
* This is similar to the ufind_msb lowering.
*/
assert(!b->shader->options->lower_uadd_sat);
return nir_umin(b, lo_lsb, nir_uadd_sat(b, hi_lsb, nir_imm_int(b, 32)));
return nir_umin(b, lo_lsb, nir_ior_imm(b, hi_lsb, 32));
}
static nir_def *

View File

@@ -2976,8 +2976,10 @@ dri2_initialize_wayland_swrast(_EGLDisplay *disp)
dri2_dpy->formats.num_formats))
goto cleanup;
if (disp->Options.Zink)
dri2_initialize_wayland_drm_extensions(dri2_dpy);
if (disp->Options.Zink) {
if (!dri2_initialize_wayland_drm_extensions(dri2_dpy) && !disp->Options.ForceSoftware)
goto cleanup;
}
dri2_dpy->driver_name = strdup(disp->Options.Zink ? "zink" : "swrast");
if (!dri2_load_driver_swrast(disp))

View File

@@ -1778,7 +1778,7 @@ dri2_initialize_x11_swrast(_EGLDisplay *disp)
if (disp->Options.Zink &&
!debug_get_bool_option("LIBGL_DRI3_DISABLE", false) &&
!debug_get_bool_option("LIBGL_KOPPER_DRI2", false))
dri3_x11_connect(dri2_dpy, disp->Options.ForceSoftware);
dri3_x11_connect(dri2_dpy, disp->Options.Zink, disp->Options.ForceSoftware);
#endif
if (!dri2_load_driver_swrast(disp))
goto cleanup;
@@ -1863,7 +1863,7 @@ dri2_initialize_x11_dri3(_EGLDisplay *disp)
if (!dri2_get_xcb_connection(disp, dri2_dpy))
goto cleanup;
status = dri3_x11_connect(dri2_dpy, disp->Options.ForceSoftware);
status = dri3_x11_connect(dri2_dpy, disp->Options.Zink, disp->Options.ForceSoftware);
if (status != DRI2_EGL_DRIVER_LOADED)
goto cleanup;

View File

@@ -527,7 +527,7 @@ struct dri2_egl_display_vtbl dri3_x11_display_vtbl = {
};
enum dri2_egl_driver_fail
dri3_x11_connect(struct dri2_egl_display *dri2_dpy, bool swrast)
dri3_x11_connect(struct dri2_egl_display *dri2_dpy, bool zink, bool swrast)
{
dri2_dpy->fd_render_gpu =
loader_dri3_open(dri2_dpy->conn, dri2_dpy->screen->root, 0);
@@ -549,15 +549,16 @@ dri3_x11_connect(struct dri2_egl_display *dri2_dpy, bool swrast)
if (!dri2_dpy->driver_name)
dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd_render_gpu);
if (!strcmp(dri2_dpy->driver_name, "zink") &&
!debug_get_bool_option("LIBGL_KOPPER_DISABLE", false)) {
if (!zink && !strcmp(dri2_dpy->driver_name, "zink")) {
close(dri2_dpy->fd_render_gpu);
dri2_dpy->fd_render_gpu = -1;
return DRI2_EGL_DRIVER_PREFER_ZINK;
}
if (!dri2_dpy->driver_name) {
_eglLog(_EGL_WARNING, "DRI3: No driver found");
close(dri2_dpy->fd_render_gpu);
dri2_dpy->fd_render_gpu = -1;
return DRI2_EGL_DRIVER_FAILED;
}

View File

@@ -36,6 +36,6 @@ extern const __DRIimageLoaderExtension dri3_image_loader_extension;
extern struct dri2_egl_display_vtbl dri3_x11_display_vtbl;
enum dri2_egl_driver_fail
dri3_x11_connect(struct dri2_egl_display *dri2_dpy, bool swrast);
dri3_x11_connect(struct dri2_egl_display *dri2_dpy, bool zink, bool swrast);
#endif

View File

@@ -24,9 +24,12 @@ tu_wsi_proc_addr(VkPhysicalDevice physicalDevice, const char *pName)
static bool
tu_wsi_can_present_on_device(VkPhysicalDevice physicalDevice, int fd)
{
#ifdef HAVE_LIBDRM
VK_FROM_HANDLE(tu_physical_device, pdevice, physicalDevice);
return wsi_common_drm_devices_equal(fd, pdevice->local_fd);
#else
return true;
#endif
}
VkResult

View File

@@ -10,6 +10,7 @@
#include <string>
#include <vector>
#include <mutex>
#include <cstdlib>
#include "lp_bld.h"
#include "lp_bld_debug.h"
#include "lp_bld_init.h"
@@ -57,7 +58,7 @@
/* conflict with ObjectLinkingLayer.h */
#include "util/u_memory.h"
#if DETECT_ARCH_RISCV64 == 1 || DETECT_ARCH_RISCV32 == 1 || (defined(_WIN32) && LLVM_VERSION_MAJOR >= 15)
#if DETECT_ARCH_RISCV64 == 1 || DETECT_ARCH_RISCV32 == 1 || DETECT_ARCH_LOONGARCH64 == 1 || (defined(_WIN32) && LLVM_VERSION_MAJOR >= 15)
/* use ObjectLinkingLayer (JITLINK backend) */
#define USE_JITLINK
#endif
@@ -102,6 +103,8 @@ public:
class LPJit;
void lpjit_exit();
class LLVMEnsureMultithreaded {
public:
LLVMEnsureMultithreaded()
@@ -270,15 +273,19 @@ private:
LPJit(const LPJit&) = delete;
LPJit& operator=(const LPJit&) = delete;
friend void lpjit_exit();
static void init_native_targets();
llvm::orc::JITTargetMachineBuilder create_jtdb();
static void init_lpjit() {
jit = new LPJit;
std::atexit(lpjit_exit);
}
static LPJit* jit;
std::unique_ptr<llvm::orc::LLJIT> lljit;
std::unique_ptr<llvm::TargetMachine> tm_unique;
/* avoid name conflict */
unsigned jit_dylib_count;
@@ -292,6 +299,11 @@ private:
LPJit* LPJit::jit = NULL;
void lpjit_exit()
{
delete LPJit::jit;
}
LLVMErrorRef module_transform(void *Ctx, LLVMModuleRef mod) {
struct lp_passmgr *mgr;
@@ -318,7 +330,8 @@ LPJit::LPJit() :jit_dylib_count(0) {
init_native_targets();
JITTargetMachineBuilder JTMB = create_jtdb();
tm = wrap(ExitOnErr(JTMB.createTargetMachine()).release());
tm_unique = ExitOnErr(JTMB.createTargetMachine());
tm = wrap(tm_unique.get());
/* Create an LLJIT instance with an ObjectLinkingLayer (JITLINK)
* or RuntimeDyld as the base layer.
@@ -410,6 +423,14 @@ llvm::orc::JITTargetMachineBuilder LPJit::create_jtdb() {
#else
#error "GALLIVM: unknown target riscv float abi"
#endif
#endif
#if DETECT_ARCH_LOONGARCH64 == 1
#if defined(__loongarch_lp64) && defined(__loongarch_double_float)
options.MCOptions.ABIName = "lp64d";
#else
#error "GALLIVM: unknown target loongarch float abi"
#endif
#endif
JTMB.setOptions(options);

View File

@@ -414,6 +414,24 @@ lp_build_fill_mattrs(std::vector<std::string> &MAttrs)
*/
MAttrs = {"+m","+c","+a","+d","+f"};
#endif
#if DETECT_ARCH_LOONGARCH64 == 1
/*
* TODO: Implement util_get_cpu_caps()
*
* No FPU-less LoongArch64 systems are ever shipped yet, and LP64D is
* the default ABI, so FPU is enabled here.
*
* The Software development convention defaults to have "128-bit
* vector", so LSX is enabled here, see
* https://github.com/loongson/la-softdev-convention/releases/download/v0.1/la-softdev-convention.pdf
*/
MAttrs = {"+f","+d"};
#if LLVM_VERSION_MAJOR == 17
/* LLVM 17's LSX support is incomplete, so explicitly mask it */
MAttrs.push_back("-lsx");
#endif
#endif
}
void

View File

@@ -3848,7 +3848,7 @@ atomic_emit(
LLVMValueRef atom_res = lp_build_alloca(gallivm,
uint_bld->vec_type, "");
LLVMValueRef ssbo_limit;
LLVMValueRef ssbo_limit = NULL;
if (!is_shared) {
ssbo_limit = LLVMBuildAShr(gallivm->builder, bld->ssbo_sizes[buf], lp_build_const_int32(gallivm, 2), "");
ssbo_limit = lp_build_broadcast_scalar(uint_bld, ssbo_limit);

View File

@@ -1090,6 +1090,7 @@ d3d12_video_encoder_convert_profile_to_d3d12_enc_profile_h264(enum pipe_video_pr
{
switch (profile) {
case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN:
{
return D3D12_VIDEO_ENCODER_PROFILE_H264_MAIN;

View File

@@ -873,6 +873,7 @@ d3d12_has_video_encode_support(struct pipe_screen *pscreen,
switch (profile) {
#if VIDEO_CODEC_H264ENC
case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH10:

View File

@@ -47,6 +47,7 @@
#include "pan_cmdstream.h"
#include "pan_context.h"
#include "pan_csf.h"
#include "pan_format.h"
#include "pan_indirect_dispatch.h"
#include "pan_jm.h"
#include "pan_job.h"
@@ -195,7 +196,12 @@ panfrost_create_sampler_state(struct pipe_context *pctx,
* swizzle derived from the format, to allow more formats than the
* hardware otherwise supports. When packing border colours, we need to
* undo this bijection, by swizzling with its inverse.
* On v10+, watch out for depth+stencil formats, because those have a
* swizzle that doesn't really apply to the border color
*/
#if PAN_ARCH >= 10
if (!util_format_is_depth_and_stencil(cso->border_color_format)) {
#endif
unsigned mali_format =
GENX(panfrost_format_from_pipe_format)(cso->border_color_format)->hw;
enum mali_rgb_component_order order = mali_format & BITFIELD_MASK(12);
@@ -207,6 +213,10 @@ panfrost_create_sampler_state(struct pipe_context *pctx,
util_format_apply_color_swizzle(&so->base.border_color, &cso->border_color,
inverted_swizzle,
false /* is_integer (irrelevant) */);
#if PAN_ARCH >= 10
}
#endif
#endif
bool using_nearest = cso->min_img_filter == PIPE_TEX_MIPFILTER_NEAREST;
@@ -378,6 +388,17 @@ panfrost_emit_blend(struct panfrost_batch *batch, void *rts,
panfrost_dithered_format_from_pipe_format)(format, dithered);
cfg.fixed_function.rt = i;
#if PAN_ARCH >= 7
if (cfg.mode == MALI_BLEND_MODE_FIXED_FUNCTION &&
(cfg.fixed_function.conversion.memory_format & 0xff) ==
MALI_RGB_COMPONENT_ORDER_RGB1) {
/* fixed function does not like RGB1 as the component order */
/* force this field to be the default 0 (RGBA) */
cfg.fixed_function.conversion.memory_format &= ~0xff;
cfg.fixed_function.conversion.memory_format |=
MALI_RGB_COMPONENT_ORDER_RGBA;
}
#endif
#if PAN_ARCH <= 7
if (!info.opaque) {
cfg.fixed_function.alpha_zero_nop = info.alpha_zero_nop;

View File

@@ -208,7 +208,7 @@ panfrost_get_param(struct pipe_screen *screen, enum pipe_cap param)
* handles this but we need to fix up the border colour.
*/
case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK:
if (dev->arch == 7)
if (dev->arch == 7 || dev->arch >= 10)
return PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_FREEDRENO;
else
return 0;

View File

@@ -627,7 +627,10 @@ static int peephole_mad_presub_bias(
if (rc_inline_to_float(src1_reg.Index) != 2.0f)
return 0;
} else {
struct rc_constant *constant = &c->Program.Constants.Constants[src1_reg.Index];
if (src1_reg.File != RC_FILE_CONSTANT)
return 0;
struct rc_constant *constant = &c->Program.Constants.Constants[src1_reg.Index];
if (constant->Type != RC_CONSTANT_IMMEDIATE)
return 0;
for (i = 0; i < 4; i++) {

View File

@@ -2924,8 +2924,14 @@ begin_rendering(struct zink_context *ctx, bool check_msaa_expand)
if (has_swapchain) {
ASSERTED struct zink_resource *res = zink_resource(ctx->fb_state.cbufs[0]->texture);
zink_render_fixup_swapchain(ctx);
if (res->use_damage)
if (res->use_damage) {
ctx->dynamic_fb.info.renderArea = res->damage;
} else {
ctx->dynamic_fb.info.renderArea.offset.x = 0;
ctx->dynamic_fb.info.renderArea.offset.y = 0;
ctx->dynamic_fb.info.renderArea.extent.width = ctx->fb_state.width;
ctx->dynamic_fb.info.renderArea.extent.height = ctx->fb_state.height;
}
/* clamp for late swapchain resize */
if (res->base.b.width0 < ctx->dynamic_fb.info.renderArea.extent.width)
ctx->dynamic_fb.info.renderArea.extent.width = res->base.b.width0;

View File

@@ -416,7 +416,7 @@ init_program_db(struct zink_screen *screen, struct zink_program *pg, enum zink_d
{
VkDeviceSize val;
VKSCR(GetDescriptorSetLayoutSizeEXT)(screen->dev, dsl, &val);
pg->dd.db_size[type] = val;
pg->dd.db_size[type] = align64(val, screen->info.db_props.descriptorBufferOffsetAlignment);
pg->dd.db_offset[type] = rzalloc_array(pg, uint32_t, num_bindings);
for (unsigned i = 0; i < num_bindings; i++) {
VKSCR(GetDescriptorSetLayoutBindingOffsetEXT)(screen->dev, dsl, bindings[i].binding, &val);
@@ -740,7 +740,7 @@ zink_descriptor_shader_init(struct zink_screen *screen, struct zink_shader *shad
shader->precompile.num_bindings = num_bindings;
VkDeviceSize val;
VKSCR(GetDescriptorSetLayoutSizeEXT)(screen->dev, shader->precompile.dsl, &val);
shader->precompile.db_size = val;
shader->precompile.db_size = align64(val, screen->info.db_props.descriptorBufferOffsetAlignment);
shader->precompile.db_offset = rzalloc_array(shader, uint32_t, num_bindings);
for (unsigned i = 0; i < num_bindings; i++) {
VKSCR(GetDescriptorSetLayoutBindingOffsetEXT)(screen->dev, shader->precompile.dsl, bindings[i].binding, &val);
@@ -1146,6 +1146,7 @@ update_separable(struct zink_context *ctx, struct zink_program *pg)
}
bs->dd.cur_db_offset[use_buffer] = bs->dd.db_offset;
bs->dd.db_offset += zs->precompile.db_size;
/* TODO: maybe compile multiple variants for different set counts for compact mode? */
int set_idx = screen->info.have_EXT_shader_object ? j : j == MESA_SHADER_FRAGMENT;
VKCTX(CmdSetDescriptorBufferOffsetsEXT)(bs->cmdbuf, VK_PIPELINE_BIND_POINT_GRAPHICS, pg->layout, set_idx, 1, &use_buffer, &offset);
@@ -1633,7 +1634,7 @@ zink_descriptors_init(struct zink_context *ctx)
VkDeviceSize val;
for (unsigned i = 0; i < 2; i++) {
VKSCR(GetDescriptorSetLayoutSizeEXT)(screen->dev, ctx->dd.push_dsl[i]->layout, &val);
ctx->dd.db_size[i] = val;
ctx->dd.db_size[i] = align64(val, screen->info.db_props.descriptorBufferOffsetAlignment);
}
for (unsigned i = 0; i < ZINK_GFX_SHADER_COUNT; i++) {
VKSCR(GetDescriptorSetLayoutBindingOffsetEXT)(screen->dev, ctx->dd.push_dsl[0]->layout, i, &val);
@@ -1709,7 +1710,7 @@ zink_descriptor_util_init_fbfetch(struct zink_context *ctx)
if (zink_descriptor_mode == ZINK_DESCRIPTOR_MODE_DB) {
VkDeviceSize val;
VKSCR(GetDescriptorSetLayoutSizeEXT)(screen->dev, ctx->dd.push_dsl[0]->layout, &val);
ctx->dd.db_size[0] = val;
ctx->dd.db_size[0] = align64(val, screen->info.db_props.descriptorBufferOffsetAlignment);
for (unsigned i = 0; i < ARRAY_SIZE(ctx->dd.db_offset); i++) {
VKSCR(GetDescriptorSetLayoutBindingOffsetEXT)(screen->dev, ctx->dd.push_dsl[0]->layout, i, &val);
ctx->dd.db_offset[i] = val;

View File

@@ -887,6 +887,8 @@ zink_kopper_present_queue(struct zink_screen *screen, struct zink_resource *res,
kopper_present(cpi, screen, -1);
}
res->obj->indefinite_acquire = false;
res->use_damage = false;
memset(&res->damage, 0, sizeof(res->damage));
cdt->swapchain->images[res->obj->dt_idx].acquired = NULL;
res->obj->dt_idx = UINT32_MAX;
}

View File

@@ -1542,10 +1542,25 @@ zink_set_damage_region(struct pipe_screen *pscreen, struct pipe_resource *pres,
for (unsigned i = 0; i < nrects; i++) {
int y = pres->height0 - rects[i].y - rects[i].height;
res->damage.extent.width = MAX2(res->damage.extent.width, rects[i].x + rects[i].width);
res->damage.extent.height = MAX2(res->damage.extent.height, y + rects[i].height);
res->damage.offset.x = MIN2(res->damage.offset.x, rects[i].x);
res->damage.offset.y = MIN2(res->damage.offset.y, y);
/* convert back to coord-based rects to use coordinate calcs */
struct u_rect currect = {
.x0 = res->damage.offset.x,
.y0 = res->damage.offset.y,
.x1 = res->damage.offset.x + res->damage.extent.width,
.y1 = res->damage.offset.y + res->damage.extent.height,
};
struct u_rect newrect = {
.x0 = rects[i].x,
.y0 = y,
.x1 = rects[i].x + rects[i].width,
.y1 = y + rects[i].height,
};
struct u_rect u;
u_rect_union(&u, &currect, &newrect);
res->damage.extent.width = u.y1 - u.y0;
res->damage.extent.height = u.x1 - u.x0;
res->damage.offset.x = u.x0;
res->damage.offset.y = u.y0;
}
res->use_damage = nrects > 0;

View File

@@ -2346,7 +2346,7 @@ dri_swrast_kms_init_screen(struct dri_screen *screen, bool driver_name_is_inferr
#endif
if (!pscreen)
goto fail;
return NULL;
dri_init_options(screen);
dri2_init_screen_extensions(screen, pscreen, true);
@@ -2364,7 +2364,7 @@ dri_swrast_kms_init_screen(struct dri_screen *screen, bool driver_name_is_inferr
return configs;
fail:
dri_release_screen(screen);
pipe_loader_release(&screen->dev, 1);
#endif // HAVE_SWRAST
return NULL;

View File

@@ -35,7 +35,7 @@ pub static DISPATCH: cl_icd_dispatch = cl_icd_dispatch {
clRetainCommandQueue: Some(clRetainCommandQueue),
clReleaseCommandQueue: Some(clReleaseCommandQueue),
clGetCommandQueueInfo: Some(clGetCommandQueueInfo),
clSetCommandQueueProperty: None,
clSetCommandQueueProperty: Some(clSetCommandQueueProperty),
clCreateBuffer: Some(clCreateBuffer),
clCreateImage2D: Some(clCreateImage2D),
clCreateImage3D: Some(clCreateImage3D),

View File

@@ -367,7 +367,14 @@ fn set_kernel_arg(
return Err(CL_INVALID_ARG_SIZE);
}
}
_ => {
KernelArgType::Sampler => {
if arg_size != std::mem::size_of::<cl_sampler>() {
return Err(CL_INVALID_ARG_SIZE);
}
}
KernelArgType::Constant => {
if arg.size != arg_size {
return Err(CL_INVALID_ARG_SIZE);
}

View File

@@ -2192,13 +2192,20 @@ fn enqueue_unmap_mem_object(
// SAFETY: it's required that applications do not cause data races
let mapped_ptr = unsafe { MutMemoryPtr::from_ptr(mapped_ptr) };
let needs_sync = m.unmap(mapped_ptr)?;
create_and_queue(
q,
CL_COMMAND_UNMAP_MEM_OBJECT,
evs,
event,
false,
Box::new(move |q, ctx| m.unmap(q, ctx, mapped_ptr)),
Box::new(move |q, ctx| {
if needs_sync {
m.sync_unmap(q, ctx, mapped_ptr)
} else {
Ok(())
}
}),
)
}

View File

@@ -41,6 +41,22 @@ impl CLInfo<cl_command_queue_info> for cl_command_queue {
}
}
#[cl_entrypoint(clSetCommandQueueProperty)]
fn set_command_queue_property(
_command_queue: cl_command_queue,
_properties: cl_command_queue_properties,
_enable: cl_bool,
_old_properties: *mut cl_command_queue_properties,
) -> CLResult<()> {
// clSetCommandQueueProperty may unconditionally return an error if no devices in the context
// associated with command_queue support modifying the properties of a command-queue. Support
// for modifying the properties of a command-queue is required only for OpenCL 1.0 devices.
//
// CL_INVALID_OPERATION if no devices in the context associated with command_queue support
// modifying the properties of a command-queue.
Err(CL_INVALID_OPERATION)
}
fn valid_command_queue_properties(properties: cl_command_queue_properties) -> bool {
let valid_flags = cl_bitfield::from(
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE

View File

@@ -39,6 +39,8 @@ struct Mapping<T> {
layout: Layout,
writes: bool,
ptr: Option<MutMemoryPtr>,
/// reference count from the API perspective. Once it reaches 0, we need to write back the
/// mappings content to the GPU resource.
count: u32,
inner: T,
}
@@ -152,10 +154,17 @@ impl Mem {
}
}
pub fn unmap(&self, q: &Queue, ctx: &PipeContext, ptr: MutMemoryPtr) -> CLResult<()> {
pub fn sync_unmap(&self, q: &Queue, ctx: &PipeContext, ptr: MutMemoryPtr) -> CLResult<()> {
match self {
Self::Buffer(b) => b.unmap(q, ctx, ptr),
Self::Image(i) => i.unmap(q, ctx, ptr),
Self::Buffer(b) => b.sync_unmap(q, ctx, ptr),
Self::Image(i) => i.sync_unmap(q, ctx, ptr),
}
}
pub fn unmap(&self, ptr: MutMemoryPtr) -> CLResult<bool> {
match self {
Self::Buffer(b) => b.unmap(ptr),
Self::Image(i) => i.unmap(ptr),
}
}
}
@@ -712,7 +721,9 @@ impl MemBase {
fn is_pure_user_memory(&self, d: &Device) -> CLResult<bool> {
let r = self.get_res_of_dev(d)?;
Ok(r.is_user())
// 1Dbuffer objects are weird. The parent memory object can be a host_ptr thing, but we are
// not allowed to actually return a pointer based on the host_ptr when mapping.
Ok(r.is_user() && !self.host_ptr().is_null())
}
fn map<T>(
@@ -912,7 +923,9 @@ impl Buffer {
}
fn is_mapped_ptr(&self, ptr: *mut c_void) -> bool {
self.maps.lock().unwrap().contains_key(ptr as usize)
let mut maps = self.maps.lock().unwrap();
let entry = maps.entry(ptr as usize);
matches!(entry, Entry::Occupied(entry) if entry.get().count > 0)
}
pub fn map(&self, size: usize, offset: usize, writes: bool) -> CLResult<MutMemoryPtr> {
@@ -993,6 +1006,31 @@ impl Buffer {
self.read(q, ctx, mapping.offset, ptr, mapping.size())
}
pub fn sync_unmap(&self, q: &Queue, ctx: &PipeContext, ptr: MutMemoryPtr) -> CLResult<()> {
// no need to update
if self.is_pure_user_memory(q.device)? {
return Ok(());
}
match self.maps.lock().unwrap().entry(ptr.as_ptr() as usize) {
Entry::Vacant(_) => Err(CL_INVALID_VALUE),
Entry::Occupied(entry) => {
let mapping = entry.get();
if mapping.writes {
self.write(q, ctx, mapping.offset, ptr.into(), mapping.size())?;
}
// only remove if the mapping wasn't reused in the meantime
if mapping.count == 0 {
entry.remove();
}
Ok(())
}
}
}
fn tx<'a>(
&self,
q: &Queue,
@@ -1014,22 +1052,16 @@ impl Buffer {
}
// TODO: only sync on unmap when the memory is not mapped for writing
pub fn unmap(&self, q: &Queue, ctx: &PipeContext, ptr: MutMemoryPtr) -> CLResult<()> {
let mapping = match self.maps.lock().unwrap().entry(ptr.as_ptr() as usize) {
Entry::Vacant(_) => return Err(CL_INVALID_VALUE),
pub fn unmap(&self, ptr: MutMemoryPtr) -> CLResult<bool> {
match self.maps.lock().unwrap().entry(ptr.as_ptr() as usize) {
Entry::Vacant(_) => Err(CL_INVALID_VALUE),
Entry::Occupied(mut entry) => {
entry.get_mut().count -= 1;
(entry.get().count == 0).then(|| entry.remove())
let entry = entry.get_mut();
debug_assert!(entry.count > 0);
entry.count -= 1;
Ok(entry.count == 0)
}
};
if let Some(mapping) = mapping {
if mapping.writes && !self.is_pure_user_memory(q.device)? {
self.write(q, ctx, mapping.offset, ptr.into(), mapping.size())?;
}
};
Ok(())
}
}
pub fn write(
@@ -1289,7 +1321,9 @@ impl Image {
}
fn is_mapped_ptr(&self, ptr: *mut c_void) -> bool {
self.maps.lock().unwrap().contains_key(ptr as usize)
let mut maps = self.maps.lock().unwrap();
let entry = maps.entry(ptr as usize);
matches!(entry, Entry::Occupied(entry) if entry.get().count > 0)
}
pub fn is_parent_buffer(&self) -> bool {
@@ -1309,8 +1343,33 @@ impl Image {
*row_pitch = self.image_desc.row_pitch()? as usize;
*slice_pitch = self.image_desc.slice_pitch();
let (offset, size) =
CLVec::calc_offset_size(origin, region, [pixel_size, *row_pitch, *slice_pitch]);
let offset = CLVec::calc_offset(origin, [pixel_size, *row_pitch, *slice_pitch]);
// From the CL Spec:
//
// The pointer returned maps a 1D, 2D or 3D region starting at origin and is at least
// region[0] pixels in size for a 1D image, 1D image buffer or 1D image array,
// (image_row_pitch × region[1]) pixels in size for a 2D image or 2D image array, and
// (image_slice_pitch × region[2]) pixels in size for a 3D image. The result of a memory
// access outside this region is undefined.
//
// It's not guaranteed that the row_pitch is taken into account for 1D images, but the CL
// CTS relies on this behavior.
//
// Also note, that the spec wording is wrong in regards to arrays, which need to take the
// image_slice_pitch into account.
let size = if self.image_desc.is_array() || self.image_desc.dims() == 3 {
debug_assert_ne!(*slice_pitch, 0);
// the slice count is in region[1] for 1D array images
if self.mem_type == CL_MEM_OBJECT_IMAGE1D_ARRAY {
region[1] * *slice_pitch
} else {
region[2] * *slice_pitch
}
} else {
debug_assert_ne!(*row_pitch, 0);
region[1] * *row_pitch
};
let layout;
unsafe {
@@ -1418,6 +1477,41 @@ impl Image {
)
}
pub fn sync_unmap(&self, q: &Queue, ctx: &PipeContext, ptr: MutMemoryPtr) -> CLResult<()> {
// no need to update
if self.is_pure_user_memory(q.device)? {
return Ok(());
}
match self.maps.lock().unwrap().entry(ptr.as_ptr() as usize) {
Entry::Vacant(_) => Err(CL_INVALID_VALUE),
Entry::Occupied(entry) => {
let mapping = entry.get();
let row_pitch = self.image_desc.row_pitch()? as usize;
let slice_pitch = self.image_desc.slice_pitch();
if mapping.writes {
self.write(
ptr.into(),
q,
ctx,
&mapping.region,
row_pitch,
slice_pitch,
&mapping.origin,
)?;
}
// only remove if the mapping wasn't reused in the meantime
if mapping.count == 0 {
entry.remove();
}
Ok(())
}
}
}
fn tx_image<'a>(
&self,
q: &Queue,
@@ -1430,33 +1524,16 @@ impl Image {
}
// TODO: only sync on unmap when the memory is not mapped for writing
pub fn unmap(&self, q: &Queue, ctx: &PipeContext, ptr: MutMemoryPtr) -> CLResult<()> {
let mapping = match self.maps.lock().unwrap().entry(ptr.as_ptr() as usize) {
Entry::Vacant(_) => return Err(CL_INVALID_VALUE),
pub fn unmap(&self, ptr: MutMemoryPtr) -> CLResult<bool> {
match self.maps.lock().unwrap().entry(ptr.as_ptr() as usize) {
Entry::Vacant(_) => Err(CL_INVALID_VALUE),
Entry::Occupied(mut entry) => {
entry.get_mut().count -= 1;
(entry.get().count == 0).then(|| entry.remove())
}
};
let row_pitch = self.image_desc.row_pitch()? as usize;
let slice_pitch = self.image_desc.slice_pitch();
if let Some(mapping) = mapping {
if mapping.writes && !self.is_pure_user_memory(q.device)? {
self.write(
ptr.into(),
q,
ctx,
&mapping.region,
row_pitch,
slice_pitch,
&mapping.origin,
)?;
let entry = entry.get_mut();
debug_assert!(entry.count > 0);
entry.count -= 1;
Ok(entry.count == 0)
}
}
Ok(())
}
pub fn write(

View File

@@ -22,8 +22,14 @@ if with_ld_dynamic_list
gallium_dri_link_depends += files('../dri.dyn')
endif
if get_option('unversion-libgallium') or with_platform_android
libgallium_name = 'gallium_dri'
else
libgallium_name = 'gallium-@0@'.format(meson.project_version())
endif
libgallium_dri = shared_library(
'gallium-@0@'.format(meson.project_version()),
libgallium_name,
files('dri_target.c'),
include_directories : [
inc_include, inc_src, inc_mapi, inc_mesa, inc_gallium, inc_gallium_aux, inc_util, inc_gallium_drivers,

View File

@@ -593,7 +593,10 @@ blorp_emit_cc_viewport(struct blorp_batch *batch)
{
uint32_t cc_vp_offset;
if (batch->blorp->config.use_cached_dynamic_states) {
/* Somehow reusing CC_VIEWPORT on Gfx9 is causing issues :
* https://gitlab.freedesktop.org/mesa/mesa/-/issues/11647
*/
if (GFX_VER != 9 && batch->blorp->config.use_cached_dynamic_states) {
cc_vp_offset = blorp_get_dynamic_state(batch, BLORP_DYNAMIC_STATE_CC_VIEWPORT);
} else {
blorp_emit_dynamic(batch, GENX(CC_VIEWPORT), vp, 32, &cc_vp_offset) {

View File

@@ -75,6 +75,9 @@ opt_saturate_propagation_local(fs_visitor &s, bblock_t *block)
!scan_inst->can_change_types()))
break;
if (scan_inst->flags_written(s.devinfo) != 0)
break;
if (scan_inst->saturate) {
inst->saturate = false;
progress = true;

View File

@@ -24,6 +24,24 @@
#include "brw_nir_rt.h"
#include "brw_nir_rt_builder.h"
static nir_def *
nir_build_vec3_mat_mult_col_major(nir_builder *b, nir_def *vec,
nir_def *matrix[], bool translation)
{
nir_def *result_components[3] = {
nir_channel(b, matrix[3], 0),
nir_channel(b, matrix[3], 1),
nir_channel(b, matrix[3], 2),
};
for (unsigned i = 0; i < 3; ++i) {
for (unsigned j = 0; j < 3; ++j) {
nir_def *v = nir_fmul(b, nir_channels(b, vec, 1 << j), nir_channels(b, matrix[j], 1 << i));
result_components[i] = (translation || j) ? nir_fadd(b, result_components[i], v) : v;
}
}
return nir_vec(b, result_components, 3);
}
static nir_def *
build_leaf_is_procedural(nir_builder *b, struct brw_nir_rt_mem_hit_defs *hit)
{
@@ -163,11 +181,27 @@ lower_rt_intrinsics_impl(nir_function_impl *impl,
break;
case nir_intrinsic_load_ray_object_origin:
sysval = object_ray_in.orig;
if (stage == MESA_SHADER_CLOSEST_HIT) {
struct brw_nir_rt_bvh_instance_leaf_defs leaf;
brw_nir_rt_load_bvh_instance_leaf(b, &leaf, hit_in.inst_leaf_ptr);
sysval = nir_build_vec3_mat_mult_col_major(
b, world_ray_in.orig, leaf.world_to_object, true);
} else {
sysval = object_ray_in.orig;
}
break;
case nir_intrinsic_load_ray_object_direction:
sysval = object_ray_in.dir;
if (stage == MESA_SHADER_CLOSEST_HIT) {
struct brw_nir_rt_bvh_instance_leaf_defs leaf;
brw_nir_rt_load_bvh_instance_leaf(b, &leaf, hit_in.inst_leaf_ptr);
sysval = nir_build_vec3_mat_mult_col_major(
b, world_ray_in.dir, leaf.world_to_object, false);
} else {
sysval = object_ray_in.dir;
}
break;
case nir_intrinsic_load_ray_t_min:

View File

@@ -45,7 +45,8 @@ using namespace elk;
*/
static bool
opt_saturate_propagation_local(const fs_live_variables &live, elk_bblock_t *block)
opt_saturate_propagation_local(const intel_device_info *devinfo,
const fs_live_variables &live, elk_bblock_t *block)
{
bool progress = false;
int ip = block->end_ip + 1;
@@ -74,6 +75,16 @@ opt_saturate_propagation_local(const fs_live_variables &live, elk_bblock_t *bloc
!scan_inst->can_change_types()))
break;
/* min and max pseudo ops modify the flags on Gfx4 and Gfx5, but
* it's not based on the result of the operation. This is the one
* case where it is always safe to propagate a saturate to an
* instruction that writes the flags.
*/
if (scan_inst->flags_written(devinfo) != 0 &&
scan_inst->opcode != ELK_OPCODE_SEL) {
break;
}
if (scan_inst->saturate) {
inst->saturate = false;
progress = true;
@@ -156,7 +167,7 @@ elk_fs_visitor::opt_saturate_propagation()
bool progress = false;
foreach_block (block, cfg) {
progress = opt_saturate_propagation_local(live, block) || progress;
progress = opt_saturate_propagation_local(devinfo, live, block) || progress;
}
/* Live intervals are still valid. */

View File

@@ -2023,15 +2023,15 @@ intel_device_info_wa_stepping(struct intel_device_info *devinfo)
uint32_t
intel_device_info_get_max_slm_size(const struct intel_device_info *devinfo)
{
uint32_t k_bytes = 0;
uint32_t bytes = 0;
if (devinfo->verx10 >= 200) {
k_bytes = intel_device_info_get_max_preferred_slm_size(devinfo);
bytes = intel_device_info_get_max_preferred_slm_size(devinfo);
} else {
k_bytes = 64;
bytes = 64 * 1024;
}
return k_bytes * 1024;
return bytes;
}
uint32_t

View File

@@ -1825,7 +1825,13 @@ cmd_buffer_gfx_state_emission(struct anv_cmd_buffer *cmd_buffer)
}
}
if (BITSET_TEST(hw_state->dirty, ANV_GFX_STATE_VIEWPORT_CC)) {
/* Force CC_VIEWPORT reallocation on Gfx9 when reprogramming
* 3DSTATE_VIEWPORT_STATE_POINTERS_CC :
* https://gitlab.freedesktop.org/mesa/mesa/-/issues/11647
*/
if (BITSET_TEST(hw_state->dirty, ANV_GFX_STATE_VIEWPORT_CC) ||
(GFX_VER == 9 &&
BITSET_TEST(hw_state->dirty, ANV_GFX_STATE_VIEWPORT_CC_PTR))) {
hw_state->vp_cc.state =
anv_cmd_buffer_alloc_dynamic_state(cmd_buffer,
hw_state->vp_cc.count * 8, 32);

View File

@@ -110,7 +110,8 @@ genX(emit_simpler_shader_init_fragment)(struct anv_simple_shader *state)
genX(emit_l3_config)(batch, device, state->l3_config);
state->cmd_buffer->state.current_l3_config = state->l3_config;
if (state->cmd_buffer)
state->cmd_buffer->state.current_l3_config = state->l3_config;
enum intel_urb_deref_block_size deref_block_size;
genX(emit_urb_setup)(device, batch, state->l3_config,

View File

@@ -13,6 +13,8 @@
#include "vk_format.h"
#include "clb097.h"
VkFormatFeatureFlags2
nvk_get_buffer_format_features(struct nvk_physical_device *pdev,
VkFormat vk_format)
@@ -29,6 +31,8 @@ nvk_get_buffer_format_features(struct nvk_physical_device *pdev,
if (nil_format_supports_storage(&pdev->info, p_format)) {
features |= VK_FORMAT_FEATURE_2_STORAGE_TEXEL_BUFFER_BIT |
VK_FORMAT_FEATURE_2_STORAGE_WRITE_WITHOUT_FORMAT_BIT;
if (pdev->info.cls_eng3d >= MAXWELL_A)
features |= VK_FORMAT_FEATURE_2_STORAGE_READ_WITHOUT_FORMAT_BIT;
}
if (p_format == PIPE_FORMAT_R32_UINT || p_format == PIPE_FORMAT_R32_SINT)

View File

@@ -267,6 +267,9 @@ vk_image_usage_to_format_features(VkImageUsageFlagBits usage_flag)
return VK_FORMAT_FEATURE_2_COLOR_ATTACHMENT_BIT;
case VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT:
return VK_FORMAT_FEATURE_2_DEPTH_STENCIL_ATTACHMENT_BIT;
case VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT:
return VK_FORMAT_FEATURE_2_COLOR_ATTACHMENT_BIT |
VK_FORMAT_FEATURE_2_DEPTH_STENCIL_ATTACHMENT_BIT;
default:
return 0;
}

View File

@@ -30,18 +30,26 @@
/* Convenience */
#define MALI_BLEND_AU_R8G8B8A8 (MALI_RGBA8_TB << 12)
#define MALI_BLEND_PU_R8G8B8A8 (MALI_RGBA8_TB << 12)
#define MALI_BLEND_AU_R10G10B10A2 (MALI_RGB10_A2_TB << 12)
#define MALI_BLEND_PU_R10G10B10A2 (MALI_RGB10_A2_TB << 12)
#define MALI_BLEND_AU_R8G8B8A2 (MALI_RGB8_A2_AU << 12)
#define MALI_BLEND_PU_R8G8B8A2 (MALI_RGB8_A2_PU << 12)
#define MALI_BLEND_AU_R4G4B4A4 (MALI_RGBA4_AU << 12)
#define MALI_BLEND_PU_R4G4B4A4 (MALI_RGBA4_PU << 12)
#define MALI_BLEND_AU_R5G6B5A0 (MALI_R5G6B5_AU << 12)
#define MALI_BLEND_PU_R5G6B5A0 (MALI_R5G6B5_PU << 12)
#define MALI_BLEND_AU_R5G5B5A1 (MALI_RGB5_A1_AU << 12)
#define MALI_BLEND_PU_R5G5B5A1 (MALI_RGB5_A1_PU << 12)
#if PAN_ARCH == 6
#define MALI_RGBA_SWIZZLE PAN_V6_SWIZZLE(R, G, B, A)
#define MALI_RGB1_SWIZZLE PAN_V6_SWIZZLE(R, G, B, A)
#else
#define MALI_RGBA_SWIZZLE MALI_RGB_COMPONENT_ORDER_RGBA
#define MALI_RGB1_SWIZZLE MALI_RGB_COMPONENT_ORDER_RGB1
#endif
#define MALI_BLEND_AU_R8G8B8A8 (MALI_RGBA8_TB << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_PU_R8G8B8A8 (MALI_RGBA8_TB << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_AU_R10G10B10A2 (MALI_RGB10_A2_TB << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_PU_R10G10B10A2 (MALI_RGB10_A2_TB << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_AU_R8G8B8A2 (MALI_RGB8_A2_AU << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_PU_R8G8B8A2 (MALI_RGB8_A2_PU << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_AU_R4G4B4A4 (MALI_RGBA4_AU << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_PU_R4G4B4A4 (MALI_RGBA4_PU << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_AU_R5G6B5A0 (MALI_R5G6B5_AU << 12) | MALI_RGB1_SWIZZLE
#define MALI_BLEND_PU_R5G6B5A0 (MALI_R5G6B5_PU << 12) | MALI_RGB1_SWIZZLE
#define MALI_BLEND_AU_R5G5B5A1 (MALI_RGB5_A1_AU << 12) | MALI_RGBA_SWIZZLE
#define MALI_BLEND_PU_R5G5B5A1 (MALI_RGB5_A1_PU << 12) | MALI_RGBA_SWIZZLE
#if PAN_ARCH <= 5
#define BFMT2(pipe, internal, writeback, srgb) \
@@ -50,18 +58,6 @@
MALI_COLOR_FORMAT_##writeback, \
{ 0, 0 }, \
}
#elif PAN_ARCH == 6
#define BFMT2(pipe, internal, writeback, srgb) \
[PIPE_FORMAT_##pipe] = { \
MALI_COLOR_BUFFER_INTERNAL_FORMAT_##internal, \
MALI_COLOR_FORMAT_##writeback, \
{ \
MALI_BLEND_PU_##internal | (srgb ? (1 << 20) : 0) | \
PAN_V6_SWIZZLE(R, G, B, A), \
MALI_BLEND_AU_##internal | (srgb ? (1 << 20) : 0) | \
PAN_V6_SWIZZLE(R, G, B, A), \
}, \
}
#else
#define BFMT2(pipe, internal, writeback, srgb) \
[PIPE_FORMAT_##pipe] = { \

View File

@@ -112,6 +112,14 @@
#endif
#endif
#if defined(__loongarch__)
#ifdef __loongarch_lp64
#define DETECT_ARCH_LOONGARCH64 1
#else
#error "detect_arch: unknown target loongarch base ABI type"
#endif
#endif
#ifndef DETECT_ARCH_X86
#define DETECT_ARCH_X86 0
#endif
@@ -168,4 +176,8 @@
#define DETECT_ARCH_RISCV64 0
#endif
#ifndef DETECT_ARCH_LOONGARCH64
#define DETECT_ARCH_LOONGARCH64 0
#endif
#endif /* UTIL_DETECT_ARCH_H_ */

View File

@@ -240,6 +240,12 @@ do { \
# endif
#endif
#ifdef HAVE_FUNC_ATTRIBUTE_OPTIMIZE
#define ATTRIBUTE_OPTIMIZE(flags) __attribute__((__optimize__((flags))))
#else
#define ATTRIBUTE_OPTIMIZE(flags)
#endif
#ifdef __cplusplus
/**
* Macro function that evaluates to true if T is a trivially

View File

@@ -49,7 +49,8 @@ func_b(void)
debug_backtrace_dump(backtrace, 16);
}
static void ATTRIBUTE_NOINLINE
/* This function must emit a stack frame for the unit test to work */
static void ATTRIBUTE_NOINLINE ATTRIBUTE_OPTIMIZE("no-omit-frame-pointer")
func_c(struct debug_stack_frame *frames)
{
debug_backtrace_capture(frames, 0, 16);

View File

@@ -166,10 +166,9 @@ u_printf_impl(FILE *out, const char *buffer, size_t buffer_size,
int arg_size = fmt->arg_sizes[i];
size_t spec_pos = util_printf_next_spec_pos(format, 0);
if (spec_pos == -1) {
u_printf_plain(out, format);
continue;
}
/* If we hit an unused argument we skip all remaining ones */
if (spec_pos == -1)
break;
const char *token = util_printf_prev_tok(&format[spec_pos]);
const char *next_format = &format[spec_pos + 1];

View File

@@ -58,6 +58,10 @@ static const struct debug_control debug_control[] = {
{ NULL, },
};
static bool present_false(VkPhysicalDevice pdevice, int fd) {
return false;
}
VkResult
wsi_device_init(struct wsi_device *wsi,
VkPhysicalDevice pdevice,
@@ -270,6 +274,21 @@ wsi_device_init(struct wsi_device *wsi,
}
}
/* can_present_on_device is a function pointer used to determine if images
* can be presented directly on a given device file descriptor (fd).
* If HAVE_LIBDRM is defined, it will be initialized to a platform-specific
* function (wsi_device_matches_drm_fd). Otherwise, it is initialized to
* present_false to ensure that it always returns false, preventing potential
* segmentation faults from unchecked calls.
* Drivers for non-PCI based GPUs are expected to override this after calling
* wsi_device_init().
*/
#ifdef HAVE_LIBDRM
wsi->can_present_on_device = wsi_device_matches_drm_fd;
#else
wsi->can_present_on_device = present_false;
#endif
return VK_SUCCESS;
fail:
wsi_device_finish(wsi, alloc);

View File

@@ -1100,7 +1100,7 @@ wsi_display_surface_get_present_rectangles(VkIcdSurfaceBase *surface_base,
wsi_display_mode *mode = wsi_display_mode_from_handle(surface->displayMode);
VK_OUTARRAY_MAKE_TYPED(VkRect2D, out, pRects, pRectCount);
if (wsi_device_matches_drm_fd(wsi_device, mode->connector->wsi->fd)) {
if (wsi_device->can_present_on_device(wsi_device->pdevice, mode->connector->wsi->fd)) {
vk_outarray_append_typed(VkRect2D, &out, rect) {
*rect = (VkRect2D) {
.offset = { 0, 0 },
@@ -3114,7 +3114,7 @@ wsi_AcquireDrmDisplayEXT(VkPhysicalDevice physicalDevice,
VK_FROM_HANDLE(vk_physical_device, pdevice, physicalDevice);
struct wsi_device *wsi_device = pdevice->wsi_device;
if (!wsi_device_matches_drm_fd(wsi_device, drmFd))
if (!wsi_device->can_present_on_device(wsi_device->pdevice, drmFd))
return VK_ERROR_UNKNOWN;
struct wsi_display *wsi =
@@ -3148,7 +3148,7 @@ wsi_GetDrmDisplayEXT(VkPhysicalDevice physicalDevice,
VK_FROM_HANDLE(vk_physical_device, pdevice, physicalDevice);
struct wsi_device *wsi_device = pdevice->wsi_device;
if (!wsi_device_matches_drm_fd(wsi_device, drmFd)) {
if (!wsi_device->can_present_on_device(wsi_device->pdevice, drmFd)) {
*pDisplay = VK_NULL_HANDLE;
return VK_ERROR_UNKNOWN;
}

View File

@@ -440,10 +440,10 @@ wsi_common_drm_devices_equal(int fd_a, int fd_b)
}
bool
wsi_device_matches_drm_fd(const struct wsi_device *wsi, int drm_fd)
wsi_device_matches_drm_fd(VkPhysicalDevice physicalDevice, int drm_fd)
{
if (wsi->can_present_on_device)
return wsi->can_present_on_device(wsi->pdevice, drm_fd);
VK_FROM_HANDLE(vk_physical_device, pdevice, physicalDevice);
const struct wsi_device *wsi = pdevice->wsi_device;
drmDevicePtr fd_device;
int ret = drmGetDevice2(drm_fd, 0, &fd_device);

View File

@@ -225,7 +225,7 @@ struct wsi_swapchain {
};
bool
wsi_device_matches_drm_fd(const struct wsi_device *wsi, int drm_fd);
wsi_device_matches_drm_fd(VkPhysicalDevice pdevice, int drm_fd);
void
wsi_wl_surface_destroy(VkIcdSurfaceBase *icd_surface, VkInstance _instance,

View File

@@ -160,7 +160,7 @@ wsi_x11_check_dri3_compatible(const struct wsi_device *wsi_dev,
if (dri3_fd == -1)
return true;
bool match = wsi_device_matches_drm_fd(wsi_dev, dri3_fd);
bool match = wsi_dev->can_present_on_device(wsi_dev->pdevice, dri3_fd);
close(dri3_fd);
@@ -1071,9 +1071,11 @@ struct x11_image {
* We need to keep track of them when considering present ID. */
/* This is arbitrarily chosen. With IMMEDIATE on a 3 deep swapchain,
* we allow up to 48 outstanding presentations per vblank, which is more than enough
* for any reasonable application. */
#define X11_SWAPCHAIN_MAX_PENDING_COMPLETIONS 16
* we allow over 300 outstanding presentations per vblank, which is more than enough
* for any reasonable application.
* This used to be 16, but it regressed benchmarks that did 15k+ FPS.
* This should allow over 25k FPS on a 60 Hz monitor. Any more than this is comical. */
#define X11_SWAPCHAIN_MAX_PENDING_COMPLETIONS 128
uint32_t present_queued_count;
struct x11_image_pending_completion pending_completions[X11_SWAPCHAIN_MAX_PENDING_COMPLETIONS];
#ifdef HAVE_DRI3_EXPLICIT_SYNC