Compare commits

...

49 Commits

Author SHA1 Message Date
Dylan Baker
65d255cd1e VERSION: bump version for 19.2.7 2019-12-04 13:48:11 -08:00
Dylan Baker
d8e767ede8 docs: Add release notes for 19.2.7 2019-12-04 13:47:44 -08:00
Rhys Perry
4a0199b6e4 radv: set writes_memory for global memory stores/atomics
Fixes: 13ab63bb62 ('radv: Implement VK_EXT_buffer_device_address.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 35fab1ba33)
2019-12-04 13:43:32 -08:00
Samuel Pitoiset
3ed8c94244 radv: fix compute pipeline keys when optimizations are disabled
If an app first creates a compute pipeline with
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it
without that flag, the driver should re-compile the compute shader.
Otherwise, it will return the unoptimized one.

Fixes: ce188813bf ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 9ab27647ff)
2019-12-04 13:43:32 -08:00
Samuel Pitoiset
5c98b36577 radv/gfx10: fix implementation of exclusive scans
This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.

Fixes: 227c29a80d ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c9aa843961)
Conflicts resolved by Dylan Baker
2019-12-04 13:43:32 -08:00
Samuel Pitoiset
a3869c14c0 radv: fix enabling sample shading with SampleID/SamplePosition
When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.

Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 86a5fbfd4a)
Conflicts resolved by Dylan Baker
2019-12-04 13:43:32 -08:00
Yevhenii Kolesnikov
bda6890f58 meson: Fix linkage of libgallium_nine with libgalliumvl
Do not link libgallium_nine with libgalliumvl_stub if it's already
linked with libgalliumvl. Linking with stub leads to "duplicate
symbol" errors.

Fixes: 6b4c7047d5
       ("meson: build gallium nine state_tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2040

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 9af22ccddc)
Conflicts resolved by Dylan Baker
2019-12-04 13:43:32 -08:00
Jason Ekstrand
2bf47550ce anv: Set up SBE_SWIZ properly for gl_Viewport
gl_Viewport is also in the VUE header so we need to whack the read
offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that
case as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit b1f37688ba)
2019-12-04 13:43:32 -08:00
Jonathan Gray
b4e83559cb i965: update Makefile.sources for perf changes
brw_performance_query_metrics.h was removed in
134e750e16 and
brw_performance_query.h was removed in
8ae6667992

remove reference to these files from Makefile.sources

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Fixes: 134e750e16 ("i965: extract performance query metrics")
Fixes: 8ae6667992 ("intel/perf: move query_object into perf")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 34dda0ca65)
2019-12-04 13:43:32 -08:00
Boris Brezillon
a39d364af3 gallium: Fix the ->set_damage_region() implementation
BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a
damage region update on the wrong pipe_resource object.
Let's delay the ->set_damage_region() call until the attachments are
updated when we're in that case.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 492ffbed63 ("st/dri2: Implement DRI2bufferDamageExtension")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b196e1a8cf)
2019-12-04 13:43:32 -08:00
Jonathan Gray
c166435dc2 winsys/amdgpu: avoid double simple_mtx_unlock()
pthread_mutex_unlock() when unlocked is documented by posix as
being undefined behaviour.  On OpenBSD pthread_mutex_unlock() will call
abort(3) if this happens.

This occurs in amdgpu_winsys_create() after
cb446dc0fa
winsys/amdgpu: Add amdgpu_screen_winsys

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3fe3bde4f2)
2019-12-04 13:43:32 -08:00
Bas Nieuwenhuizen
52a8d43a24 radv: Unify max_descriptor_set_size.
They were out of sync. Besides syncing, lets ensure they never diverge
again.

Fixes: 8d2654a419 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4cde0e04e3)
2019-12-04 13:43:32 -08:00
Bas Nieuwenhuizen
2e379b0a65 radv: Allocate cmdbuffer space for buffer marker write.
Fixes: 946193ae00 "radv: add support for VK_AMD_buffer_marker"
Reviewed-by:  Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 25bc9102d8)
2019-12-04 13:43:32 -08:00
Zebediah Figura
336f59f8ba Revert "draw: revert using correct order for prim decomposition."
This reverts commit f97b731c82.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit a3c8bc10aa)
2019-12-04 13:43:32 -08:00
Ian Romanick
38c8af9e2a intel/fs: Disable conditional discard optimization on Gen4 and Gen5
The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of
valid data and 31 bits of junk.  Results of comparisons that are used as
Boolean values need to have a fixup applied to generate the proper 0/~0
values.

Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup
code from being generated.  This results in a sequence like:

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
(+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD     g8<8,8,1>UD

instead of

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
        or(16) g4<1>UD g4<8,8,1>UD     g8<8,8,1>UD
(+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD     1UD

I examined a couple of the shaders hurt by this change, and ALL of them
would have been affected by this bug. :(

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836
Fixes: 0ba9497e66 ("intel/fs: Improve discard_if code generation")

Iron Lake
total instructions in shared programs: 8122757 -> 8122957 (<.01%)
instructions in affected programs: 8307 -> 8507 (2.41%)
helped: 0
HURT: 100
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.58% 3.03%
Instructions are HURT.

total cycles in shared programs: 188510100 -> 188510376 (<.01%)
cycles in affected programs: 76018 -> 76294 (0.36%)
helped: 0
HURT: 55
HURT stats (abs)   min: 2 max: 12 x̄: 5.02 x̃: 4
HURT stats (rel)   min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56%
95% mean confidence interval for cycles value: 4.33 5.71
95% mean confidence interval for cycles %-change: 0.60% 1.12%
Cycles are HURT.

GM45
total instructions in shared programs: 4994403 -> 4994503 (<.01%)
instructions in affected programs: 4212 -> 4312 (2.37%)
helped: 0
HURT: 50
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.45% 3.07%
Instructions are HURT.

total cycles in shared programs: 128928750 -> 128928982 (<.01%)
cycles in affected programs: 67442 -> 67674 (0.34%)
helped: 0
HURT: 47
HURT stats (abs)   min: 2 max: 12 x̄: 4.94 x̃: 4
HURT stats (rel)   min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53%
95% mean confidence interval for cycles value: 4.19 5.68
95% mean confidence interval for cycles %-change: 0.50% 1.00%
Cycles are HURT.

(cherry picked from commit e51eda99df)
2019-12-04 13:43:32 -08:00
Dylan Baker
5836dd66e0 VERSION: bumpre to 19.2.6 2019-11-21 16:04:47 -08:00
Dylan Baker
264d1187df docs: Add release notes for 19.2.6 2019-11-21 16:04:11 -08:00
Dylan Baker
aa620fdf8e meson: generate .pc files for gles and gles2 with old glvnd
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1921
2019-11-21 17:28:23 +00:00
Yevhenii Kolesnikov
05d5784ea6 glsl: Enable textureSize for samplerExternalOES
From OES_EGL_image_external_essl3

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1901

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2019-11-21 09:26:26 -08:00
Dave Airlie
b1f505463d llvmpipe/ppc: fix if/ifdef confusion in backport.
Fixes: 32aba91c07 (llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2131
2019-11-21 09:26:26 -08:00
Hyunjun Ko
c2488d810b freedreno/ir3: fix printing output registers of FS.
Fixes: cea39af2fb ("freedreno/ir3: Generalize ir3_shader_disasm()")

Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit d0f38394b1)
2019-11-21 09:26:26 -08:00
Alejandro Piñeiro
e594e4cefd v3d: adds an extra MOV for any sig.ld*
Specifically when we are in non-uniform control flow, as we would need
to set the condition for the last instruction. If (for example) a
image atomic load stores directly their value on a NIR register,
last_inst would be a nop, and would fail when set the condition.

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

v2: (Changes suggested by Eric Anholt)
   * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of
     them have the same restriction.
   * Update comment explaining why we add a MOV in that case
   * Tweak commit message.

v3:
   * Drop extra set of parens (Eric)
   * Add missing ld signal to is_ld_signal to fix shader-db regression.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b4bc59e37e)
2019-11-21 09:26:26 -08:00
Jose Maria Casanova Crespo
8f526ee7cd v3d: Fix predication with atomic image operations
Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test

Fixes: 6281f26f06 ("v3d: Add support for shader_image_load_store.")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d983055184)
2019-11-21 09:26:26 -08:00
Eric Engestrom
be8a46d064 vulkan: delete typo'd header
Two files exist in that directory:
- vulkan_xlib_randr.h
- vulkan_xlib_xrandr.h

Both were imported in 205c271562 ("vulkan: Update the XML and
headers to 1.1.70") with identical contents (ie. the
VK_EXT_acquire_xlib_display extension), but the former was never
included anywhere and can't be found upstream [1], while the latter is
included in vulkan.h and found upstream.

[1] https://github.com/KhronosGroup/Vulkan-Headers/tree/master/include/vulkan

Fixes: 205c271562 ("vulkan: Update the XML and headers to 1.1.70")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 344859c32d)
2019-11-21 09:26:26 -08:00
Dylan Baker
4c86eda2c2 docs/relnotes/19.2.5: Add SHA256 sum 2019-11-20 09:18:11 -08:00
Dylan Baker
f418e9231a VERSION: bump for 19.2.5 2019-11-20 08:54:30 -08:00
Dylan Baker
9e0a0d2ebb docs: Add relnotes for 19.2.5 2019-11-20 08:54:10 -08:00
Jason Ekstrand
e10851ff34 anv: Stop bounds-checking pushed UBOs
The bounds checking is actually less safe than just pushing the data.
If the bounds checking actually ever kicks in and it's not on the last
UBO push range, then the shrinking will cause all subsequent ranges to
be pushed to the wrong place in the GRF.  One of the behaviors we
definitely don't want is for OOB UBO access to result in completely
unrelated UBOs returning garbage values.  It's safer to just push the
UBOs as-requested.  If we're really concerned about robustness, we can
emit shader code to do bounds checking which should be stupid cheap (a
CMP followed by SEL).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-20 08:24:09 -08:00
Brian Paul
023ddb01b5 Call shmget() with permission 0600 instead of 0777
A security advisory (TALOS-2019-0857/CVE-2019-5068) found that
creating shared memory regions with permission mode 0777 could allow
any user to access that memory.  Several Mesa drivers use shared-
memory XImages to implement back buffers for improved performance.

This path changes the shmget() calls to use 0600 (user r/w).

Tested with legacy Xlib driver and llvmpipe.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit 02c3dad0f3)
2019-11-20 08:24:09 -08:00
Danylo Piliaiev
3199172eaa i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround
Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting
3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in
SuperTuxCart and Tropico 6 which was seen only on Haswell.
The reason for this is unknown and fix was found empirically.

The closest mention in PRM is that it should improve performance.
From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS):
 "When the BLEND_STATE pointer changes but not the CC_STATE pointer,
  driver needs to force a CC_STATE pointer change to improve
  blend performance in pixel backend."

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1834
Fixes: eca4a654 ("i965: Disable dual source blending when shader doesn't support it on gen8+")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6f17fe0606)
2019-11-20 08:24:09 -08:00
Ben Crocker
ae071434e9 llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders
Large programs, e.g. gnome-shell and firefox, may tax the
addressability of the Medium code model once a (potentially unbounded)
number of dynamically generated JIT-compiled shader programs are
linked in and relocated.  Yet the default code model as of LLVM 8 is
Medium or even Small.

The cost of changing from Medium to Large is negligible:
- an additional 8-byte pointer stored immediately before the shader entrypoint;
- change an add-immediate (addis) instruction to a load (ld).

Testing with WebGL Conformance
(https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html)
yields clean runs with this change (and crashes without it).

Testing with glxgears shows no detectable performance difference.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223

Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com>

CC: mesa-stable@lists.freedesktop.org

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
(cherry picked from commit 9c3be6d21f)
Conflicts resolved Dylan (PIPE_ARCH -> UTIL_ARCH rename)
2019-11-20 08:24:09 -08:00
Pierre-Eric Pelloux-Prayer
60c299c542 radeonsi: fix shader disk cache key
Use unsigned values otherwise signed extension will produce a 64 bits value where
the 32 left-most bits are 1.

Fixes: 2afeed3010 ("radeonsi: tell the shader disk cache what IR is used")
2019-11-20 08:24:09 -08:00
Pierre-Eric Pelloux-Prayer
b5b09acb74 radeonsi: tell the shader disk cache what IR is used
Until 8bef4df196 the IR (TGSI or NIR) was used in disk_cache driver_flags.
This commit restores this features to avoid crashing when switching from
one IR to the other.

As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI
to keep the same driver_flags.

Fixes: 8bef4df196 ("radeonsi: add si_debug_options for convenient adding/removing of options")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-20 08:24:09 -08:00
Pierre-Eric Pelloux-Prayer
38bd621f0d radeonsi: disable sdma for gfx10
Disable sdma on gfx10 until all timeouts bugs are fixed.

See:
    https://gitlab.freedesktop.org/mesa/mesa/issues/1907
    https://bugs.freedesktop.org/show_bug.cgi?id=111481

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-20 08:24:09 -08:00
Marek Olšák
0e7e56aa2f tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes
radeonsi doesn't use the format and internal shaders don't set it.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
(cherry picked from commit f704fb7f0b)
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2112
2019-11-20 08:23:55 -08:00
Marek Olšák
2353a63a58 tgsi_to_nir: fix masked out image loads
This caused a failure in NIR validation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3906fce88b)
2019-11-19 16:56:01 -08:00
Illia Iorin
856c7eddf7 mesa/main: Ignore filter state for MS texture completeness
After the discussion in
https://github.com/KhronosGroup/OpenGL-API/issues/45
the section 8.17 (texture completeness) of the OpenGL 4.6 core profile
was changed to explicitly say that multisample texture completeness
ignores filter state of the texture.

"Using the preceding definitions, a texture is complete unless any of the
 following conditions hold true:
   ...
  - The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
    the texture is not multisample, and the texture is not mipmap complete.
  - The texture is not multisample; either the magnification filter is not
    NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
    MIPMAP_NEAREST; and any of
    – The internal format of the texture is integer (see table 8.12).
    – The internal format is STENCIL_INDEX.
    – The internal format is DEPTH_STENCIL, and the value of DEPTH_-
      STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6b672e342a)
2019-11-19 16:56:00 -08:00
Ian Romanick
4c82d426bd nir/algebraic: Mark other comparison exact when removing a == a
This prevents some additional optimizations that would change the
original result.  This includes things like (b < a && b < c) => b <
min(a, c) and !(a < b) => b >= a.  Both of these optimizations were
specifically observed in the piglit tests added in piglit!160.

This was discovered while investigating
https://gitlab.freedesktop.org/mesa/mesa/issues/1958.  However, the
problem in that issue was Chrome or Angle is replacing calls to isnan()
with some stuff that we (correctly) optimize to false.  If they had left
the calls to isnan() alone, everything would have just worked.

No shader-db changes on any Intel platform.

I also tried marking the comparison generated by the isnan() function
precise.  The precise marker "infects" every computation involved in
calculating the parameter to the isnan() function, and this severely
hurt all of the (few) shaders in shader-db that use isnan().

I also considered adding a new ir_unop_isnan opcode that would implement
the functionality.  During GLSL IR-to-NIR translation, the resulting
comparison operation would be marked exact (and the samething would need
to happen in SPIR-V translation).

This approach taken by this patch seemed easier, but we may want to do
the ir_unop_isnan thing anyway.

Fixes: d55835b8bd ("nir/algebraic: Add optimizations for "a == a && a CMP b"")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 9be4a422a0)
2019-11-19 16:56:00 -08:00
Ian Romanick
d8a37880b5 nir/algebraic: Add the ability to mark a replacement as exact
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit ea19f2fb68)
2019-11-19 16:56:00 -08:00
Paulo Zanoni
bd2f6150ca intel/compiler: fix nir_op_{i,u}*32 on ICL
On ICL we have the src1 restriction which is applied through
fix_byte_src() and potentially changes the type of the operands from 8
to 32 bits. When this change happens, we fall into the "else if
(bit_size < 32)" case and miscompute src_type because it takes into
consideration bit_size (8) instead of the adjusted size of temp_op
(32). This results in the shader reading unused memory, giving us
mostly failures, but occasional passes due to whatever was already in
the registers we were reading.

This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as:
    dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

This can also be verified by simply changing fix_byte_src() to apply
on all platforms.

Fixes: 5847de6e9a ("intel/compiler: don't use byte operands for src1 on ICL")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit eb6352162d)
2019-11-19 16:56:00 -08:00
Marek Olšák
9ffb0176e2 st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them
They use the "sample" keyword as a variable name.

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e00791c552)
2019-11-19 16:56:00 -08:00
Lionel Landwerlin
a390cf739f anv/wsi: signal the semaphore in the acquireNextImage
We seem to have forgotten about the semaphore in the
acquireNextImageInfo.

v2: Signal semaphore/fence regardless of presentation status (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit edc6606d4e)
2019-11-19 16:56:00 -08:00
Lionel Landwerlin
a993dc20b6 anv: remove list items on batch fini
This doesn't seem to fix anything because those destroy() calls happen
right before the command buffer object & its list of batch_bo is also
destroyed. Still looks a bit cleaner.

v2: Found a second occurence

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Fixes: 26ba0ad54d ("vk: Re-name command buffer implementation files")
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 935f8f0e56)
2019-11-19 16:56:00 -08:00
Lionel Landwerlin
2281258a7b anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit
We always close the in_fence at the end the anv_cmd_buffer_execbuf()
so when we take it from the semaphore, let's not forget to invalidate
it.

Note that the code leaks the fence_in if we get any error before
reaching the close(). Let's fix that in another patch or better,
rewrite the whole thing!

v2: drop redundant fd = -1 (Jason)

v3: Update commit message (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 048f0690ee)
2019-11-19 16:56:00 -08:00
Dylan Baker
a89d4090bc cherry-ignore: Update for 19.2.4 cycle 2019-11-19 16:56:00 -08:00
Eric Engestrom
cb835d281b egl: fix _EGL_NATIVE_PLATFORM fallback
When the X11 or Haiku platforms were compiled in, they would bypass the
`_EGL_NATIVE_PLATFORM` fallback by always returning themselves instead.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 86d3a346f1)
2019-11-13 11:10:13 -08:00
Caio Marcelo de Oliveira Filho
6e98a923cb spirv: Don't leak GS initialization to other stages
The stage specific fields of shader_info are in an union.  We've
likely been lucky that this value was either overwritten or ignored by
other stages.  The recent change in shader_info layout in commit
84a1a2578d ("compiler: pack shader_info from 160 bytes to 96 bytes")
made this issue visible.

Fixes: cf2257069c ("nir/spirv: Set a default number of invocations for geometry shaders")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 087ecd9ca5)
2019-11-13 11:10:13 -08:00
Lepton Wu
2b89e68a2e gallium: dri2: Use index as plane number.
This fix wrong color when playing video under Android + virgl
configuration.

Fixes: 2decad495f ("gallium/dri2: Support images with multiple planes for modifiers")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Lepton Wu <lepton@chromium.org>
(cherry picked from commit 5a40e153fd)
2019-11-13 11:10:13 -08:00
Dylan Baker
9cbffed5d0 docs: Add SHA256 sum for for 19.2.4 2019-11-13 11:09:32 -08:00
49 changed files with 748 additions and 222 deletions

View File

@@ -1 +1 @@
19.2.4
19.2.7

View File

@@ -27,3 +27,14 @@ bcd9224728dcb8d8fe4bcddc4bd9b2c36fcfe9dd
869e32593a9096b845dd6106f8f86e1c41fac968
a2c3c65a31de90fdb55f76f2894860dfbafe2043
bb0c5c487e63e88acbb792f092dd8f392bad8540
# This is reverted shortly after it was landed
4432a2d14d80081d062f7939a950d65ea3a16eed
# These aren't relevant for 19.2
1a05811936dd8d0c3a367c6f00629624ef39d537
911a8261419f48dcd756f78832fa5a5f4c5b8d93
# This was manuall backported
2afeed301010917c4eae55dcd2544f9d329df934
4b392ced2d744fccffe95490ff57e6b41033c266

View File

@@ -35,7 +35,7 @@ depends on the particular driver being used.
<h2>SHA256 checksum</h2>
<pre>
TBD.
09000a0f7dbbd82e193b81a8f1bf0c118eab7ca975c0329181968596e548e30f mesa-19.2.4.tar.xz
</pre>

115
docs/relnotes/19.2.5.html Normal file
View File

@@ -0,0 +1,115 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.5 Release Notes / 2019-11-20</h1>
<p>
Mesa 19.2.5 is a bug fix release which fixes bugs found since the 19.2.4 release.
</p>
<p>
Mesa 19.2.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.5 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
3d010a366b28d10bdd71e32091d8684baf1522e6466c5c5703667091b2108c8b mesa-19.2.5.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>HSW. Tropico 6 and SuperTuxKart have shadows flickering</li>
<li>glxgears segfaults on POWER / Xvnc</li>
<li>Cannot start Civ6 with AMD GPU on Linux</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Ben Crocker (1):</p>
<li> llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders</li>
<p></p>
<p>Brian Paul (1):</p>
<li> Call shmget() with permission 0600 instead of 0777</li>
<p></p>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<li> spirv: Don&#x27;t leak GS initialization to other stages</li>
<p></p>
<p>Danylo Piliaiev (1):</p>
<li> i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround</li>
<p></p>
<p>Dylan Baker (2):</p>
<li> docs: Add SHA256 sum for for 19.2.4</li>
<li> cherry-ignore: Update for 19.2.4 cycle</li>
<p></p>
<p>Eric Engestrom (1):</p>
<li> egl: fix _EGL_NATIVE_PLATFORM fallback</li>
<p></p>
<p>Ian Romanick (2):</p>
<li> nir/algebraic: Add the ability to mark a replacement as exact</li>
<li> nir/algebraic: Mark other comparison exact when removing a == a</li>
<p></p>
<p>Illia Iorin (1):</p>
<li> mesa/main: Ignore filter state for MS texture completeness</li>
<p></p>
<p>Jason Ekstrand (1):</p>
<li> anv: Stop bounds-checking pushed UBOs</li>
<p></p>
<p>Lepton Wu (1):</p>
<li> gallium: dri2: Use index as plane number.</li>
<p></p>
<p>Lionel Landwerlin (3):</p>
<li> anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit</li>
<li> anv: remove list items on batch fini</li>
<li> anv/wsi: signal the semaphore in the acquireNextImage</li>
<p></p>
<p>Marek Olšák (3):</p>
<li> st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them</li>
<li> tgsi_to_nir: fix masked out image loads</li>
<li> tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes</li>
<p></p>
<p>Paulo Zanoni (1):</p>
<li> intel/compiler: fix nir_op_{i,u}*32 on ICL</li>
<p></p>
<p>Pierre-Eric Pelloux-Prayer (3):</p>
<li> radeonsi: disable sdma for gfx10</li>
<li> radeonsi: tell the shader disk cache what IR is used</li>
<li> radeonsi: fix shader disk cache key</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

87
docs/relnotes/19.2.6.html Normal file
View File

@@ -0,0 +1,87 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.6 Release Notes / 2019-11-21</h1>
<p>
Mesa 19.2.6 is a bug fix release which fixes bugs found since the 19.2.5 release.
</p>
<p>
Mesa 19.2.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.6 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>glesv2.pc is not built since fafd20f67dec9f589</li>
<li>textureSize(samplerExternalOES, int) missing in desktop mesa 19.1.7 implementation</li>
<li>[19.2.5] lp_bld_misc: broken #if PIPE_ARCH_LITTLE_ENDIAN on ppc64l</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Alejandro Piñeiro (1):</p>
<li> v3d: adds an extra MOV for any sig.ld*</li>
<p></p>
<p>Dave Airlie (1):</p>
<li> llvmpipe/ppc: fix if/ifdef confusion in backport.</li>
<p></p>
<p>Dylan Baker (2):</p>
<li> docs/relnotes/19.2.5: Add SHA256 sum</li>
<li> meson: generate .pc files for gles and gles2 with old glvnd</li>
<p></p>
<p>Eric Engestrom (1):</p>
<li> vulkan: delete typo&#x27;d header</li>
<p></p>
<p>Hyunjun Ko (1):</p>
<li> freedreno/ir3: fix printing output registers of FS.</li>
<p></p>
<p>Jose Maria Casanova Crespo (1):</p>
<li> v3d: Fix predication with atomic image operations</li>
<p></p>
<p>Yevhenii Kolesnikov (1):</p>
<li> glsl: Enable textureSize for samplerExternalOES</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

96
docs/relnotes/19.2.7.html Normal file
View File

@@ -0,0 +1,96 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.2.7 Release Notes / 2019-12-04</h1>
<p>
Mesa 19.2.7 is a bug fix release which fixes bugs found since the 19.2.6 release.
</p>
<p>
Mesa 19.2.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.2.7 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>ld.lld: error: duplicate symbol (mesa-19.3.0-rc1)</li>
<li>triangle strip clipping with GL_FIRST_VERTEX_CONVENTION causes wrong vertex&#x27;s attribute to be broadcasted for flat interpolation</li>
<li>[bisected][regression][g45,g965,ilk] piglit arb_fragment_program kil failures</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Bas Nieuwenhuizen (2):</p>
<li> radv: Allocate cmdbuffer space for buffer marker write.</li>
<li> radv: Unify max_descriptor_set_size.</li>
<p></p>
<p>Boris Brezillon (1):</p>
<li> gallium: Fix the -&gt;set_damage_region() implementation</li>
<p></p>
<p>Ian Romanick (1):</p>
<li> intel/fs: Disable conditional discard optimization on Gen4 and Gen5</li>
<p></p>
<p>Jason Ekstrand (1):</p>
<li> anv: Set up SBE_SWIZ properly for gl_Viewport</li>
<p></p>
<p>Jonathan Gray (2):</p>
<li> winsys/amdgpu: avoid double simple_mtx_unlock()</li>
<li> i965: update Makefile.sources for perf changes</li>
<p></p>
<p>Rhys Perry (1):</p>
<li> radv: set writes_memory for global memory stores/atomics</li>
<p></p>
<p>Samuel Pitoiset (3):</p>
<li> radv: fix enabling sample shading with SampleID/SamplePosition</li>
<li> radv/gfx10: fix implementation of exclusive scans</li>
<li> radv: fix compute pipeline keys when optimizations are disabled</li>
<p></p>
<p>Yevhenii Kolesnikov (1):</p>
<li> meson: Fix linkage of libgallium_nine with libgalliumvl</li>
<p></p>
<p>Zebediah Figura (1):</p>
<li> Revert &quot;draw: revert using correct order for prim decomposition.&quot;</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>

View File

@@ -1,54 +0,0 @@
#ifndef VULKAN_XLIB_RANDR_H_
#define VULKAN_XLIB_RANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2017 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -4218,8 +4218,43 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
{
LLVMValueRef result, tmp;
if (ctx->chip_class >= GFX10) {
result = inclusive ? src : identity;
if (inclusive) {
result = src;
} else if (ctx->chip_class >= GFX10) {
/* wavefront shift_right by 1 on GFX10 (emulate dpp_wf_sr1) */
LLVMValueRef active, tmp1, tmp2;
LLVMValueRef tid = ac_get_thread_id(ctx);
tmp1 = ac_build_dpp(ctx, identity, src, dpp_row_sr(1), 0xf, 0xf, false);
tmp2 = ac_build_permlane16(ctx, src, (uint64_t)~0, true, false);
if (maxprefix > 32) {
active = LLVMBuildICmp(ctx->builder, LLVMIntEQ, tid,
LLVMConstInt(ctx->i32, 32, false), "");
tmp2 = LLVMBuildSelect(ctx->builder, active,
ac_build_readlane(ctx, src,
LLVMConstInt(ctx->i32, 31, false)),
tmp2, "");
active = LLVMBuildOr(ctx->builder, active,
LLVMBuildICmp(ctx->builder, LLVMIntEQ,
LLVMBuildAnd(ctx->builder, tid,
LLVMConstInt(ctx->i32, 0x1f, false), ""),
LLVMConstInt(ctx->i32, 0x10, false), ""), "");
src = LLVMBuildSelect(ctx->builder, active, tmp2, tmp1, "");
} else if (maxprefix > 16) {
active = LLVMBuildICmp(ctx->builder, LLVMIntEQ, tid,
LLVMConstInt(ctx->i32, 16, false), "");
src = LLVMBuildSelect(ctx->builder, active, tmp2, tmp1, "");
}
result = src;
} else if (ctx->chip_class >= GFX8) {
src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
result = src;
} else {
if (!inclusive)
src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
@@ -4249,33 +4284,31 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
return result;
if (ctx->chip_class >= GFX10) {
/* dpp_row_bcast{15,31} are not supported on gfx10. */
LLVMBuilderRef builder = ctx->builder;
LLVMValueRef tid = ac_get_thread_id(ctx);
LLVMValueRef cc;
/* TODO-GFX10: Can we get better code-gen by putting this into
* a branch so that LLVM generates EXEC mask manipulations? */
if (inclusive)
tmp = result;
else
tmp = ac_build_alu_op(ctx, result, src, op);
tmp = ac_build_permlane16(ctx, tmp, ~(uint64_t)0, true, false);
tmp = ac_build_alu_op(ctx, result, tmp, op);
cc = LLVMBuildAnd(builder, tid, LLVMConstInt(ctx->i32, 16, false), "");
cc = LLVMBuildICmp(builder, LLVMIntNE, cc, ctx->i32_0, "");
result = LLVMBuildSelect(builder, cc, tmp, result, "");
LLVMValueRef active;
tmp = ac_build_permlane16(ctx, result, ~(uint64_t)0, true, false);
active = LLVMBuildICmp(ctx->builder, LLVMIntNE,
LLVMBuildAnd(ctx->builder, tid,
LLVMConstInt(ctx->i32, 16, false), ""),
ctx->i32_0, "");
tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
result = ac_build_alu_op(ctx, result, tmp, op);
if (maxprefix <= 32)
return result;
if (inclusive)
tmp = result;
else
tmp = ac_build_alu_op(ctx, result, src, op);
tmp = ac_build_readlane(ctx, tmp, LLVMConstInt(ctx->i32, 31, false));
tmp = ac_build_alu_op(ctx, result, tmp, op);
cc = LLVMBuildICmp(builder, LLVMIntUGE, tid,
LLVMConstInt(ctx->i32, 32, false), "");
result = LLVMBuildSelect(builder, cc, tmp, result, "");
tmp = ac_build_readlane(ctx, result, LLVMConstInt(ctx->i32, 31, false));
active = LLVMBuildICmp(ctx->builder, LLVMIntUGE, tid,
LLVMConstInt(ctx->i32, 32, false), "");
tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
result = ac_build_alu_op(ctx, result, tmp, op);
return result;
}

View File

@@ -6001,6 +6001,8 @@ void radv_CmdWriteBufferMarkerAMD(
si_emit_cache_flush(cmd_buffer);
ASSERTED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 12);
if (!(pipelineStage & ~VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT)) {
radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, 0));
radeon_emit(cs, COPY_DATA_SRC_SEL(COPY_DATA_IMM) |
@@ -6020,4 +6022,6 @@ void radv_CmdWriteBufferMarkerAMD(
va, marker,
cmd_buffer->gfx9_eop_bug_va);
}
assert(cmd_buffer->cs->cdw <= cdw_max);
}

View File

@@ -1075,6 +1075,24 @@ void radv_GetPhysicalDeviceFeatures2(
return radv_GetPhysicalDeviceFeatures(physicalDevice, &pFeatures->features);
}
static size_t
radv_max_descriptor_set_size()
{
/* make sure that the entire descriptor set is addressable with a signed
* 32-bit int. So the sum of all limits scaled by descriptor size has to
* be at most 2 GiB. the combined image & samples object count as one of
* both. This limit is for the pipeline layout, not for the set layout, but
* there is no set limit, so we just set a pipeline limit. I don't think
* any app is going to hit this soon. */
return ((1ull << 31) - 16 * MAX_DYNAMIC_BUFFERS
- MAX_INLINE_UNIFORM_BLOCK_SIZE * MAX_INLINE_UNIFORM_BLOCK_COUNT) /
(32 /* uniform buffer, 32 due to potential space wasted on alignment */ +
32 /* storage buffer, 32 due to potential space wasted on alignment */ +
32 /* sampler, largest when combined with image */ +
64 /* sampled image */ +
64 /* storage image */);
}
void radv_GetPhysicalDeviceProperties(
VkPhysicalDevice physicalDevice,
VkPhysicalDeviceProperties* pProperties)
@@ -1082,18 +1100,7 @@ void radv_GetPhysicalDeviceProperties(
RADV_FROM_HANDLE(radv_physical_device, pdevice, physicalDevice);
VkSampleCountFlags sample_counts = 0xf;
/* make sure that the entire descriptor set is addressable with a signed
* 32-bit int. So the sum of all limits scaled by descriptor size has to
* be at most 2 GiB. the combined image & samples object count as one of
* both. This limit is for the pipeline layout, not for the set layout, but
* there is no set limit, so we just set a pipeline limit. I don't think
* any app is going to hit this soon. */
size_t max_descriptor_set_size = ((1ull << 31) - 16 * MAX_DYNAMIC_BUFFERS) /
(32 /* uniform buffer, 32 due to potential space wasted on alignment */ +
32 /* storage buffer, 32 due to potential space wasted on alignment */ +
32 /* sampler, largest when combined with image */ +
64 /* sampled image */ +
64 /* storage image */);
size_t max_descriptor_set_size = radv_max_descriptor_set_size();
VkPhysicalDeviceLimits limits = {
.maxImageDimension1D = (1 << 14),
@@ -1362,13 +1369,7 @@ void radv_GetPhysicalDeviceProperties2(
properties->robustBufferAccessUpdateAfterBind = false;
properties->quadDivergentImplicitLod = false;
size_t max_descriptor_set_size = ((1ull << 31) - 16 * MAX_DYNAMIC_BUFFERS -
MAX_INLINE_UNIFORM_BLOCK_SIZE * MAX_INLINE_UNIFORM_BLOCK_COUNT) /
(32 /* uniform buffer, 32 due to potential space wasted on alignment */ +
32 /* storage buffer, 32 due to potential space wasted on alignment */ +
32 /* sampler, largest when combined with image */ +
64 /* sampled image */ +
64 /* storage image */);
size_t max_descriptor_set_size = radv_max_descriptor_set_size();
properties->maxPerStageDescriptorUpdateAfterBindSamplers = max_descriptor_set_size;
properties->maxPerStageDescriptorUpdateAfterBindUniformBuffers = max_descriptor_set_size;
properties->maxPerStageDescriptorUpdateAfterBindStorageBuffers = max_descriptor_set_size;

View File

@@ -1122,15 +1122,32 @@ radv_pipeline_init_multisample_state(struct radv_pipeline *pipeline,
int ps_iter_samples = 1;
uint32_t mask = 0xffff;
if (vkms)
if (vkms) {
ms->num_samples = vkms->rasterizationSamples;
else
ms->num_samples = 1;
if (vkms)
ps_iter_samples = radv_pipeline_get_ps_iter_samples(vkms);
if (vkms && !vkms->sampleShadingEnable && pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.force_persample) {
ps_iter_samples = ms->num_samples;
/* From the Vulkan 1.1.129 spec, 26.7. Sample Shading:
*
* "Sample shading is enabled for a graphics pipeline:
*
* - If the interface of the fragment shader entry point of the
* graphics pipeline includes an input variable decorated
* with SampleId or SamplePosition. In this case
* minSampleShadingFactor takes the value 1.0.
* - Else if the sampleShadingEnable member of the
* VkPipelineMultisampleStateCreateInfo structure specified
* when creating the graphics pipeline is set to VK_TRUE. In
* this case minSampleShadingFactor takes the value of
* VkPipelineMultisampleStateCreateInfo::minSampleShading.
*
* Otherwise, sample shading is considered disabled."
*/
if (pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.force_persample) {
ps_iter_samples = ms->num_samples;
} else {
ps_iter_samples = radv_pipeline_get_ps_iter_samples(vkms);
}
} else {
ms->num_samples = 1;
}
const struct VkPipelineRasterizationStateRasterizationOrderAMD *raster_order =
@@ -4738,6 +4755,19 @@ radv_compute_generate_pm4(struct radv_pipeline *pipeline)
assert(pipeline->cs.cdw <= pipeline->cs.max_dw);
}
static struct radv_pipeline_key
radv_generate_compute_pipeline_key(struct radv_pipeline *pipeline,
const VkComputePipelineCreateInfo *pCreateInfo)
{
struct radv_pipeline_key key;
memset(&key, 0, sizeof(key));
if (pCreateInfo->flags & VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT)
key.optimisations_disabled = 1;
return key;
}
static VkResult radv_compute_pipeline_create(
VkDevice _device,
VkPipelineCache _cache,
@@ -4770,7 +4800,11 @@ static VkResult radv_compute_pipeline_create(
stage_feedbacks[MESA_SHADER_COMPUTE] = &creation_feedback->pPipelineStageCreationFeedbacks[0];
pStages[MESA_SHADER_COMPUTE] = &pCreateInfo->stage;
radv_create_shaders(pipeline, device, cache, &(struct radv_pipeline_key) {0}, pStages, pCreateInfo->flags, pipeline_feedback, stage_feedbacks);
struct radv_pipeline_key key =
radv_generate_compute_pipeline_key(pipeline, pCreateInfo);
radv_create_shaders(pipeline, device, cache, &key, pStages, pCreateInfo->flags, pipeline_feedback, stage_feedbacks);
pipeline->user_data_0[MESA_SHADER_COMPUTE] = radv_pipeline_stage_to_user_data_0(pipeline, MESA_SHADER_COMPUTE, device->physical_device->rad_info.chip_class);
pipeline->need_indirect_descriptor_sets |= pipeline->shaders[MESA_SHADER_COMPUTE]->info.need_indirect_descriptor_sets;

View File

@@ -151,6 +151,13 @@ set_output_usage_mask(const nir_shader *nir, const nir_intrinsic_instr *instr,
((wrmask >> (i * 4)) & 0xf) << comp;
}
static void
set_writes_memory(const nir_shader *nir, struct radv_shader_info *info)
{
if (nir->info.stage == MESA_SHADER_FRAGMENT)
info->ps.writes_memory = true;
}
static void
gather_intrinsic_store_deref_info(const nir_shader *nir,
const nir_intrinsic_instr *instr,
@@ -304,8 +311,7 @@ gather_intrinsic_info(const nir_shader *nir, const nir_intrinsic_instr *instr,
instr->intrinsic == nir_intrinsic_image_deref_atomic_xor ||
instr->intrinsic == nir_intrinsic_image_deref_atomic_exchange ||
instr->intrinsic == nir_intrinsic_image_deref_atomic_comp_swap) {
if (nir->info.stage == MESA_SHADER_FRAGMENT)
info->ps.writes_memory = true;
set_writes_memory(nir, info);
}
break;
}
@@ -320,15 +326,28 @@ gather_intrinsic_info(const nir_shader *nir, const nir_intrinsic_instr *instr,
case nir_intrinsic_ssbo_atomic_xor:
case nir_intrinsic_ssbo_atomic_exchange:
case nir_intrinsic_ssbo_atomic_comp_swap:
if (nir->info.stage == MESA_SHADER_FRAGMENT)
info->ps.writes_memory = true;
set_writes_memory(nir, info);
break;
case nir_intrinsic_load_deref:
gather_intrinsic_load_deref_info(nir, instr, info);
break;
case nir_intrinsic_store_deref:
gather_intrinsic_store_deref_info(nir, instr, info);
/* fallthrough */
case nir_intrinsic_deref_atomic_add:
case nir_intrinsic_deref_atomic_imin:
case nir_intrinsic_deref_atomic_umin:
case nir_intrinsic_deref_atomic_imax:
case nir_intrinsic_deref_atomic_umax:
case nir_intrinsic_deref_atomic_and:
case nir_intrinsic_deref_atomic_or:
case nir_intrinsic_deref_atomic_xor:
case nir_intrinsic_deref_atomic_exchange:
case nir_intrinsic_deref_atomic_comp_swap: {
if (nir_src_as_deref(instr->src[0])->mode & (nir_var_mem_global | nir_var_mem_ssbo))
set_writes_memory(nir, info);
break;
}
default:
break;
}

View File

@@ -406,6 +406,20 @@ ntq_init_ssa_def(struct v3d_compile *c, nir_ssa_def *def)
return qregs;
}
static bool
is_ld_signal(const struct v3d_qpu_sig *sig)
{
return (sig->ldunif ||
sig->ldunifa ||
sig->ldunifrf ||
sig->ldunifarf ||
sig->ldtmu ||
sig->ldvary ||
sig->ldvpm ||
sig->ldtlb ||
sig->ldtlbu);
}
/**
* This function is responsible for getting VIR results into the associated
* storage for a NIR instruction.
@@ -453,11 +467,12 @@ ntq_store_dest(struct v3d_compile *c, nir_dest *dest, int chan,
_mesa_hash_table_search(c->def_ht, reg);
struct qreg *qregs = entry->data;
/* Insert a MOV if the source wasn't an SSA def in the
* previous instruction.
/* If the previous instruction can't be predicated for
* the store into the nir_register, then emit a MOV
* that can be.
*/
if ((vir_in_nonuniform_control_flow(c) &&
c->defs[last_inst->dst.index]->qpu.sig.ldunif)) {
if (vir_in_nonuniform_control_flow(c) &&
is_ld_signal(&c->defs[last_inst->dst.index]->qpu.sig)) {
result = vir_MOV(c, result);
last_inst = c->defs[result.index];
}

View File

@@ -388,9 +388,21 @@ v3d40_vir_emit_image_load_store(struct v3d_compile *c,
}
}
if (vir_in_nonuniform_control_flow(c) &&
instr->intrinsic != nir_intrinsic_image_deref_load) {
vir_set_pf(vir_MOV_dest(c, vir_nop_reg(), c->execute),
V3D_QPU_PF_PUSHZ);
}
vir_TMU_WRITE(c, V3D_QPU_WADDR_TMUSF, ntq_get_src(c, instr->src[1], 0),
&tmu_writes);
if (vir_in_nonuniform_control_flow(c) &&
instr->intrinsic != nir_intrinsic_image_deref_load) {
struct qinst *last_inst= (struct qinst *)c->cur_block->instructions.prev;
vir_set_cond(last_inst, V3D_QPU_COND_IFA);
}
vir_emit_thrsw(c);
/* The input FIFO has 16 slots across all threads, so make sure we

View File

@@ -2094,6 +2094,8 @@ builtin_builder::create_builtins()
_textureSize(texture_multisample_array, glsl_type::ivec3_type, glsl_type::sampler2DMSArray_type),
_textureSize(texture_multisample_array, glsl_type::ivec3_type, glsl_type::isampler2DMSArray_type),
_textureSize(texture_multisample_array, glsl_type::ivec3_type, glsl_type::usampler2DMSArray_type),
_textureSize(texture_external_es3, glsl_type::ivec2_type, glsl_type::samplerExternalOES_type),
NULL);
add_function("textureSize1D",

View File

@@ -200,7 +200,7 @@ class Value(object):
${val.cond if val.cond else 'NULL'},
${val.swizzle()},
% elif isinstance(val, Expression):
${'true' if val.inexact else 'false'},
${'true' if val.inexact else 'false'}, ${'true' if val.exact else 'false'},
${val.comm_expr_idx}, ${val.comm_exprs},
${val.c_opcode()},
{ ${', '.join(src.c_value_ptr(cache) for src in val.sources)} },
@@ -348,7 +348,7 @@ class Variable(Value):
return '{' + ', '.join([str(swizzles[c]) for c in self.swiz[1:]]) + '}'
return '{0, 1, 2, 3}'
_opcode_re = re.compile(r"(?P<inexact>~)?(?P<opcode>\w+)(?:@(?P<bits>\d+))?"
_opcode_re = re.compile(r"(?P<inexact>~)?(?P<exact>!)?(?P<opcode>\w+)(?:@(?P<bits>\d+))?"
r"(?P<cond>\([^\)]+\))?")
class Expression(Value):
@@ -362,8 +362,12 @@ class Expression(Value):
self.opcode = m.group('opcode')
self._bit_size = int(m.group('bits')) if m.group('bits') else None
self.inexact = m.group('inexact') is not None
self.exact = m.group('exact') is not None
self.cond = m.group('cond')
assert not self.inexact or not self.exact, \
'Expression cannot be both exact and inexact.'
# "many-comm-expr" isn't really a condition. It's notification to the
# generator that this pattern is known to have too many commutative
# expressions, and an error should not be generated for this case.

View File

@@ -69,6 +69,9 @@ e = 'e'
# expression this indicates that the constructed value should have that
# bit-size.
#
# If the opcode in a replacement expression is prefixed by a '!' character,
# this indicated that the new expression will be marked exact.
#
# A special condition "many-comm-expr" can be used with expressions to note
# that the expression and its subexpressions have more commutative expressions
# than nir_replace_instr can handle. If this special condition is needed with
@@ -1327,8 +1330,8 @@ optimizations += [(bitfield_reverse('x@32'), ('bitfield_reverse', 'x'), '!option
# and, if a is a NaN then the second comparison will fail anyway.
for op in ['flt', 'fge', 'feq']:
optimizations += [
(('iand', ('feq', a, a), (op, a, b)), (op, a, b)),
(('iand', ('feq', a, a), (op, b, a)), (op, b, a)),
(('iand', ('feq', a, a), (op, a, b)), ('!' + op, a, b)),
(('iand', ('feq', a, a), (op, b, a)), ('!' + op, b, a)),
]
# Add optimizations to handle the case where the result of a ternary is

View File

@@ -472,7 +472,7 @@ construct_value(nir_builder *build,
* expression we are replacing has any exact values, the entire
* replacement should be exact.
*/
alu->exact = state->has_exact_alu;
alu->exact = state->has_exact_alu || expr->exact;
for (unsigned i = 0; i < nir_op_infos[op].num_inputs; i++) {
/* If the source is an explicitly sized source, then we need to reset

View File

@@ -138,6 +138,9 @@ typedef struct {
*/
bool inexact;
/** In a replacement, requests that the instruction be marked exact. */
bool exact;
/* Commutative expression index. This is assigned by opt_algebraic.py when
* search structures are constructed and is a unique (to this structure)
* index within the commutative operation bitfield used for searching for

View File

@@ -4647,7 +4647,8 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
}
/* Set shader info defaults */
b->shader->info.gs.invocations = 1;
if (stage == MESA_SHADER_GEOMETRY)
b->shader->info.gs.invocations = 1;
b->specializations = spec;
b->num_specializations = num_spec;

View File

@@ -135,15 +135,6 @@ _eglNativePlatformDetectNativeDisplay(void *nativeDisplay)
if (first_pointer == gbm_create_device)
return _EGL_PLATFORM_DRM;
#endif
#ifdef HAVE_X11_PLATFORM
/* If not matched to any other platform, fallback to x11. */
return _EGL_PLATFORM_X11;
#endif
#ifdef HAVE_HAIKU_PLATFORM
return _EGL_PLATFORM_HAIKU;
#endif
}
return _EGL_INVALID_PLATFORM;

View File

@@ -405,8 +405,9 @@ ir3_shader_disasm(struct ir3_shader_variant *so, uint32_t *bin, FILE *out)
fprintf(out, "; %s: outputs:", type);
for (i = 0; i < so->outputs_count; i++) {
uint8_t regid = so->outputs[i].regid;
fprintf(out, " r%d.%c (%s)",
(regid >> 2), "xyzw"[regid & 0x3],
const char *reg_type = so->outputs[i].half ? "hr" : "r";
fprintf(out, " %s%d.%c (%s)",
reg_type, (regid >> 2), "xyzw"[regid & 0x3],
output_name(so, i));
}
fprintf(out, "\n");

View File

@@ -3,8 +3,6 @@
const boolean quads_flatshade_last = \
draw->quads_always_flatshade_last; \
const boolean last_vertex_last = \
!(draw->rasterizer->flatshade && \
draw->rasterizer->flatshade_first);
/* FIXME: the draw->rasterizer->flatshade part is really wrong */
!draw->rasterizer->flatshade_first;
#include "draw_decompose_tmp.h"

View File

@@ -692,7 +692,20 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
* when not using MCJIT so no instructions are generated which the old JIT
* can't handle. Not entirely sure if we really need to do anything yet.
*/
#if defined(PIPE_ARCH_LITTLE_ENDIAN) && defined(PIPE_ARCH_PPC_64)
#ifdef PIPE_ARCH_PPC_64
/*
* Large programs, e.g. gnome-shell and firefox, may tax the addressability
* of the Medium code model once dynamically generated JIT-compiled shader
* programs are linked in and relocated. Yet the default code model as of
* LLVM 8 is Medium or even Small.
* The cost of changing from Medium to Large is negligible:
* - an additional 8-byte pointer stored immediately before the shader entrypoint;
* - change an add-immediate (addis) instruction to a load (ld).
*/
builder.setCodeModel(CodeModel::Large);
#ifdef PIPE_ARCH_LITTLE_ENDIAN
/*
* Versions of LLVM prior to 4.0 lacked a table entry for "POWER8NVL",
* resulting in (big-endian) "generic" being returned on
@@ -704,6 +717,7 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
*/
if (MCPU == "generic")
MCPU = "pwr8";
#endif
#endif
builder.setMCPU(MCPU);
if (gallivm_debug & (GALLIVM_DEBUG_IR | GALLIVM_DEBUG_ASM | GALLIVM_DEBUG_DUMP_BC)) {

View File

@@ -1727,6 +1727,9 @@ static GLenum
get_image_format(struct tgsi_full_instruction *tgsi_inst)
{
switch (tgsi_inst->Memory.Format) {
case PIPE_FORMAT_NONE:
return GL_NONE;
case PIPE_FORMAT_R8_UNORM:
return GL_R8;
case PIPE_FORMAT_R8G8_UNORM:
@@ -1922,8 +1925,7 @@ ttn_mem(struct ttn_compile *c, nir_alu_dest dest, nir_ssa_def **src)
if (tgsi_inst->Instruction.Opcode == TGSI_OPCODE_LOAD) {
nir_ssa_dest_init(&instr->instr, &instr->dest,
util_last_bit(tgsi_inst->Dst[0].Register.WriteMask),
nir_ssa_dest_init(&instr->instr, &instr->dest, instr->num_components,
32, NULL);
nir_builder_instr_insert(b, &instr->instr);
ttn_move_dest(b, dest, &instr->dest.ssa);

View File

@@ -20,6 +20,7 @@ DRI_CONF_SECTION_DEBUG
DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN("false")
DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS("false")
DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED("false")
DRI_CONF_DISABLE_ARB_GPU_SHADER5("false")
DRI_CONF_FORCE_GLSL_VERSION(0)
DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false")
DRI_CONF_ALLOW_GLSL_BUILTIN_CONST_EXPRESSION("false")

View File

@@ -460,7 +460,13 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
if (!sctx->ctx)
goto fail;
if (sscreen->info.num_sdma_rings && !(sscreen->debug_flags & DBG(NO_ASYNC_DMA))) {
if (sscreen->info.num_sdma_rings &&
!(sscreen->debug_flags & DBG(NO_ASYNC_DMA)) &&
/* SDMA timeouts sometimes on gfx10 so disable it for now. See:
* https://bugs.freedesktop.org/show_bug.cgi?id=111481
* https://gitlab.freedesktop.org/mesa/mesa/issues/1907
*/
(sctx->chip_class != GFX10 || sscreen->debug_flags & DBG(FORCE_DMA))) {
sctx->dma_cs = sctx->ws->cs_create(sctx->ctx, RING_DMA,
(void*)si_flush_dma_cs,
sctx, stop_exec_on_failure);
@@ -871,6 +877,10 @@ static void si_disk_cache_create(struct si_screen *sscreen)
/* These flags affect shader compilation. */
#define ALL_FLAGS (DBG(SI_SCHED) | DBG(GISEL))
uint64_t shader_debug_flags = sscreen->debug_flags & ALL_FLAGS;
/* Reserve left-most bit for tgsi/nir selector */
assert(!(shader_debug_flags & (1u << 31)));
shader_debug_flags |= (uint32_t)
((sscreen->options.enable_nir & 0x1) << 31);
/* Add the high bits of 32-bit addresses, which affects
* how 32-bit addresses are expanded to 64 bits.
@@ -993,6 +1003,13 @@ radeonsi_screen_create_impl(struct radeon_winsys *ws,
return NULL;
}
{
#define OPT_BOOL(name, dflt, description) \
sscreen->options.name = \
driQueryOptionb(config->options, "radeonsi_"#name);
#include "si_debug_options.h"
}
si_disk_cache_create(sscreen);
/* Determine the number of shader compiler threads. */
@@ -1119,13 +1136,6 @@ radeonsi_screen_create_impl(struct radeon_winsys *ws,
sscreen->commutative_blend_add =
driQueryOptionb(config->options, "radeonsi_commutative_blend_add");
{
#define OPT_BOOL(name, dflt, description) \
sscreen->options.name = \
driQueryOptionb(config->options, "radeonsi_"#name);
#include "si_debug_options.h"
}
sscreen->has_gfx9_scissor_bug = sscreen->info.family == CHIP_VEGA10 ||
sscreen->info.family == CHIP_RAVEN;
sscreen->has_msaa_sample_loc_bug = (sscreen->info.family >= CHIP_POLARIS10 &&

View File

@@ -219,6 +219,7 @@ struct st_config_options
{
bool disable_blend_func_extended;
bool disable_glsl_line_continuations;
bool disable_arb_gpu_shader5;
bool force_glsl_extensions_warn;
unsigned force_glsl_version;
bool allow_glsl_extension_directive_midshader;

View File

@@ -907,7 +907,7 @@ dri2_create_image_from_fd(__DRIscreen *_screen,
whandles[i].stride = (unsigned)strides[index];
whandles[i].offset = (unsigned)offsets[index];
whandles[i].modifier = modifier;
whandles[i].plane = i;
whandles[i].plane = index;
}
img = dri2_create_image_from_winsys(_screen, width, height, map,
@@ -1861,8 +1861,6 @@ static void
dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int *rects)
{
struct dri_drawable *drawable = dri_drawable(dPriv);
struct pipe_resource *resource = drawable->textures[ST_ATTACHMENT_BACK_LEFT];
struct pipe_screen *screen = resource->screen;
struct pipe_box *boxes = NULL;
if (nrects) {
@@ -1876,8 +1874,25 @@ dri2_set_damage_region(__DRIdrawable *dPriv, unsigned int nrects, int *rects)
}
}
screen->set_damage_region(screen, resource, nrects, boxes);
FREE(boxes);
FREE(drawable->damage_rects);
drawable->damage_rects = boxes;
drawable->num_damage_rects = nrects;
/* Only apply the damage region if the BACK_LEFT texture is up-to-date. */
if (drawable->texture_stamp == drawable->dPriv->lastStamp &&
(drawable->texture_mask & (1 << ST_ATTACHMENT_BACK_LEFT))) {
struct pipe_screen *screen = drawable->screen->base.screen;
struct pipe_resource *resource;
if (drawable->stvis.samples > 1)
resource = drawable->msaa_textures[ST_ATTACHMENT_BACK_LEFT];
else
resource = drawable->textures[ST_ATTACHMENT_BACK_LEFT];
screen->set_damage_region(screen, resource,
drawable->num_damage_rects,
drawable->damage_rects);
}
}
static __DRI2bufferDamageExtension dri2BufferDamageExtension = {

View File

@@ -95,6 +95,18 @@ dri_st_framebuffer_validate(struct st_context_iface *stctx,
}
} while (lastStamp != drawable->dPriv->lastStamp);
/* Flush the pending set_damage_region request. */
struct pipe_screen *pscreen = screen->base.screen;
if (new_mask & (1 << ST_ATTACHMENT_BACK_LEFT) &&
pscreen->set_damage_region) {
struct pipe_resource *resource = textures[ST_ATTACHMENT_BACK_LEFT];
pscreen->set_damage_region(pscreen, resource,
drawable->num_damage_rects,
drawable->damage_rects);
}
if (!out)
return true;
@@ -202,6 +214,7 @@ dri_destroy_buffer(__DRIdrawable * dPriv)
/* Notify the st manager that this drawable is no longer valid */
stapi->destroy_drawable(stapi, &drawable->base);
FREE(drawable->damage_rects);
FREE(drawable);
}

View File

@@ -56,6 +56,9 @@ struct dri_drawable
unsigned old_w;
unsigned old_h;
struct pipe_box *damage_rects;
unsigned int num_damage_rects;
struct pipe_resource *textures[ST_ATTACHMENT_COUNT];
struct pipe_resource *msaa_textures[ST_ATTACHMENT_COUNT];
unsigned int texture_mask, texture_stamp;

View File

@@ -65,6 +65,8 @@ dri_fill_st_options(struct dri_screen *screen)
options->disable_blend_func_extended =
driQueryOptionb(optionCache, "disable_blend_func_extended");
options->disable_arb_gpu_shader5 =
driQueryOptionb(optionCache, "disable_arb_gpu_shader5");
options->disable_glsl_line_continuations =
driQueryOptionb(optionCache, "disable_glsl_line_continuations");
options->force_glsl_extensions_warn =

View File

@@ -28,12 +28,24 @@ nine_version = ['1', '0', '0']
gallium_nine_c_args = []
gallium_nine_ld_args = []
gallium_nine_link_depends = []
gallium_nine_link_with = [
libgallium, libnine_st,
libpipe_loader_static, libws_null, libwsw, libswdri,
libswkmsdri,
]
if with_ld_version_script
gallium_nine_ld_args += ['-Wl,--version-script', join_paths(meson.current_source_dir(), 'd3dadapter9.sym')]
gallium_nine_link_depends += files('d3dadapter9.sym')
endif
if (with_gallium_va or with_gallium_vdpau or with_gallium_omx != 'disabled' or
with_gallium_xvmc or with_dri)
gallium_nine_link_with += libgalliumvl
else
gallium_nine_link_with += libgalliumvl_stub
endif
libgallium_nine = shared_library(
'd3dadapter9',
files('description.c', 'getproc.c', 'drm.c'),
@@ -47,11 +59,7 @@ libgallium_nine = shared_library(
cpp_args : [cpp_vis_args],
link_args : [ld_args_gc_sections, gallium_nine_ld_args],
link_depends : gallium_nine_link_depends,
link_with : [
libgalliumvl_stub, libgallium, libnine_st,
libpipe_loader_static, libws_null, libwsw, libswdri,
libswkmsdri, libnir,
],
link_with : gallium_nine_link_with,
dependencies : [
dep_selinux, dep_libdrm, dep_llvm, dep_thread, idep_xmlconfig, idep_mesautil,
driver_swrast, driver_r300, driver_r600, driver_radeonsi, driver_nouveau,

View File

@@ -326,7 +326,6 @@ amdgpu_winsys_create(int fd, const struct pipe_screen_config *config,
aws = util_hash_table_get(dev_tab, dev);
if (aws) {
pipe_reference(NULL, &aws->reference);
simple_mtx_unlock(&dev_tab_mutex);
/* Release the device handle, because we don't need it anymore.
* This function is returning an existing winsys instance, which

View File

@@ -92,7 +92,8 @@ alloc_shm(struct dri_sw_displaytarget *dri_sw_dt, unsigned size)
{
char *addr;
dri_sw_dt->shmid = shmget(IPC_PRIVATE, size, IPC_CREAT|0777);
/* 0600 = user read+write */
dri_sw_dt->shmid = shmget(IPC_PRIVATE, size, IPC_CREAT | 0600);
if (dri_sw_dt->shmid < 0)
return NULL;

View File

@@ -126,7 +126,8 @@ alloc_shm(struct xlib_displaytarget *buf, unsigned size)
shminfo->shmid = -1;
shminfo->shmaddr = (char *) -1;
shminfo->shmid = shmget(IPC_PRIVATE, size, IPC_CREAT|0777);
/* 0600 = user read+write */
shminfo->shmid = shmget(IPC_PRIVATE, size, IPC_CREAT | 0600);
if (shminfo->shmid < 0) {
return NULL;
}

View File

@@ -1329,7 +1329,7 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr,
temp_op[0] = bld.fix_byte_src(op[0]);
temp_op[1] = bld.fix_byte_src(op[1]);
const uint32_t bit_size = nir_src_bit_size(instr->src[0].src);
const uint32_t bit_size = type_sz(temp_op[0].type) * 8;
if (bit_size != 32)
dest = bld.vgrf(temp_op[0].type, 1);
@@ -3368,7 +3368,14 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld,
if (alu != NULL &&
alu->op != nir_op_bcsel &&
alu->op != nir_op_inot) {
alu->op != nir_op_inot &&
(devinfo->gen > 5 ||
(alu->instr.pass_flags & BRW_NIR_BOOLEAN_MASK) != BRW_NIR_BOOLEAN_NEEDS_RESOLVE ||
alu->op == nir_op_fne32 || alu->op == nir_op_feq32 ||
alu->op == nir_op_flt32 || alu->op == nir_op_fge32 ||
alu->op == nir_op_ine32 || alu->op == nir_op_ieq32 ||
alu->op == nir_op_ilt32 || alu->op == nir_op_ige32 ||
alu->op == nir_op_ult32 || alu->op == nir_op_uge32)) {
/* Re-emit the instruction that generated the Boolean value, but
* do not store it. Since this instruction will be conditional,
* other instructions that want to use the real Boolean value may

View File

@@ -478,8 +478,10 @@ anv_batch_bo_list_clone(const struct list_head *list,
}
if (result != VK_SUCCESS) {
list_for_each_entry_safe(struct anv_batch_bo, bbo, new_list, link)
list_for_each_entry_safe(struct anv_batch_bo, bbo, new_list, link) {
list_del(&bbo->link);
anv_batch_bo_destroy(bbo, cmd_buffer);
}
}
return result;
@@ -804,6 +806,7 @@ anv_cmd_buffer_fini_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
/* Destroy all of the batch buffers */
list_for_each_entry_safe(struct anv_batch_bo, bbo,
&cmd_buffer->batch_bos, link) {
list_del(&bbo->link);
anv_batch_bo_destroy(bbo, cmd_buffer);
}
}
@@ -1620,6 +1623,9 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
assert(!pdevice->has_syncobj);
if (in_fence == -1) {
in_fence = impl->fd;
if (in_fence == -1)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
impl->fd = -1;
} else {
int merge = anv_gem_sync_file_merge(device, in_fence, impl->fd);
if (merge == -1)
@@ -1627,10 +1633,9 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
close(impl->fd);
close(in_fence);
impl->fd = -1;
in_fence = merge;
}
impl->fd = -1;
break;
case ANV_SEMAPHORE_TYPE_DRM_SYNCOBJ:

View File

@@ -247,12 +247,28 @@ VkResult anv_AcquireNextImage2KHR(
pAcquireInfo,
pImageIndex);
/* Thanks to implicit sync, the image is ready immediately. However, we
* should wait for the current GPU state to finish.
/* Thanks to implicit sync, the image is ready immediately. However, we
* should wait for the current GPU state to finish. Regardless of the
* result of the presentation, we need to signal the semaphore & fence.
*/
if (pAcquireInfo->semaphore != VK_NULL_HANDLE) {
/* Put a dummy semaphore in temporary, this is the fastest way to avoid
* any kind of work yet still provide some kind of synchronization. This
* only works because the Mesa WSI code always returns an image
* immediately if available.
*/
ANV_FROM_HANDLE(anv_semaphore, semaphore, pAcquireInfo->semaphore);
anv_semaphore_reset_temporary(device, semaphore);
struct anv_semaphore_impl *impl = &semaphore->temporary;
impl->type = ANV_SEMAPHORE_TYPE_DUMMY;
}
if (pAcquireInfo->fence != VK_NULL_HANDLE) {
anv_QueueSubmit(anv_queue_to_handle(&device->queue), 0, NULL,
pAcquireInfo->fence);
result = anv_QueueSubmit(anv_queue_to_handle(&device->queue),
0, NULL, pAcquireInfo->fence);
}
return result;

View File

@@ -2553,20 +2553,12 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
const struct anv_pipeline_binding *binding =
&bind_map->surface_to_descriptor[surface];
struct anv_address read_addr;
uint32_t read_len;
struct anv_address addr;
if (binding->set == ANV_DESCRIPTOR_SET_SHADER_CONSTANTS) {
struct anv_address constant_data = {
addr = (struct anv_address) {
.bo = pipeline->device->dynamic_state_pool.block_pool.bo,
.offset = pipeline->shaders[stage]->constant_data.offset,
};
unsigned constant_data_size =
pipeline->shaders[stage]->constant_data_size;
read_len = MIN2(range->length,
DIV_ROUND_UP(constant_data_size, 32) - range->start);
read_addr = anv_address_add(constant_data,
range->start * 32);
} else if (binding->set == ANV_DESCRIPTOR_SET_DESCRIPTORS) {
/* This is a descriptor set buffer so the set index is
* actually given by binding->binding. (Yes, that's
@@ -2574,45 +2566,27 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
*/
struct anv_descriptor_set *set =
gfx_state->base.descriptors[binding->binding];
struct anv_address desc_buffer_addr =
anv_descriptor_set_address(cmd_buffer, set);
const unsigned desc_buffer_size = set->desc_mem.alloc_size;
read_len = MIN2(range->length,
DIV_ROUND_UP(desc_buffer_size, 32) - range->start);
read_addr = anv_address_add(desc_buffer_addr,
range->start * 32);
addr = anv_descriptor_set_address(cmd_buffer, set);
} else {
const struct anv_descriptor *desc =
anv_descriptor_for_binding(&gfx_state->base, binding);
if (desc->type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER) {
read_len = MIN2(range->length,
DIV_ROUND_UP(desc->buffer_view->range, 32) - range->start);
read_addr = anv_address_add(desc->buffer_view->address,
range->start * 32);
addr = desc->buffer_view->address;
} else {
assert(desc->type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC);
uint32_t dynamic_offset =
dynamic_offset_for_binding(&gfx_state->base, binding);
uint32_t buf_offset =
MIN2(desc->offset + dynamic_offset, desc->buffer->size);
uint32_t buf_range =
MIN2(desc->range, desc->buffer->size - buf_offset);
read_len = MIN2(range->length,
DIV_ROUND_UP(buf_range, 32) - range->start);
read_addr = anv_address_add(desc->buffer->address,
buf_offset + range->start * 32);
addr = anv_address_add(desc->buffer->address,
desc->offset + dynamic_offset);
}
}
if (read_len > 0) {
c.ConstantBody.Buffer[n] = read_addr;
c.ConstantBody.ReadLength[n] = read_len;
n--;
}
c.ConstantBody.Buffer[n] =
anv_address_add(addr, range->start * 32);
c.ConstantBody.ReadLength[n] = range->length;
n--;
}
struct anv_state state =

View File

@@ -369,8 +369,8 @@ emit_3dstate_sbe(struct anv_pipeline *pipeline)
if (input_index < 0)
continue;
/* gl_Layer is stored in the VUE header */
if (attr == VARYING_SLOT_LAYER) {
/* gl_Viewport and gl_Layer are stored in the VUE header */
if (attr == VARYING_SLOT_VIEWPORT || attr == VARYING_SLOT_LAYER) {
urb_entry_read_offset = 0;
continue;
}

View File

@@ -1,4 +1,4 @@
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2019 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -35,11 +35,31 @@ if with_shared_glapi
else
libglapi = []
endif
if not with_glvnd
if with_gles1
if with_gles1
if not with_glvnd
subdir('es1api')
endif
if with_gles2
subdir('es2api')
elif not glvnd_has_headers_and_pc_files
pkg.generate(
name : 'glesv1_cm',
filebase : 'glesv1_cm',
description : 'Mesa OpenGL ES 1.1 CM library',
version : meson.project_version(),
libraries : '-L${libdir} -lGLESv1_CM',
libraries_private : gl_priv_libs,
)
endif
endif
if with_gles2
if not with_glvnd
subdir('es2api')
elif not glvnd_has_headers_and_pc_files
pkg.generate(
name : 'glesv2',
filebase : 'glesv2',
description : 'Mesa OpenGL ES 2.0 library',
version : meson.project_version(),
libraries : '-L${libdir} -lGLESv2',
libraries_private : gl_priv_libs,
)
endif
endif

View File

@@ -35,9 +35,7 @@ i965_FILES = \
brw_object_purgeable.c \
brw_pipe_control.c \
brw_pipe_control.h \
brw_performance_query.h \
brw_performance_query.c \
brw_performance_query_metrics.h \
brw_program.c \
brw_program.h \
brw_program_binary.c \

View File

@@ -3051,7 +3051,7 @@ genX(upload_blend_state)(struct brw_context *brw)
#endif
}
static const struct brw_tracked_state genX(blend_state) = {
UNUSED static const struct brw_tracked_state genX(blend_state) = {
.dirty = {
.mesa = _NEW_BUFFERS |
_NEW_COLOR |
@@ -3412,7 +3412,7 @@ genX(upload_color_calc_state)(struct brw_context *brw)
#endif
}
static const struct brw_tracked_state genX(color_calc_state) = {
UNUSED static const struct brw_tracked_state genX(color_calc_state) = {
.dirty = {
.mesa = _NEW_COLOR |
_NEW_STENCIL |
@@ -3430,6 +3430,35 @@ static const struct brw_tracked_state genX(color_calc_state) = {
};
/* ---------------------------------------------------------------------- */
#if GEN_IS_HASWELL
static void
genX(upload_color_calc_and_blend_state)(struct brw_context *brw)
{
genX(upload_blend_state)(brw);
genX(upload_color_calc_state)(brw);
}
/* On Haswell when BLEND_STATE is emitted CC_STATE should also be re-emitted,
* this workarounds the flickering shadows in several games.
*/
static const struct brw_tracked_state genX(cc_and_blend_state) = {
.dirty = {
.mesa = _NEW_BUFFERS |
_NEW_COLOR |
_NEW_STENCIL |
_NEW_MULTISAMPLE,
.brw = BRW_NEW_BATCH |
BRW_NEW_BLORP |
BRW_NEW_CC_STATE |
BRW_NEW_FS_PROG_DATA |
BRW_NEW_STATE_BASE_ADDRESS,
},
.emit = genX(upload_color_calc_and_blend_state),
};
#endif
/* ---------------------------------------------------------------------- */
#if GEN_GEN >= 7
@@ -5697,8 +5726,12 @@ genX(init_atoms)(struct brw_context *brw)
&gen7_l3_state,
&gen7_push_constant_space,
&gen7_urb,
#if GEN_IS_HASWELL
&genX(cc_and_blend_state),
#else
&genX(blend_state), /* must do before cc unit */
&genX(color_calc_state), /* must do before cc unit */
#endif
&genX(depth_stencil_state), /* must do before cc unit */
&brw_vs_image_surfaces, /* Before vs push/pull constants and binding table */

View File

@@ -89,8 +89,9 @@ alloc_back_shm_ximage(XMesaBuffer b, GLuint width, GLuint height)
return GL_FALSE;
}
/* 0600 = user read+write */
b->shminfo.shmid = shmget(IPC_PRIVATE, b->backxrb->ximage->bytes_per_line
* b->backxrb->ximage->height, IPC_CREAT|0777);
* b->backxrb->ximage->height, IPC_CREAT | 0600);
if (b->shminfo.shmid < 0) {
_mesa_warning(NULL, "shmget failed while allocating back buffer.\n");
XDestroyImage(b->backxrb->ximage);

View File

@@ -124,14 +124,28 @@ static inline GLboolean
_mesa_is_texture_complete(const struct gl_texture_object *texObj,
const struct gl_sampler_object *sampler)
{
struct gl_texture_image *img = texObj->Image[0][texObj->BaseLevel];
bool isMultisample = img && img->NumSamples >= 2;
/*
* According to ARB_stencil_texturing, NEAREST_MIPMAP_NEAREST would
* be forbidden, however it is allowed per GL 4.5 rules, allow it
* even without GL 4.5 since it was a spec mistake.
*/
if ((texObj->_IsIntegerFormat ||
/* Section 8.17 (texture completeness) of the OpenGL 4.6 core profile spec:
*
* "The texture is not multisample; either the magnification filter is not
* NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
* MIPMAP_NEAREST; and any of
* The internal format of the texture is integer.
* The internal format is STENCIL_INDEX.
* The internal format is DEPTH_STENCIL, and the value of DEPTH_-
* STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX.""
*/
if (!isMultisample &&
(texObj->_IsIntegerFormat ||
(texObj->StencilSampling &&
texObj->Image[0][texObj->BaseLevel]->_BaseFormat == GL_DEPTH_STENCIL)) &&
img->_BaseFormat == GL_DEPTH_STENCIL)) &&
(sampler->MagFilter != GL_NEAREST ||
(sampler->MinFilter != GL_NEAREST &&
sampler->MinFilter != GL_NEAREST_MIPMAP_NEAREST))) {
@@ -139,7 +153,12 @@ _mesa_is_texture_complete(const struct gl_texture_object *texObj,
return GL_FALSE;
}
if (_mesa_is_mipmap_filter(sampler))
/* Section 8.17 (texture completeness) of the OpenGL 4.6 core profile spec:
*
* "The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
* the texture is not multisample, and the texture is not mipmap complete.""
*/
if (!isMultisample &&_mesa_is_mipmap_filter(sampler))
return texObj->_MipmapComplete;
else
return texObj->_BaseComplete;

View File

@@ -1091,7 +1091,7 @@ void st_init_extensions(struct pipe_screen *screen,
if (api == API_OPENGLES2 && ESSLVersion >= 320)
extensions->ARB_gpu_shader5 = GL_TRUE;
if (GLSLVersion >= 400)
if (GLSLVersion >= 400 && !options->disable_arb_gpu_shader5)
extensions->ARB_gpu_shader5 = GL_TRUE;
if (GLSLVersion >= 410)
extensions->ARB_shader_precision = GL_TRUE;

View File

@@ -56,11 +56,13 @@ TODO: document the other workarounds.
<application name="Unigine Sanctuary" executable="Sanctuary">
<option name="force_glsl_extensions_warn" value="true" />
<option name="disable_blend_func_extended" value="true" />
<option name="disable_arb_gpu_shader5" value="true" />
</application>
<application name="Unigine Tropics" executable="Tropics">
<option name="force_glsl_extensions_warn" value="true" />
<option name="disable_blend_func_extended" value="true" />
<option name="disable_arb_gpu_shader5" value="true" />
</application>
<application name="Unigine Heaven (32-bit)" executable="heaven_x86">

View File

@@ -80,6 +80,11 @@ DRI_CONF_OPT_BEGIN_B(disable_blend_func_extended, def) \
DRI_CONF_DESC(en,gettext("Disable dual source blending")) \
DRI_CONF_OPT_END
#define DRI_CONF_DISABLE_ARB_GPU_SHADER5(def) \
DRI_CONF_OPT_BEGIN_B(disable_arb_gpu_shader5, def) \
DRI_CONF_DESC(en,"Disable GL_ARB_gpu_shader5") \
DRI_CONF_OPT_END
#define DRI_CONF_DUAL_COLOR_BLEND_BY_LOCATION(def) \
DRI_CONF_OPT_BEGIN_B(dual_color_blend_by_location, def) \
DRI_CONF_DESC(en,gettext("Identify dual color blending sources by location rather than index")) \