Compare commits

...

51 Commits

Author SHA1 Message Date
Dylan Baker
2964ee3ad0 docs: Add release notes for 19.0.2 2019-04-10 20:34:09 -07:00
Dylan Baker
349759165c VERSION: bump version for 19.0.2 2019-04-10 20:30:30 -07:00
Boyuan Zhang
20db3b0e46 st/va: reverse qt matrix back to its original order
The quantiser matrix that VAAPI provides has been applied with inverse z-scan.
However, what we expect in MPEG2 picture description is the original order.
Therefore, we need to reverse it back to its original order.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110257
Cc: mesa-stable@lists.freedesktop.org

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d507bcdcf2)
2019-04-09 08:36:40 -07:00
Lionel Landwerlin
57b7dbbb21 intel: add dependency on genxml generated files
Drivers using genxml will start compilation before generated files are
created, so add a dependency to it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 48e48b8560)
Conflicts resolved by Dylan

Conflicts:
	src/gallium/drivers/iris/meson.build
2019-04-09 08:35:49 -07:00
Caio Marcelo de Oliveira Filho
b493686860 nir: Take if_uses into account when repairing SSA
If a def is used as an condition before its definition, we should also
consider this a case to repair.  When repairing, make sure we rewrite
any if conditions too.

Found in while inspecting a SPIR-V conversion from a 'continue block'
that contains a conditional branch.  We pull the continue block up to
the beggining of the loop, and the condition in the branch ends up
defined afterwards.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 364212f1ed "nir: Add a pass to repair SSA form"
(cherry picked from commit c037dbb0ef)
2019-04-08 09:30:03 -07:00
Eric Anholt
73bc3248f4 v3d: Don't try to use the TFU blit path if a scissor is enabled.
We'll need to do a render-based blit for scissors, since the TFU (as seen
in this conditional) can only update a whole surface.

Fixes: 976ea90bdc ("v3d: Add support for using the TFU to do some blits.")
Fixes piglit fbo-scissor-blit.

(cherry picked from commit 4c70f276bc)
2019-04-05 09:08:03 -07:00
Eric Anholt
d1f4c96919 v3d: Bump the maximum texture size to 4k for V3D 4.x.
4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in
the CTS at 8k and 16k.  4k at least lets us get one 4k display working.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 62360e92ec)
2019-04-05 09:07:57 -07:00
Eric Anholt
b7769cdfb7 dri3: Return the current swap interval from glXGetSwapIntervalMESA().
We were caching only the value set with glXSwapIntervalSGI(), missing out
on the default setting of the swap interval by the loader.  This fixes
glxgears's warning about being vblank synchronized by default.

Fixes: 9777c4234b ("loader: drop the [gs]et_swap_interval callbacks")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit edc7deec42)
2019-04-02 09:14:20 -07:00
Marek Olšák
e46e3bfd13 radeonsi: fix assertion failure by using the correct type
src/gallium/drivers/radeonsi/si_state_viewport.c:196: si_emit_guardband:
Assertion `vp_as_scissor.maxx <= max_viewport_size[vp_as_scissor.quant_mode]
&& vp_as_scissor.maxy <= max_viewport_size[vp_as_scissor.quant_mode]' failed.

The comparison was unsigned, so negative maxx or maxy would fail.

Fixes: 3c540e0a74 "radeonsi: Fix guardband computation for large render targets"
(cherry picked from commit 3ad2a9b3fa)
2019-04-01 09:47:45 -07:00
Leo Liu
a4d5161d42 radeon/vcn/vp9: search the render target from the whole list
The number of render targets could be more than max of references,
so we search the full list of the render pictures for the current
render target index

https://bugs.freedesktop.org/show_bug.cgi?id=109648

Signed-off-by: Leo Liu <leo.liu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: James Zhu<James.Zhu@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d4e0fbc92f)
2019-04-01 09:47:39 -07:00
Eric Engestrom
a1c30b8b78 meson: strip rpath from megadrivers
More specifically, use the library file that has been post-processed by Meson
when creating the hardlinks.

Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108766
Fixes: 3218056e0e "meson: Build i965 and dri stack"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit aa7afe324c)
2019-04-01 09:47:34 -07:00
Karol Herbst
9987a3d448 nir/print: fix printing the image_array intrinsic index
Fixes: 0de003be03 ("nir: Add handle/index-based image intrinsics")

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6ffc72472c)
2019-03-29 08:32:00 -07:00
Samuel Pitoiset
891c4ff633 radv: do not always initialize HTILE in compressed state
Especially when performing a transtion from UNDEFINED->GENERAL,
the driver shouldn't initialize HTILE metadata in compressed
state because it doesn't decompress when the src layout is
GENERAL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110259
Fixes: 3a2e93147f ("radv: always initialize HTILE when the src layout is UNDEFINED")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 62a9d757e6)
2019-03-29 08:31:53 -07:00
Samuel Pitoiset
a175dffe84 radv: skip updating depth/color metadata for conditional rendering
I don't think we should update metadata when conditional rendering
is enabled. For some reasons, some CTS breaks only on SI.

This fixes the following CTS on SI:
dEQP-VK.conditional_rendering.draw_clear.clear.depth.*

Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 6596eb2b30)
2019-03-28 12:14:46 -07:00
Leo Liu
29bfb1af10 radeon/vcn: add H.264 constrained baseline support
VCN supports this profile as well as UVD, so add it

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f8ef8b56a6)
2019-03-28 12:14:39 -07:00
Jason Ekstrand
dc6f00d53e Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"
This reverts commit 4e1bbb000c.  It turns
out that some DXVK apps due to some implementation detail of DXVK or
other create and destroy instances in an interleaved way.  Freeing the
glsl_type memory without being a bit more careful causes use-after-free
issues.  Looks like we need to try again.

(cherry picked from commit ce47999cee)
2019-03-27 11:49:05 -07:00
Dylan Baker
ba3eb3c938 docs: Add SHA256 sums for mesa 19.0.1 2019-03-27 10:10:37 -07:00
Dylan Baker
08fbf25ce1 Add release notes for 19.0.1 2019-03-27 10:02:21 -07:00
Dylan Baker
499053e5d7 bump version for 19.0.1 2019-03-27 09:56:53 -07:00
Bas Nieuwenhuizen
bb66e61727 ac/nir: Return frag_coord as integer.
To preserve the invariant that nir ssa defs are integers or pointers
in LLVM.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 82075e3c42)
2019-03-26 12:14:04 -07:00
Dylan Baker
964f7a7063 bin/install_megadrivers.py: Fix regression for set DESTDIR
The previous patch tried to address a bug when DESTDIR is '', however,
it introduces a bug when DESTDIR is not '', and fakeroot is used. This
patch does fix that, and has been tested with the arch pkg-build to
ensure it isn't regressed.

Fixes: 093a1ade4e24b7dd701a093d30a71efd669fe9c8
       ("bin/install_megadrivers.py: Correctly handle DESTDIR=''")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110221
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit ed96038e55)
2019-03-25 09:44:28 -07:00
Dylan Baker
561fd519a7 bin/install_megadrivers.py: Correctly handle DESTDIR=''
Currently if destdir is set to '' then the resulting libdir will have
it's first character replaced by / instead of / being prepended to the
string. This was the result of ensuring that that DESTDIR wouldn't be
ignored if libdir was absolute, since the only cases that meson allows
the libdir to be absolute is if the prefix is /, this won't be a
problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110211
Fixes: ae3f45c11e
       ("bin/install_megadrivers: fix DESTDIR and -D*-path")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 4188dd7879)
2019-03-25 09:43:24 -07:00
Józef Kucia
db6c05f5db mesa: Fix GL_NUM_DEVICE_UUIDS_EXT
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 1d996ef714)
2019-03-22 10:44:58 -07:00
Tapani Pälli
a9a600f216 anv/radv: release memory allocated by glsl types during spirv_to_nir
Fixes leaks for each glsl_type generated:

   ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
   ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
   ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
   ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
   ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
   ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
   ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
   ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
   ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
   ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
   ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
   ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
   ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)

v2: move release call to vkDestroyInstance
v3: apply fix also to radv driver

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4e1bbb000c)
2019-03-22 10:44:50 -07:00
Józef Kucia
96b0478c41 radv: Fix driverUUID
Fixes: 14cad8786a ("radv: generate the same driver UUID as radeonsi")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c077d5d7de)
2019-03-22 10:44:43 -07:00
Danylo Piliaiev
764131ff0a glsl: Cross validate variable's invariance by explicit invariance only
'invariant' qualifier is propagated on variables which are used
to calculate other invariant variables, however when we are matching
variable's declarations we should take into account only explicitly
declared invariance because invariance propagation is an implementation
specific detail.

Thus new flag is added to ir_variable_data which indicates 'invariant'
qualifier being explicitly set in the shader.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100316
Fixes: 89b60492 ('glsl: Add a pass to propagate the "invariant" and
  "precise" qualifiers')

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit ea9bde151f)
2019-03-22 10:44:25 -07:00
Dave Airlie
09f08a2fce softpipe: fix texture view crashes
I noticed we crashed piglit arb_texture_view-rendering-formats
when run on softpipe.

This fixes the clear tiles to use the surface format not the
underlying storage format.

This fixes a bunch of srgb piglits as well.

Fixes: 396ac41fc2 (softpipe: add integer support)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 04189565a0)
2019-03-22 10:44:16 -07:00
Jason Ekstrand
d3941aa8e7 intel/nir: Lower array-deref-of-vector UBO and SSBO loads
This fixes a serious performance issue with DXVK:

https://github.com/doitsujin/dxvk/issues/937

This was caused by a recent change that to improve performance on RADV
which back-fired on ANV and killed performance for some apps:

e5a06d3f4a

Throwing in this bit of lowering lets us come along and CSE those UBO
loads (or copy-prop for SSBO load) and get one load where we previously
would have gotten several.

VkPipeline-db results on Kaby Lake:

    total instructions in shared programs: 5115361 -> 5073185 (-0.82%)
    instructions in affected programs: 1754333 -> 1712157 (-2.40%)
    helped: 5331
    HURT: 63

    total cycles in shared programs: 2544501169 -> 2481144545 (-2.49%)
    cycles in affected programs: 2531058653 -> 2467702029 (-2.50%)
    helped: 9202
    HURT: 4323

    total loops in shared programs: 3340 -> 3331 (-0.27%)
    loops in affected programs: 9 -> 0
    helped: 9
    HURT: 0

    total spills in shared programs: 3246 -> 3053 (-5.95%)
    spills in affected programs: 384 -> 191 (-50.26%)
    helped: 10
    HURT: 5

    total fills in shared programs: 4626 -> 4452 (-3.76%)
    fills in affected programs: 439 -> 265 (-39.64%)
    helped: 10
    HURT: 5

All of the shaders with hurt spilling were in Rise of the Tomb Raider
which also had shaders solidly helped in the spilling department.  Not
shown in those results (because I've not had success dumping the
shaders) is Witcher 3 where this reduces spilling and improves over-all
perf by around 20-25%.  There were no shader-db changes.  Apparently,
this just isn't a pattern that happens in OpenGL.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: "19.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit d3386e73c5)
Conflicts resolved by Dylan
2019-03-20 08:51:33 -07:00
Samuel Pitoiset
62b2aea3ee radv: fix binding transform feedback buffers
The mask should be accumulated if two calls are used for
binding two buffers at different indexes. Otherwise, the
driver only accounts for the last one.

Noticed while glancing at this code.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 4fa61273a8)
2019-03-20 08:44:50 -07:00
Andres Gomez
062d464c4c Revert "glsl: relax input->output validation for SSO programs"
This reverts commit 1aa5738e66.

This patch incorrectly asumed that for SSOs no inner interface
matching check was needed.

From the ARB_separate_shader_objects spec v.25:

  " With separable program objects, interfaces between shader stages
    may involve the outputs from one program object and the inputs
    from a second program object.  For such interfaces, it is not
    possible to detect mismatches at link time, because the programs
    are linked separately.  When each such program is linked, all
    inputs or outputs interfacing with another program stage are
    treated as active.  The linker will generate an executable that
    assumes the presence of a compatible program on the other side of
    the interface.  If a mismatch between programs occurs, no GL error
    will be generated, but some or all of the inputs on the interface
    will be undefined."

This completes the fix from commit:
3be05dd267 ("glsl/linker: don't fail non static used inputs without matching outputs")

Fixes: 1aa5738e66 ("glsl: relax input->output validation for SSO programs")
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit ab28dca033)
2019-03-19 10:53:45 -07:00
Andres Gomez
33d331859a glsl/linker: simplify xfb_offset vs xfb_stride overflow check
Current implementation uses a complicated calculation which relies in
an implicit conversion to check the integral part of 2 division
results.

However, the calculation actually checks that the xfb_offset is
smaller or a multiplier of the xfb_stride. For example, while this is
expected to fail, it actually succeeds:

  "

    ...

    layout(xfb_buffer = 2, xfb_stride = 12) out block3 {
      layout(xfb_offset = 0) vec3 c;
      layout(xfb_offset = 12) vec3 d; // ERROR, requires stride of 24
    };

    ...

  "

Fixes: 2fab85aaea ("glsl: add xfb_stride link time validation")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 422882e78f)
2019-03-19 10:53:40 -07:00
Andres Gomez
068e9a8f45 glsl/linker: don't fail non static used inputs without matching outputs
If there is no Static Use of an input variable, the linker shouldn't
fail whenever there is no defined matching output variable in the
previous stage.

From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec:

  " Only the input variables that are statically read need to be
    written by the previous stage; it is allowed to have superfluous
    declarations of input variables."

Now, we complete this exception whenever the input variable has an
explicit location. Previously, 18004c338f ("glsl: fail when a
shader's input var has not an equivalent out var in previous") took
care of the cases in which the input variable didn't have an explicit
location.

v2: do the location based interface matching check regardless on
    whether it is a separable program or not (Ilia).

Fixes: 1aa5738e66 ("glsl: relax input->output validation for SSO programs")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 3be05dd267)
2019-03-19 10:53:33 -07:00
Andres Gomez
1b4712719e glsl: correctly validate component layout qualifier for dvec{3,4}
From page 62 (page 68 of the PDF) of the GLSL 4.50 v.7 spec:

  " A dvec3 or dvec4 can only be declared without specifying a
    component."

Therefore, using the "component" qualifier with a dvec3 or dvec4
should result in a compiling error.

v2: enhance the error message (Timothy).

Fixes: 94438578d2 ("glsl: validate and store component layout qualifier in GLSL IR")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit a96093136b)
2019-03-19 10:53:27 -07:00
Bas Nieuwenhuizen
da17740ea7 radv: Use correct image view comparison for fast clears.
The if is actually returning true on success, enabling fast clears, so we
need to have the test succeed when the iview dimensions are right.

Fixes: d5400a5ec2 "radv: provide a helper for comparing an image extents."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a777c3d7cb)
2019-03-19 10:53:20 -07:00
Jason Ekstrand
cf2e4490c3 nir: Add a new pass to lower array dereferences on vectors
This pass was originally written for lowering TCS output reads and
writes but it is also applicable just about anything including UBOs,
SSBOs, and shared variables.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 35b8f6f40b)
2019-03-18 11:36:41 -07:00
Jason Ekstrand
fa137cd655 nir/builder: Add a vector extract helper
This one's a tiny bit better than what we had in spirv_to_nir because it
emits a binary tree rather than a linear walk.  It also doesn't leave
around unneeded bcsel instructions for a constant index and returns an
undef for constant OOB access.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit fe9a6c0f14)
2019-03-18 11:36:26 -07:00
Dylan Baker
12745f5dc0 cherry-ignore: Add commit that doesn't apply 2019-03-18 11:34:41 -07:00
Danylo Piliaiev
ddea2a99c5 anv: Treat zero size XFB buffer as disabled
Vulkan spec doesn't explicitly forbid zero size transform
feedback buffers.
Having zero size xfb caused SurfaceSize overflow and
triggered assert in debug build.

The only way to have zero size SO_BUFFER is to disable
SO_BUFFER as stated in hardware spec.

From SKL PRM, Vol 2a, "3DSTATE_SO_BUFFER":
  "If set, stream output to SO Buffer is enabled,
  if 3DSTATE_STREAMOUT::SO Function ENABLE is also enabled.
  If clear, the SO Buffer is considered "not bound" and effectively
  treated as a zero- length buffer for the purposes of SO output and
  overflow detection. If an enabled stream's Stream to Buffer Selects
  includes this buffer it is by definition an overflow condition.
  That stream will cause no writes to occur,
  and only SO_PRIM_STORAGE_NEEDED[<stream>] will increment."

Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ecb98c6898)
2019-03-18 10:19:46 -07:00
Tapani Pälli
f028945c01 isl: fix automake build when sse41 is not supported
Fixes: 864cc419eb "intel/isl: move tiled_memcpy static libs from i965 to isl"
Cc: mesa-stable@lists.freedesktop.org
Reported-by: Milav Soni <milav.soni@teqdiligent.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit a1cd0040b6)
2019-03-18 10:19:41 -07:00
Mark Janes
6c7f03bb5b mesa: properly report the length of truncated log messages
_mesa_log_msg must provide the length of the string passed into the
KHR_debug api.  When the string formatted by _mesa_gl_vdebugf exceeds
MAX_DEBUG_MESSAGE_LENGTH, the length is incorrectly set to the number
of characters that would have been written if enough space had been
available.

Fixes: 3025680578
       ("mesa: Add support for GL_ARB_debug_output with dynamic ID allocation.")

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit eb1a869a5d)
2019-03-15 14:58:39 -07:00
Sergii Romantsov
ee18a3ec10 d3d: meson: do not prefix user provided d3d-drivers-path
The user can select the location where there d3d drivers
are installed by the d3d-drivers-path meson option.

By default path will be $prefix/$libdir/d3d.

Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.

Based on logic of
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit dcc4866419)
2019-03-14 10:38:20 -07:00
Samuel Pitoiset
06787d23cb radv: always initialize HTILE when the src layout is UNDEFINED
HTILE should always be initialized when transitioning from
VK_IMAGE_LAYOUT_UNDEFINED to other image layouts. Otherwise,
if an app does a transition from UNDEFINED to GENERAL, the
driver doesn't initialize HTILE and it tries to decompress
the depth surface. For some reasons, this results in VM faults.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 3a2e93147f)
2019-03-14 09:50:51 -07:00
Plamena Manolova
35029d4361 i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9
ARB_fragment_shader_interlock depends on memory fences to
ensure fragment ordering and this ordering guarantee is
only supported from GEN9 onwards.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980
Fixes: 939312702e "i965: Add ARB_fragment_shader_interlock support."
Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 19ab082001)
2019-03-14 09:50:44 -07:00
Jason Ekstrand
c4f8fb1749 anv/pass: Flag the need for a RT flush for resolve attachments
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 489bf2de23)
2019-03-14 09:50:39 -07:00
Kevin Strasser
0dd88cf9ae egl/dri: Avoid out of bounds array access
indexConfigAttrib iterates over every index in the dri driver, possibly
exceeding __DRI_ATTRIB_MAX. In other words, if the dri driver has newer
attributes libEGL will end up reading from uninitialized memory through
dri2_to_egl_attribute_map[].

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 70b36c0ef9)
2019-03-13 14:26:36 -07:00
Jason Ekstrand
3a18f13ba5 glsl/list: Add a list variant of insert_after
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

(cherry picked from commit 20c4578c55)
2019-03-13 14:26:36 -07:00
Jason Ekstrand
95b001cb19 glsl/lower_vector_derefs: Don't use a temporary for TCS outputs
Tessellation control shader outputs act as if they have memory backing
them and you can have multiple writes to different components of the
same vector in-flight at the same time.  When this happens, the load vec
store pattern that gets used by ir_triop_vector_insert doesn't yield the
correct results.  Instead, just emit a sequence of conditional
assignments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bd17bdc56b)
2019-03-13 14:26:36 -07:00
Kenneth Graunke
f2e5ca1d81 intel/fs: Fix opt_peephole_csel to not throw away saturates.
We were not copying the saturate bit from the original instruction
to the new replacement instruction.  This caused major misrendering
in DiRT Rally on iris, where comparisons leading to discards failed
due to the missing saturate, causing lots of extra garbage pixels to
be drawn in text rendering, trees, and so on.

This did not show up on i965 because st/nir performs a more aggressive
version of nir_opt_peephole_select, yielding more b32csel operations.

Fixes: 52c7df1643 i965/fs: Merge CMP and SEL into CSEL on Gen8+

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3570d15b6d)
2019-03-13 14:26:36 -07:00
Eric Anholt
f953d0f52f v3d: Fix leak of the renderonly struct on screen destruction.
This makes v3d match vc4's destroy path.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
(cherry picked from commit 486b181fd7)
2019-03-13 14:26:36 -07:00
Samuel Pitoiset
93386fbc5e radv: set the maximum number of IBs per submit to 192
This fixes random SteamVR corruption, see
https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181

Fixes: 4d30f2c6f4 ("radv/winsys: remove the max IBs per submit limit for the fallback path")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit ae77f12368)
2019-03-13 14:26:36 -07:00
Dylan Baker
142e37ab34 docs: Add SHA256 sums for 19.0.0 2019-03-13 12:09:08 -07:00
63 changed files with 881 additions and 87 deletions

View File

@@ -1 +1 @@
19.0.0
19.0.2

View File

@@ -11,4 +11,7 @@ b031c643491a92a5574c7a4bd659df33f2d89bb6
# These were manually rebased by Jason, thanks!
8ab95b849e66f3221d80a67eef2ec6e3730901a8
5c30fffeec1732c21d600c036f95f8cdb1bb5487
5c30fffeec1732c21d600c036f95f8cdb1bb5487
# This doesn't actually appliy to 19.0
29179f58c6ba8099859ea25900214dbbd3814a92

View File

@@ -35,7 +35,11 @@ def main():
args = parser.parse_args()
if os.path.isabs(args.libdir):
to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])
destdir = os.environ.get('DESTDIR')
if destdir:
to = os.path.join(destdir, args.libdir[1:])
else:
to = args.libdir
else:
to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)
@@ -45,7 +49,6 @@ def main():
if os.path.lexists(to):
os.unlink(to)
os.makedirs(to)
shutil.copy(args.megadriver, master)
for driver in args.drivers:
abs_driver = os.path.join(to, driver)

View File

@@ -32,7 +32,8 @@ Compatibility contexts may report a lower version depending on each driver.
<h2>SHA256 checksums</h2>
<pre>
TBD.
4c5b9c5227d37c1f6bdc786a6fa7ee7fbce40b2e8a87340c7d3234534ece3304 mesa-19.0.0.tar.gz
5a549dfb40ec31e5c36c47aadac04554cb2e2a8d144a046a378fc16da57e38f8 mesa-19.0.0.tar.xz
</pre>

159
docs/relnotes/19.0.1.html Normal file
View File

@@ -0,0 +1,159 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.0.1 Release Notes / March 27, 2019</h1>
<p>
Mesa 19.0.1 is a bug fix release which fixes bugs found since the 19.0.0 release.
</p>
<p>
Mesa 19.0.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
f1dd1980ed628edea3935eed7974fbc5d8353e9578c562728b880d63ac613dbd mesa-19.0.1.tar.gz
6884163c0ea9e4c98378ab8fecd72fe7b5f437713a14471beda378df247999d4 mesa-19.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100316">Bug 100316</a> - Linking GLSL 1.30 shaders with invariant and deprecated variables triggers an 'mismatching invariant qualifiers' error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107563">Bug 107563</a> - [RADV] Broken rendering in Unity demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109698">Bug 109698</a> - dri.pc contents invalid when built with meson</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109980">Bug 109980</a> - [i915 CI][HSW] spec&#64;arb_fragment_shader_interlock&#64;arb_fragment_shader_interlock-image-load-store - fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110211">Bug 110211</a> - If DESTDIR is set to an empty string, the dri drivers are not installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110221">Bug 110221</a> - build error with meson</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (4):</p>
<ul>
<li>glsl: correctly validate component layout qualifier for dvec{3,4}</li>
<li>glsl/linker: don't fail non static used inputs without matching outputs</li>
<li>glsl/linker: simplify xfb_offset vs xfb_stride overflow check</li>
<li>Revert "glsl: relax input-&gt;output validation for SSO programs"</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>radv: Use correct image view comparison for fast clears.</li>
<li>ac/nir: Return frag_coord as integer.</li>
</ul>
<p>Danylo Piliaiev (2):</p>
<ul>
<li>anv: Treat zero size XFB buffer as disabled</li>
<li>glsl: Cross validate variable's invariance by explicit invariance only</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>softpipe: fix texture view crashes</li>
</ul>
<p>Dylan Baker (5):</p>
<ul>
<li>docs: Add SHA256 sums for 19.0.0</li>
<li>cherry-ignore: Add commit that doesn't apply</li>
<li>bin/install_megadrivers.py: Correctly handle DESTDIR=''</li>
<li>bin/install_megadrivers.py: Fix regression for set DESTDIR</li>
<li>bump version for 19.0.1</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>v3d: Fix leak of the renderonly struct on screen destruction.</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>glsl/lower_vector_derefs: Don't use a temporary for TCS outputs</li>
<li>glsl/list: Add a list variant of insert_after</li>
<li>anv/pass: Flag the need for a RT flush for resolve attachments</li>
<li>nir/builder: Add a vector extract helper</li>
<li>nir: Add a new pass to lower array dereferences on vectors</li>
<li>intel/nir: Lower array-deref-of-vector UBO and SSBO loads</li>
</ul>
<p>Józef Kucia (2):</p>
<ul>
<li>radv: Fix driverUUID</li>
<li>mesa: Fix GL_NUM_DEVICE_UUIDS_EXT</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>intel/fs: Fix opt_peephole_csel to not throw away saturates.</li>
</ul>
<p>Kevin Strasser (1):</p>
<ul>
<li>egl/dri: Avoid out of bounds array access</li>
</ul>
<p>Mark Janes (1):</p>
<ul>
<li>mesa: properly report the length of truncated log messages</li>
</ul>
<p>Plamena Manolova (1):</p>
<ul>
<li>i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9</li>
</ul>
<p>Samuel Pitoiset (3):</p>
<ul>
<li>radv: set the maximum number of IBs per submit to 192</li>
<li>radv: always initialize HTILE when the src layout is UNDEFINED</li>
<li>radv: fix binding transform feedback buffers</li>
</ul>
<p>Sergii Romantsov (1):</p>
<ul>
<li>d3d: meson: do not prefix user provided d3d-drivers-path</li>
</ul>
<p>Tapani Pälli (2):</p>
<ul>
<li>isl: fix automake build when sse41 is not supported</li>
<li>anv/radv: release memory allocated by glsl types during spirv_to_nir</li>
</ul>
</div>
</body>
</html>

121
docs/relnotes/19.0.2.html Normal file
View File

@@ -0,0 +1,121 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.0.2 Release Notes / April 10, 2019</h1>
<p>
Mesa 19.0.2 is a bug fix release which fixes bugs found since the 19.0.1 release.
</p>
<p>
Mesa 19.0.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108766">Bug 108766</a> - Mesa built with meson has RPATH entries</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109648">Bug 109648</a> - AMD Raven hang during va-api decoding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110257">Bug 110257</a> - Major artifacts in mpeg2 vaapi hw decoding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=110259">Bug 110259</a> - radv: Sampling depth-stencil image in GENERAL layout returns nothing but zero (regression, bisected)</li>
</ul>
<h2>Changes</h2>
<p>Boyuan Zhang (1):</p>
<ul>
<li>st/va: reverse qt matrix back to its original order</li>
</ul>
<p>Caio Marcelo de Oliveira Filho (1):</p>
<ul>
<li>nir: Take if_uses into account when repairing SSA</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>docs: Add SHA256 sums for mesa 19.0.1</li>
<li>VERSION: bump version for 19.0.2</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>dri3: Return the current swap interval from glXGetSwapIntervalMESA().</li>
<li>v3d: Bump the maximum texture size to 4k for V3D 4.x.</li>
<li>v3d: Don't try to use the TFU blit path if a scissor is enabled.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>meson: strip rpath from megadrivers</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nir/print: fix printing the image_array intrinsic index</li>
</ul>
<p>Leo Liu (2):</p>
<ul>
<li>radeon/vcn: add H.264 constrained baseline support</li>
<li>radeon/vcn/vp9: search the render target from the whole list</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>intel: add dependency on genxml generated files</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: fix assertion failure by using the correct type</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: skip updating depth/color metadata for conditional rendering</li>
<li>radv: do not always initialize HTILE in compressed state</li>
</ul>
</div>
</body>
</html>

View File

@@ -608,7 +608,7 @@ with_gallium_xa = _xa != 'false'
d3d_drivers_path = get_option('d3d-drivers-path')
if d3d_drivers_path == ''
d3d_drivers_path = join_paths(get_option('libdir'), 'd3d')
d3d_drivers_path = join_paths(get_option('prefix'), get_option('libdir'), 'd3d')
endif
with_gallium_st_nine = get_option('gallium-nine')

View File

@@ -3093,7 +3093,8 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
ctx->abi->frag_pos[2],
ac_build_fdiv(&ctx->ac, ctx->ac.f32_1, ctx->abi->frag_pos[3])
};
result = ac_build_gather_values(&ctx->ac, values, 4);
result = ac_to_integer(&ctx->ac,
ac_build_gather_values(&ctx->ac, values, 4));
break;
}
case nir_intrinsic_load_front_face:

View File

@@ -1258,7 +1258,7 @@ radv_set_ds_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
++reg_count;
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 2 + reg_count, 0));
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 2 + reg_count, cmd_buffer->state.predicating));
radeon_emit(cs, S_370_DST_SEL(V_370_MEM) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_PFP));
@@ -1282,7 +1282,7 @@ radv_set_tc_compat_zrange_metadata(struct radv_cmd_buffer *cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->tc_compat_zrange_offset;
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, cmd_buffer->state.predicating));
radeon_emit(cs, S_370_DST_SEL(V_370_MEM) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_PFP));
@@ -1476,7 +1476,7 @@ radv_set_color_clear_metadata(struct radv_cmd_buffer *cmd_buffer,
assert(radv_image_has_cmask(image) || radv_image_has_dcc(image));
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, cmd_buffer->state.predicating));
radeon_emit(cs, S_370_DST_SEL(V_370_MEM) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_PFP));
@@ -4406,10 +4406,15 @@ static void radv_handle_depth_image_transition(struct radv_cmd_buffer *cmd_buffe
if (!radv_image_has_htile(image))
return;
if (src_layout == VK_IMAGE_LAYOUT_UNDEFINED &&
radv_layout_has_htile(image, dst_layout, dst_queue_mask)) {
/* TODO: merge with the clear if applicable */
radv_initialize_htile(cmd_buffer, image, range, 0);
if (src_layout == VK_IMAGE_LAYOUT_UNDEFINED) {
uint32_t clear_value = vk_format_is_stencil(image->vk_format) ? 0xfffff30f : 0xfffc000f;
if (radv_layout_is_htile_compressed(image, dst_layout,
dst_queue_mask)) {
clear_value = 0;
}
radv_initialize_htile(cmd_buffer, image, range, clear_value);
} else if (!radv_layout_is_htile_compressed(image, src_layout, src_queue_mask) &&
radv_layout_is_htile_compressed(image, dst_layout, dst_queue_mask)) {
uint32_t clear_value = vk_format_is_stencil(image->vk_format) ? 0xfffff30f : 0xfffc000f;
@@ -4906,7 +4911,7 @@ void radv_CmdBindTransformFeedbackBuffersEXT(
enabled_mask |= 1 << idx;
}
cmd_buffer->state.streamout.enabled_mask = enabled_mask;
cmd_buffer->state.streamout.enabled_mask |= enabled_mask;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_STREAMOUT_BUFFER;
}

View File

@@ -337,7 +337,7 @@ radv_physical_device_init(struct radv_physical_device *device,
device->rad_info.chip_class > GFX9)
fprintf(stderr, "WARNING: radv is not a conformant vulkan implementation, testing use only.\n");
radv_get_driver_uuid(&device->device_uuid);
radv_get_driver_uuid(&device->driver_uuid);
radv_get_device_uuid(&device->rad_info, &device->device_uuid);
if (device->rad_info.family == CHIP_STONEY ||
@@ -2794,7 +2794,7 @@ VkResult radv_QueueSubmit(
struct radeon_winsys_fence *base_fence = fence ? fence->fence : NULL;
struct radeon_winsys_ctx *ctx = queue->hw_ctx;
int ret;
uint32_t max_cs_submission = queue->device->trace_bo ? 1 : UINT32_MAX;
uint32_t max_cs_submission = queue->device->trace_bo ? 1 : RADV_MAX_IBS_PER_SUBMIT;
uint32_t scratch_size = 0;
uint32_t compute_scratch_size = 0;
uint32_t esgs_ring_size = 0, gsvs_ring_size = 0;

View File

@@ -651,7 +651,7 @@ static bool depth_view_can_fast_clear(struct radv_cmd_buffer *cmd_buffer,
iview->base_mip == 0 &&
iview->base_layer == 0 &&
radv_layout_is_htile_compressed(iview->image, layout, queue_mask) &&
!radv_image_extent_compare(iview->image, &iview->extent))
radv_image_extent_compare(iview->image, &iview->extent))
return true;
return false;
}

View File

@@ -29,6 +29,13 @@
#ifndef RADV_AMDGPU_WINSYS_PUBLIC_H
#define RADV_AMDGPU_WINSYS_PUBLIC_H
/* The number of IBs per submit isn't infinite, it depends on the ring type
* (ie. some initial setup needed for a submit) and the number of IBs (4 DW).
* This limit is arbitrary but should be safe for now. Ideally, we should get
* this limit from the KMD.
*/
#define RADV_MAX_IBS_PER_SUBMIT 192
struct radeon_winsys *radv_amdgpu_winsys_create(int fd, uint64_t debug_flags,
uint64_t perftest_flags);

View File

@@ -820,8 +820,8 @@
<packet code="120" name="Tile Binning Mode Cfg" min_ver="41">
<field name="Height (in pixels)" size="12" start="48" type="uint" minus_one="true"/>
<field name="Width (in pixels)" size="12" start="32" type="uint" minus_one="true"/>
<field name="Height (in pixels)" size="16" start="48" type="uint" minus_one="true"/>
<field name="Width (in pixels)" size="16" start="32" type="uint" minus_one="true"/>
<field name="Double-buffer in non-ms mode" size="1" start="15" type="bool"/>
<field name="Multisample Mode (4x)" size="1" start="14" type="bool"/>

View File

@@ -32,7 +32,8 @@
*/
#define V3D_MAX_TEXTURE_SAMPLERS 16
#define V3D_MAX_MIP_LEVELS 12
/* The HW can do 16384 (15), but we run into hangs when we expose that. */
#define V3D_MAX_MIP_LEVELS 13
#define V3D_MAX_SAMPLES 4

View File

@@ -229,6 +229,7 @@ NIR_FILES = \
nir/nir_lower_alpha_test.c \
nir/nir_lower_alu.c \
nir/nir_lower_alu_to_scalar.c \
nir/nir_lower_array_deref_of_vec.c \
nir/nir_lower_atomics_to_ssbo.c \
nir/nir_lower_bitmap.c \
nir/nir_lower_bit_size.c \

View File

@@ -3698,6 +3698,10 @@ apply_layout_qualifier_to_variable(const struct ast_type_qualifier *qual,
"cannot be applied to a matrix, a structure, "
"a block, or an array containing any of "
"these.");
} else if (components > 4 && type->is_64bit()) {
_mesa_glsl_error(loc, state, "component layout qualifier "
"cannot be applied to dvec%u.",
components / 2);
} else if (qual_component != 0 &&
(qual_component + components - 1) > 3) {
_mesa_glsl_error(loc, state, "component overflow (%u > 3)",
@@ -3940,7 +3944,8 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual,
"`invariant' after being used",
var->name);
} else {
var->data.invariant = 1;
var->data.explicit_invariant = true;
var->data.invariant = true;
}
}
@@ -4148,8 +4153,10 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual,
}
}
if (state->all_invariant && var->data.mode == ir_var_shader_out)
if (state->all_invariant && var->data.mode == ir_var_shader_out) {
var->data.explicit_invariant = true;
var->data.invariant = true;
}
var->data.interpolation =
interpret_interpolation_qualifier(qual, var->type,
@@ -4857,6 +4864,7 @@ ast_declarator_list::hir(exec_list *instructions,
"`invariant' after being used",
earlier->name);
} else {
earlier->data.explicit_invariant = true;
earlier->data.invariant = true;
}
}

View File

@@ -1734,6 +1734,7 @@ ir_variable::ir_variable(const struct glsl_type *type, const char *name,
this->data.centroid = false;
this->data.sample = false;
this->data.patch = false;
this->data.explicit_invariant = false;
this->data.invariant = false;
this->data.how_declared = ir_var_declared_normally;
this->data.mode = mode;

View File

@@ -657,6 +657,19 @@ public:
unsigned centroid:1;
unsigned sample:1;
unsigned patch:1;
/**
* Was an 'invariant' qualifier explicitly set in the shader?
*
* This is used to cross validate qualifiers.
*/
unsigned explicit_invariant:1;
/**
* Is the variable invariant?
*
* It can happen either by having the 'invariant' qualifier
* explicitly set in the shader or by being used in calculations
* of other invariant variables.
*/
unsigned invariant:1;
unsigned precise:1;

View File

@@ -199,6 +199,7 @@ void ir_print_visitor::visit(ir_variable *ir)
const char *const samp = (ir->data.sample) ? "sample " : "";
const char *const patc = (ir->data.patch) ? "patch " : "";
const char *const inv = (ir->data.invariant) ? "invariant " : "";
const char *const explicit_inv = (ir->data.explicit_invariant) ? "explicit_invariant " : "";
const char *const prec = (ir->data.precise) ? "precise " : "";
const char *const bindless = (ir->data.bindless) ? "bindless " : "";
const char *const bound = (ir->data.bound) ? "bound " : "";
@@ -215,11 +216,11 @@ void ir_print_visitor::visit(ir_variable *ir)
const char *const interp[] = { "", "smooth", "flat", "noperspective" };
STATIC_ASSERT(ARRAY_SIZE(interp) == INTERP_MODE_COUNT);
fprintf(f, "(%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s) ",
fprintf(f, "(%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s) ",
binding, loc, component, cent, bindless, bound,
image_format, memory_read_only, memory_write_only,
memory_coherent, memory_volatile, memory_restrict,
samp, patc, inv, prec, mode[ir->data.mode],
samp, patc, inv, explicit_inv, prec, mode[ir->data.mode],
stream,
interp[ir->data.interpolation]);

View File

@@ -419,8 +419,10 @@ ir_reader::read_declaration(s_expression *expr)
var->data.sample = 1;
} else if (strcmp(qualifier->value(), "patch") == 0) {
var->data.patch = 1;
} else if (strcmp(qualifier->value(), "explicit_invariant") == 0) {
var->data.explicit_invariant = true;
} else if (strcmp(qualifier->value(), "invariant") == 0) {
var->data.invariant = 1;
var->data.invariant = true;
} else if (strcmp(qualifier->value(), "uniform") == 0) {
var->data.mode = ir_var_uniform;
} else if (strcmp(qualifier->value(), "shader_storage") == 0) {

View File

@@ -309,16 +309,16 @@ cross_validate_types_and_qualifiers(struct gl_context *ctx,
* "The invariance of varyings that are declared in both the vertex
* and fragment shaders must match."
*/
if (input->data.invariant != output->data.invariant &&
if (input->data.explicit_invariant != output->data.explicit_invariant &&
prog->data->Version < (prog->IsES ? 300 : 430)) {
linker_error(prog,
"%s shader output `%s' %s invariant qualifier, "
"but %s shader input %s invariant qualifier\n",
_mesa_shader_stage_to_string(producer_stage),
output->name,
(output->data.invariant) ? "has" : "lacks",
(output->data.explicit_invariant) ? "has" : "lacks",
_mesa_shader_stage_to_string(consumer_stage),
(input->data.invariant) ? "has" : "lacks");
(input->data.explicit_invariant) ? "has" : "lacks");
return;
}
@@ -773,8 +773,20 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
output = explicit_locations[idx][input->data.location_frac].var;
if (output == NULL ||
input->data.location != output->data.location) {
if (output == NULL) {
/* A linker failure should only happen when there is no
* output declaration and there is Static Use of the
* declared input.
*/
if (input->data.used) {
linker_error(prog,
"%s shader input `%s' with explicit location "
"has no matching output\n",
_mesa_shader_stage_to_string(consumer->Stage),
input->name);
break;
}
} else if (input->data.location != output->data.location) {
linker_error(prog,
"%s shader input `%s' with explicit location "
"has no matching output\n",
@@ -804,7 +816,7 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
*/
assert(!input->data.assigned);
if (input->data.used && !input->get_interface_type() &&
!input->data.explicit_location && !prog->SeparateShader)
!input->data.explicit_location)
linker_error(prog,
"%s shader input `%s' "
"has no matching output in the previous stage\n",
@@ -1166,8 +1178,7 @@ tfeedback_decl::store(struct gl_context *ctx, struct gl_shader_program *prog,
return false;
}
if ((this->offset / 4) / info->Buffers[buffer].Stride !=
(xfb_offset - 1) / info->Buffers[buffer].Stride) {
if (xfb_offset > info->Buffers[buffer].Stride) {
linker_error(prog, "xfb_offset (%d) overflows xfb_stride (%d) for "
"buffer (%d)", xfb_offset * 4,
info->Buffers[buffer].Stride * 4, buffer);

View File

@@ -1090,7 +1090,7 @@ cross_validate_globals(struct gl_context *ctx, struct gl_shader_program *prog,
}
}
if (existing->data.invariant != var->data.invariant) {
if (existing->data.explicit_invariant != var->data.explicit_invariant) {
linker_error(prog, "declarations for %s `%s' have "
"mismatching invariant qualifiers\n",
mode_string(var), var->name);

View File

@@ -81,6 +81,12 @@ struct exec_node {
* Insert a node in the list after the current node
*/
void insert_after(exec_node *after);
/**
* Insert another list in the list after the current node
*/
void insert_after(struct exec_list *after);
/**
* Insert a node in the list before the current node
*/
@@ -507,6 +513,21 @@ exec_list_append(struct exec_list *list, struct exec_list *source)
exec_list_make_empty(source);
}
static inline void
exec_node_insert_list_after(struct exec_node *n, struct exec_list *after)
{
if (exec_list_is_empty(after))
return;
after->tail_sentinel.prev->next = n->next;
after->head_sentinel.next->prev = n;
n->next->prev = after->tail_sentinel.prev;
n->next = after->head_sentinel.next;
exec_list_make_empty(after);
}
static inline void
exec_list_prepend(struct exec_list *list, struct exec_list *source)
{
@@ -635,6 +656,11 @@ inline void exec_list::append_list(exec_list *source)
exec_list_append(this, source);
}
inline void exec_node::insert_after(exec_list *after)
{
exec_node_insert_list_after(this, after);
}
inline void exec_list::prepend_list(exec_list *source)
{
exec_list_prepend(this, source);

View File

@@ -32,8 +32,9 @@ namespace {
class vector_deref_visitor : public ir_rvalue_enter_visitor {
public:
vector_deref_visitor()
: progress(false)
vector_deref_visitor(void *mem_ctx, gl_shader_stage shader_stage)
: progress(false), shader_stage(shader_stage),
factory(&factory_instructions, mem_ctx)
{
}
@@ -45,6 +46,9 @@ public:
virtual ir_visitor_status visit_enter(ir_assignment *ir);
bool progress;
gl_shader_stage shader_stage;
exec_list factory_instructions;
ir_factory factory;
};
} /* anonymous namespace */
@@ -65,13 +69,63 @@ vector_deref_visitor::visit_enter(ir_assignment *ir)
ir_constant *old_index_constant =
deref->array_index->constant_expression_value(mem_ctx);
if (!old_index_constant) {
ir->rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert,
new_lhs->type,
new_lhs->clone(mem_ctx, NULL),
ir->rhs,
deref->array_index);
ir->write_mask = (1 << new_lhs->type->vector_elements) - 1;
ir->set_lhs(new_lhs);
if (shader_stage == MESA_SHADER_TESS_CTRL &&
deref->variable_referenced()->data.mode == ir_var_shader_out) {
/* Tessellation control shader outputs act as if they have memory
* backing them and if we have writes from multiple threads
* targeting the same vec4 (this can happen for patch outputs), the
* load-vec-store pattern of ir_triop_vector_insert doesn't work.
* Instead, we have to lower to a series of conditional write-masked
* assignments.
*/
ir_variable *const src_temp =
factory.make_temp(ir->rhs->type, "scalar_tmp");
/* The newly created variable declaration goes before the assignment
* because we're going to set it as the new LHS.
*/
ir->insert_before(factory.instructions);
ir->set_lhs(new(mem_ctx) ir_dereference_variable(src_temp));
ir_variable *const arr_index =
factory.make_temp(deref->array_index->type, "index_tmp");
factory.emit(assign(arr_index, deref->array_index));
for (unsigned i = 0; i < new_lhs->type->vector_elements; i++) {
ir_constant *const cmp_index =
ir_constant::zero(factory.mem_ctx, deref->array_index->type);
cmp_index->value.u[0] = i;
ir_rvalue *const lhs_clone = new_lhs->clone(factory.mem_ctx, NULL);
ir_dereference_variable *const src_temp_deref =
new(mem_ctx) ir_dereference_variable(src_temp);
if (new_lhs->ir_type != ir_type_swizzle) {
assert(lhs_clone->as_dereference());
ir_assignment *cond_assign =
new(mem_ctx) ir_assignment(lhs_clone->as_dereference(),
src_temp_deref,
equal(arr_index, cmp_index),
WRITEMASK_X << i);
factory.emit(cond_assign);
} else {
ir_assignment *cond_assign =
new(mem_ctx) ir_assignment(swizzle(lhs_clone, i, 1),
src_temp_deref,
equal(arr_index, cmp_index));
factory.emit(cond_assign);
}
}
ir->insert_after(factory.instructions);
} else {
ir->rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert,
new_lhs->type,
new_lhs->clone(mem_ctx, NULL),
ir->rhs,
deref->array_index);
ir->write_mask = (1 << new_lhs->type->vector_elements) - 1;
ir->set_lhs(new_lhs);
}
} else if (new_lhs->ir_type != ir_type_swizzle) {
ir->set_lhs(new_lhs);
ir->write_mask = 1 << old_index_constant->get_uint_component(0);
@@ -105,7 +159,7 @@ vector_deref_visitor::handle_rvalue(ir_rvalue **rv)
bool
lower_vector_derefs(gl_linked_shader *shader)
{
vector_deref_visitor v;
vector_deref_visitor v(shader->ir, shader->Stage);
visit_list_elements(&v, shader->ir);

View File

@@ -112,6 +112,7 @@ files_libnir = files(
'nir_lower_alu.c',
'nir_lower_alu_to_scalar.c',
'nir_lower_alpha_test.c',
'nir_lower_array_deref_of_vec.c',
'nir_lower_atomics_to_ssbo.c',
'nir_lower_bitmap.c',
'nir_lower_bool_to_float.c',

View File

@@ -2910,6 +2910,16 @@ void nir_fixup_deref_modes(nir_shader *shader);
bool nir_lower_global_vars_to_local(nir_shader *shader);
typedef enum {
nir_lower_direct_array_deref_of_vec_load = (1 << 0),
nir_lower_indirect_array_deref_of_vec_load = (1 << 1),
nir_lower_direct_array_deref_of_vec_store = (1 << 2),
nir_lower_indirect_array_deref_of_vec_store = (1 << 3),
} nir_lower_array_deref_of_vec_options;
bool nir_lower_array_deref_of_vec(nir_shader *shader, nir_variable_mode modes,
nir_lower_array_deref_of_vec_options options);
bool nir_lower_indirect_derefs(nir_shader *shader, nir_variable_mode modes);
bool nir_lower_locals_to_regs(nir_shader *shader);

View File

@@ -560,6 +560,35 @@ nir_channels(nir_builder *b, nir_ssa_def *def, nir_component_mask_t mask)
return nir_swizzle(b, def, swizzle, num_channels, false);
}
static inline nir_ssa_def *
_nir_vector_extract_helper(nir_builder *b, nir_ssa_def *vec, nir_ssa_def *c,
unsigned start, unsigned end)
{
if (start == end - 1) {
return nir_channel(b, vec, start);
} else {
unsigned mid = start + (end - start) / 2;
return nir_bcsel(b, nir_ilt(b, c, nir_imm_int(b, mid)),
_nir_vector_extract_helper(b, vec, c, start, mid),
_nir_vector_extract_helper(b, vec, c, mid, end));
}
}
static inline nir_ssa_def *
nir_vector_extract(nir_builder *b, nir_ssa_def *vec, nir_ssa_def *c)
{
nir_src c_src = nir_src_for_ssa(c);
if (nir_src_is_const(c_src)) {
unsigned c_const = nir_src_as_uint(c_src);
if (c_const < vec->num_components)
return nir_channel(b, vec, c_const);
else
return nir_ssa_undef(b, 1, vec->bit_size);
} else {
return _nir_vector_extract_helper(b, vec, c, 0, vec->num_components);
}
}
static inline nir_ssa_def *
nir_i2i(nir_builder *build, nir_ssa_def *x, unsigned dest_bit_size)
{

View File

@@ -0,0 +1,190 @@
/*
* Copyright © 2019 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "nir.h"
#include "nir_builder.h"
static void
build_write_masked_store(nir_builder *b, nir_deref_instr *vec_deref,
nir_ssa_def *value, unsigned component)
{
assert(value->num_components == 1);
unsigned num_components = glsl_get_components(vec_deref->type);
assert(num_components > 1 && num_components <= NIR_MAX_VEC_COMPONENTS);
nir_ssa_def *u = nir_ssa_undef(b, 1, value->bit_size);
nir_ssa_def *comps[NIR_MAX_VEC_COMPONENTS];
for (unsigned i = 0; i < num_components; i++)
comps[i] = (i == component) ? value : u;
nir_ssa_def *vec = nir_vec(b, comps, num_components);
nir_store_deref(b, vec_deref, vec, (1u << component));
}
static void
build_write_masked_stores(nir_builder *b, nir_deref_instr *vec_deref,
nir_ssa_def *value, nir_ssa_def *index,
unsigned start, unsigned end)
{
if (start == end - 1) {
build_write_masked_store(b, vec_deref, value, start);
} else {
unsigned mid = start + (end - start) / 2;
nir_push_if(b, nir_ilt(b, index, nir_imm_int(b, mid)));
build_write_masked_stores(b, vec_deref, value, index, start, mid);
nir_push_else(b, NULL);
build_write_masked_stores(b, vec_deref, value, index, mid, end);
nir_pop_if(b, NULL);
}
}
static bool
nir_lower_array_deref_of_vec_impl(nir_function_impl *impl,
nir_variable_mode modes,
nir_lower_array_deref_of_vec_options options)
{
bool progress = false;
nir_builder b;
nir_builder_init(&b, impl);
nir_foreach_block(block, impl) {
nir_foreach_instr_safe(instr, block) {
if (instr->type != nir_instr_type_intrinsic)
continue;
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
assert(intrin->intrinsic != nir_intrinsic_copy_deref);
if (intrin->intrinsic != nir_intrinsic_load_deref &&
intrin->intrinsic != nir_intrinsic_interp_deref_at_centroid &&
intrin->intrinsic != nir_intrinsic_interp_deref_at_sample &&
intrin->intrinsic != nir_intrinsic_interp_deref_at_offset &&
intrin->intrinsic != nir_intrinsic_store_deref)
continue;
nir_deref_instr *deref = nir_src_as_deref(intrin->src[0]);
if (!(deref->mode & modes))
continue;
/* We only care about array derefs that act on vectors */
if (deref->deref_type != nir_deref_type_array)
continue;
nir_deref_instr *vec_deref = nir_deref_instr_parent(deref);
if (!glsl_type_is_vector(vec_deref->type))
continue;
assert(intrin->num_components == 1);
unsigned num_components = glsl_get_components(vec_deref->type);
assert(num_components > 1 && num_components <= NIR_MAX_VEC_COMPONENTS);
b.cursor = nir_after_instr(&intrin->instr);
if (intrin->intrinsic == nir_intrinsic_store_deref) {
assert(intrin->src[1].is_ssa);
nir_ssa_def *value = intrin->src[1].ssa;
if (nir_src_is_const(deref->arr.index)) {
if (!(options & nir_lower_direct_array_deref_of_vec_store))
continue;
unsigned index = nir_src_as_uint(deref->arr.index);
/* If index is OOB, we throw the old store away and don't
* replace it with anything.
*/
if (index < num_components)
build_write_masked_store(&b, vec_deref, value, index);
} else {
if (!(options & nir_lower_indirect_array_deref_of_vec_store))
continue;
nir_ssa_def *index = nir_ssa_for_src(&b, deref->arr.index, 1);
build_write_masked_stores(&b, vec_deref, value, index,
0, num_components);
}
nir_instr_remove(&intrin->instr);
progress = true;
} else {
if (nir_src_is_const(deref->arr.index)) {
if (!(options & nir_lower_direct_array_deref_of_vec_load))
continue;
} else {
if (!(options & nir_lower_indirect_array_deref_of_vec_load))
continue;
}
/* Turn the load into a vector load */
nir_instr_rewrite_src(&intrin->instr, &intrin->src[0],
nir_src_for_ssa(&vec_deref->dest.ssa));
intrin->dest.ssa.num_components = num_components;
intrin->num_components = num_components;
nir_ssa_def *index = nir_ssa_for_src(&b, deref->arr.index, 1);
nir_ssa_def *scalar =
nir_vector_extract(&b, &intrin->dest.ssa, index);
if (scalar->parent_instr->type == nir_instr_type_ssa_undef) {
nir_ssa_def_rewrite_uses(&intrin->dest.ssa,
nir_src_for_ssa(scalar));
nir_instr_remove(&intrin->instr);
} else {
nir_ssa_def_rewrite_uses_after(&intrin->dest.ssa,
nir_src_for_ssa(scalar),
scalar->parent_instr);
}
progress = true;
}
}
}
if (progress) {
nir_metadata_preserve(impl, nir_metadata_block_index |
nir_metadata_dominance);
}
return progress;
}
/* Lowers away array dereferences on vectors
*
* These are allowed on certain variable types such as SSBOs and TCS outputs.
* However, not everyone can actually handle them everywhere. There are also
* cases where we want to lower them for performance reasons.
*
* This patch assumes that copy_deref instructions have already been lowered.
*/
bool
nir_lower_array_deref_of_vec(nir_shader *shader, nir_variable_mode modes,
nir_lower_array_deref_of_vec_options options)
{
bool progress = false;
nir_foreach_function(function, shader) {
if (function->impl &&
nir_lower_array_deref_of_vec_impl(function->impl, modes, options))
progress = true;
}
return progress;
}

View File

@@ -812,8 +812,8 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, print_state *state)
assert(dim < ARRAY_SIZE(dim_name) && dim_name[dim]);
fprintf(fp, " image_dim=%s", dim_name[dim]);
} else if (idx == NIR_INTRINSIC_IMAGE_ARRAY) {
bool array = nir_intrinsic_image_dim(instr);
fprintf(fp, " image_dim=%s", array ? "true" : "false");
bool array = nir_intrinsic_image_array(instr);
fprintf(fp, " image_array=%s", array ? "true" : "false");
} else if (idx == NIR_INTRINSIC_DESC_TYPE) {
VkDescriptorType desc_type = nir_intrinsic_desc_type(instr);
fprintf(fp, " desc_type=%s", vulkan_descriptor_type_name(desc_type));

View File

@@ -77,6 +77,15 @@ repair_ssa_def(nir_ssa_def *def, void *void_state)
}
}
nir_foreach_if_use(src, def) {
nir_block *block_before_if =
nir_cf_node_as_block(nir_cf_node_prev(&src->parent_if->cf_node));
if (!nir_block_dominates(def->parent_instr->block, block_before_if)) {
is_valid = false;
break;
}
}
if (is_valid)
return true;
@@ -98,6 +107,15 @@ repair_ssa_def(nir_ssa_def *def, void *void_state)
}
}
nir_foreach_if_use_safe(src, def) {
nir_block *block_before_if =
nir_cf_node_as_block(nir_cf_node_prev(&src->parent_if->cf_node));
if (!nir_block_dominates(def->parent_instr->block, block_before_if)) {
nir_if_rewrite_condition(src->parent_if, nir_src_for_ssa(
nir_phi_builder_value_get_block_def(val, block_before_if)));
}
}
return true;
}

View File

@@ -3045,12 +3045,7 @@ nir_ssa_def *
vtn_vector_extract_dynamic(struct vtn_builder *b, nir_ssa_def *src,
nir_ssa_def *index)
{
nir_ssa_def *dest = vtn_vector_extract(b, src, 0);
for (unsigned i = 1; i < src->num_components; i++)
dest = nir_bcsel(&b->nb, nir_ieq_imm(&b->nb, index, i),
vtn_vector_extract(b, src, i), dest);
return dest;
return nir_vector_extract(&b->nb, src, nir_i2i(&b->nb, index, 32));
}
nir_ssa_def *

View File

@@ -199,8 +199,10 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
bind_to_texture_rgb = 0;
bind_to_texture_rgba = 0;
for (int i = 0; dri2_dpy->core->indexConfigAttrib(dri_config, i, &attrib,
&value); ++i) {
for (int i = 0; i < __DRI_ATTRIB_MAX; ++i) {
if (!dri2_dpy->core->indexConfigAttrib(dri_config, i, &attrib, &value))
break;
switch (attrib) {
case __DRI_ATTRIB_RENDER_TYPE:
if (value & __DRI_ATTRIB_RGBA_BIT)

View File

@@ -64,6 +64,7 @@ static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
memset(&result, 0, sizeof(result));
switch (pic->base.profile) {
case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
result.profile = RDECODE_H264_PROFILE_BASELINE;
break;
@@ -490,7 +491,7 @@ static rvcn_dec_message_vp9_t get_vp9_msg(struct radeon_decoder *dec,
assert(dec->base.max_references + 1 <= 16);
for (i = 0 ; i < dec->base.max_references + 1 ; ++i) {
for (i = 0 ; i < 16 ; ++i) {
if (dec->render_pic_list[i] && dec->render_pic_list[i] == target) {
result.curr_pic_idx =
(uintptr_t)vl_video_buffer_get_associated_data(target, &dec->base);

View File

@@ -186,7 +186,7 @@ static void si_emit_guardband(struct si_context *ctx)
ctx->chip_class >= VI ? 16 : MAX2(ctx->screen->se_tile_repeat, 16);
/* Indexed by quantization modes */
static unsigned max_viewport_size[] = {65535, 16383, 4095};
static int max_viewport_size[] = {65535, 16383, 4095};
/* Ensure that the whole viewport stays representable in
* absolute coordinates.

View File

@@ -373,17 +373,18 @@ sp_tile_cache_flush_clear(struct softpipe_tile_cache *tc, int layer)
if (util_format_is_pure_uint(tc->surface->format)) {
pipe_put_tile_ui_format(pt, tc->transfer_map[layer],
x, y, TILE_SIZE, TILE_SIZE,
pt->resource->format,
tc->surface->format,
(unsigned *) tc->tile->data.colorui128);
} else if (util_format_is_pure_sint(tc->surface->format)) {
pipe_put_tile_i_format(pt, tc->transfer_map[layer],
x, y, TILE_SIZE, TILE_SIZE,
pt->resource->format,
tc->surface->format,
(int *) tc->tile->data.colori128);
} else {
pipe_put_tile_rgba(pt, tc->transfer_map[layer],
x, y, TILE_SIZE, TILE_SIZE,
(float *) tc->tile->data.color);
pipe_put_tile_rgba_format(pt, tc->transfer_map[layer],
x, y, TILE_SIZE, TILE_SIZE,
tc->surface->format,
(float *) tc->tile->data.color);
}
}
numCleared++;

View File

@@ -491,7 +491,8 @@ v3d_tfu_blit(struct pipe_context *pctx, const struct pipe_blit_info *info)
if ((info->mask & PIPE_MASK_RGBA) == 0)
return false;
if (info->dst.box.x != 0 ||
if (info->scissor_enable ||
info->dst.box.x != 0 ||
info->dst.box.y != 0 ||
info->dst.box.width != dst_width ||
info->dst.box.height != dst_height ||

View File

@@ -70,6 +70,7 @@ v3d_screen_destroy(struct pipe_screen *pscreen)
util_hash_table_destroy(screen->bo_handles);
v3d_bufmgr_destroy(pscreen);
slab_destroy_parent(&screen->transfer_pool);
free(screen->ro);
if (using_v3d_simulator)
v3d_simulator_destroy(screen);
@@ -184,7 +185,10 @@ v3d_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
return V3D_MAX_MIP_LEVELS;
if (screen->devinfo.ver < 40)
return 12;
else
return V3D_MAX_MIP_LEVELS;
case PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS:
return 2048;

View File

@@ -55,7 +55,28 @@ v3d_start_draw(struct v3d_context *v3d)
job->submit.bcl_start = job->bcl.bo->offset;
v3d_job_add_bo(job, job->bcl.bo);
job->tile_alloc = v3d_bo_alloc(v3d->screen, 1024 * 1024, "tile_alloc");
/* The PTB will request the tile alloc initial size per tile at start
* of tile binning.
*/
uint32_t tile_alloc_size = (job->draw_tiles_x *
job->draw_tiles_y) * 64;
/* The PTB allocates in aligned 4k chunks after the initial setup. */
tile_alloc_size = align(tile_alloc_size, 4096);
/* Include the first two chunk allocations that the PTB does so that
* we definitely clear the OOM condition before triggering one (the HW
* won't trigger OOM during the first allocations).
*/
tile_alloc_size += 8192;
/* For performance, allocate some extra initial memory after the PTB's
* minimal allocations, so that we hopefully don't have to block the
* GPU on the kernel handling an OOM signal.
*/
tile_alloc_size += 512 * 1024;
job->tile_alloc = v3d_bo_alloc(v3d->screen, tile_alloc_size,
"tile_alloc");
uint32_t tsda_per_tile_size = v3d->screen->devinfo.ver >= 40 ? 256 : 64;
job->tile_state = v3d_bo_alloc(v3d->screen,
job->draw_tiles_y *

View File

@@ -846,6 +846,9 @@ v3d_setup_texture_shader_state(struct V3DX(TEXTURE_SHADER_STATE) *tex,
prsc->target == PIPE_TEXTURE_1D_ARRAY) {
tex->image_height = tex->image_width >> 14;
}
tex->image_width &= (1 << 14) - 1;
tex->image_height &= (1 << 14) - 1;
#endif
if (prsc->target == PIPE_TEXTURE_3D) {

View File

@@ -27,6 +27,19 @@
#include "va_private.h"
const int reverse_inverse_zscan[] =
{
/* Reverse inverse z scan pattern */
0, 2, 3, 9, 10, 20, 21, 35,
1, 4, 8, 11, 19, 22, 34, 36,
5, 7, 12, 18, 23, 33, 37, 48,
6, 13, 17, 24, 32, 38, 47, 49,
14, 16, 25, 31, 39, 46, 50, 57,
15, 26, 30, 40, 45, 51, 56, 58,
27, 29, 41, 44, 52, 55, 59, 62,
28, 42, 43, 53, 54, 60, 61, 63,
};
void vlVaHandlePictureParameterBufferMPEG12(vlVaDriver *drv, vlVaContext *context, vlVaBuffer *buf)
{
VAPictureParameterBufferMPEG2 *mpeg2 = buf->data;
@@ -66,16 +79,29 @@ void vlVaHandlePictureParameterBufferMPEG12(vlVaDriver *drv, vlVaContext *contex
void vlVaHandleIQMatrixBufferMPEG12(vlVaContext *context, vlVaBuffer *buf)
{
VAIQMatrixBufferMPEG2 *mpeg2 = buf->data;
static uint8_t temp_intra_matrix[64];
static uint8_t temp_nonintra_matrix[64];
assert(buf->size >= sizeof(VAIQMatrixBufferMPEG2) && buf->num_elements == 1);
if (mpeg2->load_intra_quantiser_matrix)
context->desc.mpeg12.intra_matrix = mpeg2->intra_quantiser_matrix;
else
if (mpeg2->load_intra_quantiser_matrix) {
/* The quantiser matrix that VAAPI provides has been applied
with inverse z-scan. However, what we expect in MPEG2
picture description is the original order. Therefore,
we need to reverse it back to its original order.
*/
for (int i = 0; i < 64; i++)
temp_intra_matrix[i] =
mpeg2->intra_quantiser_matrix[reverse_inverse_zscan[i]];
context->desc.mpeg12.intra_matrix = temp_intra_matrix;
} else
context->desc.mpeg12.intra_matrix = NULL;
if (mpeg2->load_non_intra_quantiser_matrix)
context->desc.mpeg12.non_intra_matrix = mpeg2->non_intra_quantiser_matrix;
else
if (mpeg2->load_non_intra_quantiser_matrix) {
for (int i = 0; i < 64; i++)
temp_nonintra_matrix[i] =
mpeg2->non_intra_quantiser_matrix[reverse_inverse_zscan[i]];
context->desc.mpeg12.non_intra_matrix = temp_nonintra_matrix;
} else
context->desc.mpeg12.non_intra_matrix = NULL;
}

View File

@@ -68,5 +68,5 @@ pkg.generate(
description : 'Native D3D driver modules',
version : '.'.join(nine_version),
requires_private : 'libdrm >= ' + dep_libdrm.version(),
variables : ['moduledir=${prefix}/@0@'.format(d3d_drivers_path)],
variables : ['moduledir=@0@'.format(d3d_drivers_path)],
)

View File

@@ -60,6 +60,9 @@ libgallium_dri = shared_library(
driver_tegra, driver_i915, driver_svga, driver_virgl,
driver_swr,
],
# Will be deleted during installation, see install_megadrivers.py
install : true,
install_dir : dri_drivers_path,
)
foreach d : [[with_gallium_kmsro, 'pl111_dri.so'],

View File

@@ -49,6 +49,7 @@ libva_gallium = shared_library(
dep_libdrm, dep_thread, driver_r600, driver_radeonsi, driver_nouveau,
],
link_depends : va_link_depends,
# Will be deleted during installation, see install_megadrivers.py
install : true,
install_dir : va_drivers_path,
)

View File

@@ -55,6 +55,9 @@ libvdpau_gallium = shared_library(
],
link_depends : vdpau_link_depends,
soversion : '@0@.@1@.0'.format(VDPAU_MAJOR, VDPAU_MINOR),
# Will be deleted during installation, see install_megadrivers.py
install : true,
install_dir : vdpau_drivers_path,
)
foreach d : [[with_gallium_r300, 'r300'],
[with_gallium_r600, 'r600'],

View File

@@ -47,6 +47,9 @@ libxvmc_gallium = shared_library(
],
dependencies : [dep_thread, driver_r600, driver_nouveau],
link_depends : xvmc_link_depends,
# Will be deleted during installation, see install_megadrivers.py
install : true,
install_dir : xvmc_drivers_path,
)
foreach d : [[with_gallium_r600, 'r600'], [with_gallium_nouveau, 'nouveau']]

View File

@@ -642,7 +642,6 @@ dri3_set_swap_interval(__GLXDRIdrawable *pdraw, int interval)
break;
}
priv->swap_interval = interval;
loader_dri3_set_swap_interval(&priv->loader_drawable, interval);
return 0;
@@ -659,7 +658,7 @@ dri3_get_swap_interval(__GLXDRIdrawable *pdraw)
struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
return priv->swap_interval;
return priv->loader_drawable.swap_interval;
}
static void

View File

@@ -117,7 +117,6 @@ struct dri3_context
struct dri3_drawable {
__GLXDRIdrawable base;
struct loader_dri3_drawable loader_drawable;
int swap_interval;
/* LIBGL_SHOW_FPS support */
uint64_t previous_ust;

View File

@@ -33,12 +33,15 @@ ISL_GEN_LIBS = \
noinst_LTLIBRARIES += $(ISL_GEN_LIBS) \
isl/libisl.la \
libisl_tiled_memcpy.la \
libisl_tiled_memcpy_sse41.la
libisl_tiled_memcpy.la
isl_libisl_la_LIBADD = $(ISL_GEN_LIBS) \
libisl_tiled_memcpy.la \
libisl_tiled_memcpy_sse41.la
libisl_tiled_memcpy.la
if SSE41_SUPPORTED
isl_libisl_la_LIBADD += libisl_tiled_memcpy_sse41.la
noinst_LTLIBRARIES += libisl_tiled_memcpy_sse41.la
endif
isl_libisl_la_SOURCES = $(ISL_FILES) $(ISL_GENERATED_FILES)

View File

@@ -33,5 +33,5 @@ libblorp = static_library(
files_libblorp,
include_directories : [inc_common, inc_intel],
c_args : [c_vis_args, no_override_init_args],
dependencies : idep_nir_headers,
dependencies : [idep_nir_headers, idep_genxml],
)

View File

@@ -43,5 +43,5 @@ libintel_common = static_library(
include_directories : [inc_common, inc_intel],
c_args : [c_vis_args, no_override_init_args],
link_with : [libisl],
dependencies : [dep_expat, dep_libdrm, dep_thread],
dependencies : [dep_expat, dep_libdrm, dep_thread, idep_genxml],
)

View File

@@ -3117,6 +3117,7 @@ fs_visitor::opt_peephole_csel()
if (csel_inst != NULL) {
progress = true;
csel_inst->saturate = inst->saturate;
inst->remove(block);
}

View File

@@ -2100,6 +2100,7 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
break;
case SHADER_OPCODE_INTERLOCK:
assert(devinfo->gen >= 9);
/* The interlock is basically a memory fence issued via sendc */
brw_memory_fence(p, dst, BRW_OPCODE_SENDC);
break;

View File

@@ -781,6 +781,17 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir)
OPT(brw_nir_lower_mem_access_bit_sizes);
/* Lower array derefs of vectors for SSBO and UBO loads. For both UBOs and
* SSBOs, our back-end is capable of loading an entire vec4 at a time and
* we would like to take advantage of that whenever possible regardless of
* whether or not the app gives us full loads. This should allow the
* optimizer to combine UBO and SSBO load operations and save us some send
* messages.
*/
OPT(nir_lower_array_deref_of_vec,
nir_var_mem_ubo | nir_var_mem_ssbo,
nir_lower_direct_array_deref_of_vec_load);
/* Get rid of split copies */
nir = brw_nir_optimize(nir, compiler, is_scalar, false);

View File

@@ -57,3 +57,5 @@ foreach f : gen_xml_files
capture : true,
)
endforeach
idep_genxml = declare_dependency(sources : [gen_xml_pack, genX_bits_h, genX_xml_h])

View File

@@ -21,9 +21,9 @@
c_sse2_args = ['-msse2', '-mstackrealign']
inc_intel = include_directories('.')
subdir('genxml')
subdir('blorp')
subdir('dev')
subdir('genxml')
subdir('isl')
subdir('common')
subdir('compiler')

View File

@@ -178,12 +178,28 @@ anv_render_pass_compile(struct anv_render_pass *pass)
* subpasses and checking to see if any of them don't have an external
* dependency. Or, we could just be lazy and add a couple extra flushes.
* We choose to be lazy.
*
* From the documentation for vkCmdNextSubpass:
*
* "Moving to the next subpass automatically performs any multisample
* resolve operations in the subpass being ended. End-of-subpass
* multisample resolves are treated as color attachment writes for the
* purposes of synchronization. This applies to resolve operations for
* both color and depth/stencil attachments. That is, they are
* considered to execute in the
* VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage and
* their writes are synchronized with
* VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT."
*
* Therefore, the above flags concerning color attachments also apply to
* color and depth/stencil resolve attachments.
*/
if (all_usage & VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT) {
pass->subpass_flushes[0] |=
ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT;
}
if (all_usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
if (all_usage & (VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
VK_IMAGE_USAGE_TRANSFER_DST_BIT)) {
pass->subpass_flushes[pass->subpass_count] |=
ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT;
}

View File

@@ -2653,7 +2653,7 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_SO_BUFFER), sob) {
sob.SOBufferIndex = idx;
if (cmd_buffer->state.xfb_enabled && xfb->buffer) {
if (cmd_buffer->state.xfb_enabled && xfb->buffer && xfb->size != 0) {
sob.SOBufferEnable = true;
sob.MOCS = cmd_buffer->device->default_mocs,
sob.StreamOffsetWriteEnable = false;

View File

@@ -203,7 +203,7 @@ libvulkan_intel = shared_library(
libvulkan_util, libvulkan_wsi, libmesa_util,
],
dependencies : [
dep_thread, dep_dl, dep_m, anv_deps, idep_nir,
dep_thread, dep_dl, dep_m, anv_deps, idep_nir, idep_genxml,
],
c_args : anv_flags,
link_args : ['-Wl,--build-id=sha1', ld_args_bsymbolic, ld_args_gc_sections],

View File

@@ -253,7 +253,6 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.EXT_shader_samples_identical = true;
ctx->Extensions.OES_primitive_bounding_box = true;
ctx->Extensions.OES_texture_buffer = true;
ctx->Extensions.ARB_fragment_shader_interlock = true;
if (can_do_pipelined_register_writes(brw->screen)) {
ctx->Extensions.ARB_draw_indirect = true;
@@ -318,6 +317,30 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.KHR_blend_equation_advanced_coherent = true;
ctx->Extensions.KHR_texture_compression_astc_ldr = true;
ctx->Extensions.KHR_texture_compression_astc_sliced_3d = true;
/*
* From the Skylake PRM Vol. 7 (Memory Fence Message, page 221):
* "A memory fence message issued by a thread causes further messages
* issued by the thread to be blocked until all previous data port
* messages have completed, or the results can be globally observed from
* the point of view of other threads in the system."
*
* From the Haswell PRM Vol. 7 (Memory Fence, page 256):
* "A memory fence message issued by a thread causes further messages
* issued by the thread to be blocked until all previous messages issued
* by the thread to that data port (data cache or render cache) have
* been globally observed from the point of view of other threads in the
* system."
*
* Summarized: For ARB_fragment_shader_interlock to work, we need to
* ensure memory access ordering for all messages to the dataport from
* all threads. Memory fence messages prior to SKL only provide memory
* access ordering for messages from the same thread, so we can only
* support the feature from Gen9 onwards.
*
*/
ctx->Extensions.ARB_fragment_shader_interlock = true;
}
if (gen_device_info_is_9lp(devinfo))

View File

@@ -187,7 +187,7 @@ libi965 = static_library(
i965_gen_libs, libintel_common, libintel_dev, libisl, libintel_compiler,
libblorp
],
dependencies : [dep_libdrm, dep_valgrind, idep_nir_headers],
dependencies : [dep_libdrm, dep_valgrind, idep_nir_headers, idep_genxml],
)
dri_drivers += libi965

View File

@@ -54,6 +54,9 @@ if dri_drivers != []
dep_selinux, dep_libdrm, dep_expat, dep_m, dep_thread, dep_dl, idep_nir,
],
link_args : [ld_args_build_id, ld_args_bsymbolic, ld_args_gc_sections],
# Will be deleted during installation, see install_megadrivers.py
install : true,
install_dir : dri_drivers_path,
)
meson.add_install_script(

View File

@@ -231,6 +231,9 @@ _mesa_gl_vdebug(struct gl_context *ctx,
_mesa_debug_get_id(id);
len = _mesa_vsnprintf(s, MAX_DEBUG_MESSAGE_LENGTH, fmtString, args);
if (len >= MAX_DEBUG_MESSAGE_LENGTH)
/* message was truncated */
len = MAX_DEBUG_MESSAGE_LENGTH - 1;
_mesa_log_msg(ctx, source, type, *id, severity, len, s);
}

View File

@@ -906,6 +906,9 @@ find_custom_value(struct gl_context *ctx, const struct value_desc *d, union valu
break;
/* GL_EXT_external_objects */
case GL_NUM_DEVICE_UUIDS_EXT:
v->value_int = 1;
break;
case GL_DRIVER_UUID_EXT:
_mesa_get_driver_uuid(ctx, v->value_int_4);
break;