Compare commits

...

70 Commits

Author SHA1 Message Date
Juan A. Suarez Romero
3b49ab6219 docs: add release notes for 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-17 18:15:18 +00:00
Juan A. Suarez Romero
a7f75b9487 Update version to 18.0.4
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-17 18:09:37 +00:00
Kai Wasserbäch
0f9bd67c4b opencl: autotools: Fix linking order for OpenCL target
Otherwise the build fails with an undefined reference to
clang::FrontendTimesIsEnabled.

Bugzilla: https://bugs.freedesktop.org/106209
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Aaron Watry <awatry@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit b691d9192c)
2018-05-15 11:15:21 +02:00
Bas Nieuwenhuizen
0fa8cdfd13 radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.
The hardware always interprets the alpha as unsigned and fixing it
in the shader is going to add unacceptable overheads.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f944a59996)
2018-05-15 11:15:21 +02:00
Bas Nieuwenhuizen
9a4b915517 radv: Fix up 2_10_10_10 alpha sign.
Pre-Vega HW always interprets the alpha for this format as unsigned,
so we have to implement a fixup to do the sign correctly for signed
formats.

v2: Improve indexing mess.

CC: 18.0 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(Backport of 3d4d388e39 "radv: Fix up 2_10_10_10 alpha sign.")
2018-05-15 11:15:21 +02:00
Bas Nieuwenhuizen
1b0406f465 radv: Translate logic ops.
radeonsi could pass them through but the enum changed between
Gallium and Vulkan, so we have to translate.

In progress I made the register defines a bit more readable.

CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit dd102405de)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_pipeline.c
2018-05-15 11:15:21 +02:00
Dave Airlie
33a8aad459 radv: use compute path for multi-layer images.
I don't think the hw resolve path can't handle multi-layer images.

This fixes all the:
dEQP-VK.renderpass.multisample_resolve.layers_*
tests on my VI card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5978d54a09)
2018-05-15 11:14:49 +02:00
Dave Airlie
4a4a51bdfb radv: resolve all layers in compute resolve path.
This path should iterate across all layers, I've some ideas
for doing this in a single pass, but this is simpler for now.

This passes the tests because we don't use the fragment path
unless we have DCC, and we don't have DCC on layered images.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 98dbaa445a)
2018-05-15 11:14:49 +02:00
Juan A. Suarez Romero
9a537aad11 cherry-ignore: radv/resolve: do fmask decompress on all layers.
stable: The commit requires earlier commits ab0e625a67 and 62510846b6
which did not land in branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-15 11:14:49 +02:00
Jan Vesely
538022adf8 winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.
Fixes memory leak on module unload.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 58272c1ad7)
2018-05-15 11:14:49 +02:00
Juan A. Suarez Romero
4788977798 cherry-ignore: mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT
stable: The commit fixes earlier commit d07466fe18 which did not land
in branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-15 11:14:49 +02:00
Juan A. Suarez Romero
5cd442e589 cherry-ignore: mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs
stable: The commit fixes earlier commit d5f42f96e1 which did not land
in branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-15 11:14:49 +02:00
Jason Ekstrand
81a733214a i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL
From the bspec docs for "Indirect State Pointers Disable":

    "At the completion of the post-sync operation associated with this
    pipe control packet, the indirect state pointers in the hardware are
    considered invalid"

So the ISP disable is a post-sync type of operation which means that it
should be combined with a CS stall.  Without this, the simulator throws
an error.

Fixes: 766d801ca "anv: emit pixel scoreboard stall before ISP disable"
Fixes: f536097f6 "i965: require pixel scoreboard stall prior to ISP disable"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a8a740f272)
2018-05-15 11:14:49 +02:00
Lionel Landwerlin
c78a265f75 anv: emit pixel scoreboard stall before ISP disable
We want to make sure that all indirect state data has been loaded into
the EUs before disable the pointers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: 78c125af39 ("anv/gen10: Ignore push constant packets during context restore.")
(cherry picked from commit 766d801ca3)
2018-05-15 11:14:49 +02:00
Lionel Landwerlin
430bca7d89 i965: require pixel scoreboard stall prior to ISP disable
Invalidating the indirect state pointers might affect a previously
scheduled & still running 3DPRIMITIVE (causing page fault). So stall
on pixel scoreboard before that.

v2: Fix compile issue :(

v3: Stall on pixel scoreboard

v4: Drop the post sync operation (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: ca19ee33d7 ("i965/gen10: Ignore push constant packets during context restore.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106243
(cherry picked from commit f536097f67)
2018-05-15 11:14:49 +02:00
Jan Vesely
876c7c7006 winsys/radeon: Destroy fd_hash table when the last winsys is removed.
Fixes memory leak on module unload.
v2: Use util_hash_table helper function

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
(cherry picked from commit 45dfa6f4e7)
2018-05-15 11:14:49 +02:00
Jan Vesely
d2632fc765 gallium/auxiliary: Add helper function to count the number of entries in hash table
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
(cherry picked from commit d146768d13)
2018-05-15 11:14:49 +02:00
Dave Airlie
83e543e9fa r600: fix constant buffer bounds.
If you have an indirect access to a constant buffer on r600/eg
use a vertex fetch in the shader. However apps have expected
behaviour on those out of bounds accessess (even if illegal).

If the constants were being uploaded as part of a larger
upload buffer, we'd set the range of allowed access to a lot
larger than required so apps would get values back from
other parts of the upload buffer instead of the expected out
of bounds access.

This fixes rendering bugs in Trine and Witcher 1, thanks
to iive for nagging me effectively until I figured it out :-)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91808
Cc: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit ce027ac5c7)
2018-05-15 11:14:49 +02:00
Ross Burton
3c0ca29ff0 src/intel/Makefile.vulkan.am: add missing MKDIR_GEN
Out of tree builds can try to write into a directory that doesn't exist yet:

| Traceback (most recent call last):
|   File "../../../mesa-18.0.2/src/intel/vulkan/anv_icd.py", line 46, in <module>
|     with open(args.out, 'w') as f:
| IOError: [Errno 2] No such file or directory: 'vulkan/intel_icd.x86_64.json'
| Makefile:4882: recipe for target 'vulkan/intel_icd.x86_64.json' failed

Add missing MKDIR_GEN calls to solve this.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 1755654d9f)
2018-05-15 11:14:49 +02:00
Rhys Perry
4368854260 mesa: fix error handling in get_framebuffer_parameteriv
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5ac16ed047)
2018-05-15 11:14:49 +02:00
Jan Vesely
5f0c3879e6 pipe-loader: Free driver_name in error path
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0783399d79)
2018-05-15 11:14:49 +02:00
Juan A. Suarez Romero
eeaad26ff2 cherry-ignore: glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug
stable: The commit requires earlier commit ba79a90fb5 which did not
land in branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-15 11:14:45 +02:00
Jan Vesely
42229106b3 eg/compute: Drop reference to kernel_param bo in destructor
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a9e4be9212)
2018-05-15 11:14:11 +02:00
Jan Vesely
c013960afd r600: Cleanup constant buffers on context destruction
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a1e8fcce3e)
2018-05-15 11:14:11 +02:00
Jan Vesely
8d728f903e eg/compute: Drop reference on code_bo in destructor.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ea1fff4416)
2018-05-15 11:14:11 +02:00
Kenneth Graunke
7d0f1d676a i965: Don't leak blorp on Gen4-5.
We used to only initialize BLORP on Gen6+.  When we added it on Gen4-5,
we forgot to destroy it unconditionally.

Fixes: 752d7af77a (i965: Add blorp support for gen4-5)
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2dc29e095f)

Squashed with:

i965: silence unused variable

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2dc29e095f ("i965: Don't leak blorp on Gen4-5.")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 3853f1c6f4)
2018-05-15 11:14:11 +02:00
Jan Vesely
54c208e48e clover: Add explicit virtual destructor to argument class
It is needed to destroy the v vector in scalar_argument
Fixes memory leaks on parameter set/bind.

v2: Drop redundant sclara_argument destructor

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 2f1ad72ac1)
2018-05-15 11:14:11 +02:00
Neil Roberts
23cd0c1598 spirv: Apply OriginUpperLeft to FragCoord
This behaviour was changed in 1e5b09f42f. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
     nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
     nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     /* fallthrough */
  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
     nir_var->data.pixel_center_integer = b->pixel_center_integer;
     /* fallthrough */
  case SpvBuiltInSamplePosition:
     nir_var->data.origin_upper_left = b->origin_upper_left;
     break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: 1e5b09f42f "spirv: Tidy some repeated if checks..."
(cherry picked from commit e17d0ccbbd)
2018-05-15 11:14:11 +02:00
Ian Romanick
aecf2e1319 mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)
Found by inspection, so I made a piglit test too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f2db3be620)
2018-05-15 11:14:11 +02:00
Deepak Rawat
749626c473 egl/x11: Send invalidate to driver on copy_region path in swap_buffer
Similar to swap_available path send invalidate to the driver because
egl/X11 is not watching for for server's invalidate events. The
dri2_copy_region path is trigerred when server supports DRI2 version
minor 1.

Tested with piglit egl tests for regression.

V2: Move invalidate from dri2_copy_region to swap_buffer common.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 9a21c96126)
2018-05-15 11:14:11 +02:00
Jose Maria Casanova Crespo
ee99f7deaf intel/compiler: fix brw_imm_w for negative 16-bit integers
16-bit immediates need to replicate the 16-bit immediate value
in both words of the 32-bit value. This needs to be careful
to avoid sign-extension, which the previous implementation was
not handling properly.

For example, with the previous implementation, storing the value
-3 would generate imm.d = 0xfffffffd due to signed integer sign
extension, which is not correct. Instead, we should cast to
uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd.

We only had a couple of cases hitting this path in the driver
until now, one with value -1, which would work since all bits are
one in this case, and another with value -2 in brw_clip_tri(),
which would hit the aforementioned issue (this case only affects
gen4 although we are not aware of whether this was causing an
actual bug somewhere).

v2: Make explicit uint32_t casting for left shift (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f0e6dacee5)
2018-05-15 11:14:11 +02:00
Jose Maria Casanova Crespo
bbd5c75d7d intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate
From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
    Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
    Split half float implementation. (Jason Ekstrand)
    Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 2a76f03c90)
2018-05-15 11:14:11 +02:00
Juan A. Suarez Romero
6ca758f6b6 cherry-ignore: add explicit 18.1 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-15 11:14:09 +02:00
Matthew Nicholls
2e97e1ea02 radv: fix multisample image copies
Previously before fb077b0728, the LOD parameter was being used in place of the
sample index, which would only copy the first sample to all samples in the
destination image. After that multisample image copies wouldn't copy anything
from my observations.

This fixes some copy_and_blit CTS tests.

v3.1: - set lod to 0 for nir_txf_ms (Samuel)
v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel)
    - updated commit description (Samuel)

Fix this properly by copying each sample in a separate radv_CmdDraw and using a
pipeline with the correct rasterizationSamples for the destination image.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 97d57ef917)
2018-05-07 16:32:54 +02:00
Juan A. Suarez Romero
ae12c5e990 docs: add sha256 checksums for 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-07 11:18:19 +00:00
Juan A. Suarez Romero
6dc2658fd6 docs: add sha256 checksums for 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-07 10:19:36 +00:00
Juan A. Suarez Romero
5831836987 Update version to 18.0.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-05-07 10:09:56 +00:00
Boyuan Zhang
5d3caa1ca4 radeon/vcn: fix mpeg4 msg buffer settings
Previous bit-fields assignments are incorrect and will result certain mpeg4
decode failed due to wrong flag values. This patch fixes these assignments.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
(cherry picked from commit deba56accf)
2018-05-02 12:15:05 +02:00
Nanley Chery
8f97e56947 i965/tex_image: Avoid the ASTC LDR workaround on gen9lp
Both the internal documentation and the results of testing this in the
CI suggest that this is unnecessary. Add the fixes tag because this
reduces an internal benchmark's startup time by about 17 seconds
(reported by Eero).

Fixes: 710b1d2e66 "i965/tex_image: Flush certain subnormal ASTC channel values"
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3e56e4642f)
2018-05-02 12:15:05 +02:00
Samuel Pitoiset
97841a8f02 radv: compute the number of subpass attachments correctly
Only count color attachments twice if resolves are used, also
account for the depth stencil attachment if present.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d8db5986ce)
2018-05-02 12:15:05 +02:00
Andres Rodriguez
5a7de46492 radv/winsys: fix leaking resources from bo's imported by fd
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().

This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f56e22e496)
2018-05-02 12:15:05 +02:00
Leo Liu
1a23971b49 st/omx/enc: fix blit setup for YUV LoadImage
The blit here involves scaling since it's copying from I8 format to R8G8 format.
Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it
looks that GPU always uses the second half as source. Currently we use "1" as
the start point of x for R, then causing 1 source pixel of U component shift to
right. So "-1" should be the start point for U component.

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1c5f4f4e17)
[Juan A. Suarez: apply patch in
src/gallium/state_trackers/omx_bellagio/vid_enc.c]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/gallium/state_trackers/omx/vid_enc_common.c
2018-04-30 18:42:39 +02:00
Eric Anholt
c0aeac1536 gallium/util: Fix incorrect refcounting of separate stencil.
The driver may have a reference on the separate stencil buffer for some
reason (like an unflushed job using it), so we can't directly free the
resource and should instead just decrement the refcount that we own.
Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
on vc5.

Fixes: e94eb5e600 ("gallium/util: add u_transfer_helper")
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 069c409f43)
2018-04-30 18:42:39 +02:00
Marek Olšák
1fccd6736a radeonsi/gfx9: workaround for INTERP with indirect indexing
and clean up the conditions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6d19120da8)
2018-04-30 18:42:39 +02:00
Marek Olšák
001f7ac65c util/u_queue: fix a deadlock in util_queue_finish
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 7083ac7290)
2018-04-30 18:42:39 +02:00
Jason Ekstrand
3e5dfc0537 anv/allocator: Don't shrink either end of the block pool
Previously, we only tried to ensure that we didn't shrink either end
below what was already handed out.  However, due to the way we handle
relocations with block pools, we can't shrink the back end at all.  It's
probably best to not shrink in either direction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3db93f9128)
2018-04-30 18:42:38 +02:00
Juan A. Suarez Romero
16a3264c32 cherry-ignore: add explicit 18.1 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-30 18:42:38 +02:00
Juan A. Suarez Romero
b3eed3ad03 docs: add sha256 checksums for 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-28 16:57:30 +00:00
Juan A. Suarez Romero
d38da7bd2d docs: add release notes for 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-28 16:22:11 +00:00
Juan A. Suarez Romero
ff629ffcd3 Update version to 18.0.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-28 16:16:16 +00:00
Dylan Baker
53ff157c33 meson: don't build classic mesa tests without dri_drivers
Since mesa_classic is build-on-demand the tests will create a demand and
add a bunch of extra compilation.

Fixes: 43a6e84927
       ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit aaab624245)
2018-04-25 14:04:53 +02:00
Samuel Pitoiset
3b9b66560a radv/winsys: allow to submit up to 4 IBs for chips without chaining
The SI family doesn't support chaining which means the maximum
size in dwords per CS is limited. When that limit was reached
we failed to submit the CS and the application crashed.

This patch allows to submit up to 4 IBs which is currently the
limit, but recent amdgpu supports more than that.

Please note that we can reach the limit of 4 IBs per submit
but currently we can't improve that. The only solution is to
upgrade libdrm. That will be improved later but for now this
should fix crashes on SI or when using RADV_DEBUG=noibs.

Fixes: 36cb5508e8 ("radv/winsys: Fail early on overgrown cs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-04-25 14:04:53 +02:00
Ian Romanick
60c5cf011d intel/compiler: Add scheduler deps for instructions that implicitly read g0
Otherwise the scheduler can move the writes after the reads.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: Clayton A Craft <clayton.a.craft@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0d5ce25c1c)
2018-04-25 14:04:53 +02:00
Dylan Baker
95d88ba0da bin/install_megadrivers: fix DESTDIR and -D*-path
This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when
those paths are absolute. Currently due to the way python's os.path.join
handles absolute paths these will ignore DESTDIR, which is bad. This
fixes them to be relative to DESTDIR if that is set.

Fixes: 3218056e0e
       ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
(cherry picked from commit ae3f45c11e)
2018-04-24 11:02:38 +02:00
Marek Olšák
6bd2fba19d Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"
This reverts commit dab02dea34.

It causes crashes of qtcreator and firefox.

Fixes: dab02de "st/dri: Fix dangling pointer to a destroyed dri_drawable"

Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4559aefb5c)
2018-04-24 11:02:38 +02:00
Jason Ekstrand
ead5bf4f6a i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*
They are send messages and this makes size_read() and mlen agree.  For
both of these opcodes, the payload is just a dummy so mlen == 1 and this
should decrease register pressure a bit.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit de1f22d595)
2018-04-24 11:02:38 +02:00
Juan A. Suarez Romero
d45bb9f505 cherry-ignore: add explicit 18.1 only nominations
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-24 11:02:37 +02:00
Johan Klokkhammer Helsing
d75054d0d0 st/dri: Fix dangling pointer to a destroyed dri_drawable
If an EGLSurface is created, made current and destroyed, and then a second
EGLSurface is created. Then the second malloc in driCreateNewDrawable may
return the same pointer address the first surface's drawable had.
Consequently, when dri_make_current later tries to determine if it should
update the texture_stamp it compares the surface's drawable pointer against
the drawable in the last call to dri_make_current and assumes it's the same
surface (which it isn't).

When texture_stamp is left unset, then dri_st_framebuffer_validate thinks
it has already called update_drawable_info for that drawable, leaving it
unvalidated and this is when bad things starts to happen. In my case it
manifested itself by the width and height of the surface being unset.

This is fixed this by setting the pointer to NULL before freeing the
surface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126
Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dab02dea34)
2018-04-23 12:18:53 +02:00
Lucas Stach
7673c72f3d etnaviv: fix texture_format_needs_swiz
memcmp returns 0 when both swizzles are the same, which means we don't
need any hardware swizzling. texture_format_needs_swiz should return
true when the return value of the memcmp is non-zero.

Fixes: 751ae6afbe ("etnaviv: add support for swizzled texture formats")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
(cherry picked from commit 52e93e309f)
2018-04-23 12:18:53 +02:00
Bas Nieuwenhuizen
264cda58ab radv: Mark GTT memory as device local for APUs.
Otherwise a lot of games complain about not having enough memory,
and it is sort of local so this seems reasonable to me.

CC: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit e1df849c3c)
2018-04-23 12:18:53 +02:00
Juan A. Suarez Romero
40ed4b0285 travis: radv needs LLVM 4.0
This is a backport for 18.0 from 6ce400782c ("travis: radeonsi and radv
need LLVM 4.0") that fixes Travis build with meson + vulkan.

CC: 18.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2018-04-23 09:56:53 +00:00
Kenneth Graunke
251a36d629 i965: Fix shadow batches to be the same size as the real BO.
brw_bo_alloc may round up our allocation size to the next bucket size.
In this case, we would malloc a shadow buffer that was the original
intended size, but use bo->size (the larger size) for all of our checks.

This could cause us to run off the end of the shadow buffer.

v2: Actually use the new BO size (caught by Lionel)

Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c7dcee58b5 (i965: Avoid problems from referencing orphaned BOs after growing.)
(cherry picked from commit da25ae92be)
2018-04-23 09:56:53 +00:00
Samuel Pitoiset
b62b3eb259 radv: fix scissor computation when using half-pixel viewport offset
'scale[i]' can be non-integer.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56 ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 893e19efb7)
2018-04-23 09:56:53 +00:00
Lionel Landwerlin
f581dc608b anv: fix number of planes for depth & stencil
We're not counting correctly with depth & stencil images.

Additionally we need to move an assert that is meant just for color
attachments.

v2: Move an assert() (Reported by Craig)
    Change aspect mask checks (Francesco)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a62a979335 ("anv: enable multiple planes per image/imageView")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 0a6547014f)
2018-04-23 09:56:53 +00:00
Timothy Arceri
e1b87631a9 mesa: free debug messages when destroying the debug state
Fixes: 04a8baad37 "mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data"

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98281
(cherry picked from commit a63e69f5f0)
2018-04-23 09:56:53 +00:00
Thomas Hellstrom
279c628560 svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace
When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY
extension to query whether an sRGB format is supported. That extension will
query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than
PIPE_BIND_DISPLAY_TARGET which is used when building the configs.
We only return the correct value for PIPE_BIND_DISPLAY_TARGET.

The inconsistency causes EGL to crash at surface initialization if sRGB is
not supported. Fix this by supporting both bind flags.

Testing done:
piglit egl_gl_colorspace srgb

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit e0c08183fb)
2018-04-23 09:56:53 +00:00
Marek Olšák
e7709adf7a glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract
This fixes some piglits.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 7bd24d951a)
2018-04-23 09:56:53 +00:00
Marek Olšák
cd52573fac radeonsi/gfx9: fix a hang with an empty first IB
This packet causes the no-op IB detection to fail, so the IB is always
submitted. Also fix the no-op IB detection by moving the begin call.

Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-23 09:53:45 +00:00
Bas Nieuwenhuizen
5edd3192e7 ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.
No clue how I missed those ...

Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views."
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b0e3a9b19f)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2018-04-23 11:19:32 +02:00
Juan A. Suarez Romero
a1c421c638 docs: add sha256 checksums for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-04-18 15:25:00 +00:00
62 changed files with 1339 additions and 365 deletions

View File

@@ -39,12 +39,12 @@ matrix:
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
- llvm-toolchain-trusty-4.0
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
- llvm-4.0-dev
# Common
- xz-utils
- libexpat1-dev

View File

@@ -1 +1 @@
18.0.1
18.0.4

View File

@@ -21,3 +21,47 @@ ac4437b20b87c7285b89466f05b51518ae616873 automake: small cleanup after the meson
# b2f2236dc565dd1460f0 and c62cf1f165919bc74296 which did not land in
# branch.
880c1718b6d14b33fe5ba918af70fea5be890c6b omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}
# stable: There is a specific port for this patch for stable branch.
d15fb766aa3c98ffbe16d050b2af4804e4b12c57 radeonsi/gfx9: fix a hang with an empty first IB
# stable: Explicit 18.1 only nominations
0e945fdf23bac5a62c15edfcbfd9d6ac4eee592f nir: Do not use progress for unreachable code in return lowering.
84fef802fb16cef68ec358cbfed1cac9c3bfa410 ac/nir: add missing round_slice for 1D arrays
d136a5fad9c7e67c1362453388914ecc60420883 ac: fix the number of coordinates for ac_image_get_lod and arrays
# stable: There is a specific port for this patch for stable branch.
fedd0a4215bcd387525000d76b77993ca38916ae radv/winsys: allow to submit up to 4 IBs for chips without chaining
# stable: Explicit 18.1 only nominations
413c5ca3727898fdb4fa1d2849d0c2defdd77b48 travis: update libva required version
a6fbefa67b5b0ed1ee42a9034ee74dfaed1c389a radv: fix DCC enablement since partial MSAA implementation
d7ffe3b384f4d1c15a9364768cf405d416522e60 radv: set ac_surf_info::num_channels correctly
d38425ce872c4a00cfb691ae9dceca6a07afc516 ac: fix texture query LOD for 1D textures on GFX9
4d449c94e450c33d7b2b09c1c263322042503893 autotools, meson: bump up required VA version
# stable: Explicit 18.1 only nominations
9267ff9883f749dd1708c573c0df4b46687ff973 radv: Allow vkEnumerateInstanceVersion ProcAddr without instance.
467c562a292b4424f24381932b90bcb9869c3d73 radv: Don't check the incoming apiVersion on CreateInstance.
b17cfb08a3fc9a599eff64fffe48daba398a672f vulkan/wsi: Only use LINEAR modifier for prime if supported.
597b9e881083533b987dbcbb8f679ca1eefff974 radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVM
62f50df7b79c273a0eb9bf769eded76933bddc3a radv: Fix multiview queries.
# stable: The commit requires earlier commit ba79a90fb52 which did not land in
# branch.
901db25d5b7cd2ac2dd648b370c4bddf23dd5c44 glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug
# stable: The commit fixes earlier commit d5f42f96e16 which did not land in
# branch.
d07466fe18522cde1acadfc597583f80b69c15b7 mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs
# stable: The commit fixes earlier commit d07466fe18522 which did not land in
# branch.
e4211b36bba4acde3e56ce1e22b12759e820a241 mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT
# stable: The commit requires earlier commits ab0e625a671 and 62510846b6e which
# did not land in branch.
b16fc6cda11576a4dd6c8d95f7bee94121c4b8e7 radv/resolve: do fmask decompress on all layers.
# stable: There is a specific port for this patch for stable branch.
3d4d388e3929d7948b62d90867357aecbfba5aeb radv: Fix up 2_10_10_10 alpha sign.

View File

@@ -1,6 +1,6 @@
#!/usr/bin/env python
# encoding=utf-8
# Copyright © 2017 Intel Corporation
# Copyright © 2017-2018 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -35,7 +35,11 @@ def main():
parser.add_argument('drivers', nargs='+')
args = parser.parse_args()
to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir)
if os.path.isabs(args.libdir):
to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])
else:
to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)
master = os.path.join(to, os.path.basename(args.megadriver))
if not os.path.exists(to):

View File

@@ -31,7 +31,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
0c93ba892c0610f5dd87f2e2673b9445187995c395b3ddb33fd4260bfb291e89 mesa-18.0.1.tar.gz
b2d2f5b5dbaab13e15cb0dcb5ec81887467f55ebc9625945b303a3647cd87954 mesa-18.0.1.tar.xz
</pre>

144
docs/relnotes/18.0.2.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.2 Release Notes / April 28, 2018</h1>
<p>
Mesa 18.0.2 is a bug fix release which fixes bugs found since the 18.0.1 release.
</p>
<p>
Mesa 18.0.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
SHA256: ffd8dfe3337b474a3baa085f0e7ef1a32c7cdc3bed1ad810b2633919a9324840 mesa-18.0.2.tar.gz
SHA256: 98fa159768482dc568b9f8bf0f36c7acb823fa47428ffd650b40784f16b9e7b3 mesa-18.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.</li>
<li>radv: Mark GTT memory as device local for APUs.</li>
</ul>
<p>Dylan Baker (2):</p>
<ul>
<li>bin/install_megadrivers: fix DESTDIR and -D*-path</li>
<li>meson: don't build classic mesa tests without dri_drivers</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>intel/compiler: Add scheduler deps for instructions that implicitly read g0</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*</li>
</ul>
<p>Johan Klokkhammer Helsing (1):</p>
<ul>
<li>st/dri: Fix dangling pointer to a destroyed dri_drawable</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.1</li>
<li>travis: radv needs LLVM 4.0</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>Update version to 18.0.2</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Fix shadow batches to be the same size as the real BO.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix number of planes for depth &amp; stencil</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix texture_format_needs_swiz</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi/gfx9: fix a hang with an empty first IB</li>
<li>glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract</li>
<li>Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"</li>
</ul>
<p>Samuel Pitoiset (2):</p>
<ul>
<li>radv: fix scissor computation when using half-pixel viewport offset</li>
<li>radv/winsys: allow to submit up to 4 IBs for chips without chaining</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: free debug messages when destroying the debug state</li>
</ul>
</div>
</body>
</html>

107
docs/relnotes/18.0.3.html Normal file
View File

@@ -0,0 +1,107 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.3 Release Notes / May 7, 2018</h1>
<p>
Mesa 18.0.3 is a bug fix release which fixes bugs found since the 18.0.2 release.
</p>
<p>
Mesa 18.0.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
58cc5c5b1ab2a44e6e47f18ef6c29836ad06f95450adce635ce3c317507a171b mesa-18.0.3.tar.gz
099d9667327a76a61741a533f95067d76ea71a656e66b91507b3c0caf1d49e30 mesa-18.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (1):</p>
<ul>
<li>radv/winsys: fix leaking resources from bo's imported by fd</li>
</ul>
<p>Boyuan Zhang (1):</p>
<ul>
<li>radeon/vcn: fix mpeg4 msg buffer settings</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>gallium/util: Fix incorrect refcounting of separate stencil.</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv/allocator: Don't shrink either end of the block pool</li>
</ul>
<p>Juan A. Suarez Romero (3):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.2</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>Update version to 18.0.3</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/enc: fix blit setup for YUV LoadImage</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>util/u_queue: fix a deadlock in util_queue_finish</li>
<li>radeonsi/gfx9: workaround for INTERP with indirect indexing</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965/tex_image: Avoid the ASTC LDR workaround on gen9lp</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: compute the number of subpass attachments correctly</li>
</ul>
</div>
</body>
</html>

156
docs/relnotes/18.0.4.html Normal file
View File

@@ -0,0 +1,156 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 18.0.4 Release Notes / May 17, 2018</h1>
<p>
Mesa 18.0.4 is a bug fix release which fixes bugs found since the 18.0.3 release.
</p>
<p>
Mesa 18.0.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Translate logic ops.</li>
<li>radv: Fix up 2_10_10_10 alpha sign.</li>
<li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>r600: fix constant buffer bounds.</li>
<li>radv: resolve all layers in compute resolve path.</li>
<li>radv: use compute path for multi-layer images.</li>
</ul>
<p>Deepak Rawat (1):</p>
<ul>
<li>egl/x11: Send invalidate to driver on copy_region path in swap_buffer</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)</li>
</ul>
<p>Jan Vesely (8):</p>
<ul>
<li>clover: Add explicit virtual destructor to argument class</li>
<li>eg/compute: Drop reference on code_bo in destructor.</li>
<li>r600: Cleanup constant buffers on context destruction</li>
<li>eg/compute: Drop reference to kernel_param bo in destructor</li>
<li>pipe-loader: Free driver_name in error path</li>
<li>gallium/auxiliary: Add helper function to count the number of entries in hash table</li>
<li>winsys/radeon: Destroy fd_hash table when the last winsys is removed.</li>
<li>winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL</li>
</ul>
<p>Jose Maria Casanova Crespo (2):</p>
<ul>
<li>intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate</li>
<li>intel/compiler: fix brw_imm_w for negative 16-bit integers</li>
</ul>
<p>Juan A. Suarez Romero (7):</p>
<ul>
<li>docs: add sha256 checksums for 18.0.3</li>
<li>cherry-ignore: add explicit 18.1 only nominations</li>
<li>cherry-ignore: glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug</li>
<li>cherry-ignore: mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs</li>
<li>cherry-ignore: mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT</li>
<li>cherry-ignore: radv/resolve: do fmask decompress on all layers.</li>
<li>Update version to 18.0.4</li>
</ul>
<p>Kai Wasserbäch (1):</p>
<ul>
<li>opencl: autotools: Fix linking order for OpenCL target</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Don't leak blorp on Gen4-5.</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>i965: require pixel scoreboard stall prior to ISP disable</li>
<li>anv: emit pixel scoreboard stall before ISP disable</li>
</ul>
<p>Matthew Nicholls (1):</p>
<ul>
<li>radv: fix multisample image copies</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>spirv: Apply OriginUpperLeft to FragCoord</li>
</ul>
<p>Rhys Perry (1):</p>
<ul>
<li>mesa: fix error handling in get_framebuffer_parameteriv</li>
</ul>
<p>Ross Burton (1):</p>
<ul>
<li>src/intel/Makefile.vulkan.am: add missing MKDIR_GEN</li>
</ul>
</div>
</body>
</html>

View File

@@ -3617,6 +3617,25 @@ static LLVMValueRef get_image_coords(struct ac_nir_context *ctx,
return res;
}
static LLVMValueRef get_image_buffer_descriptor(struct ac_nir_context *ctx,
const nir_intrinsic_instr *instr, bool write)
{
LLVMValueRef rsrc = get_sampler_desc(ctx, instr->variables[0], AC_DESC_BUFFER, NULL, true, write);
if (ctx->abi->gfx9_stride_size_workaround) {
LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 2, 0), "");
LLVMValueRef stride = LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 1, 0), "");
stride = LLVMBuildLShr(ctx->ac.builder, stride, LLVMConstInt(ctx->ac.i32, 16, 0), "");
LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->ac.builder,
LLVMBuildICmp(ctx->ac.builder, LLVMIntUGT, elem_count, stride, ""),
elem_count, stride, "");
rsrc = LLVMBuildInsertElement(ctx->ac.builder, rsrc, new_elem_count,
LLVMConstInt(ctx->ac.i32, 2, 0), "");
}
return rsrc;
}
static LLVMValueRef visit_image_load(struct ac_nir_context *ctx,
const nir_intrinsic_instr *instr)
{
@@ -3631,7 +3650,7 @@ static LLVMValueRef visit_image_load(struct ac_nir_context *ctx,
type = glsl_without_array(type);
if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_BUF) {
params[0] = get_sampler_desc(ctx, instr->variables[0], AC_DESC_BUFFER, NULL, true, false);
params[0] = get_image_buffer_descriptor(ctx, instr, false);
params[1] = LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, instr->src[0]),
ctx->ac.i32_0, ""); /* vindex */
params[2] = ctx->ac.i32_0; /* voffset */
@@ -3693,20 +3712,7 @@ static void visit_image_store(struct ac_nir_context *ctx,
glc = ctx->ac.i1true;
if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_BUF) {
LLVMValueRef rsrc = get_sampler_desc(ctx, instr->variables[0], AC_DESC_BUFFER, NULL, true, true);
if (ctx->abi->gfx9_stride_size_workaround) {
LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 2, 0), "");
LLVMValueRef stride = LLVMBuildExtractElement(ctx->ac.builder, rsrc, LLVMConstInt(ctx->ac.i32, 1, 0), "");
stride = LLVMBuildLShr(ctx->ac.builder, stride, LLVMConstInt(ctx->ac.i32, 16, 0), "");
LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->ac.builder,
LLVMBuildICmp(ctx->ac.builder, LLVMIntUGT, elem_count, stride, ""),
elem_count, stride, "");
rsrc = LLVMBuildInsertElement(ctx->ac.builder, rsrc, new_elem_count,
LLVMConstInt(ctx->ac.i32, 2, 0), "");
}
LLVMValueRef rsrc = get_image_buffer_descriptor(ctx, instr, true);
params[0] = ac_to_float(&ctx->ac, get_src(ctx, instr->src[2])); /* data */
params[1] = rsrc;
@@ -3801,8 +3807,7 @@ static LLVMValueRef visit_image_atomic(struct ac_nir_context *ctx,
params[param_count++] = get_src(ctx, instr->src[2]);
if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_BUF) {
params[param_count++] = get_sampler_desc(ctx, instr->variables[0], AC_DESC_BUFFER,
NULL, true, true);
params[param_count++] = get_image_buffer_descriptor(ctx, instr, true);
params[param_count++] = LLVMBuildExtractElement(ctx->ac.builder, get_src(ctx, instr->src[0]),
ctx->ac.i32_0, ""); /* vindex */
params[param_count++] = ctx->ac.i32_0; /* voffset */
@@ -5330,6 +5335,48 @@ static void visit_cf_list(struct ac_nir_context *ctx,
}
}
/* For 2_10_10_10 formats the alpha is handled as unsigned by pre-vega HW.
* so we may need to fix it up. */
static LLVMValueRef
adjust_vertex_fetch_alpha(struct nir_to_llvm_context *ctx,
unsigned adjustment,
LLVMValueRef alpha)
{
if (adjustment == RADV_ALPHA_ADJUST_NONE)
return alpha;
LLVMValueRef c30 = LLVMConstInt(ctx->ac.i32, 30, 0);
if (adjustment == RADV_ALPHA_ADJUST_SSCALED)
alpha = LLVMBuildFPToUI(ctx->ac.builder, alpha, ctx->ac.i32, "");
else
alpha = ac_to_integer(&ctx->ac, alpha);
/* For the integer-like cases, do a natural sign extension.
*
* For the SNORM case, the values are 0.0, 0.333, 0.666, 1.0
* and happen to contain 0, 1, 2, 3 as the two LSBs of the
* exponent.
*/
alpha = LLVMBuildShl(ctx->ac.builder, alpha,
adjustment == RADV_ALPHA_ADJUST_SNORM ?
LLVMConstInt(ctx->ac.i32, 7, 0) : c30, "");
alpha = LLVMBuildAShr(ctx->ac.builder, alpha, c30, "");
/* Convert back to the right type. */
if (adjustment == RADV_ALPHA_ADJUST_SNORM) {
LLVMValueRef clamp;
LLVMValueRef neg_one = LLVMConstReal(ctx->ac.f32, -1.0);
alpha = LLVMBuildSIToFP(ctx->ac.builder, alpha, ctx->ac.f32, "");
clamp = LLVMBuildFCmp(ctx->ac.builder, LLVMRealULT, alpha, neg_one, "");
alpha = LLVMBuildSelect(ctx->ac.builder, clamp, neg_one, alpha, "");
} else if (adjustment == RADV_ALPHA_ADJUST_SSCALED) {
alpha = LLVMBuildSIToFP(ctx->ac.builder, alpha, ctx->ac.f32, "");
}
return alpha;
}
static void
handle_vs_input_decl(struct nir_to_llvm_context *ctx,
struct nir_variable *variable)
@@ -5339,14 +5386,15 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
LLVMValueRef t_list;
LLVMValueRef input;
LLVMValueRef buffer_index;
int index = variable->data.location - VERT_ATTRIB_GENERIC0;
int idx = variable->data.location;
unsigned attrib_count = glsl_count_attribute_slots(variable->type, true);
variable->data.driver_location = idx * 4;
variable->data.driver_location = variable->data.location * 4;
for (unsigned i = 0; i < attrib_count; ++i, ++idx) {
if (ctx->options->key.vs.instance_rate_inputs & (1u << (index + i))) {
for (unsigned i = 0; i < attrib_count; ++i) {
LLVMValueRef output[4];
unsigned attrib_index = variable->data.location + i - VERT_ATTRIB_GENERIC0;
if (ctx->options->key.vs.instance_rate_inputs & (1u << attrib_index)) {
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.instance_id,
ctx->abi.start_instance, "");
if (ctx->options->key.vs.as_ls) {
@@ -5359,7 +5407,7 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
} else
buffer_index = LLVMBuildAdd(ctx->builder, ctx->abi.vertex_id,
ctx->abi.base_vertex, "");
t_offset = LLVMConstInt(ctx->ac.i32, index + i, false);
t_offset = LLVMConstInt(ctx->ac.i32, attrib_index, false);
t_list = ac_build_load_to_sgpr(&ctx->ac, t_list_ptr, t_offset);
@@ -5370,9 +5418,15 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
for (unsigned chan = 0; chan < 4; chan++) {
LLVMValueRef llvm_chan = LLVMConstInt(ctx->ac.i32, chan, false);
ctx->inputs[radeon_llvm_reg_index_soa(idx, chan)] =
ac_to_integer(&ctx->ac, LLVMBuildExtractElement(ctx->builder,
input, llvm_chan, ""));
output[chan] = LLVMBuildExtractElement(ctx->builder, input, llvm_chan, "");
}
unsigned alpha_adjust = (ctx->options->key.vs.alpha_adjust >> (attrib_index * 2)) & 3;
output[3] = adjust_vertex_fetch_alpha(ctx, alpha_adjust, output[3]);
for (unsigned chan = 0; chan < 4; chan++) {
ctx->inputs[radeon_llvm_reg_index_soa(variable->data.location + i, chan)] =
ac_to_integer(&ctx->ac, output[chan]);
}
}
}

View File

@@ -39,8 +39,20 @@ struct radv_pipeline_layout;
struct ac_llvm_context;
struct ac_shader_abi;
enum {
RADV_ALPHA_ADJUST_NONE = 0,
RADV_ALPHA_ADJUST_SNORM = 1,
RADV_ALPHA_ADJUST_SINT = 2,
RADV_ALPHA_ADJUST_SSCALED = 3,
};
struct ac_vs_variant_key {
uint32_t instance_rate_inputs;
/* For 2_10_10_10 formats the alpha is handled as unsigned by pre-vega HW.
* so we may need to fix it up. */
uint64_t alpha_adjust;
uint32_t as_es:1;
uint32_t as_ls:1;
uint32_t export_prim_id:1;

View File

@@ -6892,34 +6892,22 @@
#define S_028808_ROP3(x) (((unsigned)(x) & 0xFF) << 16)
#define G_028808_ROP3(x) (((x) >> 16) & 0xFF)
#define C_028808_ROP3 0xFF00FFFF
#define V_028808_X_0X00 0x00
#define V_028808_X_0X05 0x05
#define V_028808_X_0X0A 0x0A
#define V_028808_X_0X0F 0x0F
#define V_028808_X_0X11 0x11
#define V_028808_X_0X22 0x22
#define V_028808_X_0X33 0x33
#define V_028808_X_0X44 0x44
#define V_028808_X_0X50 0x50
#define V_028808_X_0X55 0x55
#define V_028808_X_0X5A 0x5A
#define V_028808_X_0X5F 0x5F
#define V_028808_X_0X66 0x66
#define V_028808_X_0X77 0x77
#define V_028808_X_0X88 0x88
#define V_028808_X_0X99 0x99
#define V_028808_X_0XA0 0xA0
#define V_028808_X_0XA5 0xA5
#define V_028808_X_0XAA 0xAA
#define V_028808_X_0XAF 0xAF
#define V_028808_X_0XBB 0xBB
#define V_028808_X_0XCC 0xCC
#define V_028808_X_0XDD 0xDD
#define V_028808_X_0XEE 0xEE
#define V_028808_X_0XF0 0xF0
#define V_028808_X_0XF5 0xF5
#define V_028808_X_0XFA 0xFA
#define V_028808_X_0XFF 0xFF
#define V_028808_ROP3_CLEAR 0x00
#define V_028808_ROP3_NOR 0x11
#define V_028808_ROP3_AND_INVERTED 0x22
#define V_028808_ROP3_COPY_INVERTED 0x33
#define V_028808_ROP3_AND_REVERSE 0x44
#define V_028808_ROP3_INVERT 0x55
#define V_028808_ROP3_XOR 0x66
#define V_028808_ROP3_NAND 0x77
#define V_028808_ROP3_AND 0x88
#define V_028808_ROP3_EQUIVALENT 0x99
#define V_028808_ROP3_NO_OP 0xaa
#define V_028808_ROP3_OR_INVERTED 0xbb
#define V_028808_ROP3_COPY 0xcc
#define V_028808_ROP3_OR_REVERSE 0xdd
#define V_028808_ROP3_OR 0xee
#define V_028808_ROP3_SET 0xff
#define R_02880C_DB_SHADER_CONTROL 0x02880C
#define S_02880C_Z_EXPORT_ENABLE(x) (((unsigned)(x) & 0x1) << 0)
#define G_02880C_Z_EXPORT_ENABLE(x) (((x) >> 0) & 0x1)

View File

@@ -141,7 +141,7 @@ radv_physical_device_init_mem_types(struct radv_physical_device *device)
gart_index = device->memory_properties.memoryHeapCount++;
device->memory_properties.memoryHeaps[gart_index] = (VkMemoryHeap) {
.size = device->rad_info.gart_size,
.flags = 0,
.flags = device->rad_info.has_dedicated_vram ? 0 : VK_MEMORY_HEAP_DEVICE_LOCAL_BIT,
};
}
@@ -158,7 +158,8 @@ radv_physical_device_init_mem_types(struct radv_physical_device *device)
device->mem_type_indices[type_count] = RADV_MEM_TYPE_GTT_WRITE_COMBINE;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
(device->rad_info.has_dedicated_vram ? 0 : VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT),
.heapIndex = gart_index,
};
}
@@ -176,7 +177,8 @@ radv_physical_device_init_mem_types(struct radv_physical_device *device)
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
VK_MEMORY_PROPERTY_HOST_CACHED_BIT,
VK_MEMORY_PROPERTY_HOST_CACHED_BIT |
(device->rad_info.has_dedicated_vram ? 0 : VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT),
.heapIndex = gart_index,
};
}

View File

@@ -619,6 +619,25 @@ radv_physical_device_get_format_properties(struct radv_physical_device *physical
tiled |= VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT;
}
switch(format) {
case VK_FORMAT_A2R10G10B10_SNORM_PACK32:
case VK_FORMAT_A2B10G10R10_SNORM_PACK32:
case VK_FORMAT_A2R10G10B10_SSCALED_PACK32:
case VK_FORMAT_A2B10G10R10_SSCALED_PACK32:
case VK_FORMAT_A2R10G10B10_SINT_PACK32:
case VK_FORMAT_A2B10G10R10_SINT_PACK32:
if (physical_device->rad_info.chip_class <= VI &&
physical_device->rad_info.family != CHIP_STONEY) {
buffer &= ~(VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT |
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT);
linear = 0;
tiled = 0;
}
break;
default:
break;
}
out_properties->linearTilingFeatures = linear;
out_properties->optimalTilingFeatures = tiled;
out_properties->bufferFeatures = buffer;

View File

@@ -100,7 +100,8 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_buffer *src_buf,
struct blit2d_src_temps *tmp,
enum blit2d_src_type src_type, VkFormat depth_format,
VkImageAspectFlagBits aspects)
VkImageAspectFlagBits aspects,
uint32_t log2_samples)
{
struct radv_device *device = cmd_buffer->device;
@@ -108,7 +109,7 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
create_bview(cmd_buffer, src_buf, &tmp->bview, depth_format);
radv_meta_push_descriptor_set(cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
0, /* set */
1, /* descriptorWriteCount */
(VkWriteDescriptorSet[]) {
@@ -123,7 +124,7 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
});
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4,
&src_buf->pitch);
} else {
@@ -131,12 +132,12 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
if (src_type == BLIT2D_SRC_TYPE_IMAGE_3D)
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4,
&src_img->layer);
radv_meta_push_descriptor_set(cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
0, /* set */
1, /* descriptorWriteCount */
(VkWriteDescriptorSet[]) {
@@ -190,10 +191,11 @@ blit2d_bind_dst(struct radv_cmd_buffer *cmd_buffer,
static void
bind_pipeline(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type, unsigned fs_key)
enum blit2d_src_type src_type, unsigned fs_key,
uint32_t log2_samples)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.blit2d.pipelines[src_type][fs_key];
cmd_buffer->device->meta_state.blit2d[log2_samples].pipelines[src_type][fs_key];
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
@@ -201,10 +203,11 @@ bind_pipeline(struct radv_cmd_buffer *cmd_buffer,
static void
bind_depth_pipeline(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.blit2d.depth_only_pipeline[src_type];
cmd_buffer->device->meta_state.blit2d[log2_samples].depth_only_pipeline[src_type];
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
@@ -212,10 +215,11 @@ bind_depth_pipeline(struct radv_cmd_buffer *cmd_buffer,
static void
bind_stencil_pipeline(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkPipeline pipeline =
cmd_buffer->device->meta_state.blit2d.stencil_only_pipeline[src_type];
cmd_buffer->device->meta_state.blit2d[log2_samples].stencil_only_pipeline[src_type];
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
@@ -227,7 +231,8 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_buffer *src_buf,
struct radv_meta_blit2d_surf *dst,
unsigned num_rects,
struct radv_meta_blit2d_rect *rects, enum blit2d_src_type src_type)
struct radv_meta_blit2d_rect *rects, enum blit2d_src_type src_type,
uint32_t log2_samples)
{
struct radv_device *device = cmd_buffer->device;
@@ -241,7 +246,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT)
depth_format = vk_format_depth_only(dst->image->vk_format);
struct blit2d_src_temps src_temps;
blit2d_bind_src(cmd_buffer, src_img, src_buf, &src_temps, src_type, depth_format, aspect_mask);
blit2d_bind_src(cmd_buffer, src_img, src_buf, &src_temps, src_type, depth_format, aspect_mask, log2_samples);
struct blit2d_dst_temps dst_temps;
blit2d_bind_dst(cmd_buffer, dst, rects[r].dst_x + rects[r].width,
@@ -255,7 +260,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_VERTEX_BIT, 0, 16,
vertex_push_constants);
@@ -266,7 +271,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.render_passes[fs_key][dst_layout],
.renderPass = device->meta_state.blit2d_render_passes[fs_key][dst_layout],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -277,13 +282,13 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_pipeline(cmd_buffer, src_type, fs_key);
bind_pipeline(cmd_buffer, src_type, fs_key, log2_samples);
} else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT) {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dst->current_layout);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.depth_only_rp[ds_layout],
.renderPass = device->meta_state.blit2d_depth_only_rp[ds_layout],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -294,14 +299,14 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_depth_pipeline(cmd_buffer, src_type);
bind_depth_pipeline(cmd_buffer, src_type, log2_samples);
} else if (aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
enum radv_blit_ds_layout ds_layout = radv_meta_blit_ds_to_type(dst->current_layout);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.stencil_only_rp[ds_layout],
.renderPass = device->meta_state.blit2d_stencil_only_rp[ds_layout],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
@@ -312,7 +317,7 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_stencil_pipeline(cmd_buffer, src_type);
bind_stencil_pipeline(cmd_buffer, src_type, log2_samples);
} else
unreachable("Processing blit2d with multiple aspects.");
@@ -332,7 +337,24 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
if (log2_samples > 0) {
for (uint32_t sample = 0; sample < src_img->image->info.samples; sample++) {
uint32_t sample_mask = 1 << sample;
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 20, 4,
&sample);
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d[log2_samples].p_layouts[src_type],
VK_SHADER_STAGE_FRAGMENT_BIT, 24, 4,
&sample_mask);
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
}
}
else
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
radv_CmdEndRenderPass(radv_cmd_buffer_to_handle(cmd_buffer));
/* At the point where we emit the draw call, all data from the
@@ -358,7 +380,8 @@ radv_meta_blit2d(struct radv_cmd_buffer *cmd_buffer,
enum blit2d_src_type src_type = src_buf ? BLIT2D_SRC_TYPE_BUFFER :
use_3d ? BLIT2D_SRC_TYPE_IMAGE_3D : BLIT2D_SRC_TYPE_IMAGE;
radv_meta_blit2d_normal_dst(cmd_buffer, src_img, src_buf, dst,
num_rects, rects, src_type);
num_rects, rects, src_type,
src_img ? util_logbase2(src_img->image->info.samples) : 0);
}
static nir_shader *
@@ -421,13 +444,14 @@ build_nir_vertex_shader(void)
typedef nir_ssa_def* (*texel_fetch_build_func)(struct nir_builder *,
struct radv_device *,
nir_ssa_def *, bool);
nir_ssa_def *, bool, bool);
static nir_ssa_def *
build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
nir_ssa_def *tex_pos, bool is_3d)
nir_ssa_def *tex_pos, bool is_3d, bool is_multisampled)
{
enum glsl_sampler_dim dim = is_3d ? GLSL_SAMPLER_DIM_3D : GLSL_SAMPLER_DIM_2D;
enum glsl_sampler_dim dim =
is_3d ? GLSL_SAMPLER_DIM_3D : is_multisampled ? GLSL_SAMPLER_DIM_MS : GLSL_SAMPLER_DIM_2D;
const struct glsl_type *sampler_type =
glsl_sampler_type(dim, false, false, GLSL_TYPE_UINT);
nir_variable *sampler = nir_variable_create(b->shader, nir_var_uniform,
@@ -436,6 +460,7 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
sampler->data.binding = 0;
nir_ssa_def *tex_pos_3d = NULL;
nir_intrinsic_instr *sample_idx = NULL;
if (is_3d) {
nir_intrinsic_instr *layer = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(layer, 16);
@@ -451,13 +476,26 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
chans[2] = &layer->dest.ssa;
tex_pos_3d = nir_vec(b, chans, 3);
}
nir_tex_instr *tex = nir_tex_instr_create(b->shader, 2);
if (is_multisampled) {
sample_idx = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(sample_idx, 20);
nir_intrinsic_set_range(sample_idx, 4);
sample_idx->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
sample_idx->num_components = 1;
nir_ssa_dest_init(&sample_idx->instr, &sample_idx->dest, 1, 32, "sample_idx");
nir_builder_instr_insert(b, &sample_idx->instr);
}
nir_tex_instr *tex = nir_tex_instr_create(b->shader, is_multisampled ? 3 : 2);
tex->sampler_dim = dim;
tex->op = nir_texop_txf;
tex->op = is_multisampled ? nir_texop_txf_ms : nir_texop_txf;
tex->src[0].src_type = nir_tex_src_coord;
tex->src[0].src = nir_src_for_ssa(is_3d ? tex_pos_3d : tex_pos);
tex->src[1].src_type = nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(nir_imm_int(b, 0));
tex->src[1].src_type = is_multisampled ? nir_tex_src_ms_index : nir_tex_src_lod;
tex->src[1].src = nir_src_for_ssa(is_multisampled ? &sample_idx->dest.ssa : nir_imm_int(b, 0));
if (is_multisampled) {
tex->src[2].src_type = nir_tex_src_lod;
tex->src[2].src = nir_src_for_ssa(nir_imm_int(b, 0));
}
tex->dest_type = nir_type_uint;
tex->is_array = false;
tex->coord_components = is_3d ? 3 : 2;
@@ -473,7 +511,7 @@ build_nir_texel_fetch(struct nir_builder *b, struct radv_device *device,
static nir_ssa_def *
build_nir_buffer_fetch(struct nir_builder *b, struct radv_device *device,
nir_ssa_def *tex_pos, bool is_3d)
nir_ssa_def *tex_pos, bool is_3d, bool is_multisampled)
{
const struct glsl_type *sampler_type =
glsl_sampler_type(GLSL_SAMPLER_DIM_BUF, false, false, GLSL_TYPE_UINT);
@@ -519,9 +557,31 @@ static const VkPipelineVertexInputStateCreateInfo normal_vi_create_info = {
.vertexAttributeDescriptionCount = 0,
};
static void
build_nir_store_sample_mask(struct nir_builder *b)
{
nir_intrinsic_instr *sample_mask = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_push_constant);
nir_intrinsic_set_base(sample_mask, 24);
nir_intrinsic_set_range(sample_mask, 4);
sample_mask->src[0] = nir_src_for_ssa(nir_imm_int(b, 0));
sample_mask->num_components = 1;
nir_ssa_dest_init(&sample_mask->instr, &sample_mask->dest, 1, 32, "sample_mask");
nir_builder_instr_insert(b, &sample_mask->instr);
const struct glsl_type *sample_mask_out_type = glsl_uint_type();
nir_variable *sample_mask_out =
nir_variable_create(b->shader, nir_var_shader_out,
sample_mask_out_type, "sample_mask_out");
sample_mask_out->data.location = FRAG_RESULT_SAMPLE_MASK;
nir_store_var(b, sample_mask_out, &sample_mask->dest.ssa, 0x1);
}
static nir_shader *
build_nir_copy_fragment_shader(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name, bool is_3d,
bool is_multisampled)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -538,11 +598,15 @@ build_nir_copy_fragment_shader(struct radv_device *device,
vec4, "f_color");
color_out->data.location = FRAG_RESULT_DATA0;
if (is_multisampled) {
build_nir_store_sample_mask(&b);
}
nir_ssa_def *pos_int = nir_f2i32(&b, nir_load_var(&b, tex_pos_in));
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d, is_multisampled);
nir_store_var(&b, color_out, color, 0xf);
return b.shader;
@@ -550,7 +614,8 @@ build_nir_copy_fragment_shader(struct radv_device *device,
static nir_shader *
build_nir_copy_fragment_shader_depth(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name, bool is_3d,
bool is_multisampled)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -567,11 +632,15 @@ build_nir_copy_fragment_shader_depth(struct radv_device *device,
vec4, "f_color");
color_out->data.location = FRAG_RESULT_DEPTH;
if (is_multisampled) {
build_nir_store_sample_mask(&b);
}
nir_ssa_def *pos_int = nir_f2i32(&b, nir_load_var(&b, tex_pos_in));
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d, is_multisampled);
nir_store_var(&b, color_out, color, 0x1);
return b.shader;
@@ -579,7 +648,8 @@ build_nir_copy_fragment_shader_depth(struct radv_device *device,
static nir_shader *
build_nir_copy_fragment_shader_stencil(struct radv_device *device,
texel_fetch_build_func txf_func, const char* name, bool is_3d)
texel_fetch_build_func txf_func, const char* name, bool is_3d,
bool is_multisampled)
{
const struct glsl_type *vec4 = glsl_vec4_type();
const struct glsl_type *vec2 = glsl_vector_type(GLSL_TYPE_FLOAT, 2);
@@ -596,11 +666,15 @@ build_nir_copy_fragment_shader_stencil(struct radv_device *device,
vec4, "f_color");
color_out->data.location = FRAG_RESULT_STENCIL;
if (is_multisampled) {
build_nir_store_sample_mask(&b);
}
nir_ssa_def *pos_int = nir_f2i32(&b, nir_load_var(&b, tex_pos_in));
unsigned swiz[4] = { 0, 1 };
nir_ssa_def *tex_pos = nir_swizzle(&b, pos_int, swiz, 2, false);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d);
nir_ssa_def *color = txf_func(&b, device, tex_pos, is_3d, is_multisampled);
nir_store_var(&b, color_out, color, 0x1);
return b.shader;
@@ -614,45 +688,48 @@ radv_device_finish_meta_blit2d_state(struct radv_device *device)
for(unsigned j = 0; j < NUM_META_FS_KEYS; ++j) {
for (unsigned k = 0; k < RADV_META_DST_LAYOUT_COUNT; ++k) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.render_passes[j][k],
&state->alloc);
state->blit2d_render_passes[j][k],
&state->alloc);
}
}
for (enum radv_blit_ds_layout j = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; j < RADV_BLIT_DS_LAYOUT_COUNT; j++) {
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.depth_only_rp[j], &state->alloc);
state->blit2d_depth_only_rp[j], &state->alloc);
radv_DestroyRenderPass(radv_device_to_handle(device),
state->blit2d.stencil_only_rp[j], &state->alloc);
state->blit2d_stencil_only_rp[j], &state->alloc);
}
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->blit2d.p_layouts[src],
&state->alloc);
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
state->blit2d.ds_layouts[src],
&state->alloc);
for (unsigned log2_samples = 0; log2_samples < 1 + MAX_SAMPLES_LOG2; ++log2_samples) {
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
radv_DestroyPipelineLayout(radv_device_to_handle(device),
state->blit2d[log2_samples].p_layouts[src],
&state->alloc);
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
state->blit2d[log2_samples].ds_layouts[src],
&state->alloc);
for (unsigned j = 0; j < NUM_META_FS_KEYS; ++j) {
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d[log2_samples].pipelines[src][j],
&state->alloc);
}
for (unsigned j = 0; j < NUM_META_FS_KEYS; ++j) {
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d.pipelines[src][j],
state->blit2d[log2_samples].depth_only_pipeline[src],
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d[log2_samples].stencil_only_pipeline[src],
&state->alloc);
}
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d.depth_only_pipeline[src],
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->blit2d.stencil_only_pipeline[src],
&state->alloc);
}
}
static VkResult
blit2d_init_color_pipeline(struct radv_device *device,
enum blit2d_src_type src_type,
VkFormat format)
VkFormat format,
uint32_t log2_samples)
{
VkResult result;
unsigned fs_key = radv_format_meta_fs_key(format);
@@ -681,7 +758,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D, log2_samples > 0);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -705,7 +782,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
};
for (unsigned dst_layout = 0; dst_layout < RADV_META_DST_LAYOUT_COUNT; ++dst_layout) {
if (!device->meta_state.blit2d.render_passes[fs_key][dst_layout]) {
if (!device->meta_state.blit2d_render_passes[fs_key][dst_layout]) {
VkImageLayout layout = radv_meta_dst_layout_to_layout(dst_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
@@ -737,7 +814,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.render_passes[fs_key][dst_layout]);
}, &device->meta_state.alloc, &device->meta_state.blit2d_render_passes[fs_key][dst_layout]);
}
}
@@ -765,7 +842,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = 1 << log2_samples,
.sampleShadingEnable = false,
.pSampleMask = (VkSampleMask[]) { UINT32_MAX },
},
@@ -796,8 +873,8 @@ blit2d_init_color_pipeline(struct radv_device *device,
},
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.render_passes[fs_key][0],
.layout = device->meta_state.blit2d[log2_samples].p_layouts[src_type],
.renderPass = device->meta_state.blit2d_render_passes[fs_key][0],
.subpass = 0,
};
@@ -809,7 +886,7 @@ blit2d_init_color_pipeline(struct radv_device *device,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
&device->meta_state.blit2d.pipelines[src_type][fs_key]);
&device->meta_state.blit2d[log2_samples].pipelines[src_type][fs_key]);
ralloc_free(vs.nir);
@@ -820,7 +897,8 @@ blit2d_init_color_pipeline(struct radv_device *device,
static VkResult
blit2d_init_depth_only_pipeline(struct radv_device *device,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkResult result;
const char *name;
@@ -847,7 +925,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader_depth(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader_depth(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D, log2_samples > 0);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -871,7 +949,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
};
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
if (!device->meta_state.blit2d.depth_only_rp[ds_layout]) {
if (!device->meta_state.blit2d_depth_only_rp[ds_layout]) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
@@ -899,7 +977,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.depth_only_rp[ds_layout]);
}, &device->meta_state.alloc, &device->meta_state.blit2d_depth_only_rp[ds_layout]);
}
}
@@ -927,7 +1005,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = 1 << log2_samples,
.sampleShadingEnable = false,
.pSampleMask = (VkSampleMask[]) { UINT32_MAX },
},
@@ -958,8 +1036,8 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
},
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.depth_only_rp[0],
.layout = device->meta_state.blit2d[log2_samples].p_layouts[src_type],
.renderPass = device->meta_state.blit2d_depth_only_rp[0],
.subpass = 0,
};
@@ -971,7 +1049,7 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
&device->meta_state.blit2d.depth_only_pipeline[src_type]);
&device->meta_state.blit2d[log2_samples].depth_only_pipeline[src_type]);
ralloc_free(vs.nir);
@@ -982,7 +1060,8 @@ blit2d_init_depth_only_pipeline(struct radv_device *device,
static VkResult
blit2d_init_stencil_only_pipeline(struct radv_device *device,
enum blit2d_src_type src_type)
enum blit2d_src_type src_type,
uint32_t log2_samples)
{
VkResult result;
const char *name;
@@ -1009,7 +1088,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_nir_copy_fragment_shader_stencil(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D);
fs.nir = build_nir_copy_fragment_shader_stencil(device, src_func, name, src_type == BLIT2D_SRC_TYPE_IMAGE_3D, log2_samples > 0);
vi_create_info = &normal_vi_create_info;
struct radv_shader_module vs = {
@@ -1033,7 +1112,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
};
for (enum radv_blit_ds_layout ds_layout = RADV_BLIT_DS_LAYOUT_TILE_ENABLE; ds_layout < RADV_BLIT_DS_LAYOUT_COUNT; ds_layout++) {
if (!device->meta_state.blit2d.stencil_only_rp[ds_layout]) {
if (!device->meta_state.blit2d_stencil_only_rp[ds_layout]) {
VkImageLayout layout = radv_meta_blit_ds_to_layout(ds_layout);
result = radv_CreateRenderPass(radv_device_to_handle(device),
&(VkRenderPassCreateInfo) {
@@ -1061,7 +1140,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
.pPreserveAttachments = (uint32_t[]) { 0 },
},
.dependencyCount = 0,
}, &device->meta_state.alloc, &device->meta_state.blit2d.stencil_only_rp[ds_layout]);
}, &device->meta_state.alloc, &device->meta_state.blit2d_stencil_only_rp[ds_layout]);
}
}
@@ -1089,7 +1168,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = 1 << log2_samples,
.sampleShadingEnable = false,
.pSampleMask = (VkSampleMask[]) { UINT32_MAX },
},
@@ -1136,8 +1215,8 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
},
},
.flags = 0,
.layout = device->meta_state.blit2d.p_layouts[src_type],
.renderPass = device->meta_state.blit2d.stencil_only_rp[0],
.layout = device->meta_state.blit2d[log2_samples].p_layouts[src_type],
.renderPass = device->meta_state.blit2d_stencil_only_rp[0],
.subpass = 0,
};
@@ -1149,7 +1228,7 @@ blit2d_init_stencil_only_pipeline(struct radv_device *device,
radv_pipeline_cache_to_handle(&device->meta_state.cache),
&vk_pipeline_info, &radv_pipeline_info,
&device->meta_state.alloc,
&device->meta_state.blit2d.stencil_only_pipeline[src_type]);
&device->meta_state.blit2d[log2_samples].stencil_only_pipeline[src_type]);
ralloc_free(vs.nir);
@@ -1175,15 +1254,16 @@ static VkFormat pipeline_formats[] = {
static VkResult
meta_blit2d_create_pipe_layout(struct radv_device *device,
int idx)
int idx,
uint32_t log2_samples)
{
VkResult result;
VkDescriptorType desc_type = (idx == BLIT2D_SRC_TYPE_BUFFER) ? VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER : VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE;
const VkPushConstantRange push_constant_ranges[] = {
{VK_SHADER_STAGE_VERTEX_BIT, 0, 16},
{VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4},
{VK_SHADER_STAGE_FRAGMENT_BIT, 16, 12},
};
int num_push_constant_range = (idx != BLIT2D_SRC_TYPE_IMAGE) ? 2 : 1;
int num_push_constant_range = (idx != BLIT2D_SRC_TYPE_IMAGE || log2_samples > 0) ? 2 : 1;
result = radv_CreateDescriptorSetLayout(radv_device_to_handle(device),
&(VkDescriptorSetLayoutCreateInfo) {
@@ -1199,7 +1279,7 @@ meta_blit2d_create_pipe_layout(struct radv_device *device,
.pImmutableSamplers = NULL
},
}
}, &device->meta_state.alloc, &device->meta_state.blit2d.ds_layouts[idx]);
}, &device->meta_state.alloc, &device->meta_state.blit2d[log2_samples].ds_layouts[idx]);
if (result != VK_SUCCESS)
goto fail;
@@ -1207,11 +1287,11 @@ meta_blit2d_create_pipe_layout(struct radv_device *device,
&(VkPipelineLayoutCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
.setLayoutCount = 1,
.pSetLayouts = &device->meta_state.blit2d.ds_layouts[idx],
.pSetLayouts = &device->meta_state.blit2d[log2_samples].ds_layouts[idx],
.pushConstantRangeCount = num_push_constant_range,
.pPushConstantRanges = push_constant_ranges,
},
&device->meta_state.alloc, &device->meta_state.blit2d.p_layouts[idx]);
&device->meta_state.alloc, &device->meta_state.blit2d[log2_samples].p_layouts[idx]);
if (result != VK_SUCCESS)
goto fail;
return VK_SUCCESS;
@@ -1225,27 +1305,33 @@ radv_device_init_meta_blit2d_state(struct radv_device *device)
VkResult result;
bool create_3d = device->physical_device->rad_info.chip_class >= GFX9;
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
if (src == BLIT2D_SRC_TYPE_IMAGE_3D && !create_3d)
continue;
for (unsigned log2_samples = 0; log2_samples < 1 + MAX_SAMPLES_LOG2; log2_samples++) {
for (unsigned src = 0; src < BLIT2D_NUM_SRC_TYPES; src++) {
if (src == BLIT2D_SRC_TYPE_IMAGE_3D && !create_3d)
continue;
result = meta_blit2d_create_pipe_layout(device, src);
if (result != VK_SUCCESS)
goto fail;
/* Don't need to handle copies between buffers and multisample images. */
if (src == BLIT2D_SRC_TYPE_BUFFER && log2_samples > 0)
continue;
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
result = blit2d_init_color_pipeline(device, src, pipeline_formats[j]);
result = meta_blit2d_create_pipe_layout(device, src, log2_samples);
if (result != VK_SUCCESS)
goto fail;
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
result = blit2d_init_color_pipeline(device, src, pipeline_formats[j], log2_samples);
if (result != VK_SUCCESS)
goto fail;
}
result = blit2d_init_depth_only_pipeline(device, src, log2_samples);
if (result != VK_SUCCESS)
goto fail;
result = blit2d_init_stencil_only_pipeline(device, src, log2_samples);
if (result != VK_SUCCESS)
goto fail;
}
result = blit2d_init_depth_only_pipeline(device, src);
if (result != VK_SUCCESS)
goto fail;
result = blit2d_init_stencil_only_pipeline(device, src);
if (result != VK_SUCCESS)
goto fail;
}
return VK_SUCCESS;

View File

@@ -358,6 +358,8 @@ static void radv_pick_resolve_method_images(struct radv_image *src_image,
*method = RESOLVE_COMPUTE;
else if (vk_format_is_int(src_image->vk_format))
*method = RESOLVE_COMPUTE;
else if (src_image->info.array_size > 1)
*method = RESOLVE_COMPUTE;
if (radv_layout_dcc_compressed(dest_image, dest_image_layout, queue_mask)) {
*method = RESOLVE_FRAGMENT;

View File

@@ -536,12 +536,48 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer)
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
emit_resolve(cmd_buffer,
src_iview,
dst_iview,
&(VkOffset2D) { 0, 0 },
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
struct radv_image *src_image = src_iview->image;
struct radv_image *dst_image = dst_iview->image;
for (uint32_t layer = 0; layer < src_image->info.array_size; layer++) {
struct radv_image_view tsrc_iview;
radv_image_view_init(&tsrc_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(src_image),
.viewType = radv_meta_get_view_type(src_image),
.format = src_image->vk_format,
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = src_iview->base_mip,
.levelCount = 1,
.baseArrayLayer = layer,
.layerCount = 1,
},
});
struct radv_image_view tdst_iview;
radv_image_view_init(&tdst_iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(dst_image),
.viewType = radv_meta_get_view_type(dst_image),
.format = vk_to_non_srgb_format(dst_image->vk_format),
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = dst_iview->base_mip,
.levelCount = 1,
.baseArrayLayer = layer,
.layerCount = 1,
},
});
emit_resolve(cmd_buffer,
&tsrc_iview,
&tdst_iview,
&(VkOffset2D) { 0, 0 },
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });
}
}
radv_meta_restore(&saved_state, cmd_buffer);

View File

@@ -87,8 +87,8 @@ VkResult radv_CreateRenderPass(
subpass_attachment_count +=
desc->inputAttachmentCount +
desc->colorAttachmentCount +
/* Count colorAttachmentCount again for resolve_attachments */
desc->colorAttachmentCount;
(desc->pResolveAttachments ? desc->colorAttachmentCount : 0) +
(desc->pDepthStencilAttachment != NULL);
}
if (subpass_attachment_count) {

View File

@@ -142,6 +142,47 @@ radv_pipeline_scratch_init(struct radv_device *device,
return VK_SUCCESS;
}
static uint32_t si_translate_blend_logic_op(VkLogicOp op)
{
switch (op) {
case VK_LOGIC_OP_CLEAR:
return V_028808_ROP3_CLEAR;
case VK_LOGIC_OP_AND:
return V_028808_ROP3_AND;
case VK_LOGIC_OP_AND_REVERSE:
return V_028808_ROP3_AND_REVERSE;
case VK_LOGIC_OP_COPY:
return V_028808_ROP3_COPY;
case VK_LOGIC_OP_AND_INVERTED:
return V_028808_ROP3_AND_INVERTED;
case VK_LOGIC_OP_NO_OP:
return V_028808_ROP3_NO_OP;
case VK_LOGIC_OP_XOR:
return V_028808_ROP3_XOR;
case VK_LOGIC_OP_OR:
return V_028808_ROP3_OR;
case VK_LOGIC_OP_NOR:
return V_028808_ROP3_NOR;
case VK_LOGIC_OP_EQUIVALENT:
return V_028808_ROP3_EQUIVALENT;
case VK_LOGIC_OP_INVERT:
return V_028808_ROP3_INVERT;
case VK_LOGIC_OP_OR_REVERSE:
return V_028808_ROP3_OR_REVERSE;
case VK_LOGIC_OP_COPY_INVERTED:
return V_028808_ROP3_COPY_INVERTED;
case VK_LOGIC_OP_OR_INVERTED:
return V_028808_ROP3_OR_INVERTED;
case VK_LOGIC_OP_NAND:
return V_028808_ROP3_NAND;
case VK_LOGIC_OP_SET:
return V_028808_ROP3_SET;
default:
unreachable("Unhandled logic op");
}
}
static uint32_t si_translate_blend_function(VkBlendOp op)
{
switch (op) {
@@ -532,9 +573,9 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
}
blend->cb_color_control = 0;
if (vkblend->logicOpEnable)
blend->cb_color_control |= S_028808_ROP3(vkblend->logicOp | (vkblend->logicOp << 4));
blend->cb_color_control |= S_028808_ROP3(si_translate_blend_logic_op(vkblend->logicOp));
else
blend->cb_color_control |= S_028808_ROP3(0xcc);
blend->cb_color_control |= S_028808_ROP3(V_028808_ROP3_COPY);
blend->db_alpha_to_mask = S_028B70_ALPHA_TO_MASK_OFFSET0(2) |
S_028B70_ALPHA_TO_MASK_OFFSET1(2) |
@@ -1755,10 +1796,34 @@ radv_generate_graphics_pipeline_key(struct radv_pipeline *pipeline,
}
for (unsigned i = 0; i < input_state->vertexAttributeDescriptionCount; ++i) {
unsigned binding;
binding = input_state->pVertexAttributeDescriptions[i].binding;
unsigned location = input_state->pVertexAttributeDescriptions[i].location;
unsigned binding = input_state->pVertexAttributeDescriptions[i].binding;
if (binding_input_rate & (1u << binding))
key.instance_rate_inputs |= 1u << input_state->pVertexAttributeDescriptions[i].location;
if (pipeline->device->physical_device->rad_info.chip_class <= VI &&
pipeline->device->physical_device->rad_info.family != CHIP_STONEY) {
VkFormat format = input_state->pVertexAttributeDescriptions[i].format;
uint64_t adjust;
switch(format) {
case VK_FORMAT_A2R10G10B10_SNORM_PACK32:
case VK_FORMAT_A2B10G10R10_SNORM_PACK32:
adjust = RADV_ALPHA_ADJUST_SNORM;
break;
case VK_FORMAT_A2R10G10B10_SSCALED_PACK32:
case VK_FORMAT_A2B10G10R10_SSCALED_PACK32:
adjust = RADV_ALPHA_ADJUST_SSCALED;
break;
case VK_FORMAT_A2R10G10B10_SINT_PACK32:
case VK_FORMAT_A2B10G10R10_SINT_PACK32:
adjust = RADV_ALPHA_ADJUST_SINT;
break;
default:
adjust = 0;
break;
}
key.vertex_alpha_adjust |= adjust << (2 * location);
}
}
if (pCreateInfo->pTessellationState)
@@ -1787,6 +1852,7 @@ radv_fill_shader_keys(struct ac_shader_variant_key *keys,
nir_shader **nir)
{
keys[MESA_SHADER_VERTEX].vs.instance_rate_inputs = key->instance_rate_inputs;
keys[MESA_SHADER_VERTEX].vs.alpha_adjust = key->vertex_alpha_adjust;
if (nir[MESA_SHADER_TESS_CTRL]) {
keys[MESA_SHADER_VERTEX].vs.as_ls = true;

View File

@@ -329,6 +329,7 @@ struct radv_pipeline_cache {
struct radv_pipeline_key {
uint32_t instance_rate_inputs;
uint64_t vertex_alpha_adjust;
unsigned tess_input_vertices;
uint32_t col_format;
uint32_t is_int8;
@@ -442,18 +443,18 @@ struct radv_meta_state {
} blit;
struct {
VkRenderPass render_passes[NUM_META_FS_KEYS][RADV_META_DST_LAYOUT_COUNT];
VkPipelineLayout p_layouts[5];
VkDescriptorSetLayout ds_layouts[5];
VkPipeline pipelines[5][NUM_META_FS_KEYS];
VkPipelineLayout p_layouts[3];
VkDescriptorSetLayout ds_layouts[3];
VkPipeline pipelines[3][NUM_META_FS_KEYS];
VkPipeline depth_only_pipeline[5];
VkRenderPass depth_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkPipeline depth_only_pipeline[3];
VkPipeline stencil_only_pipeline[5];
} blit2d[1 + MAX_SAMPLES_LOG2];
VkRenderPass stencil_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkPipeline stencil_only_pipeline[3];
} blit2d;
VkRenderPass blit2d_render_passes[NUM_META_FS_KEYS][RADV_META_DST_LAYOUT_COUNT];
VkRenderPass blit2d_depth_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
VkRenderPass blit2d_stencil_only_rp[RADV_BLIT_DS_LAYOUT_COUNT];
struct {
VkPipelineLayout img_p_layout;

View File

@@ -647,10 +647,10 @@ static VkRect2D si_scissor_from_viewport(const VkViewport *viewport)
get_viewport_xform(viewport, scale, translate);
rect.offset.x = translate[0] - abs(scale[0]);
rect.offset.y = translate[1] - abs(scale[1]);
rect.extent.width = ceilf(translate[0] + abs(scale[0])) - rect.offset.x;
rect.extent.height = ceilf(translate[1] + abs(scale[1])) - rect.offset.y;
rect.offset.x = translate[0] - fabs(scale[0]);
rect.offset.y = translate[1] - fabs(scale[1]);
rect.extent.width = ceilf(translate[0] + fabs(scale[0])) - rect.offset.x;
rect.extent.height = ceilf(translate[1] + fabs(scale[1])) - rect.offset.y;
return rect;
}

View File

@@ -450,6 +450,7 @@ radv_amdgpu_winsys_bo_from_fd(struct radeon_winsys *_ws,
bo->size = result.alloc_size;
bo->is_shared = true;
bo->ws = ws;
bo->ref_count = 1;
radv_amdgpu_add_buffer_to_global_list(bo);
return (struct radeon_winsys_bo *)bo;
error_va_map:

View File

@@ -66,6 +66,10 @@ struct radv_amdgpu_cs {
struct radeon_winsys_bo **virtual_buffers;
uint8_t *virtual_buffer_priorities;
int *virtual_buffer_hash_table;
/* For chips that don't support chaining. */
struct radeon_winsys_cs *old_cs_buffers;
unsigned num_old_cs_buffers;
};
static inline struct radv_amdgpu_cs *
@@ -166,6 +170,12 @@ static void radv_amdgpu_cs_destroy(struct radeon_winsys_cs *rcs)
for (unsigned i = 0; i < cs->num_old_ib_buffers; ++i)
cs->ws->base.buffer_destroy(cs->old_ib_buffers[i]);
for (unsigned i = 0; i < cs->num_old_cs_buffers; ++i) {
struct radeon_winsys_cs *rcs = &cs->old_cs_buffers[i];
free(rcs->buf);
}
free(cs->old_cs_buffers);
free(cs->old_ib_buffers);
free(cs->virtual_buffers);
free(cs->virtual_buffer_priorities);
@@ -251,9 +261,46 @@ static void radv_amdgpu_cs_grow(struct radeon_winsys_cs *_cs, size_t min_size)
/* The total ib size cannot exceed limit_dws dwords. */
if (ib_dws > limit_dws)
{
cs->failed = true;
/* The maximum size in dwords has been reached,
* try to allocate a new one.
*/
if (cs->num_old_cs_buffers + 1 >= AMDGPU_CS_MAX_IBS_PER_SUBMIT) {
/* TODO: Allow to submit more than 4 IBs. */
fprintf(stderr, "amdgpu: Maximum number of IBs "
"per submit reached.\n");
cs->failed = true;
cs->base.cdw = 0;
return;
}
cs->old_cs_buffers =
realloc(cs->old_cs_buffers,
(cs->num_old_cs_buffers + 1) * sizeof(*cs->old_cs_buffers));
if (!cs->old_cs_buffers) {
cs->failed = true;
cs->base.cdw = 0;
return;
}
/* Store the current one for submitting it later. */
cs->old_cs_buffers[cs->num_old_cs_buffers].cdw = cs->base.cdw;
cs->old_cs_buffers[cs->num_old_cs_buffers].max_dw = cs->base.max_dw;
cs->old_cs_buffers[cs->num_old_cs_buffers].buf = cs->base.buf;
cs->num_old_cs_buffers++;
/* Reset the cs, it will be re-allocated below. */
cs->base.cdw = 0;
return;
cs->base.buf = NULL;
/* Re-compute the number of dwords to allocate. */
ib_dws = MAX2(cs->base.cdw + min_size,
MIN2(cs->base.max_dw * 2, limit_dws));
if (ib_dws > limit_dws) {
fprintf(stderr, "amdgpu: Too high number of "
"dwords to allocate\n");
cs->failed = true;
return;
}
}
uint32_t *new_buf = realloc(cs->base.buf, ib_dws * 4);
@@ -365,6 +412,15 @@ static void radv_amdgpu_cs_reset(struct radeon_winsys_cs *_cs)
cs->ib.ib_mc_address = radv_amdgpu_winsys_bo(cs->ib_buffer)->base.va;
cs->ib_size_ptr = &cs->ib.size;
cs->ib.size = 0;
} else {
for (unsigned i = 0; i < cs->num_old_cs_buffers; ++i) {
struct radeon_winsys_cs *rcs = &cs->old_cs_buffers[i];
free(rcs->buf);
}
free(cs->old_cs_buffers);
cs->old_cs_buffers = NULL;
cs->num_old_cs_buffers = 0;
}
}
@@ -515,7 +571,8 @@ static void radv_amdgpu_cs_execute_secondary(struct radeon_winsys_cs *_parent,
static int radv_amdgpu_create_bo_list(struct radv_amdgpu_winsys *ws,
struct radeon_winsys_cs **cs_array,
unsigned count,
struct radv_amdgpu_winsys_bo *extra_bo,
struct radv_amdgpu_winsys_bo **extra_bo_array,
unsigned num_extra_bo,
struct radeon_winsys_cs *extra_cs,
amdgpu_bo_list_handle *bo_list)
{
@@ -544,7 +601,7 @@ static int radv_amdgpu_create_bo_list(struct radv_amdgpu_winsys *ws,
bo_list);
free(handles);
pthread_mutex_unlock(&ws->global_bo_list_lock);
} else if (count == 1 && !extra_bo && !extra_cs &&
} else if (count == 1 && !num_extra_bo && !extra_cs &&
!radv_amdgpu_cs(cs_array[0])->num_virtual_buffers) {
struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs*)cs_array[0];
if (cs->num_buffers == 0) {
@@ -554,8 +611,8 @@ static int radv_amdgpu_create_bo_list(struct radv_amdgpu_winsys *ws,
r = amdgpu_bo_list_create(ws->dev, cs->num_buffers, cs->handles,
cs->priorities, bo_list);
} else {
unsigned total_buffer_count = !!extra_bo;
unsigned unique_bo_count = !!extra_bo;
unsigned total_buffer_count = num_extra_bo;
unsigned unique_bo_count = num_extra_bo;
for (unsigned i = 0; i < count; ++i) {
struct radv_amdgpu_cs *cs = (struct radv_amdgpu_cs*)cs_array[i];
total_buffer_count += cs->num_buffers;
@@ -578,9 +635,9 @@ static int radv_amdgpu_create_bo_list(struct radv_amdgpu_winsys *ws,
return -ENOMEM;
}
if (extra_bo) {
handles[0] = extra_bo->bo;
priorities[0] = 8;
for (unsigned i = 0; i < num_extra_bo; i++) {
handles[i] = extra_bo_array[i]->bo;
priorities[i] = 8;
}
for (unsigned i = 0; i < count + !!extra_cs; ++i) {
@@ -710,7 +767,8 @@ static int radv_amdgpu_winsys_cs_submit_chained(struct radeon_winsys_ctx *_ctx,
}
}
r = radv_amdgpu_create_bo_list(cs0->ws, cs_array, cs_count, NULL, initial_preamble_cs, &bo_list);
r = radv_amdgpu_create_bo_list(cs0->ws, cs_array, cs_count, NULL, 0, initial_preamble_cs,
&bo_list);
if (r) {
fprintf(stderr, "amdgpu: buffer list creation failed for the "
"chained submission(%d)\n", r);
@@ -777,7 +835,7 @@ static int radv_amdgpu_winsys_cs_submit_fallback(struct radeon_winsys_ctx *_ctx,
memset(&request, 0, sizeof(request));
r = radv_amdgpu_create_bo_list(cs0->ws, &cs_array[i], cnt, NULL,
r = radv_amdgpu_create_bo_list(cs0->ws, &cs_array[i], cnt, NULL, 0,
preamble_cs, &bo_list);
if (r) {
fprintf(stderr, "amdgpu: buffer list creation failed "
@@ -857,68 +915,127 @@ static int radv_amdgpu_winsys_cs_submit_sysmem(struct radeon_winsys_ctx *_ctx,
assert(cs_count);
for (unsigned i = 0; i < cs_count;) {
struct amdgpu_cs_ib_info ib = {0};
struct radeon_winsys_bo *bo = NULL;
struct amdgpu_cs_ib_info ibs[AMDGPU_CS_MAX_IBS_PER_SUBMIT] = {0};
unsigned number_of_ibs = 1;
struct radeon_winsys_bo *bos[AMDGPU_CS_MAX_IBS_PER_SUBMIT] = {0};
struct radeon_winsys_cs *preamble_cs = i ? continue_preamble_cs : initial_preamble_cs;
struct radv_amdgpu_cs *cs = radv_amdgpu_cs(cs_array[i]);
uint32_t *ptr;
unsigned cnt = 0;
unsigned size = 0;
unsigned pad_words = 0;
if (preamble_cs)
size += preamble_cs->cdw;
while (i + cnt < cs_count && 0xffff8 - size >= radv_amdgpu_cs(cs_array[i + cnt])->base.cdw) {
size += radv_amdgpu_cs(cs_array[i + cnt])->base.cdw;
++cnt;
if (cs->num_old_cs_buffers > 0) {
/* Special path when the maximum size in dwords has
* been reached because we need to handle more than one
* IB per submit.
*/
unsigned new_cs_count = cs->num_old_cs_buffers + 1;
struct radeon_winsys_cs *new_cs_array[AMDGPU_CS_MAX_IBS_PER_SUBMIT];
unsigned idx = 0;
for (unsigned j = 0; j < cs->num_old_cs_buffers; j++)
new_cs_array[idx++] = &cs->old_cs_buffers[j];
new_cs_array[idx++] = cs_array[i];
for (unsigned j = 0; j < new_cs_count; j++) {
struct radeon_winsys_cs *rcs = new_cs_array[j];
bool needs_preamble = preamble_cs && j == 0;
unsigned size = 0;
if (needs_preamble)
size += preamble_cs->cdw;
size += rcs->cdw;
assert(size < 0xffff8);
while (!size || (size & 7)) {
size++;
pad_words++;
}
bos[j] = ws->buffer_create(ws, 4 * size, 4096,
RADEON_DOMAIN_GTT,
RADEON_FLAG_CPU_ACCESS |
RADEON_FLAG_NO_INTERPROCESS_SHARING |
RADEON_FLAG_READ_ONLY);
ptr = ws->buffer_map(bos[j]);
if (needs_preamble) {
memcpy(ptr, preamble_cs->buf, preamble_cs->cdw * 4);
ptr += preamble_cs->cdw;
}
memcpy(ptr, rcs->buf, 4 * rcs->cdw);
ptr += rcs->cdw;
for (unsigned k = 0; k < pad_words; ++k)
*ptr++ = pad_word;
ibs[j].size = size;
ibs[j].ib_mc_address = radv_buffer_get_va(bos[j]);
}
number_of_ibs = new_cs_count;
cnt++;
} else {
if (preamble_cs)
size += preamble_cs->cdw;
while (i + cnt < cs_count && 0xffff8 - size >= radv_amdgpu_cs(cs_array[i + cnt])->base.cdw) {
size += radv_amdgpu_cs(cs_array[i + cnt])->base.cdw;
++cnt;
}
while (!size || (size & 7)) {
size++;
pad_words++;
}
assert(cnt);
bos[0] = ws->buffer_create(ws, 4 * size, 4096,
RADEON_DOMAIN_GTT,
RADEON_FLAG_CPU_ACCESS |
RADEON_FLAG_NO_INTERPROCESS_SHARING |
RADEON_FLAG_READ_ONLY);
ptr = ws->buffer_map(bos[0]);
if (preamble_cs) {
memcpy(ptr, preamble_cs->buf, preamble_cs->cdw * 4);
ptr += preamble_cs->cdw;
}
for (unsigned j = 0; j < cnt; ++j) {
struct radv_amdgpu_cs *cs = radv_amdgpu_cs(cs_array[i + j]);
memcpy(ptr, cs->base.buf, 4 * cs->base.cdw);
ptr += cs->base.cdw;
}
for (unsigned j = 0; j < pad_words; ++j)
*ptr++ = pad_word;
ibs[0].size = size;
ibs[0].ib_mc_address = radv_buffer_get_va(bos[0]);
}
while(!size || (size & 7)) {
size++;
pad_words++;
}
assert(cnt);
bo = ws->buffer_create(ws, 4 * size, 4096, RADEON_DOMAIN_GTT,
RADEON_FLAG_CPU_ACCESS |
RADEON_FLAG_NO_INTERPROCESS_SHARING |
RADEON_FLAG_READ_ONLY);
ptr = ws->buffer_map(bo);
if (preamble_cs) {
memcpy(ptr, preamble_cs->buf, preamble_cs->cdw * 4);
ptr += preamble_cs->cdw;
}
for (unsigned j = 0; j < cnt; ++j) {
struct radv_amdgpu_cs *cs = radv_amdgpu_cs(cs_array[i + j]);
memcpy(ptr, cs->base.buf, 4 * cs->base.cdw);
ptr += cs->base.cdw;
}
for (unsigned j = 0; j < pad_words; ++j)
*ptr++ = pad_word;
memset(&request, 0, sizeof(request));
r = radv_amdgpu_create_bo_list(cs0->ws, &cs_array[i], cnt,
(struct radv_amdgpu_winsys_bo*)bo,
preamble_cs, &bo_list);
(struct radv_amdgpu_winsys_bo **)bos,
number_of_ibs, preamble_cs,
&bo_list);
if (r) {
fprintf(stderr, "amdgpu: buffer list creation failed "
"for the sysmem submission (%d)\n", r);
return r;
}
ib.size = size;
ib.ib_mc_address = radv_buffer_get_va(bo);
memset(&request, 0, sizeof(request));
request.ip_type = cs0->hw_ip;
request.ring = queue_idx;
request.resources = bo_list;
request.number_of_ibs = 1;
request.ibs = &ib;
request.number_of_ibs = number_of_ibs;
request.ibs = ibs;
request.fence_info = radv_set_cs_fence(ctx, cs0->hw_ip, queue_idx);
sem_info->cs_emit_signal = (i == cs_count - cnt) ? emit_signal_sem : false;
@@ -934,9 +1051,11 @@ static int radv_amdgpu_winsys_cs_submit_sysmem(struct radeon_winsys_ctx *_ctx,
if (bo_list)
amdgpu_bo_list_destroy(bo_list);
ws->buffer_destroy(bo);
if (r)
return r;
for (unsigned j = 0; j < number_of_ibs; j++) {
ws->buffer_destroy(bos[j]);
if (r)
return r;
}
i += cnt;
}

View File

@@ -1363,11 +1363,11 @@ apply_var_decoration(struct vtn_builder *b, nir_variable *nir_var,
case SpvBuiltInTessLevelInner:
nir_var->data.compact = true;
break;
case SpvBuiltInSamplePosition:
nir_var->data.origin_upper_left = b->origin_upper_left;
/* fallthrough */
case SpvBuiltInFragCoord:
nir_var->data.pixel_center_integer = b->pixel_center_integer;
/* fallthrough */
case SpvBuiltInSamplePosition:
nir_var->data.origin_upper_left = b->origin_upper_left;
break;
default:
break;

View File

@@ -864,19 +864,22 @@ dri2_x11_swap_buffers_msc(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw,
if (draw->Type == EGL_PIXMAP_BIT || draw->Type == EGL_PBUFFER_BIT)
return 0;
if (draw->SwapBehavior == EGL_BUFFER_PRESERVED || !dri2_dpy->swap_available)
return dri2_copy_region(drv, disp, draw, dri2_surf->region) ? 0 : -1;
if (draw->SwapBehavior == EGL_BUFFER_PRESERVED || !dri2_dpy->swap_available) {
swap_count = dri2_copy_region(drv, disp, draw, dri2_surf->region) ? 0 : -1;
} else {
dri2_flush_drawable_for_swapbuffers(disp, draw);
dri2_flush_drawable_for_swapbuffers(disp, draw);
cookie = xcb_dri2_swap_buffers_unchecked(dri2_dpy->conn,
dri2_surf->drawable, msc_hi,
msc_lo, divisor_hi, divisor_lo,
remainder_hi, remainder_lo);
cookie = xcb_dri2_swap_buffers_unchecked(dri2_dpy->conn, dri2_surf->drawable,
msc_hi, msc_lo, divisor_hi, divisor_lo, remainder_hi, remainder_lo);
reply = xcb_dri2_swap_buffers_reply(dri2_dpy->conn, cookie, NULL);
reply = xcb_dri2_swap_buffers_reply(dri2_dpy->conn, cookie, NULL);
if (reply) {
swap_count = (((int64_t)reply->swap_hi) << 32) | reply->swap_lo;
free(reply);
if (reply) {
swap_count = (((int64_t)reply->swap_hi) << 32) | reply->swap_lo;
free(reply);
}
}
/* Since we aren't watching for the server's invalidate events like we're

View File

@@ -202,6 +202,7 @@ pipe_loader_drm_probe_fd(struct pipe_loader_device **dev, int fd)
if (ddev->lib)
util_dl_close(ddev->lib);
#endif
FREE(ddev->base.driver_name);
FREE(ddev);
return false;
}

View File

@@ -270,6 +270,23 @@ util_hash_table_foreach(struct util_hash_table *ht,
}
static enum pipe_error
util_hash_inc(void *k, void *v, void *d)
{
++*(size_t *)d;
return PIPE_OK;
}
size_t
util_hash_table_count(struct util_hash_table *ht)
{
size_t count = 0;
util_hash_table_foreach(ht, util_hash_inc, &count);
return count;
}
void
util_hash_table_destroy(struct util_hash_table *ht)
{

View File

@@ -85,6 +85,11 @@ util_hash_table_foreach(struct util_hash_table *ht,
(void *key, void *value, void *data),
void *data);
size_t
util_hash_table_count(struct util_hash_table *ht);
void
util_hash_table_destroy(struct util_hash_table *ht);

View File

@@ -138,8 +138,7 @@ u_transfer_helper_resource_destroy(struct pipe_screen *pscreen,
if (helper->vtbl->get_stencil) {
struct pipe_resource *stencil = helper->vtbl->get_stencil(prsc);
if (stencil)
helper->vtbl->resource_destroy(pscreen, stencil);
pipe_resource_reference(&stencil, NULL);
}
helper->vtbl->resource_destroy(pscreen, prsc);

View File

@@ -302,7 +302,7 @@ texture_format_needs_swiz(enum pipe_format fmt)
bool swiz = false;
if (formats[fmt].present)
swiz = !memcmp(def, formats[fmt].tex_swiz, sizeof(formats[fmt].tex_swiz));
swiz = !!memcmp(def, formats[fmt].tex_swiz, sizeof(formats[fmt].tex_swiz));
return swiz;
}

View File

@@ -461,11 +461,10 @@ static void evergreen_delete_compute_state(struct pipe_context *ctx, void *state
} else {
#ifdef HAVE_OPENCL
radeon_shader_binary_clean(&shader->binary);
pipe_resource_reference(&shader->code_bo, NULL);
pipe_resource_reference(&shader->kernel_param, NULL);
#endif
r600_destroy_shader(&shader->bc);
/* TODO destroy shader->code_bo, shader->const_bo
* we'll need something like r600_buffer_free */
}
FREE(shader);
}

View File

@@ -2202,7 +2202,7 @@ static void evergreen_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, PKT3(PKT3_SET_RESOURCE, 8, 0) | pkt_flags);
radeon_emit(cs, (buffer_id_base + buffer_index) * 8);
radeon_emit(cs, va); /* RESOURCEi_WORD0 */
radeon_emit(cs, rbuffer->b.b.width0 - cb->buffer_offset - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, cb->buffer_size -1); /* RESOURCEi_WORD1 */
radeon_emit(cs, /* RESOURCEi_WORD2 */
S_030008_ENDIAN_SWAP(gs_ring_buffer ? ENDIAN_NONE : r600_endian_swap(32)) |
S_030008_STRIDE(gs_ring_buffer ? 4 : 16) |

View File

@@ -65,7 +65,7 @@ static const struct debug_named_value r600_debug_options[] = {
static void r600_destroy_context(struct pipe_context *context)
{
struct r600_context *rctx = (struct r600_context *)context;
unsigned sh;
unsigned sh, i;
r600_isa_destroy(rctx->isa);
@@ -101,6 +101,10 @@ static void r600_destroy_context(struct pipe_context *context)
}
util_unreference_framebuffer_state(&rctx->framebuffer.state);
for (sh = 0; sh < PIPE_SHADER_TYPES; ++sh)
for (i = 0; i < PIPE_MAX_CONSTANT_BUFFERS; ++i)
rctx->b.b.set_constant_buffer(context, sh, i, NULL);
if (rctx->blitter) {
util_blitter_destroy(rctx->blitter);
}

View File

@@ -1724,7 +1724,7 @@ static void r600_emit_constant_buffers(struct r600_context *rctx,
radeon_emit(cs, PKT3(PKT3_SET_RESOURCE, 7, 0));
radeon_emit(cs, (buffer_id_base + buffer_index) * 7);
radeon_emit(cs, offset); /* RESOURCEi_WORD0 */
radeon_emit(cs, rbuffer->b.b.width0 - offset - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, cb->buffer_size - 1); /* RESOURCEi_WORD1 */
radeon_emit(cs, /* RESOURCEi_WORD2 */
S_038008_ENDIAN_SWAP(gs_ring_buffer ? ENDIAN_NONE : r600_endian_swap(32)) |
S_038008_STRIDE(gs_ring_buffer ? 4 : 16));

View File

@@ -554,15 +554,15 @@ static rvcn_dec_message_mpeg4_asp_vld_t get_mpeg4_msg(struct radeon_decoder *dec
result.vop_time_increment_resolution = pic->vop_time_increment_resolution;
result.short_video_header |= pic->short_video_header << 0;
result.interlaced |= pic->interlaced << 2;
result.load_intra_quant_mat |= 1 << 3;
result.load_nonintra_quant_mat |= 1 << 4;
result.quarter_sample |= pic->quarter_sample << 5;
result.complexity_estimation_disable |= 1 << 6;
result.resync_marker_disable |= pic->resync_marker_disable << 7;
result.newpred_enable |= 0 << 10; //
result.reduced_resolution_vop_enable |= 0 << 11;
result.short_video_header = pic->short_video_header;
result.interlaced = pic->interlaced;
result.load_intra_quant_mat = 1;
result.load_nonintra_quant_mat = 1;
result.quarter_sample = pic->quarter_sample;
result.complexity_estimation_disable = 1;
result.resync_marker_disable = pic->resync_marker_disable;
result.newpred_enable = 0;
result.reduced_resolution_vop_enable = 0;
result.quant_type = pic->quant_type;

View File

@@ -470,12 +470,19 @@ static int si_get_shader_param(struct pipe_screen* pscreen,
case PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR:
/* TODO: Indirect indexing of GS inputs is unimplemented. */
return shader != PIPE_SHADER_GEOMETRY &&
(sscreen->llvm_has_working_vgpr_indexing ||
/* TCS and TES load inputs directly from LDS or
* offchip memory, so indirect indexing is trivial. */
shader == PIPE_SHADER_TESS_CTRL ||
shader == PIPE_SHADER_TESS_EVAL);
if (shader == PIPE_SHADER_GEOMETRY)
return 0;
if (shader == PIPE_SHADER_VERTEX &&
!sscreen->llvm_has_working_vgpr_indexing)
return 0;
/* TCS and TES load inputs directly from LDS or offchip
* memory, so indirect indexing is always supported.
* PS has to support indirect indexing, because we can't
* lower that to TEMPs for INTERP instructions.
*/
return 1;
case PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR:
return sscreen->llvm_has_working_vgpr_indexing ||

View File

@@ -26,6 +26,7 @@
#include "si_shader_internal.h"
#include "sid.h"
#include "radeon/r600_cs.h"
#include "radeon/radeon_uvd.h"
#include "util/hash_table.h"
#include "util/u_log.h"
@@ -333,9 +334,6 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
sctx->sample_mask.sample_mask = 0xffff;
/* these must be last */
si_begin_new_cs(sctx);
if (sctx->b.chip_class >= GFX9) {
sctx->wait_mem_scratch = (struct r600_resource*)
pipe_buffer_create(screen, 0, PIPE_USAGE_DEFAULT, 4);
@@ -351,6 +349,9 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
radeon_emit(cs, sctx->wait_mem_scratch->gpu_address);
radeon_emit(cs, sctx->wait_mem_scratch->gpu_address >> 32);
radeon_emit(cs, sctx->wait_mem_number);
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
sctx->wait_mem_scratch,
RADEON_USAGE_WRITE, RADEON_PRIO_FENCE);
}
/* CIK cannot unbind a constant buffer (S_BUFFER_LOAD doesn't skip loads
@@ -423,6 +424,8 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen,
util_dynarray_init(&sctx->resident_img_needs_color_decompress, NULL);
util_dynarray_init(&sctx->resident_tex_needs_depth_decompress, NULL);
/* this must be last */
si_begin_new_cs(sctx);
return &sctx->b.b;
fail:
fprintf(stderr, "radeonsi: Failed to create a context.\n");

View File

@@ -2106,7 +2106,7 @@ svga_is_format_supported(struct pipe_screen *screen,
if (!ss->sws->have_vgpu10 &&
util_format_is_srgb(format) &&
(bindings & PIPE_BIND_DISPLAY_TARGET)) {
(bindings & (PIPE_BIND_DISPLAY_TARGET | PIPE_BIND_RENDER_TARGET))) {
/* We only support sRGB rendering with vgpu10 */
return FALSE;
}

View File

@@ -93,6 +93,7 @@ namespace clover {
/// Free any resources that were allocated in bind().
virtual void unbind(exec_context &ctx) = 0;
virtual ~argument() {};
protected:
argument();

View File

@@ -934,7 +934,7 @@ static OMX_ERRORTYPE enc_LoadImage(omx_base_PortType *port, OMX_BUFFERHEADERTYPE
blit.src.resource = inp->resource;
blit.src.format = inp->resource->format;
blit.src.box.x = 0;
blit.src.box.x = -1;
blit.src.box.y = def->nFrameHeight;
blit.src.box.width = def->nFrameWidth;
blit.src.box.height = def->nFrameHeight / 2 ;
@@ -948,11 +948,11 @@ static OMX_ERRORTYPE enc_LoadImage(omx_base_PortType *port, OMX_BUFFERHEADERTYPE
blit.dst.box.depth = 1;
blit.filter = PIPE_TEX_FILTER_NEAREST;
blit.mask = PIPE_MASK_G;
blit.mask = PIPE_MASK_R;
priv->s_pipe->blit(priv->s_pipe, &blit);
blit.src.box.x = 1;
blit.mask = PIPE_MASK_R;
blit.src.box.x = 0;
blit.mask = PIPE_MASK_G;
priv->s_pipe->blit(priv->s_pipe, &blit);
priv->s_pipe->flush(priv->s_pipe, NULL, 0);

View File

@@ -23,11 +23,10 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
$(LIBELF_LIBS) \
$(DLOPEN_LIBS) \
-lclangCodeGen \
-lclangFrontendTool \
-lclangFrontend \
-lclangFrontendTool \
-lclangDriver \
-lclangSerialization \
-lclangCodeGen \
-lclangParse \
-lclangSema \
-lclangAnalysis \

View File

@@ -220,8 +220,13 @@ static bool amdgpu_winsys_unref(struct radeon_winsys *rws)
simple_mtx_lock(&dev_tab_mutex);
destroy = pipe_reference(&ws->reference, NULL);
if (destroy && dev_tab)
if (destroy && dev_tab) {
util_hash_table_remove(dev_tab, ws->dev);
if (util_hash_table_count(dev_tab) == 0) {
util_hash_table_destroy(dev_tab);
dev_tab = NULL;
}
}
simple_mtx_unlock(&dev_tab_mutex);
return destroy;

View File

@@ -716,8 +716,13 @@ static bool radeon_winsys_unref(struct radeon_winsys *ws)
mtx_lock(&fd_tab_mutex);
destroy = pipe_reference(&rws->reference, NULL);
if (destroy && fd_tab)
if (destroy && fd_tab) {
util_hash_table_remove(fd_tab, intptr_to_pointer(rws->fd));
if (util_hash_table_count(fd_tab) == 0) {
util_hash_table_destroy(fd_tab);
fd_tab = NULL;
}
}
mtx_unlock(&fd_tab_mutex);
return destroy;

View File

@@ -71,10 +71,12 @@ EXTRA_DIST += \
vulkan/TODO
vulkan/dev_icd.json : vulkan/anv_extensions.py vulkan/anv_icd.py
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_icd.py \
--lib-path="${abs_top_builddir}/${LIB_DIR}" --out $@
vulkan/intel_icd.@host_cpu@.json : vulkan/anv_extensions.py vulkan/anv_icd.py
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_icd.py \
--lib-path="${libdir}" --out $@

View File

@@ -816,6 +816,8 @@ fs_inst::size_read(int arg) const
case SHADER_OPCODE_TYPED_ATOMIC:
case SHADER_OPCODE_TYPED_SURFACE_READ:
case SHADER_OPCODE_TYPED_SURFACE_WRITE:
case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
case SHADER_OPCODE_BYTE_SCATTERED_WRITE:
case SHADER_OPCODE_BYTE_SCATTERED_READ:

View File

@@ -332,6 +332,31 @@ public:
opcode != BRW_OPCODE_IF &&
opcode != BRW_OPCODE_WHILE));
}
bool reads_g0_implicitly() const
{
switch (opcode) {
case SHADER_OPCODE_TEX:
case SHADER_OPCODE_TXL:
case SHADER_OPCODE_TXD:
case SHADER_OPCODE_TXF:
case SHADER_OPCODE_TXF_CMS_W:
case SHADER_OPCODE_TXF_CMS:
case SHADER_OPCODE_TXF_MCS:
case SHADER_OPCODE_TXS:
case SHADER_OPCODE_TG4:
case SHADER_OPCODE_TG4_OFFSET:
case SHADER_OPCODE_SAMPLEINFO:
case VS_OPCODE_PULL_CONSTANT_LOAD:
case GS_OPCODE_SET_PRIMITIVE_ID:
case GS_OPCODE_GET_INSTANCE_ID:
case SHADER_OPCODE_GEN4_SCRATCH_READ:
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
return true;
default:
return false;
}
}
};
/**

View File

@@ -647,7 +647,7 @@ static inline struct brw_reg
brw_imm_w(int16_t w)
{
struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_W);
imm.d = w | (w << 16);
imm.ud = (uint16_t)w | (uint32_t)(uint16_t)w << 16;
return imm;
}

View File

@@ -1267,6 +1267,9 @@ vec4_instruction_scheduler::calculate_deps()
}
}
if (inst->reads_g0_implicitly())
add_dep(last_fixed_grf_write, n);
if (!inst->is_send_from_grf()) {
for (int i = 0; i < inst->mlen; i++) {
/* It looks like the MRF regs are released in the send

View File

@@ -566,9 +566,11 @@ brw_negate_immediate(enum brw_reg_type type, struct brw_reg *reg)
reg->d = -reg->d;
return true;
case BRW_REGISTER_TYPE_W:
case BRW_REGISTER_TYPE_UW:
reg->d = -(int16_t)reg->ud;
case BRW_REGISTER_TYPE_UW: {
uint16_t value = -(int16_t)reg->ud;
reg->ud = value | (uint32_t)value << 16;
return true;
}
case BRW_REGISTER_TYPE_F:
reg->f = -reg->f;
return true;
@@ -602,9 +604,11 @@ brw_abs_immediate(enum brw_reg_type type, struct brw_reg *reg)
case BRW_REGISTER_TYPE_D:
reg->d = abs(reg->d);
return true;
case BRW_REGISTER_TYPE_W:
reg->d = abs((int16_t)reg->ud);
case BRW_REGISTER_TYPE_W: {
uint16_t value = abs((int16_t)reg->ud);
reg->ud = value | (uint32_t)value << 16;
return true;
}
case BRW_REGISTER_TYPE_F:
reg->f = fabsf(reg->f);
return true;

View File

@@ -508,12 +508,12 @@ anv_block_pool_grow(struct anv_block_pool *pool, struct anv_block_state *state)
assert(center_bo_offset >= back_used);
/* Make sure we don't shrink the back end of the pool */
if (center_bo_offset < pool->back_state.end)
center_bo_offset = pool->back_state.end;
if (center_bo_offset < back_required)
center_bo_offset = back_required;
/* Make sure that we don't shrink the front end of the pool */
if (size - center_bo_offset < pool->state.end)
center_bo_offset = size - pool->state.end;
if (size - center_bo_offset < front_required)
center_bo_offset = size - front_required;
}
assert(center_bo_offset % PAGE_SIZE == 0);

View File

@@ -2282,6 +2282,10 @@ anv_image_aspect_get_planes(VkImageAspectFlags aspect_mask)
if (aspect_mask & VK_IMAGE_ASPECT_PLANE_2_BIT_KHR)
planes++;
if ((aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT) != 0 &&
(aspect_mask & VK_IMAGE_ASPECT_STENCIL_BIT) != 0)
planes++;
return planes;
}

View File

@@ -875,10 +875,10 @@ genX(cmd_buffer_setup_attachments)(struct anv_cmd_buffer *cmd_buffer,
struct anv_image_view *iview = framebuffer->attachments[i];
anv_assert(iview->vk_format == att->format);
anv_assert(iview->n_planes == 1);
union isl_color_value clear_color = { .u32 = { 0, } };
if (att_aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
anv_assert(iview->n_planes == 1);
assert(att_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
color_attachment_compute_aux_usage(cmd_buffer->device,
state, i, begin->renderArea,
@@ -1048,10 +1048,18 @@ genX(BeginCommandBuffer)(
* context restore, so the mentioned hang doesn't happen. However,
* software must program push constant commands for all stages prior to
* rendering anything. So we flag them dirty in BeginCommandBuffer.
*
* Finally, we also make sure to stall at pixel scoreboard to make sure the
* constants have been loaded into the EUs prior to disable the push constants
* so that it doesn't hang a previous 3DPRIMITIVE.
*/
static void
emit_isp_disable(struct anv_cmd_buffer *cmd_buffer)
{
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.StallAtPixelScoreboard = true;
pc.CommandStreamerStallEnable = true;
}
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.IndirectStatePointersDisable = true;
pc.CommandStreamerStallEnable = true;

View File

@@ -1077,7 +1077,6 @@ intelDestroyContext(__DRIcontext * driContextPriv)
struct brw_context *brw =
(struct brw_context *) driContextPriv->driverPrivate;
struct gl_context *ctx = &brw->ctx;
const struct gen_device_info *devinfo = &brw->screen->devinfo;
_mesa_meta_free(&brw->ctx);
@@ -1089,8 +1088,7 @@ intelDestroyContext(__DRIcontext * driContextPriv)
brw_destroy_shader_time(brw);
}
if (devinfo->gen >= 6)
blorp_finish(&brw->blorp);
blorp_finish(&brw->blorp);
brw_destroy_state(brw);
brw_draw_destroy(brw);

View File

@@ -349,10 +349,18 @@ gen7_emit_vs_workaround_flush(struct brw_context *brw)
* context restore, so the mentioned hang doesn't happen. However,
* software must program push constant commands for all stages prior to
* rendering anything, so we flag them as dirty.
*
* Finally, we also make sure to stall at pixel scoreboard to make sure the
* constants have been loaded into the EUs prior to disable the push constants
* so that it doesn't hang a previous 3DPRIMITIVE.
*/
void
gen10_emit_isp_disable(struct brw_context *brw)
{
brw_emit_pipe_control(brw,
PIPE_CONTROL_STALL_AT_SCOREBOARD |
PIPE_CONTROL_CS_STALL,
NULL, 0, 0);
brw_emit_pipe_control(brw,
PIPE_CONTROL_ISP_DIS |
PIPE_CONTROL_CS_STALL,

View File

@@ -339,8 +339,11 @@ grow_buffer(struct brw_context *brw,
/* We can't safely use realloc, as it may move the existing buffer,
* breaking existing pointers the caller may still be using. Just
* malloc a new copy and memcpy it like the normal BO path.
*
* Use bo->size rather than new_size because the bufmgr may have
* rounded up the size, and we want the shadow size to match.
*/
grow->map = malloc(new_size);
grow->map = malloc(new_bo->size);
} else {
grow->map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE);
}

View File

@@ -871,7 +871,7 @@ intelCompressedTexSubImage(struct gl_context *ctx, GLuint dims,
!_mesa_is_srgb_format(gl_format);
struct brw_context *brw = (struct brw_context*) ctx;
const struct gen_device_info *devinfo = &brw->screen->devinfo;
if (devinfo->gen == 9 && is_linear_astc)
if (devinfo->gen == 9 && !gen_device_info_is_9lp(devinfo) && is_linear_astc)
flush_astc_denorms(ctx, dims, texImage,
xoffset, yoffset, zoffset,
width, height, depth);

View File

@@ -501,6 +501,28 @@ debug_clear_group(struct gl_debug_state *debug)
debug->Groups[gstack] = NULL;
}
/**
* Delete the oldest debug messages out of the log.
*/
static void
debug_delete_messages(struct gl_debug_state *debug, int count)
{
struct gl_debug_log *log = &debug->Log;
if (count > log->NumMessages)
count = log->NumMessages;
while (count--) {
struct gl_debug_message *msg = &log->Messages[log->NextMessage];
debug_message_clear(msg);
log->NumMessages--;
log->NextMessage++;
log->NextMessage %= MAX_DEBUG_LOGGED_MESSAGES;
}
}
/**
* Loop through debug group stack tearing down states for
* filtering debug messages. Then free debug output state.
@@ -514,6 +536,7 @@ debug_destroy(struct gl_debug_state *debug)
}
debug_clear_group(debug);
debug_delete_messages(debug, debug->Log.NumMessages);
free(debug);
}
@@ -648,28 +671,6 @@ debug_fetch_message(const struct gl_debug_state *debug)
return (log->NumMessages) ? &log->Messages[log->NextMessage] : NULL;
}
/**
* Delete the oldest debug messages out of the log.
*/
static void
debug_delete_messages(struct gl_debug_state *debug, int count)
{
struct gl_debug_log *log = &debug->Log;
if (count > log->NumMessages)
count = log->NumMessages;
while (count--) {
struct gl_debug_message *msg = &log->Messages[log->NextMessage];
debug_message_clear(msg);
log->NumMessages--;
log->NextMessage++;
log->NextMessage %= MAX_DEBUG_LOGGED_MESSAGES;
}
}
static struct gl_debug_message *
debug_get_group_message(struct gl_debug_state *debug)
{

View File

@@ -1488,45 +1488,66 @@ _mesa_FramebufferParameteri(GLenum target, GLenum pname, GLint param)
}
static bool
_pname_valid_for_default_framebuffer(struct gl_context *ctx,
GLenum pname)
validate_get_framebuffer_parameteriv_pname(struct gl_context *ctx,
struct gl_framebuffer *fb,
GLuint pname, const char *func)
{
if (!_mesa_is_desktop_gl(ctx))
return false;
bool cannot_be_winsys_fbo = true;
switch (pname) {
case GL_FRAMEBUFFER_DEFAULT_LAYERS:
/*
* According to the OpenGL ES 3.1 specification section 9.2.3, the
* GL_FRAMEBUFFER_LAYERS parameter name is not supported.
*/
if (_mesa_is_gles31(ctx) && !ctx->Extensions.OES_geometry_shader) {
_mesa_error(ctx, GL_INVALID_ENUM, "%s(pname=0x%x)", func, pname);
return false;
}
break;
case GL_FRAMEBUFFER_DEFAULT_WIDTH:
case GL_FRAMEBUFFER_DEFAULT_HEIGHT:
case GL_FRAMEBUFFER_DEFAULT_SAMPLES:
case GL_FRAMEBUFFER_DEFAULT_FIXED_SAMPLE_LOCATIONS:
break;
case GL_DOUBLEBUFFER:
case GL_IMPLEMENTATION_COLOR_READ_FORMAT:
case GL_IMPLEMENTATION_COLOR_READ_TYPE:
case GL_SAMPLES:
case GL_SAMPLE_BUFFERS:
case GL_STEREO:
return true;
/* From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries:
*
* "An INVALID_OPERATION error is generated by GetFramebufferParameteriv
* if the default framebuffer is bound to target and pname is not one
* of the accepted values from table 23.73, other than
* SAMPLE_POSITION."
*
* For OpenGL ES, using default framebuffer raises INVALID_OPERATION
* for any pname.
*/
cannot_be_winsys_fbo = !_mesa_is_desktop_gl(ctx);
break;
default:
_mesa_error(ctx, GL_INVALID_ENUM, "%s(pname=0x%x)", func, pname);
return false;
}
if (cannot_be_winsys_fbo && _mesa_is_winsys_fbo(fb)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"%s(invalid pname=0x%x for default framebuffer)", func, pname);
return false;
}
return true;
}
static void
get_framebuffer_parameteriv(struct gl_context *ctx, struct gl_framebuffer *fb,
GLenum pname, GLint *params, const char *func)
{
/* From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries:
*
* "An INVALID_OPERATION error is generated by GetFramebufferParameteriv
* if the default framebuffer is bound to target and pname is not one
* of the accepted values from table 23.73, other than
* SAMPLE_POSITION."
*
* For OpenGL ES, using default framebuffer still raises INVALID_OPERATION
* for any pname.
*/
if (_mesa_is_winsys_fbo(fb) &&
!_pname_valid_for_default_framebuffer(ctx, pname)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"%s(invalid pname=0x%x for default framebuffer)", func, pname);
if (!validate_get_framebuffer_parameteriv_pname(ctx, fb, pname, func))
return;
}
switch (pname) {
case GL_FRAMEBUFFER_DEFAULT_WIDTH:
@@ -1536,14 +1557,6 @@ get_framebuffer_parameteriv(struct gl_context *ctx, struct gl_framebuffer *fb,
*params = fb->DefaultGeometry.Height;
break;
case GL_FRAMEBUFFER_DEFAULT_LAYERS:
/*
* According to the OpenGL ES 3.1 specification section 9.2.3, the
* GL_FRAMEBUFFER_LAYERS parameter name is not supported.
*/
if (_mesa_is_gles31(ctx) && !ctx->Extensions.OES_geometry_shader) {
_mesa_error(ctx, GL_INVALID_ENUM, "%s(pname=0x%x)", func, pname);
break;
}
*params = fb->DefaultGeometry.Layers;
break;
case GL_FRAMEBUFFER_DEFAULT_SAMPLES:
@@ -1570,9 +1583,6 @@ get_framebuffer_parameteriv(struct gl_context *ctx, struct gl_framebuffer *fb,
case GL_STEREO:
*params = fb->Visual.stereoMode;
break;
default:
_mesa_error(ctx, GL_INVALID_ENUM,
"%s(pname=0x%x)", func, pname);
}
}

View File

@@ -62,6 +62,7 @@ _mesa_Fogiv(GLenum pname, const GLint *params )
case GL_FOG_END:
case GL_FOG_INDEX:
case GL_FOG_COORDINATE_SOURCE_EXT:
case GL_FOG_DISTANCE_MODE_NV:
p[0] = (GLfloat) *params;
break;
case GL_FOG_COLOR:

View File

@@ -732,6 +732,6 @@ endif
if with_glx == 'xlib'
subdir('drivers/x11')
endif
if with_tests
if with_tests and dri_drivers != []
subdir('main/tests')
endif

View File

@@ -7049,6 +7049,11 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
} while (progress);
}
/* Do this again to lower ir_binop_vector_extract introduced
* by optimization passes.
*/
do_vec_index_to_cond_assign(ir);
validate_ir_tree(ir);
}

View File

@@ -311,6 +311,7 @@ util_queue_init(struct util_queue *queue,
goto fail;
(void) mtx_init(&queue->lock, mtx_plain);
(void) mtx_init(&queue->finish_lock, mtx_plain);
queue->num_queued = 0;
cnd_init(&queue->has_queued_cond);
@@ -398,6 +399,7 @@ util_queue_destroy(struct util_queue *queue)
cnd_destroy(&queue->has_space_cond);
cnd_destroy(&queue->has_queued_cond);
mtx_destroy(&queue->finish_lock);
mtx_destroy(&queue->lock);
free(queue->jobs);
free(queue->threads);
@@ -529,6 +531,12 @@ util_queue_finish(struct util_queue *queue)
util_barrier_init(&barrier, queue->num_threads);
/* If 2 threads were adding jobs for 2 different barries at the same time,
* a deadlock would happen, because 1 barrier requires that all threads
* wait for it exclusively.
*/
mtx_lock(&queue->finish_lock);
for (unsigned i = 0; i < queue->num_threads; ++i) {
util_queue_fence_init(&fences[i]);
util_queue_add_job(queue, &barrier, &fences[i], util_queue_finish_execute, NULL);
@@ -538,6 +546,7 @@ util_queue_finish(struct util_queue *queue)
util_queue_fence_wait(&fences[i]);
util_queue_fence_destroy(&fences[i]);
}
mtx_unlock(&queue->finish_lock);
util_barrier_destroy(&barrier);

View File

@@ -200,6 +200,7 @@ struct util_queue_job {
/* Put this into your context. */
struct util_queue {
const char *name;
mtx_t finish_lock; /* only for util_queue_finish */
mtx_t lock;
cnd_t has_queued_cond;
cnd_t has_space_cond;