Compare commits

...

379 Commits

Author SHA1 Message Date
Andres Gomez
ae52410bf0 docs: add release notes for 17.2.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-10 15:33:58 +02:00
Andres Gomez
2c582b4cf7 Update version to 17.2.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-10 15:24:51 +02:00
Andres Gomez
0195b78aa7 cherry-ignore: automake: include git_sha1.h.in in release tarball
fixes: This commit has more than one Fixes tag but the commit it
addresses didn't land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-07 18:45:28 +02:00
Andres Gomez
a23bd4ea87 cherry-ignore: added 17.3 nominations.
stable: 17.3 nominations only.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-07 18:44:48 +02:00
Andres Gomez
27cd0abe8e cherry-ignore: intel/fs: Alloc pull constants off mem_ctx
stable: This commit addressed earlier commit 8d90e28839 which did not
land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-07 18:44:44 +02:00
Andres Gomez
b5dc8c43aa cherry-ignore: etnaviv: don't do resolve-in-place without valid TS
stable: This commit addressed earlier commit 78ade65956 which did not
land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-07 18:44:38 +02:00
Andres Gomez
526ecbe417 cherry-ignore: i965: fix blorp stage_prog_data->param leak
stable: This commit addressed earlier commit 8d90e28839 which did not
land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-07 18:42:55 +02:00
Andres Gomez
0d91d13564 cherry-ignore: radv: copy indirect lowering settings from radeonsi
fixes: remove 6ce550453f and 059434e1763, which were depending in now
backported 087e010b2b.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-11-07 18:35:54 +02:00
Tomasz Figa
fd0cf2c946 glsl: Allow precision mismatch on dead data with GLSL ES 1.00
Commit 259fc50545 added linker error for
mismatching uniform precision, as required by GLES 3.0 specification and
conformance test-suite.

Several Android applications, including Forge of Empires, have shaders
which violate this rule, on a dead varying that will be eliminated.
The problem affects a big number of applications using Cocos2D engine
and other GLES implementations accept this, this poses a serious
application compatibility issue.

Starting from GLSL ES 3.0, declarations with conflicting precision
qualifiers are explicitly prohibited. However GLSL ES 1.00 does not
clearly specify the behavior, except that

  "Uniforms are defined to behave as if they are using the same storage in
  the vertex and fragment processors and may be implemented this way.
  If uniforms are used in both the vertex and fragment shaders, developers
  should be warned if the precisions are different. Conversion of
  precision should never be implicit."

The word "used" is not clear in this context and might refer to
 1) declared (same as GLES 3.x)
 2) referred after post-processing, or
 3) linked after all optimizations are done.

Looking at existing applications, 2) or 3) seems to be widely adopted.
To avoid compatibility issues, turn the error into a warning if GLSL ES
version is lower than 3.0 and the data is dead in at least one of the
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0886be093f)
2017-11-07 18:35:54 +02:00
Bas Nieuwenhuizen
6a73458510 radv: Disallow indirect outputs for GS on GFX9 as well.
Since it also uses the output vector before writing to memory.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit c07d719e8b)
[Bas Nieuwenhuizen: resolve conflicts]

Conflicts:
        src/amd/vulkan/radv_shader.c
2017-11-07 18:35:54 +02:00
Bas Nieuwenhuizen
9ba45e7d33 radv: Don't use vgpr indexing for outputs on GFX9.
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6ce550453f)
[Bas Nieuwenhuizen: resolve conflicts]

Conflicts:
        src/amd/vulkan/radv_shader.c
2017-11-07 18:35:54 +02:00
Timothy Arceri
662cff8fe4 radv: copy indirect lowering settings from radeonsi
It looks the original indirect mask was probably copied from
ANV.

Sascha Willems demo results:

tessellation ~4000 -> ~4200 fps

V2: continue lowering local indirects due to llvm deficiencies.

(cherry picked from commit 087e010b2b)
[Bas Nieuwenhuizen: patch is a backport for 17.2 of the cherry-pick above]
2017-11-07 18:35:54 +02:00
Dave Airlie
0b0b7f1833 radv: add initial copy descriptor support. (v2)
It appears the latest dota2 vulkan uses this,
and we get a hang in VR mode without it.

v2: remove finishme I left in after finishing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4bcb48b831)
2017-11-07 18:35:54 +02:00
Dave Airlie
23eaeeb88a radv: free attachments on end command buffer.
If we allocate attachments in the begin command buffer due to the
render pass continue bit, we were leaking them.

Since renderpasses inside a cmd buffer malloc/free these properly,
and set to NULL, we just need to call free at end.

Fixes a memory leak with multithreading demo.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f0ae06a13c)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_cmd_buffer.c
2017-11-07 18:35:54 +02:00
Dave Airlie
d8e0f66b55 i915g: make gears run again.
We need to validate some structs exist before we dirty the states, and
avoid the problem in some other places.

Fixes: e027935a7 ("st/mesa: don't update unrelated states in non-draw calls such as Clear")
(cherry picked from commit cc69f2385e)
2017-11-07 18:35:54 +02:00
Bas Nieuwenhuizen
bd2037da82 radv: Don't expose heaps with 0 memory.
It confuses CTS. This pregenerates the heap info into the
physical device, so we can use it for translating contiguous
indices into our "standard" ones.

This also makes the WSI a bit smarter in case the first preferred
heap does not exist.

Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 806721429a)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_device.c
	src/amd/vulkan/radv_private.h
	src/amd/vulkan/radv_wsi.c
2017-11-07 18:35:54 +02:00
Eric Engestrom
388b68a61e vc4: fix release build
Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need
to explicitly compile this code out.

Fixes: 3df7892878 "vc4: Drop reloc_count tracking for debug
       asserts on non-debug builds."
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 5d44e35a8f)
2017-11-07 18:35:54 +02:00
Gert Wollny
71a33028ba r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling
It is possible that the optimizer ends up in an infinite loop in
post_scheduler::schedule_alu(), because post_scheduler::prepare_alu_group()
does not find a proper scheduling. This can be deducted from
pending.count() being larger than zero and not getting smaller.

This patch works around this problem by signalling this failure so that the
optimizers bails out and the un-optimized shader is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 69eee511c6)
2017-11-07 18:35:53 +02:00
Neil Roberts
f33f5e9ad6 nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB
Previously the values were calculated by just shifting ~0 by the
invocation ID. This would end up including bits that are higher than
gl_SubGroupSizeARB. The corresponding CTS test effectively requires that
these high bits be zero so it was failing. There is a Piglit test as
well but this appears to checking the wrong values so it passes.

For the two greater-than bitmasks, this patch adds an extra mask with
(~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero.

Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Neil Roberts <nroberts@igalia.com>
(cherry picked from commit b697ece10a)
2017-11-07 18:35:53 +02:00
Nanley Chery
af1dc1cf87 i965: Check CCS_E compatibility for texture view rendering
Only use CCS_E to render to a texture that is CCS_E-compatible with the
original texture's miptree (linear) format. This prevents render
operations from writing data that can't be decoded with the original
miptree format.

On Gen10, with the new CCS_E-enabled formats handled, this enables the
driver to pass the arb_texture_view-rendering-formats piglit test.

v2. Add a TODO for texturing. (Jason)

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 9e849eb8bb)
2017-11-07 18:35:53 +02:00
Topi Pohjolainen
3d7546a312 intel/compiler/gen9: Pixel shader header only workaround
Fixes intermittent GPU hangs on Broxton with an Intel internal
test case.

There are plenty of similar fragment shaders in piglit that do
not use any varyings and any uniforms. According to the
documentation special timing is needed between pipeline stages.
Apparently we just don't hit that with piglit. Even with the
failing test case one doesn't always get the hang.

Moreover, according to the error states the hang happens
significantly later than the execution of the problematic shader.
There are multiple render cycles (primitive submissions) in between.
I've also seen error states where the ACTHD points outside the
batch. Almost as if the hardware writes somewhere that gets used
later on. That would also explain why piglit doesn't suffer from
this - most tests kick off one render cycle and any corruption
is left unseen.

v2 (Ken): Instead of enabling push constants, enable one of the
          inputs (PSIZ).
v3 (Ken, Jason): Use LAYER instead making vulkan emit_3dstate_sbe()
                 happy.

Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 97e01adfd5)
2017-11-07 18:35:53 +02:00
Kenneth Graunke
67fc6d43bd mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility.
According to the ARB_ES3_1_compatibility specification,
glGetFramebufferAttachmentParameteriv is supposed to accept BACK,
and it behaves exactly like BACK_LEFT.

Fixes a GL error in GFXBench 5 Aztec Ruins.

Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4f538c3f99)
2017-11-07 18:35:53 +02:00
Tapani Pälli
71ea414782 i965: unref push_const_bo in intelDestroyContext
Valgrind shows that leak is caused by gen6_upload_push_constant, add
unref push_const_bo per stage to destructor to fix this (like done for
scratch_bo).

   ==10952== 144 bytes in 1 blocks are definitely lost in loss record 44 of 66
   ==10952==    at 0x4C30A1E: calloc (vg_replace_malloc.c:711)
   ==10952==    by 0x8C02847: bo_alloc_internal.constprop.10 (brw_bufmgr.c:344)
   ==10952==    by 0x8C425C4: intel_upload_space (intel_upload.c:101)
   ==10952==    by 0x8C22ED0: gen6_upload_push_constants (gen6_constant_state.c:154)

v2: remove if conditions, brw_bo_unreference handles NULL (Ken, Emil)

Fixes: 24891d7c05 ("i965: Store per-stage push constant BO pointers.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0b131ca427)
2017-11-07 18:35:53 +02:00
Jason Ekstrand
e7ddddc989 i965/miptree: Take an isl_format in render_aux_usage
Not all rendering matches the miptree format.  We allow rendering to
texture views so there are cases where it may not match.  In those
cases, our current scheme of just passing the value of ctx->sRGBEnabled
isn't viable.  Instead, just do what we do for texturing and pass the
view format in directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 39c5c12f8f)
[Andres Gomez: remove code which was trivially modified previously]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_draw.c
	src/mesa/drivers/dri/i965/brw_wm_surface_state.c
2017-11-07 18:35:53 +02:00
Jason Ekstrand
6dc8e69da6 i965/blorp: Use more temporary isl_format variables
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 78e50185d6)
2017-11-07 18:35:53 +02:00
Jason Ekstrand
78ff359d04 i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 94389943b6)
2017-11-07 18:35:53 +02:00
Jason Ekstrand
9e27cdc771 spirv: Claim support for the simple memory model
It's rather surprising that we've never actually hit this before.
Aparently, Ian's SPIR-V generator currently claims the Simple when you
don't do anything complex.  We really shouldn't assert-fail on it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8ab9820d34)
2017-11-07 18:35:53 +02:00
Leo Liu
1317bdc3a1 radeon/video: add gfx9 offsets when rejoin the video surface
For CPU access.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit ea3dc75d72)
2017-11-07 18:35:53 +02:00
Nicolai Hähnle
e95c4af12d amd/common/gfx9: workaround DCC corruption more conservatively
Fixes KHR-GL45.texture_swizzle.smoke and others on Vega.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f9ccfda9bc)
2017-11-07 18:35:53 +02:00
Marek Olšák
d132c0d792 ac/surface/gfx9: don't allow DCC for the smallest mipmap levels
This fixes garbage there if we don't flush TC L2 after rendering.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 759526813b)
2017-11-07 18:35:53 +02:00
Marek Olšák
e28870b9fe st/dri: don't expose modifiers in EGL if the driver doesn't implement them
This unbreaks waffle/gbm (piglit/gbm) which fails initialization.

v2: also don't set queryDmaBufFormats

Reviewed-by: Daniel Stone <daniel@fooishbar.org>
(cherry picked from commit a65db0ad1c)
2017-11-07 18:35:53 +02:00
Andres Gomez
1b10d0851a docs: add sha256 checksums for 17.2.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-30 16:53:44 +02:00
Andres Gomez
a4b72e2643 docs: add release notes for 17.2.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-30 16:46:20 +02:00
Andres Gomez
fe9dc0fad6 Update version to 17.2.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-30 16:37:54 +02:00
Andres Gomez
162fa27b12 cherry-ignore: broadcom/vc5: Propagate vc4 aliasing fix to vc5.
extra: Commit is not applicable when ade416d023 is missing.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Andres Gomez
b3207d9ff7 cherry-ignore: mesa/bufferobj: don't double negate the range
fixes: This commit addressed earlier commit 35ac13ed3 which did not
land in branch.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Andres Gomez
6cab29e973 cherry-ignore: radv: Disallow indirect outputs for GS on GFX9 as well.
fixes: Commit is not applicable when 6ce550453f is missing.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Andres Gomez
1c2f79066c cherry-ignore: radv: Don't use vgpr indexing for outputs on GFX9.
fixes: Commit is not applicable when 087e010b2b is missing.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Andres Gomez
0c63d53765 cherry-ignore: added 17.3 nominations.
stable: 17.3 nominations only.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Andres Gomez
52019bad1d cherry-ignore: glsl: fix derived cs variables
stable: Commit is too big for stable at this point.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Andres Gomez
9bdd943ff2 cherry-ignore: configure.ac: rework llvm detection and handling
stable: Commits are too invasive for 17.2.

Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-10-27 18:07:38 +03:00
Jason Ekstrand
1393d37d7b intel/eu: Use EXECUTE_1 for JMPI
The PRM says "The execution size must be 1."  In 73137997e2, the
execution size was set to 1 when it should have been BRW_EXECUTE_1
(which maps to 0).  Later, in dc2d3a7f5c, JMPI was used for
line AA on gen6 and earlier and we started manually stomping the
exeution size to BRW_EXECUTE_1 in the generator.  This commit fixes the
original bug and makes brw_JMPI just do the right thing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 73137997e2
(cherry picked from commit 562b8d458c)
2017-10-27 18:07:38 +03:00
Jason Ekstrand
74f1903234 anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir
We currently have a bug where nir_lower_system_values gets called before
nir_lower_var_copies so it will miss any system value uses which come
from a copy_var intrinsic.  Moving it to after brw_preprocess_nir fixes
this problem.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 279f8fb69c)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/intel/vulkan/anv_pipeline.c
2017-10-27 18:07:37 +03:00
Jason Ekstrand
aac0807f48 intel/fs: Handle flag read/write aliasing in needs_src_copy
In order to implement the ballot intrinsic, we do a MOV from flag
register to some GRF.  If that GRF is used in a SEL, cmod propagation
helpfully changes it into a MOV from the flag register with a cmod.
This is perfectly valid but when lower_simd_width comes along, it simply
splits into two instructions which both have conditional modifiers.
This is a problem since we're reading the flag register.  This commit
makes us check whether or not flags_written() overlaps with the flag
values that we are reading via the instruction source and, if we have
any interference, will force us to emit a copy of the source.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit fa6e74e33e)
2017-10-27 18:07:37 +03:00
Jan Vesely
03f3899f99 clover: Fix compilation after clang r315871
v2: use a more generic compat function
v3: rename and formatting cleanup

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a6d38f476b)
2017-10-27 18:07:37 +03:00
Jason Ekstrand
87a9a989ee nir/intrinsics: Set the correct num_indices for load_output
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit c1b84256cc)
2017-10-27 18:07:37 +03:00
Matthew Nicholls
ce725baa7c ac/nir: generate correct instruction for atomic min/max on unsigned images
v2: fix silly typo

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 27a0b24bf2)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-10-27 18:07:37 +03:00
Bas Nieuwenhuizen
f46ba9ee35 ac/nir: Fix nir_texop_lod on GFX for 1D arrays.
Fixes: 1bcb953e16 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2c5b43c87f)
2017-10-27 18:07:37 +03:00
Stefan Schake
2d51d41865 broadcom/vc4: Fix aliasing issue
This was causing Android clang version 3.8.256229 to miscompile,
presumably due to strict aliasing.

Fixes: 14dc281c13 ("vc4: Enforce one-uniform-per-instruction after optimization.")
(cherry picked from commit e5fea0d621)
2017-10-27 18:07:37 +03:00
Kenneth Graunke
cbc081b871 i965: Revert absolute mode for constant buffer pointers.
The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2
registers at context initialization time.  Instead, they're inherited
from whatever happened to be running on the GPU prior to first run of a
new context.  So, when we started setting these, other contexts in the
system started inheriting our values.  Since this controls whether
3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong
setting is fatal for almost any process which isn't expecting this.

Unfortunately, VA-API and Beignet don't initialize this (nor does older
Mesa), so they will die horribly if we start doing this.  UXA and SNA
don't use any push constants, so they are unaffected.

Until we have some kind of solution to this problem, I'm going to revert
this patch and abandon using the feature for now.  It will lead to fewer
pushed UBO ranges on Broadwell+, which may lead to lower performance,
though I don't have any data on the impact.

Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774
(cherry picked from commit 013d331220)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_state_upload.c
	src/mesa/drivers/dri/i965/intel_screen.c
2017-10-27 18:07:24 +03:00
Michel Dänzer
546b4d455a st/mesa: Initialize textures array in st_framebuffer_validate
And just reference pipe_resources to it in the validate callbacks.

Avoids pipe_resource leaks when st_framebuffer_validate ends up calling
the validate callback multiple times, e.g. when a window is resized.

v2:
* Use generic stable tag instead of Fixes: tag, since the problem could
  already happen before the commit referenced in v1 (Thomas Hellstrom)
* Use memset to initialize the array on the stack instead of allocating
  the array with os_calloc.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 7561da367b)

Squashed with:

st/osmesa: include u_inlines.h for pipe_resource_reference

Fixes build failure due to unresolved symbol.

Fixes: 7561da367b "st/mesa: Initialize textures array in
                     st_framebuffer_validate"

Trivial.

(cherry picked from commit 8c9e7c9638)
2017-10-27 18:04:59 +03:00
Henri Verbeet
9cbf8c910e vulkan/wsi: Free the event in x11_manage_fifo_queues().
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Fixes: e73d136a02 ("vulkan/wsi/x11: Implement FIFO mode.")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com
(cherry picked from commit 3de87f7cd7)
2017-10-27 18:04:59 +03:00
Dave Airlie
1eb4cbc934 radv/image: bump all the offset to uint64_t.
So one of the CTS tests tries to allocate a 16384x1 2048 array
texture. This overflows a bunch of calculations when we want it
tiled as the heights goes to 128.

addrlib returns us the correct size (16GB or so), but we mangle
it in the htile calcs due to the 32-bit offset fields, then
userspace gives us the reduced number and we try to allocate
it on a heap and things blow up.

We really need to give the app back the correct size for the
image so we can blow up properly in memory allocation later.

This should fix hangs in
dEQP-VK.pipeline.render_to_image.core.1d_array.huge.width_layers.r8g8b8a8_unorm_d32_sfloat_s8_uint
since
Fixes: ad3d98da9f (radv: enable tc compatible htile for d32s8 also.)

Now there's an open question if we should be enabling tc-compat
htile at all for shallow textures like the above.

This might cause some other wierd side effects in CTS even
without the tc compat so:
Cc: "17.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 35c66f3e40)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <agomez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_private.h
2017-10-27 18:04:59 +03:00
Marek Olšák
fba44d91d0 Revert "mesa: fix texture updates for ATI_fragment_shader"
This reverts commit 9d54025cd1.

It breaks KOTOR.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d071bf04b)
2017-10-27 18:04:59 +03:00
Samuel Pitoiset
9cba4d491c radv: add the draw count buffer to the list of buffers
My guess is that the GPU is going to report VM faults if
vkCmdDrawIndirectCountAMD() (and friends) are used.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 3e5f27faf3)
2017-10-27 18:04:59 +03:00
Emil Velikov
facc851818 docs: add sha256 checksums for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-19 13:28:13 +01:00
Emil Velikov
28dc4b64f2 docs: add release notes for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-19 13:10:20 +01:00
Emil Velikov
ea38f4c33a Update version to 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-19 13:02:58 +01:00
Emil Velikov
23c08dabc3 eglmesaext: add forward declaration for struct wl_buffers
The user does not need to know the specifics of the struct, as only a
pointer to it is used.

Just forward declare the struct making the header self-contained.

v2: Remove deprecation warning text/bugzilla - patch does no help there.

Cc: Greg V <greg@unrelenting.technology>
Fixes: 5cddb1ce3c ("wayland: Add an extension to create wl_buffers from
EGLImages")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
(cherry picked from commit 66ebdfbd44)
2017-10-17 16:59:31 +01:00
Emil Velikov
dc9bd1dade wayland-drm: use a copy of the wayland_drm_callbacks struct
The callbacks may be called even when they are no longer valid.
Say, the user is dlclose(ing) libEGL while the buffers are being
destroyed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Derek Foreman <derekf@osg.samsung.com>
(cherry picked from commit 0cfd6f6cfc)
2017-10-17 16:59:31 +01:00
Jason Ekstrand
d001ff1267 nir: Get rid of the variable on vote intrinsics
This looks like a copy+paste error.  They don't actually write into that
variable as would be implied by putting the return there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3442c9fc3e)
2017-10-17 16:59:31 +01:00
Jason Ekstrand
88a16c895b nir/opcodes: Fix constant-folding of ufind_msb
We didn't fold correctly in the case of 0x1 because we never let the
loop counter hit 0.  Switching it to bit >= 0 solves this problem.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a0947921eb)
2017-10-17 16:59:31 +01:00
Jason Ekstrand
b640bf38ca glsl/blob: Return false from grow_to_fit if we've ever failed
Otherwise we could have a failure followed by a smaller write that
succeeds and get a corrupted blob.  If we ever OOM, we should stop.

v2 (Jason Ekstrand):
 - Initialize the new boolean member in create_blob

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e03717efbd)
2017-10-17 16:59:31 +01:00
Jason Ekstrand
4d1ae3283c glsl/blob: Return false from ensure_can_read on overrun
Otherwise, if you have a large read fail and then try to do a small
read, the small read may succeed even though it's at the wrong offset.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7118851374)
2017-10-17 16:59:31 +01:00
Eric Engestrom
d56aa9fe43 scons: use python3-compatible print()
These changes were generated using python's `2to3` tool.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102852
Reported-by: Alex Granni <liviuprodea@yahoo.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 7d48219b3a)
2017-10-17 16:59:31 +01:00
Bas Nieuwenhuizen
3b657e4ff5 radv: Only set the MTYPE flags on GFX9+.
Older kernels fail the va_op with this flag set. If the kernel
supports GFX9 usefully, it will also support this flag.

Fixes: e8d57802fe "radv/gfx9: allocate events from uncached VA space"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 96f80c8d4d)
2017-10-17 16:59:31 +01:00
Daniel Stone
99d3661bce egl/wayland: Don't use dmabuf with no modifiers
The dmabuf interface requires a valid modifier to be sent. If we don't
explicitly get a modifier from the driver, we can't know what to send;
it must be inferred from legacy side-channels (or assumed to linear, if
none exists).

If we have no modifier, then we can only have a single-plane format
anyway, so fall back to the old wl_drm buffer import path.

Fixes: a65db0ad1c ("st/dri: don't expose modifiers in EGL if the driver doesn't implement them")
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b65d6dafd6)
2017-10-17 16:59:31 +01:00
Daniel Stone
0f6e89dfe0 egl/wayland: Check queryImage return for wl_buffer
When creating a wl_buffer from a DRIImage, we extract all the DRIImage
information via queryImage. Check whether or not it actually succeeds,
either bailing out if the query was critical, or providing sensible
fallbacks for information which was not available in older DRIImage
versions.

Fixes: a65db0ad1c ("st/dri: don't expose modifiers in EGL if the driver doesn't implement them")
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6273d2f269)
2017-10-17 16:59:31 +01:00
Emil Velikov
bee97ec32e swr/rast: do not crash on NULL strings returned by getenv
The current convenience function GetEnv feeds the results of getenv
directly into std::string(). That is a bad idea, since the variable
may be unset, thus we feed NULL into the C++ construct.

The latter of which is not allowed and leads to a crash.

v2: Better variable name, implicit char* -> std::string conversion (Eric)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101832
Fixes: a25093de71 ("swr/rast: Implement JIT shader caching to disk")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Bernhard Rosenkraenzer <bero@lindev.ch>
[Emil Velikov: make an actual commit from the misc diff]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Laurent Carlier <lordheavym@gmail.com> (v1)
(cherry picked from commit 21e271024d)
2017-10-17 16:59:31 +01:00
Nicolai Hähnle
9f4b0a336c radeonsi: clamp border colors for upgraded depth textures
The hardware does this automatically for unorm formats, but we need to
do it manually for unorm depth formats that have been upgraded to
Z32_FLOAT.

Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
and others.

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 6eb9483912)
2017-10-17 16:59:31 +01:00
Nicolai Hähnle
74a28d85de radeonsi: clamp depth comparison value only for fixed point formats
The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.

The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.

Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 4c56e07029)
[Emil Velikov: handle lack of dirty_mask in original patch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_descriptors.c
2017-10-17 16:59:31 +01:00
Nicolai Hähnle
f805a61e04 st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts
Found by address sanitizer.

The loop here tries to be safe, but in doing so, it ends up doing
exactly the wrong thing: the safe foreach is for when the loop
variable (inst) could be deleted and nothing else. However, this
particular can delete inst's successor, but not inst itself.

Fixes: 8c6a0ebaad ("st/mesa: add st fp64 support (v7.1)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 2703fa613b)
2017-10-17 16:59:31 +01:00
Lionel Landwerlin
6957dfb0d8 anv: bo_cache: allow importing a BO larger than needed
It's not a problem if a BO has been allocated larger than we need it
to be.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102940
Fixes: 818b857914 ("anv: Use the BO cache for DeviceMemory allocations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c0a4f56fb9)
2017-10-17 16:59:31 +01:00
Nicolai Hähnle
410f4dbcb1 st/glsl_to_tgsi: fix indirect access to 64-bit integer
Make sure we actually allocate two adjacent TGSI temporaries. The
current code fails e.g. when an arithmetic operation has two
operands with indirect accesses.

I will send out a new piglit test
(arb_gpu_shader_int64/execution/indirect-array-two-accesses.shader_test)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 541208cf13)
2017-10-17 16:59:31 +01:00
Ilia Mirkin
d22e779d6a nv50,nvc0: fix push hint logic in presence of a start offset
Previously buffer offsets were passed in explicitly as an offset, which
had to be added to the resource address. Now they are passed in via an
increased 'start' parameter. As a result, we were double-adding the
start offset in this kind of situation.

This condition was triggered by piglit's draw-elements test which has a
requisite glMultiDrawElements in combination with a small enough number
of vertices to go through the immediate push path.

Fixes: 330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
Reported-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b20bccbcac)
2017-10-17 16:59:31 +01:00
Dave Airlie
41ec2af2a8 radv: lower ffma in nir.
So it appears the Vulkan SPIR-V fma opcode can be equivalent to a
mad operation, and the fma hw opcode on AMD hw is issued like a double
opcode so is slower. Also the radeonsi stack does this.

This appears to improve performance on a number of games from Feral,
and thanks to Feral for noticing the problem.

I'm reposting this one as Marek indicated he thinks this is what
we should be doing on AMD hw.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2c61594d84)
[Emil Velikov: use correct file radv_shader.c -> radv_pipeline.c]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_shader.c
2017-10-17 16:59:31 +01:00
Alex Smith
0bd7be0142 radv: Add R16G16B16A16_SNORM fast clear support
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 25d76fd658)
2017-10-17 16:59:30 +01:00
Nicolai Hähnle
4bcadb533f st/mesa: don't clobber glGetInternalformat* buffer for GL_NUM_SAMPLE_COUNTS
Applications might pass in a buffer that is sized too large and rely
on the extra space of the buffer not being overwritten.

Fixes dEQP-GLES31.functional.state_query.internal_format.partial_query.num_sample_counts

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9a8f13a33b)
2017-10-17 16:59:30 +01:00
Ilia Mirkin
2eae2a6f0e nv50/ir: fix 64-bit integer shifts
TGSI was adjusted to always pass in 64-bit integers but nouveau was left
with the old semantics. Update to the new thing.

Fixes: d10fbe5159 (st/glsl_to_tgsi: fix 64-bit integer bit shifts)
Reported-by: Karol Herbst <karolherbst@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ce6da2a026)
2017-10-17 16:59:30 +01:00
Józef Kucia
077f925473 anv: Do not assert() on VK_ATTACHMENT_UNUSED
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 91ba331ef4)
2017-10-17 16:59:30 +01:00
Józef Kucia
2e92d16f9d spirv: Fix SpvOpAtomicISub
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e0acb630a5)
2017-10-17 16:59:30 +01:00
Emil Velikov
83dcf9dc33 cherry-ignore: add "anv/wsi: Allocate enough memory for the entire image"
Addresses bug introduced with a feature patch, which is not in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-17 16:59:28 +01:00
Lionel Landwerlin
406e7e0e17 anv/cmd_buffer: Reset state in cmd_buffer_destroy
This ensures that everything gets cleaned up properly. In particular,
it fixes a memory leak where we were leaking the push constants
structs.

Valgrind stats on
dEQP-VK.pipeline.push_constant.graphics_pipeline.range_size_128 :

Before:
HEAP SUMMARY:
    in use at exit: 2,467,513 bytes in 1,305 blocks
  total heap usage: 697,853 allocs, 696,530 frees, 138,466,600 bytes allocated

LEAK SUMMARY:
   definitely lost: 1,068 bytes in 11 blocks
   indirectly lost: 24,669 bytes in 412 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 2,441,776 bytes in 882 blocks
        suppressed: 0 bytes in 0 blocks

After:
HEAP SUMMARY:
    in use at exit: 2,467,381 bytes in 1,304 blocks
  total heap usage: 697,853 allocs, 696,531 frees, 138,466,600 bytes allocated

LEAK SUMMARY:
   definitely lost: 936 bytes in 10 blocks
   indirectly lost: 24,669 bytes in 412 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 2,441,776 bytes in 882 blocks
        suppressed: 0 bytes in 0 blocks

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0763f814d7)
2017-10-17 16:59:02 +01:00
Lionel Landwerlin
60466859fa anv/cmd_buffer: fix push descriptors with set > 0
When writing to set > 0, we were just wrongly writing to set 0. This
commit fixes this by lazily allocating each set as we write to them.

We didn't go for having them directly into the command buffer as this
would require an additional ~45Kb per command buffer.

v2: Allocate push descriptors from system memory rather than in BO
    streams. (Lionel)

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Fixes: 9f60ed98e5 ("anv: add VK_KHR_push_descriptor support")
Reported-by: Daniel Ribeiro Maciel <daniel.maciel@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d296dea54e)
2017-10-17 16:59:02 +01:00
Marek Olšák
014b5a7209 glsl_to_tgsi: fix instruction order for bindless textures
We emitted instructions loading the bindless handle after the memory
instruction.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 985338e2cb)
2017-10-17 16:59:02 +01:00
Jason Ekstrand
f9c4f22f5a intel/compiler: Don't propagate cmod into integer multiplies
No shader-db change on Sky Lake.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7463d50580)
2017-10-17 16:59:02 +01:00
Jason Ekstrand
4ae1a62b26 intel/compiler: Don't cmod propagate into a saturated operation
Shader-db results on Sky Lake:

    total instructions in shared programs: 12954445 -> 12955125 (0.01%)
    instructions in affected programs: 141862 -> 142542 (0.48%)
    helped: 0
    HURT: 626

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b91ecee04a)
2017-10-17 16:59:02 +01:00
Ben Crocker
ea3ad52ad3 gallivm/ppc64le: allow environmental control of Altivec code generation
In check_os_altivec_support(), allow control of Altivec (first PPC vector
instruction set) code generation via a new environmental control,
GALLIVM_ALTIVEC, which is expected to take on a value of 1 or 0.
The default is to enable Altivec code generation.

This environmental control of Altivec code generation is initially
available only #ifdef DEBUG.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 1359af930e)
2017-10-17 16:59:02 +01:00
Ben Crocker
3f5d2b768c gallivm/ppc64le: adjust VSX code generation control.
In lp_build_create_jit_compiler_for_module(), advance the minimum
version of LLVM for VSX code generation to 4.0; this is the minimum
revision at which several known VSX code generation bugs are fixed:

  https://llvm.org/bugs/show_bug.cgi?id=25503 (fixed in 3.8.1)
  https://llvm.org/bugs/show_bug.cgi?id=26775 (fixed in 3.8.1)
  https://llvm.org/bugs/show_bug.cgi?id=33531 (fixed in 4.0)

An llc performance bug introduced in LLVM 4.0,

  https://llvm.org/bugs/show_bug.cgi?id=34647

is still pending as of LLVM 5.0, but only has a pronounced effect on
one of the Piglit tests: ext_transform_feedback-max-varyings.

All changes tested via Piglit.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit e93f056a4e)
2017-10-17 16:59:02 +01:00
Ben Crocker
070d2dcfac gallivm: allow additional llc options
In init_native_targets, allow the passing of additional options to
the LLC compiler via new GALLIVM_LLC_OPTIONS environmental control.
This option is available only #ifdef DEBUG, initially.
At top, add #include <llvm-c/Support.h> for LLVMParseCommandLineOptions()
declaration.

v2: Fix compile error with old llvm versions (sroland)

Cc: "17.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 5c75f0c8bb)
2017-10-17 16:59:02 +01:00
Ben Crocker
1a8ccdc6e9 gallivm: fix typo in debug_printf message
In gallivm_compile_module, fix a typo in the
debug_printf("Invoke as \"llc ..." message.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 3a9feb4db8)
2017-10-17 16:59:02 +01:00
Leo Liu
608bea62ca st/va: don't re-allocate interlaced buffer with pakced format
It caused corruption, when vlVaPutImage putting raw data to the fields

v2: add RGB formats since it got uploaded here as well

Cc: mesa-stable@lists.freedesktop.org
Cc: Andy Furniss <adf.lists@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 0fa950ecd3)
2017-10-17 16:59:02 +01:00
Leo Liu
dc47b179ed st/vdpau: don't re-allocate interlaced buffer with packed YUV format
It caused corruption, when vlVdpVideoSurfacePutBitsYCbCr putting YUV to the fields

Cc: mesa-stable@lists.freedesktop.org
Cc: Andy Furniss <adf.lists@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 327480d10f)
2017-10-17 16:59:02 +01:00
Dave Airlie
1146591d4b radv: emit fmuladd instead of fma to llvm.
For Vulkan SPIR-V the spec states
fma() Inherited from OpFMul followed by OpFAdd.

Matt says the backend will do the right thing depending on the
hardware being compiled for, if you use the fmuladd intrinsic.

Using the Mad Max pts test, on high settings at 4K:
CHP: 55->60
HGDD: 46->50
LM: 55->60
No change on Stronghold.

Thanks to Feral for spending the time to track this down.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4e93d6baae)
[Emil Velikov: s/ac_to_float_type/to_float_type/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-10-17 16:59:02 +01:00
Emil Velikov
76c6bbca7c cherry-ignore: add "anv: Remove unreachable cases from isl_format_for_size"
The commit causes a number of regressions with dEQP

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-17 16:57:45 +01:00
Lionel Landwerlin
3315bb4f08 intel: compiler: vec4: add missing default 0 lod
We set a similar default value for LOD in the fs backend for TXS/TXL.
Without this we end up generating invalid MOV with a null src.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d3acc240d0)
2017-10-17 16:53:25 +01:00
Józef Kucia
cb72969d8b anv: Fix vkCmdFillBuffer()
The vkCmdFillBuffer() command fills a buffer with an uint32_t value.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.1 17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 15fdbf9c39)
2017-10-11 17:44:36 +01:00
Marek Olšák
1093207463 st/mesa: don't use pipe_surface for passing information about EGLImage
Use st_egl_image instead. radeonsi doesn't like when we create
a pipe_surface with PIPE_FORMAT_NV12.

This fixes NV12 texturing on radeonsi using kmscube.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 2d62817da9)
2017-10-11 17:44:36 +01:00
Bas Nieuwenhuizen
f5f515a023 nir/spirv: Allow loop breaks in a switch body.
Per the SPIR-V spec 2.11 Structured Control Flow:

"The only blocks in a construct that can branch outside the construct are

...
- a break block for the innermost loop it is inside of.
..."

With

"Break block: A block containing a branch to the Merge Block of a loop header's merge instruction."

Note that it puts no restriction on not being in an if or switch within the innermost loop.

This passes the loop_break block to the switch body so it can properly detect loop breaks.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ef61d09d5b)
2017-10-11 17:44:36 +01:00
Rob Clark
18db986354 freedreno/a5xx: fix missing restore state
RB_CLEAR_CNTL seems to be in a funny state after boot (at least on
8x96/a530).

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 7f3eab03fe)
2017-10-11 17:44:36 +01:00
Rob Clark
c386538036 freedreno/a5xx: align height to GMEM
Similar to the way width/pitch alignment works, it seems like we need to
do similar for height.  Otherwise the BLIT from system memory to GMEM
can over-fetch beyond the end of the buffer, triggering a fault.

I'm not sure if there is a better solution yet.  Possibly we could fall
back to pre-a5xx style DRAW packets for cases where BLIT might over-
fetch.  (We in theory have that problem already with rendering to higher
mipmap levels, although fortunately those tend to use GMEM bypass.)

This fixes issues reported with glamor.

Reported-by: don.harbin@linaro.org
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 16ac70bdcf)
2017-10-11 17:44:36 +01:00
Nicolai Hähnle
d0b52003d0 radeonsi: fix maximum advertised point size / line width
The hardware registers store the half-size/width in 12.4 fixed point
format, so 8192 is the maximum.

Fixes dEQP-GLES3.functional.rasterization.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 30e37289ea)
2017-10-11 17:44:36 +01:00
Nicolai Hähnle
6e903ae7d5 radeonsi: deduce rast_prim correctly for tessellation point mode
Together with the previous patches, this fixes
dEQP-GLES31.functional.primitive_bounding_box.wide_points.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a3fa3b2e02)
2017-10-11 17:44:36 +01:00
Nicolai Hähnle
6e3788e3b1 radeonsi: don't discard points and lines
This is a bit conservative, but a more precise solution requires access
to the rasterizer state. This is something to tackle after the fork between
r600 and radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4d74432dd3)
2017-10-11 17:44:36 +01:00
Nicolai Hähnle
05e2ed7889 radeonsi: move current_rast_prim to r600_common_context
We'll use it in the scissors / clip / guardband state.

v2: avoid a performance regression on r600 when applied to
    (pre-fork) stable branches

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f86a112b07)
2017-10-11 17:44:36 +01:00
Nicolai Hähnle
f2fc168517 glsl/lower_instruction: handle denorms and overflow in ldexp correctly
GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.

Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 93bf9c114b)
[Emil Velikov: init_num_operands() does not exist on branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/lower_instructions.cpp
2017-10-11 17:44:06 +01:00
Nicolai Hähnle
eac6932c66 util/queue: fix a race condition in the fence code
A tempting alternative fix would be adding a lock/unlock pair in
util_queue_fence_is_signalled. However, that wouldn't actually
improve anything in the semantics of util_queue_fence_is_signalled,
while making that test much more heavy-weight. So this lock/unlock
pair in util_queue_fence_destroy for "flushing out" other threads
that may still be in util_queue_fence_signal looks like the better
fix.

v2: rephrase the comment

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
(cherry picked from commit a208cd7ae4)
2017-10-11 17:22:55 +01:00
Nicolai Hähnle
7bedb1fd12 radeonsi/gfx9: fix geometry shaders without output vertices
Not that those are super common or useful, but hey! Fun corner cases
of the API...

Fixes dEQP-GLES31.functional.geometry_shading.emit.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 7dfa891f32)
2017-10-11 17:22:55 +01:00
Nicolai Hähnle
332f6e9b3b amd/common: fix build_cube_select
Fix the custom cube coord selection sequence to be identical to
the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling
with user-provided derivatives.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 5be5c1e0fa)
2017-10-11 17:22:55 +01:00
Nicolai Hähnle
f5cbe1f9b8 st/glsl_to_tgsi: fix conditional assignments to packed shader outputs
Overriding the default (no-op) swizzle is clearly counter-productive,
since the whole point is putting the destination register as one of
the source operands so that it remains unmodified when the assignment
condition is false.

Fragment depth and stencil outputs are a special case due to how their
source swizzles are manipulated in translate_src when compiling to
TGSI.

Fixes dEQP-GLES2.functional.shaders.conditionals.if.*_vertex
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

(cherry picked from commit 8ea7d3a5c8)
2017-10-11 17:22:55 +01:00
Marek Olšák
f6d64c7dc1 mesa: fix texture updates for ATI_fragment_shader
Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d54025cd1)
2017-10-11 17:22:55 +01:00
Leo Liu
80db5b3420 st/va: use pipe transfer_map to map upload buffer
The function pipe_buffer_map() is only for linear pipe buffer,
with height as 0, and it's not for any 2D textures.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6ed61b8d3f)
2017-10-11 17:22:55 +01:00
Juan A. Suarez Romero
5a71ed6fa5 docs: add sha256 checksums for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-10-02 18:10:02 +02:00
Juan A. Suarez Romero
bc12538a8e docs: add release notes for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-10-02 17:26:10 +02:00
Juan A. Suarez Romero
98dce315c8 Update version to 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-10-02 17:15:13 +02:00
Bas Nieuwenhuizen
1535e8a5d4 radv: Check for GFX9 for 1D arrays in image_size intrinsic.
Only on GFX9 we implement them as 2D images.

This fixes:
dEQP-VK.image.image_size.1d_array.readonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_7x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_writeonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_writeonly_7x1
dEQP-VK.image.image_size.1d_array.writeonly_12x34
dEQP-VK.image.image_size.1d_array.writeonly_1x1
dEQP-VK.image.image_size.1d_array.writeonly_32x32
dEQP-VK.image.image_size.1d_array.writeonly_7x1

Fixes: 1bcb953e16 "radv: handle GFX9 1D textures"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 979978ee06)
2017-10-02 17:12:17 +02:00
Nicolai Hähnle
a66a70480f radeonsi: fix a regression in integer cube map handling
A recent commit fixed the case of 8888 integer cube maps, which need the
workaround of replacing the data format with USCALED/SSCALED. However,
this broke the case of non-8888 integer cube maps; those still need the
fix of shifting the texture coordinates.

Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar.

Fixes: 6fb0c1013b ("radeonsi: workaround for gather4 on integer cube maps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6d23f7c65d)
2017-10-02 17:12:17 +02:00
Nicolai Hähnle
bd522741c4 amd/common: move ac_build_phi from radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 052b974fed)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/common/ac_llvm_build.c
	src/amd/common/ac_llvm_build.h
2017-10-02 17:12:17 +02:00
Jason Ekstrand
b78c664115 vulkan/wsi/wayland: Return better error messages
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4fe3913b96)
2017-09-28 13:54:58 +00:00
Jason Ekstrand
2467b50e45 vulkan/wsi/wayland: Copy wl_proxy objects from oldSwapchain if available
This should save us some round trips while resizing.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 537b9bc3e4)
2017-09-28 13:54:58 +00:00
Jason Ekstrand
b5a70210af vulkan/wsi/wayland: Stop caching Wayland displays
We originally implemented caching to avoid unneeded round-trips to the
compositor when querying surface capabilities etc. to set up the
swapchain.  Unfortunately, this doesn't work if vkDestroyInstance is
called after the Wayland connection has been dropped.  In this case, we
end up trying to clean up already destroyed wl_proxy objects which leads
to crashes.  In particular most of dEQP-VK.wsi.wayland is crashing
thanks to this problem.

This commit gets rid of the cache and simply embeds the wsi_wl_display
struct in the swapchain.  While we're at it, we can get rid of the
wl_event_queue that we were storing in the swapchain because we can just
use the one in the embedded wsi_wl_display.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Bugzilla: https://bugs.freedesktop.org/102578
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4369102498)
2017-09-28 13:54:58 +00:00
Jason Ekstrand
5edc4dc080 vulkan/wsi/wayland: Refactor wsi_wl_display code
We convert it over to an inti/finish model and make create/destroy
wrappers for the former.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 77181d9580)
2017-09-28 13:54:57 +00:00
Ian Romanick
6587cbfca2 nv20: Fix GL_CLAMP
v2: Force T and R wrap modes to GL_CLAMP_TO_EDGE for 1D textures.
This fixes a regression in tex1d-2dborder.  The test uses a 1D texture
but it provides S and T texture coordinates.  Since the T wrap mode
would (correctly) be set to GL_CLAMP, the texture would gradually
blend (incorrectly) with the border color.

I also tried setting NV20_3D_TEX_FORMAT_DIMS_1D instead of
NV20_3D_TEX_FORMAT_DIMS_2D for 1D textures, but that did not help.

It is possible that the same problem exists for 2D textures with the
R-wrap mode, but I don't think there are any piglit tests for that.

No test changes on NV20 (10de:0201).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 953a3cf0fd)
2017-09-28 13:54:57 +00:00
Tomasz Figa
6b0ded06b1 egl/dri2: Implement swapInterval fallback in a conformant way
dri2_fallback_swap_interval() currently used to stub out swap interval
support in Android backend does nothing besides returning EGL_FALSE.
This causes at least one known application (Android Snapchat) to fail
due to an unexpected error and my loose interpretation of the EGL 1.5
specification justifies it. Relevant quote below:

    The function

        EGLBoolean eglSwapInterval(EGLDisplay dpy, EGLint interval);

    specifies the minimum number of video frame periods per buffer swap
    for the draw surface of the current context, for the current rendering
    API. [...]

    The parameter interval specifies the minimum number of video frames
    that are displayed before a buffer swap will occur. The interval
    specified by the function applies to the draw surface bound to the
    context that is current on the calling thread. [...] interval is
    silently clamped to minimum and maximum implementation dependent
    values before being stored; these values are defined by EGLConfig
    attributes EGL_MIN_SWAP_INTERVAL and EGL_MAX_SWAP_INTERVAL
    respectively.

    The default swap interval is 1.

Even though it does not specify the exact behavior if the platform does
not support changing the swap interval, the default assumed state is the
swap interval of 1, which I interpret as a value that eglSwapInterval()
should succeed if called with, even if there is no ability to change the
interval (but there is no change requested). Moreover, since the
behavior is defined to clamp the requested value to minimum and maximum
and at least the default value of 1 must be present in the range, the
implementation might be expected to have a valid range, which in case of
the feature being unsupported, would correspond to {1} and any request
might be expected to be clamped to this value.

This is further confirmed by the code in _eglSwapInterval() in
src/egl/main/eglsurface.c, which is the default fallback implementation
for EGL drivers not implementing its own. The problem with it is that
the DRI2 EGL driver provides its own implementation that calls into
platform backends, so we cannot just simply fall back to it.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-28 13:54:57 +00:00
Boris Brezillon
d9aba007c0 broadcom/vc4: Fix infinite retry in vc4_bo_alloc()
cleared_and_retried is always reset to false when jumping to the retry
label, thus leading to an infinite retry loop.

Fix that by moving the cleared_and_retried variable definitions at the
beginning of the function.  While we're at it, move the create variable
with the other local variables and explicitly reset its content in the
retry path.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Fixes: 78087676c9 "vc4: Restructure the simulator mode."
(cherry picked from commit ef578906d8)
2017-09-28 13:54:57 +00:00
Eric Anholt
e22ab89f9f broadcom/vc4: Keep pipe_sampler_view->texture matching the original texture.
I was overwriting view->texture with the shadow resource when we need to
do shadow copies (retiling or baselevel rebase), but that tripped up some
critical new sanity checking in state_tracker (making sure that stObj->pt
hasn't changed from view->texture through TexImage-related paths).

To avoid that, move the shadow resource to the vc4_sampler_view struct.

Fixes: f0ecd36ef8 ("st/mesa: add an entirely separate codepath for setting up buffer views")
(cherry picked from commit 68c91a87d7)
2017-09-28 13:54:57 +00:00
Jason Ekstrand
1b100aae7b vulkan/wsi/wayland: Stop printing out the DRM device
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 016de7e155)
2017-09-28 13:54:57 +00:00
Kenneth Graunke
ec89ab30c9 i965/vec4: Fix swizzles on atomic sources.
Atomic operation sources are scalar values, but we were failing to
select the .x component of the second operand.  For example,

   atomicCounterCompSwapARB(counter, 5u, 10u)

would generate

   mov(8) vgrf4.x:D, 5D
   mov(8) vgrf5.x:D, 10D

   mov(8) vgrf9.x:UD, vgrf4.xyzw:D
   mov(8) vgrf9.y:UD, vgrf5.xyzw:D

which wrongly selects the .y component of vgrf5, so the actual 10u value
would get dead code eliminated.  The swizzle works for the other source,
but both of them ought to be .xxxx.

Fixes the compare and swap CTS tests in:
KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase

Cc: "17.2 17.1 17.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 66342c997f)
2017-09-28 13:54:57 +00:00
Kenneth Graunke
8619a6b1cc i965/vec4: Actually handle atomic op intrinsics.
Embarassingly, someone enabled the ARB_shader_atomic_counter_ops
extension for Gen7+ but never added the intrinsics to the switch
statement in the vec4 backend, so they just hit an unreachable()
call and died.

Fixes: 40dd45d0c6 (i965: Enable ARB_shader_atomic_counter_ops)
Cc: "17.2 17.1 17.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit a62fe34098)
2017-09-28 13:54:57 +00:00
Samuel Pitoiset
47c8ffe719 radv: fix saved compute state when doing statistics/occlusion queries
We are pushing 16-bytes of constants, so we have to save/restore
the same amount of data to avoid data corruption.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4b407a62c7)
2017-09-28 13:54:57 +00:00
Grazvydas Ignotas
84f3374f1c configure: check if -latomic is needed for __atomic_*
On some platforms, gcc generates library calls when __atomic_* functions
are used, but does not link the required library (libatomic) automatically
(supposedly to allow the app to use some other atomics implementation?).

Detect this at configure time and add the library when needed. Tested
on armel (library was added) and on x86_64 (was not, as expected).

Some documentation on this is provided in GCC wiki:
https://gcc.gnu.org/wiki/Atomic/GCCMM

Fixes: 8915f0c0 "util: use GCC atomic intrinsics with explicit memory model"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102573
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2ef7f23820)
2017-09-28 13:54:57 +00:00
Nicolai Hähnle
5e5882a7aa amd/addrlib: fix missing va_end() after va_copy()
There's no reason to use va_copy here.

CID: 1418113
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Fixes: e7fc664b91 ("winsys/amdgpu: add addrlib - texture
                              addressing and alignment calculator")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 34126ed248)
2017-09-28 13:54:57 +00:00
Juan A. Suarez Romero
89db1532fc cherry-ignore: add "radv: copy the number of viewports/scissors at pipeline bind time"
fixes: This commit addressed earlier commits dcf46e99 and 60878dd0 which
did not land in branch.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-28 13:54:57 +00:00
Nicolai Hähnle
5e8ddba33f radeonsi: set MIP_POINT_PRECLAMP to 0
This fixes a bug with nearest ("point") mip selection when the fractional
part of max_lod is in (0.5,1). In this case, the spec mandates that
we still select the mip level ceil(max_lod) in the clamping case. However,
MIP_POINT_PRECLAMP will clamp before the mip selection, which is wrong.

Supposedly this setting was originally copied from the closed Vulkan
driver, but as far as I can tell, closed Vulkan was actually changed back
recently :)

Fixes dEQP-GLES3.functional.texture.mipmap.2d.max_lod.{nearest,linear}_nearest

Fixes: f7420ef5b4 ("radeonsi: enable some sampler fields to match the closed driver")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 704ddbcdf6)
2017-09-28 13:54:57 +00:00
Eric Anholt
fc2b01adc9 broadcom/vc4: Fix use-after-free when deleting a program.
By leaving the compiled shader in the context's stage state, the next
compile of a new FS would look in the old compiled FS for figuring out
whether to set various dirty flags for the VS compile.  Clear out the
pointer when deleting the program, and make sure that we always mark the
state as dirty if the previous program had been lost.  Fixes valgrind
warnings on glsl-max-varyings.

Fixes: 2350569a78 ("vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.")
(cherry picked from commit 3752ad28f2)
2017-09-28 13:54:57 +00:00
Eric Anholt
fe53c8bc21 broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.
The blitter will bind just the depth buffer, which flushes the current job
if we had both a color and depth/stencil.  If the clear was doing partial
depth/stencil (quad-based) and color (tile-based), we'd go on to try to
set up the rest of the tile clear in the now flushed job.

Instead, move the partial clear up before we start setting up the job for
the current FBO state, and re-fetch the job if we're continuing on to a
tile-based clear.  Fixes valgrind failures in fbo-depthtex.

Fixes: 9421a6065c ("vc4: Fix fallback to quad clears of depth in GLX.")
(cherry picked from commit 9940fb4205)
2017-09-28 13:54:57 +00:00
Eric Anholt
60efffcec0 broadcom/vc4: Fix use-after-free for flushing when writing to a texture.
I was trying to continue the hash table loop, not the inner loop.  This
tended to work out, because we would have *just* freed the job struct.
Fixes some valgrind failures in fbo-depthtex.

Fixes: f597ac3966 ("vc4: Implement job shuffling")
(cherry picked from commit d88a75182d)
2017-09-28 13:54:56 +00:00
Dave Airlie
5d1deeeab3 st/glsl->tgsi: fix u64 to bool comparisons.
Otherwise we end up using a 32-bit comparison which didn't end well.

Timothy caught this while playing around with some opt passes.

Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a7a7bf21bd)
2017-09-28 13:54:56 +00:00
Juan A. Suarez Romero
dbccd8465d cherry-ignore: add "radv: Check for GFX9 for 1D arrays in image_size intrinsic."
fixes: This commit addressed earlier commits 61ad2f13 and 6dcc54b4 which
did not land in branch

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-28 13:54:56 +00:00
Juan A. Suarez Romero
de44476572 cherry-ignore: add "radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug"
fixes: This commit addressed an earlier commit c7e9ebb3ab which did not
land in branch

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-28 13:54:56 +00:00
Leo Liu
2712e59538 st/va/postproc: use video original size for postprocessing
Otherwise the aligned size will make video scaled

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit eb51838771)
2017-09-28 13:54:56 +00:00
Samuel Iglesias Gonsálvez
e16c31f3fe anv: fix viewport transformation for z component
In Vulkan, for 'z' (depth) component, the scale and translate values
for the viewport transformation are:

pz = maxDepth - minDepth
oz = minDepth

zf = pz × zd + oz

Being zd, the third component in vertex's normalized device coordinates.

Fixes: dEQP-VK.draw.inverted_depth_ranges.*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d2cd9deeb8)
2017-09-28 13:54:56 +00:00
David Airlie
dbf33a12e1 radv: add gfx9 scissor workaround
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3e54493265)
2017-09-28 13:54:56 +00:00
Lucas Stach
cd9d26c3ef etnaviv: fix 16bpp clears
util_pack_color may leave undefined values in the upper half of the packed
integer. As our hardware needs the upper 16 bits to mirror the lower 16bits,
this breaks clears of those formats if the undefined values aren't masked off.

I've only observed the issue with R5G6B5_UNORM surfaces, other 16bpp
formats seem to work fine.

Fixes: d6aa2ba2b2 (etnaviv: replace translate_clear_color with util_pack_color)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit e9d37d68cf)
2017-09-28 13:54:56 +00:00
Tim Rowley
b378fd5cf3 swr/rast: remove llvm fence/atomics from generated files
We currently don't use these instructions, and since their API
changed in llvm-5.0 having them in the autogen files broke the mesa
release tarballs which ship with generated autogen files.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847
CC: mesa-stable@lists.freedesktop.org
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 066d1dc951)
2017-09-28 13:54:56 +00:00
Tapani Pälli
caff607bcb mesa: free current ComputeProgram state in _mesa_free_context_data
This is already done for other programs stages, fixes a leak when using
compute programs.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102844
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 589457d97f)
2017-09-28 13:54:56 +00:00
Nicolai Hähnle
737e2245ce radeonsi: fix array textures layer coordinate
Like for cube map (array) gather, we need to round to nearest on <= VI.

Fixes tests in dEQP-GLES3.functional.shaders.texture_functions.texture.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 87f7c7bd65)
2017-09-28 13:54:56 +00:00
Nicolai Hähnle
46cfefc5be glsl/linker: fix output variable overlap check
Prevent an overflow caused by too many output variables. To limit the
scope of the issue, write to the assigned array only for the non-ES
fragment shader path, which is the only place where it's needed.

Since the function will bail with an error when output variables with
overlapping components are found, (max # of FS outputs) * 4 is an upper
limit to the space we need.

Found by address sanitizer.

Fixes dEQP-GLES3.functional.attribute_location.bind_aliasing.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 15cae12804)

Squashed with commit:

glsl/linker: properly fix output variable overlap check

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102904
Fixes: 15cae12804 ("glsl/linker: fix output variable overlap check")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit df8767a14e)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-28 13:54:56 +00:00
Józef Kucia
6121bc3150 anv: Fix descriptors copying
Trivial.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 65a09f98ad)
2017-09-28 13:54:56 +00:00
Dave Airlie
488460b1ef ac/surface: handle S8 on gfx9
If we don't have a depth piece, we don't get a correct
swizzle mode and we hit an assert in addrlib.

In case of no depth get the preferrred swizzle mode for
stencil alone.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c4ac522511)

Squashed with commit:

ac/surface: handle error when choosing preferred swizzle mode

CID: 1418140
Fixes: c4ac522511 ("ac/surface: handle S8 on gfx9")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit eb71394ff3)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-28 13:54:56 +00:00
Alexandru-Liviu Prodea
3c54a77d08 Scons: Add LLVM 5.0 support
1 new required library - LLVMBinaryFormat

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org=/show_bug.cgi?id=3D102318
Signed-off-by: Alexandru-Liviu Prodea <liviuprodea@yahoo.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c1b0137048)
2017-09-28 13:54:56 +00:00
Nicolai Hähnle
891627ec0f amd/common: add workaround for cube map array layer clamping
Fixes dEQP-GLES31.functional.texture.filtering.cube_array.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 94736d31c3)

Squashed with commit:

amd/common: add chip_class to ac_llvm_context

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3db86d86ed)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-28 13:54:56 +00:00
Nicolai Hähnle
688f8415f7 amd/common: round cube array slice in ac_prepare_cube_coords
The NIR-to-LLVM pass already does this; now the same fix covers
radeonsi as well.

Fixes various tests of
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e0af3bed2c)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-09-28 13:54:55 +00:00
Nicolai Hähnle
54915ee52f radeonsi: workaround for gather4 on integer cube maps
This is the same workaround that radv already applied in commit
3ece76f03d ("radv/ac: gather4 cube workaround integer").

Fixes dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i/ui.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6fb0c1013b)
2017-09-28 13:54:55 +00:00
Jason Ekstrand
da517431f6 i965/blorp: Set r8stencil_needs_update when writing stencil
This fixes a crash on Haswell when we try to upload a stencil texture
with blorp.  It would also be a problem if someone tried to texture from
stencil after glBlitFramebuffers.

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit a43d379000)
2017-09-28 13:54:55 +00:00
Matt Turner
3b4a6dbe93 util/u_atomic: Add implementation of __sync_val_compare_and_swap_8
Needed for 32-bit PowerPC.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 1bbe180873)
2017-09-28 13:54:55 +00:00
Matt Turner
d2dca92a6c util: Link libmesautil into u_atomic_test
Platforms without particular atomic operations require the
implementations in u_atomic.c

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d075a4089e)
[Juan A. Suarez: resolve trivial conflicts]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/util/Makefile.am
2017-09-28 13:54:55 +00:00
Emil Velikov
d41d54b061 automake: enable libunwind in `make distcheck'
Enable the toggle to catch when the library is missing from the link
path. Better to test, fail and address before releasing Mesa ;-)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 9aba643e3c)
2017-09-28 13:54:55 +00:00
Gert Wollny
5d16640ea1 travis: Add libunwind-dev to gallium/make builds
libunwind is a optional dependency used by the gallium aux module
(libgallium) and consequently the final binaries must be linked against
it. To test whether the library is properly specified in the link pass
add it to the travis-ci build environment and force its use.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 39fe51c1e3)
2017-09-26 12:17:33 +02:00
Gert Wollny
6024a95198 travis: force llvm-3.3 for "make Gallium ST Other"
In Ubuntu Trusty the default version of llvm is 3.4 and the build was
actually randomly picking 3.5 or 3.9. Adding libunwind would then result
is build success or failure depending of what version was picked.

Install the llvm-3.3-dev package and force its use: On one hand it is
the minimum required version we want to the build test against, and on
the other hand forcing the version stabilizes the build.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d3675812b5)
2017-09-26 12:17:33 +02:00
Dave Airlie
996bf9c1cd radv/nir: call opt_remove_phis after trivial continues.
With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 64d9bd149a)
[Juan A. Suarez: apply patch over src/amd/vulkan/radv_pipeline.c]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>

Conflicts:
	src/amd/vulkan/radv_shader.c
2017-09-26 12:17:33 +02:00
Emil Velikov
bd903d4ee1 docs: add sha256 checksums for 17.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-18 00:12:36 +01:00
Emil Velikov
d6d2b6b5ec docs: add release notes for 17.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-17 23:57:32 +01:00
Emil Velikov
cb778d563f Update version to 17.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-17 23:53:08 +01:00
Bas Nieuwenhuizen
7c3bd519e7 radv: Don't allocate CMASK for linear images.
We can't use it anyway in fast clears, and on GFX9 it seems to
actually hange the card if we specify it.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
(cherry picked from commit 1a172fb113)
2017-09-13 21:52:42 +01:00
Bas Nieuwenhuizen
c6b3732967 radv: Disable multilayer & multilevel DCC.
The current DCC init routine doesn't account for initializing a
single layer or level. Multilayer seems hard for small textures on
pre-GFX9 as tre metadata for the layers can be interleaved. For
GFX9 multilevel textures are a problem for similar reasons.

So just disable this for now, until we handle the texture modes
correctly.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
(cherry picked from commit bee83b2661)
2017-09-13 21:52:42 +01:00
Eric Engestrom
e012ade1cf docs/egl: remove reference to EGL_DRIVERS_PATH
Support for external egl drivers was dropped a few years ago.

Fixes: 209360bbb9 "egl/main: drop support for external egl drivers"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 85b66d2096)
2017-09-13 21:52:42 +01:00
Dave Airlie
41e691d605 radv/winsys: fix flags vs va_flags thinko.
Fixes: e8d57802f (radv/gfx9: allocate events from uncached VA space)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a5add6fb30)
2017-09-13 21:52:42 +01:00
Emil Velikov
b6ae4400fc egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extension
Fixes: 3b7b6adf3a ("egl: Implement __DRI_BACKGROUND_CALLABLE")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f24bc18162)
2017-09-13 21:52:42 +01:00
Bas Nieuwenhuizen
c47914276e radv: Fix vkCopyImage with both depth and stencil aspects.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ff23e03d60)
2017-09-13 21:52:42 +01:00
Bas Nieuwenhuizen
536f852d42 radv: Actually set the cmd_buffer usage_flags.
Otherwise, the simultaneous uage bit doesn't get set from the begin
info, which we need for batchchaining.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dec7b38fe6)
2017-09-13 21:52:42 +01:00
Grazvydas Ignotas
fdc4f6e684 radv: don't assert on empty hash table
Currently if table_size is 0, it's falling through to:

unreachable("hash table should never be full");

But table_size can be 0 when RADV_DEBUG=nocache is set, or when the
table allocation fails (which is not considered an error).

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit b8dd69e1b4)
2017-09-13 21:52:42 +01:00
Eric Engestrom
8cf9e5ab56 mesa/st: remove unwanted backup file
Fixes: 0ac78dc925 "util: move string_to_uint_map to glsl"
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ac0d8dc3fa)
2017-09-13 21:52:42 +01:00
Samuel Pitoiset
d5c71e44c4 radeonsi: update dirty_level_mask before dispatching
This fixes a rendering issue with Hitman when bindless textures
are enabled.

Fixes: 2263610827 ("radeonsi: flush DB caches only when transitioning from DB to texturing")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 59101e771d)
2017-09-13 21:52:42 +01:00
Nicolai Hähnle
53190187cc glsl: fix glsl_struct_field size calculations for shader cache
Found by address sanitizer:

==22621==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61400000cbd8 at pc 0x7f561610a4ff bp 0x7ffca85f9d50 sp 0x7ffca85f94f8
READ of size 344 at 0x61400000cbd8 thread T0
    #0 0x7f561610a4fe  (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x5f4fe)
    #1 0x7f560bb305a5 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
    #2 0x7f560bb305a5 in blob_write_bytes ../../../mesa-src/src/compiler/glsl/blob.c:136
    #3 0x7f560be7d7ff in encode_type_to_blob ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:153
    #4 0x7f560be81222 in write_program_resource_data ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:950
    #5 0x7f560be81222 in write_program_resource_list ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1118
    #6 0x7f560be81222 in shader_cache_write_program_metadata(gl_context*, gl_shader_program*) ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1407
    #7 0x7f560b825fdb in link_program ../../../mesa-src/src/mesa/main/shaderapi.c:1163

Fixes: 073a84ff60 ("glsl: stop adding pointers from glsl_struct_field to the cache")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 4da6cf6c98)
2017-09-13 21:52:42 +01:00
Emil Velikov
366e02f992 cherry-ignore: add EGL+gbm swast patches
They depend on an earlier commit which did not land in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-13 21:52:41 +01:00
Nicolai Hähnle
5cec093773 ac/surface: match Z and stencil tile config
Fixes various piglit tests on Stoney, see the comment.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit cffc0ae0d9)
[Emil Velikov: attribute for the different gfx6_surface_settings()]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_surface.c
2017-09-13 21:52:02 +01:00
Nicolai Hähnle
69394e3517 radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog
gl_SampleMaskIn is supposed to contain set bits only for the samples that
are covered by the current fragment shader invocation, but the VGPR
initialization hardware loads the set of all bits that are covered at the
current pixel.

Fixes various tests in
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 92c4277990)
[Emil Velikov: attribute for the lack of add_arg*checked() API]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
2017-09-13 21:52:02 +01:00
Timothy Arceri
c0c1e05fb6 glsl: stop adding pointers from bindless structs to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4009370232)
2017-09-13 21:52:02 +01:00
Timothy Arceri
94b656cb5e glsl: stop adding pointers from shader_info to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit a6618afd27)
2017-09-13 21:52:02 +01:00
Timothy Arceri
5e0b556abc compiler: move pointers to the start of shader_info
This will allow us to easily skip them when writting the struct
to disk cache.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3ea3f75723)
2017-09-13 21:52:02 +01:00
Timothy Arceri
f087dad40d glsl: always write a name/label string to the cache
In the following patch we will stop writing the pointer to cache.

Unfortunately adding empty strings to that cache seems to be the
only thing we can do here once we no longer have the pointers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 44918a1979)
2017-09-13 21:52:02 +01:00
Timothy Arceri
83bbfaddf1 glsl: don't write uniform storage offset if there isn't one
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 22154823d2)
2017-09-13 21:52:02 +01:00
Timothy Arceri
a8eea040cd glsl: add has_uniform_storage() helper to shader cache
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2662269ad7)
2017-09-13 21:52:02 +01:00
Timothy Arceri
15ecbb2c07 glsl: stop adding pointers from glsl_struct_field to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 073a84ff60)
2017-09-13 21:52:02 +01:00
Timothy Arceri
7fef8d436a glsl: stop adding pointers from gl_shader_variable to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 37d453b55a)
2017-09-13 21:52:02 +01:00
Timothy Arceri
ea1bcfc063 glsl: allow NULL to be passed to encode_type_to_blob()
This will be used by the following commit.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 37eb67714e)
2017-09-13 21:52:01 +01:00
Eric Engestrom
6e2b9fba21 util: improve compiler guard
Glibc 2.26 has dropped xlocale.h, but the functions needed (strtod_l()
and strdof_l()) can be found in stdlib.h.
Improve the detection method to allow newer builds to still make use of
the locale-setting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102454
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 49b428470e)
2017-09-13 21:52:01 +01:00
Kenneth Graunke
dcb634df92 i965: Set "Subslice Hashing Mode" to 16x16 on Apollolake.
As of 4.11, the kernel isn't bothering to set the subslice hashing mode
on Apollolake, leaving it at the default of 8x8.  (It initializes it to
16x4 on most platforms.)

Performance data for GPUTest Triangle on Apollolake at 1024x640:

   X-tiled RT:
   -----------
   8x8 -> 16x4:   2.4325%  +/- 0.383683% (n=107)
   8x8 -> 8x4:   -3.75105% +/- 0.592491% (n=40)
   8x8 -> 16x16:  6.17238% +/- 0.67157%  (n=30)

   Y-tiled RT:
   -----------
   8x8 -> 16x4:   1.30307%  +/- 0.297292% (n=205)
   8x8 -> 8x4:   -0.769282% +/- 0.729557% (n=35)
   8x8 -> 16x16:  3.00254%  +/- 0.715503% (n=40)

   8x MSAA RT (INTEL_FORCE_MSAA=8):
   --------------------------------
   8x8 -> 16x4:   1.38889% +/- 0.93729%  (n=7)
   8x8 -> 8x4:   -2.10643% +/- 1.15153%  (n=3)
   8x8 -> 16x16:  3.87183% +/- 1.08851%  (n=5)

Based on this, we choose 16x16 for Apollolake.

Skylake GT2 with X-tiled buffers appears to be a toss-up between 16x4
and 16x16, and with Y-tiled buffers it doesn't seem to really matter.
So we'll leave Skylake alone for now.

The hashing mode doesn't seem to make a measurable impact on more
complex benchmarks.

Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ebd2fd6ef3)
2017-09-13 21:52:01 +01:00
Dave Airlie
20be71ba7c radv/gfx9: fix image resource handling.
GFX9 changes how images are layed out, so this needs updating.

Fixes: dEQP-VK.query_pool.statistics_query.*

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3633bae36b)
2017-09-13 21:52:01 +01:00
Dave Airlie
a3d1ea347a radv/ac: bump params array for image atomic comp swap
For the comp_swap case this was overflowing and crashing
sometimes.

Fixes:
dEQP-VK.image.atomic_operations.compare_exchange.*

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit aba441be44)
2017-09-13 21:52:01 +01:00
Dave Airlie
dc640aab63 radv/gfx9: set mip0-depth correctly for 2d arrays/3d images
This field covers the whole resource.

Fixes:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.3d.format.*
dEQP-VK.texture.filtering.3d.combinations.*

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ebd2a5354d)
2017-09-13 21:52:01 +01:00
Dave Airlie
c8076c8ea1 radv: handle GFX9 1D textures
As GFX9 can't handle 1D depth textures, radeonsi and
apparantly pro just update all 1D textures to 2D,
and work around it.

This ports the workarounds from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1bcb953e16)

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-09-13 21:52:01 +01:00
Dave Airlie
62f0bb2a87 radv: don't use iview for meta image width/height.
Work out the width/height from the level manually, as on GFX9
we won't minify the iview width/height.

This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image* on gfx9

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2f5b4490b5)
2017-09-13 21:52:01 +01:00
Emil Velikov
40f06286f9 cherry-ignore: add execution_type() fix to the list
As mentioned by Matt:
 "Without commit 4fab67a441, this patch isn't  needed at all."

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-13 21:50:49 +01:00
Nicolai Hähnle
0f79eb7abe st/glsl_to_tgsi: only the first (inner-most) array reference can be a 2D index
Don't get distracted by record dereferences between array references.

Fixes dEQP-GLES31.functional.tessellation.user_defined_io.per_vertex_block.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 03203b7448)
2017-09-13 21:34:03 +01:00
Dave Airlie
54c5568fa9 radv: use simpler indirect packet 3 if possible.
This fixes some observed hangs on CIK GPUs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 219d29e4d8)
[Emil Velikov: add the hunk in radv_emit_indirect_draw]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_cmd_buffer.c
2017-09-13 21:34:03 +01:00
Dave Airlie
c2f314d721 radv/gfx9: allocate events from uncached VA space
This copies what amdgpu-pro does, and allocates the memory
for an event with an uncached mtype.

This fixes hangs with:
dEQP-VK.api.command_buffers.record_simul_use_primary

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e8d57802fe)
2017-09-13 21:34:02 +01:00
Dave Airlie
fe04abc6e9 radv/winsys: use amdgpu_bo_va_op_raw.
This is a precursor to the gfx9 fix to use uncached for the event
memory. Move to the interface which allows setting the flags,
but wrap it to avoid having to copy it around the place.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 76ac8fafad)
2017-09-13 21:34:02 +01:00
Marek Olšák
0e305f0518 st/mesa: skip draw calls with pipe_draw_info::count == 0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102502

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e4018fdd85)
2017-09-13 21:34:02 +01:00
Nicolai Hähnle
cb9dae484a radeonsi/gfx9: always flush DB metadata on framebuffer changes
This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 34124e412f)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_pipe.h
	src/gallium/drivers/radeonsi/si_state.c
2017-09-13 21:34:02 +01:00
Dave Airlie
57ecf28668 Revert "radv: disable support for VEGA for now."
This reverts commit 611076a41a.

With the two previous commits, vega shouldn't be unstable,
doesn't pass CTS, but can do a complete run, and games shouldn't
hang anymore, so bring it back online.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e38685cc62)
2017-09-13 21:34:02 +01:00
Dave Airlie
78e2b539a4 radv/gfx9: set descriptor up for base_mip to level range.
This is required on GFX9, fixes a bug in Talos where all the
mipmaps overlay each other.

Just pushing this as well as it fixes Talos.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6d929d3f85)
2017-09-13 21:34:02 +01:00
Dave Airlie
86d4c203bd radv: disable 1d/2d linear optimisation on gfx9.
This causes hangs in some of the CTS tests with a 2d
1536x2 texture.

This fixes hangs with:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.iew_type.1d_aray.format.r4g4b4a4_unorm_pack16.count_1.size.512x1_array_of_3
if we reenable it, make sure these don't regress.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d118ff8765)
2017-09-13 21:34:02 +01:00
Jason Ekstrand
bda2975eef spirv: Add support for the HelperInvocation builtin
I have no idea how this got missed but it's been missing since forever.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit e439908af9)
2017-09-13 21:34:02 +01:00
Roland Scheidegger
a4dc18efca st/mesa: fix view template initialization in try_pbo_readpixels
I think this is what the code was meant to do, albeit as far as I can tell
the redundant initialization some analyzers complain about should work as
well just fine (only the first layer will be used, if the view contains one
or more layers doesn't really matter).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102467
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 2b2c61f0df)
2017-09-13 21:34:02 +01:00
Kenneth Graunke
8f77409e2b i965: Fix crash in fallback GTT mapping.
We can't perf_debug without a context.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 52b65dfda8)
2017-09-13 21:34:02 +01:00
Rob Clark
458f52618a freedreno: skip batch-cache for compute shaders
It is kind of pointless for compute, and avoids issues with apps kicking
off more than 32 compute shaders at once.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc9e08b0c3)
2017-09-13 21:34:02 +01:00
Karol Herbst
402b8073ad nvc0: write 0 to pipeline_statistics.cs_invocations
cs_invocations are currently unsupported, but leaving the field uninitialized
is even worse.

fixes on nvc0:
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b672c3833b)
2017-09-13 21:34:02 +01:00
Ben Crocker
6fe4852f59 llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).

Roland Scheidegger's commit e827d91756
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000.  This patch fixes
three of the four regressions observed by Ray:

- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2

One regression remains:
- draw-vertices-2101010

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 57c8ead0cd)
2017-09-13 21:34:02 +01:00
Ray Strode
d080f10fdf gallivm: correct channel shift logic on big endian
lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.

The code fails to account for endianess when reading the packed
values.

This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 75cb6e3617)
2017-09-13 21:34:02 +01:00
Jason Ekstrand
93b794f094 anv/formats: Nicely handle unknown VkFormat enums
This fixes some crashes in the dEQP-VK.memory.requirements.core.* tests.
I'm not sure whether or not passing out-of-bound formats into the query
is supposed to be allowed but there's no harm in protecting ourselves
from it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/101956
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 242211933a)

Squashed with:

anv: fix off by one in array check

`anv_formats[ARRAY_SIZE(anv_formats)]` is already one too far.
Spotted by Coverity.

CovID: 1417259
Fixes: 242211933a "anv/formats: Nicely handle unknown VkFormat enums"
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
(cherry picked from commit 0c7272a66c)
2017-09-13 21:33:55 +01:00
Charmaine Lee
520786586d vbo: fix offset in minmax cache key
Instead of saving primitive offset in the minmax cache key,
save the actual buffer offset which is used in the cache lookup.

Fixes rendering artifact seen with GoogleEarth when run with
VMware driver.

v2: Per Brian's comment, initialize offset to avoid compiler warning.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2d93b462b4)

Squashed with:

vbo: fix build errors on android

incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int')
from 'const char *' [-Werror,-Wint-conversion]

      offset = indices;
             ^ ~~~~~~~

Fixes: 2d93b462b4 ("vbo: fix offset in minmax cache key")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0986f68632)
2017-09-13 21:32:19 +01:00
Michael Olbrich
9ffe4cc6c0 egl/dri2: only destroy created objects
dri2_display_destroy may be called by dri2_initialize_wayland_drm() if
initialization fails. In this case, these objects may not be initialized.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Michael Olbrich <m.olbrich@pengutronix.de>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 81d5c31631)
2017-09-13 21:05:57 +01:00
Brian Paul
2b0e70e2ae llvmpipe: initialize llvmpipe->dirty with LP_NEW_SCISSOR
If llvmpipe_set_scissor_states() is never called, we still need to be sure
that derived scissor/clip state is updated.  As of commit 743ad599a9
that function might not be called.

Fixes regressed Piglit gl-1.0-scissor-offscreen -fbo -auto test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101709
Fixes: 743ad599a9 ("st/mesa: don't set 16 scissors and 16 viewports
if they're unused")
Cc: "17.2" <mesa-stable@lists.freedesktop.org>

(cherry picked from commit 88cdf16871)
2017-09-13 21:05:57 +01:00
Emil Velikov
3a9b2e0e7d cherry-ignore: ignore gfx9 tile swizzle fix
As mentioned by Dave and Bas, the patch is not applicable for 17.2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-13 21:05:56 +01:00
Emil Velikov
188010c68d cherry-ignore: add getCapability patches
Adding extra infra is not acceptable for stable. Offending code has been
wrapped in a Android compile guard with an earlier commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-13 21:05:33 +01:00
Emil Velikov
b4473dd519 docs: add sha256 checksums for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-04 18:24:29 +01:00
Emil Velikov
f5925b2897 docs: Update 17.2.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-04 18:16:16 +01:00
Emil Velikov
702950d1ad Update version to 17.2.0(final)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-04 15:16:21 +01:00
Emil Velikov
909f2b6aa2 Update version to 17.2.0-rc6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-31 10:06:29 +01:00
Emil Velikov
3666d9ab99 egl/wayland: make sure HAS_$FORMAT is set for wl_dmabuf
Otherwise eglCreateWaylandBufferFromImageWL will fail, since we
have no "supported" format.

Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 9e07005e87)
2017-08-29 19:14:14 +01:00
Emil Velikov
5303f0c246 egl/wayland: set correct format with wl_dmabuf as wl_drm is missing
For most/all cases today, we have wl_drm available alongside wl_dmabuf.
Yet in the long run, we want to make sure the latter can operate without
any traces of the former.

Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit da100fe697)
2017-08-29 19:14:13 +01:00
Emil Velikov
8d51747dc0 egl/wayland: polish object teardown in dri2_wl_destroy_surface
The wl_drm wrapper is created before the wl display/surface ones.
Thus make sure we destroy it after them. In reality it should not make
any difference either way.

Fixes: 03dd9a88b0 ("egl/wayland: Use per-surface event queues")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 1a8015e753)
2017-08-29 19:14:13 +01:00
Emil Velikov
3b41bef9a4 egl/wayland: plug leaks in dri2_wl_create_window_surface() error path
We forgot to teardown the wl display/surface wrappers.

Fixes: 03dd9a88b0 ("egl/wayland: Use per-surface event queues")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 83442112d7)
2017-08-29 19:14:13 +01:00
Grazvydas Ignotas
e040bdbc8a radv: clear dynamic_shader_stages on create
Valgrind reports it's being used uninitialized.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 7780374833)
2017-08-29 19:14:13 +01:00
Dave Airlie
e4c4b0dbcb radv/wsi: Compute correct row_pitch for GFX9.
(commit split out by Bas Nieuwenhuizen)

Fixes: 65477bae9c "radv: enable GFX9 on radv"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 9573bd70e1)
2017-08-29 19:14:13 +01:00
Emil Velikov
ca9817a6e8 egl: don't NULL deref the .get_capabilities function pointer
One could easily introduce version 3 of the DRI2fenceExtension,
extending the struct, while not implementing the above function.

Thus we'll end up with NULL pointer, and dereferencing it won't fare
too well.

Fixes: 0201f01dc4 ("egl: add EGL_ANDROID_native_fence_sync")
Cc: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit f0d053cb6d)
2017-08-29 19:14:13 +01:00
Bas Nieuwenhuizen
df8e65feb7 radv: Fix sparse BO mapping merging.
If we merge a mapping with the mapping before it, we also need
to not only change the offset, but also the bo offset.

Fixes: 715df30a4e "radv/amdgpu: Add winsys implementation of virtual buffers."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9b7e663da1)
2017-08-29 19:14:13 +01:00
Bas Nieuwenhuizen
ec94b874c4 radv: Fix off by one in MAX_VBS assert.
e.g. 0 + 32 <= 32 should be valid.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit fba0e07869)
2017-08-29 19:14:13 +01:00
Bas Nieuwenhuizen
17e5b0f1db radv: Don't set a new subpass on compute resolve.
We don't use the render path so totally unneeded.

Fixes: 19be95f71e "radv: add subpass resolve compute path"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bd81cb3206)
2017-08-29 19:14:13 +01:00
Bas Nieuwenhuizen
69c793c0ce radv: Remove some intel comments from the resolve code.
These are clearly not applicable to radv.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e5c4e10769)
2017-08-29 19:14:13 +01:00
Dave Airlie
b29987ace4 radv: don't crash if we have no framebuffer
Recording secondaries with no framebuffer attachment may
make this happen, though this might not be the complete solution.

(esp if someone does meta stuff in there, would we have to
save things, not sure).

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4a091b0788)
2017-08-29 19:14:13 +01:00
Dave Airlie
22c1e2e966 radv: fix predication on gfx9
When I added gfx9 I did it wrong, this fixes it.

Fixes: 5247b311e9 "radv/gfx9: fix set predication packet."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 12fd0f8dc1)
2017-08-29 19:14:13 +01:00
Timothy Arceri
f01cd4feeb mesa: fix ES only draw if we have vertex positions
This code was separated from the validation code so it could
use used with KHR_no_error paths. The return values were inverted
to reflect the name of the helper, but here the condtion was
mistakenly inverted rather than the return value.

Fixes: 4df2931a87 (mesa/vbo: move some Draw checks out of validation)

Reported-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a4635c84dc)
2017-08-29 19:14:13 +01:00
Daniel Stone
c77ae9744e egl: Add dma_buf_import_modifiers for glvnd
Make sure we advertise the new entrypoints to libglvnd's EGL dispatch.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reported-by: Emmanuel Gil Peyrot <emmanuel.peyrot@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101982
Fixes: 4c412293d0 ("egl: advertise EGL_EXT_image_dma_buf_import_modifiers")
(cherry picked from commit 85ef0215dd)
2017-08-29 19:14:13 +01:00
Daniel Stone
5fdf832235 egl: Update headers from Khronos
Taken from egl-registry 7d68647c4dab.

Signed-off-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 2eee03b7a1)
2017-08-29 19:14:13 +01:00
Emil Velikov
46573f1028 util: move string_to_uint_map to glsl
The functionality is used by glsl and mesa. With the latter already
depending on the former.

With this in place the src/util/ static library libmesautil.la no longer
has a C++ dependency. Thus objects which use it (like libEGL) don't need
the C++ link.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: James Harvey <lothmordor@gmail.com>
(cherry picked from commit 0ac78dc925)
2017-08-29 19:14:13 +01:00
Jason Ekstrand
1dfbfad0b9 nir: Fix system_value_from_intrinsic for subgroups
A couple of the cases were backwards

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 63e79a8a77)
2017-08-29 19:14:13 +01:00
Ilia Mirkin
289b2a8788 st/mesa: fix handling of vertex array double inputs
The is_double_vertex_input needs to be set for arrays of doubles as
well.

Fixes KHR-GL45.enhanced_layouts.varying_array_locations

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ae53bff8b1)
2017-08-29 19:14:13 +01:00
Ilia Mirkin
1960d8f0dd glsl: fix counting of vertex shader output slots used by explicit vars
The argument to count_attribute_slots should only be set to true for
vertex inputs, not for all vertex shader varyings.

Fixes KHR-GL45.enhanced_layouts.varying_locations

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit eefeff09a7)
2017-08-29 19:14:13 +01:00
Christian Gmeiner
2c964f68a8 etnaviv: use correct param for etna_compatible_rs_format(..)
Found by code inspection.

Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 67fc3e37a7)
2017-08-29 19:14:13 +01:00
Kai Chen
5d5a13e375 egl/wayland: Use roundtrips when awaiting buffer release
In get_back_bo, we use wl_display_dispatch_queue() to block and wait for
a buffer release event. However, not all Wayland compositors flush the
client socket on posting a buffer-release event, so by only blocking
client-side, we may block indefinitely, or at least need to wait for an
input event / frame completion to arrive for the compositor to flush.

We now use dispatch_queue as a first pass, but if our entire buffer pool
is exhausted, use a roundtrip (an immediately-triggered wl_callback) to
ensure that the compositor flushes out our release event immediately.

[daniels: Modified comment and commit message.]

Signed-off-by: Kai Chen <kai.chen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 151188d1e3)
2017-08-29 19:14:13 +01:00
Dave Airlie
9d6d567046 radv/gfx9: gfx9 has buffer sizing rules like pre-VI.
This fixes:
dEQP-VK.robustness.buffer_access.* on GFX9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 19f6906c1e)
2017-08-29 19:14:13 +01:00
Bas Nieuwenhuizen
6f0d910617 ac/nir: Cast sources of integer ops to int.
The int32->float semantic conversion got dropped in a testcase,
because the src was already float. On closer inspection I decided
to add a few more casts for integer op operands to be safe too.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 43595db302)
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-08-29 19:14:13 +01:00
Ilia Mirkin
a0b3095bb6 nv50/ir: properly set sType for TXF ops to U32
All of the coordinates and LOD args are integers for TXF. This mostly
doesn't matter, except for converting into a levelZero=true operation by
removing an explicit zero LOD. For the comparison against zero to work
properly, the sType of the instruction has to be set correctly.

Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch
Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 96be442b77)
2017-08-29 19:14:13 +01:00
Dave Airlie
c25e896810 radv/gfx9: don't expose linear depth on vega.
This just zeros out the linear flags for gfx9 + depth formats.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8985ad494b)
2017-08-29 19:14:13 +01:00
Dave Airlie
9172c7744a radv: don't degrade tiling mode for small compressed or depth texture.
This is what radeonsi does, so we should do the same, also vega
doesn't support linear depth textures anyways.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5d26e0baf2)
2017-08-29 19:14:12 +01:00
Dave Airlie
adce678550 radv/gfx9: only minify image view width/height/depth before gfx9.
For gfx9 the addressing for images has changed, so we need to
provide the hw with the level0, however we still need to scale
for format block differences (so our compressed upload paths still
work).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bae7723e13)
2017-08-29 19:14:12 +01:00
Dave Airlie
7f3555951e radv/image: don't rescale width/height if the format isn't changing
If the image view has the same format, we don't need to rescale
the w/h.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a74d987431)
2017-08-29 19:14:12 +01:00
Dave Airlie
bc39869414 radv: cleanup some image view descriptor setup.
Avoid passing the vulkan image creation into the image view descriptor
setup. This cleans up the usage of range inside the init, instead
using the properly inited values in the image view.

This is just a cleanup but some future vega changes will depend on it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5378b5d071)
2017-08-29 19:14:12 +01:00
Dave Airlie
93aee55a7f radv/gfx9: emit sx_mrt_blend registers
GFX9 needs the SX MRT blend registers programmed, port over
the code from radeonsi to workout the values from the blend
state, and program the registers on rbplus systems.

This fixes lots of:
dEQP-VK.pipeline.blend.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9c080100d3)
2017-08-29 19:14:12 +01:00
Dave Airlie
996e3239c9 radv: bump space check for indexed draw.
For the GFX9 packet we need one more dword.

Fixes an assert in:
dEQP-VK.draw.shader_draw_parameters.base_vertex.draw_indexed

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 864eb18527)
2017-08-29 19:14:12 +01:00
Dave Airlie
9b856ea323 radv/gfx9: fixup db/stencil disable.
This fixes disabled Z/stencil.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d987b4ab9e)
2017-08-29 19:14:12 +01:00
Dave Airlie
d8f8689101 radv/gfx9: fix level count in color register setup.
There was an off by one here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 11834195e9)
2017-08-29 19:14:12 +01:00
Dave Airlie
dbca342014 radv/gfx9: use total levels in texture descriptor
We need to use all the levels when filling out the gfx9
descriptor.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit df09f1f3cd)
2017-08-29 19:14:12 +01:00
Kenneth Graunke
c0c03d0003 i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit.
When changing fast clear colors, we need to emit new SURFACE_STATE
with the updated color at the next draw call.

Most things work today because the atoms that handle SURFACE_STATE
for images (mutable images, textures, render targets) also listen to
BRW_NEW_BLORP, causing us to re-emit these on every BLORP operation.
However, this is overkill - most BLORP operations don't require us
to re-emit SURFACE_STATE.

One case where this is broken today is a fast clear to a different
color followed by a non-coherent framebuffer fetch.  The renderbuffer
read atom doesn't listen to BRW_NEW_BLORP, and would not get the new
fast clear color.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 54c41af0aa)
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_tcs_surface_state.c
2017-08-29 19:14:12 +01:00
Marek Olšák
9013f80f67 radeonsi: emit VGT_REUSE_OFF in the right place
clip_regs aren't marked dirty when writes_viewport_index is changed.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8dadb07790)
2017-08-29 19:14:12 +01:00
Marek Olšák
d93e6a0409 radeonsi/gfx9: properly handle imported textures with unexpected swizzle mode
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit fdef2f0fd1)
[Emil Velikov: r600_surface_import_metadata is not separate function,
tile_swizzle isn't in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeon/r600_texture.c
2017-08-29 19:14:12 +01:00
Marek Olšák
cf2f05ff91 radeonsi/gfx9: add a temporary workaround for a tessellation driver bug
The workaround will do for now. The root cause is still unknown.

This fixes new piglit: 16in-1out

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 166823bfd2)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
91715a3681 i965/clear: Quantize the depth clear value based on the format
In f9fd976e8a we changed the clear value to be stored as an
isl_color_value.  This had the side-effect same clear value check is now
happening directly between the f32[0] field of the isl_color_value and
ctx->Depth.Clear.  This isn't what we want for two reasons.  One is that
the comparison happens in floating point even for Z16 and Z24 formats.
Worse than that, ctx->Depth.Clear is a double so, even for 32-bit float
formats, we were comparing as doubles and not floats.  This means that
the test basically always fails for anything other than 0.0f and 1.0f.
This caused a slight performance regression in Lightsmark 2008 because
it was using a depth clear value of 0.999 which can't be stored in a
32-bit float so we were doing unneeded resolves.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/101678
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ae9ce0f29)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_clear.c
2017-08-29 18:37:57 +01:00
Rob Herring
9ba7693a9a Android: Fix LLVM duplicated symbols linking for N and M
Both statically linking libLLVMCore and dynamically linking libLLVM causes
duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
really need to link libLLVMCore, but just need generated headers to be
built first. Dynamically linking to libLLVM instead is enough to do
that. Thanks to Qiang Yu for finding the root cause.

With this change, we can align all versions and just have libLLVM as a
shared lib dependency.

This also requires changes in the M and N versions of LLVM to export the
include paths for libLLVM. AOSP master is okay.

Fixes: 26aee6f4d5 ("Android: rework LLVM build support")
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
(cherry picked from commit 4734bfc02a)
2017-08-29 18:37:57 +01:00
Samuel Pitoiset
1f8cb4c243 radeonsi: update non-resident bindless descriptors if needed
Only resident bindless descriptors are currently updated and
re-uploaded, this makes sure that the non-resident ones are
also updated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2843c5d15c)
2017-08-29 18:37:57 +01:00
Topi Pohjolainen
da4d1fd084 intel/blorp: Adjust intra-tile x when faking rgb with red-only
v2 (Jason): Adjust directly in surf_fake_rgb_with_red()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 393ec1a507)
2017-08-29 18:37:57 +01:00
Dave Airlie
7789126929 ac/nir: fixup layer/viewport export for GFX9.
GFX9 moved where the viewport index export goes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b040f51b61)
2017-08-29 18:37:57 +01:00
Christoph Haag
6389526ce5 mesa: only copy requested compressed teximage cubemap faces
This is analogous to commit 2259b11 which only fixed the regular case

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102308
Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 87556a650a)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
2813e83f6c i965/tex: Don't pass samples to miptree_create_for_teximage
In 76e2f390f9, when Topi switched num_samples from 0 to 1 for
single-sampled, he accidentally switched the last parameter in the call
to miptree_create_for_teximage from 0 to 1 thinking it was num_samples
when it was actually layout_flags.  Switching from 0 to 1 added the
MIPTREE_LAYOUT_ACCELERATED_UPLOAD flag which causes us to allocate a
busy BO instead of an idle one.  This caused the subsequent CPU upload
to consistently stall.  The end result was a 15% performance drop in the
SynMark v7 DrvRes microbenchmark.  This restores the old behavior and
fixes the performance regression.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 76e2f390f9
Bugzilla: https://bugs.freedesktop.org/102260
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f24cf82d6d)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
ea5c466c91 i965/miptree: Return NONE from texture_aux_usage when fully resolved
This little optimization improves the performance of SynMark v7
TexFilterTri by almost 10% on Sky Lake GT4 among other improvements.
We've been doing it for some time but somehow it got dropped during
the miptree refactoring.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/102258
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 61d2f3f1c2)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
d174214218 i965: Stop looking at NewDriverState when emitting 3DSTATE_URB
Looking at NewDriverState is not safe in general.  The state atom system
is set up to ensure that new bits that get added to NewDriverState get
accumulated into the set of bits used when emitting atoms but it doesn't
go the other way.  If we read NewDriverState, we may not get the full
picture because the per-pipeline state (3D or compute) does not get
added to NewDriverState before state emit is done.  It's especially
dangerous to do this from BLORP (either explicitly or implicitly when
BLORP calls gen7_upload_urb) because that does not happen during one of
the normal state upload paths.

This commit solves the problem by whacking all of the per-shader-stage
URB sizes to zero whenever we change the total URB size.  We still have
to flag BRW_NEW_URB_SIZE to ensure that the gen7_urb atom triggers but
the actual decision in gen7_upload_urb can now be based entirely on URB
sizes rather than on state atoms.  This also makes BLORP correct because
it just asks for a new URB config whenever the vsize is too small and so
any change to the total URB size will trigger blorp to re-emit as well
because 0 < vs_entry_size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/102289
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d5e217dbfd)
2017-08-29 18:37:57 +01:00
Kenneth Graunke
a14dd9fff6 i965: Mark all EGLimages as non-coherent.
EGLimages are shared with external users, and we don't know what they're
going to do with them.  They might scan them out.  They might access
them in a way that doesn't work with our explicit clflushing.

It's safest to simply mark them non-coherent.

Chris Wilson caught this problem and wrote a similar (though less
aggressive) patch to solve it; the miptree code has since undergone
a lot of refactoring so I had to rewrite it.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit bc56dfbf3f)
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/intel_mipmap_tree.c
2017-08-29 18:37:57 +01:00
Jason Ekstrand
3e62ba0907 i965/miptree: Rework create flags
The only one of the three remaining flags that has anything whatsoever
to do with layout is TILING_NONE.  This commit renames them to
MIPTREE_CREATE_*, documents the meaning of each flag, and makes the
create functions take an actual enum type so GDB will print them nicely.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 24a0da338f)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
1011fd22ae i965/miptree: Delete MIPTREE_LAYOUT_TILING_(Y|ANY)
The only force tiling flag we really care about is LAYOUT_TILING_NONE.
The others don't actually do anything but add confusion.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 55116839d9)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
dd03cb5c5e i965/miptree: Delete MIPTREE_LAYOUT_FOR_SCANOUT
The flag hasn't affected actual surface layout for some time.  The only
purpose it served was to set bo->cache_coherent = false on the BO used
to create the miptree.  This is fairly silly because we can just set
that directly from the caller where it makes much more sense.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a5a673dfa7)
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/intel_mipmap_tree.c
2017-08-29 18:37:57 +01:00
Jason Ekstrand
046730b8a3 i965/miptree: Delete some unused layout flags
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2bca18be44)
2017-08-29 18:37:57 +01:00
Jason Ekstrand
f83d979759 i965/miptree Remove layout_flags parameter form is_mcs_supported
The one caller of is_mcs_supported passes 0 in as the layout_flags
unconditionally.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0e4d9a4b37)
2017-08-29 18:36:14 +01:00
Dave Airlie
2cdb51e76d radv: disable texture gather workaround on gfx9.
Not required anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4c02e2bd95)
[Emil Velikov: chip_class is in ctx::options, not ctx::abi]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-29 16:18:49 +01:00
Emil Velikov
15f23fb855 Update version to 17.2.0-rc5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-08-21 11:49:19 +01:00
Emil Velikov
dd4fafcfda cherry-ignore: ignore storage offset fixes
For stable the regressions were addressed by reverting the offending
commits - see previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-19 02:57:19 +01:00
Samuel Pitoiset
17a3e4891b Revert "mesa: stop assigning unused storage for non-bindless opaque types"
This reverts commit fcbb93e860 and
also  commit 7c5b204e38 to avoid
compilation errors.

Basically, the parameter indexes look wrong when a non-bindless
sampler is declared inside a nested struct (because it is skipped).
I think it's safer to just restore the previous behaviour which is
here since ages and also because the initial attempt is only a
little performance improvement.

This fixes a regression with
ES2-CTS.functional.shaders.struct.uniform.sampler_nested*.

Note: this patch is only for 17.2 since the fixes in master are rather
invasive. Namely:

365d34540f mesa: correctly calculate the storage offset for i915
de0e62e106 st/mesa: correctly calculate the storage offset

Fixes: fcbb93e860 ("mesa: stop assigning unused storage for non-bindless
opaque types")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101983
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-19 02:57:19 +01:00
Jason Ekstrand
85cc3533fa i965/miptree: Set supports_fast_clear = false in make_shareable
The make_shareable function deletes the aux buffer and then whacks
aux_usage to ISL_AUX_USAGE_NONE but not unsetting supports_fast_clear.
Since we only look at supports_fast_clear to decide whether or not to do
fast clears, this was causing assertion failures.

Reported-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101925
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit e7a52cc381)
2017-08-19 02:57:19 +01:00
Eric Anholt
93bd5fbfe1 broadcom/vc4: Build the vc4_tiling_lt_neon.c with -mfpu=neon on ARM.
If you don't pass this, the compiler refuses to compile the assembly for
pre-v7 CPUs.  This also keeps us from building identical, non-NEON code on
aarch64 and x86.

Fixes: a373f77662 ("vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.")

v2: Fix Android build by just appending NEON_C_SOURCES when
    ARCH_ARM_HAVE_NEON.

Tested-by: Rob Herring <robh@kernel.org>
(cherry picked from commit bd5efbd70b)
[Emil Velikov: address libvc4_la_LIBADD conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/vc4/Makefile.am
2017-08-19 02:57:19 +01:00
Eric Anholt
833f12abdf configure.ac: Introduce HAVE_ARM_ASM/HAVE_AARCH64_ASM and the -D flags.
I've been trying to get away without these conditionals in vc4's NEON
code, but it meant compiling extra unused code on x86, and build failing
on ARMv6.

v2: Use the _arm/_arm64 flags to simplify detection (suggested by Rob),
    but hide the _arm version under ARCH_ARM_HAVE_NEON to keep from trying
    to build this stuff for armv5te.

Tested-by: Rob Herring <robh@kernel.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ba8533b6ea)
2017-08-19 02:57:19 +01:00
Eric Anholt
1fc44a3ba0 util: Fix build on old glibc.
We need to link librt for u_thread.h's clock_gettime() call.

Fixes: b822d9dd67 ("gallium/util: move u_queue.{c,h} to src/util")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b94ddc181b)
2017-08-19 02:57:19 +01:00
Scott D Phillips
6ccb75e869 intel/genxml: Fix gen10 BLEND_STATE variable length packing
BLEND_STATE packing was modified to be variable-length in:

 9670124e31 genxml: Make BLEND_STATE command support variable length array.

The initial gen10.xml still had the old, fixed-length style
definition for BLEND_STATE. So gen10_upload_blend_state would
overwrite the packed BLEND_STATE_ENTRYs with its own fixed array
of all-zero entries when packing BLEND_STATE. This caused
BLEND_STATE upload to not work at all.

Fixes: aa416f515a ("i965/genxml: Add gen10.xml")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit d6539608a4)
2017-08-19 02:57:19 +01:00
Chris Wilson
b4fec4271b i965: Always allow CPU readback of the scanout on LLC platforms
LLC platforms are magic in that reads from the CPU are always cache
coherent, or rather GPU writes that bypass LLC do still invalidate the
appropriate cache line.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 49eda75df6)
2017-08-19 02:57:19 +01:00
Ilia Mirkin
efb9b5740e glsl: add a few missing int64 constant propagation cases
Fixes KHR-GL45.shader_ballot_tests.ShaderBallotAvailability, which
causes some silly swizzles to appear, triggering this optimization to
get hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9c8f017f77)
2017-08-19 02:57:19 +01:00
Dave Airlie
e44393cb23 radv: disable support for VEGA for now.
I'm working on this, but I'm not sure I'll make 17.2 at this stage,
maybe 17.2.1.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 611076a41a)
2017-08-19 02:57:19 +01:00
Ilia Mirkin
81b5a6a85b nv50/ir: fix TXQ srcMask
src0.x is always read for the LOD, irrespective of which outputs are
read.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 934511d1f3)
2017-08-19 02:57:19 +01:00
Ilia Mirkin
5795e42116 nv50/ir: fix srcMask computation for TG4 and TXF
This affects which inputs are marked as used. In a situation where only
the texture instruction uses an input, it might have been ignored as
unused due to input masks.

Affects subtests of KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 054c54d1be)
2017-08-19 02:57:19 +01:00
Frank Richter
42e8f3ccdf gallium/os: fix os_time_get_nano() to roll over less
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 7fb7287ce7)
2017-08-19 01:46:08 +01:00
Frank Richter
e452ed26ff st/wgl: check for negative delta in wait_swap_interval()
This can happen because of rollover.  See bug report for details.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit d90e05ad48)
2017-08-19 01:46:08 +01:00
Frank Richter
c7c6ba44ca st/mesa: fix a null pointer access
Fixes crash with llvmpipe on Windows.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102148
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 496a691e35)
2017-08-19 01:46:08 +01:00
Tim Rowley
bdab7c69f5 swr/rast: Fix invalid casting for calls to Interlocked* functions
CID: 1416243, 1416244, 1416255
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit b333bc753e)
[Emil Velikov: resolve trivial conflicts - LONG -> long]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/swr/rasterizer/core/api.cpp
	src/gallium/drivers/swr/rasterizer/core/threads.cpp
2017-08-19 01:45:47 +01:00
Ilia Mirkin
47507ec1fd glsl/ast: update rhs in addition to the var's constant_value
We continue in the code to do some more things with the rhs, including
setting a constant initializer. If the type is wrong, this causes some
confusion down the line, leading to assertions. This makes sure that the
rhs processing continues to flow as-if the type was correct to start
with (even though the state has been marked as an error state).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101766
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 978c4c597a)
2017-08-19 01:42:08 +01:00
Dave Airlie
3fb5cc2637 radv/gfx9: for fast clear use is_linear flag.
The legacy test won't work on gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 694d59fbaf)
2017-08-19 01:42:08 +01:00
David Airlie
42d6e7eba2 radv/gfx9: handle GFX9 opaque metadata
port the opaque metadata changes from radeonsi for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e43cc3e3af)
2017-08-19 01:36:25 +01:00
David Airlie
f60ff71b6f radv: emit db_htile_surface reg on gfx9 as well
This is also a GFX9 register.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 674ecbfef2)
2017-08-19 01:36:25 +01:00
Dave Airlie
f70f216564 radv/gfx9: remove some leftover gfx6 descriptor setup.
We set this later in the non-gfx9 path, just remove these
bits from here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit fc600eb98d)
2017-08-19 01:36:25 +01:00
Dave Airlie
827bf79052 radv/gfx9: fix set predication packet.
The predication packet changed format on GFX9, update the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 5247b311e9)
2017-08-19 01:36:25 +01:00
Marek Olšák
2b10b3a417 radeonsi: disable CE by default
It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 1ab7fed707)
2017-08-19 01:36:25 +01:00
Scott D Phillips
27e7a3a5ef i965/blorp: Correct type of src_format in call to intel_miptree_texture_aux_usage
intel_miptree_texture_aux_usage() takes an isl_format, but we are
passing a mesa_format. clang warns:

 brw_blorp.c:305:52: warning: implicit conversion from enumeration
    type 'mesa_format' to different enumeration type
    'enum isl_format' [-Wenum-conversion]
       intel_miptree_texture_aux_usage(brw, src_mt, src_format);
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~              ^~~~~~~~~~

Fixes: fc1639e46d ("i965/blorp: Use texture/render_aux_usage for blits")
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f7dfc44c61)
2017-08-19 01:36:25 +01:00
Ilia Mirkin
da005f5566 nv50/ir: clean up saturated values immediately
Since we don't iterate to a fixed point, we can end up in situations
where we have a SAT instruction + a long immediate. This is not legal.
However since it's immediately computable, just run unary straight away
to handle the situation.

Fixes: 24a799ad35 ("nv50/ir: fix ConstantFolding with saturation")
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 165e18dd21)
2017-08-19 01:36:25 +01:00
Jeremy Huddleston Sequoia
cc8ae8842b glxcmds: Fix a typo in the __APPLE__ codepath
s/DummyContext/dummyContext/

Regressed-in: 5d9b50e596
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit c1c4c18a80)
2017-08-17 15:48:20 -07:00
Emil Velikov
3165f9877e Update version to 17.2.0-rc4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-12 17:04:27 +01:00
Dave Airlie
1e11687029 radv: force cs/ps/l2 flush at end of command stream. (v2)
This seems like a workaround, but we don't see the bug on CIK/VI.

On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.

Could also be a kernel difference between SI and CIK.

v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 82ba384c10)
2017-08-11 21:00:49 +01:00
Dave Airlie
ea595756f8 radv: fix MSAA on SI gpus.
This ports the workaround from radeonsi, that was missing in radv.

This fixes Talos rendering when MSAA is enabled on my Tahiti card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8bf3930751)
2017-08-11 21:00:46 +01:00
Dave Airlie
ffd8120284 radv: fix f16->f32 denorm handling for SI/CIK. (v2)
This just copies the code from the -pro shaders,
and fixes the tests on CIK.

With this CIK passes the same set of conformance
tests as VI.

Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3f389f75b6)
2017-08-11 21:00:43 +01:00
Bas Nieuwenhuizen
9d65214f3d radv: Use the correct channel for alpha in resolve srgb conversion.
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.

Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit acba3a3151)
2017-08-11 21:00:40 +01:00
Bas Nieuwenhuizen
a57390cee0 radv: Only convert linear->srgb in compute resolves.
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.

Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 15e5a7a683)
2017-08-11 21:00:37 +01:00
Bas Nieuwenhuizen
8b706102eb radv: Don't use SRGB format for image stores during resolve.
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.

Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8286c3a49f)
2017-08-11 21:00:34 +01:00
Marek Olšák
7f5d86ebaa radeonsi/gfx9: use the VI codepath for clamping Z
This fixes corrupted shadows in Unigine Valley.
The corruption disappeared when I stopped setting IMG_DATA_FORMAT_24_8
for depth.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 27fef5d52d)
2017-08-11 20:59:31 +01:00
Kenneth Graunke
1c1653d7b0 isl: Validate row pitch of stencil surfaces.
Also, silence an obnoxious finishme that started occurring for all
GL applications which use stencil after the i965 ISL conversion.

v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using
    separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using
    combined depth-stencil.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 5563872dbf)
2017-08-11 20:59:28 +01:00
Emil Velikov
d4100b0d09 egl: avoid eglCreatePlatform*Surface{EXT,} crash with invalid dpy
If we have an invalid display fed into the functions, the display lookup
will return NULL. Thus as we attempt to get the platform type, we'll
deref. it leading to a crash.

Keep in mind that this will not happen if Mesa is built without X11 or
when the legacy eglCreate*Surface codepaths are used.

A similar check was added with earlier commit 5e97b8f5ce ("egl: Fix
crashes in eglCreate*Surface), although it was only applicable when the
surfaceless platform is built.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 26fbb9eacd)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/egl/main/eglapi.c
2017-08-11 20:59:02 +01:00
Tim Rowley
bb6e5e5476 configure: remove trailing "-a" in swr architecture test
Fixes "configure: line 27326: test: argument expected"

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 4d9b0dcccb)
2017-08-11 20:58:19 +01:00
Marek Olšák
08d49e074d ac: fail shader compilation if libelf is replaced by an incompatible version
UE4Editor has this issue.

This commit prevents hangs (release build) or assertion failures (debug
build). It doesn't fix the editor, but catastrophic scenarios are
prevented.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4630ede102)
2017-08-11 20:58:14 +01:00
Karol Herbst
75f5abb82f nv50/ir: fix ConstantFolding with saturation
For mul(a, +-1) codegen can generate OP_MOV with a saturation flag
set which is ignored at emission. The same can happen with add(a, 0),
and others.

Adding an assert for detecting more of such issues.

Fixes wrongly rendered water in Hitman Absolution running under wine.
Also a few shaders in Mad Max and Alien Isolation produce such MOVs.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: generalize the fix for other cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 24a799ad35)
2017-08-11 20:57:54 +01:00
Alex Smith
f0b6298c05 radv: Fix decompression on multisampled depth buffers
Need to take the sample count into account in the depth decompress and
resummarize pipelines and render pass.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e9a13bf22)
2017-08-11 20:57:50 +01:00
Jason Ekstrand
ac04187e33 i965/miptree: Call alloc_aux in create_for_bo
Originally, I had moved it to the caller to make some things easier when
adding the CCS modifier.  However, this broke DRI2 because
intel_process_dri2_buffer calls intel_miptree_create_for_bo but never
calls intel_miptree_alloc_aux.  Also, in hindsight, it should be pretty
easy to make the CCS modifier stuff work even if create_for_bo allocates
the CCS when DISABLE_AUX is not set.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8e5808fc0c)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/intel_mipmap_tree.c
2017-08-11 20:57:36 +01:00
Jason Ekstrand
b1514579c2 intel/isl: Don't align the height of the last array slice
We were calculating the total height of 2D surfaces by multiplying the
row pitch by the number of slices.  This means that we actually request
slightly more space than actually needed since the padding on the last
slice is unnecessary.  For tiled surfaces this is not likely to make a
difference.  For linear surfaces, on the other hand, this means we may
require additional memory.  In particular, this makes the i965 driver
reject EGL imports of buffers which do not have this extra padding.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d27c6095e)
2017-08-11 20:52:22 +01:00
Jason Ekstrand
dc63c715cb intel/isl: Stop padding surfaces
The docs contain a bunch of commentary about the need to pad various
surfaces out to multiples of something or other.  However, all of those
requirements are about avoiding GTT errors due to missing pages when the
data port or sampler accesses slightly out-of-bounds.  However, because
the kernel already fills all the empty space in our GTT with the scratch
page, we never have to worry about faulting due to OOB reads.  There are
two caveats to this:

 1) There is some potential for issues with caches here if extra data
    ends up in a cache we don't expect due to OOB reads.  However,
    because we always trash the entire cache whenever we need to move
    anything between cache domains, this shouldn't be an issue.

 2) There is a potential issue if a surface gets placed at the very top
    of the GTT by the kernel.  In this case, the hardware could
    potentially end up trying to read past the top of the GTT.  If it
    nicely wraps around at the 48-bit (or 32-bit) boundary, then this
    shouldn't be an issue thanks to the scratch page.  If it doesn't,
    then we need to come up with something to handle it.

Up until some of the GL move to ISL, having the padding code in there
just caused us to harmlessly use a bit more memory in Vulkan.  However,
now that we're using ISL sizes to validate external dma-buf images,
these padding requirements are causing us to reject otherwise valid
images due to the size of the BO being too small.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c15b92ce11)
2017-08-11 20:52:19 +01:00
Jason Ekstrand
9d9ea2c5a4 anv/formats: Allow sampling on depth-only formats on gen7
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 06d3115bb9)
2017-08-11 20:52:16 +01:00
Dave Airlie
07b5c78836 radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.

v2: get it right, reverse the polarity.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 36a1b61321)
2017-08-11 20:52:13 +01:00
Emil Velikov
59f7fdb85e egl/x11: don't leak xfixes_query in the error path
If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit c961b679fe)
2017-08-11 20:52:08 +01:00
Emil Velikov
29df4deef2 Update version to 17.2.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-07 12:45:40 +01:00
Jason Ekstrand
e4371d14f1 anv: Stop advertising VK_KHX_multiview
We don't want to advertise experimental extensions in actual releases.
However, there's no harm in leaving the code lying around in the tree.
2017-08-05 00:09:26 +01:00
Tomasz Figa
0b2c034f64 st/dri: enable 32-bit RGBX/RGBA formats only on Android
X/GLX can't handle them. This removes almost 500 GLX visuals that were
incorrectly exposed.

This is a less invasive version of Marek's .getCapability series.
Note: the patch is not applicable for master, but only for the 17.2
branch.

Suggested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f33d8af7aa "st/dri: add 32-bit RGBX/RGBA formats"
[Emil Velikov: commit message polish]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-05 00:09:26 +01:00
Tim Rowley
f4e163094d swr/rast: fix scons gen_knobs.h dependency
Copy/paste error was duplicating a gen_knobs.cpp rule.

Fixes: 5079c277b5 ("swr: [scons] Fix windows build")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit e4a6ae06cf)
2017-08-05 00:09:26 +01:00
Timothy Arceri
4a181e6244 mesa/st: fix conditional jump depends on uninitialised value
Reported by valgrind at:
glsl_to_tgsi_visitor::visit(ir_expression*) (st_glsl_to_tgsi.cpp:1560)

When compiling the Deus Ex shaders.

Fixes: 28a5e7104 ("st/glsl_to_tgsi: handle precise modifier")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 06237fc9e1)
2017-08-05 00:09:26 +01:00
Scott D Phillips
8ef9fe7229 gles: Restore some lost typedefs
GLES/gl.h has historically provided some typedefs that are not
used in the API itself. Restore these typedefs that were lost to
avoid breaking applications.

These seem to be the only typedefs removed in the update.

Fixes: 7fd0817 "Update Khronos-supplied headers"

[Eric: added a big warning to revert this patch when pulling the updated header]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 3db05ed1d1)
2017-08-05 00:09:26 +01:00
Dave Airlie
4f872e62c2 Revert "st_glsl_to_tgsi: rewrite rename registers to use array fully."
This reverts commit 3008161d28,
which caused a regression for VMWare.

The initial code had some recursion in it, that I removed by accident
trying to add back the recursion broke lots of things, take the high
road and revert for now.

Fixes: 3008161d (st_glsl_to_tgsi: rewrite rename registers to use array fully.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b8bea9a050)
2017-08-05 00:09:26 +01:00
Dave Airlie
c5fac38ced radv: handle 10-bit format clamping workaround.
This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.*
for a2r10g10b10 formats as destination on SI/CIK hardware.

This adds support to the meta program for emitting 10-bit
outputs, and adds 10-bit support to the fragment shader key.

It also only does the int8/10 on SI/CIK.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit df61a05019)
2017-08-05 00:09:26 +01:00
Bas Nieuwenhuizen
579ecfd91e radv: Don't underflow non-visible VRAM size.
In some APU situations the reported visible size can be larger than
VRAM size. This properly clamps the value.

Surprisingly both CTS and spec seem to allow a heap type with size 0,
so this seemed like the easiest option to me.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 4ae84efbc5 "radv: Use enum for memory heaps."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8229706ad8)
2017-08-05 00:09:26 +01:00
Bruce Cherniak
6b279b3271 st/osmesa: add osmesa framebuffer iface hash table per st manager
Commit bbc29393d3 didn't include osmesa state_tracker.  This patch adds
necessary initialization.

Fixes crash in OSMesa initialization.

Created-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9966c85e01)
2017-08-05 00:09:26 +01:00
Dave Airlie
6efb8d79a9 intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.
If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.

Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 271fa3a684)
2017-08-05 00:09:25 +01:00
Chris Wilson
795b712bd7 i965/blit: Remember to include miptree buffer offset in relocs
Remember to add the offset to the start of the buffer in the relocation
or else we write 0xff into random bytes elsewhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit fb63c43fd1)
2017-08-05 00:09:25 +01:00
Kenneth Graunke
43a2b178c2 i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.
The cacheline alignment restriction is on the base address; the pitch
can be anything.

Fixes assertion failures when using primus (say, on glxgears, which
creates a 300x300 linear BGRX surface with a pitch of 1200):

intel_blit.c:190: get_blit_intratile_offset_el: Assertion `mt->surf.row_pitch % 64 == 0' failed.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 595a47b829)
2017-08-05 00:09:25 +01:00
Jason Ekstrand
d5def4f5a9 spirv: Fix SpvImageFormatR16ui
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.1 17.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 95c6a97464)
2017-08-05 00:09:25 +01:00
Thomas Hellstrom
f2a60ff20a dri3: Wait for all pending swapbuffers to be scheduled before touching the front
This implements a wait for glXWaitGL, glXCopySubBuffer, dri flush_front and
creation of fake front until all pending SwapBuffers have been committed to
hardware. Among other things this fixes piglit glx-copy-sub-buffers on dri3.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 185ef06fd2)
2017-08-05 00:09:25 +01:00
Nicolai Hähnle
3c8673d420 gallium/radeon: fix ARB_query_buffer_object conversion to boolean
The issue here is that the immediate is treated as a 64-bit value,
and fetching it does not work reliably with swizzles that are different
from xy and zw.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit da83687c4b)
2017-08-05 00:09:25 +01:00
Dave Airlie
381ccaa1cb radeon/ac: use ds_swizzle for derivs on si/cik.
This looks like it's supported since llvm 3.9 at least,
so switch over radeonsi and radv to using it, -pro also
uses this. We can now drop creating lds for these operations
as the ds_swizzle operation doesn't actually write to lds at all.

Acked-by: Marek Olšák <marek.olsak@amd.com>
(stable requested due to fixing radv CIK conformance tests)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cb6f16dce9)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-08-05 00:09:25 +01:00
Connor Abbott
2dd6030fbb ac/nir: fix lsb emission
This makes it match radeonsi. The LLVM backend itself will emit the
correct instruction, but LLVM might do incorrect optimizations since it
thinks the output is undefined when the input is 0, even though it's not
supposed to be. We really need a new intrinsic, or for the backend to
become smarter and recognize this pattern.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
(cherry picked from commit 6d731c5651)
2017-08-05 00:09:25 +01:00
Connor Abbott
a90d99f7a5 nir: fix algebraic optimizations
The optimizations are only valid for 32-bit integers. They were
mistakenly firing for 64-bit integers as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit de91461575)
2017-08-05 00:09:25 +01:00
Marek Olšák
6806773905 Revert "st/mesa: release sampler views when redefining a texture in st_context_teximage"
This reverts commit 5c1241268b.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101961

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d85802e501)
2017-08-05 00:09:25 +01:00
Nicolai Hähnle
3e872f4b53 radeonsi: ensure that temp array allocas are in the entry block
Otherwise, code generation fails. This has become necessary since some
shaders are wrapped in control flow.

Fixes: 081ac6e5c6 ("radeonsi/gfx9: always wrap GS and TCS in an if-block (v2)")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2879a602dd)
2017-08-05 00:09:25 +01:00
Samuel Pitoiset
21ea75b3e9 mesa: fix mismatch when returning 64-bit bindless uniform handles
The slower convert-and-copy process performs a bad conversion
because it converts the value to signed 64-bit integer, but
bindless uniform handles are considered unsigned 64-bit.

This fixes "Check glUniform*() with mixed texture units/handles"
from arb_bindless_texture-uniform piglit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b38c9c57f2)
2017-08-05 00:09:25 +01:00
Emil Velikov
58fe86a6d6 Update version to 17.2.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-31 10:52:13 +01:00
Samuel Pitoiset
d466a70532 st/glsl_to_tgsi: fix getting the image type for array of structs
Since array splitting for AoA is disabled, we have to retrieve
the type of the first non-array type when an array of images is
declared inside a structure. Otherwise, it will hit an assert
in glsl_type::sampler_index() because it expects either a sampler
or an image type.

This fixes a regression in the following piglit test:
arb_bindless_texture/compiler/images/arrays-of-struct.frag

Fixes: 57165f2ef8 ("glsl: disable array splitting for AoA")
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f99e9335e2)
2017-07-31 10:26:27 +01:00
Marek Olšák
e62eddcdbe st/mesa: release sampler views when redefining a texture in st_context_teximage
Noticed randomly.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5c1241268b)
2017-07-31 10:25:55 +01:00
Dave Airlie
6d07e58afb radv: for stencil only set Z tile mode index to same value
On SI this was causing a hang in
dEQP-VK.pipeline.render_to_image.core.2d_array.mipmap.r16g16_sint_s8_uint

This was due to not handling the tile mode index for depth like
I fixed previously for new GPUs.

Fixes: 01d0c5a9 (radv: fix stencil regression since new addrlib import)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 800d162209)
2017-07-31 10:24:42 +01:00
Dave Airlie
f9e563597d virgl: drop precise modifier.
The host doesn't understand this yet, so drop it for now.

Fixes: virgl regressions.

Fixes: af22adee4f (tgsi: add precise flag to tgsi_instruction)
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 554aa09440)
2017-07-31 10:23:09 +01:00
Marek Olšák
5b61ba4432 radeonsi: update dirty_level_mask only when flushing or unbinding framebuffer
This fixes corruption with bindless textures in Dawn Of War 3.

The do_update_surf_dirtiness mechanism was complicated and dirty_level_mask
was only updated after the first draw call. The problem is bindless textures
are checked for decompression every draw call and we would only decompress
after the first draw call. The solution is to set dirtiness after the last
draw call to the framebuffer, so the (unconditional) decompression of
bindless textures happens at the right time.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f4d095cc65)
2017-07-31 10:21:17 +01:00
Marek Olšák
6625382b1c st/mesa: always unconditionally revalidate main framebuffer after SwapBuffers
This fixes the black Feral launcher window.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101867

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
(cherry picked from commit 7257c171e9)
2017-07-31 10:20:44 +01:00
Nicolai Hähnle
2bca74253d radeonsi/gfx9: always wrap GS and TCS in an if-block (v2)
With merged ESGS shaders, the GS part of a wave may be empty, and the
hardware gets confused if any GS messages are sent from that wave. Since
S_SENDMSG is executed even when EXEC = 0, we have to wrap even
non-monolithic GS shaders in an if-block, so that the entire shader and
hence the S_SENDMSG instructions are skipped in empty waves.

This change is not required for TCS/HS, but applying it there as well
simplifies the logic a bit.

Fixes GL45-CTS.geometry_shader.rendering.rendering.*

v2: ensure that the TCS epilog doesn't run for non-existing patches

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 081ac6e5c6)
2017-07-31 10:20:20 +01:00
Nicolai Hähnle
b36ff2d1f2 radeonsi/gfx9: fix vertex idx in ES with multiple waves per threadgroup
Cc: mesa-stable@lists.freedesktop.org
Reviewed: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 873789002f)
2017-07-31 10:20:12 +01:00
George Kyriazis
99b2613ce1 swr: fix transform feedback logic
The shader that is used to copy vertex data out of the vs/gs shaders to
the user-specified buffer (streamout or SO shader) was not using the
correct offsets.

Adjust the offsets that are used just for the SO shader:
- Make sure that position is handled in the same special way
  as in the vs/gs shaders
- Use the correct offset to be passed in the core
- consolidate register slot mapping logic into one function, since it's
  been calculated in 2 different places (one for calcuating the slot mask,
  and one for the register offsets themselves

Also make room for all attibutes in the backend vertex area.

Fixes:
- all vtk GL2PS tests
- 18 piglit tests (16 ext_transform_feedback tests,
  arb-quads-follow-provoking-vertex and primitive-type gl_points

v2:

- take care of more SGV slots in slot mapping logic
- trim feState.vsVertexSize
- fix GS interface and incorporate GS while calculating vsVertexSize

Note that vsVertexSize is used in the core as the one parameter that
controls vertex size between all stages, so it has to be adjusted appropriately
for the whole vs/gs/fs pipeline.

Also note that GS and SO is not fully implemented.  This will be addressed
later.

fixes:
- fixes total of 20 piglit tests

CC: 17.2 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 194ff5eed1)
2017-07-31 10:19:55 +01:00
Dave Airlie
f9c7549605 radv/ac: port SI TC L1 write corruption fix.
This ports 72e46c988 to radv.
    radeonsi: apply a TC L1 write corruption workaround for SI

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e77ff11ffe)
2017-07-27 19:54:53 +01:00
Dave Airlie
2ce4f0afd3 radv/ac: realign SI workaround with radeonsi.
This ports: da7453666a
radeonsi: don't apply the Z export bug workaround to Hainan
to radv.

Just noticed in passing.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a81e99f50a)
2017-07-27 19:54:52 +01:00
Dave Airlie
546282e8bc radv: only report external semaphore info for opaque fd.
Until we support sync fd, don't report the info.

Fixes CTS dEQP-VK.api.external.semaphore.sync_fd.* from crashing.

Fixes: eaa56eab6 (radv: initial support for shared semaphores (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6cbc8cf178)
2017-07-27 19:54:52 +01:00
Dave Airlie
de55bc8f49 radv: fix buffer views on SI/CIK.
Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
on SI/CIK with radv.

Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ca82ef5ac7)
2017-07-27 19:54:52 +01:00
Daniel Stone
f0b7563a36 st/dri2: Return invalid modifier when no driver support
Always initialise whandle.modifier for DRIImage modifier queries, so if
the driver doesn't support it then we return false for the query.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: d33fe8b84e ("st/dri: enable DRIimage modifier queries")
(cherry picked from commit 45383d32d4)
2017-07-27 19:54:52 +01:00
Daniel Stone
5bee196840 st/dri: Check get-handle return value in queryImage
In the DRIImage queryImage hook, check if resource_get_handle() failed
and return FALSE if so.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b4a18f13ce)
2017-07-27 19:54:52 +01:00
Daniel Stone
2a1792981c egl/wayland: Ignore invalid modifiers
If the underlying driver does not support modifiers, dmabuf will still
advertise formats through the 'modifier' event, but send them with an
invalid modifier. Ignore them if this is the case, rather than passing
them through to the driver.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
(cherry picked from commit dd072cf4b1)
2017-07-27 19:54:52 +01:00
Tim Rowley
a2b7477603 swr/rast: non-regex knob fallback code for gcc < 4.9
gcc prior to 4.9 didn't implement <regex>, causing a startup crash
in the swr knob parameter reading code.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit e21fc2c625)
2017-07-27 19:54:52 +01:00
Dave Airlie
3e777d5cab virgl: encode index buffer offset.
Fixes arb_vertex_buffer_object-combined-vertex-index

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c4652a0a5b)
2017-07-27 19:54:52 +01:00
Marek Olšák
9bb6aa5794 ac/surface: fix hybrid graphics where APU=GFX9, dGPU=older
v2: don't do it for compressed textures (bpp = 0)

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
(cherry picked from commit 5e81df0f10)
2017-07-27 19:54:52 +01:00
Marek Olšák
90bbcb93b1 radeonsi: decrease the number of compiler threads
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ed2b3f5c81)
2017-07-27 19:54:52 +01:00
Marek Olšák
529c440dd3 gallium/radeon: make S_FIXED function signed and move it to shared code
This fixes a bug uncovered by:
    2412c4c81e
    util: Make CLAMP turn NaN into MIN.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 433f6f7ac9)
2017-07-27 19:54:52 +01:00
Charmaine Lee
94e0de90ee st/mesa: create framebuffer iface hash table per st manager
With commit 5124bf9823, a framebuffer interface hash table is
created in st_gl_api_create(), which is called in
dri_init_screen_helper() for each screen. When the hash table is
overwritten with multiple calls to st_gl_api_create(), it can cause
race condition. This patch fixes the problem by creating a
framebuffer interface hash table per state tracker manager.

Fixes crash with steam.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101876
Fixes: 5124bf9823 ("st/mesa: add destroy_drawable interface")
Tested-by: Christoph Haag <haagch@frickel.club>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit bbc29393d3)

Squashed with:

st/mesa: fix unconditional return in st_framebuffer_iface_remove

Noticed by James Legg @ Feral.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 914f11e75b)

Squashed with:

st/mesa: Fix inversed test in st_api_destroy_drawable

Fixes a drawable leak.

Fixes: bbc29393d3 ("st/mesa: create framebuffer iface hash table per
                      st manager")
Bugzilla: https://bugs.freedesktop.org/101930
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 57132d126f)
2017-07-27 19:54:24 +01:00
Grigori Goronzy
3180f0fa0d egl: move KHR_no_error vs debug/robustness check further down
We'll fail to flag an error if the context flags appear after the
no-error attribute in the context attribute list.

Delay the check to after attribute parsing to fix this.

Fixes: 4909519a66 ("egl: Add EGL_KHR_create_context_no_error support")
Cc: mesa-stable@lists.freedesktop.org
[Emil Velikov: add fixes/stable tags, commit message polish]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 39bf7756b9)
2017-07-27 18:56:45 +01:00
Nicolai Hähnle
e7f14a8b52 radeonsi/gfx9: reduce max threads per block to 1024 on gfx9+
The number of supported waves per thread group has been reduced to 16
with gfx9. Trying to use 32 waves causes hangs, and barriers might
not work correctly with > 16 waves.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a0e6b9a2db)
2017-07-27 18:56:45 +01:00
Nicolai Hähnle
7f5d9d7a6d radeonsi: fix detection of DRAW_INDIRECT_MULTI on SI
The firmware version numbers for SI were wrong. The new numbers are probably
too conservative (we don't have a definitive answer by the firmware team),
but DRAW_INDIRECT_MULTI has been confirmed to work with these versions on
Tahiti (by Gustaw) and on Verde (by myself).

While this is technically adding a feature, it's a feature we thought we had
for a long time. The change is small enough and we're early enough in the 17.2
release cycle that it should still go in.

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 65fbaab0b7)
2017-07-27 18:56:45 +01:00
Iago Toral Quiroga
3d0960e761 anv: only expose up to 28 vertex attributes
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 31f1863ace)
2017-07-27 18:56:45 +01:00
Iago Toral Quiroga
bdbd8ab517 anv/cmd_buffer: fix off by one error in assertion
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a848e693ef)
2017-07-27 18:56:44 +01:00
Kenneth Graunke
47bca2cfa7 i965: Fix = vs == in MCS aux usage assert.
Caught by Coverity (CID 1415680).

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 698636cc97)
Fixes: 0f9b609cf4 ("i965/blorp: Do prepare/finish manually")
2017-07-27 18:54:41 +01:00
Kenneth Graunke
04bb687f04 i965: Fix offset addition in get_isl_surf.
Increase the value, not the pointer to the stack variable.

Caught by Coverity (CID 1415574).  Not shipped in a real release.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit f6e674fa51)
Fixes: 63a43f4161 ("i965: Refactor miptree to isl converter and adjustment")
2017-07-27 18:54:13 +01:00
Eric Anholt
4e0f29ed0b broadcom/vc4: Prefer blit via rendering to the software fallback.
I don't know how I managed to leave this here for so long.  Found when
working on a 1:1 overlapping blit extension for X11.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 93fec49a75)
2017-07-27 18:21:27 +01:00
Lionel Landwerlin
0ccb853cc0 i965: perf: flush batchbuffers at the beginning of queries
As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.

If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).

Fixes: adafe4b733 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9f439ae120)
2017-07-27 18:21:22 +01:00
Emil Velikov
a455f594bb Update version to 17.2.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 16:59:32 +01:00
Emil Velikov
a955622c1a intel/blorp: ship blorp_genX_exec.h within the tarball
Fixes: c9cb37b2a6 ("intel/blorp: Add a partial resolve pass for MCS")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5d47dd9c2a)
2017-07-24 16:59:32 +01:00
265 changed files with 5981 additions and 1976 deletions

View File

@@ -40,6 +40,7 @@ matrix:
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
packages:
@@ -66,6 +67,7 @@ matrix:
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"
- GALLIUM_DRIVERS="swr"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
@@ -81,6 +83,7 @@ matrix:
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium Drivers Other"
- BUILD=make
@@ -93,6 +96,7 @@ matrix:
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
@@ -108,6 +112,7 @@ matrix:
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover"
@@ -125,6 +130,7 @@ matrix:
# Regardless - we're doing a quick build test here.
- GALLIUM_DRIVERS="i915"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
@@ -144,11 +150,14 @@ matrix:
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium ST Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine --enable-xvmc --enable-vdpau --enable-va --enable-omx --enable-gallium-osmesa"
@@ -157,9 +166,12 @@ matrix:
# Regardless - we're doing a quick build test here.
- GALLIUM_DRIVERS="i915,swrast"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
packages:
# We actually want to test against llvm-3.3
- llvm-3.3-dev
# Nine requires gcc 4.6... which is the one we have right ?
- libxvmc-dev
# Build locally, for now.
@@ -174,6 +186,7 @@ matrix:
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Vulkan"
- BUILD=make
@@ -186,6 +199,7 @@ matrix:
- GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS="intel,radeon"
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
sources:
@@ -367,6 +381,7 @@ script:
export CC="$CC -isystem`pwd`";
./autogen.sh --enable-debug
$LIBUNWIND_FLAGS
$DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS
$GALLIUM_ST

View File

@@ -88,6 +88,10 @@ LOCAL_CFLAGS += \
endif
endif
ifeq ($(ARCH_ARM_HAVE_NEON),true)
LOCAL_CFLAGS_arm += -DUSE_ARM_ASM
endif
LOCAL_CFLAGS_arm64 += -DUSE_AARCH64_ASM
ifneq ($(LOCAL_IS_HOST_MODULE),true)
LOCAL_CFLAGS += -DHAVE_LIBDRM

View File

@@ -92,16 +92,12 @@ define mesa-build-with-llvm
$(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5), \
$(warning Unsupported LLVM version in Android $(MESA_ANDROID_MAJOR_VERSION)),) \
$(if $(filter 6,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0) \
$(eval LOCAL_STATIC_LIBRARIES += libLLVMCore) \
$(eval LOCAL_C_INCLUDES += external/llvm/include external/llvm/device/include),) \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0)) \
$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0) \
$(eval LOCAL_STATIC_LIBRARIES += libLLVMCore) \
$(eval LOCAL_C_INCLUDES += external/llvm/include external/llvm/device/include),) \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0)) \
$(if $(filter O,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0) \
$(eval LOCAL_HEADER_LIBRARIES += llvm-headers),)
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)) \
$(eval LOCAL_SHARED_LIBRARIES += libLLVM)
endef
# add subdirectories

View File

@@ -41,6 +41,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-xa \
--enable-xvmc \
--enable-llvm-shared-libs \
--enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \

View File

@@ -50,10 +50,10 @@ except KeyError:
pass
else:
targets = targets.split(',')
print 'scons: warning: targets option is deprecated; pass the targets on their own such as'
print
print ' scons %s' % ' '.join(targets)
print
print('scons: warning: targets option is deprecated; pass the targets on their own such as')
print()
print(' scons %s' % ' '.join(targets))
print()
COMMAND_LINE_TARGETS.append(targets)

View File

@@ -1 +1 @@
17.2.0-devel
17.2.5

86
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,86 @@
# fixes: The commits are too invasive for stable. Instead the offending patches
# causing regressions have been reverted.
365d34540f331df57780dddf8da87235be0a6bcb mesa: correctly calculate the storage offset for i915
de0e62e1065e2d9172acf3ab7c70bba0160125c8 st/mesa: correctly calculate the storage offset
# stable: Add loader::getCapability patches. It's rather invasive infra
# not suitable as a bugfix.
1bf703e4ea5c4f742bc7ba55d01e5afc3f4e11f9 dri_interface,egl,gallium: only expose RGBA visuals on Android
be5773fa8dfe9255d9abaf5c7d5bbbd2d922da08 Android: fix compile error for DRI2 loader getCapability
31a6750988d7dd431f72ff1ff11bfca83bde5d8c st/dri: NULL check before deref DRI loader .getCapability
# stable: The commit addresses code that did not land in the stable branch
31bb8517a194af733deefe2d821537d994d39365 radv/gfx9: fix tile swizzle handling for gfx9
# stable: Commit is not applicable when 4fab67a4415 is missing.
d496780fb2c7f2cf0e32b6a79dc528e5156dfcb3 intel/eu/validate: Look up types on demand in execution_type()
# fixes: Depend on preseding commit which adds new public GBM API
3a5e3aa5a53cff55a5e31766d713a41ffa5a93d7 egl/drm: Fix misused x and y offsets in swrast_put_image2()
fe2a6281b3b299998fe7399e7dbcc2077d773824 egl/drm: Fix misused x and y offsets in swrast_get_image()
# fixes: This commit addressed an earlier commit c7e9ebb3ab8 which did not
# land in branch
45c5c444518b7e83d9accd9f44702fa49282a3b8 radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
# fixes: This commit addressed earlier commits 61ad2f13 and 6dcc54b4 which did
# not land in branch
979978ee06867a531b8d56cee252f5c83920a339 radv: Check for GFX9 for 1D arrays in image_size intrinsic.
# fixes: This commit addressed earlier commits dcf46e99 and 60878dd0 which did
# not land in branch
8e9e339c530c7b82b5a29d4b3183e8f5a01eae28 radv: copy the number of viewports/scissors at pipeline bind time
# stable: The commit regresses a few dEQP tests. Namely:
# dEQP-VK.api.copy_and_blit.core.buffer_to_buffer.partial
# dEQP-VK.api.copy_and_blit.dedicated_allocation.buffer_to_buffer.partial
14555d0b7a51bd3701764fd213c2459410143431 anv: Remove unreachable cases from isl_format_for_size()
# stable: The commit addresses earlier commit a62a9793357 which is no applicable
# for the stable branch
6c7720ed78db754d52f204cbb74897aa9e65ea7e anv/wsi: Allocate enough memory for the entire image
# stable: Commits are too invasive for 17.2.
98fdff7247b6877d028d33284f9cc63189ee204e configure.ac: factor out detection for old and buggy llvm
13a53c4f5cdd664fd155c9e78fb46a4387af006c configure.ac: rework llvm libs handling for 3.9+
a7ecf7b86f4eae59f3ceac2125e5d1725c403c07 Travis: add binutils 2.26 for a few more LLVM 3.9 builds
36d6d1e931936a80da327889862ba02942ac427b configure.ac: add llvm_add_optional_component helper
df3a43018020c16c1dfa88a76c9a84c9fb85be38 configure.ac: add missing LLVM components for OpenCL
# stable: Commit is too big for stable at this point.
4d24a7cb97641cacecd371d1968f6964785822e4 glsl: fix derived cs variables
# stable: 17.3 nomination only.
fee9d05e2136b2b7c5a1ad2be7180b99f733f539 radv: Update code pointer correctly if a variant is already created
# stable: 17.3 nomination only.
d8cefaa197f02944812ef535b1b303dd5bf26848 radv: use device name in cache creation like radeonsi.
# fixes: This commit addressed earlier commit 35ac13ed3 which did not
# land in branch.
11d688d9f0d2ee4d0178d1807c0075e5e8364b1d mesa/bufferobj: don't double negate the range
# extra: Commit is not applicable when ade416d0236 is missing.
07bfdb478bf844a0ac9cf3679f51f83c4abea5a1 broadcom/vc5: Propagate vc4 aliasing fix to vc5.
# stable: This commit addressed earlier commit 8d90e28839 which did
# not land in branch.
446c5726ecb968d06a6607e0df42be1cb74948c4 i965: fix blorp stage_prog_data->param leak
# stable: This commit addressed earlier commit 78ade659569 which did
# not land in branch.
8fbd82f464f26a56167f7962174b2b69756a105a etnaviv: don't do resolve-in-place without valid TS
# stable: This commit addressed earlier commit 8d90e28839 which did
# not land in branch.
7b4387519c382cffef9c62bbbbefcfe71cfde905 intel/fs: Alloc pull constants off mem_ctx
# stable: 17.3 nomination only.
3f8e3c2bd8f54ae6817f7496be47f4e1a8860d9c radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI
# stable: 17.3 nomination only.
7dae419aa7c34af820c08896acef3b65d855188e Android: move drivers' symlinks to /vendor (v2)
# fixes: This commit has more than one Fixes tag but the commit it
# addresses didn't land in branch.
e17e8934f9e4b008bdfb4f9abd8ed4faa604c7d9 automake: include git_sha1.h.in in release tarball

View File

@@ -410,8 +410,21 @@ int main() {
}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
dnl On some platforms, new-style atomics need a helper library
AC_MSG_CHECKING(whether -latomic is needed)
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#include <stdint.h>
uint64_t v;
int main() {
return (int)__atomic_load_n(&v, __ATOMIC_ACQUIRE);
}]])], GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=no, GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=yes)
AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC)
if test "x$GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC" = xyes; then
LIBATOMIC_LIBS="-latomic"
fi
fi
AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
AC_SUBST([LIBATOMIC_LIBS])
dnl Check if host supports 64-bit atomics
dnl note that lack of support usually results in link (not compile) error
@@ -773,6 +786,20 @@ if test "x$enable_asm" = xyes; then
;;
esac
;;
aarch64)
case "$host_os" in
linux*)
asm_arch=aarch64
;;
esac
;;
arm)
case "$host_os" in
linux*)
asm_arch=arm
;;
esac
;;
esac
case "$asm_arch" in
@@ -792,6 +819,14 @@ if test "x$enable_asm" = xyes; then
DEFINES="$DEFINES -DUSE_PPC64LE_ASM"
AC_MSG_RESULT([yes, ppc64le])
;;
aarch64)
DEFINES="$DEFINES -DUSE_AARCH64_ASM"
AC_MSG_RESULT([yes, aarch64])
;;
arm)
DEFINES="$DEFINES -DUSE_ARM_ASM"
AC_MSG_RESULT([yes, arm])
;;
*)
AC_MSG_RESULT([no, platform not supported])
;;
@@ -804,6 +839,27 @@ AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
AC_MSG_CHECKING([whether strtod has locale support])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
#define _GNU_SOURCE
#include <stdlib.h>
#include <locale.h>
#ifdef HAVE_XLOCALE_H
#include <xlocale.h>
#endif
int main() {
locale_t loc = newlocale(LC_CTYPE_MASK, "C", NULL);
const char *s = "1.0";
char *end;
double d = strtod_l(s, end, loc);
float f = strtof_l(s, end, loc);
freelocale(loc);
return 0;
}]])],
[DEFINES="$DEFINES -DHAVE_STRTOD_L"];
AC_MSG_RESULT([yes]),
AC_MSG_RESULT([no]))
dnl Check to see if dlopen is in default libraries (like Solaris, which
dnl has it in libc), or if libdl is needed to get it.
AC_CHECK_FUNC([dlopen], [DEFINES="$DEFINES -DHAVE_DLOPEN"],
@@ -2551,7 +2607,7 @@ if test -n "$with_gallium_drivers"; then
if test "x$HAVE_SWR_AVX" != xyes -a \
"x$HAVE_SWR_AVX2" != xyes -a \
"x$HAVE_SWR_KNL" != xyes -a \
"x$HAVE_SWR_SKX" != xyes -a; then
"x$HAVE_SWR_SKX" != xyes; then
AC_MSG_ERROR([swr enabled but no swr architectures selected])
fi
@@ -2735,6 +2791,8 @@ AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64)
AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc)
AM_CONDITIONAL(HAVE_PPC64LE_ASM, test "x$asm_arch" = xppc64le)
AM_CONDITIONAL(HAVE_AARCH64_ASM, test "x$asm_arch" = xaarch64)
AM_CONDITIONAL(HAVE_ARM_ASM, test "x$asm_arch" = xarm)
AC_SUBST([NINE_MAJOR], 1)
AC_SUBST([NINE_MINOR], 0)

View File

@@ -130,27 +130,6 @@ mesa/demos repository.</p>
runtime</p>
<dl>
<dt><code>EGL_DRIVERS_PATH</code></dt>
<dd>
<p>By default, the main library will look for drivers in the directory where
the drivers are installed to. This variable specifies a list of
colon-separated directories where the main library will look for drivers, in
addition to the default directory. This variable is ignored for setuid/setgid
binaries.</p>
<p>This variable is usually set to test an uninstalled build. For example, one
may set</p>
<pre>
$ export LD_LIBRARY_PATH=$mesa/lib
$ export EGL_DRIVERS_PATH=$mesa/lib/egl
</pre>
<p>to test a build without installation</p>
</dd>
<dt><code>EGL_DRIVER</code></dt>
<dd>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.0 Release Notes / TBD</h1>
<h1>Mesa 17.2.0 Release Notes / September 4, 2017</h1>
<p>
Mesa 17.2.0 is a new development release.
@@ -33,7 +33,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD.
9484ad96b4bb6cda5bbf1aef52dfa35183dc21aa6258a2991c245996c2fdaf85 mesa-17.2.0.tar.gz
3123448f770eae58bc73e15480e78909defb892f10ab777e9116c9b218094943 mesa-17.2.0.tar.xz
</pre>
@@ -56,9 +57,156 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
<ul>
TBD
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68365">Bug 68365</a> - [SNB Bisected]Piglit spec_ARB_framebuffer_object_fbo-blit-stretch fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77240">Bug 77240</a> - khrplatform.h not installed if EGL is disabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95530">Bug 95530</a> - Stellaris - colored overlay of sectors doesn't render on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96449">Bug 96449</a> - Dying Light reports OpenGL version 3.0 with mesa-git</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96958">Bug 96958</a> - [SKL] Improper rendering in Europa Universalis IV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97524">Bug 97524</a> - Samplers referring to the same texture unit with different types should raise GL_INVALID_OPERATION</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97957">Bug 97957</a> - Awful screen tearing in a separate X server with DRI3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98238">Bug 98238</a> - Witcher 2: objects are black when changing lod on Radeon Pitcairn</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98428">Bug 98428</a> - Undefined non-weak-symbol in dri-drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98833">Bug 98833</a> - [REGRESSION, bisected] Wayland revert commit breaks non-Vsync fullscreen frame updates</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99467">Bug 99467</a> - [radv] DOOM 2016 + wine. Green screen everywhere (but can be started)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100070">Bug 100070</a> - Rocket League: grass gets rendered incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100242">Bug 100242</a> - radeon buffer allocation failure during startup of Factorio</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100620">Bug 100620</a> - [SKL] 48-bit addresses break DOOM</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100690">Bug 100690</a> - [Regression, bisected] TotalWar: Warhammer corrupted graphics</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100741">Bug 100741</a> - Chromium - Memory leak</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100785">Bug 100785</a> - [regression, bisected] arb_gpu_shader5 piglit fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100854">Bug 100854</a> - YUV to RGB Color Space Conversion result is not precise</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100871">Bug 100871</a> - gles cts hangs mesa indefinitely</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100877">Bug 100877</a> - vulkan/tests/block_pool_no_free regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100892">Bug 100892</a> - Polaris 12: winsys init bad switch (missing break) initializing addrlib</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100925">Bug 100925</a> - [HSW/BSW/BDW/SKL] Google Earth is not resolving all the details in the map correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100937">Bug 100937</a> - Mesa fails to build with GCC 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100945">Bug 100945</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100988">Bug 100988</a> - glXGetCurrentDisplay() no longer works for FakeGLX contexts?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101071">Bug 101071</a> - compiling glsl fails with undefined reference to `pthread_create'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101088">Bug 101088</a> - `gallium: remove pipe_index_buffer and set_index_buffer` causes glitches and crash in gallium nine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101110">Bug 101110</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101189">Bug 101189</a> - Latest git fails to compile with radeon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101252">Bug 101252</a> - eglGetDisplay() is not thread safe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101254">Bug 101254</a> - VDPAU videos don't start playing with r600 gallium driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101283">Bug 101283</a> - skylake: page fault accessing address 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101284">Bug 101284</a> - [G45] ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101294">Bug 101294</a> - radeonsi minecraft forge splash freeze since 17.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101306">Bug 101306</a> - [BXT] gles asserts in cts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101326">Bug 101326</a> - gallium/wgl: Allow context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101336">Bug 101336</a> - glcpp-test.sh regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101340">Bug 101340</a> - i915_surface.c:108:4: error: too few arguments to function util_blitter_default_src_texture</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101360">Bug 101360</a> - Assertion failure comparing result of ballotARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101401">Bug 101401</a> - [REGRESSION][BISECTED] GDM fails to start after 8ec4975cd83365c791a1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101418">Bug 101418</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101451">Bug 101451</a> - [G33] ES2-CTS.functional.clipping.polygon regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101464">Bug 101464</a> - PrimitiveRestartNV inside a render list causes a crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101471">Bug 101471</a> - Mesa fails to build: unknown typename bool</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101535">Bug 101535</a> - [bisected] [Skylake] Kwin won't start and glxgears coredumps</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101538">Bug 101538</a> - From &quot;Use isl for hiz layouts&quot; commit onwards, everything crashes with Mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101539">Bug 101539</a> - [Regresion] [IVB] Segment fault in recent commit in intel_miptree_level_has_hiz under Ivy bridge</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101558">Bug 101558</a> - [regression][bisected] MPV playing video via opengl &quot;randomly&quot; results in only part of the window / screen being rendered with Mesa GIT.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101596">Bug 101596</a> - Blender renders black UI elements</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101607">Bug 101607</a> - Regression in anisotropic filtering from &quot;i965: Convert fs sampler state to use genxml&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101657">Bug 101657</a> - strtod.c:32:10: fatal error: xlocale.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101666">Bug 101666</a> - bitfieldExtract is marked as a built-in function on OpenGL ES 3.0, but was added in OpenGL ES 3.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101683">Bug 101683</a> - Some games hang while loading when compositing is shut off or absent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101703">Bug 101703</a> - No stencil buffer allocated when requested by GLUT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101704">Bug 101704</a> - [regression][bisected] glReadPixels() from pbuffer failing in Android CTS camera tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101774">Bug 101774</a> - gen_clflush.h:37:7: error: implicit declaration of function __builtin_ia32_clflush</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101775">Bug 101775</a> - Xorg segfault since 147d7fb &quot;st/mesa: add a winsys buffers list in st_context&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101829">Bug 101829</a> - read-after-free in st_framebuffer_validate</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101831">Bug 101831</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101851">Bug 101851</a> - [regression] libEGL_common.a undefined reference to '__gxx_personality_v0'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101867">Bug 101867</a> - Launch options window renders black in Feral Games in current Mesa trunk</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101876">Bug 101876</a> - SIGSEGV when launching Steam</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101925">Bug 101925</a> - playstore/webview crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101961">Bug 101961</a> - Serious Sam Fusion hangs system completely</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101982">Bug 101982</a> - Weston crashes when running an OpenGL program on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101983">Bug 101983</a> - [G33] ES2-CTS.functional.shaders.struct.uniform.sampler_nested* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>
</ul>
<h2>Changes</h2>
<ul>

200
docs/relnotes/17.2.1.html Normal file
View File

@@ -0,0 +1,200 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.1 Release Notes / September 17, 2017</h1>
<p>
Mesa 17.2.1 is a bug fix release which fixes bugs found since the 17.2.0 release.
</p>
<p>
Mesa 17.2.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c902d8dc2540195bc570d88af1a8fd8a1774373660a27bb1d539551f46824bc1 mesa-17.2.1.tar.gz
77385d17827cff24a3bae134342234f2efe7f7f990e778109682571dbbc9ba1e mesa-17.2.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">Bug 100613</a> - Regression in Mesa 17 on s390x (zSystems)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101709">Bug 101709</a> - [llvmpipe] piglit gl-1.0-scissor-offscreen regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102454">Bug 102454</a> - glibc 2.26 doesn't provide anymore xlocale.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102467">Bug 102467</a> - src/mesa/state_tracker/st_cb_readpixels.c:178]: (warning) Redundant assignment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102502">Bug 102502</a> - [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Actually set the cmd_buffer usage_flags.</li>
<li>radv: Fix vkCopyImage with both depth and stencil aspects.</li>
<li>radv: Disable multilayer &amp; multilevel DCC.</li>
<li>radv: Don't allocate CMASK for linear images.</li>
</ul>
<p>Ben Crocker (1):</p>
<ul>
<li>llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>llvmpipe: initialize llvmpipe-&gt;dirty with LP_NEW_SCISSOR</li>
</ul>
<p>Charmaine Lee (1):</p>
<ul>
<li>vbo: fix offset in minmax cache key</li>
</ul>
<p>Dave Airlie (12):</p>
<ul>
<li>radv: disable 1d/2d linear optimisation on gfx9.</li>
<li>radv/gfx9: set descriptor up for base_mip to level range.</li>
<li>Revert "radv: disable support for VEGA for now."</li>
<li>radv/winsys: use amdgpu_bo_va_op_raw.</li>
<li>radv/gfx9: allocate events from uncached VA space</li>
<li>radv: use simpler indirect packet 3 if possible.</li>
<li>radv: don't use iview for meta image width/height.</li>
<li>radv: handle GFX9 1D textures</li>
<li>radv/gfx9: set mip0-depth correctly for 2d arrays/3d images</li>
<li>radv/ac: bump params array for image atomic comp swap</li>
<li>radv/gfx9: fix image resource handling.</li>
<li>radv/winsys: fix flags vs va_flags thinko.</li>
</ul>
<p>Emil Velikov (7):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.0</li>
<li>cherry-ignore: add getCapability patches</li>
<li>cherry-ignore: ignore gfx9 tile swizzle fix</li>
<li>cherry-ignore: add execution_type() fix to the list</li>
<li>cherry-ignore: add EGL+gbm swast patches</li>
<li>egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extension</li>
<li>Update version to 17.2.1</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>util: improve compiler guard</li>
<li>mesa/st: remove unwanted backup file</li>
<li>docs/egl: remove reference to EGL_DRIVERS_PATH</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>radv: don't assert on empty hash table</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>anv/formats: Nicely handle unknown VkFormat enums</li>
<li>spirv: Add support for the HelperInvocation builtin</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvc0: write 0 to pipeline_statistics.cs_invocations</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix crash in fallback GTT mapping.</li>
<li>i965: Set "Subslice Hashing Mode" to 16x16 on Apollolake.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: skip draw calls with pipe_draw_info::count == 0</li>
</ul>
<p>Michael Olbrich (1):</p>
<ul>
<li>egl/dri2: only destroy created objects</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>radeonsi/gfx9: always flush DB metadata on framebuffer changes</li>
<li>st/glsl_to_tgsi: only the first (inner-most) array reference can be a 2D index</li>
<li>ac/surface: match Z and stencil tile config</li>
<li>glsl: fix glsl_struct_field size calculations for shader cache</li>
</ul>
<p>Ray Strode (1):</p>
<ul>
<li>gallivm: correct channel shift logic on big endian</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: skip batch-cache for compute shaders</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>st/mesa: fix view template initialization in try_pbo_readpixels</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radeonsi: update dirty_level_mask before dispatching</li>
</ul>
<p>Timothy Arceri (9):</p>
<ul>
<li>glsl: allow NULL to be passed to encode_type_to_blob()</li>
<li>glsl: stop adding pointers from gl_shader_variable to the cache</li>
<li>glsl: stop adding pointers from glsl_struct_field to the cache</li>
<li>glsl: add has_uniform_storage() helper to shader cache</li>
<li>glsl: don't write uniform storage offset if there isn't one</li>
<li>glsl: always write a name/label string to the cache</li>
<li>compiler: move pointers to the start of shader_info</li>
<li>glsl: stop adding pointers from shader_info to the cache</li>
<li>glsl: stop adding pointers from bindless structs to the cache</li>
</ul>
</div>
</body>
</html>

203
docs/relnotes/17.2.2.html Normal file
View File

@@ -0,0 +1,203 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.2 Release Notes / October 2, 2017</h1>
<p>
Mesa 17.2.2 is a bug fix release which fixes bugs found since the 17.2.1 release.
</p>
<p>
Mesa 17.2.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
8242256f3243ed3f35184ed7bf0a9070439ccdf477a3bd9cfd2437c0b2f9bc7f mesa-17.2.2.tar.gz
cf522244d6a5a1ecde3fc00e7c96935253fe22f808f064cab98be6f3faa65782 mesa-17.2.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102573">Bug 102573</a> - fails to build on armel</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102844">Bug 102844</a> - memory leak with glDeleteProgram for shader program type GL_COMPUTE_SHADER</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102847">Bug 102847</a> - swr fail to build with llvm-5.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102904">Bug 102904</a> - piglit and gl45 cts linker tests regressed</li>
</ul>
<h2>Changes</h2>
<p>Alexandru-Liviu Prodea (1):</p>
<ul>
<li>Scons: Add LLVM 5.0 support</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Check for GFX9 for 1D arrays in image_size intrinsic.</li>
</ul>
<p>Boris Brezillon (1):</p>
<ul>
<li>broadcom/vc4: Fix infinite retry in vc4_bo_alloc()</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv/nir: call opt_remove_phis after trivial continues.</li>
<li>ac/surface: handle S8 on gfx9</li>
<li>st/glsl-&gt;tgsi: fix u64 to bool comparisons.</li>
</ul>
<p>David Airlie (1):</p>
<ul>
<li>radv: add gfx9 scissor workaround</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.1</li>
<li>automake: enable libunwind in `make distcheck'</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>broadcom/vc4: Fix use-after-free for flushing when writing to a texture.</li>
<li>broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.</li>
<li>broadcom/vc4: Fix use-after-free when deleting a program.</li>
<li>broadcom/vc4: Keep pipe_sampler_view-&gt;texture matching the original texture.</li>
</ul>
<p>Gert Wollny (2):</p>
<ul>
<li>travis: force llvm-3.3 for "make Gallium ST Other"</li>
<li>travis: Add libunwind-dev to gallium/make builds</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>configure: check if -latomic is needed for __atomic_*</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>nv20: Fix GL_CLAMP</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>i965/blorp: Set r8stencil_needs_update when writing stencil</li>
<li>vulkan/wsi/wayland: Stop printing out the DRM device</li>
<li>vulkan/wsi/wayland: Refactor wsi_wl_display code</li>
<li>vulkan/wsi/wayland: Stop caching Wayland displays</li>
<li>vulkan/wsi/wayland: Copy wl_proxy objects from oldSwapchain if available</li>
<li>vulkan/wsi/wayland: Return better error messages</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>cherry-ignore: add "radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug"</li>
<li>cherry-ignore: add "radv: Check for GFX9 for 1D arrays in image_size intrinsic."</li>
<li>cherry-ignore: add "radv: copy the number of viewports/scissors at pipeline bind time"</li>
<li>Update version to 17.2.2</li>
</ul>
<p>Józef Kucia (1):</p>
<ul>
<li>anv: Fix descriptors copying</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965/vec4: Actually handle atomic op intrinsics.</li>
<li>i965/vec4: Fix swizzles on atomic sources.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/va/postproc: use video original size for postprocessing</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix 16bpp clears</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util: Link libmesautil into u_atomic_test</li>
<li>util/u_atomic: Add implementation of __sync_val_compare_and_swap_8</li>
</ul>
<p>Nicolai Hähnle (9):</p>
<ul>
<li>radeonsi: workaround for gather4 on integer cube maps</li>
<li>amd/common: round cube array slice in ac_prepare_cube_coords</li>
<li>amd/common: add workaround for cube map array layer clamping</li>
<li>glsl/linker: fix output variable overlap check</li>
<li>radeonsi: fix array textures layer coordinate</li>
<li>radeonsi: set MIP_POINT_PRECLAMP to 0</li>
<li>amd/addrlib: fix missing va_end() after va_copy()</li>
<li>amd/common: move ac_build_phi from radeonsi</li>
<li>radeonsi: fix a regression in integer cube map handling</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: fix viewport transformation for z component</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix saved compute state when doing statistics/occlusion queries</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: free current ComputeProgram state in _mesa_free_context_data</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: remove llvm fence/atomics from generated files</li>
</ul>
<p>Tomasz Figa (1):</p>
<ul>
<li>egl/dri2: Implement swapInterval fallback in a conformant way</li>
</ul>
</div>
</body>
</html>

181
docs/relnotes/17.2.3.html Normal file
View File

@@ -0,0 +1,181 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.3 Release Notes / October 19, 2017</h1>
<p>
Mesa 17.2.3 is a bug fix release which fixes bugs found since the 17.2.2 release.
</p>
<p>
Mesa 17.2.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
fb305eecfeec1fd771fdc96fff973c51871f7bd35fd2bd56cacc27b4b8823220 mesa-17.2.3.tar.gz
a0b0ec8f7b24dd044d7ab30a8c7e6d3767521e245f88d4ed5dd93315dc56f837 mesa-17.2.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101832">Bug 101832</a> - [PATCH][regression][bisect] Xorg fails to start after f50aa21456d82c8cb6fbaa565835f1acc1720a5d</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102852">Bug 102852</a> - Scons: Support the new Scons 3.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102940">Bug 102940</a> - Regression: Vulkan KMS rendering crashes since 17.2</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Add R16G16B16A16_SNORM fast clear support</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>nir/spirv: Allow loop breaks in a switch body.</li>
<li>radv: Only set the MTYPE flags on GFX9+.</li>
</ul>
<p>Ben Crocker (4):</p>
<ul>
<li>gallivm: fix typo in debug_printf message</li>
<li>gallivm: allow additional llc options</li>
<li>gallivm/ppc64le: adjust VSX code generation control.</li>
<li>gallivm/ppc64le: allow environmental control of Altivec code generation</li>
</ul>
<p>Daniel Stone (2):</p>
<ul>
<li>egl/wayland: Check queryImage return for wl_buffer</li>
<li>egl/wayland: Don't use dmabuf with no modifiers</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>radv: emit fmuladd instead of fma to llvm.</li>
<li>radv: lower ffma in nir.</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>cherry-ignore: add "anv: Remove unreachable cases from isl_format_for_size"</li>
<li>cherry-ignore: add "anv/wsi: Allocate enough memory for the entire image"</li>
<li>swr/rast: do not crash on NULL strings returned by getenv</li>
<li>wayland-drm: use a copy of the wayland_drm_callbacks struct</li>
<li>eglmesaext: add forward declaration for struct wl_buffers</li>
<li>Update version to 17.2.3</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>scons: use python3-compatible print()</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50/ir: fix 64-bit integer shifts</li>
<li>nv50,nvc0: fix push hint logic in presence of a start offset</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>intel/compiler: Don't cmod propagate into a saturated operation</li>
<li>intel/compiler: Don't propagate cmod into integer multiplies</li>
<li>glsl/blob: Return false from ensure_can_read on overrun</li>
<li>glsl/blob: Return false from grow_to_fit if we've ever failed</li>
<li>nir/opcodes: Fix constant-folding of ufind_msb</li>
<li>nir: Get rid of the variable on vote intrinsics</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.2</li>
</ul>
<p>Józef Kucia (3):</p>
<ul>
<li>anv: Fix vkCmdFillBuffer()</li>
<li>spirv: Fix SpvOpAtomicISub</li>
<li>anv: Do not assert() on VK_ATTACHMENT_UNUSED</li>
</ul>
<p>Leo Liu (3):</p>
<ul>
<li>st/va: use pipe transfer_map to map upload buffer</li>
<li>st/vdpau: don't re-allocate interlaced buffer with packed YUV format</li>
<li>st/va: don't re-allocate interlaced buffer with pakced format</li>
</ul>
<p>Lionel Landwerlin (4):</p>
<ul>
<li>intel: compiler: vec4: add missing default 0 lod</li>
<li>anv/cmd_buffer: fix push descriptors with set &gt; 0</li>
<li>anv/cmd_buffer: Reset state in cmd_buffer_destroy</li>
<li>anv: bo_cache: allow importing a BO larger than needed</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>mesa: fix texture updates for ATI_fragment_shader</li>
<li>st/mesa: don't use pipe_surface for passing information about EGLImage</li>
<li>glsl_to_tgsi: fix instruction order for bindless textures</li>
</ul>
<p>Nicolai Hähnle (14):</p>
<ul>
<li>st/glsl_to_tgsi: fix conditional assignments to packed shader outputs</li>
<li>amd/common: fix build_cube_select</li>
<li>radeonsi/gfx9: fix geometry shaders without output vertices</li>
<li>util/queue: fix a race condition in the fence code</li>
<li>glsl/lower_instruction: handle denorms and overflow in ldexp correctly</li>
<li>radeonsi: move current_rast_prim to r600_common_context</li>
<li>radeonsi: don't discard points and lines</li>
<li>radeonsi: deduce rast_prim correctly for tessellation point mode</li>
<li>radeonsi: fix maximum advertised point size / line width</li>
<li>st/mesa: don't clobber glGetInternalformat* buffer for GL_NUM_SAMPLE_COUNTS</li>
<li>st/glsl_to_tgsi: fix indirect access to 64-bit integer</li>
<li>st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts</li>
<li>radeonsi: clamp depth comparison value only for fixed point formats</li>
<li>radeonsi: clamp border colors for upgraded depth textures</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/a5xx: align height to GMEM</li>
<li>freedreno/a5xx: fix missing restore state</li>
</ul>
</div>
</body>
</html>

132
docs/relnotes/17.2.4.html Normal file
View File

@@ -0,0 +1,132 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.4 Release Notes / October 30, 2017</h1>
<p>
Mesa 17.2.4 is a bug fix release which fixes bugs found since the 17.2.3 release.
</p>
<p>
Mesa 17.2.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
cb266edc5cf7226219ebaf556ca2e03dff282e0324d20afd80423a5754d1272c mesa-17.2.4.tar.gz
5ba408fecd6e1132e5490eec1a2f04466214e4c65c8b89b331be844768c2e550 mesa-17.2.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>cherry-ignore: configure.ac: rework llvm detection and handling</li>
<li>cherry-ignore: glsl: fix derived cs variables</li>
<li>cherry-ignore: added 17.3 nominations.</li>
<li>cherry-ignore: radv: Don't use vgpr indexing for outputs on GFX9.</li>
<li>cherry-ignore: radv: Disallow indirect outputs for GS on GFX9 as well.</li>
<li>cherry-ignore: mesa/bufferobj: don't double negate the range</li>
<li>cherry-ignore: broadcom/vc5: Propagate vc4 aliasing fix to vc5.</li>
<li>Update version to 17.2.4</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>ac/nir: Fix nir_texop_lod on GFX for 1D arrays.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv/image: bump all the offset to uint64_t.</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.3</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>vulkan/wsi: Free the event in x11_manage_fifo_queues().</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: Fix compilation after clang r315871</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>nir/intrinsics: Set the correct num_indices for load_output</li>
<li>intel/fs: Handle flag read/write aliasing in needs_src_copy</li>
<li>anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir</li>
<li>intel/eu: Use EXECUTE_1 for JMPI</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Revert absolute mode for constant buffer pointers.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>Revert "mesa: fix texture updates for ATI_fragment_shader"</li>
</ul>
<p>Matthew Nicholls (1):</p>
<ul>
<li>ac/nir: generate correct instruction for atomic min/max on unsigned images</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>st/mesa: Initialize textures array in st_framebuffer_validate</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: add the draw count buffer to the list of buffers</li>
</ul>
<p>Stefan Schake (1):</p>
<ul>
<li>broadcom/vc4: Fix aliasing issue</li>
</ul>
</div>
</body>
</html>

155
docs/relnotes/17.2.5.html Normal file
View File

@@ -0,0 +1,155 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.5 Release Notes / November 10, 2017</h1>
<p>
Mesa 17.2.5 is a bug fix release which fixes bugs found since the 17.2.4 release.
</p>
<p>
Mesa 17.2.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.4</li>
<li>cherry-ignore: radv: copy indirect lowering settings from radeonsi</li>
<li>cherry-ignore: i965: fix blorp stage_prog_data-&gt;param leak</li>
<li>cherry-ignore: etnaviv: don't do resolve-in-place without valid TS</li>
<li>cherry-ignore: intel/fs: Alloc pull constants off mem_ctx</li>
<li>cherry-ignore: added 17.3 nominations.</li>
<li>cherry-ignore: automake: include git_sha1.h.in in release tarball</li>
<li>Update version to 17.2.5</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Don't expose heaps with 0 memory.</li>
<li>radv: Don't use vgpr indexing for outputs on GFX9.</li>
<li>radv: Disallow indirect outputs for GS on GFX9 as well.</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>i915g: make gears run again.</li>
<li>radv: free attachments on end command buffer.</li>
<li>radv: add initial copy descriptor support. (v2)</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>vc4: fix release build</li>
</ul>
<p>Gert Wollny (1):</p>
<ul>
<li>r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>spirv: Claim support for the simple memory model</li>
<li>i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees</li>
<li>i965/blorp: Use more temporary isl_format variables</li>
<li>i965/miptree: Take an isl_format in render_aux_usage</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/video: add gfx9 offsets when rejoin the video surface</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>st/dri: don't expose modifiers in EGL if the driver doesn't implement them</li>
<li>ac/surface/gfx9: don't allow DCC for the smallest mipmap levels</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>i965: Check CCS_E compatibility for texture view rendering</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>amd/common/gfx9: workaround DCC corruption more conservatively</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>i965: unref push_const_bo in intelDestroyContext</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>radv: copy indirect lowering settings from radeonsi</li>
</ul>
<p>Tomasz Figa (1):</p>
<ul>
<li>glsl: Allow precision mismatch on dead data with GLSL ES 1.00</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>intel/compiler/gen9: Pixel shader header only workaround</li>
</ul>
</div>
</body>
</html>

View File

@@ -31,14 +31,14 @@ extern "C" {
** This header is generated from the Khronos OpenGL / OpenGL ES XML
** API Registry. The current version of the Registry, generator scripts
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/egl
** http://www.khronos.org/registry/egl
**
** Khronos $Revision$ on $Date$
** Khronos $Git commit SHA1: a732b061e7 $ on $Git commit date: 2017-06-17 23:27:53 +0100 $
*/
#include <EGL/eglplatform.h>
/* Generated on date 20161230 */
/* Generated on date 20170627 */
/* Generated C header for:
* API: egl

View File

@@ -31,14 +31,14 @@ extern "C" {
** This header is generated from the Khronos OpenGL / OpenGL ES XML
** API Registry. The current version of the Registry, generator scripts
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/egl
** http://www.khronos.org/registry/egl
**
** Khronos $Revision$ on $Date$
** Khronos $Git commit SHA1: a732b061e7 $ on $Git commit date: 2017-06-17 23:27:53 +0100 $
*/
#include <EGL/eglplatform.h>
#define EGL_EGLEXT_VERSION 20161230
#define EGL_EGLEXT_VERSION 20170627
/* Generated C header for:
* API: egl
@@ -133,6 +133,15 @@ EGLAPI EGLint EGLAPIENTRY eglLabelObjectKHR (EGLDisplay display, EGLenum objectT
#endif
#endif /* EGL_KHR_debug */
#ifndef EGL_KHR_display_reference
#define EGL_KHR_display_reference 1
#define EGL_TRACK_REFERENCES_KHR 0x3352
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDISPLAYATTRIBKHRPROC) (EGLDisplay dpy, EGLint name, EGLAttrib *value);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribKHR (EGLDisplay dpy, EGLint name, EGLAttrib *value);
#endif
#endif /* EGL_KHR_display_reference */
#ifndef EGL_KHR_fence_sync
#define EGL_KHR_fence_sync 1
typedef khronos_utime_nanoseconds_t EGLTimeKHR;
@@ -555,6 +564,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu
#define EGL_DISCARD_SAMPLES_ARM 0x3286
#endif /* EGL_ARM_pixmap_multisample_discard */
#ifndef EGL_EXT_bind_to_front
#define EGL_EXT_bind_to_front 1
#define EGL_FRONT_BUFFER_EXT 0x3464
#endif /* EGL_EXT_bind_to_front */
#ifndef EGL_EXT_buffer_age
#define EGL_EXT_buffer_age 1
#define EGL_BUFFER_AGE_EXT 0x313D
@@ -564,6 +578,30 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu
#define EGL_EXT_client_extensions 1
#endif /* EGL_EXT_client_extensions */
#ifndef EGL_EXT_compositor
#define EGL_EXT_compositor 1
#define EGL_PRIMARY_COMPOSITOR_CONTEXT_EXT 0x3460
#define EGL_EXTERNAL_REF_ID_EXT 0x3461
#define EGL_COMPOSITOR_DROP_NEWEST_FRAME_EXT 0x3462
#define EGL_COMPOSITOR_KEEP_NEWEST_FRAME_EXT 0x3463
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORSETCONTEXTLISTEXTPROC) (const EGLint *external_ref_ids, EGLint num_entries);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORSETCONTEXTATTRIBUTESEXTPROC) (EGLint external_ref_id, const EGLint *context_attributes, EGLint num_entries);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORSETWINDOWLISTEXTPROC) (EGLint external_ref_id, const EGLint *external_win_ids, EGLint num_entries);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORSETWINDOWATTRIBUTESEXTPROC) (EGLint external_win_id, const EGLint *window_attributes, EGLint num_entries);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORBINDTEXWINDOWEXTPROC) (EGLint external_win_id);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORSETSIZEEXTPROC) (EGLint external_win_id, EGLint width, EGLint height);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOMPOSITORSWAPPOLICYEXTPROC) (EGLint external_win_id, EGLint policy);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorSetContextListEXT (const EGLint *external_ref_ids, EGLint num_entries);
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorSetContextAttributesEXT (EGLint external_ref_id, const EGLint *context_attributes, EGLint num_entries);
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorSetWindowListEXT (EGLint external_ref_id, const EGLint *external_win_ids, EGLint num_entries);
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorSetWindowAttributesEXT (EGLint external_win_id, const EGLint *window_attributes, EGLint num_entries);
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorBindTexWindowEXT (EGLint external_win_id);
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorSetSizeEXT (EGLint external_win_id, EGLint width, EGLint height);
EGLAPI EGLBoolean EGLAPIENTRY eglCompositorSwapPolicyEXT (EGLint external_win_id, EGLint policy);
#endif
#endif /* EGL_EXT_compositor */
#ifndef EGL_EXT_create_context_robustness
#define EGL_EXT_create_context_robustness 1
#define EGL_CONTEXT_OPENGL_ROBUST_ACCESS_EXT 0x30BF
@@ -618,6 +656,21 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint a
#define EGL_GL_COLORSPACE_BT2020_PQ_EXT 0x3340
#endif /* EGL_EXT_gl_colorspace_bt2020_pq */
#ifndef EGL_EXT_gl_colorspace_display_p3
#define EGL_EXT_gl_colorspace_display_p3 1
#define EGL_GL_COLORSPACE_DISPLAY_P3_EXT 0x3363
#endif /* EGL_EXT_gl_colorspace_display_p3 */
#ifndef EGL_EXT_gl_colorspace_display_p3_linear
#define EGL_EXT_gl_colorspace_display_p3_linear 1
#define EGL_GL_COLORSPACE_DISPLAY_P3_LINEAR_EXT 0x3362
#endif /* EGL_EXT_gl_colorspace_display_p3_linear */
#ifndef EGL_EXT_gl_colorspace_scrgb
#define EGL_EXT_gl_colorspace_scrgb 1
#define EGL_GL_COLORSPACE_SCRGB_EXT 0x3351
#endif /* EGL_EXT_gl_colorspace_scrgb */
#ifndef EGL_EXT_gl_colorspace_scrgb_linear
#define EGL_EXT_gl_colorspace_scrgb_linear 1
#define EGL_GL_COLORSPACE_SCRGB_LINEAR_EXT 0x3350
@@ -670,6 +723,13 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDmaBufModifiersEXT (EGLDisplay dpy, EGLint
#endif
#endif /* EGL_EXT_image_dma_buf_import_modifiers */
#ifndef EGL_EXT_image_implicit_sync_control
#define EGL_EXT_image_implicit_sync_control 1
#define EGL_IMPORT_SYNC_TYPE_EXT 0x3470
#define EGL_IMPORT_IMPLICIT_SYNC_EXT 0x3471
#define EGL_IMPORT_EXPLICIT_SYNC_EXT 0x3472
#endif /* EGL_EXT_image_implicit_sync_control */
#ifndef EGL_EXT_multiview_window
#define EGL_EXT_multiview_window 1
#define EGL_MULTIVIEW_VIEW_COUNT_EXT 0x3134
@@ -769,6 +829,12 @@ EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerOutputEXT (EGLDisplay dpy, EGLStr
#endif
#endif /* EGL_EXT_stream_consumer_egloutput */
#ifndef EGL_EXT_surface_CTA861_3_metadata
#define EGL_EXT_surface_CTA861_3_metadata 1
#define EGL_CTA861_3_MAX_CONTENT_LIGHT_LEVEL_EXT 0x3360
#define EGL_CTA861_3_MAX_FRAME_AVERAGE_LEVEL_EXT 0x3361
#endif /* EGL_EXT_surface_CTA861_3_metadata */
#ifndef EGL_EXT_surface_SMPTE2086_metadata
#define EGL_EXT_surface_SMPTE2086_metadata 1
#define EGL_SMPTE2086_DISPLAY_PRIMARY_RX_EXT 0x3341
@@ -781,6 +847,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerOutputEXT (EGLDisplay dpy, EGLStr
#define EGL_SMPTE2086_WHITE_POINT_Y_EXT 0x3348
#define EGL_SMPTE2086_MAX_LUMINANCE_EXT 0x3349
#define EGL_SMPTE2086_MIN_LUMINANCE_EXT 0x334A
#define EGL_METADATA_SCALING_EXT 50000
#endif /* EGL_EXT_surface_SMPTE2086_metadata */
#ifndef EGL_EXT_swap_buffers_with_damage

View File

@@ -70,6 +70,7 @@ typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYWAYLANDBUFFERWL) (EGLDisplay dpy, st
#ifndef EGL_WL_create_wayland_buffer_from_image
#define EGL_WL_create_wayland_buffer_from_image 1
struct wl_buffer;
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI struct wl_buffer * EGLAPIENTRY eglCreateWaylandBufferFromImageWL(EGLDisplay dpy, EGLImageKHR image);
#endif

View File

@@ -50,9 +50,22 @@ extern "C" {
#ifndef GL_VERSION_ES_CM_1_0
#define GL_VERSION_ES_CM_1_0 1
/*
* XXX: Temporary fix; needs to be reverted as part of the next
* header update.
* For more details:
* https://github.com/KhronosGroup/OpenGL-Registry/pull/76
* https://lists.freedesktop.org/archives/mesa-dev/2017-June/161647.html
*/
#include <KHR/khrplatform.h>
typedef khronos_int8_t GLbyte;
typedef khronos_float_t GLclampf;
typedef short GLshort;
typedef unsigned short GLushort;
typedef void GLvoid;
typedef unsigned int GLenum;
#include <KHR/khrplatform.h>
typedef khronos_float_t GLfloat;
typedef khronos_int32_t GLfixed;
typedef unsigned int GLuint;

View File

@@ -104,7 +104,6 @@ GL_API void GL_APIENTRY glBlendEquationOES (GLenum mode);
#ifndef GL_OES_byte_coordinates
#define GL_OES_byte_coordinates 1
typedef khronos_int8_t GLbyte;
#endif /* GL_OES_byte_coordinates */
#ifndef GL_OES_compressed_ETC1_RGB8_sub_texture
@@ -128,7 +127,6 @@ typedef khronos_int8_t GLbyte;
#ifndef GL_OES_draw_texture
#define GL_OES_draw_texture 1
typedef short GLshort;
#define GL_TEXTURE_CROP_RECT_OES 0x8B9D
typedef void (GL_APIENTRYP PFNGLDRAWTEXSOESPROC) (GLshort x, GLshort y, GLshort z, GLshort width, GLshort height);
typedef void (GL_APIENTRYP PFNGLDRAWTEXIOESPROC) (GLint x, GLint y, GLint z, GLint width, GLint height);
@@ -409,7 +407,6 @@ GL_API GLbitfield GL_APIENTRY glQueryMatrixxOES (GLfixed *mantissa, GLint *expon
#ifndef GL_OES_single_precision
#define GL_OES_single_precision 1
typedef khronos_float_t GLclampf;
typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFOESPROC) (GLclampf depth);
typedef void (GL_APIENTRYP PFNGLCLIPPLANEFOESPROC) (GLenum plane, const GLfloat *equation);
typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFOESPROC) (GLclampf n, GLclampf f);

View File

@@ -26,7 +26,7 @@
/* Khronos platform-specific types and definitions.
*
* $Revision: 23298 $ on $Date: 2013-09-30 17:07:13 -0700 (Mon, 30 Sep 2013) $
* $Revision: 32517 $ on $Date: 2016-03-11 02:41:19 -0800 (Fri, 11 Mar 2016) $
*
* Adopters may modify this file to suit their platform. Adopters are
* encouraged to submit platform specific modifications to the Khronos
@@ -98,11 +98,7 @@
* This precedes the return type of the function in the function prototype.
*/
#if defined(_WIN32) && !defined(__SCITECH_SNAP__)
# if defined(KHRONOS_DLL_EXPORTS)
# define KHRONOS_APICALL __declspec(dllexport)
# else
# define KHRONOS_APICALL __declspec(dllimport)
# endif
# define KHRONOS_APICALL __declspec(dllimport)
#elif defined (__SYMBIAN32__)
# define KHRONOS_APICALL IMPORT_C
#elif (defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 303) \
@@ -231,7 +227,7 @@ typedef signed short int khronos_int16_t;
typedef unsigned short int khronos_uint16_t;
/*
* Types that differ between LLP64 and LP64 architectures - in LLP64,
* Types that differ between LLP64 and LP64 architectures - in LLP64,
* pointers are 64 bits, but 'long' is still 32 bits. Win64 appears
* to be the only LLP64 architecture in current use.
*/

View File

@@ -157,6 +157,19 @@ def check_header(env, header):
env = conf.Finish()
return have_header
def check_functions(env, functions):
'''Check if all of the functions exist'''
conf = SCons.Script.Configure(env)
have_functions = True
for function in functions:
if not conf.CheckFunc(function):
have_functions = False
env = conf.Finish()
return have_functions
def check_prog(env, prog):
"""Check whether this program exists."""
@@ -244,16 +257,16 @@ def generate(env):
# Backwards compatability with the debug= profile= options
if env['build'] == 'debug':
if not env['debug']:
print 'scons: warning: debug option is deprecated and will be removed eventually; use instead'
print
print ' scons build=release'
print
print('scons: warning: debug option is deprecated and will be removed eventually; use instead')
print('')
print(' scons build=release')
print('')
env['build'] = 'release'
if env['profile']:
print 'scons: warning: profile option is deprecated and will be removed eventually; use instead'
print
print ' scons build=profile'
print
print('scons: warning: profile option is deprecated and will be removed eventually; use instead')
print('')
print(' scons build=profile')
print('')
env['build'] = 'profile'
if False:
# Enforce SConscripts to use the new build variable
@@ -287,7 +300,7 @@ def generate(env):
env['build_dir'] = build_dir
env.SConsignFile(os.path.join(build_dir, '.sconsign'))
if 'SCONS_CACHE_DIR' in os.environ:
print 'scons: Using build cache in %s.' % (os.environ['SCONS_CACHE_DIR'],)
print('scons: Using build cache in %s.' % (os.environ['SCONS_CACHE_DIR'],))
env.CacheDir(os.environ['SCONS_CACHE_DIR'])
env['CONFIGUREDIR'] = os.path.join(build_dir, 'conf')
env['CONFIGURELOG'] = os.path.join(os.path.abspath(build_dir), 'config.log')
@@ -339,6 +352,9 @@ def generate(env):
if check_header(env, 'xlocale.h'):
cppdefines += ['HAVE_XLOCALE_H']
if check_functions(env, ['strtod_l', 'strtof_l']):
cppdefines += ['HAVE_STRTOD_L']
if platform == 'windows':
cppdefines += [
'WIN32',
@@ -369,8 +385,8 @@ def generate(env):
if env['embedded']:
cppdefines += ['PIPE_SUBSYSTEM_EMBEDDED']
if env['texture_float']:
print 'warning: Floating-point textures enabled.'
print 'warning: Please consult docs/patents.txt with your lawyer before building Mesa.'
print('warning: Floating-point textures enabled.')
print('warning: Please consult docs/patents.txt with your lawyer before building Mesa.')
cppdefines += ['TEXTURE_FLOAT_ENABLED']
env.Append(CPPDEFINES = cppdefines)

View File

@@ -68,13 +68,13 @@ def generate(env):
if env['platform'] == 'windows':
# XXX: There is no llvm-config on Windows, so assume a standard layout
if llvm_dir is None:
print 'scons: LLVM environment variable must be specified when building for windows'
print('scons: LLVM environment variable must be specified when building for windows')
return
# Try to determine the LLVM version from llvm/Config/config.h
llvm_config = os.path.join(llvm_dir, 'include/llvm/Config/llvm-config.h')
if not os.path.exists(llvm_config):
print 'scons: could not find %s' % llvm_config
print('scons: could not find %s' % llvm_config)
return
llvm_version_major_re = re.compile(r'^#define LLVM_VERSION_MAJOR ([0-9]+)')
llvm_version_minor_re = re.compile(r'^#define LLVM_VERSION_MINOR ([0-9]+)')
@@ -92,10 +92,10 @@ def generate(env):
llvm_version = distutils.version.LooseVersion('%s.%s' % (llvm_version_major, llvm_version_minor))
if llvm_version is None:
print 'scons: could not determine the LLVM version from %s' % llvm_config
print('scons: could not determine the LLVM version from %s' % llvm_config)
return
if llvm_version < distutils.version.LooseVersion(required_llvm_version):
print 'scons: LLVM version %s found, but %s is required' % (llvm_version, required_llvm_version)
print('scons: LLVM version %s found, but %s is required' % (llvm_version, required_llvm_version))
return
env.Prepend(CPPPATH = [os.path.join(llvm_dir, 'include')])
@@ -104,7 +104,26 @@ def generate(env):
])
env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
# LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter irreader`
if llvm_version >= distutils.version.LooseVersion('4.0'):
if llvm_version >= distutils.version.LooseVersion('5.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
'LLVMDebugInfoCodeView', 'LLVMCodeGen',
'LLVMScalarOpts', 'LLVMInstCombine',
'LLVMTransformUtils',
'LLVMBitWriter', 'LLVMX86Desc',
'LLVMMCDisassembler', 'LLVMX86Info',
'LLVMX86AsmPrinter', 'LLVMX86Utils',
'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
'LLVMAnalysis', 'LLVMProfileData',
'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
'LLVMBitReader', 'LLVMMC', 'LLVMCore',
'LLVMSupport',
'LLVMIRReader', 'LLVMAsmParser',
'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',
'LLVMBinaryFormat',
])
elif llvm_version >= distutils.version.LooseVersion('4.0'):
env.Prepend(LIBS = [
'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
@@ -212,14 +231,14 @@ def generate(env):
else:
llvm_config = os.environ.get('LLVM_CONFIG', 'llvm-config')
if not env.Detect(llvm_config):
print 'scons: %s script not found' % llvm_config
print('scons: %s script not found' % llvm_config)
return
llvm_version = env.backtick('%s --version' % llvm_config).rstrip()
llvm_version = distutils.version.LooseVersion(llvm_version)
if llvm_version < distutils.version.LooseVersion(required_llvm_version):
print 'scons: LLVM version %s found, but %s is required' % (llvm_version, required_llvm_version)
print('scons: LLVM version %s found, but %s is required' % (llvm_version, required_llvm_version))
return
try:
@@ -245,13 +264,13 @@ def generate(env):
env.ParseConfig('%s --system-libs' % llvm_config)
env.Append(CXXFLAGS = ['-std=c++11'])
except OSError:
print 'scons: llvm-config version %s failed' % llvm_version
print('scons: llvm-config version %s failed' % llvm_version)
return
assert llvm_version is not None
env['llvm'] = True
print 'scons: Found LLVM version %s' % llvm_version
print('scons: Found LLVM version %s' % llvm_version)
env['LLVM_VERSION'] = llvm_version
# Define HAVE_LLVM macro with the major/minor version number (e.g., 0x0206 for 2.6)

View File

@@ -28,7 +28,7 @@ def write_git_sha1_h_file(filename):
try:
subprocess.Popen(args, stdout=f).wait()
except:
print "Warning: exception in write_git_sha1_h_file()"
print("Warning: exception in write_git_sha1_h_file()")
return
if not os.path.exists(filename) or not filecmp.cmp(tempfile, filename):

View File

@@ -54,9 +54,7 @@ LOCAL_C_INCLUDES := \
$(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src/gallium/auxiliary \
$(intermediates)/common \
external/llvm/include \
external/llvm/device/include
$(intermediates)/common
LOCAL_EXPORT_C_INCLUDE_DIRS := \
$(LOCAL_PATH)/common

View File

@@ -216,20 +216,16 @@ VOID Object::DebugPrint(
#if DEBUG
if (m_client.callbacks.debugPrint != NULL)
{
va_list ap;
va_start(ap, pDebugString);
ADDR_DEBUGPRINT_INPUT debugPrintInput = {0};
debugPrintInput.size = sizeof(ADDR_DEBUGPRINT_INPUT);
debugPrintInput.pDebugString = const_cast<CHAR*>(pDebugString);
debugPrintInput.hClient = m_client.handle;
va_copy(debugPrintInput.ap, ap);
va_start(debugPrintInput.ap, pDebugString);
m_client.callbacks.debugPrint(&debugPrintInput);
va_end(ap);
va_end(debugPrintInput.ap);
}
#endif
}

View File

@@ -109,7 +109,7 @@ static void parse_relocs(Elf *elf, Elf_Data *relocs, Elf_Data *symbols,
}
}
void ac_elf_read(const char *elf_data, unsigned elf_size,
bool ac_elf_read(const char *elf_data, unsigned elf_size,
struct ac_shader_binary *binary)
{
char *elf_buffer;
@@ -118,6 +118,7 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
Elf_Data *symbols = NULL, *relocs = NULL;
size_t section_str_index;
unsigned symbol_sh_link = 0;
bool success = true;
/* One of the libelf implementations
* (http://www.mr511.de/software/english.htm) requires calling
@@ -137,7 +138,8 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
GElf_Shdr section_header;
if (gelf_getshdr(section, &section_header) != &section_header) {
fprintf(stderr, "Failed to read ELF section header\n");
return;
success = false;
break;
}
name = elf_strptr(elf, section_str_index, section_header.sh_name);
if (!strcmp(name, ".text")) {
@@ -148,6 +150,11 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
} else if (!strcmp(name, ".AMDGPU.config")) {
section_data = elf_getdata(section, section_data);
binary->config_size = section_data->d_size;
if (!binary->config_size) {
fprintf(stderr, ".AMDGPU.config is empty!\n");
success = false;
break;
}
binary->config = MALLOC(binary->config_size * sizeof(unsigned char));
memcpy(binary->config, section_data->d_buf, binary->config_size);
} else if (!strcmp(name, ".AMDGPU.disasm")) {
@@ -186,6 +193,7 @@ void ac_elf_read(const char *elf_data, unsigned elf_size,
binary->global_symbol_count = 1;
binary->config_size_per_symbol = binary->config_size;
}
return success;
}
const unsigned char *ac_shader_binary_config_start(

View File

@@ -83,7 +83,7 @@ struct ac_shader_config {
* Parse the elf binary stored in \p elf_data and create a
* ac_shader_binary object.
*/
void ac_elf_read(const char *elf_data, unsigned elf_size,
bool ac_elf_read(const char *elf_data, unsigned elf_size,
struct ac_shader_binary *binary);
/**

View File

@@ -45,10 +45,13 @@
* The caller is responsible for initializing ctx::module and ctx::builder.
*/
void
ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context)
ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context,
enum chip_class chip_class)
{
LLVMValueRef args[1];
ctx->chip_class = chip_class;
ctx->context = context;
ctx->module = NULL;
ctx->builder = NULL;
@@ -176,6 +179,20 @@ void ac_build_type_name_for_intr(LLVMTypeRef type, char *buf, unsigned bufsize)
}
}
/**
* Helper function that builds an LLVM IR PHI node and immediately adds
* incoming edges.
*/
LLVMValueRef
ac_build_phi(struct ac_llvm_context *ctx, LLVMTypeRef type,
unsigned count_incoming, LLVMValueRef *values,
LLVMBasicBlockRef *blocks)
{
LLVMValueRef phi = LLVMBuildPhi(ctx->builder, type, "");
LLVMAddIncoming(phi, values, blocks, count_incoming);
return phi;
}
LLVMValueRef
ac_build_gather_values_extended(struct ac_llvm_context *ctx,
LLVMValueRef *values,
@@ -290,15 +307,15 @@ static void build_cube_select(LLVMBuilderRef builder,
is_ma_x = LLVMBuildAnd(builder, is_not_ma_z, LLVMBuildNot(builder, is_ma_y, ""), "");
/* Select sc */
tmp = LLVMBuildSelect(builder, is_ma_z, coords[2], coords[0], "");
tmp = LLVMBuildSelect(builder, is_ma_x, coords[2], coords[0], "");
sgn = LLVMBuildSelect(builder, is_ma_y, LLVMConstReal(f32, 1.0),
LLVMBuildSelect(builder, is_ma_x, sgn_ma,
LLVMBuildSelect(builder, is_ma_z, sgn_ma,
LLVMBuildFNeg(builder, sgn_ma, ""), ""), "");
out_st[0] = LLVMBuildFMul(builder, tmp, sgn, "");
/* Select tc */
tmp = LLVMBuildSelect(builder, is_ma_y, coords[2], coords[1], "");
sgn = LLVMBuildSelect(builder, is_ma_y, LLVMBuildFNeg(builder, sgn_ma, ""),
sgn = LLVMBuildSelect(builder, is_ma_y, sgn_ma,
LLVMConstReal(f32, -1.0), "");
out_st[1] = LLVMBuildFMul(builder, tmp, sgn, "");
@@ -312,7 +329,7 @@ static void build_cube_select(LLVMBuilderRef builder,
void
ac_prepare_cube_coords(struct ac_llvm_context *ctx,
bool is_deriv, bool is_array,
bool is_deriv, bool is_array, bool is_lod,
LLVMValueRef *coords_arg,
LLVMValueRef *derivs_arg)
{
@@ -322,6 +339,38 @@ ac_prepare_cube_coords(struct ac_llvm_context *ctx,
LLVMValueRef coords[3];
LLVMValueRef invma;
if (is_array && !is_lod) {
LLVMValueRef tmp = coords_arg[3];
tmp = ac_build_intrinsic(ctx, "llvm.rint.f32", ctx->f32, &tmp, 1, 0);
/* Section 8.9 (Texture Functions) of the GLSL 4.50 spec says:
*
* "For Array forms, the array layer used will be
*
* max(0, min(d1, floor(layer+0.5)))
*
* where d is the depth of the texture array and layer
* comes from the component indicated in the tables below.
* Workaroudn for an issue where the layer is taken from a
* helper invocation which happens to fall on a different
* layer due to extrapolation."
*
* VI and earlier attempt to implement this in hardware by
* clamping the value of coords[2] = (8 * layer) + face.
* Unfortunately, this means that the we end up with the wrong
* face when clamping occurs.
*
* Clamp the layer earlier to work around the issue.
*/
if (ctx->chip_class <= VI) {
LLVMValueRef ge0;
ge0 = LLVMBuildFCmp(builder, LLVMRealOGE, tmp, ctx->f32_0, "");
tmp = LLVMBuildSelect(builder, ge0, tmp, ctx->f32_0, "");
}
coords_arg[3] = tmp;
}
build_cube_intrinsic(ctx, coords_arg, &selcoords);
invma = ac_build_intrinsic(ctx, "llvm.fabs.f32",
@@ -795,21 +844,21 @@ ac_build_ddxy(struct ac_llvm_context *ctx,
bool has_ds_bpermute,
uint32_t mask,
int idx,
LLVMValueRef lds,
LLVMValueRef val)
{
LLVMValueRef thread_id, tl, trbl, tl_tid, trbl_tid, args[2];
LLVMValueRef tl, trbl, args[2];
LLVMValueRef result;
thread_id = ac_get_thread_id(ctx);
tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
LLVMConstInt(ctx->i32, mask, false), "");
trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
LLVMConstInt(ctx->i32, idx, false), "");
if (has_ds_bpermute) {
LLVMValueRef thread_id, tl_tid, trbl_tid;
thread_id = ac_get_thread_id(ctx);
tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
LLVMConstInt(ctx->i32, mask, false), "");
trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
LLVMConstInt(ctx->i32, idx, false), "");
args[0] = LLVMBuildMul(ctx->builder, tl_tid,
LLVMConstInt(ctx->i32, 4, false), "");
args[1] = val;
@@ -827,15 +876,42 @@ ac_build_ddxy(struct ac_llvm_context *ctx,
AC_FUNC_ATTR_READNONE |
AC_FUNC_ATTR_CONVERGENT);
} else {
LLVMValueRef store_ptr, load_ptr0, load_ptr1;
uint32_t masks[2];
store_ptr = ac_build_gep0(ctx, lds, thread_id);
load_ptr0 = ac_build_gep0(ctx, lds, tl_tid);
load_ptr1 = ac_build_gep0(ctx, lds, trbl_tid);
switch (mask) {
case AC_TID_MASK_TOP_LEFT:
masks[0] = 0x8000;
if (idx == 1)
masks[1] = 0x8055;
else
masks[1] = 0x80aa;
LLVMBuildStore(ctx->builder, val, store_ptr);
tl = LLVMBuildLoad(ctx->builder, load_ptr0, "");
trbl = LLVMBuildLoad(ctx->builder, load_ptr1, "");
break;
case AC_TID_MASK_TOP:
masks[0] = 0x8044;
masks[1] = 0x80ee;
break;
case AC_TID_MASK_LEFT:
masks[0] = 0x80a0;
masks[1] = 0x80f5;
break;
}
args[0] = val;
args[1] = LLVMConstInt(ctx->i32, masks[0], false);
tl = ac_build_intrinsic(ctx,
"llvm.amdgcn.ds.swizzle", ctx->i32,
args, 2,
AC_FUNC_ATTR_READNONE |
AC_FUNC_ATTR_CONVERGENT);
args[1] = LLVMConstInt(ctx->i32, masks[1], false);
trbl = ac_build_intrinsic(ctx,
"llvm.amdgcn.ds.swizzle", ctx->i32,
args, 2,
AC_FUNC_ATTR_READNONE |
AC_FUNC_ATTR_CONVERGENT);
}
tl = LLVMBuildBitCast(ctx->builder, tl, ctx->f32, "");

View File

@@ -28,6 +28,8 @@
#include <stdbool.h>
#include <llvm-c/TargetMachine.h>
#include "amd_family.h"
#ifdef __cplusplus
extern "C" {
#endif
@@ -61,10 +63,13 @@ struct ac_llvm_context {
unsigned fpmath_md_kind;
LLVMValueRef fpmath_md_2p5_ulp;
LLVMValueRef empty_md;
enum chip_class chip_class;
};
void
ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context);
ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context,
enum chip_class chip_class);
LLVMValueRef
ac_build_intrinsic(struct ac_llvm_context *ctx, const char *name,
@@ -73,6 +78,11 @@ ac_build_intrinsic(struct ac_llvm_context *ctx, const char *name,
void ac_build_type_name_for_intr(LLVMTypeRef type, char *buf, unsigned bufsize);
LLVMValueRef
ac_build_phi(struct ac_llvm_context *ctx, LLVMTypeRef type,
unsigned count_incoming, LLVMValueRef *values,
LLVMBasicBlockRef *blocks);
LLVMValueRef
ac_build_gather_values_extended(struct ac_llvm_context *ctx,
LLVMValueRef *values,
@@ -91,7 +101,7 @@ ac_build_fdiv(struct ac_llvm_context *ctx,
void
ac_prepare_cube_coords(struct ac_llvm_context *ctx,
bool is_deriv, bool is_array,
bool is_deriv, bool is_array, bool is_lod,
LLVMValueRef *coords_arg,
LLVMValueRef *derivs_arg);
@@ -173,7 +183,6 @@ ac_build_ddxy(struct ac_llvm_context *ctx,
bool has_ds_bpermute,
uint32_t mask,
int idx,
LLVMValueRef lds,
LLVMValueRef val);
#define AC_SENDMSG_GS 2

View File

@@ -1178,7 +1178,17 @@ static LLVMValueRef emit_find_lsb(struct ac_llvm_context *ctx,
*/
LLVMConstInt(ctx->i1, 1, false),
};
return ac_build_intrinsic(ctx, "llvm.cttz.i32", ctx->i32, params, 2, AC_FUNC_ATTR_READNONE);
LLVMValueRef lsb = ac_build_intrinsic(ctx, "llvm.cttz.i32", ctx->i32,
params, 2,
AC_FUNC_ATTR_READNONE);
/* TODO: We need an intrinsic to skip this conditional. */
/* Check for zero: */
return LLVMBuildSelect(ctx->builder, LLVMBuildICmp(ctx->builder,
LLVMIntEQ, src0,
ctx->i32_0, ""),
LLVMConstInt(ctx->i32, -1, 0), lsb, "");
}
static LLVMValueRef emit_ifind_msb(struct ac_llvm_context *ctx,
@@ -1305,7 +1315,6 @@ static LLVMValueRef emit_f2f16(struct nir_to_llvm_context *ctx,
src0 = to_float(&ctx->ac, src0);
result = LLVMBuildFPTrunc(ctx->builder, src0, ctx->f16, "");
/* TODO SI/CIK options here */
if (ctx->options->chip_class >= VI) {
LLVMValueRef args[2];
/* Check if the result is a denormal - and flush to 0 if so. */
@@ -1319,7 +1328,22 @@ static LLVMValueRef emit_f2f16(struct nir_to_llvm_context *ctx,
if (ctx->options->chip_class >= VI)
result = LLVMBuildSelect(ctx->builder, cond, ctx->f32zero, result, "");
else {
/* for SI/CIK */
/* 0x38800000 is smallest half float value (2^-14) in 32-bit float,
* so compare the result and flush to 0 if it's smaller.
*/
LLVMValueRef temp, cond2;
temp = emit_intrin_1f_param(&ctx->ac, "llvm.fabs",
ctx->f32, result);
cond = LLVMBuildFCmp(ctx->builder, LLVMRealUGT,
LLVMBuildBitCast(ctx->builder, LLVMConstInt(ctx->i32, 0x38800000, false), ctx->f32, ""),
temp, "");
cond2 = LLVMBuildFCmp(ctx->builder, LLVMRealUNE,
temp, ctx->f32zero, "");
cond = LLVMBuildAnd(ctx->builder, cond, cond2, "");
result = LLVMBuildSelect(ctx->builder, cond, ctx->f32zero, result, "");
}
return result;
}
@@ -1443,11 +1467,6 @@ static LLVMValueRef emit_ddxy(struct nir_to_llvm_context *ctx,
int idx;
LLVMValueRef result;
if (!ctx->lds && !ctx->has_ds_bpermute)
ctx->lds = LLVMAddGlobalInAddressSpace(ctx->module,
LLVMArrayType(ctx->i32, 64),
"ddxy_lds", LOCAL_ADDR_SPACE);
if (op == nir_op_fddx_fine || op == nir_op_fddx)
mask = AC_TID_MASK_LEFT;
else if (op == nir_op_fddy_fine || op == nir_op_fddy)
@@ -1464,7 +1483,7 @@ static LLVMValueRef emit_ddxy(struct nir_to_llvm_context *ctx,
idx = 2;
result = ac_build_ddxy(&ctx->ac, ctx->has_ds_bpermute,
mask, idx, ctx->lds,
mask, idx,
src0);
return result;
}
@@ -1742,7 +1761,7 @@ static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr *inst
result);
break;
case nir_op_ffma:
result = emit_intrin_3f_param(&ctx->ac, "llvm.fma",
result = emit_intrin_3f_param(&ctx->ac, "llvm.fmuladd",
to_float_type(&ctx->ac, def_type), src[0], src[1], src[2]);
break;
case nir_op_ibitfield_extract:
@@ -1779,10 +1798,12 @@ static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr *inst
break;
case nir_op_i2f32:
case nir_op_i2f64:
src[0] = to_integer(&ctx->ac, src[0]);
result = LLVMBuildSIToFP(ctx->builder, src[0], to_float_type(&ctx->ac, def_type), "");
break;
case nir_op_u2f32:
case nir_op_u2f64:
src[0] = to_integer(&ctx->ac, src[0]);
result = LLVMBuildUIToFP(ctx->builder, src[0], to_float_type(&ctx->ac, def_type), "");
break;
case nir_op_f2f64:
@@ -1793,6 +1814,7 @@ static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr *inst
break;
case nir_op_u2u32:
case nir_op_u2u64:
src[0] = to_integer(&ctx->ac, src[0]);
if (get_elem_bits(&ctx->ac, LLVMTypeOf(src[0])) < get_elem_bits(&ctx->ac, def_type))
result = LLVMBuildZExt(ctx->builder, src[0], def_type, "");
else
@@ -1800,6 +1822,7 @@ static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr *inst
break;
case nir_op_i2i32:
case nir_op_i2i64:
src[0] = to_integer(&ctx->ac, src[0]);
if (get_elem_bits(&ctx->ac, LLVMTypeOf(src[0])) < get_elem_bits(&ctx->ac, def_type))
result = LLVMBuildSExt(ctx->builder, src[0], def_type, "");
else
@@ -1809,18 +1832,25 @@ static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr *inst
result = emit_bcsel(&ctx->ac, src[0], src[1], src[2]);
break;
case nir_op_find_lsb:
src[0] = to_integer(&ctx->ac, src[0]);
result = emit_find_lsb(&ctx->ac, src[0]);
break;
case nir_op_ufind_msb:
src[0] = to_integer(&ctx->ac, src[0]);
result = emit_ufind_msb(&ctx->ac, src[0]);
break;
case nir_op_ifind_msb:
src[0] = to_integer(&ctx->ac, src[0]);
result = emit_ifind_msb(&ctx->ac, src[0]);
break;
case nir_op_uadd_carry:
src[0] = to_integer(&ctx->ac, src[0]);
src[1] = to_integer(&ctx->ac, src[1]);
result = emit_uint_carry(&ctx->ac, "llvm.uadd.with.overflow.i32", src[0], src[1]);
break;
case nir_op_usub_borrow:
src[0] = to_integer(&ctx->ac, src[0]);
src[1] = to_integer(&ctx->ac, src[1]);
result = emit_uint_carry(&ctx->ac, "llvm.usub.with.overflow.i32", src[0], src[1]);
break;
case nir_op_b2f:
@@ -1833,15 +1863,20 @@ static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr *inst
result = emit_b2i(&ctx->ac, src[0]);
break;
case nir_op_i2b:
src[0] = to_integer(&ctx->ac, src[0]);
result = emit_i2b(&ctx->ac, src[0]);
break;
case nir_op_fquantize2f16:
result = emit_f2f16(ctx, src[0]);
break;
case nir_op_umul_high:
src[0] = to_integer(&ctx->ac, src[0]);
src[1] = to_integer(&ctx->ac, src[1]);
result = emit_umul_high(&ctx->ac, src[0], src[1]);
break;
case nir_op_imul_high:
src[0] = to_integer(&ctx->ac, src[0]);
src[1] = to_integer(&ctx->ac, src[1]);
result = emit_imul_high(&ctx->ac, src[0], src[1]);
break;
case nir_op_pack_half_2x16:
@@ -2158,7 +2193,7 @@ static LLVMValueRef build_tex_intrinsic(struct nir_to_llvm_context *ctx,
break;
}
if (instr->op == nir_texop_tg4) {
if (instr->op == nir_texop_tg4 && ctx->options->chip_class <= VI) {
enum glsl_base_type stype = glsl_get_sampler_result_type(instr->texture->var->type);
if (stype == GLSL_TYPE_UINT || stype == GLSL_TYPE_INT) {
return radv_lower_gather4_integer(ctx, args, instr);
@@ -3274,13 +3309,13 @@ static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
int count;
enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
bool is_array = glsl_sampler_type_is_array(type);
bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
count = image_type_to_components_count(dim,
glsl_sampler_type_is_array(type));
bool gfx9_1d = ctx->options->chip_class >= GFX9 && dim == GLSL_SAMPLER_DIM_1D;
count = image_type_to_components_count(dim, is_array);
if (is_ms) {
LLVMValueRef fmask_load_address[3];
@@ -3288,7 +3323,7 @@ static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
fmask_load_address[0] = LLVMBuildExtractElement(ctx->builder, src0, masks[0], "");
fmask_load_address[1] = LLVMBuildExtractElement(ctx->builder, src0, masks[1], "");
if (glsl_sampler_type_is_array(type))
if (is_array)
fmask_load_address[2] = LLVMBuildExtractElement(ctx->builder, src0, masks[2], "");
else
fmask_load_address[2] = NULL;
@@ -3303,7 +3338,7 @@ static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
sample_index,
get_sampler_desc(ctx, instr->variables[0], DESC_FMASK));
}
if (count == 1) {
if (count == 1 && !gfx9_1d) {
if (instr->src[0].ssa->num_components)
res = LLVMBuildExtractElement(ctx->builder, src0, masks[0], "");
else
@@ -3313,13 +3348,22 @@ static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
if (is_ms)
count--;
for (chan = 0; chan < count; ++chan) {
coords[chan] = LLVMBuildExtractElement(ctx->builder, src0, masks[chan], "");
coords[chan] = llvm_extract_elem(ctx, src0, chan);
}
if (add_frag_pos) {
for (chan = 0; chan < count; ++chan)
coords[chan] = LLVMBuildAdd(ctx->builder, coords[chan], LLVMBuildFPToUI(ctx->builder, ctx->frag_pos[chan], ctx->i32, ""), "");
}
if (gfx9_1d) {
if (is_array) {
coords[2] = coords[1];
coords[1] = ctx->ac.i32_0;
} else
coords[1] = ctx->ac.i32_0;
count++;
}
if (is_ms) {
coords[count] = sample_index;
count++;
@@ -3400,7 +3444,10 @@ static void visit_image_store(struct nir_to_llvm_context *ctx,
char intrinsic_name[64];
const nir_variable *var = instr->variables[0]->var;
const struct glsl_type *type = glsl_without_array(var->type);
LLVMValueRef glc = ctx->i1false;
bool force_glc = ctx->options->chip_class == SI;
if (force_glc)
glc = ctx->i1true;
if (ctx->stage == MESA_SHADER_FRAGMENT)
ctx->shader_info->fs.writes_memory = true;
@@ -3410,7 +3457,7 @@ static void visit_image_store(struct nir_to_llvm_context *ctx,
params[2] = LLVMBuildExtractElement(ctx->builder, get_src(ctx, instr->src[0]),
LLVMConstInt(ctx->i32, 0, false), ""); /* vindex */
params[3] = LLVMConstInt(ctx->i32, 0, false); /* voffset */
params[4] = ctx->i1false; /* glc */
params[4] = glc; /* glc */
params[5] = ctx->i1false; /* slc */
ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.buffer.store.format.v4f32", ctx->voidt,
params, 6, 0);
@@ -3418,7 +3465,6 @@ static void visit_image_store(struct nir_to_llvm_context *ctx,
bool is_da = glsl_sampler_type_is_array(type) ||
glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_CUBE;
LLVMValueRef da = is_da ? ctx->i1true : ctx->i1false;
LLVMValueRef glc = ctx->i1false;
LLVMValueRef slc = ctx->i1false;
params[0] = to_float(&ctx->ac, get_src(ctx, instr->src[2]));
@@ -3453,7 +3499,7 @@ static void visit_image_store(struct nir_to_llvm_context *ctx,
static LLVMValueRef visit_image_atomic(struct nir_to_llvm_context *ctx,
const nir_intrinsic_instr *instr)
{
LLVMValueRef params[6];
LLVMValueRef params[7];
int param_count = 0;
const nir_variable *var = instr->variables[0]->var;
@@ -3465,15 +3511,17 @@ static LLVMValueRef visit_image_atomic(struct nir_to_llvm_context *ctx,
if (ctx->stage == MESA_SHADER_FRAGMENT)
ctx->shader_info->fs.writes_memory = true;
bool is_unsigned = glsl_get_sampler_result_type(type) == GLSL_TYPE_UINT;
switch (instr->intrinsic) {
case nir_intrinsic_image_atomic_add:
atomic_name = "add";
break;
case nir_intrinsic_image_atomic_min:
atomic_name = "smin";
atomic_name = is_unsigned ? "umin" : "smin";
break;
case nir_intrinsic_image_atomic_max:
atomic_name = "smax";
atomic_name = is_unsigned ? "umax" : "smax";
break;
case nir_intrinsic_image_atomic_and:
atomic_name = "and";
@@ -3554,14 +3602,23 @@ static LLVMValueRef visit_image_size(struct nir_to_llvm_context *ctx,
res = ac_build_image_opcode(&ctx->ac, &args);
LLVMValueRef two = LLVMConstInt(ctx->i32, 2, false);
if (glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_CUBE &&
glsl_sampler_type_is_array(type)) {
LLVMValueRef two = LLVMConstInt(ctx->i32, 2, false);
LLVMValueRef six = LLVMConstInt(ctx->i32, 6, false);
LLVMValueRef z = LLVMBuildExtractElement(ctx->builder, res, two, "");
z = LLVMBuildSDiv(ctx->builder, z, six, "");
res = LLVMBuildInsertElement(ctx->builder, res, z, two, "");
}
if (ctx->options->chip_class >= GFX9 &&
glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_1D &&
glsl_sampler_type_is_array(type)) {
LLVMValueRef layers = LLVMBuildExtractElement(ctx->builder, res, two, "");
res = LLVMBuildInsertElement(ctx->builder, res, layers,
ctx->ac.i32_1, "");
}
return res;
}
@@ -4418,36 +4475,50 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
/* pack derivatives */
if (ddx || ddy) {
int num_src_deriv_channels, num_dest_deriv_channels;
switch (instr->sampler_dim) {
case GLSL_SAMPLER_DIM_3D:
case GLSL_SAMPLER_DIM_CUBE:
num_deriv_comp = 3;
num_src_deriv_channels = 3;
num_dest_deriv_channels = 3;
break;
case GLSL_SAMPLER_DIM_2D:
default:
num_src_deriv_channels = 2;
num_dest_deriv_channels = 2;
num_deriv_comp = 2;
break;
case GLSL_SAMPLER_DIM_1D:
num_deriv_comp = 1;
num_src_deriv_channels = 1;
if (ctx->options->chip_class >= GFX9) {
num_dest_deriv_channels = 2;
num_deriv_comp = 2;
} else {
num_dest_deriv_channels = 1;
num_deriv_comp = 1;
}
break;
}
for (unsigned i = 0; i < num_deriv_comp; i++) {
for (unsigned i = 0; i < num_src_deriv_channels; i++) {
derivs[i] = to_float(&ctx->ac, llvm_extract_elem(ctx, ddx, i));
derivs[num_deriv_comp + i] = to_float(&ctx->ac, llvm_extract_elem(ctx, ddy, i));
derivs[num_dest_deriv_channels + i] = to_float(&ctx->ac, llvm_extract_elem(ctx, ddy, i));
}
for (unsigned i = num_src_deriv_channels; i < num_dest_deriv_channels; i++) {
derivs[i] = ctx->ac.f32_0;
derivs[num_dest_deriv_channels + i] = ctx->ac.f32_0;
}
}
if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE && coord) {
if (instr->is_array && instr->op != nir_texop_lod)
coords[3] = apply_round_slice(ctx, coords[3]);
for (chan = 0; chan < instr->coord_components; chan++)
coords[chan] = to_float(&ctx->ac, coords[chan]);
if (instr->coord_components == 3)
coords[3] = LLVMGetUndef(ctx->f32);
ac_prepare_cube_coords(&ctx->ac,
instr->op == nir_texop_txd, instr->is_array,
coords, derivs);
instr->op == nir_texop_lod, coords, derivs);
if (num_deriv_comp)
num_deriv_comp--;
}
@@ -4475,6 +4546,25 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
}
address[count++] = coords[2];
}
if (ctx->options->chip_class >= GFX9) {
LLVMValueRef filler;
if (instr->op == nir_texop_txf)
filler = ctx->ac.i32_0;
else
filler = LLVMConstReal(ctx->f32, 0.5);
if (instr->sampler_dim == GLSL_SAMPLER_DIM_1D) {
/* No nir_texop_lod, because it does not take a slice
* even with array textures. */
if (instr->is_array && instr->op != nir_texop_lod ) {
address[count] = address[count - 1];
address[count - 1] = filler;
count++;
} else
address[count++] = filler;
}
}
}
/* Pack LOD */
@@ -4569,6 +4659,14 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
LLVMValueRef z = LLVMBuildExtractElement(ctx->builder, result, two, "");
z = LLVMBuildSDiv(ctx->builder, z, six, "");
result = LLVMBuildInsertElement(ctx->builder, result, z, two, "");
} else if (ctx->options->chip_class >= GFX9 &&
instr->op == nir_texop_txs &&
instr->sampler_dim == GLSL_SAMPLER_DIM_1D &&
instr->is_array) {
LLVMValueRef two = LLVMConstInt(ctx->i32, 2, false);
LLVMValueRef layers = LLVMBuildExtractElement(ctx->builder, result, two, "");
result = LLVMBuildInsertElement(ctx->builder, result, layers,
ctx->ac.i32_1, "");
} else if (instr->dest.ssa.num_components != 4)
result = trim_vector(ctx, result, instr->dest.ssa.num_components);
@@ -5178,6 +5276,7 @@ si_llvm_init_export_args(struct nir_to_llvm_context *ctx,
unsigned index = target - V_008DFC_SQ_EXP_MRT;
unsigned col_format = (ctx->options->key.fs.col_format >> (4 * index)) & 0xf;
bool is_int8 = (ctx->options->key.fs.is_int8 >> index) & 1;
bool is_int10 = (ctx->options->key.fs.is_int10 >> index) & 1;
switch(col_format) {
case V_028714_SPI_SHADER_ZERO:
@@ -5255,11 +5354,13 @@ si_llvm_init_export_args(struct nir_to_llvm_context *ctx,
break;
case V_028714_SPI_SHADER_UINT16_ABGR: {
LLVMValueRef max = LLVMConstInt(ctx->i32, is_int8 ? 255 : 65535, 0);
LLVMValueRef max_rgb = LLVMConstInt(ctx->i32,
is_int8 ? 255 : is_int10 ? 1023 : 65535, 0);
LLVMValueRef max_alpha = !is_int10 ? max_rgb : LLVMConstInt(ctx->i32, 3, 0);
for (unsigned chan = 0; chan < 4; chan++) {
val[chan] = to_integer(&ctx->ac, values[chan]);
val[chan] = emit_minmax_int(&ctx->ac, LLVMIntULT, val[chan], max);
val[chan] = emit_minmax_int(&ctx->ac, LLVMIntULT, val[chan], chan == 3 ? max_alpha : max_rgb);
}
args->compr = 1;
@@ -5269,14 +5370,18 @@ si_llvm_init_export_args(struct nir_to_llvm_context *ctx,
}
case V_028714_SPI_SHADER_SINT16_ABGR: {
LLVMValueRef max = LLVMConstInt(ctx->i32, is_int8 ? 127 : 32767, 0);
LLVMValueRef min = LLVMConstInt(ctx->i32, is_int8 ? -128 : -32768, 0);
LLVMValueRef max_rgb = LLVMConstInt(ctx->i32,
is_int8 ? 127 : is_int10 ? 511 : 32767, 0);
LLVMValueRef min_rgb = LLVMConstInt(ctx->i32,
is_int8 ? -128 : is_int10 ? -512 : -32768, 0);
LLVMValueRef max_alpha = !is_int10 ? max_rgb : ctx->i32one;
LLVMValueRef min_alpha = !is_int10 ? min_rgb : LLVMConstInt(ctx->i32, -2, 0);
/* Clamp. */
for (unsigned chan = 0; chan < 4; chan++) {
val[chan] = to_integer(&ctx->ac, values[chan]);
val[chan] = emit_minmax_int(&ctx->ac, LLVMIntSLT, val[chan], max);
val[chan] = emit_minmax_int(&ctx->ac, LLVMIntSGT, val[chan], min);
val[chan] = emit_minmax_int(&ctx->ac, LLVMIntSLT, val[chan], chan == 3 ? max_alpha : max_rgb);
val[chan] = emit_minmax_int(&ctx->ac, LLVMIntSGT, val[chan], chan == 3 ? min_alpha : min_rgb);
}
args->compr = 1;
@@ -5367,11 +5472,11 @@ handle_vs_outputs_post(struct nir_to_llvm_context *ctx,
ctx->outputs[radeon_llvm_reg_index_soa(VARYING_SLOT_VIEWPORT, 0)], "");
}
uint32_t mask = ((outinfo->writes_pointsize == true ? 1 : 0) |
(outinfo->writes_layer == true ? 4 : 0) |
(outinfo->writes_viewport_index == true ? 8 : 0));
if (mask) {
pos_args[1].enabled_channels = mask;
if (outinfo->writes_pointsize ||
outinfo->writes_layer ||
outinfo->writes_viewport_index) {
pos_args[1].enabled_channels = ((outinfo->writes_pointsize == true ? 1 : 0) |
(outinfo->writes_layer == true ? 4 : 0));
pos_args[1].valid_mask = 0;
pos_args[1].done = 0;
pos_args[1].target = V_008DFC_SQ_EXP_POS + 1;
@@ -5385,8 +5490,26 @@ handle_vs_outputs_post(struct nir_to_llvm_context *ctx,
pos_args[1].out[0] = psize_value;
if (outinfo->writes_layer == true)
pos_args[1].out[2] = layer_value;
if (outinfo->writes_viewport_index == true)
pos_args[1].out[3] = viewport_index_value;
if (outinfo->writes_viewport_index == true) {
if (ctx->options->chip_class >= GFX9) {
/* GFX9 has the layer in out.z[10:0] and the viewport
* index in out.z[19:16].
*/
LLVMValueRef v = viewport_index_value;
v = to_integer(&ctx->ac, v);
v = LLVMBuildShl(ctx->builder, v,
LLVMConstInt(ctx->i32, 16, false),
"");
v = LLVMBuildOr(ctx->builder, v,
to_integer(&ctx->ac, pos_args[1].out[2]), "");
pos_args[1].out[2] = to_float(&ctx->ac, v);
pos_args[1].enabled_channels |= 1 << 2;
} else {
pos_args[1].out[3] = viewport_index_value;
pos_args[1].enabled_channels |= 1 << 3;
}
}
}
for (i = 0; i < 4; i++) {
if (pos_args[i].out[0])
@@ -5815,10 +5938,11 @@ si_export_mrt_z(struct nir_to_llvm_context *ctx,
args.enabled_channels |= 0x4;
}
/* SI (except OLAND) has a bug that it only looks
/* SI (except OLAND and HAINAN) has a bug that it only looks
* at the X writemask component. */
if (ctx->options->chip_class == SI &&
ctx->options->family != CHIP_OLAND)
ctx->options->family != CHIP_OLAND &&
ctx->options->family != CHIP_HAINAN)
args.enabled_channels |= 0x1;
ac_build_export(&ctx->ac, &args);
@@ -6041,7 +6165,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
ctx.context = LLVMContextCreate();
ctx.module = LLVMModuleCreateWithNameInContext("shader", ctx.context);
ac_llvm_context_init(&ctx.ac, ctx.context);
ac_llvm_context_init(&ctx.ac, ctx.context, options->chip_class);
ctx.ac.module = ctx.module;
ctx.has_ds_bpermute = ctx.options->chip_class >= VI;
@@ -6375,7 +6499,7 @@ void ac_create_gs_copy_shader(LLVMTargetMachineRef tm,
ctx.options = options;
ctx.shader_info = shader_info;
ac_llvm_context_init(&ctx.ac, ctx.context);
ac_llvm_context_init(&ctx.ac, ctx.context, options->chip_class);
ctx.ac.module = ctx.module;
ctx.is_gs_copy_shader = true;

View File

@@ -57,6 +57,7 @@ struct ac_tcs_variant_key {
struct ac_fs_variant_key {
uint32_t col_format;
uint32_t is_int8;
uint32_t is_int10;
};
union ac_shader_variant_key {

View File

@@ -257,6 +257,18 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
AddrSurfInfoIn->width = u_minify(config->info.width, level);
AddrSurfInfoIn->height = u_minify(config->info.height, level);
/* Make GFX6 linear surfaces compatible with GFX9 for hybrid graphics,
* because GFX9 needs linear alignment of 256 bytes.
*/
if (config->info.levels == 1 &&
AddrSurfInfoIn->tileMode == ADDR_TM_LINEAR_ALIGNED &&
AddrSurfInfoIn->bpp) {
unsigned alignment = 256 / (AddrSurfInfoIn->bpp / 8);
assert(util_is_power_of_two(AddrSurfInfoIn->bpp));
AddrSurfInfoIn->width = align(AddrSurfInfoIn->width, alignment);
}
if (config->is_3d)
AddrSurfInfoIn->numSlices = u_minify(config->info.depth, level);
else if (config->is_cube)
@@ -541,15 +553,35 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
AddrSurfInfoIn.flags.noStencil = (surf->flags & RADEON_SURF_SBUFFER) == 0;
AddrSurfInfoIn.flags.compressZ = AddrSurfInfoIn.flags.depth;
/* noStencil = 0 can result in a depth part that is incompatible with
* mipmapped texturing. So set noStencil = 1 when mipmaps are requested (in
* this case, we may end up setting stencil_adjusted).
/* On CI/VI, the DB uses the same pitch and tile mode (except tilesplit)
* for Z and stencil. This can cause a number of problems which we work
* around here:
*
* TODO: update addrlib to a newer version, remove this, and
* use flags.matchStencilTileCfg = 1 as an alternative fix.
* - a depth part that is incompatible with mipmapped texturing
* - at least on Stoney, entirely incompatible Z/S aspects (e.g.
* incorrect tiling applied to the stencil part, stencil buffer
* memory accesses that go out of bounds) even without mipmapping
*
* Some piglit tests that are prone to different types of related
* failures:
* ./bin/ext_framebuffer_multisample-upsample 2 stencil
* ./bin/framebuffer-blit-levels {draw,read} stencil
* ./bin/ext_framebuffer_multisample-unaligned-blit N {depth,stencil} {msaa,upsample,downsample}
* ./bin/fbo-depth-array fs-writes-{depth,stencil} / {depth,stencil}-{clear,layered-clear,draw}
* ./bin/depthstencil-render-miplevels 1024 d=s=z24_s8
*/
if (config->info.levels > 1)
int stencil_tile_idx = -1;
if (AddrSurfInfoIn.flags.depth && !AddrSurfInfoIn.flags.noStencil &&
(config->info.levels > 1 || info->family == CHIP_STONEY)) {
/* Compute stencilTileIdx that is compatible with the (depth)
* tileIdx. This degrades the depth surface if necessary to
* ensure that a matching stencilTileIdx exists. */
AddrSurfInfoIn.flags.matchStencilTileCfg = 1;
/* Keep the depth mip-tail compatible with texturing. */
AddrSurfInfoIn.flags.noStencil = 1;
}
/* Set preferred macrotile parameters. This is usually required
* for shared resources. This is for 2D tiling only. */
@@ -631,12 +663,33 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
if (level > 0)
continue;
/* Check that we actually got a TC-compatible HTILE if
* we requested it (only for level 0, since we're not
* supporting HTILE on higher mip levels anyway). */
assert(AddrSurfInfoOut.tcCompatible ||
!AddrSurfInfoIn.flags.tcCompatible ||
AddrSurfInfoIn.flags.matchStencilTileCfg);
if (AddrSurfInfoIn.flags.matchStencilTileCfg) {
if (!AddrSurfInfoOut.tcCompatible) {
AddrSurfInfoIn.flags.tcCompatible = 0;
surf->flags &= ~RADEON_SURF_TC_COMPATIBLE_HTILE;
}
AddrSurfInfoIn.flags.matchStencilTileCfg = 0;
AddrSurfInfoIn.tileIndex = AddrSurfInfoOut.tileIndex;
stencil_tile_idx = AddrSurfInfoOut.stencilTileIdx;
assert(stencil_tile_idx >= 0);
}
gfx6_surface_settings(info, &AddrSurfInfoOut, surf);
}
}
/* Calculate texture layout information for stencil. */
if (surf->flags & RADEON_SURF_SBUFFER) {
AddrSurfInfoIn.tileIndex = stencil_tile_idx;
AddrSurfInfoIn.bpp = 8;
AddrSurfInfoIn.flags.depth = 0;
AddrSurfInfoIn.flags.stencil = 1;
@@ -835,9 +888,11 @@ static int gfx9_compute_miptree(ADDR_HANDLE addrlib,
in->numSamples == 1) {
ADDR2_COMPUTE_DCCINFO_INPUT din = {0};
ADDR2_COMPUTE_DCCINFO_OUTPUT dout = {0};
ADDR2_META_MIP_INFO meta_mip_info[RADEON_SURF_MAX_LEVELS] = {};
din.size = sizeof(ADDR2_COMPUTE_DCCINFO_INPUT);
dout.size = sizeof(ADDR2_COMPUTE_DCCINFO_OUTPUT);
dout.pMipInfo = meta_mip_info;
din.dccKeyFlags.pipeAligned = 1;
din.dccKeyFlags.rbAligned = 1;
@@ -861,6 +916,39 @@ static int gfx9_compute_miptree(ADDR_HANDLE addrlib,
surf->u.gfx9.dcc_pitch_max = dout.pitch - 1;
surf->dcc_size = dout.dccRamSize;
surf->dcc_alignment = dout.dccRamBaseAlign;
surf->num_dcc_levels = in->numMipLevels;
/* Disable DCC for levels that are in the mip tail.
*
* There are two issues that this is intended to
* address:
*
* 1. Multiple mip levels may share a cache line. This
* can lead to corruption when switching between
* rendering to different mip levels because the
* RBs don't maintain coherency.
*
* 2. Texturing with metadata after rendering sometimes
* fails with corruption, probably for a similar
* reason.
*
* Working around these issues for all levels in the
* mip tail may be overly conservative, but it's what
* Vulkan does.
*
* Alternative solutions that also work but are worse:
* - Disable DCC entirely.
* - Flush TC L2 after rendering.
*/
for (unsigned i = 0; i < in->numMipLevels; i++) {
if (meta_mip_info[i].inMiptail) {
surf->num_dcc_levels = i;
break;
}
}
if (!surf->num_dcc_levels)
surf->dcc_size = 0;
}
/* FMASK */
@@ -997,6 +1085,11 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
case RADEON_SURF_MODE_1D:
case RADEON_SURF_MODE_2D:
if (surf->flags & RADEON_SURF_IMPORTED) {
AddrSurfInfoIn.swizzleMode = surf->u.gfx9.surf.swizzle_mode;
break;
}
r = gfx9_get_preferred_swizzle_mode(addrlib, &AddrSurfInfoIn, false,
&AddrSurfInfoIn.swizzleMode);
if (r)
@@ -1009,6 +1102,7 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
surf->u.gfx9.resource_type = AddrSurfInfoIn.resourceType;
surf->num_dcc_levels = 0;
surf->surf_size = 0;
surf->dcc_size = 0;
surf->htile_size = 0;
@@ -1025,9 +1119,16 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
/* Calculate texture layout information for stencil. */
if (surf->flags & RADEON_SURF_SBUFFER) {
AddrSurfInfoIn.bpp = 8;
AddrSurfInfoIn.flags.depth = 0;
AddrSurfInfoIn.flags.stencil = 1;
AddrSurfInfoIn.bpp = 8;
if (!AddrSurfInfoIn.flags.depth) {
r = gfx9_get_preferred_swizzle_mode(addrlib, &AddrSurfInfoIn, false,
&AddrSurfInfoIn.swizzleMode);
if (r)
return r;
} else
AddrSurfInfoIn.flags.depth = 0;
r = gfx9_compute_miptree(addrlib, surf, compressed, &AddrSurfInfoIn);
if (r)
@@ -1035,7 +1136,6 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,
}
surf->is_linear = surf->u.gfx9.surf.swizzle_mode == ADDR_SW_LINEAR;
surf->num_dcc_levels = surf->dcc_size ? config->info.levels : 0;
switch (surf->u.gfx9.surf.swizzle_mode) {
/* S = standard. */

View File

@@ -2453,6 +2453,8 @@
#define S_008F3C_BORDER_COLOR_PTR(x) (((unsigned)(x) & 0xFFF) << 0)
#define G_008F3C_BORDER_COLOR_PTR(x) (((x) >> 0) & 0xFFF)
#define C_008F3C_BORDER_COLOR_PTR 0xFFFFF000
/* The UPGRADED_DEPTH field is driver-specific and does not exist in hardware. */
#define S_008F3C_UPGRADED_DEPTH(x) (((unsigned)(x) & 0x1) << 29)
#define S_008F3C_BORDER_COLOR_TYPE(x) (((unsigned)(x) & 0x03) << 30)
#define G_008F3C_BORDER_COLOR_TYPE(x) (((x) >> 30) & 0x03)
#define C_008F3C_BORDER_COLOR_TYPE 0x3FFFFFFF

View File

@@ -368,6 +368,10 @@ radv_emit_graphics_blend_state(struct radv_cmd_buffer *cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_028B70_DB_ALPHA_TO_MASK, pipeline->graphics.blend.db_alpha_to_mask);
if (cmd_buffer->device->physical_device->has_rbplus) {
radeon_set_context_reg_seq(cmd_buffer->cs, R_028760_SX_MRT0_BLEND_OPT, 8);
radeon_emit_array(cmd_buffer->cs, pipeline->graphics.blend.sx_mrt_blend_opt, 8);
radeon_set_context_reg_seq(cmd_buffer->cs, R_028754_SX_PS_DOWNCONVERT, 3);
radeon_emit(cmd_buffer->cs, 0); /* R_028754_SX_PS_DOWNCONVERT */
radeon_emit(cmd_buffer->cs, 0); /* R_028758_SX_BLEND_OPT_EPSILON */
@@ -934,6 +938,11 @@ static void
radv_emit_scissor(struct radv_cmd_buffer *cmd_buffer)
{
uint32_t count = cmd_buffer->state.dynamic.scissor.count;
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_PS_PARTIAL_FLUSH;
si_emit_cache_flush(cmd_buffer);
}
si_write_scissors(cmd_buffer->cs, 0, count,
cmd_buffer->state.dynamic.scissor.scissors,
cmd_buffer->state.dynamic.viewport.viewports,
@@ -1007,6 +1016,8 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
}
radeon_set_context_reg(cmd_buffer->cs, R_028008_DB_DEPTH_VIEW, ds->db_depth_view);
radeon_set_context_reg(cmd_buffer->cs, R_028ABC_DB_HTILE_SURFACE, ds->db_htile_surface);
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
radeon_set_context_reg_seq(cmd_buffer->cs, R_028014_DB_HTILE_DATA_BASE, 3);
@@ -1043,7 +1054,6 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
radeon_emit(cmd_buffer->cs, ds->db_depth_size); /* R_028058_DB_DEPTH_SIZE */
radeon_emit(cmd_buffer->cs, ds->db_depth_slice); /* R_02805C_DB_DEPTH_SLICE */
radeon_set_context_reg(cmd_buffer->cs, R_028ABC_DB_HTILE_SURFACE, ds->db_htile_surface);
}
radeon_set_context_reg(cmd_buffer->cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
@@ -1208,6 +1218,10 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer *cmd_buffer)
struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer;
const struct radv_subpass *subpass = cmd_buffer->state.subpass;
/* this may happen for inherited secondary recording */
if (!framebuffer)
return;
for (i = 0; i < 8; ++i) {
if (i >= subpass->color_count || subpass->color_attachments[i].attachment == VK_ATTACHMENT_UNUSED) {
radeon_set_context_reg(cmd_buffer->cs, R_028C70_CB_COLOR0_INFO + i * 0x3C,
@@ -1247,9 +1261,13 @@ radv_emit_framebuffer_state(struct radv_cmd_buffer *cmd_buffer)
}
radv_load_depth_clear_regs(cmd_buffer, image);
} else {
radeon_set_context_reg_seq(cmd_buffer->cs, R_028040_DB_Z_INFO, 2);
radeon_emit(cmd_buffer->cs, S_028040_FORMAT(V_028040_Z_INVALID)); /* R_028040_DB_Z_INFO */
radeon_emit(cmd_buffer->cs, S_028044_FORMAT(V_028044_STENCIL_INVALID)); /* R_028044_DB_STENCIL_INFO */
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9)
radeon_set_context_reg_seq(cmd_buffer->cs, R_028038_DB_Z_INFO, 2);
else
radeon_set_context_reg_seq(cmd_buffer->cs, R_028040_DB_Z_INFO, 2);
radeon_emit(cmd_buffer->cs, S_028040_FORMAT(V_028040_Z_INVALID)); /* DB_Z_INFO */
radeon_emit(cmd_buffer->cs, S_028044_FORMAT(V_028044_STENCIL_INVALID)); /* DB_STENCIL_INFO */
}
radeon_set_context_reg(cmd_buffer->cs, R_028208_PA_SC_WINDOW_SCISSOR_BR,
S_028208_BR_X(framebuffer->width) |
@@ -1971,6 +1989,7 @@ VkResult radv_BeginCommandBuffer(
memset(&cmd_buffer->state, 0, sizeof(cmd_buffer->state));
cmd_buffer->state.last_primitive_reset_en = -1;
cmd_buffer->usage_flags = pBeginInfo->flags;
/* setup initial configuration into command buffer */
if (cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY) {
@@ -2016,7 +2035,7 @@ void radv_CmdBindVertexBuffers(
/* We have to defer setting up vertex buffer since we need the buffer
* stride from the pipeline. */
assert(firstBinding + bindingCount < MAX_VBS);
assert(firstBinding + bindingCount <= MAX_VBS);
for (uint32_t i = 0; i < bindingCount; i++) {
vb[firstBinding + i].buffer = radv_buffer_from_handle(pBuffers[i]);
vb[firstBinding + i].offset = pOffsets[i];
@@ -2233,8 +2252,13 @@ VkResult radv_EndCommandBuffer(
{
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
if (cmd_buffer->queue_family_index != RADV_QUEUE_TRANSFER)
if (cmd_buffer->queue_family_index != RADV_QUEUE_TRANSFER) {
if (cmd_buffer->device->physical_device->rad_info.chip_class == SI)
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_CS_PARTIAL_FLUSH | RADV_CMD_FLAG_PS_PARTIAL_FLUSH | RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2;
si_emit_cache_flush(cmd_buffer);
}
vk_free(&cmd_buffer->pool->alloc, cmd_buffer->state.attachments);
if (!cmd_buffer->device->ws->cs_finalize(cmd_buffer->cs) ||
cmd_buffer->record_fail)
@@ -2701,7 +2725,7 @@ void radv_CmdDrawIndexed(
radv_cmd_buffer_flush_state(cmd_buffer, true, (instanceCount > 1), false, indexCount);
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 15);
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 16);
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
radeon_set_uconfig_reg_idx(cmd_buffer->cs, R_03090C_VGT_INDEX_TYPE,
@@ -2757,6 +2781,8 @@ radv_emit_indirect_draw(struct radv_cmd_buffer *cmd_buffer,
if (count_buffer) {
count_va = cmd_buffer->device->ws->buffer_get_va(count_buffer->bo);
count_va += count_offset + count_buffer->offset;
cmd_buffer->device->ws->cs_add_buffer(cs, count_buffer->bo, 8);
}
if (!draw_count)
@@ -2772,20 +2798,30 @@ radv_emit_indirect_draw(struct radv_cmd_buffer *cmd_buffer,
radeon_emit(cs, indirect_va);
radeon_emit(cs, indirect_va >> 32);
radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT_MULTI :
PKT3_DRAW_INDIRECT_MULTI,
8, false));
radeon_emit(cs, 0);
radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
radeon_emit(cs, (((base_reg + 8) - SI_SH_REG_OFFSET) >> 2) |
S_2C3_DRAW_INDEX_ENABLE(draw_id_enable) |
S_2C3_COUNT_INDIRECT_ENABLE(!!count_va));
radeon_emit(cs, draw_count); /* count */
radeon_emit(cs, count_va); /* count_addr */
radeon_emit(cs, count_va >> 32);
radeon_emit(cs, stride); /* stride */
radeon_emit(cs, di_src_sel);
if (draw_count == 1 && !count_va && !draw_id_enable) {
radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT :
PKT3_DRAW_INDIRECT, 3, false));
radeon_emit(cs, 0);
radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
radeon_emit(cs, di_src_sel);
} else {
radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT_MULTI :
PKT3_DRAW_INDIRECT_MULTI,
8, false));
radeon_emit(cs, 0);
radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
radeon_emit(cs, (((base_reg + 8) - SI_SH_REG_OFFSET) >> 2) |
S_2C3_DRAW_INDEX_ENABLE(draw_id_enable) |
S_2C3_COUNT_INDIRECT_ENABLE(!!count_va));
radeon_emit(cs, draw_count); /* count */
radeon_emit(cs, count_va); /* count_addr */
radeon_emit(cs, count_va >> 32);
radeon_emit(cs, stride); /* stride */
radeon_emit(cs, di_src_sel);
}
radv_cmd_buffer_trace_emit(cmd_buffer);
}

View File

@@ -66,6 +66,7 @@ VkResult radv_CreateDescriptorSetLayout(
set_layout->binding_count = max_binding + 1;
set_layout->shader_stages = 0;
set_layout->dynamic_shader_stages = 0;
set_layout->size = 0;
memset(set_layout->binding, 0, size - sizeof(struct radv_descriptor_set_layout));
@@ -734,8 +735,59 @@ void radv_update_descriptor_sets(
}
}
if (descriptorCopyCount)
radv_finishme("copy descriptors");
for (i = 0; i < descriptorCopyCount; i++) {
const VkCopyDescriptorSet *copyset = &pDescriptorCopies[i];
RADV_FROM_HANDLE(radv_descriptor_set, src_set,
copyset->srcSet);
RADV_FROM_HANDLE(radv_descriptor_set, dst_set,
copyset->dstSet);
const struct radv_descriptor_set_binding_layout *src_binding_layout =
src_set->layout->binding + copyset->srcBinding;
const struct radv_descriptor_set_binding_layout *dst_binding_layout =
dst_set->layout->binding + copyset->dstBinding;
uint32_t *src_ptr = src_set->mapped_ptr;
uint32_t *dst_ptr = dst_set->mapped_ptr;
struct radeon_winsys_bo **src_buffer_list = src_set->descriptors;
struct radeon_winsys_bo **dst_buffer_list = dst_set->descriptors;
src_ptr += src_binding_layout->offset / 4;
dst_ptr += dst_binding_layout->offset / 4;
src_ptr += src_binding_layout->size * copyset->srcArrayElement / 4;
dst_ptr += dst_binding_layout->size * copyset->dstArrayElement / 4;
src_buffer_list += src_binding_layout->buffer_offset;
src_buffer_list += copyset->srcArrayElement;
dst_buffer_list += dst_binding_layout->buffer_offset;
dst_buffer_list += copyset->dstArrayElement;
for (j = 0; j < copyset->descriptorCount; ++j) {
switch (src_binding_layout->type) {
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC: {
unsigned src_idx = copyset->srcArrayElement + j;
unsigned dst_idx = copyset->dstArrayElement + j;
struct radv_descriptor_range *src_range, *dst_range;
src_idx += src_binding_layout->dynamic_offset_offset;
dst_idx += dst_binding_layout->dynamic_offset_offset;
src_range = src_set->dynamic_descriptors + src_idx;
dst_range = dst_set->dynamic_descriptors + dst_idx;
*dst_range = *src_range;
break;
}
default:
memcpy(dst_ptr, src_ptr, src_binding_layout->size);
}
src_ptr += src_binding_layout->size / 4;
dst_ptr += dst_binding_layout->size / 4;
dst_buffer_list[j] = src_buffer_list[j];
++src_buffer_list;
++dst_buffer_list;
}
}
}
void radv_UpdateDescriptorSets(

View File

@@ -263,6 +263,75 @@ get_chip_name(enum radeon_family family)
}
}
static void
radv_physical_device_init_mem_types(struct radv_physical_device *device)
{
STATIC_ASSERT(RADV_MEM_HEAP_COUNT <= VK_MAX_MEMORY_HEAPS);
uint64_t visible_vram_size = MIN2(device->rad_info.vram_size,
device->rad_info.vram_vis_size);
int vram_index = -1, visible_vram_index = -1, gart_index = -1;
device->memory_properties.memoryHeapCount = 0;
if (device->rad_info.vram_size - visible_vram_size > 0) {
vram_index = device->memory_properties.memoryHeapCount++;
device->memory_properties.memoryHeaps[vram_index] = (VkMemoryHeap) {
.size = device->rad_info.vram_size - visible_vram_size,
.flags = VK_MEMORY_HEAP_DEVICE_LOCAL_BIT,
};
}
if (visible_vram_size) {
visible_vram_index = device->memory_properties.memoryHeapCount++;
device->memory_properties.memoryHeaps[visible_vram_index] = (VkMemoryHeap) {
.size = visible_vram_size,
.flags = VK_MEMORY_HEAP_DEVICE_LOCAL_BIT,
};
}
if (device->rad_info.gart_size > 0) {
gart_index = device->memory_properties.memoryHeapCount++;
device->memory_properties.memoryHeaps[gart_index] = (VkMemoryHeap) {
.size = device->rad_info.gart_size,
.flags = 0,
};
}
STATIC_ASSERT(RADV_MEM_TYPE_COUNT <= VK_MAX_MEMORY_TYPES);
unsigned type_count = 0;
if (vram_index >= 0) {
device->mem_type_indices[type_count] = RADV_MEM_TYPE_VRAM;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
.heapIndex = vram_index,
};
}
if (gart_index >= 0) {
device->mem_type_indices[type_count] = RADV_MEM_TYPE_GTT_WRITE_COMBINE;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
.heapIndex = gart_index,
};
}
if (visible_vram_index >= 0) {
device->mem_type_indices[type_count] = RADV_MEM_TYPE_VRAM_CPU_ACCESS;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT |
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
.heapIndex = visible_vram_index,
};
}
if (gart_index >= 0) {
device->mem_type_indices[type_count] = RADV_MEM_TYPE_GTT_CACHED;
device->memory_properties.memoryTypes[type_count++] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
VK_MEMORY_PROPERTY_HOST_CACHED_BIT,
.heapIndex = gart_index,
};
}
device->memory_properties.memoryTypeCount = type_count;
}
static VkResult
radv_physical_device_init(struct radv_physical_device *device,
struct radv_instance *instance,
@@ -346,6 +415,7 @@ radv_physical_device_init(struct radv_physical_device *device,
device->rbplus_allowed = device->rad_info.family == CHIP_STONEY;
}
radv_physical_device_init_mem_types(device);
return VK_SUCCESS;
fail:
@@ -902,47 +972,7 @@ void radv_GetPhysicalDeviceMemoryProperties(
{
RADV_FROM_HANDLE(radv_physical_device, physical_device, physicalDevice);
STATIC_ASSERT(RADV_MEM_TYPE_COUNT <= VK_MAX_MEMORY_TYPES);
pMemoryProperties->memoryTypeCount = RADV_MEM_TYPE_COUNT;
pMemoryProperties->memoryTypes[RADV_MEM_TYPE_VRAM] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
.heapIndex = RADV_MEM_HEAP_VRAM,
};
pMemoryProperties->memoryTypes[RADV_MEM_TYPE_GTT_WRITE_COMBINE] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
.heapIndex = RADV_MEM_HEAP_GTT,
};
pMemoryProperties->memoryTypes[RADV_MEM_TYPE_VRAM_CPU_ACCESS] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT |
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
.heapIndex = RADV_MEM_HEAP_VRAM_CPU_ACCESS,
};
pMemoryProperties->memoryTypes[RADV_MEM_TYPE_GTT_CACHED] = (VkMemoryType) {
.propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
VK_MEMORY_PROPERTY_HOST_CACHED_BIT,
.heapIndex = RADV_MEM_HEAP_GTT,
};
STATIC_ASSERT(RADV_MEM_HEAP_COUNT <= VK_MAX_MEMORY_HEAPS);
pMemoryProperties->memoryHeapCount = RADV_MEM_HEAP_COUNT;
pMemoryProperties->memoryHeaps[RADV_MEM_HEAP_VRAM] = (VkMemoryHeap) {
.size = physical_device->rad_info.vram_size -
physical_device->rad_info.vram_vis_size,
.flags = VK_MEMORY_HEAP_DEVICE_LOCAL_BIT,
};
pMemoryProperties->memoryHeaps[RADV_MEM_HEAP_VRAM_CPU_ACCESS] = (VkMemoryHeap) {
.size = physical_device->rad_info.vram_vis_size,
.flags = VK_MEMORY_HEAP_DEVICE_LOCAL_BIT,
};
pMemoryProperties->memoryHeaps[RADV_MEM_HEAP_GTT] = (VkMemoryHeap) {
.size = physical_device->rad_info.gart_size,
.flags = 0,
};
*pMemoryProperties = physical_device->memory_properties;
}
void radv_GetPhysicalDeviceMemoryProperties2KHR(
@@ -2231,6 +2261,7 @@ VkResult radv_AllocateMemory(
VkResult result;
enum radeon_bo_domain domain;
uint32_t flags = 0;
enum radv_mem_type mem_type_index = device->physical_device->mem_type_indices[pAllocateInfo->memoryTypeIndex];
assert(pAllocateInfo->sType == VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO);
@@ -2273,18 +2304,18 @@ VkResult radv_AllocateMemory(
}
uint64_t alloc_size = align_u64(pAllocateInfo->allocationSize, 4096);
if (pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_GTT_WRITE_COMBINE ||
pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_GTT_CACHED)
if (mem_type_index == RADV_MEM_TYPE_GTT_WRITE_COMBINE ||
mem_type_index == RADV_MEM_TYPE_GTT_CACHED)
domain = RADEON_DOMAIN_GTT;
else
domain = RADEON_DOMAIN_VRAM;
if (pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_VRAM)
if (mem_type_index == RADV_MEM_TYPE_VRAM)
flags |= RADEON_FLAG_NO_CPU_ACCESS;
else
flags |= RADEON_FLAG_CPU_ACCESS;
if (pAllocateInfo->memoryTypeIndex == RADV_MEM_TYPE_GTT_WRITE_COMBINE)
if (mem_type_index == RADV_MEM_TYPE_GTT_WRITE_COMBINE)
flags |= RADEON_FLAG_GTT_WC;
mem->bo = device->ws->buffer_create(device->ws, alloc_size, device->physical_device->rad_info.max_alignment,
@@ -2294,7 +2325,7 @@ VkResult radv_AllocateMemory(
result = VK_ERROR_OUT_OF_DEVICE_MEMORY;
goto fail;
}
mem->type_index = pAllocateInfo->memoryTypeIndex;
mem->type_index = mem_type_index;
out_success:
*pMem = radv_device_memory_to_handle(mem);
@@ -2378,13 +2409,14 @@ VkResult radv_InvalidateMappedMemoryRanges(
}
void radv_GetBufferMemoryRequirements(
VkDevice device,
VkDevice _device,
VkBuffer _buffer,
VkMemoryRequirements* pMemoryRequirements)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_buffer, buffer, _buffer);
pMemoryRequirements->memoryTypeBits = (1u << RADV_MEM_TYPE_COUNT) - 1;
pMemoryRequirements->memoryTypeBits = (1u << device->physical_device->memory_properties.memoryTypeCount) - 1;
if (buffer->flags & VK_BUFFER_CREATE_SPARSE_BINDING_BIT)
pMemoryRequirements->alignment = 4096;
@@ -2418,13 +2450,14 @@ void radv_GetBufferMemoryRequirements2KHR(
}
void radv_GetImageMemoryRequirements(
VkDevice device,
VkDevice _device,
VkImage _image,
VkMemoryRequirements* pMemoryRequirements)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_image, image, _image);
pMemoryRequirements->memoryTypeBits = (1u << RADV_MEM_TYPE_COUNT) - 1;
pMemoryRequirements->memoryTypeBits = (1u << device->physical_device->memory_properties.memoryTypeCount) - 1;
pMemoryRequirements->size = image->size;
pMemoryRequirements->alignment = image->alignment;
@@ -2811,7 +2844,7 @@ VkResult radv_CreateEvent(
event->bo = device->ws->buffer_create(device->ws, 8, 8,
RADEON_DOMAIN_GTT,
RADEON_FLAG_CPU_ACCESS);
RADEON_FLAG_VA_UNCACHED | RADEON_FLAG_CPU_ACCESS);
if (!event->bo) {
vk_free2(&device->alloc, pAllocator, event);
return VK_ERROR_OUT_OF_DEVICE_MEMORY;
@@ -3077,9 +3110,13 @@ radv_initialise_color_surface(struct radv_device *device,
format != V_028C70_COLOR_24_8) |
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
if (iview->image->info.samples > 1)
if (iview->image->fmask.size)
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if ((iview->image->info.samples > 1) && iview->image->fmask.size) {
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if (device->physical_device->rad_info.chip_class == SI) {
unsigned fmask_bankh = util_logbase2(iview->image->fmask.bank_height);
cb->cb_color_attrib |= S_028C74_FMASK_BANK_HEIGHT(fmask_bankh);
}
}
if (iview->image->cmask.size &&
!(device->debug_flags & RADV_DEBUG_NO_FAST_CLEARS))
@@ -3109,15 +3146,15 @@ radv_initialise_color_surface(struct radv_device *device,
}
if (device->physical_device->rad_info.chip_class >= GFX9) {
uint32_t max_slice = radv_surface_layer_count(iview);
unsigned mip0_depth = iview->base_layer + max_slice - 1;
unsigned mip0_depth = iview->image->type == VK_IMAGE_TYPE_3D ?
(iview->extent.depth - 1) : (iview->image->info.array_size - 1);
cb->cb_color_view |= S_028C6C_MIP_LEVEL(iview->base_mip);
cb->cb_color_attrib |= S_028C74_MIP0_DEPTH(mip0_depth) |
S_028C74_RESOURCE_TYPE(iview->image->surface.u.gfx9.resource_type);
cb->cb_color_attrib2 = S_028C68_MIP0_WIDTH(iview->image->info.width - 1) |
S_028C68_MIP0_HEIGHT(iview->image->info.height - 1) |
S_028C68_MAX_MIP(iview->image->info.levels);
cb->cb_color_attrib2 = S_028C68_MIP0_WIDTH(iview->extent.width - 1) |
S_028C68_MIP0_HEIGHT(iview->extent.height - 1) |
S_028C68_MAX_MIP(iview->image->info.levels - 1);
cb->gfx9_epitch = S_0287A0_EPITCH(iview->image->surface.u.gfx9.surf.epitch);
@@ -3246,6 +3283,8 @@ radv_initialise_ds_surface(struct radv_device *device,
ds->db_z_info |= S_028040_TILE_MODE_INDEX(tile_mode_index);
tile_mode_index = si_tile_mode_index(iview->image, level, true);
ds->db_stencil_info |= S_028044_TILE_MODE_INDEX(tile_mode_index);
if (stencil_only)
ds->db_z_info |= S_028040_TILE_MODE_INDEX(tile_mode_index);
}
ds->db_depth_size = S_028058_PITCH_TILE_MAX((level_info->nblk_x / 8) - 1) |
@@ -3624,9 +3663,14 @@ void radv_GetPhysicalDeviceExternalSemaphorePropertiesKHR(
const VkPhysicalDeviceExternalSemaphoreInfoKHR* pExternalSemaphoreInfo,
VkExternalSemaphorePropertiesKHR* pExternalSemaphoreProperties)
{
pExternalSemaphoreProperties->exportFromImportedHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
pExternalSemaphoreProperties->compatibleHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
pExternalSemaphoreProperties->externalSemaphoreFeatures = VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHR |
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHR;
if (pExternalSemaphoreInfo->handleType == VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR) {
pExternalSemaphoreProperties->exportFromImportedHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
pExternalSemaphoreProperties->compatibleHandleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
pExternalSemaphoreProperties->externalSemaphoreFeatures = VK_EXTERNAL_SEMAPHORE_FEATURE_EXPORTABLE_BIT_KHR |
VK_EXTERNAL_SEMAPHORE_FEATURE_IMPORTABLE_BIT_KHR;
} else {
pExternalSemaphoreProperties->exportFromImportedHandleTypes = 0;
pExternalSemaphoreProperties->compatibleHandleTypes = 0;
pExternalSemaphoreProperties->externalSemaphoreFeatures = 0;
}
}

View File

@@ -578,6 +578,10 @@ radv_physical_device_get_format_properties(struct radv_physical_device *physical
VK_FORMAT_FEATURE_BLIT_DST_BIT;
tiled |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR |
VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR;
/* GFX9 doesn't support linear depth surfaces */
if (physical_device->rad_info.chip_class >= GFX9)
linear = 0;
}
} else {
bool linear_sampling;
@@ -958,6 +962,12 @@ bool radv_format_pack_clear_color(VkFormat format,
clear_vals[1] = ((uint16_t)util_iround(CLAMP(value->float32[2], 0.0f, 1.0f) * 0xffff)) & 0xffff;
clear_vals[1] |= ((uint16_t)util_iround(CLAMP(value->float32[3], 0.0f, 1.0f) * 0xffff)) << 16;
break;
case VK_FORMAT_R16G16B16A16_SNORM:
clear_vals[0] = ((uint16_t)util_iround(CLAMP(value->float32[0], -1.0f, 1.0f) * 0x7fff)) & 0xffff;
clear_vals[0] |= ((uint16_t)util_iround(CLAMP(value->float32[1], -1.0f, 1.0f) * 0x7fff)) << 16;
clear_vals[1] = ((uint16_t)util_iround(CLAMP(value->float32[2], -1.0f, 1.0f) * 0x7fff)) & 0xffff;
clear_vals[1] |= ((uint16_t)util_iround(CLAMP(value->float32[3], -1.0f, 1.0f) * 0x7fff)) << 16;
break;
case VK_FORMAT_A2B10G10R10_UNORM_PACK32:
clear_vals[0] = ((uint16_t)util_iround(CLAMP(value->float32[0], 0.0f, 1.0f) * 0x3ff)) & 0x3ff;
clear_vals[0] |= (((uint16_t)util_iround(CLAMP(value->float32[1], 0.0f, 1.0f) * 0x3ff)) & 0x3ff) << 10;

View File

@@ -34,7 +34,7 @@
#include "util/debug.h"
#include "util/u_atomic.h"
static unsigned
radv_choose_tiling(struct radv_device *Device,
radv_choose_tiling(struct radv_device *device,
const struct radv_image_create_info *create_info)
{
const VkImageCreateInfo *pCreateInfo = create_info->vk_info;
@@ -44,12 +44,17 @@ radv_choose_tiling(struct radv_device *Device,
return RADEON_SURF_MODE_LINEAR_ALIGNED;
}
/* Textures with a very small height are recommended to be linear. */
if (pCreateInfo->imageType == VK_IMAGE_TYPE_1D ||
/* Only very thin and long 2D textures should benefit from
* linear_aligned. */
(pCreateInfo->extent.width > 8 && pCreateInfo->extent.height <= 2))
return RADEON_SURF_MODE_LINEAR_ALIGNED;
if (!vk_format_is_compressed(pCreateInfo->format) &&
!vk_format_is_depth_or_stencil(pCreateInfo->format)
&& device->physical_device->rad_info.chip_class <= VI) {
/* this causes hangs in some VK CTS tests on GFX9. */
/* Textures with a very small height are recommended to be linear. */
if (pCreateInfo->imageType == VK_IMAGE_TYPE_1D ||
/* Only very thin and long 2D textures should benefit from
* linear_aligned. */
(pCreateInfo->extent.width > 8 && pCreateInfo->extent.height <= 2))
return RADEON_SURF_MODE_LINEAR_ALIGNED;
}
/* MSAA resources must be 2D tiled. */
if (pCreateInfo->samples > 1)
@@ -115,6 +120,7 @@ radv_init_surface(struct radv_device *device,
VK_IMAGE_USAGE_STORAGE_BIT)) ||
(pCreateInfo->flags & VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT) ||
(pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR) ||
pCreateInfo->mipLevels > 1 || pCreateInfo->arrayLayers > 1 ||
device->physical_device->rad_info.chip_class < VI ||
create_info->scanout || (device->debug_flags & RADV_DEBUG_NO_DCC) ||
!radv_is_colorbuffer_format_supported(pCreateInfo->format, &blendable))
@@ -181,6 +187,11 @@ radv_make_buffer_descriptor(struct radv_device *device,
state[0] = va;
state[1] = S_008F04_BASE_ADDRESS_HI(va >> 32) |
S_008F04_STRIDE(stride);
if (device->physical_device->rad_info.chip_class != VI && stride) {
range /= stride;
}
state[2] = range;
state[3] = S_008F0C_DST_SEL_X(radv_map_swizzle(desc->swizzle[0])) |
S_008F0C_DST_SEL_Y(radv_map_swizzle(desc->swizzle[1])) |
@@ -200,7 +211,6 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
{
uint64_t gpu_address = image->bo ? device->ws->buffer_get_va(image->bo) + image->offset : 0;
uint64_t va = gpu_address;
unsigned pitch = base_level_info->nblk_x * block_width;
enum chip_class chip_class = device->physical_device->rad_info.chip_class;
uint64_t meta_va = 0;
if (chip_class >= GFX9) {
@@ -216,9 +226,6 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
state[0] |= image->surface.u.legacy.tile_swizzle;
state[1] &= C_008F14_BASE_ADDRESS_HI;
state[1] |= S_008F14_BASE_ADDRESS_HI(va >> 40);
state[3] |= S_008F1C_TILING_INDEX(si_tile_mode_index(image, base_level,
is_stencil));
state[4] |= S_008F20_PITCH_GFX6(pitch - 1);
if (chip_class >= VI) {
state[6] &= C_008F28_COMPRESSION_EN;
@@ -274,10 +281,14 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
}
static unsigned radv_tex_dim(VkImageType image_type, VkImageViewType view_type,
unsigned nr_layers, unsigned nr_samples, bool is_storage_image)
unsigned nr_layers, unsigned nr_samples, bool is_storage_image, bool gfx9)
{
if (view_type == VK_IMAGE_VIEW_TYPE_CUBE || view_type == VK_IMAGE_VIEW_TYPE_CUBE_ARRAY)
return is_storage_image ? V_008F1C_SQ_RSRC_IMG_2D_ARRAY : V_008F1C_SQ_RSRC_IMG_CUBE;
/* GFX9 allocates 1D textures as 2D. */
if (gfx9 && image_type == VK_IMAGE_TYPE_1D)
image_type = VK_IMAGE_TYPE_2D;
switch (image_type) {
case VK_IMAGE_TYPE_1D:
return nr_layers > 1 ? V_008F1C_SQ_RSRC_IMG_1D_ARRAY : V_008F1C_SQ_RSRC_IMG_1D;
@@ -368,7 +379,7 @@ si_make_texture_descriptor(struct radv_device *device,
}
type = radv_tex_dim(image->type, view_type, image->info.array_size, image->info.samples,
is_storage_image);
is_storage_image, device->physical_device->rad_info.chip_class >= GFX9);
if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
height = 1;
depth = image->info.array_size;
@@ -414,7 +425,7 @@ si_make_texture_descriptor(struct radv_device *device,
state[4] |= S_008F20_BC_SWIZZLE(bc_swizzle);
state[5] |= S_008F24_MAX_MIP(image->info.samples > 1 ?
util_logbase2(image->info.samples) :
last_level);
image->info.levels - 1);
} else {
state[3] |= S_008F1C_POW2_PAD(image->info.levels > 1);
state[4] |= S_008F20_DEPTH(depth - 1);
@@ -489,7 +500,7 @@ si_make_texture_descriptor(struct radv_device *device,
S_008F1C_DST_SEL_Y(V_008F1C_SQ_SEL_X) |
S_008F1C_DST_SEL_Z(V_008F1C_SQ_SEL_X) |
S_008F1C_DST_SEL_W(V_008F1C_SQ_SEL_X) |
S_008F1C_TYPE(radv_tex_dim(image->type, view_type, 1, 0, false));
S_008F1C_TYPE(radv_tex_dim(image->type, view_type, 1, 0, false, false));
fmask_state[4] = 0;
fmask_state[5] = S_008F24_BASE_ARRAY(first_layer);
fmask_state[6] = 0;
@@ -554,10 +565,11 @@ radv_query_opaque_metadata(struct radv_device *device,
memcpy(&md->metadata[2], desc, sizeof(desc));
/* Dwords [10:..] contain the mipmap level offsets. */
for (i = 0; i <= image->info.levels - 1; i++)
md->metadata[10+i] = image->surface.u.legacy.level[i].offset >> 8;
md->size_metadata = (11 + image->info.levels - 1) * 4;
if (device->physical_device->rad_info.chip_class <= VI) {
for (i = 0; i <= image->info.levels - 1; i++)
md->metadata[10+i] = image->surface.u.legacy.level[i].offset >> 8;
md->size_metadata = (11 + image->info.levels - 1) * 4;
}
}
void
@@ -826,8 +838,10 @@ radv_image_create(VkDevice _device,
if ((pCreateInfo->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) &&
pCreateInfo->mipLevels == 1 &&
!image->surface.dcc_size && image->info.depth == 1 && can_cmask_dcc)
!image->surface.dcc_size && image->info.depth == 1 && can_cmask_dcc &&
!image->surface.is_linear)
radv_image_alloc_cmask(device, image);
if (image->info.samples > 1 && vk_format_is_color(pCreateInfo->format)) {
radv_image_alloc_fmask(device, image);
} else if (vk_format_is_depth(pCreateInfo->format)) {
@@ -856,15 +870,15 @@ radv_image_create(VkDevice _device,
static void
radv_image_view_make_descriptor(struct radv_image_view *iview,
struct radv_device *device,
const VkImageViewCreateInfo* pCreateInfo,
const VkComponentMapping *components,
bool is_storage_image)
{
RADV_FROM_HANDLE(radv_image, image, pCreateInfo->image);
const VkImageSubresourceRange *range = &pCreateInfo->subresourceRange;
struct radv_image *image = iview->image;
bool is_stencil = iview->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT;
uint32_t blk_w;
uint32_t *descriptor;
uint32_t *fmask_descriptor;
uint32_t hw_level = 0;
if (is_storage_image) {
descriptor = iview->storage_descriptor;
@@ -877,23 +891,32 @@ radv_image_view_make_descriptor(struct radv_image_view *iview,
assert(image->surface.blk_w % vk_format_get_blockwidth(image->vk_format) == 0);
blk_w = image->surface.blk_w / vk_format_get_blockwidth(image->vk_format) * vk_format_get_blockwidth(iview->vk_format);
if (device->physical_device->rad_info.chip_class >= GFX9)
hw_level = iview->base_mip;
si_make_texture_descriptor(device, image, is_storage_image,
iview->type,
iview->vk_format,
&pCreateInfo->components,
0, radv_get_levelCount(image, range) - 1,
range->baseArrayLayer,
range->baseArrayLayer + radv_get_layerCount(image, range) - 1,
components,
hw_level, hw_level + iview->level_count - 1,
iview->base_layer,
iview->base_layer + iview->layer_count - 1,
iview->extent.width,
iview->extent.height,
iview->extent.depth,
descriptor,
fmask_descriptor);
const struct legacy_surf_level *base_level_info = NULL;
if (device->physical_device->rad_info.chip_class <= GFX9) {
if (is_stencil)
base_level_info = &image->surface.u.legacy.stencil_level[iview->base_mip];
else
base_level_info = &image->surface.u.legacy.level[iview->base_mip];
}
si_set_mutable_tex_desc_fields(device, image,
is_stencil ? &image->surface.u.legacy.stencil_level[range->baseMipLevel]
: &image->surface.u.legacy.level[range->baseMipLevel],
range->baseMipLevel,
range->baseMipLevel,
base_level_info,
iview->base_mip,
iview->base_mip,
blk_w, is_stencil, descriptor);
}
@@ -929,23 +952,34 @@ radv_image_view_init(struct radv_image_view *iview,
iview->vk_format = vk_format_depth_only(iview->vk_format);
}
iview->extent = (VkExtent3D) {
.width = radv_minify(image->info.width , range->baseMipLevel),
.height = radv_minify(image->info.height, range->baseMipLevel),
.depth = radv_minify(image->info.depth , range->baseMipLevel),
};
if (device->physical_device->rad_info.chip_class >= GFX9) {
iview->extent = (VkExtent3D) {
.width = image->info.width,
.height = image->info.height,
.depth = image->info.depth,
};
} else {
iview->extent = (VkExtent3D) {
.width = radv_minify(image->info.width , range->baseMipLevel),
.height = radv_minify(image->info.height, range->baseMipLevel),
.depth = radv_minify(image->info.depth , range->baseMipLevel),
};
}
iview->extent.width = round_up_u32(iview->extent.width * vk_format_get_blockwidth(iview->vk_format),
vk_format_get_blockwidth(image->vk_format));
iview->extent.height = round_up_u32(iview->extent.height * vk_format_get_blockheight(iview->vk_format),
vk_format_get_blockheight(image->vk_format));
if (iview->vk_format != image->vk_format) {
iview->extent.width = round_up_u32(iview->extent.width * vk_format_get_blockwidth(iview->vk_format),
vk_format_get_blockwidth(image->vk_format));
iview->extent.height = round_up_u32(iview->extent.height * vk_format_get_blockheight(iview->vk_format),
vk_format_get_blockheight(image->vk_format));
}
iview->base_layer = range->baseArrayLayer;
iview->layer_count = radv_get_layerCount(image, range);
iview->base_mip = range->baseMipLevel;
iview->level_count = radv_get_levelCount(image, range);
radv_image_view_make_descriptor(iview, device, pCreateInfo, false);
radv_image_view_make_descriptor(iview, device, pCreateInfo, true);
radv_image_view_make_descriptor(iview, device, &pCreateInfo->components, false);
radv_image_view_make_descriptor(iview, device, &pCreateInfo->components, true);
}
bool radv_layout_has_htile(const struct radv_image *image,
@@ -1020,23 +1054,34 @@ radv_DestroyImage(VkDevice _device, VkImage _image,
}
void radv_GetImageSubresourceLayout(
VkDevice device,
VkDevice _device,
VkImage _image,
const VkImageSubresource* pSubresource,
VkSubresourceLayout* pLayout)
{
RADV_FROM_HANDLE(radv_image, image, _image);
RADV_FROM_HANDLE(radv_device, device, _device);
int level = pSubresource->mipLevel;
int layer = pSubresource->arrayLayer;
struct radeon_surf *surface = &image->surface;
pLayout->offset = surface->u.legacy.level[level].offset + surface->u.legacy.level[level].slice_size * layer;
pLayout->rowPitch = surface->u.legacy.level[level].nblk_x * surface->bpe;
pLayout->arrayPitch = surface->u.legacy.level[level].slice_size;
pLayout->depthPitch = surface->u.legacy.level[level].slice_size;
pLayout->size = surface->u.legacy.level[level].slice_size;
if (image->type == VK_IMAGE_TYPE_3D)
pLayout->size *= u_minify(image->info.depth, level);
if (device->physical_device->rad_info.chip_class >= GFX9) {
pLayout->offset = surface->u.gfx9.offset[level] + surface->u.gfx9.surf_slice_size * layer;
pLayout->rowPitch = surface->u.gfx9.surf_pitch * surface->bpe;
pLayout->arrayPitch = surface->u.gfx9.surf_slice_size;
pLayout->depthPitch = surface->u.gfx9.surf_slice_size;
pLayout->size = surface->u.gfx9.surf_slice_size;
if (image->type == VK_IMAGE_TYPE_3D)
pLayout->size *= u_minify(image->info.depth, level);
} else {
pLayout->offset = surface->u.legacy.level[level].offset + surface->u.legacy.level[level].slice_size * layer;
pLayout->rowPitch = surface->u.legacy.level[level].nblk_x * surface->bpe;
pLayout->arrayPitch = surface->u.legacy.level[level].slice_size;
pLayout->depthPitch = surface->u.legacy.level[level].slice_size;
pLayout->size = surface->u.legacy.level[level].slice_size;
if (image->type == VK_IMAGE_TYPE_3D)
pLayout->size *= u_minify(image->info.depth, level);
}
}

View File

@@ -477,48 +477,8 @@ radv_meta_build_nir_fs_noop(void)
return b.shader;
}
static nir_ssa_def *radv_meta_build_resolve_srgb_conversion(nir_builder *b,
nir_ssa_def *input)
{
nir_const_value v;
unsigned i;
v.u32[0] = 0x3b4d2e1c; // 0.00313080009
nir_ssa_def *cmp[3];
for (i = 0; i < 3; i++)
cmp[i] = nir_flt(b, nir_channel(b, input, i),
nir_build_imm(b, 1, 32, v));
nir_ssa_def *ltvals[3];
v.f32[0] = 12.92;
for (i = 0; i < 3; i++)
ltvals[i] = nir_fmul(b, nir_channel(b, input, i),
nir_build_imm(b, 1, 32, v));
nir_ssa_def *gtvals[3];
for (i = 0; i < 3; i++) {
v.f32[0] = 1.0/2.4;
gtvals[i] = nir_fpow(b, nir_channel(b, input, i),
nir_build_imm(b, 1, 32, v));
v.f32[0] = 1.055;
gtvals[i] = nir_fmul(b, gtvals[i],
nir_build_imm(b, 1, 32, v));
v.f32[0] = 0.055;
gtvals[i] = nir_fsub(b, gtvals[i],
nir_build_imm(b, 1, 32, v));
}
nir_ssa_def *comp[4];
for (i = 0; i < 3; i++)
comp[i] = nir_bcsel(b, cmp[i], ltvals[i], gtvals[i]);
comp[3] = nir_channels(b, input, 3);
return nir_vec(b, comp, 4);
}
void radv_meta_build_resolve_shader_core(nir_builder *b,
bool is_integer,
bool is_srgb,
int samples,
nir_variable *input_img,
nir_variable *color,
@@ -596,10 +556,4 @@ void radv_meta_build_resolve_shader_core(nir_builder *b,
if (outer_if)
b->cursor = nir_after_cf_node(&outer_if->cf_node);
if (is_srgb) {
nir_ssa_def *newv = nir_load_var(b, color);
newv = radv_meta_build_resolve_srgb_conversion(b, newv);
nir_store_var(b, color, newv, 0xf);
}
}

View File

@@ -234,7 +234,6 @@ nir_shader *radv_meta_build_nir_fs_noop(void);
void radv_meta_build_resolve_shader_core(nir_builder *b,
bool is_integer,
bool is_srgb,
int samples,
nir_variable *input_img,
nir_variable *color,

View File

@@ -275,15 +275,20 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
VkFilter blit_filter)
{
struct radv_device *device = cmd_buffer->device;
uint32_t src_width = radv_minify(src_iview->image->info.width, src_iview->base_mip);
uint32_t src_height = radv_minify(src_iview->image->info.height, src_iview->base_mip);
uint32_t src_depth = radv_minify(src_iview->image->info.depth, src_iview->base_mip);
uint32_t dst_width = radv_minify(dest_iview->image->info.width, dest_iview->base_mip);
uint32_t dst_height = radv_minify(dest_iview->image->info.height, dest_iview->base_mip);
assert(src_image->info.samples == dest_image->info.samples);
float vertex_push_constants[5] = {
(float)src_offset_0.x / (float)src_iview->extent.width,
(float)src_offset_0.y / (float)src_iview->extent.height,
(float)src_offset_1.x / (float)src_iview->extent.width,
(float)src_offset_1.y / (float)src_iview->extent.height,
(float)src_offset_0.z / (float)src_iview->extent.depth,
(float)src_offset_0.x / (float)src_width,
(float)src_offset_0.y / (float)src_height,
(float)src_offset_1.x / (float)src_width,
(float)src_offset_1.y / (float)src_height,
(float)src_offset_0.z / (float)src_depth,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
@@ -310,8 +315,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
.pAttachments = (VkImageView[]) {
radv_image_view_to_handle(dest_iview),
},
.width = dest_iview->extent.width,
.height = dest_iview->extent.height,
.width = dst_width,
.height = dst_height,
.layers = 1,
}, &cmd_buffer->pool->alloc, &fb);
VkPipeline pipeline;
@@ -695,6 +700,8 @@ static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,

View File

@@ -53,7 +53,8 @@ enum blit2d_src_type {
static void
create_iview(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *surf,
struct radv_image_view *iview, VkFormat depth_format)
struct radv_image_view *iview, VkFormat depth_format,
VkImageAspectFlagBits aspects)
{
VkFormat format;
@@ -69,7 +70,7 @@ create_iview(struct radv_cmd_buffer *cmd_buffer,
.viewType = VK_IMAGE_VIEW_TYPE_2D,
.format = format,
.subresourceRange = {
.aspectMask = surf->aspect_mask,
.aspectMask = aspects,
.baseMipLevel = surf->level,
.levelCount = 1,
.baseArrayLayer = surf->layer,
@@ -111,7 +112,8 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
struct radv_meta_blit2d_surf *src_img,
struct radv_meta_blit2d_buffer *src_buf,
struct blit2d_src_temps *tmp,
enum blit2d_src_type src_type, VkFormat depth_format)
enum blit2d_src_type src_type, VkFormat depth_format,
VkImageAspectFlagBits aspects)
{
struct radv_device *device = cmd_buffer->device;
@@ -138,7 +140,7 @@ blit2d_bind_src(struct radv_cmd_buffer *cmd_buffer,
VK_SHADER_STAGE_FRAGMENT_BIT, 16, 4,
&src_buf->pitch);
} else {
create_iview(cmd_buffer, src_img, &tmp->iview, depth_format);
create_iview(cmd_buffer, src_img, &tmp->iview, depth_format, aspects);
radv_meta_push_descriptor_set(cmd_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
device->meta_state.blit2d.p_layouts[src_type],
@@ -175,9 +177,10 @@ blit2d_bind_dst(struct radv_cmd_buffer *cmd_buffer,
uint32_t width,
uint32_t height,
VkFormat depth_format,
struct blit2d_dst_temps *tmp)
struct blit2d_dst_temps *tmp,
VkImageAspectFlagBits aspects)
{
create_iview(cmd_buffer, dst, &tmp->iview, depth_format);
create_iview(cmd_buffer, dst, &tmp->iview, depth_format, aspects);
radv_CreateFramebuffer(radv_device_to_handle(cmd_buffer->device),
&(VkFramebufferCreateInfo) {
@@ -250,106 +253,111 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
struct radv_device *device = cmd_buffer->device;
for (unsigned r = 0; r < num_rects; ++r) {
VkFormat depth_format = 0;
if (dst->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT)
depth_format = vk_format_stencil_only(dst->image->vk_format);
else if (dst->aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT)
depth_format = vk_format_depth_only(dst->image->vk_format);
struct blit2d_src_temps src_temps;
blit2d_bind_src(cmd_buffer, src_img, src_buf, &src_temps, src_type, depth_format);
unsigned i;
for_each_bit(i, dst->aspect_mask) {
unsigned aspect_mask = 1u << i;
VkFormat depth_format = 0;
if (aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT)
depth_format = vk_format_stencil_only(dst->image->vk_format);
else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT)
depth_format = vk_format_depth_only(dst->image->vk_format);
struct blit2d_src_temps src_temps;
blit2d_bind_src(cmd_buffer, src_img, src_buf, &src_temps, src_type, depth_format, aspect_mask);
struct blit2d_dst_temps dst_temps;
blit2d_bind_dst(cmd_buffer, dst, rects[r].dst_x + rects[r].width,
rects[r].dst_y + rects[r].height, depth_format, &dst_temps);
struct blit2d_dst_temps dst_temps;
blit2d_bind_dst(cmd_buffer, dst, rects[r].dst_x + rects[r].width,
rects[r].dst_y + rects[r].height, depth_format, &dst_temps, aspect_mask);
float vertex_push_constants[4] = {
rects[r].src_x,
rects[r].src_y,
rects[r].src_x + rects[r].width,
rects[r].src_y + rects[r].height,
};
float vertex_push_constants[4] = {
rects[r].src_x,
rects[r].src_y,
rects[r].src_x + rects[r].width,
rects[r].src_y + rects[r].height,
};
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
VK_SHADER_STAGE_VERTEX_BIT, 0, 16,
vertex_push_constants);
radv_CmdPushConstants(radv_cmd_buffer_to_handle(cmd_buffer),
device->meta_state.blit2d.p_layouts[src_type],
VK_SHADER_STAGE_VERTEX_BIT, 0, 16,
vertex_push_constants);
if (dst->aspect_mask == VK_IMAGE_ASPECT_COLOR_BIT) {
unsigned fs_key = radv_format_meta_fs_key(dst_temps.iview.vk_format);
if (aspect_mask == VK_IMAGE_ASPECT_COLOR_BIT) {
unsigned fs_key = radv_format_meta_fs_key(dst_temps.iview.vk_format);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.render_passes[fs_key],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
.extent = { rects[r].width, rects[r].height },
},
.clearValueCount = 0,
.pClearValues = NULL,
}, VK_SUBPASS_CONTENTS_INLINE);
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.render_passes[fs_key],
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
.extent = { rects[r].width, rects[r].height },
},
.clearValueCount = 0,
.pClearValues = NULL,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_pipeline(cmd_buffer, src_type, fs_key);
} else if (dst->aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT) {
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.depth_only_rp,
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
.extent = { rects[r].width, rects[r].height },
},
.clearValueCount = 0,
.pClearValues = NULL,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_pipeline(cmd_buffer, src_type, fs_key);
} else if (aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT) {
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.depth_only_rp,
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
.extent = { rects[r].width, rects[r].height },
},
.clearValueCount = 0,
.pClearValues = NULL,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_depth_pipeline(cmd_buffer, src_type);
bind_depth_pipeline(cmd_buffer, src_type);
} else if (dst->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.stencil_only_rp,
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
.extent = { rects[r].width, rects[r].height },
},
.clearValueCount = 0,
.pClearValues = NULL,
}, VK_SUBPASS_CONTENTS_INLINE);
} else if (aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
radv_CmdBeginRenderPass(radv_cmd_buffer_to_handle(cmd_buffer),
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = device->meta_state.blit2d.stencil_only_rp,
.framebuffer = dst_temps.fb,
.renderArea = {
.offset = { rects[r].dst_x, rects[r].dst_y, },
.extent = { rects[r].width, rects[r].height },
},
.clearValueCount = 0,
.pClearValues = NULL,
}, VK_SUBPASS_CONTENTS_INLINE);
bind_stencil_pipeline(cmd_buffer, src_type);
bind_stencil_pipeline(cmd_buffer, src_type);
} else
unreachable("Processing blit2d with multiple aspects.");
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) {
.x = rects[r].dst_x,
.y = rects[r].dst_y,
.width = rects[r].width,
.height = rects[r].height,
.minDepth = 0.0f,
.maxDepth = 1.0f
});
radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkRect2D) {
.offset = (VkOffset2D) { rects[r].dst_x, rects[r].dst_y },
.extent = (VkExtent2D) { rects[r].width, rects[r].height },
});
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
radv_CmdEndRenderPass(radv_cmd_buffer_to_handle(cmd_buffer));
/* At the point where we emit the draw call, all data from the
* descriptor sets, etc. has been used. We are free to delete it.
*/
blit2d_unbind_dst(cmd_buffer, &dst_temps);
}
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkViewport) {
.x = rects[r].dst_x,
.y = rects[r].dst_y,
.width = rects[r].width,
.height = rects[r].height,
.minDepth = 0.0f,
.maxDepth = 1.0f
});
radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &(VkRect2D) {
.offset = (VkOffset2D) { rects[r].dst_x, rects[r].dst_y },
.extent = (VkExtent2D) { rects[r].width, rects[r].height },
});
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
radv_CmdEndRenderPass(radv_cmd_buffer_to_handle(cmd_buffer));
/* At the point where we emit the draw call, all data from the
* descriptor sets, etc. has been used. We are free to delete it.
*/
blit2d_unbind_dst(cmd_buffer, &dst_temps);
}
}
@@ -1134,6 +1142,8 @@ static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,

View File

@@ -754,6 +754,8 @@ static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,
@@ -977,7 +979,7 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
if (iview->image->info.levels > 1)
goto fail;
if (iview->image->surface.u.legacy.level[0].mode < RADEON_SURF_MODE_1D)
if (iview->image->surface.is_linear)
goto fail;
if (!radv_image_extent_compare(iview->image, &iview->extent))
goto fail;
@@ -1174,6 +1176,9 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
{
VkDevice device_h = radv_device_to_handle(cmd_buffer->device);
struct radv_image_view iview;
uint32_t width = radv_minify(image->info.width, range->baseMipLevel + level);
uint32_t height = radv_minify(image->info.height, range->baseMipLevel + level);
radv_image_view_init(&iview, cmd_buffer->device,
&(VkImageViewCreateInfo) {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
@@ -1197,9 +1202,9 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
.pAttachments = (VkImageView[]) {
radv_image_view_to_handle(&iview),
},
.width = iview.extent.width,
.height = iview.extent.height,
.layers = 1
.width = width,
.height = height,
.layers = 1
},
&cmd_buffer->pool->alloc,
&fb);
@@ -1255,8 +1260,8 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
.renderArea = {
.offset = { 0, 0, },
.extent = {
.width = iview.extent.width,
.height = iview.extent.height,
.width = width,
.height = height,
},
},
.renderPass = pass,
@@ -1275,7 +1280,7 @@ radv_clear_image_layer(struct radv_cmd_buffer *cmd_buffer,
VkClearRect clear_rect = {
.rect = {
.offset = { 0, 0 },
.extent = { iview.extent.width, iview.extent.height },
.extent = { width, height },
},
.baseArrayLayer = range->baseArrayLayer,
.layerCount = 1, /* FINISHME: clear multi-layer framebuffer */

View File

@@ -29,7 +29,9 @@
#include "sid.h"
static VkResult
create_pass(struct radv_device *device)
create_pass(struct radv_device *device,
uint32_t samples,
VkRenderPass *pass)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -37,7 +39,7 @@ create_pass(struct radv_device *device)
VkAttachmentDescription attachment;
attachment.format = VK_FORMAT_D32_SFLOAT_S8_UINT;
attachment.samples = 1;
attachment.samples = samples;
attachment.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachment.initialLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
@@ -65,14 +67,18 @@ create_pass(struct radv_device *device)
.dependencyCount = 0,
},
alloc,
&device->meta_state.depth_decomp.pass);
pass);
return result;
}
static VkResult
create_pipeline(struct radv_device *device,
VkShaderModule vs_module_h)
VkShaderModule vs_module_h,
uint32_t samples,
VkRenderPass pass,
VkPipeline *decompress_pipeline,
VkPipeline *resummarize_pipeline)
{
VkResult result;
VkDevice device_h = radv_device_to_handle(device);
@@ -129,7 +135,7 @@ create_pipeline(struct radv_device *device,
},
.pMultisampleState = &(VkPipelineMultisampleStateCreateInfo) {
.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
.rasterizationSamples = 1,
.rasterizationSamples = samples,
.sampleShadingEnable = false,
.pSampleMask = NULL,
.alphaToCoverageEnable = false,
@@ -156,7 +162,7 @@ create_pipeline(struct radv_device *device,
VK_DYNAMIC_STATE_SCISSOR,
},
},
.renderPass = device->meta_state.depth_decomp.pass,
.renderPass = pass,
.subpass = 0,
};
@@ -169,7 +175,7 @@ create_pipeline(struct radv_device *device,
.db_flush_stencil_inplace = true,
},
&device->meta_state.alloc,
&device->meta_state.depth_decomp.decompress_pipeline);
decompress_pipeline);
if (result != VK_SUCCESS)
goto cleanup;
@@ -183,7 +189,7 @@ create_pipeline(struct radv_device *device,
.db_resummarize = true,
},
&device->meta_state.alloc,
&device->meta_state.depth_decomp.resummarize_pipeline);
resummarize_pipeline);
if (result != VK_SUCCESS)
goto cleanup;
@@ -199,29 +205,31 @@ radv_device_finish_meta_depth_decomp_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
VkDevice device_h = radv_device_to_handle(device);
VkRenderPass pass_h = device->meta_state.depth_decomp.pass;
const VkAllocationCallbacks *alloc = &device->meta_state.alloc;
if (pass_h)
radv_DestroyRenderPass(device_h, pass_h,
&device->meta_state.alloc);
VkPipeline pipeline_h = state->depth_decomp.decompress_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
pipeline_h = state->depth_decomp.resummarize_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
for (uint32_t i = 0; i < ARRAY_SIZE(state->depth_decomp); ++i) {
VkRenderPass pass_h = state->depth_decomp[i].pass;
if (pass_h) {
radv_DestroyRenderPass(device_h, pass_h, alloc);
}
VkPipeline pipeline_h = state->depth_decomp[i].decompress_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
pipeline_h = state->depth_decomp[i].resummarize_pipeline;
if (pipeline_h) {
radv_DestroyPipeline(device_h, pipeline_h, alloc);
}
}
}
VkResult
radv_device_init_meta_depth_decomp_state(struct radv_device *device)
{
struct radv_meta_state *state = &device->meta_state;
VkResult res = VK_SUCCESS;
zero(device->meta_state.depth_decomp);
zero(state->depth_decomp);
struct radv_shader_module vs_module = { .nir = radv_meta_build_nir_vs_generate_vertices() };
if (!vs_module.nir) {
@@ -230,14 +238,22 @@ radv_device_init_meta_depth_decomp_state(struct radv_device *device)
goto fail;
}
res = create_pass(device);
if (res != VK_SUCCESS)
goto fail;
VkShaderModule vs_module_h = radv_shader_module_to_handle(&vs_module);
res = create_pipeline(device, vs_module_h);
if (res != VK_SUCCESS)
goto fail;
for (uint32_t i = 0; i < ARRAY_SIZE(state->depth_decomp); ++i) {
uint32_t samples = 1 << i;
res = create_pass(device, samples, &state->depth_decomp[i].pass);
if (res != VK_SUCCESS)
goto fail;
res = create_pipeline(device, vs_module_h, samples,
state->depth_decomp[i].pass,
&state->depth_decomp[i].decompress_pipeline,
&state->depth_decomp[i].resummarize_pipeline);
if (res != VK_SUCCESS)
goto fail;
}
goto cleanup;
@@ -283,10 +299,15 @@ emit_depth_decomp(struct radv_cmd_buffer *cmd_buffer,
}
enum radv_depth_op {
DEPTH_DECOMPRESS,
DEPTH_RESUMMARIZE,
};
static void radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image,
VkImageSubresourceRange *subresourceRange,
VkPipeline pipeline_h)
enum radv_depth_op op)
{
struct radv_meta_saved_state saved_state;
struct radv_meta_saved_pass_state saved_pass_state;
@@ -296,6 +317,9 @@ static void radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
subresourceRange->baseMipLevel);
uint32_t height = radv_minify(image->info.height,
subresourceRange->baseMipLevel);
uint32_t samples = image->info.samples;
uint32_t samples_log2 = ffs(samples) - 1;
struct radv_meta_state *meta_state = &cmd_buffer->device->meta_state;
if (!image->surface.htile_size)
return;
@@ -339,7 +363,7 @@ static void radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
radv_CmdBeginRenderPass(cmd_buffer_h,
&(VkRenderPassBeginInfo) {
.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
.renderPass = cmd_buffer->device->meta_state.depth_decomp.pass,
.renderPass = meta_state->depth_decomp[samples_log2].pass,
.framebuffer = fb_h,
.renderArea = {
.offset = {
@@ -356,6 +380,18 @@ static void radv_process_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
},
VK_SUBPASS_CONTENTS_INLINE);
VkPipeline pipeline_h;
switch (op) {
case DEPTH_DECOMPRESS:
pipeline_h = meta_state->depth_decomp[samples_log2].decompress_pipeline;
break;
case DEPTH_RESUMMARIZE:
pipeline_h = meta_state->depth_decomp[samples_log2].resummarize_pipeline;
break;
default:
unreachable("unknown operation");
}
emit_depth_decomp(cmd_buffer, &(VkOffset2D){0, 0 }, &(VkExtent2D){width, height}, pipeline_h);
radv_CmdEndRenderPass(cmd_buffer_h);
@@ -371,8 +407,7 @@ void radv_decompress_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
VkImageSubresourceRange *subresourceRange)
{
assert(cmd_buffer->queue_family_index == RADV_QUEUE_GENERAL);
radv_process_depth_image_inplace(cmd_buffer, image, subresourceRange,
cmd_buffer->device->meta_state.depth_decomp.decompress_pipeline);
radv_process_depth_image_inplace(cmd_buffer, image, subresourceRange, DEPTH_DECOMPRESS);
}
void radv_resummarize_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
@@ -380,6 +415,5 @@ void radv_resummarize_depth_image_inplace(struct radv_cmd_buffer *cmd_buffer,
VkImageSubresourceRange *subresourceRange)
{
assert(cmd_buffer->queue_family_index == RADV_QUEUE_GENERAL);
radv_process_depth_image_inplace(cmd_buffer, image, subresourceRange,
cmd_buffer->device->meta_state.depth_decomp.resummarize_pipeline);
radv_process_depth_image_inplace(cmd_buffer, image, subresourceRange, DEPTH_RESUMMARIZE);
}

View File

@@ -382,6 +382,11 @@ void radv_CmdResolveImage(
radv_meta_save_graphics_reset_vport_scissor_novertex(&saved_state, cmd_buffer);
assert(src_image->info.samples > 1);
if (src_image->info.samples <= 1) {
/* this causes GPU hangs if we get past here */
fprintf(stderr, "radv: Illegal resolve operation (src not multisampled), will hang GPU.");
return;
}
assert(dest_image->info.samples == 1);
if (src_image->info.samples >= 16) {
@@ -607,13 +612,6 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
/* Subpass resolves must respect the render area. We can ignore the
* render area here because vkCmdBeginRenderPass set the render area
* with 3DSTATE_DRAWING_RECTANGLE.
*
* XXX(chadv): Does the hardware really respect
* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
*/
emit_resolve(cmd_buffer,
&(VkOffset2D) { 0, 0 },
&(VkExtent2D) { fb->width, fb->height });

View File

@@ -31,6 +31,45 @@
#include "sid.h"
#include "vk_format.h"
static nir_ssa_def *radv_meta_build_resolve_srgb_conversion(nir_builder *b,
nir_ssa_def *input)
{
nir_const_value v;
unsigned i;
v.u32[0] = 0x3b4d2e1c; // 0.00313080009
nir_ssa_def *cmp[3];
for (i = 0; i < 3; i++)
cmp[i] = nir_flt(b, nir_channel(b, input, i),
nir_build_imm(b, 1, 32, v));
nir_ssa_def *ltvals[3];
v.f32[0] = 12.92;
for (i = 0; i < 3; i++)
ltvals[i] = nir_fmul(b, nir_channel(b, input, i),
nir_build_imm(b, 1, 32, v));
nir_ssa_def *gtvals[3];
for (i = 0; i < 3; i++) {
v.f32[0] = 1.0/2.4;
gtvals[i] = nir_fpow(b, nir_channel(b, input, i),
nir_build_imm(b, 1, 32, v));
v.f32[0] = 1.055;
gtvals[i] = nir_fmul(b, gtvals[i],
nir_build_imm(b, 1, 32, v));
v.f32[0] = 0.055;
gtvals[i] = nir_fsub(b, gtvals[i],
nir_build_imm(b, 1, 32, v));
}
nir_ssa_def *comp[4];
for (i = 0; i < 3; i++)
comp[i] = nir_bcsel(b, cmp[i], ltvals[i], gtvals[i]);
comp[3] = nir_channels(b, input, 1 << 3);
return nir_vec(b, comp, 4);
}
static nir_shader *
build_resolve_compute_shader(struct radv_device *dev, bool is_integer, bool is_srgb, int samples)
{
@@ -88,10 +127,13 @@ build_resolve_compute_shader(struct radv_device *dev, bool is_integer, bool is_s
nir_ssa_def *img_coord = nir_channels(&b, nir_iadd(&b, global_id, &src_offset->dest.ssa), 0x3);
nir_variable *color = nir_local_variable_create(b.impl, glsl_vec4_type(), "color");
radv_meta_build_resolve_shader_core(&b, is_integer, is_srgb, samples,
input_img, color, img_coord);
radv_meta_build_resolve_shader_core(&b, is_integer, samples, input_img,
color, img_coord);
nir_ssa_def *outval = nir_load_var(&b, color);
if (is_srgb)
outval = radv_meta_build_resolve_srgb_conversion(&b, outval);
nir_ssa_def *coord = nir_iadd(&b, global_id, &dst_offset->dest.ssa);
nir_intrinsic_instr *store = nir_intrinsic_instr_create(b.shader, nir_intrinsic_image_store);
store->src[0] = nir_src_for_ssa(coord);
@@ -402,7 +444,7 @@ void radv_meta_resolve_compute_image(struct radv_cmd_buffer *cmd_buffer,
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = radv_image_to_handle(dest_image),
.viewType = radv_meta_get_view_type(dest_image),
.format = dest_image->vk_format,
.format = vk_to_non_srgb_format(dest_image->vk_format),
.subresourceRange = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = region->dstSubresource.mipLevel,
@@ -479,21 +521,6 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer *cmd_buffer)
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
struct radv_subpass resolve_subpass = {
.color_count = 1,
.color_attachments = (VkAttachmentReference[]) { dest_att },
.depth_stencil_attachment = { .attachment = VK_ATTACHMENT_UNUSED },
};
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
/* Subpass resolves must respect the render area. We can ignore the
* render area here because vkCmdBeginRenderPass set the render area
* with 3DSTATE_DRAWING_RECTANGLE.
*
* XXX(chadv): Does the hardware really respect
* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
*/
emit_resolve(cmd_buffer,
src_iview,
dst_iview,

View File

@@ -51,7 +51,7 @@ build_nir_vertex_shader(void)
}
static nir_shader *
build_resolve_fragment_shader(struct radv_device *dev, bool is_integer, bool is_srgb, int samples)
build_resolve_fragment_shader(struct radv_device *dev, bool is_integer, int samples)
{
nir_builder b;
char name[64];
@@ -62,7 +62,7 @@ build_resolve_fragment_shader(struct radv_device *dev, bool is_integer, bool is_
false,
GLSL_TYPE_FLOAT);
snprintf(name, 64, "meta_resolve_fs-%d-%s", samples, is_integer ? "int" : (is_srgb ? "srgb" : "float"));
snprintf(name, 64, "meta_resolve_fs-%d-%s", samples, is_integer ? "int" : "float");
nir_builder_init_simple_shader(&b, NULL, MESA_SHADER_FRAGMENT, NULL);
b.shader->info.name = ralloc_strdup(b.shader, name);
@@ -92,8 +92,8 @@ build_resolve_fragment_shader(struct radv_device *dev, bool is_integer, bool is_
nir_ssa_def *img_coord = nir_channels(&b, nir_iadd(&b, pos_int, &src_offset->dest.ssa), 0x3);
nir_variable *color = nir_local_variable_create(b.impl, glsl_vec4_type(), "color");
radv_meta_build_resolve_shader_core(&b, is_integer, is_srgb,samples,
input_img, color, img_coord);
radv_meta_build_resolve_shader_core(&b, is_integer, samples, input_img,
color, img_coord);
nir_ssa_def *outval = nir_load_var(&b, color);
nir_store_var(&b, color_out, outval, 0xf);
@@ -160,6 +160,8 @@ static VkFormat pipeline_formats[] = {
VK_FORMAT_R8G8B8A8_UNORM,
VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT,
VK_FORMAT_A2R10G10B10_UINT_PACK32,
VK_FORMAT_A2R10G10B10_SINT_PACK32,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UINT,
@@ -175,31 +177,25 @@ create_resolve_pipeline(struct radv_device *device,
VkFormat format)
{
VkResult result;
bool is_integer = false, is_srgb = false;
bool is_integer = false;
uint32_t samples = 1 << samples_log2;
unsigned fs_key = radv_format_meta_fs_key(format);
const VkPipelineVertexInputStateCreateInfo *vi_create_info;
vi_create_info = &normal_vi_create_info;
if (vk_format_is_int(format))
is_integer = true;
else if (vk_format_is_srgb(format))
is_srgb = true;
struct radv_shader_module fs = { .nir = NULL };
fs.nir = build_resolve_fragment_shader(device, is_integer, is_srgb, samples);
fs.nir = build_resolve_fragment_shader(device, is_integer, samples);
struct radv_shader_module vs = {
.nir = build_nir_vertex_shader(),
};
VkRenderPass *rp = is_srgb ?
&device->meta_state.resolve_fragment.rc[samples_log2].srgb_render_pass :
&device->meta_state.resolve_fragment.rc[samples_log2].render_pass[fs_key];
VkRenderPass *rp = &device->meta_state.resolve_fragment.rc[samples_log2].render_pass[fs_key];
assert(!*rp);
VkPipeline *pipeline = is_srgb ?
&device->meta_state.resolve_fragment.rc[samples_log2].srgb_pipeline :
&device->meta_state.resolve_fragment.rc[samples_log2].pipeline[fs_key];
VkPipeline *pipeline = &device->meta_state.resolve_fragment.rc[samples_log2].pipeline[fs_key];
assert(!*pipeline);
VkPipelineShaderStageCreateInfo pipeline_shader_stages[] = {
@@ -348,8 +344,6 @@ radv_device_init_meta_resolve_fragment_state(struct radv_device *device)
for (unsigned j = 0; j < ARRAY_SIZE(pipeline_formats); ++j) {
res = create_resolve_pipeline(device, i, pipeline_formats[j]);
}
res = create_resolve_pipeline(device, i, VK_FORMAT_R8G8B8A8_SRGB);
}
return res;
@@ -368,12 +362,6 @@ radv_device_finish_meta_resolve_fragment_state(struct radv_device *device)
state->resolve_fragment.rc[i].pipeline[j],
&state->alloc);
}
radv_DestroyRenderPass(radv_device_to_handle(device),
state->resolve_fragment.rc[i].srgb_render_pass,
&state->alloc);
radv_DestroyPipeline(radv_device_to_handle(device),
state->resolve_fragment.rc[i].srgb_pipeline,
&state->alloc);
}
radv_DestroyDescriptorSetLayout(radv_device_to_handle(device),
@@ -430,9 +418,7 @@ emit_resolve(struct radv_cmd_buffer *cmd_buffer,
push_constants);
unsigned fs_key = radv_format_meta_fs_key(dest_iview->vk_format);
VkPipeline pipeline_h = vk_format_is_srgb(dest_iview->vk_format) ?
device->meta_state.resolve_fragment.rc[samples_log2].srgb_pipeline :
device->meta_state.resolve_fragment.rc[samples_log2].pipeline[fs_key];
VkPipeline pipeline_h = device->meta_state.resolve_fragment.rc[samples_log2].pipeline[fs_key];
radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
pipeline_h);
@@ -483,9 +469,7 @@ void radv_meta_resolve_fragment_image(struct radv_cmd_buffer *cmd_buffer,
radv_fast_clear_flush_image_inplace(cmd_buffer, src_image, &range);
}
rp = vk_format_is_srgb(dest_image->vk_format) ?
device->meta_state.resolve_fragment.rc[samples_log2].srgb_render_pass :
device->meta_state.resolve_fragment.rc[samples_log2].render_pass[fs_key];
rp = device->meta_state.resolve_fragment.rc[samples_log2].render_pass[fs_key];
radv_meta_save_graphics_reset_vport_scissor_novertex(&saved_state, cmd_buffer);
for (uint32_t r = 0; r < region_count; ++r) {
@@ -649,13 +633,6 @@ radv_cmd_buffer_resolve_subpass_fs(struct radv_cmd_buffer *cmd_buffer)
radv_cmd_buffer_set_subpass(cmd_buffer, &resolve_subpass, false);
/* Subpass resolves must respect the render area. We can ignore the
* render area here because vkCmdBeginRenderPass set the render area
* with 3DSTATE_DRAWING_RECTANGLE.
*
* XXX(chadv): Does the hardware really respect
* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
*/
emit_resolve(cmd_buffer,
src_iview,
dest_iview,

View File

@@ -65,6 +65,7 @@ static const struct nir_shader_compiler_options nir_options = {
.lower_unpack_unorm_4x8 = true,
.lower_extract_byte = true,
.lower_extract_word = true,
.lower_ffma = true,
.max_unroll_iterations = 32
};
@@ -161,6 +162,7 @@ radv_optimize_nir(struct nir_shader *shader)
if (nir_opt_trivial_continues(shader)) {
progress = true;
NIR_PASS(progress, shader, nir_copy_prop);
NIR_PASS(progress, shader, nir_opt_remove_phis);
NIR_PASS(progress, shader, nir_opt_dce);
}
NIR_PASS(progress, shader, nir_opt_if);
@@ -273,8 +275,32 @@ radv_shader_compile_to_nir(struct radv_device *device,
nir_shader_gather_info(nir, entry_point->impl);
/* While it would be nice not to have this flag, we are constrained
* by the reality that LLVM 5.0 doesn't have working VGPR indexing
* on GFX9.
*/
bool llvm_has_working_vgpr_indexing =
device->physical_device->rad_info.chip_class <= VI;
/* TODO: Indirect indexing of GS inputs is unimplemented.
*
* TCS and TES load inputs directly from LDS or offchip memory, so
* indirect indexing is trivial.
*/
nir_variable_mode indirect_mask = 0;
indirect_mask |= nir_var_shader_in;
if (!llvm_has_working_vgpr_indexing &&
nir->info.stage != MESA_SHADER_TESS_CTRL)
indirect_mask |= nir_var_shader_out;
/* TODO: We shouldn't need to do this, however LLVM isn't currently
* smart enough to handle indirects without causing excess spilling
* causing the gpu to hang.
*
* See the following thread for more details of the problem:
* https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html
*/
indirect_mask |= nir_var_local;
nir_lower_indirect_derefs(nir, indirect_mask);
@@ -852,6 +878,79 @@ static uint32_t si_translate_blend_factor(VkBlendFactor factor)
}
}
static uint32_t si_translate_blend_opt_function(VkBlendOp op)
{
switch (op) {
case VK_BLEND_OP_ADD:
return V_028760_OPT_COMB_ADD;
case VK_BLEND_OP_SUBTRACT:
return V_028760_OPT_COMB_SUBTRACT;
case VK_BLEND_OP_REVERSE_SUBTRACT:
return V_028760_OPT_COMB_REVSUBTRACT;
case VK_BLEND_OP_MIN:
return V_028760_OPT_COMB_MIN;
case VK_BLEND_OP_MAX:
return V_028760_OPT_COMB_MAX;
default:
return V_028760_OPT_COMB_BLEND_DISABLED;
}
}
static uint32_t si_translate_blend_opt_factor(VkBlendFactor factor, bool is_alpha)
{
switch (factor) {
case VK_BLEND_FACTOR_ZERO:
return V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_ALL;
case VK_BLEND_FACTOR_ONE:
return V_028760_BLEND_OPT_PRESERVE_ALL_IGNORE_NONE;
case VK_BLEND_FACTOR_SRC_COLOR:
return is_alpha ? V_028760_BLEND_OPT_PRESERVE_A1_IGNORE_A0
: V_028760_BLEND_OPT_PRESERVE_C1_IGNORE_C0;
case VK_BLEND_FACTOR_ONE_MINUS_SRC_COLOR:
return is_alpha ? V_028760_BLEND_OPT_PRESERVE_A0_IGNORE_A1
: V_028760_BLEND_OPT_PRESERVE_C0_IGNORE_C1;
case VK_BLEND_FACTOR_SRC_ALPHA:
return V_028760_BLEND_OPT_PRESERVE_A1_IGNORE_A0;
case VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA:
return V_028760_BLEND_OPT_PRESERVE_A0_IGNORE_A1;
case VK_BLEND_FACTOR_SRC_ALPHA_SATURATE:
return is_alpha ? V_028760_BLEND_OPT_PRESERVE_ALL_IGNORE_NONE
: V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_A0;
default:
return V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_NONE;
}
}
/**
* Get rid of DST in the blend factors by commuting the operands:
* func(src * DST, dst * 0) ---> func(src * 0, dst * SRC)
*/
static void si_blend_remove_dst(unsigned *func, unsigned *src_factor,
unsigned *dst_factor, unsigned expected_dst,
unsigned replacement_src)
{
if (*src_factor == expected_dst &&
*dst_factor == VK_BLEND_FACTOR_ZERO) {
*src_factor = VK_BLEND_FACTOR_ZERO;
*dst_factor = replacement_src;
/* Commuting the operands requires reversing subtractions. */
if (*func == VK_BLEND_OP_SUBTRACT)
*func = VK_BLEND_OP_REVERSE_SUBTRACT;
else if (*func == VK_BLEND_OP_REVERSE_SUBTRACT)
*func = VK_BLEND_OP_SUBTRACT;
}
}
static bool si_blend_factor_uses_dst(unsigned factor)
{
return factor == VK_BLEND_FACTOR_DST_COLOR ||
factor == VK_BLEND_FACTOR_DST_ALPHA ||
factor == VK_BLEND_FACTOR_SRC_ALPHA_SATURATE ||
factor == VK_BLEND_FACTOR_ONE_MINUS_DST_ALPHA ||
factor == VK_BLEND_FACTOR_ONE_MINUS_DST_COLOR;
}
static bool is_dual_src(VkBlendFactor factor)
{
switch (factor) {
@@ -1067,20 +1166,37 @@ format_is_int8(VkFormat format)
desc->channel[channel].size == 8;
}
static bool
format_is_int10(VkFormat format)
{
const struct vk_format_description *desc = vk_format_description(format);
if (desc->nr_channels != 4)
return false;
for (unsigned i = 0; i < 4; i++) {
if (desc->channel[i].pure_integer && desc->channel[i].size == 10)
return true;
}
return false;
}
unsigned radv_format_meta_fs_key(VkFormat format)
{
unsigned col_format = si_choose_spi_color_format(format, false, false) - 1;
bool is_int8 = format_is_int8(format);
bool is_int10 = format_is_int10(format);
return col_format + (is_int8 ? 3 : 0);
return col_format + (is_int8 ? 3 : is_int10 ? 5 : 0);
}
static unsigned
radv_pipeline_compute_is_int8(const VkGraphicsPipelineCreateInfo *pCreateInfo)
static void
radv_pipeline_compute_get_int_clamp(const VkGraphicsPipelineCreateInfo *pCreateInfo,
unsigned *is_int8, unsigned *is_int10)
{
RADV_FROM_HANDLE(radv_render_pass, pass, pCreateInfo->renderPass);
struct radv_subpass *subpass = pass->subpasses + pCreateInfo->subpass;
unsigned is_int8 = 0;
*is_int8 = 0;
*is_int10 = 0;
for (unsigned i = 0; i < subpass->color_count; ++i) {
struct radv_render_pass_attachment *attachment;
@@ -1091,10 +1207,10 @@ radv_pipeline_compute_is_int8(const VkGraphicsPipelineCreateInfo *pCreateInfo)
attachment = pass->attachments + subpass->color_attachments[i].attachment;
if (format_is_int8(attachment->format))
is_int8 |= 1 << i;
*is_int8 |= 1 << i;
if (format_is_int10(attachment->format))
*is_int10 |= 1 << i;
}
return is_int8;
}
static void
@@ -1132,6 +1248,7 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
for (i = 0; i < vkblend->attachmentCount; i++) {
const VkPipelineColorBlendAttachmentState *att = &vkblend->pAttachments[i];
unsigned blend_cntl = 0;
unsigned srcRGB_opt, dstRGB_opt, srcA_opt, dstA_opt;
VkBlendOp eqRGB = att->colorBlendOp;
VkBlendFactor srcRGB = att->srcColorBlendFactor;
VkBlendFactor dstRGB = att->dstColorBlendFactor;
@@ -1139,7 +1256,7 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
VkBlendFactor srcA = att->srcAlphaBlendFactor;
VkBlendFactor dstA = att->dstAlphaBlendFactor;
blend->sx_mrt0_blend_opt[i] = S_028760_COLOR_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED) | S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED);
blend->sx_mrt_blend_opt[i] = S_028760_COLOR_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED) | S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED);
if (!att->colorWriteMask)
continue;
@@ -1163,6 +1280,50 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
dstA = VK_BLEND_FACTOR_ONE;
}
/* Blending optimizations for RB+.
* These transformations don't change the behavior.
*
* First, get rid of DST in the blend factors:
* func(src * DST, dst * 0) ---> func(src * 0, dst * SRC)
*/
si_blend_remove_dst(&eqRGB, &srcRGB, &dstRGB,
VK_BLEND_FACTOR_DST_COLOR,
VK_BLEND_FACTOR_SRC_COLOR);
si_blend_remove_dst(&eqA, &srcA, &dstA,
VK_BLEND_FACTOR_DST_COLOR,
VK_BLEND_FACTOR_SRC_COLOR);
si_blend_remove_dst(&eqA, &srcA, &dstA,
VK_BLEND_FACTOR_DST_ALPHA,
VK_BLEND_FACTOR_SRC_ALPHA);
/* Look up the ideal settings from tables. */
srcRGB_opt = si_translate_blend_opt_factor(srcRGB, false);
dstRGB_opt = si_translate_blend_opt_factor(dstRGB, false);
srcA_opt = si_translate_blend_opt_factor(srcA, true);
dstA_opt = si_translate_blend_opt_factor(dstA, true);
/* Handle interdependencies. */
if (si_blend_factor_uses_dst(srcRGB))
dstRGB_opt = V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_NONE;
if (si_blend_factor_uses_dst(srcA))
dstA_opt = V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_NONE;
if (srcRGB == VK_BLEND_FACTOR_SRC_ALPHA_SATURATE &&
(dstRGB == VK_BLEND_FACTOR_ZERO ||
dstRGB == VK_BLEND_FACTOR_SRC_ALPHA ||
dstRGB == VK_BLEND_FACTOR_SRC_ALPHA_SATURATE))
dstRGB_opt = V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_A0;
/* Set the final value. */
blend->sx_mrt_blend_opt[i] =
S_028760_COLOR_SRC_OPT(srcRGB_opt) |
S_028760_COLOR_DST_OPT(dstRGB_opt) |
S_028760_COLOR_COMB_FCN(si_translate_blend_opt_function(eqRGB)) |
S_028760_ALPHA_SRC_OPT(srcA_opt) |
S_028760_ALPHA_DST_OPT(dstA_opt) |
S_028760_ALPHA_COMB_FCN(si_translate_blend_opt_function(eqA));
blend_cntl |= S_028780_ENABLE(1);
blend_cntl |= S_028780_COLOR_COMB_FCN(si_translate_blend_function(eqRGB));
@@ -1186,8 +1347,14 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
dstRGB == VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA)
blend_need_alpha |= 1 << i;
}
for (i = vkblend->attachmentCount; i < 8; i++)
for (i = vkblend->attachmentCount; i < 8; i++) {
blend->cb_blend_control[i] = 0;
blend->sx_mrt_blend_opt[i] = S_028760_COLOR_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED) | S_028760_ALPHA_COMB_FCN(V_028760_OPT_COMB_BLEND_DISABLED);
}
/* disable RB+ for now */
if (pipeline->device->physical_device->has_rbplus)
blend->cb_color_control |= S_028808_DISABLE_DUAL_QUAD(1);
if (blend->cb_target_mask)
blend->cb_color_control |= S_028808_MODE(mode);
@@ -2053,9 +2220,11 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
}
if (modules[MESA_SHADER_FRAGMENT]) {
union ac_shader_variant_key key;
union ac_shader_variant_key key = {0};
key.fs.col_format = pipeline->graphics.blend.spi_shader_col_format;
key.fs.is_int8 = radv_pipeline_compute_is_int8(pCreateInfo);
if (pipeline->device->physical_device->rad_info.chip_class < VI)
radv_pipeline_compute_get_int_clamp(pCreateInfo, &key.fs.is_int8, &key.fs.is_int10);
const VkPipelineShaderStageCreateInfo *stage = pStages[MESA_SHADER_FRAGMENT];
@@ -2180,6 +2349,9 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
S_02880C_EXEC_ON_HIER_FAIL(ps->info.fs.writes_memory) |
S_02880C_EXEC_ON_NOOP(ps->info.fs.writes_memory);
if (pipeline->device->physical_device->has_rbplus)
pipeline->graphics.db_shader_control |= S_02880C_DUAL_QUAD_DISABLE(1);
pipeline->graphics.shader_z_format =
ps->info.fs.writes_sample_mask ? V_028710_SPI_SHADER_32_ABGR :
ps->info.fs.writes_stencil ? V_028710_SPI_SHADER_32_GR :

View File

@@ -118,6 +118,9 @@ radv_pipeline_cache_search_unlocked(struct radv_pipeline_cache *cache,
const uint32_t mask = cache->table_size - 1;
const uint32_t start = (*(uint32_t *) sha1);
if (cache->table_size == 0)
return NULL;
for (uint32_t i = 0; i < cache->table_size; i++) {
const uint32_t index = (start + i) & mask;
struct cache_entry *entry = cache->hash_table[index];

View File

@@ -84,7 +84,7 @@ typedef uint32_t xcb_window_t;
#define MAX_PUSH_DESCRIPTORS 32
#define MAX_DYNAMIC_BUFFERS 16
#define MAX_SAMPLES_LOG2 4
#define NUM_META_FS_KEYS 11
#define NUM_META_FS_KEYS 13
#define RADV_MAX_DRM_DEVICES 8
#define NUM_DEPTH_CLEAR_PIPELINES 3
@@ -276,6 +276,9 @@ struct radv_physical_device {
bool has_rbplus; /* if RB+ register exist */
bool rbplus_allowed; /* if RB+ is allowed */
VkPhysicalDeviceMemoryProperties memory_properties;
enum radv_mem_type mem_type_indices[RADV_MEM_TYPE_COUNT];
};
struct radv_instance {
@@ -433,8 +436,6 @@ struct radv_meta_state {
VkPipelineLayout p_layout;
struct {
VkRenderPass srgb_render_pass;
VkPipeline srgb_pipeline;
VkRenderPass render_pass[NUM_META_FS_KEYS];
VkPipeline pipeline[NUM_META_FS_KEYS];
} rc[MAX_SAMPLES_LOG2];
@@ -444,7 +445,7 @@ struct radv_meta_state {
VkPipeline decompress_pipeline;
VkPipeline resummarize_pipeline;
VkRenderPass pass;
} depth_decomp;
} depth_decomp[1 + MAX_SAMPLES_LOG2];
struct {
VkPipeline cmask_eliminate_pipeline;
@@ -1002,7 +1003,7 @@ struct radv_depth_stencil_state {
struct radv_blend_state {
uint32_t cb_color_control;
uint32_t cb_target_mask;
uint32_t sx_mrt0_blend_opt[8];
uint32_t sx_mrt_blend_opt[8];
uint32_t cb_blend_control[8];
uint32_t spi_shader_col_format;
@@ -1215,14 +1216,14 @@ struct radv_image {
/* Set when bound */
struct radeon_winsys_bo *bo;
VkDeviceSize offset;
uint32_t dcc_offset;
uint32_t htile_offset;
uint64_t dcc_offset;
uint64_t htile_offset;
struct radeon_surf surface;
struct radv_fmask_info fmask;
struct radv_cmask_info cmask;
uint32_t clear_value_offset;
uint32_t dcc_pred_offset;
uint64_t clear_value_offset;
uint64_t dcc_pred_offset;
};
/* Whether the image has a htile that is known consistent with the contents of
@@ -1280,6 +1281,7 @@ struct radv_image_view {
uint32_t base_layer;
uint32_t layer_count;
uint32_t base_mip;
uint32_t level_count;
VkExtent3D extent; /**< Extent of VkImageViewCreateInfo::baseMipLevel. */
uint32_t descriptor[8];

View File

@@ -653,7 +653,7 @@ static void radv_query_shader(struct radv_cmd_buffer *cmd_buffer,
struct radv_device *device = cmd_buffer->device;
struct radv_meta_saved_compute_state saved_state;
radv_meta_save_compute(&saved_state, cmd_buffer, 4);
radv_meta_save_compute(&saved_state, cmd_buffer, 16);
struct radv_buffer dst_buffer = {
.bo = dst_bo,
@@ -737,7 +737,7 @@ static void radv_query_shader(struct radv_cmd_buffer *cmd_buffer,
RADV_CMD_FLAG_INV_VMEM_L1 |
RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
radv_meta_restore_compute(&saved_state, cmd_buffer, 4);
radv_meta_restore_compute(&saved_state, cmd_buffer, 16);
}
VkResult radv_CreateQueryPool(

View File

@@ -51,7 +51,8 @@ enum radeon_bo_flag { /* bitfield */
RADEON_FLAG_GTT_WC = (1 << 0),
RADEON_FLAG_CPU_ACCESS = (1 << 1),
RADEON_FLAG_NO_CPU_ACCESS = (1 << 2),
RADEON_FLAG_VIRTUAL = (1 << 3)
RADEON_FLAG_VIRTUAL = (1 << 3),
RADEON_FLAG_VA_UNCACHED = (1 << 4),
};
enum radeon_bo_usage { /* bitfield */

View File

@@ -154,6 +154,7 @@ radv_wsi_image_create(VkDevice device_h,
VkImage image_h;
struct radv_image *image;
int fd;
RADV_FROM_HANDLE(radv_device, device, device_h);
result = radv_image_create(device_h,
&(struct radv_image_create_info) {
@@ -192,12 +193,26 @@ radv_wsi_image_create(VkDevice device_h,
.image = image_h
};
/* Find the first VRAM memory type, or GART for PRIME images. */
int memory_type_index = -1;
for (int i = 0; i < device->physical_device->memory_properties.memoryTypeCount; ++i) {
bool is_local = !!(device->physical_device->memory_properties.memoryTypes[i].propertyFlags & VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
if ((linear && !is_local) || (!linear && is_local)) {
memory_type_index = i;
break;
}
}
/* fallback */
if (memory_type_index == -1)
memory_type_index = 0;
result = radv_AllocateMemory(device_h,
&(VkMemoryAllocateInfo) {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.pNext = &ded_alloc,
.allocationSize = image->size,
.memoryTypeIndex = linear ? 1 : 0,
.memoryTypeIndex = memory_type_index,
},
NULL /* XXX: pAllocator */,
&memory_h);
@@ -211,7 +226,6 @@ radv_wsi_image_create(VkDevice device_h,
* or the fd for the linear image if a copy is required.
*/
if (!needs_linear_copy || (needs_linear_copy && linear)) {
RADV_FROM_HANDLE(radv_device, device, device_h);
RADV_FROM_HANDLE(radv_device_memory, memory, memory_h);
if (!radv_get_memory_fd(device, memory, &fd))
goto fail_alloc_memory;
@@ -224,7 +238,11 @@ radv_wsi_image_create(VkDevice device_h,
*memory_p = memory_h;
*size = image->size;
*offset = image->offset;
*row_pitch = surface->u.legacy.level[0].nblk_x * surface->bpe;
if (device->physical_device->rad_info.chip_class >= GFX9)
*row_pitch = surface->u.gfx9.surf_pitch * surface->bpe;
else
*row_pitch = surface->u.legacy.level[0].nblk_x * surface->bpe;
return VK_SUCCESS;
fail_alloc_memory:
radv_FreeMemory(device_h, memory_h, pAllocator);

View File

@@ -1133,15 +1133,20 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
void
si_emit_set_predication_state(struct radv_cmd_buffer *cmd_buffer, uint64_t va)
{
uint32_t val = 0;
uint32_t op = 0;
if (va)
val = (((va >> 32) & 0xff) |
PRED_OP(PREDICATION_OP_BOOL64)|
PREDICATION_DRAW_VISIBLE);
radeon_emit(cmd_buffer->cs, PKT3(PKT3_SET_PREDICATION, 1, 0));
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, val);
op = PRED_OP(PREDICATION_OP_BOOL64) | PREDICATION_DRAW_VISIBLE;
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
radeon_emit(cmd_buffer->cs, PKT3(PKT3_SET_PREDICATION, 2, 0));
radeon_emit(cmd_buffer->cs, op);
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, va >> 32);
} else {
radeon_emit(cmd_buffer->cs, PKT3(PKT3_SET_PREDICATION, 1, 0));
radeon_emit(cmd_buffer->cs, va);
radeon_emit(cmd_buffer->cs, op | ((va >> 32) & 0xFF));
}
}
/* Set this if you want the 3D engine to wait until CP DMA is done.

View File

@@ -465,4 +465,27 @@ vk_format_get_component_bits(VkFormat format,
}
}
static inline VkFormat
vk_to_non_srgb_format(VkFormat format)
{
switch(format) {
case VK_FORMAT_R8_SRGB :
return VK_FORMAT_R8_UNORM;
case VK_FORMAT_R8G8_SRGB:
return VK_FORMAT_R8G8_UNORM;
case VK_FORMAT_R8G8B8_SRGB:
return VK_FORMAT_R8G8B8_UNORM;
case VK_FORMAT_B8G8R8_SRGB:
return VK_FORMAT_B8G8R8_UNORM;
case VK_FORMAT_R8G8B8A8_SRGB :
return VK_FORMAT_R8G8B8A8_UNORM;
case VK_FORMAT_B8G8R8A8_SRGB:
return VK_FORMAT_B8G8R8A8_UNORM;
case VK_FORMAT_A8B8G8R8_SRGB_PACK32:
return VK_FORMAT_A8B8G8R8_UNORM_PACK32;
default:
return format;
}
}
#endif /* VK_FORMAT_H */

View File

@@ -39,6 +39,23 @@
static void radv_amdgpu_winsys_bo_destroy(struct radeon_winsys_bo *_bo);
static int
radv_amdgpu_bo_va_op(amdgpu_device_handle dev,
amdgpu_bo_handle bo,
uint64_t offset,
uint64_t size,
uint64_t addr,
uint64_t flags,
uint32_t ops)
{
size = ALIGN(size, getpagesize());
flags |= (AMDGPU_VM_PAGE_READABLE |
AMDGPU_VM_PAGE_WRITEABLE |
AMDGPU_VM_PAGE_EXECUTABLE);
return amdgpu_bo_va_op_raw(dev, bo, offset, size, addr,
flags, ops);
}
static void
radv_amdgpu_winsys_virtual_map(struct radv_amdgpu_winsys_bo *bo,
const struct radv_amdgpu_map_range *range)
@@ -49,8 +66,8 @@ radv_amdgpu_winsys_virtual_map(struct radv_amdgpu_winsys_bo *bo,
return; /* TODO: PRT mapping */
p_atomic_inc(&range->bo->ref_count);
int r = amdgpu_bo_va_op(range->bo->bo, range->bo_offset, range->size,
range->offset + bo->va, 0, AMDGPU_VA_OP_MAP);
int r = radv_amdgpu_bo_va_op(bo->ws->dev, range->bo->bo, range->bo_offset, range->size,
range->offset + bo->va, 0, AMDGPU_VA_OP_MAP);
if (r)
abort();
}
@@ -64,8 +81,8 @@ radv_amdgpu_winsys_virtual_unmap(struct radv_amdgpu_winsys_bo *bo,
if (!range->bo)
return; /* TODO: PRT mapping */
int r = amdgpu_bo_va_op(range->bo->bo, range->bo_offset, range->size,
range->offset + bo->va, 0, AMDGPU_VA_OP_UNMAP);
int r = radv_amdgpu_bo_va_op(bo->ws->dev, range->bo->bo, range->bo_offset, range->size,
range->offset + bo->va, 0, AMDGPU_VA_OP_UNMAP);
if (r)
abort();
radv_amdgpu_winsys_bo_destroy((struct radeon_winsys_bo *)range->bo);
@@ -149,6 +166,7 @@ radv_amdgpu_winsys_bo_virtual_bind(struct radeon_winsys_bo *_parent,
if (parent->ranges[first].bo == bo && (!bo || offset - bo_offset == parent->ranges[first].offset - parent->ranges[first].bo_offset)) {
size += offset - parent->ranges[first].offset;
offset = parent->ranges[first].offset;
bo_offset = parent->ranges[first].bo_offset;
remove_first = true;
}
@@ -234,7 +252,7 @@ static void radv_amdgpu_winsys_bo_destroy(struct radeon_winsys_bo *_bo)
bo->ws->num_buffers--;
pthread_mutex_unlock(&bo->ws->global_bo_list_lock);
}
amdgpu_bo_va_op(bo->bo, 0, bo->size, bo->va, 0, AMDGPU_VA_OP_UNMAP);
radv_amdgpu_bo_va_op(bo->ws->dev, bo->bo, 0, bo->size, bo->va, 0, AMDGPU_VA_OP_UNMAP);
amdgpu_bo_free(bo->bo);
}
amdgpu_va_range_free(bo->va_handle);
@@ -322,7 +340,11 @@ radv_amdgpu_winsys_bo_create(struct radeon_winsys *_ws,
goto error_bo_alloc;
}
r = amdgpu_bo_va_op(buf_handle, 0, size, va, 0, AMDGPU_VA_OP_MAP);
uint32_t va_flags = 0;
if ((flags & RADEON_FLAG_VA_UNCACHED) && ws->info.chip_class >= GFX9)
va_flags |= AMDGPU_VM_MTYPE_UC;
r = radv_amdgpu_bo_va_op(ws->dev, buf_handle, 0, size, va, va_flags, AMDGPU_VA_OP_MAP);
if (r)
goto error_va_map;
@@ -398,7 +420,7 @@ radv_amdgpu_winsys_bo_from_fd(struct radeon_winsys *_ws,
if (r)
goto error_query;
r = amdgpu_bo_va_op(result.buf_handle, 0, result.alloc_size, va, 0, AMDGPU_VA_OP_MAP);
r = radv_amdgpu_bo_va_op(ws->dev, result.buf_handle, 0, result.alloc_size, va, 0, AMDGPU_VA_OP_MAP);
if (r)
goto error_va_map;

View File

@@ -140,7 +140,9 @@ LIBGLSL_FILES = \
glsl/program.h \
glsl/propagate_invariance.cpp \
glsl/s_expression.cpp \
glsl/s_expression.h
glsl/s_expression.h \
glsl/string_to_uint_map.cpp \
glsl/string_to_uint_map.h
LIBGLSL_SHADER_CACHE_FILES = \
glsl/shader_cache.cpp \

View File

@@ -4495,7 +4495,7 @@ process_initializer(ir_variable *var, ast_declaration *decl,
} else {
if (var->type->is_numeric()) {
/* Reduce cascading errors. */
var->constant_value = type->qualifier.flags.q.constant
rhs = var->constant_value = type->qualifier.flags.q.constant
? ir_constant::zero(state, var->type) : NULL;
}
}

View File

@@ -46,6 +46,9 @@ grow_to_fit(struct blob *blob, size_t additional)
size_t to_allocate;
uint8_t *new_data;
if (blob->out_of_memory)
return false;
if (blob->size + additional <= blob->allocated)
return true;
@@ -57,8 +60,10 @@ grow_to_fit(struct blob *blob, size_t additional)
to_allocate = MAX2(to_allocate, blob->allocated + additional);
new_data = realloc(blob->data, to_allocate);
if (new_data == NULL)
if (new_data == NULL) {
blob->out_of_memory = true;
return false;
}
blob->data = new_data;
blob->allocated = to_allocate;
@@ -104,6 +109,7 @@ blob_create()
blob->data = NULL;
blob->allocated = 0;
blob->size = 0;
blob->out_of_memory = false;
return blob;
}
@@ -207,6 +213,9 @@ blob_reader_init(struct blob_reader *blob, uint8_t *data, size_t size)
static bool
ensure_can_read(struct blob_reader *blob, size_t size)
{
if (blob->overrun)
return false;
if (blob->current < blob->end && blob->end - blob->current >= size)
return true;

View File

@@ -55,6 +55,12 @@ struct blob {
/** The number of bytes that have actual data written to them. */
size_t size;
/**
* True if we've ever failed to realloc or if we go pas the end of a fixed
* allocation blob.
*/
bool out_of_memory;
};
/* When done reading, the caller can ensure that everything was consumed by

View File

@@ -1158,8 +1158,6 @@ nir_visitor::visit(ir_call *ir)
case nir_intrinsic_vote_eq: {
nir_ssa_dest_init(&instr->instr, &instr->dest, 1, 32, NULL);
instr->variables[0] = evaluate_deref(&instr->instr, ir->return_deref);
ir_rvalue *value = (ir_rvalue *) ir->actual_parameters.get_head();
instr->src[0] = nir_src_for_ssa(evaluate_rvalue(value));

View File

@@ -725,6 +725,8 @@ ir_swizzle::constant_expression_value(struct hash_table *variable_context)
case GLSL_TYPE_FLOAT: data.f[i] = v->value.f[swiz_idx[i]]; break;
case GLSL_TYPE_BOOL: data.b[i] = v->value.b[swiz_idx[i]]; break;
case GLSL_TYPE_DOUBLE:data.d[i] = v->value.d[swiz_idx[i]]; break;
case GLSL_TYPE_UINT64:data.u64[i] = v->value.u64[swiz_idx[i]]; break;
case GLSL_TYPE_INT64: data.i64[i] = v->value.i64[swiz_idx[i]]; break;
default: assert(!"Should not get here."); break;
}
}

View File

@@ -25,7 +25,7 @@
#include "ir.h"
#include "linker.h"
#include "ir_uniform.h"
#include "util/string_to_uint_map.h"
#include "string_to_uint_map.h"
/* These functions are put in a "private" namespace instead of being marked
* static so that the unit tests can access them. See

View File

@@ -27,7 +27,7 @@
#include "ir_uniform.h"
#include "glsl_symbol_table.h"
#include "program.h"
#include "util/string_to_uint_map.h"
#include "string_to_uint_map.h"
#include "ir_array_refcount.h"
/**

View File

@@ -2072,7 +2072,8 @@ reserved_varying_slot(struct gl_linked_shader *stage,
var_slot = var->data.location - VARYING_SLOT_VAR0;
unsigned num_elements = get_varying_type(var, stage->Stage)
->count_attribute_slots(stage->Stage == MESA_SHADER_VERTEX);
->count_attribute_slots(io_mode == ir_var_shader_in &&
stage->Stage == MESA_SHADER_VERTEX);
for (unsigned i = 0; i < num_elements; i++) {
if (var_slot >= 0 && var_slot < MAX_VARYINGS_INCL_PATCH)
slots |= UINT64_C(1) << var_slot;

View File

@@ -75,7 +75,7 @@
#include "program/program.h"
#include "util/mesa-sha1.h"
#include "util/set.h"
#include "util/string_to_uint_map.h"
#include "string_to_uint_map.h"
#include "linker.h"
#include "link_varyings.h"
#include "ir_optimization.h"
@@ -1128,10 +1128,16 @@ cross_validate_globals(struct gl_shader_program *prog,
if (prog->IsES && (prog->data->Version != 310 ||
!var->get_interface_type()) &&
existing->data.precision != var->data.precision) {
linker_error(prog, "declarations for %s `%s` have "
"mismatching precision qualifiers\n",
mode_string(var), var->name);
return;
if ((existing->data.used && var->data.used) || prog->data->Version >= 300) {
linker_error(prog, "declarations for %s `%s` have "
"mismatching precision qualifiers\n",
mode_string(var), var->name);
return;
} else {
linker_warning(prog, "declarations for %s `%s` have "
"mismatching precision qualifiers\n",
mode_string(var), var->name);
}
}
} else
variables->add_variable(var);
@@ -2661,12 +2667,14 @@ assign_attribute_or_color_locations(void *mem_ctx,
} to_assign[32];
assert(max_index <= 32);
/* Temporary array for the set of attributes that have locations assigned.
/* Temporary array for the set of attributes that have locations assigned,
* for the purpose of checking overlapping slots/components of (non-ES)
* fragment shader outputs.
*/
ir_variable *assigned[16];
ir_variable *assigned[12 * 4]; /* (max # of FS outputs) * # components */
unsigned assigned_attr = 0;
unsigned num_attr = 0;
unsigned assigned_attr = 0;
foreach_in_list(ir_instruction, node, sh->ir) {
ir_variable *const var = node->as_variable();
@@ -2905,6 +2913,18 @@ assign_attribute_or_color_locations(void *mem_ctx,
}
}
if (target_index == MESA_SHADER_FRAGMENT && !prog->IsES) {
/* Only track assigned variables for non-ES fragment shaders
* to avoid overflowing the array.
*
* At most one variable per fragment output component should
* reach this.
*/
assert(assigned_attr < ARRAY_SIZE(assigned));
assigned[assigned_attr] = var;
assigned_attr++;
}
used_locations |= (use_mask << attr);
/* From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
@@ -2931,9 +2951,6 @@ assign_attribute_or_color_locations(void *mem_ctx,
double_storage_locations |= (use_mask << attr);
}
assigned[assigned_attr] = var;
assigned_attr++;
continue;
}

View File

@@ -358,13 +358,21 @@ lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
* into
*
* extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
* resulting_biased_exp = extracted_biased_exp + exp;
* resulting_biased_exp = min(extracted_biased_exp + exp, 255);
*
* if (resulting_biased_exp < 1 || x == 0.0f) {
* return copysign(0.0, x);
* if (extracted_biased_exp >= 255)
* return x; // +/-inf, NaN
*
* sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
*
* if (min(resulting_biased_exp, extracted_biased_exp) < 1)
* resulting_biased_exp = 0;
* if (resulting_biased_exp >= 255 ||
* min(resulting_biased_exp, extracted_biased_exp) < 1) {
* sign_mantissa &= sign_mask;
* }
*
* return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
* return bitcast_u2f(sign_mantissa |
* lshift(i2u(resulting_biased_exp), exp_shift));
*
* which we can't actually implement as such, since the GLSL IR doesn't
@@ -372,45 +380,58 @@ lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
* using conditional-select:
*
* extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
* resulting_biased_exp = extracted_biased_exp + exp;
* resulting_biased_exp = min(extracted_biased_exp + exp, 255);
*
* is_not_zero_or_underflow = logic_and(nequal(x, 0.0f),
* gequal(resulting_biased_exp, 1);
* x = csel(is_not_zero_or_underflow, x, copysign(0.0f, x));
* resulting_biased_exp = csel(is_not_zero_or_underflow,
* resulting_biased_exp, 0);
* sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
*
* return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
* lshift(i2u(resulting_biased_exp), exp_shift));
* flush_to_zero = lequal(min(resulting_biased_exp, extracted_biased_exp), 0);
* resulting_biased_exp = csel(flush_to_zero, 0, resulting_biased_exp)
* zero_mantissa = logic_or(flush_to_zero,
* gequal(resulting_biased_exp, 255));
* sign_mantissa = csel(zero_mantissa, sign_mantissa & sign_mask, sign_mantissa);
*
* result = sign_mantissa |
* lshift(i2u(resulting_biased_exp), exp_shift));
*
* return csel(extracted_biased_exp >= 255, x, bitcast_u2f(result));
*
* The definition of ldexp in the GLSL spec says:
*
* "If this product is too large to be represented in the
* floating-point type, the result is undefined."
*
* However, the definition of ldexp in the GLSL ES spec does not contain
* this sentence, so we do need to handle overflow correctly.
*
* There is additional language limiting the defined range of exp, but this
* is merely to allow implementations that store 2^exp in a temporary
* variable.
*/
const unsigned vec_elem = ir->type->vector_elements;
/* Types */
const glsl_type *ivec = glsl_type::get_instance(GLSL_TYPE_INT, vec_elem, 1);
const glsl_type *uvec = glsl_type::get_instance(GLSL_TYPE_UINT, vec_elem, 1);
const glsl_type *bvec = glsl_type::get_instance(GLSL_TYPE_BOOL, vec_elem, 1);
/* Constants */
ir_constant *zeroi = ir_constant::zero(ir, ivec);
ir_constant *sign_mask = new(ir) ir_constant(0x80000000u, vec_elem);
ir_constant *exp_shift = new(ir) ir_constant(23, vec_elem);
/* Temporary variables */
ir_variable *x = new(ir) ir_variable(ir->type, "x", ir_var_temporary);
ir_variable *exp = new(ir) ir_variable(ivec, "exp", ir_var_temporary);
ir_variable *zero_sign_x = new(ir) ir_variable(ir->type, "zero_sign_x",
ir_var_temporary);
ir_variable *result = new(ir) ir_variable(uvec, "result", ir_var_temporary);
ir_variable *extracted_biased_exp =
new(ir) ir_variable(ivec, "extracted_biased_exp", ir_var_temporary);
ir_variable *resulting_biased_exp =
new(ir) ir_variable(ivec, "resulting_biased_exp", ir_var_temporary);
ir_variable *is_not_zero_or_underflow =
new(ir) ir_variable(bvec, "is_not_zero_or_underflow", ir_var_temporary);
ir_variable *sign_mantissa =
new(ir) ir_variable(uvec, "sign_mantissa", ir_var_temporary);
ir_variable *flush_to_zero =
new(ir) ir_variable(bvec, "flush_to_zero", ir_var_temporary);
ir_variable *zero_mantissa =
new(ir) ir_variable(bvec, "zero_mantissa", ir_var_temporary);
ir_instruction &i = *base_ir;
@@ -423,58 +444,82 @@ lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
/* Extract the biased exponent from <x>. */
i.insert_before(extracted_biased_exp);
i.insert_before(assign(extracted_biased_exp,
rshift(bitcast_f2i(abs(x)), exp_shift)));
rshift(bitcast_f2i(abs(x)),
new(ir) ir_constant(23, vec_elem))));
/* The definition of ldexp in the GLSL 4.60 spec says:
*
* "If exp is greater than +128 (single-precision) or +1024
* (double-precision), the value returned is undefined. If exp is less
* than -126 (single-precision) or -1022 (double-precision), the value
* returned may be flushed to zero."
*
* So we do not have to guard against the possibility of addition overflow,
* which could happen when exp is close to INT_MAX. Addition underflow
* cannot happen (the worst case is 0 + (-INT_MAX)).
*/
i.insert_before(resulting_biased_exp);
i.insert_before(assign(resulting_biased_exp,
add(extracted_biased_exp, exp)));
min2(add(extracted_biased_exp, exp),
new(ir) ir_constant(255, vec_elem))));
/* Test if result is ±0.0, subnormal, or underflow by checking if the
* resulting biased exponent would be less than 0x1. If so, the result is
* 0.0 with the sign of x. (Actually, invert the conditions so that
* immediate values are the second arguments, which is better for i965)
*/
i.insert_before(zero_sign_x);
i.insert_before(assign(zero_sign_x,
bitcast_u2f(bit_and(bitcast_f2u(x), sign_mask))));
i.insert_before(sign_mantissa);
i.insert_before(assign(sign_mantissa,
bit_and(bitcast_f2u(x),
new(ir) ir_constant(0x807fffffu, vec_elem))));
i.insert_before(is_not_zero_or_underflow);
i.insert_before(assign(is_not_zero_or_underflow,
logic_and(nequal(x, new(ir) ir_constant(0.0f, vec_elem)),
gequal(resulting_biased_exp,
new(ir) ir_constant(0x1, vec_elem)))));
i.insert_before(assign(x, csel(is_not_zero_or_underflow,
x, zero_sign_x)));
i.insert_before(assign(resulting_biased_exp,
csel(is_not_zero_or_underflow,
resulting_biased_exp, zeroi)));
/* We could test for overflows by checking if the resulting biased exponent
* would be greater than 0xFE. Turns out we don't need to because the GLSL
* spec says:
/* We flush to zero if the original or resulting biased exponent is 0,
* indicating a +/-0.0 or subnormal input or output.
*
* "If this product is too large to be represented in the
* floating-point type, the result is undefined."
* The mantissa is set to 0 if the resulting biased exponent is 255, since
* an overflow should produce a +/-inf result.
*
* Note that NaN inputs are handled separately.
*/
i.insert_before(flush_to_zero);
i.insert_before(assign(flush_to_zero,
lequal(min2(resulting_biased_exp,
extracted_biased_exp),
ir_constant::zero(ir, ivec))));
i.insert_before(assign(resulting_biased_exp,
csel(flush_to_zero,
ir_constant::zero(ir, ivec),
resulting_biased_exp)));
ir_constant *exp_shift_clone = exp_shift->clone(ir, NULL);
i.insert_before(zero_mantissa);
i.insert_before(assign(zero_mantissa,
logic_or(flush_to_zero,
equal(resulting_biased_exp,
new(ir) ir_constant(255, vec_elem)))));
i.insert_before(assign(sign_mantissa,
csel(zero_mantissa,
bit_and(sign_mantissa,
new(ir) ir_constant(0x80000000u, vec_elem)),
sign_mantissa)));
/* Don't generate new IR that would need to be lowered in an additional
* pass.
*/
i.insert_before(result);
if (!lowering(INSERT_TO_SHIFTS)) {
ir_constant *exp_width = new(ir) ir_constant(8, vec_elem);
ir->operation = ir_unop_bitcast_i2f;
ir->operands[0] = bitfield_insert(bitcast_f2i(x), resulting_biased_exp,
exp_shift_clone, exp_width);
ir->operands[1] = NULL;
i.insert_before(assign(result,
bitfield_insert(sign_mantissa,
i2u(resulting_biased_exp),
new(ir) ir_constant(23u, vec_elem),
new(ir) ir_constant(8u, vec_elem))));
} else {
ir_constant *sign_mantissa_mask = new(ir) ir_constant(0x807fffffu, vec_elem);
ir->operation = ir_unop_bitcast_u2f;
ir->operands[0] = bit_or(bit_and(bitcast_f2u(x), sign_mantissa_mask),
lshift(i2u(resulting_biased_exp), exp_shift_clone));
i.insert_before(assign(result,
bit_or(sign_mantissa,
lshift(i2u(resulting_biased_exp),
new(ir) ir_constant(23, vec_elem)))));
}
ir->operation = ir_triop_csel;
ir->operands[0] = gequal(extracted_biased_exp,
new(ir) ir_constant(255, vec_elem));
ir->operands[1] = new(ir) ir_dereference_variable(x);
ir->operands[2] = bitcast_u2f(result);
this->progress = true;
}

View File

@@ -237,6 +237,12 @@ ir_constant_propagation_visitor::constant_propagation(ir_rvalue **rvalue) {
case GLSL_TYPE_BOOL:
data.b[i] = found->constant->value.b[rhs_channel];
break;
case GLSL_TYPE_UINT64:
data.u64[i] = found->constant->value.u64[rhs_channel];
break;
case GLSL_TYPE_INT64:
data.i64[i] = found->constant->value.i64[rhs_channel];
break;
default:
assert(!"not reached");
break;

View File

@@ -59,7 +59,7 @@
#include "program.h"
#include "shader_cache.h"
#include "util/mesa-sha1.h"
#include "util/string_to_uint_map.h"
#include "string_to_uint_map.h"
extern "C" {
#include "main/enums.h"
@@ -74,11 +74,26 @@ compile_shaders(struct gl_context *ctx, struct gl_shader_program *prog) {
}
}
static void
get_struct_type_field_and_pointer_sizes(size_t *s_field_size,
size_t *s_field_ptrs)
{
*s_field_size = sizeof(glsl_struct_field);
*s_field_ptrs =
sizeof(((glsl_struct_field *)0)->type) +
sizeof(((glsl_struct_field *)0)->name);
}
static void
encode_type_to_blob(struct blob *blob, const glsl_type *type)
{
uint32_t encoding;
if (!type) {
blob_write_uint32(blob, 0);
return;
}
switch (type->base_type) {
case GLSL_TYPE_UINT:
case GLSL_TYPE_INT:
@@ -122,11 +137,18 @@ encode_type_to_blob(struct blob *blob, const glsl_type *type)
blob_write_uint32(blob, (type->base_type) << 24);
blob_write_string(blob, type->name);
blob_write_uint32(blob, type->length);
blob_write_bytes(blob, type->fields.structure,
sizeof(glsl_struct_field) * type->length);
size_t s_field_size, s_field_ptrs;
get_struct_type_field_and_pointer_sizes(&s_field_size, &s_field_ptrs);
for (unsigned i = 0; i < type->length; i++) {
encode_type_to_blob(blob, type->fields.structure[i].type);
blob_write_string(blob, type->fields.structure[i].name);
/* Write the struct field skipping the pointers */
blob_write_bytes(blob,
((char *)&type->fields.structure[i]) + s_field_ptrs,
s_field_size - s_field_ptrs);
}
if (type->is_interface()) {
@@ -149,6 +171,11 @@ static const glsl_type *
decode_type_from_blob(struct blob_reader *blob)
{
uint32_t u = blob_read_uint32(blob);
if (u == 0) {
return NULL;
}
glsl_base_type base_type = (glsl_base_type) (u >> 24);
switch (base_type) {
@@ -182,22 +209,33 @@ decode_type_from_blob(struct blob_reader *blob)
case GLSL_TYPE_INTERFACE: {
char *name = blob_read_string(blob);
unsigned num_fields = blob_read_uint32(blob);
glsl_struct_field *fields = (glsl_struct_field *)
blob_read_bytes(blob, sizeof(glsl_struct_field) * num_fields);
size_t s_field_size, s_field_ptrs;
get_struct_type_field_and_pointer_sizes(&s_field_size, &s_field_ptrs);
glsl_struct_field *fields =
(glsl_struct_field *) malloc(s_field_size * num_fields);
for (unsigned i = 0; i < num_fields; i++) {
fields[i].type = decode_type_from_blob(blob);
fields[i].name = blob_read_string(blob);
blob_copy_bytes(blob, ((uint8_t *) &fields[i]) + s_field_ptrs,
s_field_size - s_field_ptrs);
}
const glsl_type *t;
if (base_type == GLSL_TYPE_INTERFACE) {
enum glsl_interface_packing packing =
(glsl_interface_packing) blob_read_uint32(blob);
bool row_major = blob_read_uint32(blob);
return glsl_type::get_interface_instance(fields, num_fields,
packing, row_major, name);
t = glsl_type::get_interface_instance(fields, num_fields, packing,
row_major, name);
} else {
return glsl_type::get_record_instance(fields, num_fields, name);
t = glsl_type::get_record_instance(fields, num_fields, name);
}
free(fields);
return t;
}
case GLSL_TYPE_VOID:
case GLSL_TYPE_ERROR:
@@ -555,6 +593,17 @@ read_xfb(struct blob_reader *metadata, struct gl_shader_program *shProg)
MAX_FEEDBACK_BUFFERS);
}
static bool
has_uniform_storage(struct gl_shader_program *prog, unsigned idx)
{
if (!prog->data->UniformStorage[idx].builtin &&
!prog->data->UniformStorage[idx].is_shader_storage &&
prog->data->UniformStorage[idx].block_index == -1)
return true;
return false;
}
static void
write_uniforms(struct blob *metadata, struct gl_shader_program *prog)
{
@@ -566,8 +615,6 @@ write_uniforms(struct blob *metadata, struct gl_shader_program *prog)
encode_type_to_blob(metadata, prog->data->UniformStorage[i].type);
blob_write_uint32(metadata, prog->data->UniformStorage[i].array_elements);
blob_write_string(metadata, prog->data->UniformStorage[i].name);
blob_write_uint32(metadata, prog->data->UniformStorage[i].storage -
prog->data->UniformDataSlots);
blob_write_uint32(metadata, prog->data->UniformStorage[i].builtin);
blob_write_uint32(metadata, prog->data->UniformStorage[i].remap_location);
blob_write_uint32(metadata, prog->data->UniformStorage[i].block_index);
@@ -586,6 +633,12 @@ write_uniforms(struct blob *metadata, struct gl_shader_program *prog)
prog->data->UniformStorage[i].top_level_array_size);
blob_write_uint32(metadata,
prog->data->UniformStorage[i].top_level_array_stride);
if (has_uniform_storage(prog, i)) {
blob_write_uint32(metadata, prog->data->UniformStorage[i].storage -
prog->data->UniformDataSlots);
}
blob_write_bytes(metadata, prog->data->UniformStorage[i].opaque,
sizeof(prog->data->UniformStorage[i].opaque));
}
@@ -597,9 +650,7 @@ write_uniforms(struct blob *metadata, struct gl_shader_program *prog)
*/
blob_write_uint32(metadata, prog->data->NumHiddenUniforms);
for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
if (!prog->data->UniformStorage[i].builtin &&
!prog->data->UniformStorage[i].is_shader_storage &&
prog->data->UniformStorage[i].block_index == -1) {
if (has_uniform_storage(prog, i)) {
unsigned vec_size =
prog->data->UniformStorage[i].type->component_slots() *
MAX2(prog->data->UniformStorage[i].array_elements, 1);
@@ -633,7 +684,6 @@ read_uniforms(struct blob_reader *metadata, struct gl_shader_program *prog)
uniforms[i].type = decode_type_from_blob(metadata);
uniforms[i].array_elements = blob_read_uint32(metadata);
uniforms[i].name = ralloc_strdup(prog, blob_read_string (metadata));
uniforms[i].storage = data + blob_read_uint32(metadata);
uniforms[i].builtin = blob_read_uint32(metadata);
uniforms[i].remap_location = blob_read_uint32(metadata);
uniforms[i].block_index = blob_read_uint32(metadata);
@@ -651,6 +701,10 @@ read_uniforms(struct blob_reader *metadata, struct gl_shader_program *prog)
uniforms[i].top_level_array_stride = blob_read_uint32(metadata);
prog->UniformHash->put(i, uniforms[i].name);
if (has_uniform_storage(prog, i)) {
uniforms[i].storage = data + blob_read_uint32(metadata);
}
memcpy(uniforms[i].opaque,
blob_read_bytes(metadata, sizeof(uniforms[i].opaque)),
sizeof(uniforms[i].opaque));
@@ -659,9 +713,7 @@ read_uniforms(struct blob_reader *metadata, struct gl_shader_program *prog)
/* Restore uniform values. */
prog->data->NumHiddenUniforms = blob_read_uint32(metadata);
for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
if (!prog->data->UniformStorage[i].builtin &&
!prog->data->UniformStorage[i].is_shader_storage &&
prog->data->UniformStorage[i].block_index == -1) {
if (has_uniform_storage(prog, i)) {
unsigned vec_size =
prog->data->UniformStorage[i].type->component_slots() *
MAX2(prog->data->UniformStorage[i].array_elements, 1);
@@ -867,6 +919,18 @@ write_shader_subroutine_index(struct blob *metadata,
}
}
static void
get_shader_var_and_pointer_sizes(size_t *s_var_size, size_t *s_var_ptrs,
const gl_shader_variable *var)
{
*s_var_size = sizeof(gl_shader_variable);
*s_var_ptrs =
sizeof(var->type) +
sizeof(var->interface_type) +
sizeof(var->outermost_struct_type) +
sizeof(var->name);
}
static void
write_program_resource_data(struct blob *metadata,
struct gl_shader_program *prog,
@@ -878,16 +942,19 @@ write_program_resource_data(struct blob *metadata,
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT: {
const gl_shader_variable *var = (gl_shader_variable *)res->Data;
blob_write_bytes(metadata, var, sizeof(gl_shader_variable));
encode_type_to_blob(metadata, var->type);
if (var->interface_type)
encode_type_to_blob(metadata, var->interface_type);
if (var->outermost_struct_type)
encode_type_to_blob(metadata, var->outermost_struct_type);
encode_type_to_blob(metadata, var->interface_type);
encode_type_to_blob(metadata, var->outermost_struct_type);
blob_write_string(metadata, var->name);
size_t s_var_size, s_var_ptrs;
get_shader_var_and_pointer_sizes(&s_var_size, &s_var_ptrs, var);
/* Write gl_shader_variable skipping over the pointers */
blob_write_bytes(metadata, ((char *)var) + s_var_ptrs,
s_var_size - s_var_ptrs);
break;
}
case GL_UNIFORM_BLOCK:
@@ -978,17 +1045,18 @@ read_program_resource_data(struct blob_reader *metadata,
case GL_PROGRAM_OUTPUT: {
gl_shader_variable *var = ralloc(prog, struct gl_shader_variable);
blob_copy_bytes(metadata, (uint8_t *) var, sizeof(gl_shader_variable));
var->type = decode_type_from_blob(metadata);
if (var->interface_type)
var->interface_type = decode_type_from_blob(metadata);
if (var->outermost_struct_type)
var->outermost_struct_type = decode_type_from_blob(metadata);
var->interface_type = decode_type_from_blob(metadata);
var->outermost_struct_type = decode_type_from_blob(metadata);
var->name = ralloc_strdup(prog, blob_read_string(metadata));
size_t s_var_size, s_var_ptrs;
get_shader_var_and_pointer_sizes(&s_var_size, &s_var_ptrs, var);
blob_copy_bytes(metadata, ((uint8_t *) var) + s_var_ptrs,
s_var_size - s_var_ptrs);
res->Data = var;
break;
}
@@ -1148,18 +1216,20 @@ write_shader_metadata(struct blob *metadata, gl_linked_shader *shader)
blob_write_bytes(metadata, glprog->sh.ImageUnits,
sizeof(glprog->sh.ImageUnits));
size_t ptr_size = sizeof(GLvoid *);
blob_write_uint32(metadata, glprog->sh.NumBindlessSamplers);
blob_write_uint32(metadata, glprog->sh.HasBoundBindlessSampler);
for (i = 0; i < glprog->sh.NumBindlessSamplers; i++) {
blob_write_bytes(metadata, &glprog->sh.BindlessSamplers[i],
sizeof(struct gl_bindless_sampler));
sizeof(struct gl_bindless_sampler) - ptr_size);
}
blob_write_uint32(metadata, glprog->sh.NumBindlessImages);
blob_write_uint32(metadata, glprog->sh.HasBoundBindlessImage);
for (i = 0; i < glprog->sh.NumBindlessImages; i++) {
blob_write_bytes(metadata, &glprog->sh.BindlessImages[i],
sizeof(struct gl_bindless_image));
sizeof(struct gl_bindless_image) - ptr_size);
}
write_shader_parameters(metadata, glprog->Parameters);
@@ -1187,6 +1257,8 @@ read_shader_metadata(struct blob_reader *metadata,
blob_copy_bytes(metadata, (uint8_t *) glprog->sh.ImageUnits,
sizeof(glprog->sh.ImageUnits));
size_t ptr_size = sizeof(GLvoid *);
glprog->sh.NumBindlessSamplers = blob_read_uint32(metadata);
glprog->sh.HasBoundBindlessSampler = blob_read_uint32(metadata);
if (glprog->sh.NumBindlessSamplers > 0) {
@@ -1196,7 +1268,7 @@ read_shader_metadata(struct blob_reader *metadata,
for (i = 0; i < glprog->sh.NumBindlessSamplers; i++) {
blob_copy_bytes(metadata, (uint8_t *) &glprog->sh.BindlessSamplers[i],
sizeof(struct gl_bindless_sampler));
sizeof(struct gl_bindless_sampler) - ptr_size);
}
}
@@ -1209,7 +1281,7 @@ read_shader_metadata(struct blob_reader *metadata,
for (i = 0; i < glprog->sh.NumBindlessImages; i++) {
blob_copy_bytes(metadata, (uint8_t *) &glprog->sh.BindlessImages[i],
sizeof(struct gl_bindless_image));
sizeof(struct gl_bindless_image) - ptr_size);
}
}
@@ -1224,6 +1296,14 @@ create_binding_str(const char *key, unsigned value, void *closure)
ralloc_asprintf_append(bindings_str, "%s:%u,", key, value);
}
static void
get_shader_info_and_pointer_sizes(size_t *s_info_size, size_t *s_info_ptrs,
shader_info *info)
{
*s_info_size = sizeof(shader_info);
*s_info_ptrs = sizeof(info->name) + sizeof(info->label);
}
static void
create_linked_shader_and_program(struct gl_context *ctx,
gl_shader_stage stage,
@@ -1242,12 +1322,16 @@ create_linked_shader_and_program(struct gl_context *ctx,
read_shader_metadata(metadata, glprog, linked);
glprog->info.name = ralloc_strdup(glprog, blob_read_string(metadata));
glprog->info.label = ralloc_strdup(glprog, blob_read_string(metadata));
size_t s_info_size, s_info_ptrs;
get_shader_info_and_pointer_sizes(&s_info_size, &s_info_ptrs,
&glprog->info);
/* Restore shader info */
blob_copy_bytes(metadata, (uint8_t *) &glprog->info, sizeof(shader_info));
if (glprog->info.name)
glprog->info.name = ralloc_strdup(glprog, blob_read_string(metadata));
if (glprog->info.label)
glprog->info.label = ralloc_strdup(glprog, blob_read_string(metadata));
blob_copy_bytes(metadata, ((uint8_t *) &glprog->info) + s_info_ptrs,
s_info_size - s_info_ptrs);
_mesa_reference_shader_program_data(ctx, &glprog->sh.data, prog->data);
_mesa_reference_program(ctx, &linked->Program, glprog);
@@ -1286,14 +1370,24 @@ shader_cache_write_program_metadata(struct gl_context *ctx,
if (sh) {
write_shader_metadata(metadata, sh);
/* Store nir shader info */
blob_write_bytes(metadata, &sh->Program->info, sizeof(shader_info));
if (sh->Program->info.name)
blob_write_string(metadata, sh->Program->info.name);
else
blob_write_string(metadata, "");
if (sh->Program->info.label)
blob_write_string(metadata, sh->Program->info.label);
else
blob_write_string(metadata, "");
size_t s_info_size, s_info_ptrs;
get_shader_info_and_pointer_sizes(&s_info_size, &s_info_ptrs,
&sh->Program->info);
/* Store shader info */
blob_write_bytes(metadata,
((char *) &sh->Program->info) + s_info_ptrs,
s_info_size - s_info_ptrs);
}
}

View File

@@ -36,7 +36,7 @@
#include "loop_analysis.h"
#include "standalone_scaffolding.h"
#include "standalone.h"
#include "util/string_to_uint_map.h"
#include "string_to_uint_map.h"
#include "util/set.h"
#include "linker.h"
#include "glsl_parser_extras.h"

View File

@@ -25,7 +25,7 @@
#include "main/mtypes.h"
#include "main/macros.h"
#include "util/ralloc.h"
#include "util/string_to_uint_map.h"
#include "string_to_uint_map.h"
#include "uniform_initializer_utils.h"
namespace linker {

View File

@@ -1975,10 +1975,10 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
return SYSTEM_VALUE_HELPER_INVOCATION;
case nir_intrinsic_load_view_index:
return SYSTEM_VALUE_VIEW_INDEX;
case SYSTEM_VALUE_SUBGROUP_SIZE:
return nir_intrinsic_load_subgroup_size;
case SYSTEM_VALUE_SUBGROUP_INVOCATION:
return nir_intrinsic_load_subgroup_invocation;
case nir_intrinsic_load_subgroup_size:
return SYSTEM_VALUE_SUBGROUP_SIZE;
case nir_intrinsic_load_subgroup_invocation:
return SYSTEM_VALUE_SUBGROUP_INVOCATION;
case nir_intrinsic_load_subgroup_eq_mask:
return SYSTEM_VALUE_SUBGROUP_EQ_MASK;
case nir_intrinsic_load_subgroup_ge_mask:

View File

@@ -121,9 +121,9 @@ BARRIER(memory_barrier_shared)
INTRINSIC(discard_if, 1, ARR(1), false, 0, 0, 0, xx, xx, xx, 0)
/** ARB_shader_group_vote intrinsics */
INTRINSIC(vote_any, 1, ARR(1), true, 1, 1, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(vote_all, 1, ARR(1), true, 1, 1, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(vote_eq, 1, ARR(1), true, 1, 1, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(vote_any, 1, ARR(1), true, 1, 0, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(vote_all, 1, ARR(1), true, 1, 0, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(vote_eq, 1, ARR(1), true, 1, 0, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/**
* Basic Geometry Shader intrinsics.
@@ -433,7 +433,7 @@ INTRINSIC(load_interpolated_input, 2, ARR(2, 1), true, 0, 0,
/* src[] = { buffer_index, offset }. No const_index */
LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { offset }. const_index[] = { base, component } */
LOAD(output, 1, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
LOAD(output, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { vertex, offset }. const_index[] = { base, component } */
LOAD(per_vertex_output, 2, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
/* src[] = { offset }. const_index[] = { base } */

View File

@@ -308,7 +308,7 @@ for (unsigned bit = 0; bit < 32; bit++) {
unop_convert("ufind_msb", tint32, tuint32, """
dst = -1;
for (int bit = 31; bit > 0; bit--) {
for (int bit = 31; bit >= 0; bit--) {
if ((src0 >> bit) & 1) {
dst = bit;
break;

View File

@@ -250,8 +250,8 @@ optimizations = [
(('ishr', a, 0), a),
(('ushr', 0, a), 0),
(('ushr', a, 0), a),
(('iand', 0xff, ('ushr', a, 24)), ('ushr', a, 24)),
(('iand', 0xffff, ('ushr', a, 16)), ('ushr', a, 16)),
(('iand', 0xff, ('ushr@32', a, 24)), ('ushr', a, 24)),
(('iand', 0xffff, ('ushr@32', a, 16)), ('ushr', a, 16)),
# Exponential/logarithmic identities
(('~fexp2', ('flog2', a)), a), # 2^lg2(a) = a
(('~flog2', ('fexp2', a)), a), # lg2(2^a) = a

View File

@@ -28,6 +28,26 @@
* \file nir_opt_intrinsics.c
*/
static nir_ssa_def *
high_subgroup_mask(nir_builder *b,
nir_ssa_def *count,
uint64_t base_mask)
{
/* group_mask could probably be calculated more efficiently but we want to
* be sure not to shift by 64 if the subgroup size is 64 because the GLSL
* shift operator is undefined in that case. In any case if we were worried
* about efficency this should probably be done further down because the
* subgroup size is likely to be known at compile time.
*/
nir_ssa_def *subgroup_size = nir_load_subgroup_size(b);
nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull);
nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size);
nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift);
nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count);
return nir_iand(b, higher_bits, group_mask);
}
static bool
opt_intrinsics_impl(nir_function_impl *impl)
{
@@ -95,10 +115,10 @@ opt_intrinsics_impl(nir_function_impl *impl)
replacement = nir_ishl(&b, nir_imm_int64(&b, 1ull), count);
break;
case nir_intrinsic_load_subgroup_ge_mask:
replacement = nir_ishl(&b, nir_imm_int64(&b, ~0ull), count);
replacement = high_subgroup_mask(&b, count, ~0ull);
break;
case nir_intrinsic_load_subgroup_gt_mask:
replacement = nir_ishl(&b, nir_imm_int64(&b, ~1ull), count);
replacement = high_subgroup_mask(&b, count, ~1ull);
break;
case nir_intrinsic_load_subgroup_le_mask:
replacement = nir_inot(&b, nir_ishl(&b, nir_imm_int64(&b, ~1ull), count));

View File

@@ -32,14 +32,14 @@ extern "C" {
#endif
typedef struct shader_info {
/** The shader stage, such as MESA_SHADER_VERTEX. */
gl_shader_stage stage;
const char *name;
/* Descriptive name provided by the client; may be NULL */
const char *label;
/** The shader stage, such as MESA_SHADER_VERTEX. */
gl_shader_stage stage;
/* Number of textures used by this shader */
unsigned num_textures;
/* Number of uniform buffers used by this shader */

View File

@@ -721,7 +721,7 @@ translate_image_format(SpvImageFormat format)
case SpvImageFormatRg32ui: return 0x823C; /* GL_RG32UI */
case SpvImageFormatRg16ui: return 0x823A; /* GL_RG16UI */
case SpvImageFormatRg8ui: return 0x8238; /* GL_RG8UI */
case SpvImageFormatR16ui: return 0x823A; /* GL_RG16UI */
case SpvImageFormatR16ui: return 0x8234; /* GL_R16UI */
case SpvImageFormatR8ui: return 0x8232; /* GL_R8UI */
default:
assert(!"Invalid image format");
@@ -2034,6 +2034,7 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
case SpvOpAtomicIDecrement:
case SpvOpAtomicExchange:
case SpvOpAtomicIAdd:
case SpvOpAtomicISub:
case SpvOpAtomicSMin:
case SpvOpAtomicUMin:
case SpvOpAtomicSMax:
@@ -2801,7 +2802,8 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, SpvOp opcode,
case SpvOpMemoryModel:
assert(w[1] == SpvAddressingModelLogical);
assert(w[2] == SpvMemoryModelGLSL450);
assert(w[2] == SpvMemoryModelSimple ||
w[2] == SpvMemoryModelGLSL450);
break;
case SpvOpEntryPoint: {

View File

@@ -427,7 +427,7 @@ vtn_cfg_walk_blocks(struct vtn_builder *b, struct list_head *cf_list,
list_for_each_entry(struct vtn_case, cse, &swtch->cases, link) {
assert(cse->start_block != break_block);
vtn_cfg_walk_blocks(b, &cse->body, cse->start_block, cse,
break_block, NULL, loop_cont, NULL);
break_block, loop_break, loop_cont, NULL);
}
/* Finally, we walk over all of the cases one more time and put

View File

@@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder *b,
*location = FRAG_RESULT_DEPTH;
assert(*mode == nir_var_shader_out);
break;
case SpvBuiltInHelperInvocation:
*location = SYSTEM_VALUE_HELPER_INVOCATION;
set_mode_system_value(mode);
break;
case SpvBuiltInNumWorkgroups:
*location = SYSTEM_VALUE_NUM_WORK_GROUPS;
set_mode_system_value(mode);
@@ -1161,7 +1165,6 @@ vtn_get_builtin_location(struct vtn_builder *b,
*location = SYSTEM_VALUE_VIEW_INDEX;
set_mode_system_value(mode);
break;
case SpvBuiltInHelperInvocation:
default:
unreachable("unsupported builtin");
}

View File

@@ -629,6 +629,18 @@ dri2_setup_screen(_EGLDisplay *disp)
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
unsigned int api_mask;
/*
* EGL 1.5 specification defines the default value to 1. Moreover,
* eglSwapInterval() is required to clamp requested value to the supported
* range. Since the default value is implicitly assumed to be supported,
* use it as both minimum and maximum for the platforms that do not allow
* changing the interval. Platforms, which allow it (e.g. x11, wayland)
* override these values already.
*/
dri2_dpy->min_swap_interval = 1;
dri2_dpy->max_swap_interval = 1;
dri2_dpy->default_swap_interval = 1;
if (dri2_dpy->image_driver) {
api_mask = dri2_dpy->image_driver->getAPIMask(dri2_dpy->dri_screen);
} else if (dri2_dpy->dri2) {
@@ -676,7 +688,8 @@ dri2_setup_screen(_EGLDisplay *disp)
disp->Extensions.KHR_wait_sync = EGL_TRUE;
if (dri2_dpy->fence->get_fence_from_cl_event)
disp->Extensions.KHR_cl_event2 = EGL_TRUE;
if (dri2_dpy->fence->base.version >= 2) {
if (dri2_dpy->fence->base.version >= 2 &&
dri2_dpy->fence->get_capabilities) {
unsigned capabilities =
dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen);
disp->Extensions.ANDROID_native_fence_sync =
@@ -944,9 +957,12 @@ dri2_display_destroy(_EGLDisplay *disp)
zwp_linux_dmabuf_v1_destroy(dri2_dpy->wl_dmabuf);
if (dri2_dpy->wl_shm)
wl_shm_destroy(dri2_dpy->wl_shm);
wl_registry_destroy(dri2_dpy->wl_registry);
wl_event_queue_destroy(dri2_dpy->wl_queue);
wl_proxy_wrapper_destroy(dri2_dpy->wl_dpy_wrapper);
if (dri2_dpy->wl_registry)
wl_registry_destroy(dri2_dpy->wl_registry);
if (dri2_dpy->wl_queue)
wl_event_queue_destroy(dri2_dpy->wl_queue);
if (dri2_dpy->wl_dpy_wrapper)
wl_proxy_wrapper_destroy(dri2_dpy->wl_dpy_wrapper);
u_vector_finish(&dri2_dpy->wl_modifiers.argb8888);
u_vector_finish(&dri2_dpy->wl_modifiers.xrgb8888);
u_vector_finish(&dri2_dpy->wl_modifiers.rgb565);

View File

@@ -59,7 +59,14 @@ static inline EGLBoolean
dri2_fallback_swap_interval(_EGLDriver *drv, _EGLDisplay *dpy,
_EGLSurface *surf, EGLint interval)
{
return EGL_FALSE;
if (interval > surf->Config->MaxSwapInterval)
interval = surf->Config->MaxSwapInterval;
else if (interval < surf->Config->MinSwapInterval)
interval = surf->Config->MinSwapInterval;
surf->SwapInterval = interval;
return EGL_TRUE;
}
static inline EGLBoolean

View File

@@ -149,7 +149,7 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
if (!_eglInitSurface(&dri2_surf->base, disp, EGL_WINDOW_BIT, conf, attrib_list))
goto cleanup_surf;
if (dri2_dpy->wl_drm) {
if (dri2_dpy->wl_dmabuf || dri2_dpy->wl_drm) {
if (conf->RedSize == 5)
dri2_surf->format = WL_DRM_FORMAT_RGB565;
else if (conf->AlphaSize == 0)
@@ -199,7 +199,7 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
dri2_surf->wl_surface_wrapper = get_wl_surface_proxy(window);
if (!dri2_surf->wl_surface_wrapper) {
_eglError(EGL_BAD_ALLOC, "dri2_create_surface");
goto cleanup_drm;
goto cleanup_dpy_wrapper;
}
wl_proxy_set_queue((struct wl_proxy *)dri2_surf->wl_surface_wrapper,
dri2_surf->wl_queue);
@@ -227,7 +227,7 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
dri2_surf);
if (dri2_surf->dri_drawable == NULL) {
_eglError(EGL_BAD_ALLOC, "createNewDrawable");
goto cleanup_surf;
goto cleanup_surf_wrapper;
}
dri2_wl_swap_interval(drv, disp, &dri2_surf->base,
@@ -235,6 +235,10 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
return &dri2_surf->base;
cleanup_surf_wrapper:
wl_proxy_wrapper_destroy(dri2_surf->wl_surface_wrapper);
cleanup_dpy_wrapper:
wl_proxy_wrapper_destroy(dri2_surf->wl_dpy_wrapper);
cleanup_drm:
if (dri2_surf->wl_drm_wrapper)
wl_proxy_wrapper_destroy(dri2_surf->wl_drm_wrapper);
@@ -304,10 +308,10 @@ dri2_wl_destroy_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf)
dri2_surf->wl_win->destroy_window_callback = NULL;
}
if (dri2_surf->wl_drm_wrapper)
wl_proxy_wrapper_destroy(dri2_surf->wl_drm_wrapper);
wl_proxy_wrapper_destroy(dri2_surf->wl_surface_wrapper);
wl_proxy_wrapper_destroy(dri2_surf->wl_dpy_wrapper);
if (dri2_surf->wl_drm_wrapper)
wl_proxy_wrapper_destroy(dri2_surf->wl_drm_wrapper);
wl_event_queue_destroy(dri2_surf->wl_queue);
free(surf);
@@ -403,8 +407,13 @@ get_back_bo(struct dri2_egl_surface *dri2_surf)
break;
/* If we don't have a buffer, then block on the server to release one for
* us, and try again. */
if (wl_display_dispatch_queue(dri2_dpy->wl_dpy, dri2_surf->wl_queue) < 0)
* us, and try again. wl_display_dispatch_queue will process any pending
* events, however not all servers flush on issuing a buffer release
* event. So, we spam the server with roundtrips as they always cause a
* client flush.
*/
if (wl_display_roundtrip_queue(dri2_dpy->wl_dpy,
dri2_surf->wl_queue) < 0)
return -1;
}
@@ -709,23 +718,41 @@ create_wl_buffer(struct dri2_egl_display *dri2_dpy,
__DRIimage *image)
{
struct wl_buffer *ret;
EGLBoolean query;
int width, height, fourcc, num_planes;
uint64_t modifier = DRM_FORMAT_MOD_INVALID;
dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_WIDTH, &width);
dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_HEIGHT, &height);
dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_FOURCC, &fourcc);
dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_NUM_PLANES,
&num_planes);
query = dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_WIDTH, &width);
query &= dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_HEIGHT,
&height);
query &= dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_FOURCC,
&fourcc);
if (!query)
return NULL;
if (dri2_dpy->wl_dmabuf && dri2_dpy->image->base.version >= 15) {
struct zwp_linux_buffer_params_v1 *params;
query = dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_NUM_PLANES,
&num_planes);
if (!query)
num_planes = 1;
if (dri2_dpy->image->base.version >= 15) {
int mod_hi, mod_lo;
int i;
dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_MODIFIER_UPPER,
&mod_hi);
dri2_dpy->image->queryImage(image, __DRI_IMAGE_ATTRIB_MODIFIER_LOWER,
&mod_lo);
query = dri2_dpy->image->queryImage(image,
__DRI_IMAGE_ATTRIB_MODIFIER_UPPER,
&mod_hi);
query &= dri2_dpy->image->queryImage(image,
__DRI_IMAGE_ATTRIB_MODIFIER_LOWER,
&mod_lo);
if (query) {
modifier = (uint64_t) mod_hi << 32;
modifier |= (uint64_t) (mod_lo & 0xffffffff);
}
}
if (dri2_dpy->wl_dmabuf && modifier != DRM_FORMAT_MOD_INVALID) {
struct zwp_linux_buffer_params_v1 *params;
int i;
/* We don't need a wrapper for wl_dmabuf objects, because we have to
* create the intermediate params object; we can set the queue on this,
@@ -736,7 +763,8 @@ create_wl_buffer(struct dri2_egl_display *dri2_dpy,
for (i = 0; i < num_planes; i++) {
__DRIimage *p_image;
int stride, offset, fd;
int stride, offset;
int fd = -1;
if (i == 0)
p_image = image;
@@ -747,16 +775,27 @@ create_wl_buffer(struct dri2_egl_display *dri2_dpy,
return NULL;
}
dri2_dpy->image->queryImage(p_image, __DRI_IMAGE_ATTRIB_FD, &fd);
dri2_dpy->image->queryImage(p_image, __DRI_IMAGE_ATTRIB_STRIDE,
&stride);
dri2_dpy->image->queryImage(p_image, __DRI_IMAGE_ATTRIB_OFFSET,
&offset);
query = dri2_dpy->image->queryImage(p_image,
__DRI_IMAGE_ATTRIB_FD,
&fd);
query &= dri2_dpy->image->queryImage(p_image,
__DRI_IMAGE_ATTRIB_STRIDE,
&stride);
query &= dri2_dpy->image->queryImage(p_image,
__DRI_IMAGE_ATTRIB_OFFSET,
&offset);
if (image != p_image)
dri2_dpy->image->destroyImage(p_image);
if (!query) {
if (fd >= 0)
close(fd);
zwp_linux_buffer_params_v1_destroy(params);
return NULL;
}
zwp_linux_buffer_params_v1_add(params, fd, i, offset, stride,
mod_hi, mod_lo);
modifier >> 32, modifier & 0xffffffff);
close(fd);
}
@@ -1083,15 +1122,22 @@ dmabuf_handle_modifier(void *data, struct zwp_linux_dmabuf_v1 *dmabuf,
struct dri2_egl_display *dri2_dpy = data;
uint64_t *mod = NULL;
if (modifier_hi == (DRM_FORMAT_MOD_INVALID >> 32) &&
modifier_lo == (DRM_FORMAT_MOD_INVALID & 0xffffffff))
return;
switch (format) {
case WL_DRM_FORMAT_ARGB8888:
mod = u_vector_add(&dri2_dpy->wl_modifiers.argb8888);
dri2_dpy->formats |= HAS_ARGB8888;
break;
case WL_DRM_FORMAT_XRGB8888:
mod = u_vector_add(&dri2_dpy->wl_modifiers.xrgb8888);
dri2_dpy->formats |= HAS_XRGB8888;
break;
case WL_DRM_FORMAT_RGB565:
mod = u_vector_add(&dri2_dpy->wl_modifiers.rgb565);
dri2_dpy->formats |= HAS_RGB565;
break;
default:
break;

View File

@@ -646,6 +646,7 @@ dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
error != NULL || xfixes_query->major_version < 2) {
_eglLog(_EGL_WARNING, "DRI2: failed to query xfixes version");
free(error);
free(xfixes_query);
return EGL_FALSE;
}
free(xfixes_query);
@@ -1277,6 +1278,7 @@ dri2_x11_setup_swap_interval(struct dri2_egl_display *dri2_dpy)
*/
dri2_dpy->min_swap_interval = 0;
dri2_dpy->max_swap_interval = 0;
dri2_dpy->default_swap_interval = 0;
if (!dri2_dpy->swap_available)
return;
@@ -1318,6 +1320,7 @@ static const __DRIextension *dri3_image_loader_extensions[] = {
&dri3_image_loader_extension.base,
&image_lookup_extension.base,
&use_invalidate.base,
&background_callable_extension.base,
NULL,
};

View File

@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<registry>
<!--
Copyright (c) 2013-2014 The Khronos Group Inc.
Copyright (c) 2013-2017 The Khronos Group Inc.
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and/or associated documentation files (the
@@ -29,7 +29,7 @@
together with documentation, schema, and Python generator scripts used
to generate C header files for EGL, can be found in the Khronos Registry
at
http://www.opengl.org/registry/
https://www.github.com/KhronosGroup/EGL-Registry
-->
<!-- SECTION: EGL type definitions. Does not include GL types. -->
@@ -76,6 +76,7 @@
<type requires="khrplatform">typedef khronos_utime_nanoseconds_t <name>EGLTimeNV</name>;</type>
<type requires="khrplatform">typedef khronos_utime_nanoseconds_t <name>EGLuint64NV</name>;</type>
<type requires="khrplatform">typedef khronos_uint64_t <name>EGLuint64KHR</name>;</type>
<type requires="khrplatform">typedef khronos_stime_nanoseconds_t <name>EGLnsecsANDROID</name>;</type>
<type>typedef int <name>EGLNativeFileDescriptorKHR</name>;</type>
<type requires="khrplatform">typedef khronos_ssize_t <name>EGLsizeiANDROID</name>;</type>
<type requires="EGLsizeiANDROID">typedef void (*<name>EGLSetBlobFuncANDROID</name>) (const void *key, EGLsizeiANDROID keySize, const void *value, EGLsizeiANDROID valueSize);</type>
@@ -112,6 +113,7 @@
<!--
<enum value="0x0800" name="EGL_STREAM_BIT_NV" comment="Draft EGL_NV_stream_producer_eglsurface extension (bug 8064)"/>
-->
<enum value="0x1000" name="EGL_MUTABLE_RENDER_BUFFER_BIT_KHR"/>
</enums>
<enums namespace="EGLRenderableTypeMask" type="bitmask" comment="EGL_RENDERABLE_TYPE bits">
@@ -130,6 +132,12 @@
<enum value="0x0002" name="EGL_WRITE_SURFACE_BIT_KHR"/>
</enums>
<enums namespace="EGLNativeBufferUsageFlags" type="bitmask" comment="EGL_NATIVE_BUFFER_USAGE_ANDROID bits">
<enum value="0x00000001" name="EGL_NATIVE_BUFFER_USAGE_PROTECTED_BIT_ANDROID"/>
<enum value="0x00000002" name="EGL_NATIVE_BUFFER_USAGE_RENDERBUFFER_BIT_ANDROID"/>
<enum value="0x00000004" name="EGL_NATIVE_BUFFER_USAGE_TEXTURE_BIT_ANDROID"/>
</enums>
<enums namespace="EGLSyncFlagsKHR" type="bitmask" comment="Fence/reusable sync wait bits">
<enum value="0x0001" name="EGL_SYNC_FLUSH_COMMANDS_BIT"/>
<enum value="0x0001" name="EGL_SYNC_FLUSH_COMMANDS_BIT_KHR" alias="EGL_SYNC_FLUSH_COMMANDS_BIT"/>
@@ -165,7 +173,11 @@
tokens are reused for different purposes in different
extensions and API versions). -->
<enums namespace="EGL" start="0x0000" end="0x2FFF" vendor="ARB"/>
<enums namespace="EGL" start="0x0000" end="0x2FFF" vendor="KHR" comment="Reserved for enumerants shared with WGL, GLX, and GL">
<enum value="0" name="EGL_CONTEXT_RELEASE_BEHAVIOR_NONE_KHR"/>
<enum value="0x2097" name="EGL_CONTEXT_RELEASE_BEHAVIOR_KHR"/>
<enum value="0x2098" name="EGL_CONTEXT_RELEASE_BEHAVIOR_FLUSH_KHR"/>
</enums>
<enums namespace="EGL" group="Boolean" vendor="ARB">
<enum value="0" name="EGL_FALSE"/>
@@ -173,24 +185,25 @@
</enums>
<enums namespace="EGL" group="SpecialNumbers" vendor="ARB" comment="Tokens whose numeric value is intrinsically meaningful">
<enum value="((EGLint)-1)" name="EGL_DONT_CARE"/>
<enum value="((EGLint)-1)" name="EGL_UNKNOWN"/>
<enum value="EGL_CAST(EGLint,-1)" name="EGL_DONT_CARE"/>
<enum value="EGL_CAST(EGLint,-1)" name="EGL_UNKNOWN"/>
<enum value="-1" name="EGL_NO_NATIVE_FENCE_FD_ANDROID"/>
<enum value="0" name="EGL_DEPTH_ENCODING_NONE_NV"/>
<enum value="((EGLContext)0)" name="EGL_NO_CONTEXT"/>
<enum value="((EGLDeviceEXT)(0))" name="EGL_NO_DEVICE_EXT"/>
<enum value="((EGLDisplay)0)" name="EGL_NO_DISPLAY"/>
<enum value="((EGLImage)0)" name="EGL_NO_IMAGE"/>
<enum value="((EGLImageKHR)0)" name="EGL_NO_IMAGE_KHR"/>
<enum value="((EGLNativeDisplayType)0)" name="EGL_DEFAULT_DISPLAY"/>
<enum value="((EGLNativeFileDescriptorKHR)(-1))" name="EGL_NO_FILE_DESCRIPTOR_KHR"/>
<enum value="((EGLOutputLayerEXT)0)" name="EGL_NO_OUTPUT_LAYER_EXT"/>
<enum value="((EGLOutputPortEXT)0)" name="EGL_NO_OUTPUT_PORT_EXT"/>
<enum value="((EGLStreamKHR)0)" name="EGL_NO_STREAM_KHR"/>
<enum value="((EGLSurface)0)" name="EGL_NO_SURFACE"/>
<enum value="((EGLSync)0)" name="EGL_NO_SYNC"/>
<enum value="((EGLSyncKHR)0)" name="EGL_NO_SYNC_KHR" alias="EGL_NO_SYNC"/>
<enum value="((EGLSyncNV)0)" name="EGL_NO_SYNC_NV" alias="EGL_NO_SYNC"/>
<enum value="EGL_CAST(EGLContext,0)" name="EGL_NO_CONTEXT"/>
<enum value="EGL_CAST(EGLDeviceEXT,0)" name="EGL_NO_DEVICE_EXT"/>
<enum value="EGL_CAST(EGLDisplay,0)" name="EGL_NO_DISPLAY"/>
<enum value="EGL_CAST(EGLImage,0)" name="EGL_NO_IMAGE"/>
<enum value="EGL_CAST(EGLImageKHR,0)" name="EGL_NO_IMAGE_KHR"/>
<enum value="EGL_CAST(EGLNativeDisplayType,0)" name="EGL_DEFAULT_DISPLAY"/>
<enum value="EGL_CAST(EGLNativeFileDescriptorKHR,-1)" name="EGL_NO_FILE_DESCRIPTOR_KHR"/>
<enum value="EGL_CAST(EGLOutputLayerEXT,0)" name="EGL_NO_OUTPUT_LAYER_EXT"/>
<enum value="EGL_CAST(EGLOutputPortEXT,0)" name="EGL_NO_OUTPUT_PORT_EXT"/>
<enum value="EGL_CAST(EGLStreamKHR,0)" name="EGL_NO_STREAM_KHR"/>
<enum value="EGL_CAST(EGLSurface,0)" name="EGL_NO_SURFACE"/>
<enum value="EGL_CAST(EGLSync,0)" name="EGL_NO_SYNC"/>
<enum value="EGL_CAST(EGLSyncKHR,0)" name="EGL_NO_SYNC_KHR" alias="EGL_NO_SYNC"/>
<enum value="EGL_CAST(EGLSyncNV,0)" name="EGL_NO_SYNC_NV" alias="EGL_NO_SYNC"/>
<enum value="EGL_CAST(EGLConfig,0)" name="EGL_NO_CONFIG_KHR"/>
<enum value="10000" name="EGL_DISPLAY_SCALING"/>
<enum value="0xFFFFFFFFFFFFFFFF" name="EGL_FOREVER" type="ull"/>
<enum value="0xFFFFFFFFFFFFFFFF" name="EGL_FOREVER_KHR" type="ull" alias="EGL_FOREVER"/>
@@ -356,7 +369,7 @@
<enum value="0x30BD" name="EGL_GL_TEXTURE_ZOFFSET"/>
<enum value="0x30BD" name="EGL_GL_TEXTURE_ZOFFSET_KHR" alias="EGL_GL_TEXTURE_ZOFFSET"/>
<enum value="0x30BE" name="EGL_POST_SUB_BUFFER_SUPPORTED_NV"/>
<enum value="0x30BF" name="EGL_CONTEXT_OPENGL_ROBUST_ACCESS_EXT" alias="EGL_CONTEXT_OPENGL_ROBUST_ACCESS"/>
<enum value="0x30BF" name="EGL_CONTEXT_OPENGL_ROBUST_ACCESS_EXT"/>
</enums>
<enums namespace="EGL" start="0x30C0-0x30CF" vendor="KHR">
@@ -441,7 +454,10 @@
<enum value="0x3101" name="EGL_CONTEXT_PRIORITY_HIGH_IMG"/>
<enum value="0x3102" name="EGL_CONTEXT_PRIORITY_MEDIUM_IMG"/>
<enum value="0x3103" name="EGL_CONTEXT_PRIORITY_LOW_IMG"/>
<unused start="0x3104" end="0x310F"/>
<unused start="0x3104"/>
<enum value="0x3105" name="EGL_NATIVE_BUFFER_MULTIPLANE_SEPARATE_IMG"/>
<enum value="0x3106" name="EGL_NATIVE_BUFFER_PLANE_OFFSET_IMG"/>
<unused start="0x3107" end="0x310F"/>
</enums>
<enums namespace="EGL" start="0x3110" end="0x311F" vendor="ATX" comment="Reserved for Tim Renouf, Antix (Khronos bug 4949)">
@@ -474,12 +490,14 @@
<enum value="0x3140" name="EGL_NATIVE_BUFFER_ANDROID"/>
<enum value="0x3141" name="EGL_PLATFORM_ANDROID_KHR"/>
<enum value="0x3142" name="EGL_RECORDABLE_ANDROID"/>
<unused start="0x3143"/>
<enum value="0x3143" name="EGL_NATIVE_BUFFER_USAGE_ANDROID"/>
<enum value="0x3144" name="EGL_SYNC_NATIVE_FENCE_ANDROID"/>
<enum value="0x3145" name="EGL_SYNC_NATIVE_FENCE_FD_ANDROID"/>
<enum value="0x3146" name="EGL_SYNC_NATIVE_FENCE_SIGNALED_ANDROID"/>
<enum value="0x3147" name="EGL_FRAMEBUFFER_TARGET_ANDROID"/>
<unused start="0x3148" end="0x314F"/>
<unused start="0x3148" end="0x314B"/>
<enum value="0x314C" name="EGL_FRONT_BUFFER_AUTO_REFRESH_ANDROID"/>
<unused start="0x314D" end="0x314F"/>
</enums>
<enums namespace="EGL" start="0x3150" end="0x315F" vendor="NOK" comment="Reserved for Robert Palmer (Khronos bug 5368)">
@@ -532,7 +550,9 @@
<enum value="0x31D7" name="EGL_PLATFORM_GBM_MESA" alias="EGL_PLATFORM_GBM_KHR"/>
<enum value="0x31D8" name="EGL_PLATFORM_WAYLAND_KHR"/>
<enum value="0x31D8" name="EGL_PLATFORM_WAYLAND_EXT" alias="EGL_PLATFORM_WAYLAND_KHR"/>
<unused start="0x31D9" end="0x31DF"/>
<unused start="0x31D9" end="0x31DC"/>
<enum value="0x31DD" name="EGL_PLATFORM_SURFACELESS_MESA"/>
<unused start="0x31DE" end="0x31DF"/>
</enums>
<enums namespace="EGL" start="0x31E0" end="0x31EF" vendor="HI" comment="Reserved for Mark Callow (Khronos bug 6799)">
@@ -605,7 +625,37 @@
<enum value="0x323B" name="EGL_CUDA_EVENT_HANDLE_NV"/>
<enum value="0x323C" name="EGL_SYNC_CUDA_EVENT_NV"/>
<enum value="0x323D" name="EGL_SYNC_CUDA_EVENT_COMPLETE_NV"/>
<unused start="0x323E" end="0x325F"/>
<unused start="0x323E"/>
<enum value="0x323F" name="EGL_STREAM_CROSS_PARTITION_NV"/>
<enum value="0x3240" name="EGL_STREAM_STATE_INITIALIZING_NV"/>
<enum value="0x3241" name="EGL_STREAM_TYPE_NV"/>
<enum value="0x3242" name="EGL_STREAM_PROTOCOL_NV"/>
<enum value="0x3243" name="EGL_STREAM_ENDPOINT_NV"/>
<enum value="0x3244" name="EGL_STREAM_LOCAL_NV"/>
<enum value="0x3245" name="EGL_STREAM_CROSS_PROCESS_NV"/>
<enum value="0x3246" name="EGL_STREAM_PROTOCOL_FD_NV"/>
<enum value="0x3247" name="EGL_STREAM_PRODUCER_NV"/>
<enum value="0x3248" name="EGL_STREAM_CONSUMER_NV"/>
<unused start="0x3239" end="0x324A"/>
<enum value="0x324B" name="EGL_STREAM_PROTOCOL_SOCKET_NV"/>
<enum value="0x324C" name="EGL_SOCKET_HANDLE_NV"/>
<enum value="0x324D" name="EGL_SOCKET_TYPE_NV"/>
<enum value="0x324E" name="EGL_SOCKET_TYPE_UNIX_NV"/>
<enum value="0x324F" name="EGL_SOCKET_TYPE_INET_NV"/>
<enum value="0x3250" name="EGL_MAX_STREAM_METADATA_BLOCKS_NV"/>
<enum value="0x3251" name="EGL_MAX_STREAM_METADATA_BLOCK_SIZE_NV"/>
<enum value="0x3252" name="EGL_MAX_STREAM_METADATA_TOTAL_SIZE_NV"/>
<enum value="0x3253" name="EGL_PRODUCER_METADATA_NV"/>
<enum value="0x3254" name="EGL_CONSUMER_METADATA_NV"/>
<enum value="0x3255" name="EGL_METADATA0_SIZE_NV"/>
<enum value="0x3256" name="EGL_METADATA1_SIZE_NV"/>
<enum value="0x3257" name="EGL_METADATA2_SIZE_NV"/>
<enum value="0x3258" name="EGL_METADATA3_SIZE_NV"/>
<enum value="0x3259" name="EGL_METADATA0_TYPE_NV"/>
<enum value="0x325A" name="EGL_METADATA1_TYPE_NV"/>
<enum value="0x325B" name="EGL_METADATA2_TYPE_NV"/>
<enum value="0x325C" name="EGL_METADATA3_TYPE_NV"/>
<unused start="0x325D" end="0x325F"/>
</enums>
<enums namespace="EGL" start="0x3260" end="0x326F" vendor="BCOM" comment="Reserved for Gary Sweet, Broadcom (Public bug 620)">
@@ -636,7 +686,9 @@
<enum value="0x3284" name="EGL_YUV_CHROMA_SITING_0_EXT"/>
<enum value="0x3285" name="EGL_YUV_CHROMA_SITING_0_5_EXT"/>
<enum value="0x3286" name="EGL_DISCARD_SAMPLES_ARM"/>
<unused start="0x3287" end="0x328F"/>
<unused start="0x3287" end="0x3289"/>
<enum value="0x328A" name="EGL_SYNC_PRIOR_COMMANDS_IMPLICIT_EXTERNAL_ARM"/>
<unused start="0x328B" end="0x328F"/>
</enums>
<enums namespace="EGL" start="0x3290" end="0x329F" vendor="MESA" comment="Reserved for John K&#229;re Alsaker (Public bug 757)">
@@ -699,7 +751,51 @@
</enums>
<enums namespace="EGL" start="0x3320" end="0x339F" vendor="NV" comment="Reserved for James Jones (Bug 13209)">
<unused start="0x3320" end="0x339F"/>
<unused start="0x3320" end="0x3327"/>
<enum value="0x3328" name="EGL_PENDING_METADATA_NV"/>
<enum value="0x3329" name="EGL_PENDING_FRAME_NV"/>
<enum value="0x332A" name="EGL_STREAM_TIME_PENDING_NV"/>
<unused start="0x332B"/>
<enum value="0x332C" name="EGL_YUV_PLANE0_TEXTURE_UNIT_NV"/>
<enum value="0x332D" name="EGL_YUV_PLANE1_TEXTURE_UNIT_NV"/>
<enum value="0x332E" name="EGL_YUV_PLANE2_TEXTURE_UNIT_NV"/>
<unused start="0x332F" end="0x3333"/>
<enum value="0x3334" name="EGL_SUPPORT_RESET_NV"/>
<enum value="0x3335" name="EGL_SUPPORT_REUSE_NV"/>
<enum value="0x3336" name="EGL_STREAM_FIFO_SYNCHRONOUS_NV"/>
<enum value="0x3337" name="EGL_PRODUCER_MAX_FRAME_HINT_NV"/>
<enum value="0x3338" name="EGL_CONSUMER_MAX_FRAME_HINT_NV"/>
<enum value="0x3339" name="EGL_COLOR_COMPONENT_TYPE_EXT"/>
<enum value="0x333A" name="EGL_COLOR_COMPONENT_TYPE_FIXED_EXT"/>
<enum value="0x333B" name="EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT"/>
<unused start="0x333C" end="0x333E"/>
<enum value="0x333F" name="EGL_GL_COLORSPACE_BT2020_LINEAR_EXT"/>
<enum value="0x3340" name="EGL_GL_COLORSPACE_BT2020_PQ_EXT"/>
<enum value="0x3341" name="EGL_SMPTE2086_DISPLAY_PRIMARY_RX_EXT"/>
<enum value="0x3342" name="EGL_SMPTE2086_DISPLAY_PRIMARY_RY_EXT"/>
<enum value="0x3343" name="EGL_SMPTE2086_DISPLAY_PRIMARY_GX_EXT"/>
<enum value="0x3344" name="EGL_SMPTE2086_DISPLAY_PRIMARY_GY_EXT"/>
<enum value="0x3345" name="EGL_SMPTE2086_DISPLAY_PRIMARY_BX_EXT"/>
<enum value="0x3346" name="EGL_SMPTE2086_DISPLAY_PRIMARY_BY_EXT"/>
<enum value="0x3347" name="EGL_SMPTE2086_WHITE_POINT_X_EXT"/>
<enum value="0x3348" name="EGL_SMPTE2086_WHITE_POINT_Y_EXT"/>
<enum value="0x3349" name="EGL_SMPTE2086_MAX_LUMINANCE_EXT"/>
<enum value="0x334A" name="EGL_SMPTE2086_MIN_LUMINANCE_EXT"/>
<enum value="50000" name="EGL_METADATA_SCALING_EXT"/>
<unused start="0x334B"/>
<enum value="0x334C" name="EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV"/>
<enum value="0x334D" name="EGL_STREAM_CROSS_OBJECT_NV"/>
<enum value="0x334E" name="EGL_STREAM_CROSS_DISPLAY_NV"/>
<enum value="0x334F" name="EGL_STREAM_CROSS_SYSTEM_NV"/>
<enum value="0x3350" name="EGL_GL_COLORSPACE_SCRGB_LINEAR_EXT"/>
<enum value="0x3351" name="EGL_GL_COLORSPACE_SCRGB_EXT"/>
<enum value="0x3352" name="EGL_TRACK_REFERENCES_KHR"/>
<unused start="0x3353" end="0x335F"/>
<enum value="0x3360" name="EGL_CTA861_3_MAX_CONTENT_LIGHT_LEVEL_EXT"/>
<enum value="0x3361" name="EGL_CTA861_3_MAX_FRAME_AVERAGE_LEVEL_EXT"/>
<enum value="0x3362" name="EGL_GL_COLORSPACE_DISPLAY_P3_LINEAR_EXT"/>
<enum value="0x3363" name="EGL_GL_COLORSPACE_DISPLAY_P3_EXT"/>
<unused start="0x3364" end="0x339F"/>
</enums>
<enums namespace="EGL" start="0x33A0" end="0x33AF" vendor="ANGLE" comment="Reserved for Shannon Woods (Bug 13175)">
@@ -733,6 +829,44 @@
<unused start="0x33E0" end="0x342F"/>
</enums>
<enums namespace="EGL" start="0x3430" end="0x343F" vendor="ANDROID" comment="Reserved for Pablo Ceballos (Bug 15874)">
<unused start="0x3430" end="0x343F"/>
</enums>
<enums namespace="EGL" start="0x3440" end="0x344F" vendor="ANDROID" comment="Reserved for Kristian Kristensen (Bug 16033)">
<enum value="0x3440" name="EGL_DMA_BUF_PLANE3_FD_EXT"/>
<enum value="0x3441" name="EGL_DMA_BUF_PLANE3_OFFSET_EXT"/>
<enum value="0x3442" name="EGL_DMA_BUF_PLANE3_PITCH_EXT"/>
<enum value="0x3443" name="EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT"/>
<enum value="0x3444" name="EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT"/>
<enum value="0x3445" name="EGL_DMA_BUF_PLANE1_MODIFIER_LO_EXT"/>
<enum value="0x3446" name="EGL_DMA_BUF_PLANE1_MODIFIER_HI_EXT"/>
<enum value="0x3447" name="EGL_DMA_BUF_PLANE2_MODIFIER_LO_EXT"/>
<enum value="0x3448" name="EGL_DMA_BUF_PLANE2_MODIFIER_HI_EXT"/>
<enum value="0x3449" name="EGL_DMA_BUF_PLANE3_MODIFIER_LO_EXT"/>
<enum value="0x344A" name="EGL_DMA_BUF_PLANE3_MODIFIER_HI_EXT"/>
<unused start="0x344B" end="0x344F"/>
</enums>
<enums namespace="EGL" start="0x3450" end="0x345F" vendor="ANGLE" comment="Reserved for Shannon Woods (Bug 16106)">
<unused start="0x3450" end="0x345F"/>
</enums>
<enums namespace="EGL" start="0x3460" end="0x346F" vendor="COREAVI" comment="Reserved for Daniel Herring (Bug 16162)">
<enum value="0x3460" name="EGL_PRIMARY_COMPOSITOR_CONTEXT_EXT"/>
<enum value="0x3461" name="EGL_EXTERNAL_REF_ID_EXT"/>
<enum value="0x3462" name="EGL_COMPOSITOR_DROP_NEWEST_FRAME_EXT"/>
<enum value="0x3463" name="EGL_COMPOSITOR_KEEP_NEWEST_FRAME_EXT"/>
<enum value="0x3464" name="EGL_FRONT_BUFFER_EXT"/>
<unused start="0x3465" end="0x346F"/>
</enums>
<enums namespace="EGL" start="0x3470" end="0x347F" vendor="EXT" comment="Reserved for Daniel Stone (PR 14)">
<enum value="0x3470" name="EGL_IMPORT_SYNC_TYPE_EXT"/>
<enum value="0x3471" name="EGL_IMPORT_IMPLICIT_SYNC_EXT"/>
<enum value="0x3472" name="EGL_IMPORT_EXPLICIT_SYNC_EXT"/>
</enums>
<!-- Please remember that new enumerant allocations must be obtained by
request to the Khronos API registrar (see comments at the top of this
file) File requests in the Khronos Bugzilla, EGL project, Registry
@@ -742,8 +876,8 @@
<!-- Reservable for future use. To generate a new range, allocate multiples
of 16 starting at the lowest available point in this block. -->
<enums namespace="EGL" start="0x3420" end="0x3FFF" vendor="KHR">
<unused start="0x3420" end="0x3FFF" comment="Reserved for future use"/>
<enums namespace="EGL" start="0x3480" end="0x3FFF" vendor="KHR" comment="Reserved for future use">
<unused start="0x3480" end="0x3FFF"/>
</enums>
<enums namespace="EGL" start="0x8F70" end="0x8F7F" vendor="HI" comment="For Mark Callow, Khronos bug 4055. Shared with GL.">
@@ -835,6 +969,10 @@
<param><ptype>EGLClientBuffer</ptype> <name>buffer</name></param>
<param>const <ptype>EGLint</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLClientBuffer</ptype> <name>eglCreateNativeClientBufferANDROID</name></proto>
<param>const <ptype>EGLint</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLSurface</ptype> <name>eglCreatePbufferFromClientBuffer</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -900,6 +1038,11 @@
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param>const <ptype>EGLint</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLStreamKHR</ptype> <name>eglCreateStreamAttribKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param>const <ptype>EGLAttrib</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLSurface</ptype> <name>eglCreateStreamProducerSurfaceKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -1162,6 +1305,12 @@
<param><ptype>EGLint</ptype> <name>width</name></param>
<param><ptype>EGLint</ptype> <name>height</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglPresentationTimeANDROID</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLSurface</ptype> <name>surface</name></param>
<param><ptype>EGLnsecsANDROID</ptype> <name>time</name></param>
</command>
<command>
<proto><ptype>EGLenum</ptype> <name>eglQueryAPI</name></proto>
</command>
@@ -1199,6 +1348,36 @@
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLint</ptype> <name>attribute</name></param>
<param><ptype>EGLAttrib</ptype> *<name>value</name></param>
<alias name="eglQueryDisplayAttribKHR"/>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryDisplayAttribKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLint</ptype> <name>name</name></param>
<param><ptype>EGLAttrib</ptype> *<name>value</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryDisplayAttribNV</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLint</ptype> <name>attribute</name></param>
<param><ptype>EGLAttrib</ptype> *<name>value</name></param>
<alias name="eglQueryDisplayAttribKHR"/>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryDmaBufFormatsEXT</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLint</ptype> <name>max_formats</name></param>
<param><ptype>EGLint</ptype> *<name>formats</name></param>
<param><ptype>EGLint</ptype> *<name>num_formats</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryDmaBufModifiersEXT</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLint</ptype> <name>format</name></param>
<param><ptype>EGLint</ptype> <name>max_modifiers</name></param>
<param><ptype>EGLuint64KHR</ptype> *<name>modifiers</name></param>
<param><ptype>EGLBoolean</ptype> *<name>external_only</name></param>
<param><ptype>EGLint</ptype> *<name>num_modifiers</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryNativeDisplayNV</name></proto>
@@ -1250,6 +1429,23 @@
<param><ptype>EGLenum</ptype> <name>attribute</name></param>
<param><ptype>EGLint</ptype> *<name>value</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryStreamAttribKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param><ptype>EGLenum</ptype> <name>attribute</name></param>
<param><ptype>EGLAttrib</ptype> *<name>value</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryStreamMetadataNV</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param><ptype>EGLenum</ptype> <name>name</name></param>
<param><ptype>EGLint</ptype> <name>n</name></param>
<param><ptype>EGLint</ptype> <name>offset</name></param>
<param><ptype>EGLint</ptype> <name>size</name></param>
<param>void *<name>data</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglQueryStreamTimeKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -1299,6 +1495,11 @@
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglReleaseThread</name></proto>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglResetStreamNV</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
</command>
<command>
<proto>void <name>eglSetBlobCacheFuncsANDROID</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -1312,6 +1513,22 @@
<param><ptype>EGLint</ptype> *<name>rects</name></param>
<param><ptype>EGLint</ptype> <name>n_rects</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglSetStreamAttribKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param><ptype>EGLenum</ptype> <name>attribute</name></param>
<param><ptype>EGLAttrib</ptype> <name>value</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglSetStreamMetadataNV</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param><ptype>EGLint</ptype> <name>n</name></param>
<param><ptype>EGLint</ptype> <name>offset</name></param>
<param><ptype>EGLint</ptype> <name>size</name></param>
<param>const void *<name>data</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglSignalSyncKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -1335,11 +1552,23 @@
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglStreamConsumerAcquireAttribKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param>const <ptype>EGLAttrib</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglStreamConsumerGLTextureExternalKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglStreamConsumerGLTextureExternalAttribsNV</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param><ptype>EGLAttrib</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglStreamConsumerOutputEXT</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -1351,6 +1580,12 @@
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglStreamConsumerReleaseAttribKHR</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
<param><ptype>EGLStreamKHR</ptype> <name>stream</name></param>
<param>const <ptype>EGLAttrib</ptype> *<name>attrib_list</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglSurfaceAttrib</name></proto>
<param><ptype>EGLDisplay</ptype> <name>dpy</name></param>
@@ -1427,6 +1662,44 @@
<param><ptype>EGLSyncKHR</ptype> <name>sync</name></param>
<param><ptype>EGLint</ptype> <name>flags</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorSetContextListEXT</name></proto>
<param>const <ptype>EGLint</ptype> *<name>external_ref_ids</name></param>
<param><ptype>EGLint</ptype> <name>num_entries</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorSetContextAttributesEXT</name></proto>
<param><ptype>EGLint</ptype> <name>external_ref_id</name></param>
<param>const <ptype>EGLint</ptype> *<name>context_attributes</name></param>
<param><ptype>EGLint</ptype> <name>num_entries</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorSetWindowListEXT</name></proto>
<param><ptype>EGLint</ptype> <name>external_ref_id</name></param>
<param>const <ptype>EGLint</ptype> *<name>external_win_ids</name></param>
<param><ptype>EGLint</ptype> <name>num_entries</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorSetWindowAttributesEXT</name></proto>
<param><ptype>EGLint</ptype> <name>external_win_id</name></param>
<param>const <ptype>EGLint</ptype> *<name>window_attributes</name></param>
<param><ptype>EGLint</ptype> <name>num_entries</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorBindTexWindowEXT</name></proto>
<param><ptype>EGLint</ptype> <name>external_win_id</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorSetSizeEXT</name></proto>
<param><ptype>EGLint</ptype> <name>external_win_id</name></param>
<param><ptype>EGLint</ptype> <name>width</name></param>
<param><ptype>EGLint</ptype> <name>height</name></param>
</command>
<command>
<proto><ptype>EGLBoolean</ptype> <name>eglCompositorSwapPolicyEXT</name></proto>
<param><ptype>EGLint</ptype> <name>external_win_id</name></param>
<param><ptype>EGLint</ptype> <name>policy</name></param>
</command>
</commands>
<!-- SECTION: EGL API interface definitions. -->
@@ -1699,11 +1972,25 @@
<command name="eglSetBlobCacheFuncsANDROID"/>
</require>
</extension>
<extension name="EGL_ANDROID_create_native_client_buffer" supported="egl">
<require>
<enum name="EGL_NATIVE_BUFFER_USAGE_ANDROID"/>
<enum name="EGL_NATIVE_BUFFER_USAGE_PROTECTED_BIT_ANDROID"/>
<enum name="EGL_NATIVE_BUFFER_USAGE_RENDERBUFFER_BIT_ANDROID"/>
<enum name="EGL_NATIVE_BUFFER_USAGE_TEXTURE_BIT_ANDROID"/>
<command name="eglCreateNativeClientBufferANDROID"/>
</require>
</extension>
<extension name="EGL_ANDROID_framebuffer_target" supported="egl">
<require>
<enum name="EGL_FRAMEBUFFER_TARGET_ANDROID"/>
</require>
</extension>
<extension name="EGL_ANDROID_front_buffer_auto_refresh" supported="egl">
<require>
<enum name="EGL_FRONT_BUFFER_AUTO_REFRESH_ANDROID"/>
</require>
</extension>
<extension name="EGL_ANDROID_image_native_buffer" supported="egl">
<require>
<enum name="EGL_NATIVE_BUFFER_ANDROID"/>
@@ -1718,6 +2005,11 @@
<command name="eglDupNativeFenceFDANDROID"/>
</require>
</extension>
<extension name="EGL_ANDROID_presentation_time" supported="egl">
<require>
<command name="eglPresentationTimeANDROID"/>
</require>
</extension>
<extension name="EGL_ANDROID_recordable" supported="egl">
<require>
<enum name="EGL_RECORDABLE_ANDROID"/>
@@ -1749,6 +2041,11 @@
<enum name="EGL_FIXED_SIZE_ANGLE"/>
</require>
</extension>
<extension name="EGL_ARM_implicit_external_sync" supported="egl">
<require>
<enum name="EGL_SYNC_PRIOR_COMMANDS_IMPLICIT_EXTERNAL_ARM"/>
</require>
</extension>
<extension name="EGL_ARM_pixmap_multisample_discard" supported="egl">
<require>
<enum name="EGL_DISCARD_SAMPLES_ARM"/>
@@ -1804,6 +2101,36 @@
<command name="eglQueryDisplayAttribEXT"/>
</require>
</extension>
<extension name="EGL_EXT_gl_colorspace_bt2020_linear" supported="egl">
<require>
<enum name="EGL_GL_COLORSPACE_BT2020_LINEAR_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_gl_colorspace_bt2020_pq" supported="egl">
<require>
<enum name="EGL_GL_COLORSPACE_BT2020_PQ_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_gl_colorspace_scrgb" supported="egl">
<require>
<enum name="EGL_GL_COLORSPACE_SCRGB_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_gl_colorspace_scrgb_linear" supported="egl">
<require>
<enum name="EGL_GL_COLORSPACE_SCRGB_LINEAR_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_gl_colorspace_display_p3_linear" supported="egl">
<require>
<enum name="EGL_GL_COLORSPACE_DISPLAY_P3_LINEAR_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_gl_colorspace_display_p3" supported="egl">
<require>
<enum name="EGL_GL_COLORSPACE_DISPLAY_P3_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_image_dma_buf_import" supported="egl">
<require>
<enum name="EGL_LINUX_DMA_BUF_EXT"/>
@@ -1830,6 +2157,23 @@
<enum name="EGL_YUV_CHROMA_SITING_0_5_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_image_dma_buf_import_modifiers" supported="egl">
<require>
<enum name="EGL_DMA_BUF_PLANE3_FD_EXT"/>
<enum name="EGL_DMA_BUF_PLANE3_OFFSET_EXT"/>
<enum name="EGL_DMA_BUF_PLANE3_PITCH_EXT"/>
<enum name="EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT"/>
<enum name="EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT"/>
<enum name="EGL_DMA_BUF_PLANE1_MODIFIER_LO_EXT"/>
<enum name="EGL_DMA_BUF_PLANE1_MODIFIER_HI_EXT"/>
<enum name="EGL_DMA_BUF_PLANE2_MODIFIER_LO_EXT"/>
<enum name="EGL_DMA_BUF_PLANE2_MODIFIER_HI_EXT"/>
<enum name="EGL_DMA_BUF_PLANE3_MODIFIER_LO_EXT"/>
<enum name="EGL_DMA_BUF_PLANE3_MODIFIER_HI_EXT"/>
<command name="eglQueryDmaBufFormatsEXT"/>
<command name="eglQueryDmaBufModifiersEXT"/>
</require>
</extension>
<extension name="EGL_EXT_multiview_window" supported="egl">
<require>
<enum name="EGL_MULTIVIEW_VIEW_COUNT_EXT"/>
@@ -1867,6 +2211,13 @@
<enum name="EGL_OPENWF_PORT_ID_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_pixel_format_float" supported="egl">
<require>
<enum name="EGL_COLOR_COMPONENT_TYPE_EXT"/>
<enum name="EGL_COLOR_COMPONENT_TYPE_FIXED_EXT"/>
<enum name="EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_platform_base" supported="egl">
<require>
<command name="eglGetPlatformDisplayEXT"/>
@@ -1890,6 +2241,11 @@
<enum name="EGL_PLATFORM_X11_SCREEN_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_protected_content" supported="egl">
<require>
<enum name="EGL_PROTECTED_CONTENT_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_protected_surface" supported="egl">
<require>
<enum name="EGL_PROTECTED_CONTENT_EXT"/>
@@ -1900,6 +2256,21 @@
<command name="eglStreamConsumerOutputEXT"/>
</require>
</extension>
<extension name="EGL_EXT_surface_SMPTE2086_metadata" supported="egl">
<require>
<enum name="EGL_SMPTE2086_DISPLAY_PRIMARY_RX_EXT"/>
<enum name="EGL_SMPTE2086_DISPLAY_PRIMARY_RY_EXT"/>
<enum name="EGL_SMPTE2086_DISPLAY_PRIMARY_GX_EXT"/>
<enum name="EGL_SMPTE2086_DISPLAY_PRIMARY_GY_EXT"/>
<enum name="EGL_SMPTE2086_DISPLAY_PRIMARY_BX_EXT"/>
<enum name="EGL_SMPTE2086_DISPLAY_PRIMARY_BY_EXT"/>
<enum name="EGL_SMPTE2086_WHITE_POINT_X_EXT"/>
<enum name="EGL_SMPTE2086_WHITE_POINT_Y_EXT"/>
<enum name="EGL_SMPTE2086_MAX_LUMINANCE_EXT"/>
<enum name="EGL_SMPTE2086_MIN_LUMINANCE_EXT"/>
<enum name="EGL_METADATA_SCALING_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_swap_buffers_with_damage" supported="egl">
<require>
<command name="eglSwapBuffersWithDamageEXT"/>
@@ -1956,6 +2327,12 @@
<enum name="EGL_CONTEXT_PRIORITY_LOW_IMG"/>
</require>
</extension>
<extension name="EGL_IMG_image_plane_attribs" supported="egl">
<require>
<enum name="EGL_NATIVE_BUFFER_MULTIPLANE_SEPARATE_IMG"/>
<enum name="EGL_NATIVE_BUFFER_PLANE_OFFSET_IMG"/>
</require>
</extension>
<extension name="EGL_KHR_cl_event" supported="egl">
<require>
<enum name="EGL_CL_EVENT_HANDLE_KHR"/>
@@ -1979,6 +2356,13 @@
</require>
</extension>
<extension name="EGL_KHR_client_get_all_proc_addresses" supported="egl" comment="Alias of EGL_KHR_get_all_proc_addresses"/>
<extension name="EGL_KHR_context_flush_control" supported="egl">
<require>
<enum name="EGL_CONTEXT_RELEASE_BEHAVIOR_NONE_KHR"/>
<enum name="EGL_CONTEXT_RELEASE_BEHAVIOR_KHR"/>
<enum name="EGL_CONTEXT_RELEASE_BEHAVIOR_FLUSH_KHR"/>
</require>
</extension>
<extension name="EGL_KHR_create_context" supported="egl">
<require>
<enum name="EGL_CONTEXT_MAJOR_VERSION_KHR"/>
@@ -2024,6 +2408,12 @@
<command name="eglLabelObjectKHR"/>
</require>
</extension>
<extension name="EGL_KHR_display_reference" supported="egl">
<require>
<enum name="EGL_TRACK_REFERENCES_KHR"/>
<command name="eglQueryDisplayAttribKHR"/>
</require>
</extension>
<extension name="EGL_KHR_fence_sync" protect="KHRONOS_SUPPORT_INT64" supported="egl">
<require>
<!-- Most interfaces also defined by EGL_KHR_reusable sync -->
@@ -2153,6 +2543,16 @@
<command name="eglQuerySurface64KHR"/>
</require>
</extension>
<extension name="EGL_KHR_mutable_render_buffer" supported="egl">
<require>
<enum name="EGL_MUTABLE_RENDER_BUFFER_BIT_KHR"/>
</require>
</extension>
<extension name="EGL_KHR_no_config_context" supported="egl">
<require>
<enum name="EGL_NO_CONFIG_KHR"/>
</require>
</extension>
<extension name="EGL_KHR_partial_update" supported="egl">
<require>
<enum name="EGL_BUFFER_AGE_KHR"/>
@@ -2221,6 +2621,19 @@
<command name="eglQueryStreamu64KHR"/>
</require>
</extension>
<extension name="EGL_KHR_stream_attrib" protect="KHRONOS_SUPPORT_INT64" supported="egl">
<require>
<enum name="EGL_CONSUMER_LATENCY_USEC_KHR"/>
<enum name="EGL_STREAM_STATE_KHR"/>
<enum name="EGL_STREAM_STATE_CREATED_KHR"/>
<enum name="EGL_STREAM_STATE_CONNECTING_KHR"/>
<command name="eglCreateStreamAttribKHR"/>
<command name="eglSetStreamAttribKHR"/>
<command name="eglQueryStreamAttribKHR"/>
<command name="eglStreamConsumerAcquireAttribKHR"/>
<command name="eglStreamConsumerReleaseAttribKHR"/>
</require>
</extension>
<extension name="EGL_KHR_stream_consumer_gltexture" protect="EGL_KHR_stream" supported="egl">
<require>
<enum name="EGL_CONSUMER_ACQUIRE_TIMEOUT_USEC_KHR"/>
@@ -2293,6 +2706,11 @@
<enum name="EGL_PLATFORM_GBM_MESA"/>
</require>
</extension>
<extension name="EGL_MESA_platform_surfaceless" supported="egl">
<require>
<enum name="EGL_PLATFORM_SURFACELESS_MESA"/>
</require>
</extension>
<extension name="EGL_NOK_swap_region" supported="egl">
<require>
<command name="eglSwapBuffersRegionNOK"/>
@@ -2362,6 +2780,122 @@
<command name="eglPostSubBufferNV"/>
</require>
</extension>
<extension name="EGL_NV_robustness_video_memory_purge" supported="egl">
<require>
<enum name="EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_consumer_gltexture_yuv" supported="egl">
<require>
<enum name="EGL_YUV_PLANE0_TEXTURE_UNIT_NV"/>
<enum name="EGL_YUV_PLANE1_TEXTURE_UNIT_NV"/>
<enum name="EGL_YUV_PLANE2_TEXTURE_UNIT_NV"/>
<enum name="EGL_YUV_NUMBER_OF_PLANES_EXT"/>
<enum name="EGL_YUV_BUFFER_EXT"/>
<command name="eglStreamConsumerGLTextureExternalAttribsNV"/>
</require>
</extension>
<extension name="EGL_NV_stream_cross_object" supported="egl">
<require>
<enum name="EGL_STREAM_CROSS_OBJECT_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_cross_display" supported="egl">
<require>
<enum name="EGL_STREAM_CROSS_DISPLAY_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_cross_partition" supported="egl">
<require>
<enum name="EGL_STREAM_CROSS_PARTITION_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_cross_process" supported="egl">
<require>
<enum name="EGL_STREAM_CROSS_PROCESS_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_cross_system" supported="egl">
<require>
<enum name="EGL_STREAM_CROSS_SYSTEM_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_fifo_next" supported="egl">
<require>
<enum name="EGL_PENDING_FRAME_NV"/>
<enum name="EGL_STREAM_TIME_PENDING_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_fifo_synchronous" supported="egl">
<require>
<enum name="EGL_STREAM_FIFO_SYNCHRONOUS_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_frame_limits" supported="egl">
<require>
<enum name="EGL_PRODUCER_MAX_FRAME_HINT_NV"/>
<enum name="EGL_CONSUMER_MAX_FRAME_HINT_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_metadata" supported="egl">
<require>
<enum name="EGL_MAX_STREAM_METADATA_BLOCKS_NV"/>
<enum name="EGL_MAX_STREAM_METADATA_BLOCK_SIZE_NV"/>
<enum name="EGL_MAX_STREAM_METADATA_TOTAL_SIZE_NV"/>
<enum name="EGL_PRODUCER_METADATA_NV"/>
<enum name="EGL_CONSUMER_METADATA_NV"/>
<enum name="EGL_PENDING_METADATA_NV"/>
<enum name="EGL_METADATA0_SIZE_NV"/>
<enum name="EGL_METADATA1_SIZE_NV"/>
<enum name="EGL_METADATA2_SIZE_NV"/>
<enum name="EGL_METADATA3_SIZE_NV"/>
<enum name="EGL_METADATA0_TYPE_NV"/>
<enum name="EGL_METADATA1_TYPE_NV"/>
<enum name="EGL_METADATA2_TYPE_NV"/>
<enum name="EGL_METADATA3_TYPE_NV"/>
<command name="eglQueryDisplayAttribNV"/>
<command name="eglSetStreamMetadataNV"/>
<command name="eglQueryStreamMetadataNV"/>
</require>
</extension>
<extension name="EGL_NV_stream_reset" supported="egl">
<require>
<enum name="EGL_SUPPORT_RESET_NV"/>
<enum name="EGL_SUPPORT_REUSE_NV"/>
<command name="eglResetStreamNV"/>
</require>
</extension>
<extension name="EGL_NV_stream_remote" supported="egl">
<require>
<enum name="EGL_STREAM_STATE_INITIALIZING_NV"/>
<enum name="EGL_STREAM_TYPE_NV"/>
<enum name="EGL_STREAM_PROTOCOL_NV"/>
<enum name="EGL_STREAM_ENDPOINT_NV"/>
<enum name="EGL_STREAM_LOCAL_NV"/>
<enum name="EGL_STREAM_PRODUCER_NV"/>
<enum name="EGL_STREAM_CONSUMER_NV"/>
</require>
<require comment="Supported only if EGL_KHR_stream_cross_process_fd is supported">
<enum name="EGL_STREAM_PROTOCOL_FD_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_socket" supported="egl">
<require>
<enum name="EGL_STREAM_PROTOCOL_SOCKET_NV"/>
<enum name="EGL_SOCKET_HANDLE_NV"/>
<enum name="EGL_SOCKET_TYPE_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_socket_inet" supported="egl">
<require>
<enum name="EGL_SOCKET_TYPE_INET_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_socket_unix" supported="egl">
<require>
<enum name="EGL_SOCKET_TYPE_UNIX_NV"/>
</require>
</extension>
<extension name="EGL_NV_stream_sync" supported="egl">
<require>
<enum name="EGL_SYNC_TYPE_KHR"/>
@@ -2408,5 +2942,39 @@
<enum name="EGL_NATIVE_SURFACE_TIZEN"/>
</require>
</extension>
<extension name="EGL_EXT_compositor" supported="egl">
<require>
<enum name="EGL_PRIMARY_COMPOSITOR_CONTEXT_EXT"/>
<enum name="EGL_EXTERNAL_REF_ID_EXT"/>
<enum name="EGL_COMPOSITOR_DROP_NEWEST_FRAME_EXT"/>
<enum name="EGL_COMPOSITOR_KEEP_NEWEST_FRAME_EXT"/>
<command name="eglCompositorSetContextListEXT"/>
<command name="eglCompositorSetContextAttributesEXT"/>
<command name="eglCompositorSetWindowListEXT"/>
<command name="eglCompositorSetWindowAttributesEXT"/>
<command name="eglCompositorBindTexWindowEXT"/>
<command name="eglCompositorSetSizeEXT"/>
<command name="eglCompositorSwapPolicyEXT"/>
</require>
</extension>
<extension name="EGL_EXT_surface_CTA861_3_metadata" supported="egl">
<require>
<enum name="EGL_CTA861_3_MAX_CONTENT_LIGHT_LEVEL_EXT"/>
<enum name="EGL_CTA861_3_MAX_FRAME_AVERAGE_LEVEL_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_image_implicit_sync_control" supported="egl">
<require>
<enum name="EGL_IMPORT_SYNC_TYPE_EXT"/>
<enum name="EGL_IMPORT_IMPLICIT_SYNC_EXT"/>
<enum name="EGL_IMPORT_EXPLICIT_SYNC_EXT"/>
</require>
</extension>
<extension name="EGL_EXT_bind_to_front" supported="egl">
<require>
<enum name="EGL_FRONT_BUFFER_EXT"/>
</require>
</extension>
</extensions>
</registry>

View File

@@ -195,5 +195,9 @@ EGL_FUNCTIONS = (
# EGL_ANDROID_native_fence_sync
_eglFunc("eglDupNativeFenceFDANDROID", "display"),
# EGL_EXT_image_dma_buf_import_modifiers
_eglFunc("eglQueryDmaBufFormatsEXT", "display"),
_eglFunc("eglQueryDmaBufModifiersEXT", "display"),
)

View File

@@ -923,7 +923,7 @@ static void *
_fixupNativeWindow(_EGLDisplay *disp, void *native_window)
{
#ifdef HAVE_X11_PLATFORM
if (disp->Platform == _EGL_PLATFORM_X11 && native_window != NULL) {
if (disp && disp->Platform == _EGL_PLATFORM_X11 && native_window != NULL) {
/* The `native_window` parameter for the X11 platform differs between
* eglCreateWindowSurface() and eglCreatePlatformPixmapSurfaceEXT(). In
* eglCreateWindowSurface(), the type of `native_window` is an Xlib
@@ -985,7 +985,7 @@ _fixupNativePixmap(_EGLDisplay *disp, void *native_pixmap)
* `Pixmap*`. Convert `Pixmap*` to `Pixmap` because that's what
* dri2_x11_create_pixmap_surface() expects.
*/
if (disp->Platform == _EGL_PLATFORM_X11 && native_pixmap != NULL)
if (disp && disp->Platform == _EGL_PLATFORM_X11 && native_pixmap != NULL)
return (void *)(* (Pixmap*) native_pixmap);
#endif
return native_pixmap;

View File

@@ -328,17 +328,6 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
break;
}
/* The EGL_KHR_create_context_no_error spec says:
*
* "BAD_MATCH is generated if the EGL_CONTEXT_OPENGL_NO_ERROR_KHR is TRUE at
* the same time as a debug or robustness context is specified."
*/
if (ctx->Flags & EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR ||
ctx->Flags & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR) {
err = EGL_BAD_MATCH;
break;
}
/* Canonicalize value to EGL_TRUE/EGL_FALSE definitions */
ctx->NoError = !!val;
break;
@@ -489,6 +478,16 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
break;
}
/* The EGL_KHR_create_context_no_error spec says:
*
* "BAD_MATCH is generated if the EGL_CONTEXT_OPENGL_NO_ERROR_KHR is TRUE at
* the same time as a debug or robustness context is specified."
*/
if (ctx->NoError && (ctx->Flags & EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR ||
ctx->Flags & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR)) {
err = EGL_BAD_MATCH;
}
if ((ctx->Flags & ~(EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR
| EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR
| EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR)) != 0) {

View File

@@ -47,7 +47,7 @@ struct wl_drm {
char *device_name;
uint32_t flags;
struct wayland_drm_callbacks *callbacks;
struct wayland_drm_callbacks callbacks;
struct wl_buffer_interface buffer_interface;
};
@@ -58,7 +58,7 @@ destroy_buffer(struct wl_resource *resource)
struct wl_drm_buffer *buffer = resource->data;
struct wl_drm *drm = buffer->drm;
drm->callbacks->release_buffer(drm->user_data, buffer);
drm->callbacks.release_buffer(drm->user_data, buffer);
free(buffer);
}
@@ -97,7 +97,7 @@ create_buffer(struct wl_client *client, struct wl_resource *resource,
buffer->offset[2] = offset2;
buffer->stride[2] = stride2;
drm->callbacks->reference_buffer(drm->user_data, name, fd, buffer);
drm->callbacks.reference_buffer(drm->user_data, name, fd, buffer);
if (buffer->driver_buffer == NULL) {
wl_resource_post_error(resource,
WL_DRM_ERROR_INVALID_NAME,
@@ -189,7 +189,7 @@ drm_authenticate(struct wl_client *client,
{
struct wl_drm *drm = resource->data;
if (drm->callbacks->authenticate(drm->user_data, id) < 0)
if (drm->callbacks.authenticate(drm->user_data, id) < 0)
wl_resource_post_error(resource,
WL_DRM_ERROR_AUTHENTICATE_FAIL,
"authenicate failed");
@@ -270,7 +270,7 @@ wayland_drm_init(struct wl_display *display, char *device_name,
drm->display = display;
drm->device_name = strdup(device_name);
drm->callbacks = callbacks;
drm->callbacks = *callbacks;
drm->user_data = user_data;
drm->flags = flags;

View File

@@ -650,7 +650,13 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
for (i = 0; i < format_desc->nr_channels; i++) {
struct util_format_channel_description chan_desc = format_desc->channel[i];
unsigned blockbits = type.width;
unsigned vec_nr = chan_desc.shift / type.width;
unsigned vec_nr;
#ifdef PIPE_ARCH_BIG_ENDIAN
vec_nr = (format_desc->block.bits - (chan_desc.shift + chan_desc.size)) / type.width;
#else
vec_nr = chan_desc.shift / type.width;
#endif
chan_desc.shift %= type.width;
output[i] = lp_build_extract_soa_chan(&bld,

View File

@@ -234,13 +234,39 @@ lp_build_gather_elem_vec(struct gallivm_state *gallivm,
*/
res = LLVMBuildZExt(gallivm->builder, res, dst_elem_type, "");
if (vector_justify) {
#ifdef PIPE_ARCH_BIG_ENDIAN
if (vector_justify) {
res = LLVMBuildShl(gallivm->builder, res,
LLVMConstInt(dst_elem_type,
dst_type.width - src_width, 0), "");
#endif
}
if (src_width == 48) {
/* Load 3x16 bit vector.
* The sequence of loads on big-endian hardware proceeds as follows.
* 16-bit fields are denoted by X, Y, Z, and 0. In memory, the sequence
* of three fields appears in the order X, Y, Z.
*
* Load 32-bit word: 0.0.X.Y
* Load 16-bit halfword: 0.0.0.Z
* Rotate left: 0.X.Y.0
* Bitwise OR: 0.X.Y.Z
*
* The order in which we need the fields in the result is 0.Z.Y.X,
* the same as on little-endian; permute 16-bit fields accordingly
* within 64-bit register:
*/
LLVMValueRef shuffles[4] = {
lp_build_const_int32(gallivm, 2),
lp_build_const_int32(gallivm, 1),
lp_build_const_int32(gallivm, 0),
lp_build_const_int32(gallivm, 3),
};
res = LLVMBuildBitCast(gallivm->builder, res,
lp_build_vec_type(gallivm, lp_type_uint_vec(16, 4*16)), "");
res = LLVMBuildShuffleVector(gallivm->builder, res, res, LLVMConstVector(shuffles, 4), "");
res = LLVMBuildBitCast(gallivm->builder, res, dst_elem_type, "");
}
#endif
}
}
return res;

View File

@@ -606,7 +606,7 @@ gallivm_compile_module(struct gallivm_state *gallivm)
LLVMWriteBitcodeToFile(gallivm->module, filename);
debug_printf("%s written\n", filename);
debug_printf("Invoke as \"llc %s%s -o - %s\"\n",
(HAVE_LLVM >= 0x0305) ? "[-mcpu=<-mcpu option] " : "",
(HAVE_LLVM >= 0x0305) ? "[-mcpu=<-mcpu option>] " : "",
"[-mattr=<-mattr option(s)>]",
filename);
}

View File

@@ -49,6 +49,9 @@
#endif
#include <llvm-c/Core.h>
#if HAVE_LLVM >= 0x0306
#include <llvm-c/Support.h>
#endif
#include <llvm-c/ExecutionEngine.h>
#include <llvm/Target/TargetOptions.h>
#include <llvm/ExecutionEngine/ExecutionEngine.h>
@@ -122,6 +125,26 @@ static void init_native_targets()
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetDisassembler();
#if DEBUG && HAVE_LLVM >= 0x0306
{
char *env_llc_options = getenv("GALLIVM_LLC_OPTIONS");
if (env_llc_options) {
char *option;
char *options[64] = {(char *) "llc"}; // Warning without cast
int n;
for (n = 0, option = strtok(env_llc_options, " "); option; n++, option = strtok(NULL, " ")) {
options[n + 1] = option;
}
if (gallivm_debug & (GALLIVM_DEBUG_IR | GALLIVM_DEBUG_ASM | GALLIVM_DEBUG_DUMP_BC)) {
debug_printf("llc additional options (%d):\n", n);
for (int i = 1; i <= n; i++)
debug_printf("\t%s\n", options[i]);
debug_printf("\n");
}
LLVMParseCommandLineOptions(n + 1, options, NULL);
}
}
#endif
}
extern "C" void
@@ -607,23 +630,46 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
#if defined(PIPE_ARCH_PPC)
MAttrs.push_back(util_cpu_caps.has_altivec ? "+altivec" : "-altivec");
#if (HAVE_LLVM >= 0x0304)
#if (HAVE_LLVM <= 0x0307) || (HAVE_LLVM == 0x0308 && MESA_LLVM_VERSION_PATCH == 0)
#if (HAVE_LLVM < 0x0400)
/*
* Make sure VSX instructions are disabled
* See LLVM bug https://llvm.org/bugs/show_bug.cgi?id=25503#c7
* See LLVM bugs:
* https://llvm.org/bugs/show_bug.cgi?id=25503#c7 (fixed in 3.8.1)
* https://llvm.org/bugs/show_bug.cgi?id=26775 (fixed in 3.8.1)
* https://llvm.org/bugs/show_bug.cgi?id=33531 (fixed in 4.0)
* https://llvm.org/bugs/show_bug.cgi?id=34647 (llc performance on certain unusual shader IR; intro'd in 4.0, pending as of 5.0)
*/
if (util_cpu_caps.has_altivec) {
MAttrs.push_back("-vsx");
}
#else
/*
* However, bug 25503 is fixed, by the same fix that fixed
* bug 26775, in versions of LLVM later than 3.8 (starting with 3.8.1):
* Make sure VSX instructions are ENABLED
* See LLVM bug https://llvm.org/bugs/show_bug.cgi?id=26775
* Bug 25503 is fixed, by the same fix that fixed
* bug 26775, in versions of LLVM later than 3.8 (starting with 3.8.1).
* BZ 33531 actually comprises more than one bug, all of
* which are fixed in LLVM 4.0.
*
* With LLVM 4.0 or higher:
* Make sure VSX instructions are ENABLED, unless
* a) the entire -mattr option is overridden via GALLIVM_MATTRS, or
* b) VSX instructions are explicitly enabled/disabled via GALLIVM_VSX=1 or 0.
*/
if (util_cpu_caps.has_altivec) {
MAttrs.push_back("+vsx");
char *env_mattrs = getenv("GALLIVM_MATTRS");
if (env_mattrs) {
MAttrs.push_back(env_mattrs);
}
else {
boolean enable_vsx = true;
char *env_vsx = getenv("GALLIVM_VSX");
if (env_vsx && env_vsx[0] == '0') {
enable_vsx = false;
}
if (enable_vsx)
MAttrs.push_back("+vsx");
else
MAttrs.push_back("-vsx");
}
}
#endif
#endif

View File

@@ -69,10 +69,17 @@ os_time_get_nano(void)
static LARGE_INTEGER frequency;
LARGE_INTEGER counter;
int64_t secs, nanosecs;
if(!frequency.QuadPart)
QueryPerformanceFrequency(&frequency);
QueryPerformanceCounter(&counter);
return counter.QuadPart*INT64_C(1000000000)/frequency.QuadPart;
/* Compute seconds and nanoseconds parts separately to
* reduce severity of precision loss.
*/
secs = counter.QuadPart / frequency.QuadPart;
nanosecs = (counter.QuadPart % frequency.QuadPart) * INT64_C(1000000000)
/ frequency.QuadPart;
return secs*INT64_C(1000000000) + nanosecs;
#else

View File

@@ -132,16 +132,32 @@ check_os_altivec_support(void)
if (setjmp(__lv_powerpc_jmpbuf)) {
signal(SIGILL, SIG_DFL);
} else {
__lv_powerpc_canjump = 1;
boolean enable_altivec = TRUE; /* Default: enable if available, and if not overridden */
#ifdef DEBUG
/* Disabling Altivec code generation is not the same as disabling VSX code generation,
* which can be done simply by passing -mattr=-vsx to the LLVM compiler; cf.
* lp_build_create_jit_compiler_for_module().
* If you want to disable Altivec code generation, the best place to do it is here.
*/
char *env_control = getenv("GALLIVM_ALTIVEC"); /* 1=enable (default); 0=disable */
if (env_control && env_control[0] == '0') {
enable_altivec = FALSE;
}
#endif
if (enable_altivec) {
__lv_powerpc_canjump = 1;
__asm __volatile
("mtspr 256, %0\n\t"
"vand %%v0, %%v0, %%v0"
:
: "r" (-1));
__asm __volatile
("mtspr 256, %0\n\t"
"vand %%v0, %%v0, %%v0"
:
: "r" (-1));
signal(SIGILL, SIG_DFL);
util_cpu_caps.has_altivec = 1;
signal(SIGILL, SIG_DFL);
util_cpu_caps.has_altivec = 1;
} else {
util_cpu_caps.has_altivec = 0;
}
}
#endif /* !PIPE_OS_APPLE */
}

View File

@@ -106,7 +106,7 @@ pack_rgba(enum pipe_format format, const float *rgba)
union util_color uc;
util_pack_color(rgba, format, &uc);
if (util_format_get_blocksize(format) == 2)
return uc.ui[0] << 16 | uc.ui[0];
return uc.ui[0] << 16 | (uc.ui[0] & 0xffff);
else
return uc.ui[0];
}
@@ -396,7 +396,7 @@ etna_try_rs_blit(struct pipe_context *pctx,
}
unsigned src_format = etna_compatible_rs_format(blit_info->src.format);
unsigned dst_format = etna_compatible_rs_format(blit_info->src.format);
unsigned dst_format = etna_compatible_rs_format(blit_info->dst.format);
if (translate_rs_format(src_format) == ETNA_NO_MATCH ||
translate_rs_format(dst_format) == ETNA_NO_MATCH ||
blit_info->scissor_enable || blit_info->src.box.x != 0 ||

View File

@@ -1048,6 +1048,9 @@ t7 opcode: CP_WAIT_FOR_IDLE (26) (1 dwords)
OUT_RING(ring, 0x00000000);
OUT_RING(ring, 0x00000000);
OUT_RING(ring, 0x00000000);
OUT_PKT4(ring, REG_A5XX_RB_CLEAR_CNTL, 1);
OUT_RING(ring, 0x00000000);
}
static void

Some files were not shown because too many files have changed in this diff Show More