This helps avoid compiler warningss in the next commit - everything
was initialized, but it wasn't obvious to static analysis.
Suggested-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit d6d16c0218)
If an array is accessed within an if block, then currently it is not known
whether the value in the address register is involved in the evaluation of the
if condition, and converting the if condition may actually result in
out-of-bounds array access. Consequently, if blocks that contain indirect array
access should not be converted.
Fixes piglits on r600/BARTS:
spec/glsl-1.10/execution/variable-indexing/
vs-output-array-float-index-wr
vs-output-array-vec3-index-wr
vs-output-array-vec4-index-wr
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104143
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6c268ea79a)
This fixes hangs on cayman with
tests/spec/arb_tessellation_shader/execution/trivial-tess-gs_no-gs-inputs.shader_test
This has a single if/else in it, and when this peephole activated,
it would set the jump target to NULL if there was no instruction
after the final POP. This adds a NOP if we get a jump in this case,
and seems to fix the hangs, so we have a valid target for the ELSE
instruction to go to, instead of 0 (which causes infinite loops).
v2: update last_cf correctly. (I had some other patches hide this)
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 579ec9c311)
When floating point textures are created on OpenGL ES 2.0, driver
is free to choose used internal format. Mesa makes this decision in
adjust_for_oes_float_texture. Error checking for glTexImage2D properly
checks that sized formats are not used. We use same error checking
path for glTexSubImage2D (since there is lot of overlap), however since
those checks include internalFormat checks, we need to pass original
internalFormat passed by the client. Patch adds oes_float_internal_format
that does reverse adjust_for_oes_float_texture to get that format.
Fixes following test failure:
ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float
(when running test with MESA_GLES_VERSION_OVERRIDE=2.0)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 1e508e10d9)
So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END,
CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction
was added to add the EOP flag, even though this is not actually
needed, because all these instrutions support the EOP flag.
This patch removes the fixup code, adds setting the EOP flag for the
according instructions as well as others like CF_OP_TEX and CF_OP_VTX,
and adds writing out EOP for this type of instruction in the disassembler.
This also fixes a bug where shaders were created that didn't actually have
the EOP flag set in the last CF instruction, which might have resulted
in GPU lockups.
[airlied: cleaned up a little]
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1d076aafbc)
There are two issues with the current implementation. First, it relies
on the layout(local_size_*) happening in the same shader as the main
function, and secondly it doesn't work for variable group sizes.
In both cases, the simplest fix is to move the setup of these derived
values to a later time, similar to how the gl_VertexID workarounds are
done. There already exist system values defined for both of the derived
values, so we use them unconditionally, and lower them after linking is
performed.
While we're at it, we move to using gl_LocalGroupSizeARB instead of
gl_WorkGroupSize for variable group sizes.
Also the dead code elimination avoidance can be removed, since there
can be situations where gl_LocalGroupSizeARB is needed but has not been
inserted for the shader with main function. As a result, the lowering
code has to insert its own copies of the system values if needed.
Reported-by: Stephane Chevigny <stephane.chevigny@polymtl.ca>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103393
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4d24a7cb97)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/compiler/glsl/meson.build
generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.
Fixes: fd01840c0b ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438
(cherry picked from commit f09c2cefdd)
We want to emit invariant state at the start of a render batch. In the
past, this more or less happened: a new batch flagged BRW_NEW_CONTEXT
(because we don't have hardware contexts), which triggered the
brw_invariant_state atom. So, it would be emitted before any 3D
drawing. (Technically, there might be some BLT commands in the batch
because Gen4-5 have a single combined render/BLT ring, but that should
be harmless).
With the advent of BLORP, this broke. The first item in a batch might
be a BLORP operation, which bypasses the normal draw upload path. So,
we need to ensure invariant state happens first. To do that, we just
upload it when creating a new batch. On Gen6+ we'd need to worry about
whether it's a RENDER or BLT batch, but because we have a combined ring,
this approach should work fine on Gen4-5.
Seems to fix GPU hangs when playing hardware accelerated video with
mpv -hwdec=vaapi on Ironlake.
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103529
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8f91aa35a5)
...and provide a better citation for the existing one.
v2:
- Apply the workaround to Gen8 too, as intended (caught by Topi).
- Restructure to add bits instead of an extra flush (based on a similar
patch by Rafael Antognolli).
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 8d48671492)
[Andres Gomez: brw->gen not yet dropped in favor of devinfo->gen]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/mesa/drivers/dri/i965/brw_pipe_control.c
Squashed with:
i965: Revert Gen8 aspect of VF PIPE_CONTROL workaround.
This apparently causes hangs on Broadwell, so let's back it out for now.
I think there are other PIPE_CONTROL workarounds that we're missing.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103787
(cherry picked from commit a01ba366e0)
[Andres Gomez: brw->gen not yet dropped in favor of devinfo->gen]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/mesa/drivers/dri/i965/brw_pipe_control.c
Fixes the following tests on CHV, BXT, and GLK:
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot
dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint32_to_int64
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103115
(cherry picked from commit cfcfa0b9cd)
The MOV instruction can extract bytes to words/double words, and
words/double words to quadwords, but not byte to quadwords.
For unsigned byte to quadword, we can read them as words and AND off the
high byte and extract to quadword in one instruction. For signed bytes,
we need to first sign extend to word and the sign extend that word to a
quadword.
Fixes the following test on CHV, BXT, and GLK:
KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103628
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6ac2d16901)
Number of dwords in MI_FLUSH_DW changed from 4 to 5 in gen8+.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1dc45d75bb)
[Andres Gomez: brw->gen not yet dropped in favor of devinfo->gen]
Signed-off-by: Andres Gomez <agomez@igalia.com>
Conflicts:
src/mesa/drivers/dri/i965/brw_pipe_control.c
src/mesa/drivers/dri/i965/intel_blit.c
Otherwise we leak it.
Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 7c25578863)
As per radeonsi, the tess factor components for isolines
are reversed.
Fixes: tests/spec/arb_tessellation_shader/execution/isoline.shader_test
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f3f8615d76)
We renamed "Function Enable" to "Enable", which broke our detection
of whether shaders are enabled or not. So, we'd see a bunch of HS/DS
packets with program offsets of 0, and think that was a valid TCS/TES.
Fixes: c032cae9ff (genxml: Rename "Function Enable" to "Enable".)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 9a0465b3a3)
The L3 configuration code already considers the TCS and TES programs,
but failed to listen for TCS/TES program changes.
This was somehow missing.
Fixes: e9644cb1f9 ("i965: Consider tessellation in get_pipeline_state_l3_weights.")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit b8d42cccd0)
These flags are set for C sources, but not C++. This causes symbol
visibility leaks from the C++ parts of the Intel compiler.
Fixes: 700bebb958 ("i965: Move the back-end compiler to src/intel/compiler")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 854455498c)