The Makefile.am doesn't work. I tried fixing it but gave up because
I don't understand Autotools. I strongly suspect the Android.mk also
doesn't work.
Rather than maintain the broken build files, let's delete them and
re-add working build files if-and-when we need them. (Maybe we'll be
lucky and turnip will never need to support Autotools!).
meson files have been updated, autotools and android still need
updating.
Only build tested.
v2 (chadv):
- Rebase onto master.
- Fix build breakage in Python scripts.
- Drop the WSI code. The internal WSI apis have changed recently, and
will likely change again before the driver goes upstream. To avoid
unnecessary rebase work, let's drop the WSI code and re-add it when
we're ready to really use WSI.
(olv, after rebase) do not enable freedreno by default on ARM
Which also requires uadd_carry lowering
Until recently this was lowered in glsl ir so it went unnoticed that we
weren't lowering it.
Fixes: 1d8994a63b glsl: [u/i]mulExtended optimization for GLSL
Signed-off-by: Rob Clark <robdclark@gmail.com>
Not a perfect solution, and the "pressure" target is hard-coded. But it
doesn't really seem to much in the common case, and avoids exploding
register usage in dEQP ssbo tests.
So this should serve as a stop-gap solution until I have time to re-
write the scheduler.
Hurts slightly in instruction count, but gains (reduces) slightly the
register usage in shader-db. Fixes ~150 dEQP-GLES31.functional.ssbo.*
that were failing due to RA fail.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Array index should come before sample-id. And exclude all isam variants
(which take integer texel coords) from adding of offset.
Fixes dEQP-GLES31.functional.texture.multisample.samples_1.use_texture_*_2d_array
Signed-off-by: Rob Clark <robdclark@gmail.com>
We also need to put in the output mov. Possibly we could just fixup the
output register to read it directly from the dummy, but that is more
work and I guess dEQP is probably the only time you encounter this.
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_literal_fragment
Signed-off-by: Rob Clark <robdclark@gmail.com>
We weren't propagating the array info for cases where result of atomic
is array/reg. This can happen, for example, if result is part of a phi
web lowered to regs.
Fixes dEQP-GLES31.functional.ssbo.atomic.compswap.*
Signed-off-by: Rob Clark <robdclark@gmail.com>
Use the (nopN) encoding for slightly denser shaders.. this lets us fold
nop instructions into the previous alu instruction in certain cases.
Shouldn't change the # of cycles a shader takes to execute, but reduces
the size. (ex: glmark2 refract goes from 168 to 116 instructions)
Currently only enabled for a6xx, but I think we could enable this for
a5xx and possibly a4xx.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Fixes nearly all of dEQP-GLES31.functional.texture.border_clamp.* when
run after a test that binds textures used in vertex shader.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.const_literal.vertex.samplercubeshadow
and few other similar tests that do multiple texture fetches into
individual components of a packet output. Mostly works around the
issue mentioned in ra_block_find_definers().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Turns out we can write to tiled images as well as read. This avoids
having to linearize or do the tiling in the shader.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Use the 'UNK31' bit (which should probably be called 'BUFFER') for
samplerBuffer case, which increases the size of supported buffer
texture beyond 2^15 elements.
Also need to fix the 2nd coord injected to handle the tex instructions
that take integer coords.
Fixes dEQP-GLES31.functional.texture.texture_buffer.render.as_fragment_texture.buffer_size_131071
and similar
Signed-off-by: Rob Clark <robdclark@gmail.com>
... instead of isam. It seems like when using isam, plus atomics, we
can have the problem of old data being in the texture cache. Plus this
way we don't have to load a component at a time.
Note that blob still seems to use isam in some cases. I suppose it might
be preferable in the case of loading a single component, when atomics
are not in the picture (or that the ssbo does not need to otherwise be
coherent).
Signed-off-by: Rob Clark <robdclark@gmail.com>
Resync disasm and instr header from envytools, and add ldib encoding.
This replaces an opcode from a3xx which was never seen in practice,
since that seemed easier than dealing with the same opc # meaning a
different thing on a6xx. (Not really sure if 'sti' was actually a
real thing, I think it was only seen in fuzzing.)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Fixes
dEQP-GLES3.functional.shaders.indexing.varying_array.vec3_dynamic_write_dynamic_loop_read
regression.
Fixes: c1a27ba9ba freedreno/ir3: HIGH reg w/a for a6xx
Signed-off-by: Rob Clark <robdclark@gmail.com>
The variant will be NULL if RA failed. Which isn't ideal, but at least
lets not segfault and bring down the rest of the dEQP run with us.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The wrmask is handled in regmask_get()/regmask_set(), but it wasn't
being propagated from SSA src to dst. So for example, an SSBO read
value that is passed in as src2.y component to atomic op, wasn't
getting the (sy) flag set. Causing lots of fail.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The new encoding returns a value via the 2nd src. The legalize pass
needs to be aware of this to set the correct needs_sy flag, otherwise we
can, in cases where the atomic dst is not used, overwrite the register
that hardware will asynchronously load result into without (sy) flag, so
it gets clobbered by the atomic result.
This fixes a whole lot of rando ssbo+atomic fails, like
dEQP-GLES31.functional.ssbo.layout.single_basic_type.packed.highp_vec4.
Signed-off-by: Rob Clark <robdclark@gmail.com>
It seems like some instructions (noticed this w/ cat3), cannot read HIGH
regs.. cat1 (mov/cov) can, and possibly some/all of cat2.
The blob seems to stick w/ an extra mov into low regs. So lets do the
same.
This fixes WGID on a6xx, which unsurprisingly is related to a lot of
deqp compute fails.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Images and SSBOs don't map directly to the hw. They end up being part
texture and part something else. Starting with a6xx, the hack used for
a5xx to smash the image tex state into hw texture state starting from
MAX counting down won't work, because we start using tex state also for
SSBO read.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Note that image/ssbo support is currently only implemented for a5xx.
But the instruction encoding is the same for a4xx.
Signed-off-by: Rob Clark <robdclark@gmail.com>