During anv_physical_device_init(), we opend the DRM device to do some
queries, then promptly closed it. Now we keep it open for the lifetime
of the anv_physical_device so that we can query it some more during
vkGetPhysicalDevice*Properties() [which will happen in follow-up
commits].
Because in a follow-up patch I need to do some non-trival teardown on
anv_physical_device. Currently, however, anv_physical_device_finish() is
currently a no-op that's just called in the right place.
Also, rename function fill_physical_device -> anv_physical_device_init
for symmetry.
Don't assume that $(top_srcdir)/.git is a directory. It may be a
gitlink file [1] if $(top_srcdir) is a submodule checkout or a linked
worktree [2].
[1] A "gitlink" is a text file that specifies the real location of
the gitdir.
[2] Linked worktrees are a new feature in Git 2.5.
Cc: "10.6, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 75784243df)
The Vulkan spec says that pPhysicalDeviceCount is an out parameter if
pPhysicalDevices is NULL; otherwise it's an inout parameter.
Mesa incorrectly treated it unconditionally as an inout parameter, which
could have lead to reading unitialized data.
Don't enumerate devices in vkCreateInstance(). That's where global,
device-independent initialization should happen. Move device enumeration
to the more logical location, vkEnumeratePhysicalDevices().
Function fill_physical_device() has a 'path' parameter, and struct
anv_physical_device has a 'path' member. Sometimes these are used;
sometimes hardcoded "/dev/dri/renderD128" is used instead.
Be consistent. Hardcode "/dev/dri/renderD128" in exactly one location,
during initialization of the physical device.
Before values are pushed or annotated with a name, decoration, etc.,
they need to have an invalid type, NULL name, NULL decoration, etc.
ralloc zero's everything by accident, so this wasn't an issue in
practice, but we should be explicitly zero'ing it.
Unfortunately, this requires some non-trivial changes to the driver. Now
that the primitive restart index isn't given explicitly by the client, we
always use ~0 for everything like D3D does. Unfortunately, our hardware is
awesome and a 32-bit version of ~0 doesn't match any 16-bit values. This
means, we have to set it to either UINT16_MAX or UINT32_MAX depending on
the size of the index type. Since we get the index type from
CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need
to lazy-emit the VF packet.
This was missing and was causing the driver to not work with
execlists. Presumably we get a different initial hw context with
execlists enabled, that has sample mask 0 initially.
Set this to 0xffff for now. When we add MS support, we need to take the
value from VkPipelineMsStateCreateInfo::sampleMask.
The new headers use stdbool for enable/disable fields which
implicitly converts expressions like (flags & 8) to 0 or 1.
Also handles MBO (must-be-one) fields by setting them to one,
corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and
makes a few enum values less clashy.
Convert the table from the direct mapping
VkImageViewType -> SurfaceType
into a mapping to an info struct
VkImageViewType -> struct anv_image_view_info
This accounts for a number differences between the generated headers and
the hand-written header. Not all reformatting is done in this commit but
it does make the headers much more diffable.
In theory, no functional change.
Structs in the old version were specified as
typedef struct VkSomeThing_
{
type field; // comment
} VkSomeThing;
However, in the generated headers, you have
typedef struct {
type field;
} VkSomeThing;
This commit also removes some unneeded whitespaces.
We should be asserting that the parent decoration didn't hand us
a member if the child decoration did, but different child decorations
may obviously have different members.
Since SSA values now have their own types, it's more convenient to make
'type' only used when we want to look up an actual SPIR-V type, since
we're going to change its type soon to support various decorations that
are handled at the SPIR-V -> NIR level.
Previously, the caller of vtn_handle_type had to handle actually inserting
the type. However, this didn't really work if the type was decorated in
any way.
The way the code currently works is that anv_format::cpp is the cpp of
anv_format::surface_format.
Me and Kristian disagree about how the code *should* work. Despite that,
I think it's in our discussion's best interest to document how the code
*currently* works. That should eliminate confusion.
If and when the code begins to work differently, then we'll update the
anv_format comments.
This prepares for upcoming miptree support.
anv_surface is a proxy for color surfaces, depth surfaces, and stencil
surfaces. Embed two instances of anv_surface into anv_image: the
primary surface (color or depth), and an optional stencil surface.
All function signatures that matched this pattern,
old: f(const VkImageCreateInfo *, const struct anv_image_create_info *)
were rewritten as
new: f(const struct anv_image_create_info *)
The format table defined cpp = 0 for stencil-only formats. The real cpp
is 1.
When code begins to lie, especially about stencil buffers, code becomes
increasingly fragile as time progresses, and the damage becomes
increasingly hard to undo. (For precedent, see the painful history of
stencil buffer cpp in the git log for gen6 and gen7 in the i965 driver).
Let's undo the stencil buffer cpp lie now to avoid future pain.
In the format table, set cpp = 1 for VK_FORMAT_S8; replace checks for
cpp == 0; and delete all comments about the hack.
From my experience with intel_mipmap_tree.c, I learned that for struct's
like anv_image and intel_mipmap_tree, which have sprawling
multi-function construction codepaths, it's easy to mistakenly use
unitialized struct members during construction.
Let's eliminate the risk of using unitialized anv_image members during
construction. Fill the struct at the function bottom instead of
piecemeal throughout the constructor.
anv_format::surface_format was incorrect for Vulkan depth formats.
For example, the format table mapped
VK_FORMAT_D24_UNORM -> .surface_format = D24_UNORM_X8_UINT
VK_FORMAT_D32_FLOAT -> .surface_format = D32_FLOAT
but should have mapped
VK_FORMAT_D24_UNORM -> .surface_format = R24_UNORM_X8_TYPELESS
VK_FORMAT_D32_FLOAT -> .surface_format = R32_FLOAT
The Crucible test func.depthstencil.basic passed despite the bug, but
only because it did not attempt to texture from the depth surface.
The core problem is that RENDER_SURFACE_STATE.SurfaceFormat and
3DSTATE_DEPTH_BUFFER.SurfaceFormat are distinct types. Considering them
as enum spaces, the two enum spaces have incompatible collisions.
Fix this by adding a new field 'depth_format' to struct anv_format.
Refer to brw_surface_formats.c:translate_tex_format() for precedent.
I misinterpreted anv_format::format as a VkFormat. Instead, it is
a hardware surface format (RENDER_SURFACE_STATE.SurfaceFormat). Rename
the field to 'surface_format' to make it unambiguous.
The brw_create_nir function takes a GLSL or ARB shader and turns it into a
NIR shader. The guts of the optimization and lowering code is now split
into a new brw_process_shader function.
As of this commit, nothing actually needs the brw_context.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This way we can stop doing is_gles3 checks inside of the compiler.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Previously, these were pulled out of the GL context conditionally based on
whether we were running ff/ARB or a GLSL program. Now, we just pass them
in so that the visitor doesn't have to grab them itself.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Previously, each shader took 3 shader time indices which were potentially
at arbirary points in the shader time buffer. Now, each shader gets a
single index which refers to 3 consecutive locations in the buffer. This
simplifies some of the logic at the cost of having a magic 3 a few places.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This creates the options at screen cration time and then we just copy them
into the context at context creation time. We also move is_scalar to the
brw_compiler structure.
We also end up manually setting some values that the core would have set by
default for us. Fortunately, there are only two non-zero shader compiler
option defaults that we aren't overriding anyway so this isn't a big deal.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
While we're at it, we'll drop the note about 10-20% performance loss.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
v2 (Ken): Make shader_debug_log a printf-like function.
v3 (Jason): Add a void * to pass the brw_context through
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Coverity sees the if (mode >= BRW_PRIM_OFFSET (128)) test and assumes
that the else-branch might execute for mode to up 127, which out be out
of bounds.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Coverity sees that the functions immediately below the new assertions
dereference these pointers, but is unaware that an ENDIF always follows
an IF, etc.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
On gen9+ MOCS is an index into a table. It is 7 bits, and AFAICT, bit 0 is for
doing encrypted reads.
I don't recall how I decided to do this for BXT. I don't know this patch was
ever needed, since it seems nothing is broken today on SKL. Furthermore, this
patch may no longer be needed because of the ongoing changes with MOCS setup. It
is what is being used/tested, so it's included in the series.
The chosen values are the old values left shifted. That was also an arbitrary
choice.
v2: Use shift in MOCS to make it clear what we're doing. (Ken)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Without first running the bo through pushbuf_refn, the nouveau drm
library will have uninitialized structures regarding this bo, and will
insert incorrect data.
This fixes supertuxkart 0.9 crash on start (where it ends up doing a lot
of indirect draws).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Since we clear the TFB bufctx binding point above, we need to put all of
the active tfb's back in, even if they haven't changed since last time.
Otherwise the tfb may get moved into sysmem and the underlying mapping
will generate write errors.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
The only reason we touch glapi is to dlopen it in order to:
- make sure that the unresolved _glapi* symbols in the dri modules are
provided.
- fetch glFlush() and use it at various stages in the dri2 driver.
Cc: Chih-Wei Huang <cwhuang@linux.org.tw>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
The whole of GBM does not rely on even a single symbol from the GL
dispatch library, unsuprisingly. The only need for it comes from the
unresolved symbols in the DRI modules, which are now correctly handled
with Frank's commit.
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Dri driver libs are not linked to pull in libglapi so gbm_create_device()
fails when it tries to dlopen them (unless the application is linked
with something that does pull in libglapi, like libGL).
Until dri drivers can be fixed properly, dlopen libglapi before trying
to dlopen them.
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Frank Henigman <fjhenigman@google.com>
[Emil Velikov: Drop misleading bugzilla link, mention that libname differs]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Already handled in the Makefile which includes the drivers/x11 subdir.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
We've moved the open with CLOEXEC idiom into a helper function, so
call it instead of duplicating the code.
This also replaces a couple of opens that didn't properly do CLOEXEC.
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
We've moved the open with CLOEXEC idiom into a helper function, so
call it instead of duplicating the code here.
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
This is already our common idiom for opening files with CLOEXEC and
it's a little ugly, so let's share this one implementation.
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Replacing dup() with fcntl F_DUPFD_CLOEXEC creates the duplicate
file descriptor with CLOEXEC so it won't be leaked to child
processes if the process fork()s later.
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
If the shader doesn't specify number of invocations, assume one.
This fixes geometry shaders on state trackers other than Mesa (and
probably graw tests too.)
Trivial.
This extends the draw code to add support for invocations.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is just for softpipe, llvmpipe won't work without
some changes.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is required for ARB_gpu_shader5 support in softpipe.
v2: add support to txd/txf/txq paths.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
These are basically just moves, so they should be safe as well.
When disabling i965's GLSL IR level scalarizer (channel expressions)
pass, I started seeing NIR code like this:
if ssa_21 {
block block_1:
/* preds: block_0 */
vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30
/* succs: block_3 */
} else {
block block_2:
/* preds: block_0 */
/* succs: block_3 */
}
block block_3:
/* preds: block_1 block_2 */
vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2
Previously, the GLSL IR scalarizer pass would break the vec4 into a
series of fmovs, which were allowed by the peephole pass. But with
the vec4 operation, they were not. We want to keep getting selects.
Normal i965 on Broadwell:
instructions in affected programs: 200 -> 176 (-12.00%)
helped: 4
With brw_fs_channel_expressions() disabled:
instructions in affected programs: 1832 -> 1646 (-10.15%)
helped: 30
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
This was originally only used by the vertex shader, but it's now used by
the geometry shader as well, and will also eventually be used for
tessellation control and evaluation shaders.
I suspect it will be easier to find in a file named after the concept.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
When we're done emitting the code for a loop, we need to visit the new
break block, which is the merge block of the current loop, rather than
the old merge block, which is the merge block of the loop containing the
one we just emitted code for.
This implements a workaround (exact excerpt as a comment in the code). The docs
specify [clearly, after you struggle for a while] that the offset isn't relative
to state base. This actually makes sense. This fixes hangs on SKL.
Buffer #0 is meant to be used for normal uniforms.
Buffer #1 is typically used for gather constants when using RS.
Buffer #1-#3 could be used to push a bunch of UBO data which would just be
somewhere in memory, and not relative to the dynamic state.
NOTE: I've moved away from the ternary operator for the new gen9 conditions.
Admittedly it's probably not great to do this, but I really want to fix this all
up in the subsequent patch and doing it here makes that diff a lot nicer. I want
to split out the gen8/9 code to make the function a bit more readable, but to
keep this easily cherry-pickable I am doing this fix first. If we decide not to
merge the cleanup patch then I can revisit this.
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Valtteri Rantala <Valtteri.rantala@intel.com>
It allows us to remove ilo_ib_state::draw_start_offset and
ILO_PRIM_RECTANGLES. gen6_3d_translate_pipe_prim() is also replaced by
ilo_translate_draw_mode().
Use the newly-introduced NV_VRAM_DOMAIN() macro to support alternative
VRAM domains for chips that do not have dedicated video memory.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Some GPUs (e.g. GK20A, GM20B) do not embed VRAM of their own and use
the system memory as a backend instead. For such systems, allocating
objects in VRAM results in errors since the kernel will not allow
VRAM objects allocations.
This patch adds a vram_domain member to struct nouveau_screen that can
optionally be initialized to an alternative domain to use for VRAM
allocations. If left untouched, NOUVEAU_BO_VRAM will be used for
systems that embed VRAM, and NOUVEAU_BO_GART will be used for VRAM-less
systems.
Code that uses GPU objects is then expected to use the NV_VRAM_DOMAIN()
macro in place of NOUVEAU_BO_VRAM to ensure correct behavior on
VRAM-less chips.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>
For query_levels, we generate a getinfo with writemask of (z), which RA
will consider as size==3. But we were still generating four fanouts.
Which meant that RA would see it as two different register classes,
depending on the path to definer. Ie. on the getinfo instruction itself
it would see size==3, but when chasing back through the fanouts it would
see size==4.
Easiest way to solve that is to just generate the chain of neighboring
fanouts to have the correct size in the first place.
Note: we may eventually want split_dest() to take start/end or wrmask
instead, since really we only need size==1. But RA is not clever enough
for that, query_levels is not that common, and the other two registers
that get allocated are never used so those register slots can be
immediately re-used. So bunch of work for probably no real gain.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
We get this information from NIR (which gets it from sview decl in tgsi
when translating from tgsi), so no need to maintain shader variants for
this.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
This shuffles things around to allow the shader to have multiple basic
blocks. We drop the entire CFG structure from nir and just preserve the
blocks. At scheduling we know whether to schedule conditional branches
or unconditional jumps at the end of the block based on the # of block
successors. (Dropping jumps to the following instruction, etc.)
One slight complication is that variables (load_var/store_var, ie.
arrays) are not in SSA form, so we have to figure out where to put the
phi's ourself. For this, we use the predecessor set information from
nir_block. (We could perhaps use NIR's dominance frontier information
to help with this?)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
These belong in the shader, rather than the block. Mostly a lot of
churn and nothing too interesting. But splitting this out from the
rest of ir3_block reshuffling to cut down the noise in the later
patch.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Right now, just provides a cleaner way to get at the gpu-id, given the
separation between compiler and context. But we will need this also to
hold the reg-set for new register allocation.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Use a more standard priority-queue based scheduling algo. It is simpler
and will make things easier once we have multiple basic blocks and flow
control.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Use standard list_head double-linked list and related iterators,
helpers, etc, rather than weird combo of instruction array and next
pointers depending on stage. Now block has an instrs_list. In
certain stages where we want to remove and re-add to the blocks list
we just use list_replace() to copy the list to a new list_head.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
At least for now.. right now the instruction and instruction list
printing should suffice, and the re-working of ir3_block would require
a lot of changes in that code.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Use ir3_MOV() builder in a couple of spots, rather than open-coding the
instruction construction. Also add ir3_NOP() builder and use that
instead of open coding.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Freedreno needs sampler type information to deal with int/uint textures.
To accomplish this, start creating sampler-view declarations, as
suggested here:
http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html
create a sampler-view with index matching the sampler, to encode the
texture type (ie. SINT/UINT/FLOAT). Ie:
DCL SVIEW[n], 2D, UINT
DCL SAMP[n]
TEX OUT[1], IN[1], SAMP[n]
For tgsi texture instructions which do not take an explicit SVIEW
argument, the SVIEW index is implied by the SAMP index.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Some hardware needs to know the sampler type. Update the blit related
shaders to include SVIEW decl.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This was a hack as part of debugging some glamor-on-GLES2 behavior that
ended up being an xserver bug. I suspect we can just flip this extension
on for GLES2, but the spec says it requires 3.1.
We need to make sure that when we store the aligned box, we've got
initialized contents in the border. We could potentially just load the
border area, but for now let's get text rendering working in X (and fix
the GL_TEXTURE_2D errors in piglit's texsubimage test and
gl-2.1-pbo/test_tex_image)
Being a parameter-like state, we may want to get rid of
ilo_state_vertex_buffer_info or ilo_state_vertex_buffer eventually. But we
want them now as they are how we do cross-validation right now.
3DSTATE_VF_INSTANCING specifies instancing enable and step rate. They are
specified along with 3DSTATE_VERTEX_BUFFERS instead prior to Gen8. Both
commands are added.
3DSTATE_VF specifies cut index enable and cut index. Cut index enable is
specified in 3DSTATE_INDEX_BUFFER instead prior to Gen7.5. Both commands are
added.
ilo_shader.c: In function ‘ilo_shader_select_kernel_sbe’:
ilo_shader.c:1140:27: warning: ‘src_skip’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
The original code meant to do this, but was only checking num_samples == 1 to
figure out if a surface was fast clear capable. However, we can allocate single
sample miptrees with num_samples == 0 (when it's an internally created buffer).
This fixes a bunch of the piglit tests on gen8. Other gens should have been
fine.
Here is the order of events that allowed this to slip through:
t0: I wrote halign patches and tested them. These alignment assertions are for
gen8 fast clear surfaces, basically.
t1: I pushed bogus perf patch which made fast clears never happen
t2: Reworked halign patches based on Chad's feedback and introduced the bug this
patch fixes.
t2.5: I tested reworked patches, but assertion wasn't hit because of t1.
t3. Matt fixed issue in t1 which made fast clears happen here:
commit 22af95af83
Author: Matt Turner <mattst88@gmail.com>
Date: Thu Jun 18 16:14:50 2015 -0700
i965: Add missing braces around if-statement.
This logic should match that of the v1 of my halign patch series.
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Matt Turner <mattst88@gmail.com>
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
When adding EXT_polygon_offset_clamp, I first made it core-only, and
never moved the enum getter back to the GL/GL_CORE section. Similarly,
ARB_gs5 is a core-only extension, so move its getters to the GL_CORE
section.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
If the driver says PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY=1,
the driver should never receive a pipe_vertex_element::src_offset value
that's not a multiple of four. But the vbuf code wasn't actually adjusting
the src_offset value when creating the vertex element state object.
We just need to align the src_offset values put in the driver_attribs[]
array.
See the piglit gl-1.5-vertex-buffer-offsets test.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
There are three possible return values (not two): WGL_SWAP_COPY_ARB,
WGL_SWAP_EXCHANGE_EXT and WGL_SWAP_UNDEFINED_ARB.
VMware bug 1431184
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Also, print a warning if we do return NULL from wglGetProcAddress() to
help spot this sort of problem in the future.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Viewperf 12 calls wglGetProcAddress() to get pointers to some unsupported
DSA and half-float functions. We return NULL but Viewperf doesn't check
for null before trying to jump through the pointer. That causes a crash.
This patch adds no-op functions to call instead (used by the next patch).
This avoids the crash but the rendering is incorrect.
Some DSA functions are being added to Mesa at this time so we may be
able to remove some of these no-ops in the future.
More no-op functions may be added as needed.
VMware PR1383421
Reviewed-by: José Fonseca <jfonseca@vmware.com>
WGL_CONTEXT_PROFILE_MASK_ARB doesn't apply to desktop OpenGL versions
less than 3.2 -- applications can't specify whether they want a core or
a compat 3.1 context -- instead they are supposed the check whether the
returned context advertises GL_ARB_compatibility extension.
Mesa doesn't support compatability contexts for version higher than 3.1,
so we used to return core profile context, but this makes several Windows
applications unhappy, because they just assume they got a compatability
context without checking.
So it seems safer to on Windows to never return core profile for 3.1,
ie, just fail the context creation.
VMware PR1365920.
Reviewed-by: Brian Paul <brianp@vmware.com>
Create pixel formats with 0, 4, 8 and 16 samples per pixel.
Add a SVGA_FORCE_MSAA env var to force creating all pixel formats
with a particular sample count. This is useful for testing Mesa/GLUT/
etc. programs which don't ordinarily use multisample.
Reviewed-by: Matthew McClure <mcclurem@vmware.com>
Update piglit link to the current Piglit website.
Add note about updating patchwork when sending patch revisions.
Acked-by: Matt Turner <mattst88@gmail.com>
Tested with Ilia Mirkin's gzdoom.trace and
"arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test
without my earlier fix to fail linkage when UBO exceeds
GL_MAX_UNIFORM_BLOCK_SIZE.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
It's not totally clear whether other Mesa drivers can safely cope with
over-sized UBOs, but at least for llvmpipe receiving a UBO larger than
its limit causes problems, as it won't fit into its internal display
lists.
This fixes piglit "arb_uniform_buffer_object-maxuniformblocksize
fsexceed" without regressions for llvmpipe.
NVIDIA driver also fails to link the shader from
"arb_uniform_buffer_object-maxuniformblocksize fsexceed".
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65525
PS: I don't recommend cherry-picking this for Mesa stable, as some app
might inadvertently been relying on UBOs larger than
GL_MAX_UNIFORM_BLOCK_SIZE to work on other drivers, so even if this
commit is universally accepted it's probably best to let it mature in
master for a while.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
gl_NumSamples should only be enabled when ARB_sample_shading is enabled.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Although we don't support SIMD32, krh pointed out that the left shift
by 32 is undefined by C/C++ for 32-bit integers.
Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This was apparently missed when ARB_sso support was added.
Add label support to pipeline objects just like all the other
debug-related objects.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
This avoids a security issue where userspace could have written the tile
state/tile alloc behind the GPU's back, and will apparently be necessary
for fixing stability bugs (tile state buffers are missing some top bits
for the tile alloc's address).
We can't use sampler messages with gradient information (like
sample_g or sample_d) to deal with this scenario because according
to the PRM:
"The r coordinate and its gradients are required only for surface
types that use the third coordinate. Usage of this message type on
cube surfaces assumes that the u, v, and gradients have already been
transformed onto the appropriate face, but still in [-1,+1] range.
The r coordinate contains the faceid, and the r gradients are ignored
by hardware."
Instead, we should lower this to compute the LOD manually based on the
gradients and use a different sample message that takes the computed
LOD instead of the gradients. This is already being done in
brw_lower_texture_gradients.cpp, but it is restricted to shadow
samplers only, although there is a comment stating that we should
probably do this also for samplerCube and samplerCubeArray.
Because of this, both dEQP and Piglit test cases for textureGrad with
cube maps currently fail.
This patch does two things:
1) Activates the texturegrad lowering pass for all cube samplers.
2) Corrects the computation of the LOD value for cube samplers.
I had to do 2) because for cube maps the calculations implemented
in the lowering pass always compute a value of rho that is twice
the value we want (so we get a LOD value one unit larger than we
want). This only happens for cube map samplers (all kinds). I am
not sure about why we need to do this, but I suspect that it is
related to the fact that cube map coordinates, when transported
to a specific face in the cube, are in the range [-1, 1] instead of
[0, 1] so we probably need to divide the derivatives by 2 when
we compute the LOD. Doing that would produce the same result as
dividing the final rho computation by 2 (or removing a unit
from the computed LOD, which is what we are doing here).
Fixes the following piglit tests:
bin/tex-miplevel-selection textureGrad Cube -auto -fbo
bin/tex-miplevel-selection textureGrad CubeArray -auto -fbo
bin/tex-miplevel-selection textureGrad CubeShadow -auto -fbo
Fixes 10 dEQP tests in the following category:
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Triggers an INVALID_OPCODE warning on GK208. Seems rare enough to not
warrant verification on other chips. Fixes the new piglits:
ubo_array_indexing/fs-nonuniform-control-flow.shader_test
ubo_array_indexing/vs-nonuniform-control-flow.shader_test
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Ensure that the GPU spawns the fragment shader thread for those
fragment shaders with atomic buffer access.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Add helper function that checks if current fragment shader active
of gl_context has atomic buffer access.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometry_ convenience functions
for those places where the geometry of the gl_framebuffer is needed
(in contrast to the geometry of the intersection of the attachments
of the gl_framebuffer).
This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments
on Gen7 and higher in i965.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Add helper convenience function that intersects the scissor values
against a passed bounding box. In addition, to avoid replicated code,
make the function _mesa_scissor_bounding_box() use this new function.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Add convenience helper functions for fetching geometry of gl_framebuffer
that return the geometry of the gl_framebuffer instead of the geometry of
the buffers of the gl_framebuffer when then the gl_framebuffer has no
attachments.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Implement GL_ARB_framebuffer_no_attachments in Mesa core
- changes to conditions for framebuffer completenss
- implement set/get functions for framebuffers for
new functions in GL_ARB_framebuffer_no_attachments
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Define the enumeration constants, function entry points and
glGet for the GL_ARB_framebuffer_no_attachments.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Define the infrastructure for the extension GL_ARB_framebuffer_no_attachments:
- extension table
- additions to gl_framebuffer
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
I was thinking of the MIN opcode in terms of unsigned math, but it's
signed, so if you used a negative array index, you could read before the
UBO. Fixes segfaults under simulation in piglit array indexing tests with
mprotect-based guard pages.
I wanted to assert that src1 came from a non-unspilled register in shader
validation, and this easily gets us that. And, as a bonus:
total instructions in shared programs: 93347 -> 92723 (-0.67%)
instructions in affected programs: 60524 -> 59900 (-1.03%)
Disabling miptails fixed the buffer corruption happening in FBO
which use YF/YS tiled renderbuffer or texture as color attachment.
Spec recommends disabling mip tails only for non-mip-mapped surfaces.
But, without disabling miptails I couldn't get correct data out of
mipmapped YF/YS tiled surface.
We need better understanding of miptails before start using them.
For now this patch helps move things forward.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Patch sets the alignments for texture and renderbuffer surfaces.
V3: Make changes inside horizontal_alignment() and
vertical_alignment() (Topi)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This function will be utilised in later patches.
V2: Make both pointers constants (Topi)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This patch sets the tiled resource mode for texture and renderbuffer
surfaces.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The surfaceless platform is for off-screen rendering only. Render node support
is required.
Only consider the render nodes. Do not use normal nodes as they require
auth hooks.
v3: change platform_null to platform_surfaceless
v4: make libdrm required for surfaceless
v5: remove modified include guards with defined(HAVE_SURFACELESS_PLATFORM)
v6: use O_CLOEXEC for drm fd
Signed-off-by: Haixia Shi <hshi@chromium.org>
Signed-off-by: Zach Reizner <zachr@google.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Previously when setting up the sample instruction for an indirect
sampler the vec4 backend was directly passing the pseudo opcode's
src0. However vec4_visitor::visit(ir_texture *) doesn't set the
texture operation's src0 -- it's left as BAD_FILE, which when
translated into a brw_reg gives the null register. In brw_SAMPLE,
gen6_resolve_implied_move() inserts a MOV from the inst->base_mrf and
sets the src0 appropriately. The indirect sampler case did not have a
call to gen6_resolve_implied_move().
The fs backend avoids this because the platforms that support dynamic
indexing of samplers (IVB+) have been converted to not use the
fake-MRF hack, and instead send from proper GRFs.
This patch makes it call gen6_resolve_implied_move before setting up
the indirect message. This is similar to what is done for constant
sampler numbers in brw_SAMPLE.
The Piglit tests for sampler array indexing didn't pick this up
because they were using a texture with a solid colour so it didn't
matter what texture coordinates were actually used. The tests have now
been changed to be more thorough in this commit:
http://cgit.freedesktop.org/piglit/commit/?id=4f9caf084eda7
With that patch the tests for gs and vs are currently failing on
Ivybridge, but this patch fixes them. There are no other changes to a
Piglit run on Ivybridge.
On Skylake the gs tests were failing even without the Piglit patch
because Skylake needs the source registers to work correctly in order
to send a message header to select SIMD4x2 mode.
(The explanation in the commit message is partially written by Matt
Turner)
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This reverts commit adee54f826.
Further down in the GLSL ES 3.10 spec it say:
"If an array is declared as the last member of a shader storage block
and the size is not specified at compile-time, it is sized at run-time.
In all other cases, arrays are sized only at compile-time."
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
This function was trying to align the width and height to a multiple
of the block size for compressed textures. It was using align_w/h as a
shortcut to get the block size as up until Gen9 this always happens to
match. However in Gen9+ the alignment values are expressed as
multiples of the block size so in effect the alignment values are
always 4 for compressed textures as that is the minimum value we can
pick. This happened to work for most compressed formats because the
block size is also 4, but for FXT1 this was breaking because it has a
block width of 8.
This fixes some Piglit tests testing FXT1 such as
spec@3dfx_texture_compression_fxt1@fbo-generatemipmap-formats
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
The state tracker will pass through requests from buggy applications
which will have the buffer size larger than the max allowed (64k). Clamp
the size to 64k so that we don't get errors when uploading the constbuf
data.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
One of the places we have to insert texbars is in situations where the
result of the tex gets overwritten by a different instruction (e.g. in a
conditional statement). However in some situations it can actually
appear as though the original tex itself is an overwriting instruction.
This can naturally never really happen, so just ignore the tex
instruction when it comes up.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90347
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Back in 2013, a patch was added (with 2 reviewers!) at the end of the
block to early exit the loop in this case, without noticing that the loop
already did. I added another early exit case, again without noticing, but
Rob caught me. Just drop the loop condition that apparently surprises
most of us, instead of leaving the end of the loop conspicuously not
exiting on success.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
This was part of gallium_egl, and we now have the normal libEGL Android
winsys support to handle it.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
v2: Add a comment explaining why we link libmesa_glsl. Drop warning
option from freedreno. Add vc4 to the documentation for
BOARD_GPU_DRIVERS.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
in case of glTexImage{1,2,3}D(). Texture has already been allocated
at this point and we have no data to upload. With out this patch,
with create_pbo = true, we end up creating a temporary pbo and then
uploading uninitialzed texture data.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
We have an assert() in intel_miptree_map_movntdqa() which expects
the pitch to be 16 byte aligned.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
OP_JOIN instructions are assumed to be flow instructions and mercilessly
casted to FlowInstruction.
This patch fixes an instance where an OP_JOIN is created as a plain
instruction. This can cause crashes in the ir printer.
[imirkin: add ->fixed = 1]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
We want to replace ilo_zs_surface with ilo_state_zs. One noteworthy
difference is that ilo_state_zs always aligns level 0 to 8x4 when HiZ is
enabled. HiZ will not be enabled for 1D surfaces as a result.
It needs be set to R/W only when using certain messages via DP render cache.
Since we only use RT wrties with the render cache, we never need to set it.
The only use of lp_profile() is wrapped in #if defined(PROFILE),
so there is no reason to build it unless this macro is defined.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
We compute the right mask and thread width max parameters as part of
pipeline creation and set them accordingly at vkCmdDispatch() and
vkCmdDispatchIndirect() time. These parameters depend only on the local
group size and the dispatch width of the program so we can figure this
out at pipeline create time.
This helped find the incorrect HALIGN values from the previous patches.
v2: Add PRM references for assertions (Chad)
v3: Remove duplicated part of commit message, assert num_samples > 1, instead of
num_samples > 0. (Chad)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Just like the previous patch, but for the GEN9 constraints.
v2:
bugfix: Gen9 HALIGN was being set for all miptree buffers (Chad). To address
this, move the check to where the gen8 check is, and do the appropriate
conditional there.
v3:
Remove stray whitespace introduced in v2 (Chad)
Rework comment to show AUX_CCS and AUX_MCS specifically. Remove misworded part
about gen7 (Chad).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Chad Versace <chad.versace@intel.com>
This restriction was attempted in this commit:
commit 4705346463
Author: Anuj Phogat <anuj.phogat@gmail.com>
Date: Fri Feb 13 11:21:21 2015 -0800
i965/gen8: Use HALIGN_16 if MCS is enabled for non-MSRT
However, the commit itself doesn't achieve the desired goal as determined by the
asserts which the next patch adds. mcs_mt is NULL (never set) we're in the
process of allocating the mcs_mt miptree when we get to this function. I didn't
check, but perhaps this would work with blorp, however, meta clears allocate the
miptree structure (which AFAICT needs the alignment also) way before it
allocates using meta clears where the renderbuffer is allocated way before the
aux buffer.
The restriction is referenced in a few places, but the most concise one [IMO]
from the spec is for Gen9. Gen8 loosens the restriction in that it only requires
this for non-msrt surface.
When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, HALIGN 16 must
be used.
With the code before the miptree layout flag rework (patches preceding this),
accomplishing this workaround is very difficult.
v2:
bugfix: Don't set HALIGN16 for gens before 8 (Chad)
v3:
non-trivial rebase
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
There are several constraints when determining if one can fast clear a surface.
Some of these are alignment, pixel density, tiling formats, and others that vary
by generation. The helper function which exists today does a suitable job,
however it conflates "BO properties" with "Miptree properties" when using
tiling. I consider the former to be attributes of the physical surface, things
which are determined through BO allocation, and the latter being attributes
which are derived from the API, and having nothing to do with the underlying
surface.
Determining tiling properties and creating miptrees are related operations
(when we allocate a BO for a miptree) with some disjoint constraints. By
extracting the decisions into two distinct choices (tiling vs. miptree
properties), we gain flexibility throughout the code to make determinations
about when we can or cannot fast clear strictly on the miptree.
To signify this change, I've also renamed the function to indicate it is a
distinction made on the miptree. I am torn as to whether or not it was a good
idea to remove "non_msrt" since it's a really nice thing for grep.
v2:
Reword some comments (Chad)
intel_is_non_msrt_mcs_tile_supported->intel_tiling_supports_non_msrt_mcs (Chad)
Make full if ladder for gens in above function (Chad)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
For GEN9, much of the logic to use X-Tiled buffers has been stripped out. It is
still supported in some places, but it's never desirable. Unfortunately we don't
yet have the ability to have Y-Tiled scanout (see:
http://patchwork.freedesktop.org/patch/46984/),
NOTE: This patch shouldn't actually do anything since SKL doesn't yet use fast
clears (they are disabled because they are causing regressions). THerefore, the
only case we can get to this function on SKL is by way of
intel_update_winsys_renderbuffer_miptree.
v2: Update commit message to be more clear that the NOTE is for SKL only.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
I think pretty much everyone agrees that having more than a single bool as a
function argument is bordering on a bad idea. What sucks about the current
code is in several instances it's necessary to propagate these boolean
selections down to lower layers of the code. This requires plumbing (mechanical,
but still churn) pretty much all of the miptree functions each time. By
introducing the flags paramater, it is possible to add miptree constraints very
easily.
The use of this, as is already the case, is sometimes we have some information
at the time we create the miptree that needs to be known all the way at the
lowest levels of the create/allocation, disable_aux_buffers is currently one
such example. There will be another example coming up in a few patches.
v2:
Tab fix. (Ben)
Long line fixes (Topi)
Use anonymous enum instead of #define for layout flags (Chad)
Use 'X != 0' instead of !!X (everyone except Chad)
v3:
Some non-trivial conflict resolution on top of Anuj's patches.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: "Pohjolainen, Topi" <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This will be used to implement the Gateway Barrier SEND needed to implement
the barrier function.
v2:
* notify => gateway_notify (Ken)
* combine short lines of brw_barrier proto/decl (mattst88)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This will be used to implement the barrier function.
v2:
* Rename to brw_WAIT (mattst88)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This will be used by the wait instruction when implementing the barrier()
function.
v2:
* Changes suggested by mattst88
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
These fields will be used when emitting a send for the barrier function.
Reference: IVB PRM Volume 4, Part 2, Section 1.1.1 Message Descriptor
v2:
* notify => gateway_notify (Ken)
* define bits for gen4-gen6 (bwidawsk, Ken)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This prevents an assertion from being hit with SIMD16:
Assertion `inst->exec_size == dispatch_width() || force_writemask_all' failed.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Silence the warnings about the future incompatibility with automake 2.0
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Rather than forcing everyone to provide their own definition of the symbol
provide a common (dummy) one.
This helps us resolve the build of the standalone pipe-drivers (amongst
others), which are missing the symbol.
Cc: Rob Clark <robclark@freedesktop.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Drop the stub/unused function haiku_create_surface() and add some basic implementation for destroy_surface()
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
It serves little to no purpose. As the driver gets updated, one can
look at the existing implementation (dri2) for reference rather than
letting the commented functions bitrot.
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Earlier commit folded the two separate variables into one, but forgot to
update the haiku driver.
Fixes: 0e4b564ef28(egl: combine VersionMajor and VersionMinor into one
variable)
Cc: Marek Olšák <marek.olsak@amd.com>>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Instead use fs_builder::null_reg_f() which has the correct register
width. Avoids the assertion failure in fs_builder::emit() hit by the
"ES3-CTS.shaders.loops.for_dynamic_iterations.unconditional_break_fragment"
GLES3 conformance test introduced by 4af4cfba9e.
Reported-and-reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
This reverts commits
e17ed04 * vk/image: Don't double-allocate stencil buffers
1ee2d1c * vk/image: Teach anv_image_choose_tile_mode about WMAJOR
and instead adds a comment to describe the subtlety of how we create
images for stencil only formats.
This reverts commit f3b709c0ac.
The "dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_4.
interpolation.lines_wide" test appears to be broken on Cherryview when
we expose line widths greater than 12.0. I'm not sure why.
For now, just go back to the limits we used on older platforms.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90902
Acked-by: Matt Turner <mattst88@gmail.com>
This makes the SSA definitions use sequential numbers (0, 1, 2, ...)
instead of seemingly random ones. There's not much point normally,
but it makes debug output much easier to read.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
The other PIPE_CAPF_ and PIPE_SHADER_CAP_ enums don't have explicit values.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
It should only be used as an initializer expression.
Trivial, and fixes Windows builds.
Nevertheless, overwriting an once_flag like this seems dangerous and
should be revised.
The same fix Marius implemented for gen6 (commit a9b04d8a) and
gen7 (commit 24ecf37a).
Also, we need the same code to handle special cases of line width
in gen6, gen7 and now gen8, so put that in the helper function
we use to compute the line width.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Without this patch, the following constructs (not an extensive list)
would crash mesa:
- mat2 foo = mat2(1); vec4 bar = vec4(foo);
- mat3 foo = mat3(1); vec4 bar = vec4(foo);
- mat3 foo = mat3(1); ivec4 bar = ivec4(foo);
The first case is explicitely allowed by the GLSL spec, as seen on
page 101 of the GLSL 4.40 spec:
"vec4(mat2) // the vec4 is column 0 followed by column 1"
The other cases are implicitely allowed also.
The actual changes are quite minimal. We first split each column of
the matrix to a list of vectors and then use them to initialize the
vector. An additional check to make sure that we are not trying to
copy 0 elements of a vector fix the (i)vec4(mat3) case as the last
vector (3rd column) is not needed at all.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
If _mesa_hash_table_create failed we'd get null pointer. Report
error and go away.
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
In commit fe74fee8fa we rounded the line width to the nearest integer to
match the GLES3 spec requirements stated in section 13.4.2.1, but that seems
to break a dEQP test that renders wide lines in some multisampling scenarios.
Ian noted that the Open 4.4 spec has the following similar text:
"The actual width of non-antialiased lines is determined by rounding the
supplied width to the nearest integer, then clamping it to the
implementation-dependent maximum non-antialiased line width."
and suggested that when ES removed antialiased lines, they removed
"non-antialised" from that paragraph but probably should not have.
Going by that note, this patch restricts the quantization implemented in
fe74fee8fa only to regular aliased lines. This seems to keep the
tests fixed with that commit passing while fixing the broken test.
v2:
- Drop one of the clamps (Ken, Marius)
- Add a rule to prevent advertising line widths that when rounded go beyond
the limits allowed by the hardware (Ken)
- Update comments in the code accordingly (Ian)
- Put the code in a utility function (Ian)
Fixes:
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90749
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Broadwell's stencil blitting code attempts to bind a renderbuffer as a
texture, using dd->BindRenderbufferTexImage().
This calls _mesa_init_teximage_fields(), which then attempts to set
img->_BaseFormat = _mesa_base_tex_format(ctx, internalFormat), which
assert fails if internalFormat is GL_STENCIL_INDEX8 but
ARB_texture_stencil8 is unsupported.
To work around this, just pretend to support the extension momentarily,
during the blit. Meta has already munged a variety of other things in
the context (including the API!), so it's not that much worse than what
we're already doing.
Fixes regressions since commit f7aad9da20
(mesa/teximage: use correct extension for accept stencil texture.).
v2: Add an XXX comment explaining the situation (requested by Jason
Ekstrand and Martin Peres), and an assert that we don't support
the extension so we remember to remove this hack (requested by
Neil Roberts).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Just implement it in terms of util_resource_copy_region(). Both the
original code and util_resource_copy_region() boil down to mapping,
calling util_copy_box() and unmapping.
No piglit regressions. This will also help to implement GL_ARB_copy_image.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Mesa supports EXT_texture_rg and OES_texture_float. This patch adds
support for using unsized enums GL_RED and GL_RG for floating point
targets and writes proper checks for internalformat when format is
GL_RED or GL_RG and type is of GL_FLOAT or GL_HALF_FLOAT.
Later, internalformat will get adjusted by adjust_for_oes_float_texture
after these checks.
v2: simplify to check vs supported enums
v3: follow the style and break out if internalFormat ok (Kenneth)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90748
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The check in batch_bo_finish should catch any undefined values in the batch
but isn't that great for debugging. The checks in the various emit
functions will help get better granularity.
we don't check the validity of pscreen until dri_init_screen_helper
hit this trying to init glamor on a device with no driver (udl).
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When we import a dma-buf fd from another driver the kernel
gives us the right info, and this trashes it.
Convert the kernel bo flags into the domain flags.
This helps getting reverse prime and glamor working.
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Reduce the number of table lookups in anv_image_create from 4 to 1.
- Add field for surface alignment.
- Shorten field names tile_width, tile_height -> width, height.
On Lollipop, apparently stlport is gone and libcxx must be used instead.
We still support stlport when building on earlier android releases.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
Based on the nice work of Paulo Sergio Travaglia <pstglia@gmail.com>.
The main modifications are:
- Include paths for LLVM header files and shared/static libraries
- Set C++ flag "c++11" to avoid compiling errors on LLVM header files
- Set defines for LLVM
- Add GALLIVM source files
- Changes path of libelf library for lollipop
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Anholt <eric@anholt.net>
Use the pre-defined macro es-gen to generate new added files
instead of writing new rules manually. The handmade rules
that may generate the files before the directory is created
result in such an error:
/bin/bash: out/target/product/x86/gen/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/main/format_pack.c: No such file or directory
make: *** [out/target/product/x86/gen/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/main/format_pack.c] Error 1
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
This avoids needing hardlinks between all of the DRI driver .so names,
since we're the only loader on the system.
v2: Add early exit on success (like previous block) and log message on
failure.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
The Android gallium build used to use gallium_egl, which was removed back
in March. Instead, we will now use a normal Mesa libEGL loader with
dlopen()ing of a DRI module.
v2: add a clean step to rebuild all dri modules properly.
v3: Squish the 2 patches doing this together (change by anholt).
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
This single .so includes all of the enabled gallium drivers.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
The include paths of libmesa_dri_common are also used by modules
that need libmesa_dri_common.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously the number needed to be divided by 4 to get the proper results. Now
the hardware does the right thing. Through experimentation it seems Braswell
(CHV) does also need the division by 4.
Fixes piglit test:
arb_pipeline_statistics_query-frag
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Previously, we just put the message for the EOT send as high in the file as
it would go. This is because the register pre-filling hardware will stop
all over the early registers in the file in preparation for the next thread
while you're still sending the last message. However, if something happens
to spill, then the MRF hack interferes with the EOT send message and, if
things aren't scheduled nicely, will stomp on it.
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90520
Reviewed-by: Neil Roberts <neil@linux.intel.com>
The explicit call to fs_builder::group() in emit_single_fb_write() is
required by the builder (otherwise the assertion in fs_builder::emit()
would fail) because the subsequent LOAD_PAYLOAD and FB_WRITE
instructions are in some cases emitted with a non-native execution
width. The previous code would always use the channel enables for the
first quarter, which is dubious but probably worked in practice
because FB writes are never emitted inside non-uniform control flow
and we don't pass the kill-pixel mask via predication in the cases
where we have to fall-back to SIMD8 writes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Yes, it's incorrect to use the 0-th channel enable group
unconditionally without considering the execution and regioning
controls of the instruction that uses the spilled value, but it
matches the previous behaviour exactly, the builder just makes the
preexisting problem more obvious because emitting an instruction of
non-native SIMD width without having called .group() or .exec_all()
explicitly would have led to an assertion failure.
I'll fix the problem in a follow-up series, as the solution is going
to be non-trivial.
Reviewed-by: Matt Turner <mattst88@gmail.com>
This simplifies opt_peephole_sel() slightly by emitting the SEL
instructions immediately after they are created, what makes the
sel_inst and mov_imm_inst arrays unnecessary and will make it possible
to get rid of the explicit inserts when the pass is migrated to the IR
builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
LOAD_PAYLOAD instructions need the same treatment as any other
generator instructions, at least FB writes and typed surface messages
will need a payload built with non-zero execution controls.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Most of these fields affect the behaviour of the instruction, but
apparently we currently don't CSE the kind of instructions for which
these fields could make a difference in the VEC4 back-end. That's
likely to change soon though when we start using send-from-GRF for
texture sampling and surface access messages.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Most of these fields affect the behaviour of the instruction so it
could actually break the program if we CSE a pair of otherwise
matching instructions with different values of these fields.
Reviewed-by: Matt Turner <mattst88@gmail.com>
The purpose of this change is threefold: First, it improves the
modularity of the compiler back-end by separating the functionality
required to construct an i965 IR program from the rest of the visitor
god-object, what in turn will reduce the coupling between other
components and the visitor allowing a more modular design. This patch
doesn't yet remove the equivalent functionality from the visitor
classes, as it involves major back-end surgery.
Second, it improves consistency between the scalar and vector
back-ends. The FS and VEC4 builders can both be used to generate
scalar code with a compatible interface or they can be used to
generate natural vector width code -- 1 or 4 components respectively.
Third, the approach to IR construction is somewhat different to what
the visitor classes currently do. All parameters affecting code
generation (execution size, half control, point in the program where
new instructions are inserted, etc.) are encapsulated in a stand-alone
object rather than being quasi-global state (yes, anything defined in
one of the visitor classes is effectively global due to the tight
coupling with virtually everything else in the compiler back-end).
This object is lightweight and can be copied, mutated and passed
around, making helper IR-building functions more flexible because they
can now simply take a builder object as argument and will inherit its
IR generation properties in exactly the same way that a discrete
instruction would from the same builder object.
The emit_typed_write() function from my image-load-store branch is an
example that illustrates the usefulness of the latter point: Due to
hardware limitations the function may have to split the untyped
surface message in 8-wide chunks. That means that the several
functions called to help with the construction of the message payload
are themselves required to set the execution width and half control
correctly on the instructions they emit, and to allocate all registers
with half the default width. With the previous approach this would
require the used helper functions to be aware of the parameters that
might differ from the default state and explicitly set the instruction
bits accordingly. With the new approach they would get a modified
builder object as argument that would influence all instructions
emitted by the helper function as if it were the default state.
Another example is the fs_visitor::VARYING_PULL_CONSTANT_LOAD()
method. It doesn't actually emit any instructions, they are simply
created and inserted into an exec_list which is returned for the
caller to emit at some location of the program. This sort of two-step
emission becomes unnecessary with the builder interface because the
insertion point is one more of the code generation parameters which
are part of the builder object. The caller can simply pass
VARYING_PULL_CONSTANT_LOAD() a modified builder object pointing at the
location of the program where the effect of the constant load is
desired. This two-step emission (which pervades the compiler back-end
and is in most cases redundant) goes away: E.g. ADD() now actually
adds two registers rather than just creating an ADD instruction in
memory, emit(ADD()) is no longer necessary.
v2: Drop scalarizing VEC4 builder.
v3: Take a backend_shader as constructor argument. Improve handling
of debug annotations and execution control flags.
v4: Drop Gen6 IF with inline comparison. Rename "instr" variable.
Initialize cursor to NULL by default and add method to explicitly
point the builder at the end of the program.
Reviewed-by: Matt Turner <mattst88@gmail.com>
simple_list.h defines a number of macros with short non-namespaced
names that can easily collide with other declarations (first_elem,
last_elem, next_elem, prev_elem, at_end), and according to the comment
it was only being included because of struct simple_node, which is no
longer used in this file.
Reviewed-by: Brian Paul <brianp@vmware.com>
v3: Use ffs() and a switch loop in
tr_mode_horizontal_texture_alignment() (Ben)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
v3: Use ffs() and a switch loop in
tr_mode_vertical_texture_alignment() (Ben)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
and change the name to brw_miptree_choose_tiling().
V3: Remove redundant function parameters. (Topi)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Since the introduction of
commit 536003c11e
Author: Boyan Ding <boyan.j.ding@gmail.com>
Date: Wed Mar 25 19:36:54 2015 +0800
i965: Add XRGB8888 format to intel_screen_make_configs
winsys buffers no longer have an alpha channel. This causes
_mesa_format_matches_format_and_type() to reject previously working BGRA
uploads from using the BLT fast path. Instead of using the generic
routine for matching formats exactly, export the slightly more relaxed
check from intel_miptree_blit() which importantly allows the blitter
routine to apply a small number of format conversions.
References: https://bugs.freedesktop.org/show_bug.cgi?id=90839
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Alexander Monakov <amonakov@gmail.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
The blitter already has code to accommodate filling in the alpha channel
for BGRX destination formats, so expand this to also allow filling the
alpha channgel in RGBX formats.
More importantly for the next patch is moving the test into its own
function for the purpose of exporting the check to the callers.
v2: Fix alpha expansion as spotted by Alexander with the fix suggested by
Kenneth
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Alexander Monakov <amonakov@gmail.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
The BLT pitch is specified in bytes for linear surfaces and in dwords
for tiled surfaces. In both cases the programmable limit is 32,767, so
adjust the check to compensate for the effect of tiling.
v2: Tweak whitespace for functions (Kenneth)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
This avoids the full brw context initialization and just sets up context
constants, initializes extensions and sets a few driver vfuncs for the
front-end GLSL compiler.
Based on the corresponding SI support. Same as that, this is currently
only enabled for one-dimensional buffer copies due to issues with
multi-dimensional SDMA copies.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
In the ARB_fragment_program specification, the result.depth output
variable is treated as a vec4, where the fragment depth is stored in the
.z component, and the other three components are undefined.
This is different than GLSL, which uses a scalar value (gl_FragDepth).
To make this consistent for driver backends, this patch makes
prog_to_nir use a scalar output variable for FRAG_RESULT_DEPTH,
moving result.depth.z into the first component.
Fixes Glean's fragProg1 "Z-write test" subtest.
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90000
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Previously we were leaving this at the default of 64K, which meets the
spec but is too small for some real uses. The hardware can handle up to
128M.
User was complaining about this on freenode ##OpenGL today.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In an ideal world I would just implement this instead of adding the perf debug.
There are some errata involved which lead me to believe it won't be so simple as
flipping a few bits.
There is room to add a thing for Gen9s flexibility, but since I am actively
working on that I have opted to ignore it.
Example:
Multi-LOD fast clear - giving up (256x128x8).
v2: Use braces for if statements because they are multiple lines (Ken)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When we cannot do the optimized fast clear it's important to know the buffer
size since a small buffer will have much less performance impact.
A follow-on patch could restrict printing the message to only certain sizes.
Example:
Failed to fast clear 1400x1056 depth because of scissors. Possible 5% performance win if avoided.
Recommended-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
These are just wrappers around the existing extension functions.
v2: return BAD_ALLOC if _eglConvertAttribsToInt fails
Reviewed-by: Chad Versace <chad.versace@intel.com>
v2: - don't modify "value" in eglGetSyncAttribKHR after an error
- rename _egl_api::GetSyncAttribKHR -> GetSyncAttrib
- rename GetSyncAttribKHR_t -> GetSyncAttrib_t
- rename _eglGetSyncAttribKHR to _eglGetSyncAttrib
Reviewed-by: Chad Versace <chad.versace@intel.com>
v2: include mesa and chromium extensions in eglext.h so as not to break
existing users
v3: keep PFNEGLSWAPBUFFERSREGIONNOK because piglit uses it
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
There's no reason to use our own definition.
Tessellation will use the NV definitions too.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This is for user assembly shaders only (not GLSL). We won't support those.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
These states are for GS assembly shaders only. We don't support those.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Softpipe, llvmpipe, r300g, and radeonsi pass tests. Other drivers need testing.
Freedreno and nv30 are definitely broken. Other drivers seem to be alright.
Patch fixes special cases with gl_VertexID and sets all builtin
variables locations as '-1' as specified by the extension spec.
Fixes ES 3.1 conformance test failure:
ES31-CTS.program_interface_query.input-built-in
v2: comments + use is_gl_identifier() (Martin)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Flagged by Oracle's parfait static analyzer:
Error: Format string argument mismatch (CWE 628)
In call to printf with format string "usage: %s [options] <file.vert | file.geom | file.frag>\n\nPossible options are:\n"
Too many arguments for format string (got more than 1 arguments)
at line 285 of src/glsl/main.cpp in function 'usage_fail'.
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This probably got broken when the samplers were converted to be indexed
by shader type.
Seen when looking at bug 89819 though I'm not sure if that really was what
the bug was about...
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
I just botched this when writing the original code.
From the ARB_vertex_program specification:
"The RSQ instruction approximates the reciprocal of the square root of
the absolute value of the scalar operand and replicates it to all four
components of the result vector."
Fixes a Glean vertProg1 subtest:
RSQ test 2 (reciprocal square root of negative value)
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90547
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
This change introduces a new field in gl_uniform_storage to
explicitely say that a uniform is built-in. In the case where it is,
no storage is defined to make it clear that it is read-only from the
mesa side. I fixed all the places in the code that made use of the
structure that I changed. Any place making a wrong assumption and using
the storage straight away will just crash.
This patch seems to implement the path of least resistance towards
listing built-in uniforms in GL_ACTIVE_UNIFORM (and other APIs).
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Pretty trivial, fixes the issue that we're expected to be able to blit
stencil surfaces (as the blit just relies on util blitter code which needs
stencil export to do it).
2 piglits skip->pass, 11 fail->pass
v2: prettify, keep different stencil ref value handling out of depth/stencil
test itself.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Some hardware reads only the low 16-bits even if the type is UD, but
other hardware like Cherryview can't handle this.
Fixes spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple on
Cherryview.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90830
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
It was 2 bits to accommodate SATURATE_PLUS_MINUS_ONE (removed by commit
09b566e1). A similar change was made to TGSI recently in commit
e1c4e8aa.
Reducing the size from 2 bits to 1 reduces the size of the bit fields
from 17 bits to 16, which is a much nicer number.
Reviewed-by: Brian Paul <brianp@vmware.com>
If the incoming func matches the current state it must be a legal
value so we can do this before the switch statement.
Signed-off-by: Brian Paul <brianp@vmware.com>
If the glPushAttrib() mask value was zero we didn't actually push
anything onto the attribute stack. A subsequent glPopAttrib() call
would generate a GL_STACK_UNDERFLOW error. Now push a dummy attribute
in that case to prevent the error.
Mesa now matches nvidia's behavior.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Dynamic color/blend state can be NULL in case we're not rendering to
color targets (only output to depth and/or stencil). Initialize
cmd_buffer->cb_state to NULL so we can reliably detect whether it's been
set or not.
lower_phis_to_scalar() pass recurses the instruction dependence graph to
determine if all the sources of a given instruction are scalarizable.
To prevent cycles, it temporary marks the phi instruction before recursing in,
then updates the entry with the resulting value. However, it does not consider
that the entry value may have changed after a recursion pass, hence causing
a use-after-free situation and a crash.
This patch fixes this by reloading the entry corresponding to the 'phi'
after recursing and before updating its value.
The crash can be reproduced ~20% of times with the dEQP test:
dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Now that Jason's LOAD_PAYLOAD improvements have landed, we don't need
this. Passing 1 for the number of header registers already takes care
of setting force_writemask_all on the header copy.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
See the corresponding code in brw_vs_surface_state.c.
v2: const more things (requested by Topi Pohjolainen)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
We used to store the GS dispatch mode in brw_gs_prog_data while
separately storing the VS dispatch mode in brw_vue_prog_data::simd8.
This patch introduces an enum to represent all possible dispatch modes,
and stores it in brw_vue_prog_data::dispatch_mode, unifying the two.
Based on a suggestion by Matt Turner.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
The documentation makes it pretty clear that we shouldn't use this:
"Under normal conditions SW shall specify DMask, as the GS stage
will provide a Dispatch Mask appropriate to SIMD4x2 or SIMD8 thread
execution (as a function of dispatch mode). E.g., for SIMD4x2
execution, the GS stage will generate a Dispatch Mask that is equal
to what the EU would use as the Vector Mask. For SIMD8 execution
there is no known usage model for use of Vector Mask (as there is
for PS shaders)."
I also managed to find descriptions of DMask and VMask, in the "State
Register" (sr0.2/3) field descriptions:
"Dispatch Mask (DMask). This 32-bit field specifies which channels
are active at Dispatch time."
"Vector Mask (VMask). This 32-bit field contains, for each 4-bit
group, the OR of the corresponding 4-bit group in the dispatch
mask."
SIMD4x2 shaders process one or two vec4 values, with each 4-bit group
corresponding to xyzw channel enables (either all on, or all off).
Thus, DMask = VMask in SIMD4x2 mode. But in SIMD8 mode, 4-bit groups
are meaningless, so it just messes up your values.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
instructions in affected programs: 1514770 -> 1454047 (-4.01%)
helped: 5813
HURT: 1120
The gained programs are ARB vertext programs that were previously going
through the vec4 backend. Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
It's incompletete -- it wasn't filling ReferenceType so it was causing
garbagge on the disassembly. Furthermore it seems impossible to get the
jump information through this interface.
The solution for function size problem is to effectively book-keep the
machine code start and end address while JIT'ing.
This supports the three Vulkan border color types for float color
formats. The support for integer formats is a little trickier, as we
don't know the format of the texture at this time.
When calculating the binding table index for non-constant sampler
array indexing it needs to add the base binding table index which is a
constant within the generated code. Often this base is zero so we can
avoid a redundant instruction in that case.
It looks like nothing in shader-db is doing non-constant sampler array
indexing so this patch doesn't make any difference but it might be
worth having anyway.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Previously when generating the send instruction for a sample
instruction with an indirect sampler it would use the destination
register as a temporary store. This breaks when used in combination
with the opt_sampler_eot optimisation because that forces the
destination to be null. This patch fixes that by avoiding the temp
register altogether.
The reason the temporary register was needed was because it was trying
to ensure the binding table index doesn't overflow a byte by and'ing
it with 0xff. The result is then or'd with samper_index<<8. This patch
instead just and's the whole thing by 0xfff. This will ensure that a
bogus sampler index won't overflow into the rest of the message
descriptor but unlike the previous code it won't ensure that the
binding table index doesn't overflow into the sampler index. It
doesn't seem like that should matter very much though because if the
shader is generating a bogus sampler index then it's going to just get
garbage out either way.
Instead of doing sampler_index<<8|(sampler_index+base_table_index) the
new code avoids one operation by doing
sampler_index*0x101+base_table_index which should be equivalent.
However if we wanted to avoid the multiply for some reason we could do
this by adding an extra or instruction still without needing the
temporary register.
This fixes a number of Piglit tests on Skylake that were using
indirect samplers such as:
spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
According to the bspec, you're supposed to emit a PIPE_CONTROL with a CS
stall and a render target flush prior to chainging STATE_BASE_ADDRESS. A
little experimentation, however, shows that this is not enough. It also
appears as if you have to flush the texture cache after chainging base
address or things won't propagate properly.
We were returning the most recently freed BO, without checking if it
was idle yet. This meant that we generally stalled immediately on the
previous frame when generating a new one. Instead, allocate new BOs
when the *oldest* BO is still busy, so that the cache scales with how
much is needed to keep some frames outstanding, as originally
intended.
Note that if you don't have some throttling happening, this means that
you can accidentally run the system out of memory. The kernel is now
applying some throttling on all execs, to hopefully avoid this.
This commit allows for us to create a whole new surface state buffer when
the old one runs out of room. We simply re-emit the state base address for
the new state, re-emit binding tables, and keep going.
Before, we were emitting surface states up-front when binding tables were
updated. Now, we wait to emit the surface states until we emit the binding
table. This makes meta simpler and should make it easier to deal with
swapping out the surface state buffer.
AFAICT, there is no real way to make sure a send message with EOT is properly
ignored from compact, nor can I see a way to actually encode EOT while
compacting. Before the single send optimization we'd always bail because we hit
the is_immediate && !is_compactable_immediate case. However, with single send,
is_immediate is not true, and so we end up trying to compact the un-compactible.
Without this, any compacting single send instruction will hang because the EOT
isn't there. I am not sure how I didn't hit this when I originally enabled the
optimization. I didn't check if some surrounding code changed.
I know Neil and Matt were both looking into this. I did a quick search and
didn't see any patches out there to handle this. Please ignore if this has
already been sent by someone. (Direct me to it and I will review it).
Reported-by: Neil Roberts <neil@linux.intel.com>
Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
We need to make sure we use the right index into dynamic offset
array. Dynamic descriptors can be present or not in different stages and
to get the right offset, we need to compute the index at
vkCreateDescriptorSetLayout time.
Pure integer formats cannot be sampled with linear tex / mip filters. In GL
such a setup would make the texture incomplete.
We shouldn't rely on the state tracker though to filter that out, just return
all zeros instead of dying in the lerp.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
gallivm now depends on it. And depending on particular LLVM version /
configure options, the build can fail without this change due to
undefined reference to `LLVM*Disasm*' symbols.
Trivial.
It doesn't do everything we want. In particular it doesn't allow to
detect jumps or return opcodes. Currently we detect the x86's RET
opcode.
Even though it's worse for LLVM 3.3, it's an improvement for LLVM 3.7,
which was totally busted.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
We do this by creating a surface state on the fly that incorporates the
dynamic offset. This patch also refactor the descriptor set layout
constructor a bit to be less clever with switch statement fall
through. Instead of duplicating the subtle code to update the sampler
and surface slot map, we just use two switch statements.
Roland pointed out my previous attempt was lacking, so I enhanced the
texwrap piglit test, and tested them. This fixes the offset calculations
in a number of areas by adding the offset first, it also fixes the fastpaths,
which I forgot to address in the previous commit.
v2: try and avoid divides in most paths, the repeat mirror path
really was ugly no matter which way I went, so I left it having
the divide.
Also fix the gather lod calculation bug.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Whether or not to use NIR is now equivalent to brw->scalar_vs. We can
simplify the logic and make it far less confusing.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now that everything is running through NIR, this is all dead.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now that everything is running through NIR, this is all dead.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is using multiple inheritance in C++. However, ir_visitor is really
just an interface with no data so it shouldn't be so bad.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The backend_shader class really is a representation of a shader. The fact
that it inherits from ir_visitor is somewhat immaterial.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Currently on the functions that are exclusive to core-profile are
implemented. The remainder continue to live in the XML. Additional
functions can be moved later.
The functions for GL_ARB_draw_indirect and GL_ARB_multi_draw_indirect
are put in the dispatch table inside the VBO module, so they do not need
to be moved over.
The diff of src/mesa/main/api_exec.c before and after this patch is as
expected. All of the functions listed in apiexec.py moved out of a 'if
(_mesa_is_desktop(ctx))' block into a new 'if (ctx->API ==
API_OPENGL_CORE)' block.
v2: Remove stray shebang line in apiexec.py. Suggested by Ilia.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Dylan Baker <baker.dylan.c@gmail.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
The string is only applied when the context is API_OPENGLES2.
The bulk of the change is to prevent overriding the context to
API_OPENGL_CORE based on the requested version. If the context is
API_OPENGL_ES2, don't change it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Remove _mesa_get_gl_version_override. We don't need two functions that
do basically the same thing. This change seemed easier (esp. with the
next patch) than going the other way.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This is a bit of a hack for now. Several of the extensions required for
OpenGL ES 3.1 have no support, at all, in Mesa. However, with this
patch and a patch to allow MESA_GL_VERSION_OVERRIDE to work with ES
contexts, people can begin testing the ES "version" of the functionality
that is supported.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
A couple functions are missing because there are no implementations of
them yet. These are:
glFramebufferParameteri (from GL_ARB_framebuffer_no_attachments)
glGetFramebufferParameteriv (from GL_ARB_framebuffer_no_attachments)
glMemoryBarrierByRegion
v2: Rebase on updated dispatch_sanity.cpp test.
v3: Add support for glDraw{Arrays,Elements}Indirect in vbo_exec_array.c.
The updated dispatch_sanity.cpp test discovered this omission.
v4: Rebase on glapi changes.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
If the multiplication's result is unused, except by a conditional_mod,
the destination will be null. Since the final instruction in the lowered
sequence is a partial-write, we can't put the conditional mod on it and
we have to store the full result to a register and do a MOV with a
conditional mod.
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90580
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When we compute the output swizzle we want to consider the number of
components in the add operation. So far we were using the writemask
of the multiplication for this instead, which is not correct.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Temporarily undefine DEBUG macro while including LLVM C++ headers,
leveraging the push/pop_macro pragmas, which are supported both by GCC
and MSVC.
https://bugs.freedesktop.org/show_bug.cgi?id=90621
Trivial.
We rely on these being initialized to NULL so meta can reliably detect
whether or not they've been set. ds_state is also allowed to not be
present so we need a well-defined value for that.
The idea I had when I wrote the original shadow code was that you'd see a
set_index_buffer to the IB, then a bunch of draws out of it. What's
actually happening in openarena is that set_index_buffer occurs at every
draw, so we end up making a new shadow BO every time, and converting more
of the BO than is actually used in the draw.
While I could maybe come up with a better caching scheme, for now just
do the simple thing that doesn't result in a new shadow IB allocation
per draw.
Improves performance of isosurf in drawelements mode by 58.7967% +/-
3.86152% (n=8).
We'd sometimes try to reallocate something that X was using as a new
pipe_resource, and potentially conflict in our rendering. But even
worse, if we reallocated the BO as a shader, the kernel would reject
rendering using the shader.
Not sure what happened in my testing that made the previous shadow
code fix glxgears swapbuffering, but this also fixes lots of CopyArea
in X (like dragging xlogo around in metacity).
I forgot to make the change in 96f164f6f0.
This fixes a warning with GCC and probably an error with Clang.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Starting with GEN8, there is documentation that the multisample state command
must be emitted before the 3DSTATE_WM_HZ_OP command any time the multisample
count changes. The 3DSTATE_WM_HZ_OP packet gets emitted as a result of a
intel_hix_exec(), which is called upon a fast clear and/or a resolve. This can
happen before the state atoms are checked, and so the multisample state must be
put directly in the function.
v1:
- In v0, I was always emitting the command, but Ken came up with the condition to
determine whether or not the sample count actually changed.
- Ken's recommendation was to set brw->num_multisamples after emitting
3DSTATE_MULTISAMPLE. This doesn't work. I put my best guess as to why in the XXX
(it was causing 7 regressions on BDW).
v2:
Flag NEW_MULTISAMPLE state. As Ken found, in state upload we check for the
multisample change to determine whether or not to emit certain packets. Since
the hiz code doesn't actually care about the number of multisamples, set the
flag and let the later code take care of it.
Jenkins results:
http://otc-mesa-ci.jf.intel.com/view/dev/job/bwidawsk/136/
Fixes around 200 piglit tests on SKL. I'm somewhat surprised that it seems to
have no impact on BDW as the restriction is needed there as well.
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Neil Roberts <neil@linux.intel.com> (v0)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
TargetOptions::NoFramePointerElim was removed in llvm-3.7.0svn r238244
"Remove NoFramePointerElim and NoFramePointerElimOverride from
TargetOptions and remove ExecutionEngine's dependence on CodeGen. NFC."
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
This workaround is documented in the 3DSTATE_GS documentation. It
appears to only apply to early steppings of Broadwell and Skylake.
I don't think it ever affected production hardware, so at this point it
probably makes sense to delete it.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This mega-commit primarily does two things. First, is to turn anv_batch
into a better abstraction of a batch. Instead of actually having a BO, it
now has a few pointers to some piece of memory that are used to add data to
the "batch". If it gets to the end, there is a function pointer that it
can call to attempt to grow the batch.
The second change is to start using chained batch buffers. When the end of
the current batch BO is reached, it automatically creates a new one and
ineserts an MI_BATCH_BUFFER_START command to chain to it. In this way, our
batch buffers are effectively infinite in length.
This also drops the share create_surface_state helper and moves filling
out SURFACE_STATE directly into anv_image_view_init() and
anv_color_attachment_view_init().
Commit 4bdbb588a9 introduced new _glapi_new_nop_table() and
_glapi_set_nop_handler() functions in the glapi dispatcher (which
live in libGL.so). The calls to those functions from context.c
would be undefined (i.e. an ABI break) if the libGL used at runtime
was older.
For the time being, use the old single generic_nop() function for
non-Windows builds to avoid this problem. At some point in the future
it should be safe to remove this work-around. See comments for more
details.
v2: Incorporate feedback from Emil. Use _WIN32 instead of
GLX_DIRECT_RENDERING to control behavior, move comments.
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
This hasn't been updated in a long time and from recent discussion on
the mailing list, it's not always clear what's expected. Hopefully,
this will help a bit.
v2: document function brace placement, per Thomas Helland.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
In case the glproto.h file isn't up to date, we provide the #define
for X_GLXCreateContextAttribsARB.
v2: fix other occurances, improve #ifndef test, per Jose.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
argparse type is a nice type saver for simple data types, but it doesn't
look a good fit for the input XML file:
- Certain implementations of argparse (particularly python 2.7.3's)
invoke the type constructor for the default argument even when an
option is passed in the command line. Causing `No such file or
directory: 'gl_API.xml'` when the current dir is not
src/mapi/glapi/gen.
- The parser takes multiple arguments. This is currently worked around
using lambdas, but that unnecessarily complex and hard to read.
Furthermore it's odd to have a side-effect as heavy as parsing XML
happening deep inside the argument parsing.
https://bugs.freedesktop.org/show_bug.cgi?id=90600
Reviewed-by: Brian Paul <brianp@vmware.com>
Without it, texcoords are mapped to GENERIC[0..7], PointCoord is mapped to
GENERIC[8], and user-defined varyings start from GENERIC[9]. Since texcoords
can only be used between VS and PS, and PointCoord is PS-only, it's silly to
always start from GENERIC[9] in all other shaders (such as LS, HS, ES, GS).
This adds support for TEXCOORD and PCOORD semantics. As a result, st/mesa
will use GENERIC[0] as a base for user-defined varyings, which should make
linking ES and GS as well as tessellation shaders at runtime easier.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
When using SIMD4x2 on Skylake, the sampler instructions need a message
header to select the correct mode. This was added for most sample
instructions in 0ac4c2727 but the TXF_MCS instruction is emitted
separately and it was missed.
This fixes a bunch of Piglit tests which test texelFetch in a geometry
shader, for example:
spec/arb_texture_multisample/texelfetch/2-gs-sampler2dms
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The problem is that the EDGEFLAG has to be toggled at vertex submission
time. This can be done from either the draw or the regular paths. Avoid
falling back to draw just because there's an edgeflag.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Commit 8acaf862df switched things over to use TEXCOORD instead of
GENERIC, but did not update the nv30 swtnl draw paths. This teaches the
draw logic about TEXCOORD.
Among other things, this fixes a crash in demos/arbocclude when using
swtnl. Curiously enough, the point-sprite piglit works without this.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
These are only used once per draw, so it makes sense to keep them in
GART. Also take this opportunity to modernize the buffer mapping API
usage.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
If a meta operation is called before the pipeline is set, this can cause
uses of undefined values. They *should* be harmless, but we might as well
shut up valgrind on this one too.
Apparently some compilers think we probably wanted to do !(x == y) instead
and issue a warning, so just shut it up... No functional change, obviously.
Cc: <mesa-stable@lists.freedesktop.org>
This fixes glxgears with NV30_SWTNL=1 forced on. Probably fixes a bunch
of other situations where we fall back to the swtnl path.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
nv30_validate_clip depends on the rasterizer state. Also we should
upload all the new clip planes on change since next time the plane data
won't have changed, but the enables might.
This fixes fixed-clip-enables and vs-clip-vertex-enables shader tests.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Clearing can happen at a time when various state objects are incoherent
and not ready for a draw. Some of the validation functions don't handle
this well, so only flush the framebuffer state. This has the advantage
of also not doing extra work.
This works around some crashes that can happen when clearing.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
There can be scenarios where the "indirect" arg of a PFETCH becomes
known, and so the code will attempt to propagate it. Use this
opportunity to just fold it into the first argument, and prevent the
load propagation pass from touching PFETCH further.
This fixes gs-input-array-vec4-index-rd.shader_test and
vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory,
if possible. This is just what userptr is for, so use it.
In case the memory cannot be mapped, a fallback similar to
CL_MEM_COPY_HOST_PTR is used.
v2: constify, drop unneeded cast
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
This flag is typically used to request pinned host memory, to avoid
any copies between GPU and CPU.
This improves throughput with an older OpenCL app which I unfortunately
can't publish due to its licensing.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Previously, the prog_to_nir pass was directly generating uniform load/store
intrinsics. This converts it to use a single giant "parameters" variable
and we now depend on lowering to get the uniform load/store intrinsics.
One advantage of this is that we now have one code-path after we do the
initial conversion into NIR.
No shader-db changes.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
PIPE_QUERY_TIMESTAMP_DISJOINT could not work because q->ready was always
set to FALSE. To fix this issue, add more different states for queries
according to nvc0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
We forgot to convert to VFETCH in case of indirect access. Fix that.
This avoids crashes on the new gs-input-array-vec4-index-rd and
vs-output-array-vec4-index-wr-before-gs but they still fail.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
When we get something like IN[ADDR[0].x+5], we will now guess that we
should look at IN[5] for the "base" information.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
This covers the pattern where a KILL_IF is used, which triggers a
comparison of -x to 0. This can usually be folded into the comparison whose
result is being compared to 0, however it may, itself, have already been
combined with another comparison. That shouldn't impact the logic of
this pass however. With this and the & 1.0 change, code like
00000020: 001c0001 80081df4 set b32 $r0 lt f32 $r0 0x3e800000
00000028: 001c0000 201fc000 and b32 $r0 $r0 0x3f800000
00000030: 7f9c001e dd885c00 set $p0 0x1 lt f32 neg $r0 0x0
00000038: 0000003c 19800000 $p0 discard
becomes
00000020: 001c001d b5881df4 set $p0 0x1 lt f32 $r0 0x3e800000
00000028: 0000003c 19800000 $p0 discard
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
This is roughly equivalent to the original getopt, except that it
removes the '-h' short option, which argparse reserves for
auto-generated help messages. It does retain the long option specified
by the getopt version, and changes the makefile to use that.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Also drop -m switch, which only accepted a single value or raised an
error, and was unused in the makefile.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Also removes the redundant -m argument, which could only be set to
'generic', or it would raise an exception. This option wasn't used in
the makefile.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
This makes the tools shut up about a bunch of problems, making them more
useful for catching actual problems.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
This results in slightly less code, but code that is much more readable.
It has the advantage of putting everything together in one place, all of
the code is self documenting, help messages are auto-generated, choices
are automatically enforced, and the syntax is much less C like, taking
advantage of python features and idioms.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Making the tools shut up about worthless errors so you can see real ones
is very useful
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Some shaders in Civilization V and Beyond Earth do
pow(pow(x, 2.2), 0.454545)
which is converting to and from sRGB colorspace.
A more general rule that replaces pow(pow(a, b), c) with pow(a, b * c)
actually regresses two shaders in Sun Temple in which the result of the
inner pow is used twice, once by another pow and once by another
instruction. Also, since 2.2 * 0.454545 isn't exactly one, the more
general pattern would have still left us with a pow, and I'm 2.2 *
0.454545 percent sure that's not what they want.
instructions in affected programs: 934 -> 886 (-5.14%)
helped: 16
A sequence number is written for 32-bits queries to make sure they are
ready, but not for 64-bits queries. Instead, we have to use a fence in
order to fix the HUD because it doesn't wait until the result is ready.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
v2: make this also compatible with original released firmware
v3 (chk): switch to original idea of separate files for fw versions
Signed-off-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2)
Previously, we just blasted out whatever VB's we had marked as "dirty"
regardless of which ones were used by the pipeline. Given that the stride
of the VB is embedded in the pipeline this can cause problems. One problem
is if the pipeline doesn't use the given VB binding we emit a bogus stride.
Another problem is that we weren't properly resetting the dirty bits when
the pipeline changed.
Previously, we waited until later and did a pass through the used surfaces
and did the relocations then. This lead to doing double-relocations which
was causing us to get bogus surface offsets.
We now have is_array() and without_array() that make the
code much clearer and remove the need for this.
For all remaining calls to this we already knew that
the type was an array so returning a null wasn't adding any value.
v2: use without_array() in _mesa_ast_array_index_to_hir() and don't use
without_array() in lower_clip_distance_visitor() as we want to make sure the
array is 2D.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Since the binding table pointer is only 16 bits, we can only have 64kb
of binding table state allocated at any given time. With a block size of
1kb, that amounts to just 64 command buffers, which is not enough.
get_immediate will return a const reference, the requested immediate
isn't necessarily in the x slot. Make sure to use the swizzle.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
This commit adds support for the LunarG GLSL back-door as well as detecting
regular GLSL and SPIR-V. The SPIR-V path doesn't exist yet, so that will
cause an assert-fail.
Passing -module to glibtool causes the resulting library to be called
libSomething.so rather than libSomething.dylib on darwin.
Regardless if libOSMesa is a library or a module, it has been used as
the former for quite some time. Update the build to reflect that and
resolve the naming issue.
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
[Emil Velikov: Tweak the commit message.]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Previously, we used intrinsic->const_index[1] to represent "the number of
array elements to load" for load/store intrinsics. However, this set to 1
by every pass that ever creates a load/store intrinsic. Also, while it
might make some sense for registers, it makes no sense whatsoever in SSA.
On top of that, the i965 backend was the only backend to ever support it;
freedreno and vc4 just assert that it's always 1. Let's just delete it.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
It's a remnant of some old NV extension. Unused.
I also have a patch that removes predicates if anyone is interested.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
From ARB_program_interface_query:
"Note that if an interface enumerates a single active resource list
entry for an array variable (e.g., "a[0]"), a <name> identifying
any array element other than the first (e.g., "a[1]") is not
considered to match."
It doesn't apply to arrays of interface blocks but just to array
variables.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This adds both ARB_texture_gather and the enhanced gather
for ARB_gpu_shader5.
This passes all the piglit tests, it relies on the GLSL
lowering pass to make textureGatherOffsets work.
v2: use inline to get gather component (Brian)
fix function name, add asserts (Brian)
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is a prep change for gather, and it makes more sense
to use an array in these cases.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just adds a new modifier interface for drivers to implement.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This was an oversight when GLSL1.30 was enabled, I think my
misunderstanding.
This fixes a bunch of tex-miplevel-selection tests under softpipe,
and is required for textureGather support.
I'm not sure this won't make sampling slowering, but its softpipe,
correctness first and all that.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This moves some of the image filter args into a struct,
and passes that instead, this is prep work for adding texture
gather support which needs new arguments.
review: make filter args const.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The *_init functions work basically the same as the Vulkan entrypoints
except that they act on an already-created view and take an optional
command buffer option. If a command buffer is given, the surface state is
allocated out of the command buffer's state stream.
This reverts commit b6ab076d6b.
It turns out setting USE_MEMFD to 0 is really bad because it means we can't
resize the pool. Besides, valgrind SVN handles memfd so we really don't
need this fallback for valgrind anymore.
The binding table pointers packet only allows for a 16-bit binding table
address so all binding tables have to be in the first 64 KB of the surface
state BO. We solve this by adding a slave block pool that pulls off the
first 64 KB worth of blocks and reserves them for binding tables.
Clear inherits the render targets from the current render pass. This
means we need to fill out the binding table after switching to meta
bindings. However, meta copies etc happen outside a render pass and
break when we try to fill in the render targets. This change fills the
render targets only for meta clear.
We leave the block pool untracked so that reads/writes to freed blocks
will get caught and do the tracking at the state pool/stream level. We
have to do a few extra gymnastics for streams because valgrind works in
terms of poitners and we work in terms of separate map and offset.
Fortunately, the users of the state pool and stream should always be using
the map pointer provided in the anv_state structure. We just have to
track, per block, the map that was used when we initially got the block.
Then we can make sure we always use that map and valgrind should stay
happy.
We now always copy the entire struct unless pData is NULL and
unconditionally write back the struct size. It's not clear this is
useful if the structs may grow over time, but it seems to be the
expected behaviour for now.
I've been promissed in a bug that this will be fixed in a future version of
the header. However, in the interest of my branch building, I'm adding
these changes in myself for the moment.
We don't actually use it to create all the instructions but we do use it
for insertion always. This should make things far more consistent for
implementing extended instructions.
We do control-flow handling as a two-step process. The first step is to
walk the instructions list and record various information about blocks and
functions. This is where the acutal nir_function_overload objects get
created. We also record the start/stop instruction for each block. Then
a second pass walks over each of the functions and over the blocks in each
function in a way that's NIR-friendly and actually parses the instructions.
Instead of having functions to add values and set various things, we just
have a function that does a few asserts and then returns the value. The
caller is then responsible for setting the various fields.
If we don't do this then recursive meta is completely broken. What happens
is that the outer meta call may change the bindings pointer and the inner
meta call will change it again and, when it exits set it back to the
default. However, the outer meta call may be relying on it being left
alone so it uses the non-meta descriptor sets instead of its own.
Previously, the GLSL_VK_SHADER macro didn't work if the shader contained
commas outside of parentheses due to the way the C preprocessor works.
This commit fixes this by making it variadic again and doing it correctly
this time.
This changes the way descriptor sets and layouts work so that we fill
out binding table contents at the time we bind descriptor sets. We
manipulate the binding table contents and sampler state in a shadow-copy
in anv_cmd_buffer. At draw time, we allocate the actual binding table
and sampler state and flush the anv_cmd_buffer copies.
This new utility, glsl_scraper.py scrapes C files for instances of the
GLSL_VK_SHADER macro, pulls out the shader source, and compiles it to
SPIR-V. The compilation is done using glslValidator. The result is then
placed into another C file as arrays of dwords that can be easiliy handed
to a Vulkan driver.
This way we can pass in a vertex shader and yet have the pipeline emit an
empty 3DSTATE_VS packet. We need this for meta because we need to trick
the compiler into not deleting our inputs but at the same time disable the
VS so that we can use a rectlist. This should go away once we actually get
SPIR-V.
This is rather crude but it at least makes sure that all the render targets
get flushed at the end of the pass. We probably actually want to do
somthing based on image layout traansitions, but this will work for now.
This is by no means a complete solution to the binding table problems.
However, it does make texturing actually work. Before, we were texturing
from the render target since they were both starting at 0.
We don't need to point back to the memory object the bo came from.
Pointing directly to a bo lets us bind images and buffers to other
bos - like our allocator bos.
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=86792">Bug 86792</a> - [NVC0] Portal 2 Crashes in Wine</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90147">Bug 90147</a> - swrast: build error undeclared _SC_PHYS_PAGES on osx</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90350">Bug 90350</a> - [G96] Portal's portal are incorrectly rendered</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90363">Bug 90363</a> - [nv50] HW state is not reset correctly when using a new GL context</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90310">Bug 90310</a> - Fails to build gallium_dri.so at linking stage with clang because of multiple redefinitions</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90347">Bug 90347</a> - [NVE0+] Failure to insert texbar under some circumstances (causing bad colors in Terasology)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90520">Bug 90520</a> - Register spilling clobbers registers used elsewhere in the shader</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=60797">Bug 60797</a> - 1px lines in octave plot aliased to 0</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=67564">Bug 67564</a> - HiZ buffers are much larger than necessary</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=69226">Bug 69226</a> - Cannot enable basic shaders with Second Life aborts attempt</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=71591">Bug 71591</a> - Second Life shaders fail to compile (extension declared in middle of shader)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=86837">Bug 86837</a> - kodi segfault since auxiliary/vl: rework the build of the VL code</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=86944">Bug 86944</a> - glsl_parser_extras.cpp", line 1455: Error: Badly formed expression. (Oracle Studio)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=86974">Bug 86974</a> - INTEL_DEBUG=shader_time always asserts in fs_generator::generate_code() when Mesa is built with --enable-debug (= with asserts)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88246">Bug 88246</a> - Commit 2881b12 causes 43 DrawElements test regressions</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88248">Bug 88248</a> - Calling glClear while there is an occlusion query in progress messes up the results</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88521">Bug 88521</a> - GLBenchmark 2.7 TRex renders with artifacts on Gen8 with !UXA</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88534">Bug 88534</a> - include/c11/threads_posix.h PTHREAD_MUTEX_RECURSIVE_NP not defined</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88561">Bug 88561</a> - [radeonsi][regression,bisected] Depth test/buffer issues in Portal</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88815">Bug 88815</a> - Incorrect handling of GLSL #line directive</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88883">Bug 88883</a> - ir-a2xx.c: variable changed in assert statement</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89039">Bug 89039</a> - [SKL]etqw system hang</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89058">Bug 89058</a> - [SKL]Render error in some games (etqw-demo, nexuiz, portal)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89068">Bug 89068</a> - glTexImage2D regression by texstore_rgba switch to _mesa_format_convert</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89131">Bug 89131</a> - [Bisected] Graphical corruption in Weston, shows old framebuffer pieces</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89210">Bug 89210</a> - GS statistics fail on SNB</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89260">Bug 89260</a> - macros.h:34:25: fatal error: util/u_math.h: No such file or directory</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89292">Bug 89292</a> - [regression,bisected] incomplete screenshots in some cases</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89328">Bug 89328</a> - python required to build Mesa release tarballs</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89342">Bug 89342</a> - main/light.c:159:62: error: 'M_PI' undeclared (first use in this function)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89343">Bug 89343</a> - compiler/tests/radeon_compiler_optimize_tests.c:43:3: error: implicit declaration of function ‘fprintf’ [-Werror=implicit-function-declaration]</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89345">Bug 89345</a> - imports.h:452:58: error: expected declaration specifiers or '...' before 'va_list'</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89364">Bug 89364</a> - c99_alloca.h:40:22: fatal error: alloca.h: No such file or directory</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89679">Bug 89679</a> - [NV50] Portal/Half-Life 2 will not start (native Steam)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89689">Bug 89689</a> - [Regression] Weston on DRM backend won't start with new version of mesa</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89726">Bug 89726</a> - [Bisected] dEQP-GLES3: uniform linking logic in the presence of structs</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89746">Bug 89746</a> - Mesa and LLVM 3.6+ break opengl for genymotion</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89754">Bug 89754</a> - vertexAttrib fails WebGL Conformance test with mesa drivers</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89758">Bug 89758</a> - pow WebGL Conformance test with mesa drivers</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89759">Bug 89759</a> - WebGL OGL ES GLSL conformance test with mesa drivers fails</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89963">Bug 89963</a> - lp_bld_debug.cpp:100:31: error: no matching function for call to ‘llvm::raw_ostream::raw_ostream()’</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90310">Bug 90310</a> - Fails to build gallium_dri.so at linking stage with clang because of multiple redefinitions</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90350">Bug 90350</a> - [G96] Portal's portal are incorrectly rendered</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90363">Bug 90363</a> - [nv50] HW state is not reset correctly when using a new GL context</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90397">Bug 90397</a> - ARB_program_interface_query: glGetProgramResourceiv() returns wrong value for GL_REFERENCED_BY_*_SHADER prop for GL_UNIFORM for members of an interface block with an instance name</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90466">Bug 90466</a> - arm: linker error ndefined reference to `nir_metadata_preserve'</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90520">Bug 90520</a> - Register spilling clobbers registers used elsewhere in the shader</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90830">Bug 90830</a> - [bsw bisected regression] GPU hang for spec.arb_gpu_shader5.execution.sampler_array_indexing.vs-nonzero-base</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90839">Bug 90839</a> - [10.5.5/10.6 regression, bisected] PBO glDrawPixels no longer using blit fastpath</li>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.