Need to flush before updating the buffer to ensure that the copy is
ordered after previous accesses (assuming the app has performed the
appropriate barriers).
This fixes potential issues due to draws prior to an update reading
the new buffer content, despite having the necessary barriers between
them.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e0cc32b85b)
Ported from radeonsi, pointed out by Tom.
"This prevents LLVM from using sext instructions for local memory
offsets and allows the backend to fold immediate offsets into the
instruction. This also prevents some incorrect code generation for
ptrtoint and inttoptr instructions."
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <tstellar@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b8ee70384a)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/amd/common/ac_nir_to_llvm.c
Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1,
now that LLVM bug 26775 and its corollary, 25503, are fixed.
Amendment: remove extraneous spaces in macro def & invocations.
We would prefer a runtime check, e.g. via an LLVMQueryString
(analogous to glGetString, eglQueryString) or LLVMGetVersion API,
but no such API exists at this time.
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
[Emil Velikov: remove LLVM_VERSION macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3f1b6ef2aa)
When binding as textures, the alignment can be 16. However when binding
as an image, the address has to be aligned to 256. (Also when binding as
an RT, but that can't happen with GL or current gallium APIs.)
Reported-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 32dd8d59b6)
If i-th thread could not be created it means we have i threads,
not i+1, because we start from 0.
Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7f268cf12b)
Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls
exit()") added a atexit handler which calls
util_queue_killall_and_wait() for each queue to stop the threads.
However the app is also free to use atexit handlers to clean up things,
leading to util_queue_destroy() call which will also call
util_queue_killall_and_wait() for the same queue again, causing threads
being joined twice, and that is undefined. This happens with libglut,
for example. A simple fix is to just set num_threads to 0 as there are
no more valid threads after util_queue_killall_and_wait() returns.
Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9936121935)
The number of dynamic descriptors is limited by both the number of
descriptors and the total number of dynamic things. Because there isn't
a single "maximum dynamic things" limit, we need to divide by two so
that they can create the maximum of both UBOs and SSBOs.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5e44ef4a76)
The dynamic_offset_offset in the descriptor set binding layout is
relative to the dynamic_offset_start for the set in the pipeline
layout.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 162beb2abb)
If we have any pending flushes on the primary command buffer, these
must be performed before executing the secondary buffer.
This fixes potential corruption when the contents of a subpass which
clears any of its render targets are given in a secondary buffer: the
flushes after a fast clear would not have been performed until the
vkCmdEndRenderPass call.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 290d7e892d)
See detailed explanation of why this is needed in commit eb60a89bc3.
This spot was missed/overlooked. Basically as a result of the fact
that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8
to the requested amount, we have to be mindful of that when doing bare
nouveau_pushbuf_space calls.
Reportedly this fixes some crashes when replaying a hitman trace taken
on radeonsi.
Fixes: eb60a89bc3 ("nouveau: take extra push space into account for pushbuf_space calls")
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8e6d67685e)
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.
Fixes SIGBUS on Raspberry Pi at high optimization levels.
This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.
v2: Commit message reword/typo fix, and add a bigger explanation in the
code (by anholt)
Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cd2b55e536)
Squashed with commit:
ralloc: don't leave out the alignment factor
Experimentation shows that without alignment factor gcc and clang choose
a factor of 16 even on IA-32, which doesn't match what malloc() uses (8).
The problem is it makes gcc assume the pointer is 16 byte aligned, so
with -O3 it starts using aligned SSE instructions that later fault,
so always specify a suitable alignment factor.
Cc: Jonas Pfeil <pfeiljonas@gmx.de>
Fixes: cd2b55e5 "ralloc: Make sure ralloc() allocations match malloc()'s alignment."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Tested by: Mike Lothian <mike@fireburn.co.uk>
Tested by: Jonas Pfeil <pfeiljonas@gmx.de>
(cherry picked from commit ff494fe999)
Even though compute shaders cannot access the framebuffer, there is a
synchronization issue when a compute dispatch accesses a texture that
was previously bound and drawn to as a framebuffer.
Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of
the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior
results if the texture is still attached to the currently bound
framebuffer. However, the feedback loop is broken when the application
changes the framebuffer binding before a compute dispatch, and the
state tracker needs to let the driver known about this.
Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 40c77bbf83)
exec_node::get_prev() does not guard against going past the beginning
of the list, so we need to add explicit checks here.
Found by ASAN in piglit arb_shader_storage_buffer_object-rendering.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 911391bd70)
When generating the MOV INDIRECT instruction, the source type is ignored
and it is set to destination's type. However, this is going to change in a
later patch, so we need to explicitly set the proper source type.
brw_vec8_grf() creates an float type's fs_reg by default, when the
ICP handle is actually unsigned. This patch fixes these cases before
applying the aforementioned patch.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit d8122128bc)
The lowered BSW/BXT indirect move instructions had incorrect
source types, which luckily wasn't causing incorrect assembly to be
generated due to the bug fixed in the next patch, but would have
confused the remaining back-end IR infrastructure due to the mismatch
between the IR source types and the emitted machine code.
v2:
- Improve commit log (Curro)
- Fix read_size (Curro)
- Fix DF uniform array detection in assign_constant_locations() when
it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.
v3:
- Move changes in assign_constant_locations() to other patch.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 56266df7ed)
Previously, if we had accesses with different sizes to the same uniform, we might not
push it aligned with the bigger one. This is a problem in BSW/BXT when we access
an array of DF uniform with both direct and indirect addressing because for the latter
we use 32-bit MOV INDIRECT instructions. However this problem can happen with other
generations and bitsizes.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit a497ab6838)
For blitting we need to use the depth or stencil format, never
the combined.
This fixes:
dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint
and a few others.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 800b82ea13)
Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the
availability flag is an integer. We apparently handled this correctly
already for the copy to buffer case.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d833ae97)
PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only
calls it when in a primary command buffer and we change
GetQueryPoolResults to not need it. CmdCopyQueryPoolResults
still needs it so we break that behavior for secondary command buffers.
However, that would hang already and using an unitialized value is
better than a hang.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ea34a98c0)
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.
However, the same 'var' is referenced later in ast_declarator_list::hir().
This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.
This error was detected by Address Sanitizer.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a73a618933)
This fixes:
vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
"Pass ID not registered!"' failed.
v2: use list_head, switch the call order in destroy
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4aea8fe7e0)
Some things may not be floats and intel CPUs are known for mangling bits
when a float type is used for copying integers.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 8c8095c260)
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.
Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 066a117be7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/radeonsi/si_shader.c
src/gallium/drivers/radeonsi/si_state_shaders.c
Also handle the GL_ARB_indirect_parameters case where the count itself
is in a buffer.
Use transfers rather than mapping the buffers directly. This anticipates
the possibility that the buffers are sparse (once ARB_sparse_buffer is
implemented), in which case they cannot be mapped directly.
Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type
on <= CIK.
v2:
- unmap the indirect buffer correctly
- handle the corner case where we have indirect draws, but all of them
have count 0.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6a1d9684f4)
Allocating huge buffers in VRAM is not a problem, but when those buffers
start being migrated, the kernel runs into errors because it cannot split
those buffer up for moving through GTT.
This should fix intermittent failures of
GL45-CTS.texture_buffer.texture_buffer_max_size
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 550125e1e7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
If llvm::sys::getHostCPUName() returns "generic", override
it with "pwr8" (on PPC64LE).
This is a work-around for a bug in LLVM: a table entry for "POWER8NVL"
is missing, resulting in (big-endian) "generic" being returned on
little-endian Power8NVL systems. The result is that code that
attempts to load the least significant 32 bits of a 64-bit quantity in
memory loads the wrong half.
This omission should be fixed in the next version of LLVM (4.0),
but this work-around should be left in place in case some
future version of POWER<n> also ends up unrepresented in LLVM's table.
This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion
tests on POWER8NVL processors.
(V4: add similar comment in the code.)
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b934aae364)
start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.
v2: Also fix the util_shorten_ubyte_elts_to_userptr call.
Tested with the new piglit.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a264fee624)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Conflicts:
src/gallium/drivers/radeonsi/si_state_draw.c
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79
(cherry picked from commit f448701622)
It's broken in a number of ways. In particular, a bunch of the
conditions are backwards so it doesn't actually detect what it's
supposed to detect. Since it's been broken, it hasn't actually been
helping anything so just deleting it isn't a regression.
This (and removing another optimization) were done on master in commit
b073811617.
Cc: "Kenneth Grunke" <kenneth@whitecape.org>
Cc: "Mark Janes" <mark.a.janes@intel.com>
[Emil Velikov: patch is a backport of the below "cherry pick"]
Fixes: a4393bd97f ("i965/fs: Fix the inline nir_op_pack_double optimization")
(cherry picked from commit b073811617)
vkQueuePresentKHR() takes VkPresentInfoKHR pointer and includes a
pResults fields which must holds the results of all the images
requested to be presented. Currently we're not filling this field.
Also as a side effect we probably want to go through all the images
rather than stopping on the first error.
This commit also makes the QueuePresentKHR() implementation return the
first error encountered.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0fcb92c17d)
Even though we supported both coherent and non-coherent memory types, we
effectively forced apps to use the coherent types by accident. Found by
inspection, only compile tested.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6319bfc2a6)
It's trivial to swizzle clear colors on the CPU, easily deals with the
hardware restrictions for render target swizzles, and makes swizzled
clears work on all hardware as opposed to just HSW+.
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e233db6e93)
Now that we have OES_tessellation_shader, the same situation can occur
in ES too, not just GL core profile.
Having a TCS but no TES may confuse drivers - i965 crashes, for example.
This prevents regressions in
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
with some SSO pipeline validation changes I'm making.
v2: Add an ES spec citation (suggested by Alejandro)
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 05a56893aa)
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
"If an array is declared as the last member of a shader storage block
and the size is not specified at compile-time, it is sized at run-time.
In all other cases, arrays are sized only at compile-time."
In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.
With this patch Mesa reports a compilation error as glslang does with
the following shader:
buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}
Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5bc222ebaf)
We just increased the max UBO, so we should also increase the clamp that
we do for robustness. Similarly, as we're including the fileIndex in the
new indirect value, we should reset fileIndex to 0 so that it is not
added in a second time.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c95f821cb4)
Kepler and up unfortunately only support up to 8 constbufs. We work
around this by loading from constbufs as if they were storage buffers.
However we were not consistently applying limits to loads from these
buffers. Make sure to do the same thing we do for storage buffers.
Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1acdd62847)
Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but
it's unclear what it refers to). We need to expose an extra binding
point for the "program parameters", which means this must be 15. Remove
the last vestige of the "use c14 for immediates" idea.
Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 59ca352fc5)
As per the spec -
"The functions memoryBarrierShared() and groupMemoryBarrier() are
available only in compute shaders; the other functions are available
in all shader types."
Conform to this by adding another delegate to check for compute
shader support instead of only whether the current stage is compute
This allows some fragment shaders in Dirt Rally to compile
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 21efe2528c)
(Patch co-authored by Jason and Ken.)
We scaled the guardband based on the viewport size, but failed to
take into account the translation portion of the viewport transform.
This meant the guardband was always centered around the origin.
We want it to be centered around the screen-space drawing area,
which is the intersection of the viewport and the render target.
At best, getting this wrong would reduce the guardband's effectiveness
in some cases. At worst, it might break things - objects outside of the
guardband are trivially rejected, so getting the guardband in the wrong
place and leaving guardband clipping enabled could cause problems.
v2: drop clamping of positive maximums.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f3c068c5c8)
The next patch will make the guardband calculation dependent on the
transformation matrix. Instead of computing it in both atoms, just
combine them into a single atom.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 89ad7f1be6)
CMASK alignment can be greater than image data alignment, so pass
it to the app so that it knows what alignment to backing memory
should have.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit eb01b20cc4)
The GLX specification says about glXDestroyPixmap:
"The storage for the GLX pixmap will be freed when it is not current
to any client."
We're not really following this language to the letter: some of the storage
is freed immediately (in particular, the dri3_drawable, which contains both
GLXDRIdrawable and loader_dri3_drawable). So we NULL out the pointers to
that freed storage; the previous patches added the corresponding NULL-pointer
checks.
This fixes memory corruption in piglit
./bin/glx-visuals-depth/stencil -pixmap -auto
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7be0e602ed)
The GLX specification says about glXDestroyPixmap:
"The storage for the GLX pixmap will be freed when it is not current
to any client."
So arguably, functions like glXSwapIntervalMESA can be called after
glXDestroyPixmap has been called for the currently bound GLXPixmap.
In that case, the GLXDRIDrawable no longer exists, and so we just skip
those calls.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f446f3fb33)
With a subsequent patch, we might see NULL loaderPrivates, e.g. when
a DRIdrawable is flushed whose corresponding GLXDRIdrawable was destroyed.
This resulted in a crash, since the loader vs. DRI3 drawable structures
have a non-zero offset.
Fixes glx-visuals-{depth,stencil} -pixmap
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 40c304fc06)
cs can be NULL when it comes from r600_buffer_map_sync_with_rings()
to avoid doing the same checks. It was checked for write mappings
but not for read mappings.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit af303abcdb)
This fixes a bunch of buffer related:
dEQP-VK.memory.pipeline_barrier.*
tests, that were crashing in LLVM due to this being missing.
Reviewed-by: Andres Rodriguez<andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ecd426490)
This fixes a bug uncovered by the 17-part patch series, specifically:
"gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"
If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a0740d59aa)
At this point, the pitch is in bytes. We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K. This
means the bit 15 trick won't work.
The caller now has signed integers anyway, so just pass those through
and do the obvious check.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 02216a1ddf)
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register. With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky. Use a UW type to avoid this.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit fcf723b647)
I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong. But it turns out that
there is a legitimate reason to use it, so let's do so.
The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5106df85da)
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.
This preempt fixes a bunch of geom shader tests.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2ab2be092d)
If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.
Firefox with WebGL2 enabled hits this bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456
v2: protect against a double destroy in st_create_context_priv and callers.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d9ef549238)
OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.
The functionality is also included in GL_ARB_ES2_compatibility. Ever
OpenGL core-profile driver currently exposes both extensions. I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems. No functionality
is removed.
I have left this extension in place for compatibility profile. There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.
Three other other alternatives considered:
- Remove the string from compatibility profile drivers but leave the
functionality in place.
- Add a flag to expose the extension string, and set it in every OpenGL
driver that does not expose GL_ARB_ES2_compatibility (and those
drivers only). I tried this. You can't have two instances of an
extension in the extension table (one dummy_true for ES1 and one with
a flag for compatibility profile), so the implementation requires a
bit of effort.
- Only expose the extension in compatibility if the version is less
than 2.0. I didn't see an easy way to do this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c4a0c1efff)
Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.
Fixes crash when compiling with clang.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 00847e4f14)
CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes. There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b829206b07)
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.
A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.
Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.
Fixes GL45-CTS.texture_cube_map_array.sampling.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 3cd092c415)
This reverts commit 9185a3385b.
As per the updated documentation - see commit 9e4248b206
("docs/submittingpatches.html: remove version tag for nominations")
the explicit version tag is no longer needed.
Conflicts:
bin/get-pick-list.sh
before commit f871946594
image_loader_extension was always present in dri2_dpy->extensions,
after that commit it is only present for render nodes.
Its removal broke partial render based on buffer age on (at least)
raspberry pi.
Fixes: f871946594 "egl/dri2: rework dri2_egl_display::extensions storage"
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 534ea2b5ba)
configure.ac already requires 2.4.66.
Fix SCons build. drmDevicePtr is not available until libdrm 2.4.65.
Compiling src/loader/loader.c ...
src/loader/loader.c:111:40: error: unknown type name ‘drmDevicePtr’
static char *drm_construct_id_path_tag(drmDevicePtr device)
^
Fixes: 4a183f4d06 ("scons: loader: use libdrm when available")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98421
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
(cherry picked from commit f2770fb3d5)
The currently used range HEAD..origin/master is far too broad. It looks
for nominations within the already_landed list (branchpoint..HEAD).
Similarly we look for already_landed whiting the [possible] nominations
Rand branchpoint..origin/master.
Improve things by limiting the look ups to the branch point.
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit d292f12d94)
Currently we loop (git log --grep) to check if the fix has landed. We
can simplify and make things faster by storing the already_picked list
and grep ping through it.
Slim down the message while we're here.
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 71e00d62ed)
Commit 8bca8d89ef ("glx/glvnd: Fix dispatch function names and indices")
fixed the sorting of the array initializers in g_glxglvnddispatchfuncs.c
because FindGLXFunction's binary search needs these to be sorted
alphabetically.
That commit also mostly fixed the sorting of the DI_foo defines in
g_glxglvnddispatchindices.h, which is what actually matters as the
arrays are initialized using "[DI_foo] = glXfoo," but a small error
crept in which at least causes glXGetVisualFromFBConfigSGIX to not
resolve, breaking games such as "The Binding of Isaac: Rebirth" and
"Crypt of the NecroDancer" from Steam not working and possible causes
other problems too.
This commit fixes the last of the sorting errors, fixing these mentioned
games not working.
Fixes: 8bca8d89ef ("glx/glvnd: Fix dispatch function names and indices")
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 4c66f529a8)
The kernel will reject our shader if we emit one here, and having 4, 8, or
12 as the top end of our UBO clamp rare is enough that it's not worth
making the kernel let us.
Fixes piglit fs-const-array-of-struct and
fs-const-array-of-struct-of-array since recent GLSL linking changes made
us get this as an indirect load of a uniform, instead of a tempoary.
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b230939303)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/drivers/vc4/vc4_opt_small_immediates.c
We want cached GTT for all non-persistent read mappings.
Set level = 0 on purpose.
Use dma_copy, because resource_copy_region causes a failure in the PBO
read of piglit/getteximage-luminance.
If Rocket League used the READ flag, it should get cached GTT.
v2: mask out UNSYNCHRONIZED
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d86099df0a)
The instruction has an associated label when Instruction.Label == 1,
as can be seen in ureg_emit_label() or tgsi_build_full_instruction().
This fixes dump generating extra :0 labels on conditionals, and virgl
parsing more than the expected tokens and eventually reaching "Illegal
command buffer" (when parsing more than a safety margin of 10 we
currently have).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dc2d9b8da1)
Now that information about which array-of-arrays elements are accessed
is tracked, use that information to only mark an instance array element
as used-by-stage if, in fact, it is.
Fixes GL45-CTS.program_interface_query.uniform-block-types.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ceea514d91)
Very soon this visitor will get more complicated. The users of the
existing ir_variable_refcount visitor won't need the coming
functionality, and this use doesn't need much of the functionality of
ir_variable_refcount.
v2: ir_array_refcount_visitor::get_variable_entry cannot return NULL, so
don't check it. Suggested by Timothy.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5085b64031)
One for the array parts and one for the leaf members. This will
simplify later changes.
The indentation is wonkey after this patch. This was done to make it
more obvious that the function is just getting split. The next patch
will fix the indentation.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8862fefba0)
As the linked per-stage shaders are processed, mark any block that has a
field that is accessed as referenced. When combining all the linked
shaders, combine the per-stage stageref masks.
This fixes a number of GLES CTS tests:
ES31-CTS.core.geometry_shader.program_resource.program_resource
ES32-CTS.core.geometry_shader.program_resource.program_resource
ESEXT-CTS.geometry_shader.program_resource.program_resource
piglit.gl45-cts.geometry_shader.program_resource.program_resource
However, it makes quite a few more fail:
ES31-CTS.functional.program_interface_query.buffer_variable.random.6
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.compute.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.separable_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment_only_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment_only_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment_only_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment_only_fragment.unnamed_block.float
ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.random.6
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.compute.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.separable_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment_only_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment_only_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment_only_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment_only_fragment.unnamed_block.float
ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment.unnamed_block.float
I have diagnosed the failures, but I'm not sure whether we or the
tests are wrong. After optimizations are applied, all of the tests
are of the form:
buffer X {
float f;
} x;
void main()
{
x.f = x.f;
}
The test then queries that x is referenced by that shader stage. We
eliminate the assignment of x.f to itself, and that removes the last
reference to x. We report that x is not referenced, and the test fails.
I do not know whether or not we are allowed to eliminate that assignment
of x.f to itself.
After discussions with the OpenGL ES group in Khronos, we believe that
Mesa's behavior is correct. I will provide patches to the CTS tests
to Khronos.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 084105c213)
[Emil Velikov: nominate considering the above fixes and it's a
requirement for the following nine patches]
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
intel_miptree_make_shareable() discarded and disabled CCS. Fix it so
that it discards and disables HiZ too.
Fixes dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer
on Skylake.
v2: Actually do what the commit message says. Discard the HiZ buffer.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com
Cc: Haixia Shi <hshi@chromium.org>
(cherry picked from commit 42011be1e2)
[Emil Velikov: patch is a backport by Chad of above commit]
Current blorp logic issues unconditional "flush everything"
(see brw_emit_mi_flush()) after each render. For example, all
blits issue this unconditionally which shouldn't be needed if
they set render cache properly so that subsequent renders do
necessary flushing before drawing.
In case of piglit:
ext_framebuffer_multisample-accuracy all_samples depth_draw small
intel_hiz_exec() is always preceded by blorb blit and the
unconditional flush looks to hide the lack of stall and flushes
in depth clears. By removing the brw_emit_mi_flush() I get gpu
hangs.
This patch adds the stalls and flushes mandated by the spec
and gets rid of those hangs.
v2 (Jason, Ken): Document the rational for separating
depth cache flush and stall on Gen7.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e6da6943fe)
This fixes
GL45-CTS.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
on nouveau. We only support 30 patch varyings (as 2 vec4 slots end up
being used for tess level settings), but were getting 32 exposed.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d3f9ed71c)
It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is. We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards. The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.
Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again. In particular:
1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
really know whether or not it's pipelined.
2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
is a compute shader.
3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
because the state may be getting cached there (we don't really know).
Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92128590bc)
We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed. Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related. This commit fixes them.
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f1f9794118)
Commit 7b5878ee04 increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.
Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a41f2527ae)
Applications may delete a shader program, create a new one, and bind it
before the next draw. With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program. In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:
if (brw->vertex_program != ctx->VertexProgram._Current) {
brw->vertex_program = ctx->VertexProgram._Current;
brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
}
Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on. This causes utter chaos.
As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them. Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.
The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none. This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time. Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...
Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.
Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.
Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7c5629a269)
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.
Fixes a crash due to out of bound memory access in:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
v2: Drop debug traces (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3421106ec)
The spec section 5.2 says:
"vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to VK_NULL_HANDLE and return the error."
Fixes:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25e21cb8d0)
In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format. In this case, we handle
stencil properly but we give blorp the wrong ISL format. Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.
Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c180f9633)
Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.
Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 573bf0940a)
The previous code always compared integers as 64-bit. Due to variations
in sign-extension in the code generated by nir_opt_algebraic.py, this
meant that nir_search doesn't always do what you want. Instead, 32-bit
values should be matched as 32-bit and 64-bit values should be matched
as 64-bit. While we're here we unify the unsigned and signed paths.
Now that we're using the right bit size, they should be the same since
the only difference we had before was sign extension.
This gets the UE4 bitfield_extract optimization working again. It had
stopped working due to the constant 0xff00ff00 getting sign-extended
when it shouldn't have.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb96b03461)
Blorp can deal with depth/stencil surfaces blits/copies without the
render target requirement. Also having both render target and
depth/stencil requirement is incompatible from isl's point of view.
This fixes an image creation issue in the high level quality settings
of the Unity3D player, which requires a depth texture with src/dst
transfer & 4x multisampling.
v2: Simply aspect checking condition (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74c23bde5b)
A nomination unadorned with a specific version is now interpreted as
being aimed at the 17.0 branch, which was recently opened.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
If begin_frame is called before setting intra_matrix and
non_intra_matrix it leads to segmentation faults when
vl_mpeg12_decoder.c is used.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92634
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 4b0e9babc6)
Squashed with commit:
st/va: make sure that we call begin_frame() only once v2
This fixes "st/va: delay calling begin_frame until we have all parameters".
v2: call begin frame after decoder (re)creation as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
(cherry picked from commit 1338d912f5)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/gallium/state_trackers/va/va_private.h
Make sure unused ops and their references are removed, prior to entering
the GCM (global code motion) pass, to stop GCM from breaking the loop
logic and thus hanging the GPU.
Turns out, that sb has problems with loops and node optimizations
regarding associative folding:
- the global code motion (gcm) pass moves ops up a loop level/basic block
until they've fulfilled their total usage count
- if there are ops folded into others, the usage count won't be
fulfilled and thus the op moved way up to the top
- within GCM the op would be visited and their deps would be moved
alongside it, to fulfill the src constaints
- in a loop, an unused op is moved out of the loop and GCM would move
the src value ops up as well
- now here arises the problem: if the loop counter is one of the src
values it would get moved up as well, the loop break condition would
never get hit and the shader turn into an endless loop, resulting in the
GPU hanging and being reset
A reduced (albeit nonsense) piglit example would be:
[require]
GLSL >= 1.20
[fragment shader]
uniform int SIZE;
uniform vec4 lights[512];
void main()
{
float x = 0;
for(int i = 0; i < SIZE; i++)
x += lights[2*i+1].x;
}
[test]
uniform int SIZE 1
draw rect -1 -1 2 2
Which gets optimized to:
===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 42 dw ===== 1 gprs ===== 2 stack =========================================
ALU 3 @24
1 y: MOV R0.y, 0
t: MULLO_UINT R0.w, [0x00000002 2.8026e-45].x, R0.z
LOOP_START_DX10 @22
PUSH @6
ALU 1 @30 KC0[CB0:0-15]
2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 2 @32
3 x: ADD_INT R0.x, R0.w, [0x00000002 2.8026e-45].x
TEX 1 @36
VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..]
ALU 1 @40
4 y: ADD R0.y, R0.y, R0.x
LOOP_END @4
EXPORT_DONE PIXEL 0 R0.____ EOP
===== SHADER_END ===============================================================
Notice R0.z being the loop counter/break condition relevant register
and being never incremented at all. Also some of the loop content
has been moved out of it, to fulfill the requirements for the one unused
op.
With a debug build of mesa this would produce an error like
error at : PRED_SETGE_INT __, __, EM.2, R1.x.2||FP@R0.z, C0.x
: operand value R1.x.2||FP@R0.z was not previously written to its gpr
and the compilation would fail due to this. On a release build it gets
passed to the GPU.
When using this patch, the loop remains intact:
===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 48 dw ===== 1 gprs ===== 2 stack =========================================
ALU 2 @24
1 y: MOV R0.y, 0
z: MOV R0.z, 0
LOOP_START_DX10 @22
PUSH @6
ALU 1 @28 KC0[CB0:0-15]
2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 4 @30
3 t: MULLO_UINT T0.x, [0x00000002 2.8026e-45].x, R0.z
4 x: ADD_INT R0.x, T0.x, [0x00000002 2.8026e-45].x
TEX 1 @40
VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..]
ALU 2 @44
5 y: ADD R0.y, R0.y, R0.x
z: ADD_INT R0.z, R0.z, 1
LOOP_END @4
EXPORT_DONE PIXEL 0 R0.____ EOP
===== SHADER_END ===============================================================
Piglit: ./piglit summary console -d results/*_gpu_noglx
name: unpatched_gpu_noglx patched_gpu_noglx
---- ------------------- -----------------
pass: 18016 18021
fail: 748 743
crash: 7 7
skip: 1124 1124
timeout: 0 0
warn: 13 13
incomplete: 0 0
dmesg-warn: 0 0
dmesg-fail: 0 0
changes: 0 5
fixes: 0 5
regressions: 0 0
total: 19908 19908
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94900
Tested-by: Heiko Przybyl <lil_tux@web.de>
Tested-on: Barts PRO HD6850
Signed-off-by: Heiko Przybyl <lil_tux@web.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e933246013)
Nominated-by: Andreas Boll <andreas.boll.dev@gmail.com>
Fixes a number of transform feedback tests when run with Linux 4.8,
which allows us to use the MI_LOAD_REGISTER_REG command, at which point
we started using this new broken path.
ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.*
and Piglit's arb_transform_feedback2/draw-auto are both fixed by this
patch, for example.
Thanks to Chris Wilson for catching this mistake!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 2138347a45)
In situations where libdrm_amdgpu and mesa are installed to the same
location, the mesa installed headers will take precedence over the git
source headers.
This is due to the AMDGPU_CFLAGS containing the install directory.
This situation can cause build errors if the git version of a header is
newer than the currently installed version of a header (e.g. git pull
updates vulkan.h)
Note: using the same install prefix for mesa and libdrm is probably a
common occurrence since it is described in the radeonBuildHowTo wiki:
https://www.x.org/wiki/radeonBuildHowTo/
v2: added sign-off
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a3ad6a34c6)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
Prep-work for next patch, mostly move to tracking last_fence as a
pipe_fence_handle (created now only in fd_gmem_render_tiles()), and a
bit of superficial renaming.
Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 16f6ceaca9)
Fixes a glxgears issues.
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Nominated-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
We were using ir_var_auto for the inlined function parameter variables,
which is wrong, as it suggests that those are real variables declared
by the program.
Normally this doesn't matter. However, if you called built-ins at
global scope, it would pollute the global variable namespace with
these new parameter temporaries. If the shader already had variables
with those names, the linker might see contradictory global variable
declarations and raise an error.
Making them temporaries indicates that these are just things generated
by the compiler internally. This avoids confusing the linker.
Fixes a new Piglit test: glsl-fs-multiple-builtins.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99154
Reported-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 62b8bcda1c)
As described in commit 690ead4a13 ("egl/wayland-egl: Fix for segfault
in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface
attached to already destroyed Wayland window we'll get a segfault.
v2: set the correct callback alongside the window->private. (Dan)
Cc: Daniel Stone <daniels@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit bfd6314350)
The commit resolves the issue by using new Wayland API thus bumping the
Wayland build requirement from 1.2 to 1.11. Allowing requirement changes
[for stable] is a rare exception and in this case we jump over 3 years
worth, which is not cool.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Typos do happen as people nominate patches for stable. This script aims
to catch most of those.
Due to the subtle nature of things, one has to pay special attention to
the output, similar to get-extra-pick-list.sh.
At the moment only the following is handled:
grep -i "CC:.*mesa-dev"
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f0bdd13fdb)
Ever since a long time ago when I messed around with fences, I ensure
that after a PUSH_SPACE call there is enough space to write a fence out
into the pushbuf.
However the PUSH_SPACE macro is not all-knowing, and so sometimes we
have to invoke nouveau_pushbuf_space manually with the relocs/pushes
args set. If we don't take the extra allocation from PUSH_SPACE into
account, then we will end up accidentally flushing when the code was not
expecting a flush. This can lead to various runtime and rendering
failures.
The amount of extra allocation isn't that important - it has to be at
least 8 based on the current nouveau_winsys.h setting, but even more
won't hurt. I just rounded up to powers of 2.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit eb60a89bc3)
This patch implements vk_icdNegotiateLoaderICDInterfaceVersion(), which
brings us to loader interface v3.
v2:
- Drop the pragmas. [emil]
- Advertise v3 instead of v2. Anvil supported more than I
thought. [jason]
- s/Surface/SurfaceKHR/ in comments. [emil]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1e41d7f7b0)
We can't import the latest vk_icd.h because the new header breaks the
Mesa build. This patch defines new casting macros,
ICD_DEFINE_NONDISP_HANDLE_CASTS() and ICD_FROM_HANDLE(), which can
handle both the old and new vk_icd.h, and will prevent the build from
breaking when we update the header.
In the old vk_icd.h, types were defined as:
typedef struct _VkIcdFoo {
...
} VkIcdFoo;
Commit 6ebba1f6 in the Vulkan loader changed the above to
typedef {
...
} VkIcdFoo;
because the old definitions violated the C and C++ specs. According to
the specs, identifiers that begins with an underscore followed by an
uppercase letter are reserved. (It's pedantic, I know), See the Github
issue referenced below.
References: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/7
References: 6ebba1f630
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c085bfcec9)
Currently its dependant on the user calling and checking the result
of list_empty() before using the result of list_is_singular().
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0252ba26c5)
vtn_ssa_value() can produce variable loads, and the cursor might
be after a return statement, causing nir_builder assert failures
about not inserting instructions after a jump.
This fixes:
dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_if
dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_switch
Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 203c128781)
Marking operations as redundant if they are equal to the base
range is fine when the tree structure is something like this:
max
/ \
max b
/ \
3 max
/ \
3 a
But the opt falls apart with a tree like this:
max
/ \
max max
/ \ / \
3 a b 3
The problem is that both branches are treated the same: descending in
the left branch will prune the constant, and then descending the right
branch will prune the constant there as well, because limits[0] wasn't
updated to take the change on the left branch into account, and so we
still get [3,\infty) as baserange.
In order to fix the bug we just disable the marking of redundant expressions
when they match the baserange.
NIR algebraic opt will clean up the first tree for anyway, hopefully
other backends are smart enough to do this also.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1edc53a66b)
Because border color is handled pre-swizzle, when we move the alpha
channel around in the format, the OPAQUE_BLACK border colors don't work
correctly on B4G4R4A4_UNORM_PACK16 with the hack. This fixes the
following Vulkan CTS tests on Broadwell:
dEQP-VK.pipeline.sampler.view_type.2d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.1d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.2d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.1d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.3d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d7bed6158)
We were failing to zero m0.2 of the sampler message header for TCS and
GS messages in the simple case. fs_generator has done this for about
a year now, but we missed it in vec4_generator.
Fixes ES31-CTS.core.texture_cube_map_array.sampling,
GL45-CTS.texture_cube_map_array.sampling, and many
dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests:
- dynamically_uniform.tessellation_control.isampler3d
- dynamically_uniform.tessellation_control.isamplercube
- dynamically_uniform.tessellation_control.sampler2d
- dynamically_uniform.tessellation_control.usamplercube
- dynamically_uniform.tessellation_control.sampler2darray
- dynamically_uniform.tessellation_control.isampler2darray
- dynamically_uniform.tessellation_control.usampler3d
- dynamically_uniform.tessellation_control.usampler2darray
- dynamically_uniform.tessellation_control.usampler2d
- dynamically_uniform.tessellation_control.sampler3d
- dynamically_uniform.tessellation_control.samplercube
- dynamically_uniform.tessellation_control.isampler2d
- uniform.tessellation_control.isampler3d
- uniform.tessellation_control.isamplercube
- uniform.tessellation_control.usampler2d
- uniform.tessellation_control.usampler3d
- uniform.tessellation_control.sampler2darray
- uniform.tessellation_control.isampler2darray
- uniform.tessellation_control.usampler2darray
- uniform.tessellation_control.sampler2d
- uniform.tessellation_control.usamplercube
- uniform.tessellation_control.sampler3d
- uniform.tessellation_control.samplercube
- uniform.tessellation_control.isampler2d
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4295af646f)
In OpenGL 3.0 and later it is legal to make a context current without
a default framebuffer.
This has been broken since DRI3 support was introduced.
Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b6670157d7)
If the VUE map has slots at the end which the shader does not write,
then we'd "flush" (constructing an URB write) on the last output it
actually wrote. Then, we'd construct another SEND with EOT, but with
no actual payload data. That's not legal.
For example, SSO programs have clip distance slots allocated no matter
what, but the shader may not write them. If it doesn't write any user
defined varyings, then the clip distance slots will be the last ones.
Found while debugging
dEQP-VK.tessellation.shader_input_output.gl_position_vs_to_tcs_to_tes
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 480d6c1653)
Fixes tests 'dEQP-GLES3.functional.texture.mipmap.*.generate.rgba5551*' on
Intel Broadwell 0x1616.
The GL 4.5 spec describes the algorithm of glGenerateMipmap as:
The contents of the derived images are computed by repeated, filtered
reduction of the level base image. [...] No particular filter algorithm is
required, though a box filter is recommended as the default filter.
Consider a texture for which all pixels are identical at level 0.
From the spec's description above, one may reasonably assume that the "filtered
reduction" of level 0 produces a new miplevel for which again all pixels are
identical. For any 2x2 subspan of identical pixels, it is difficult to see how
the "filtered reduction" of that subspan can produce a pixel that differs from
the source pixels.
Dithering during _mesa_meta_GenerateMipmap() violated that reasonable
assumption.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99210
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c4b87f129e)
Use atomic ops when updating gl_shader::RefCount.
Fixes intermittent failures and crashes in
'dEQP-EGL.functional.sharing.gles2.multithread.*'.
All tests in that group now pass except
'dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.textures.copyteximage2d_texsubimage2d_render'.
Tested with:
mesa: branch 'master' at d6545f2
deqp: branch 'nougat-cts-dev' at 4acf725 with additional local fixes
DEQP_TARGET: x11_egl
hw: Intel Broadwell 0x1616
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99085
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Haixia Shi <hshi@chromium.org>
(cherry picked from commit 464b23b1f2)
[Emil Velikov: add u_atomic.h include to resolve the build]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
The spec implicitly allows the incoming count to be 0. From the Vulkan
1.0.38 spec, Section 4.1 Physical Devices:
If the value referenced by pQueueFamilyPropertyCount is not 0 [then
do stuff].
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d6545f2345)
Fixes dEQP-EGL.functional.create_context_ext.robust_*
on Intel with GBM.
If the user sets the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR in
EGL_CONTEXT_FLAGS_KHR when creating an OpenGL ES context, then
EGL_KHR_create_context spec requires that we unconditionally emit
EGL_BAD_ATTRIBUTE because that flag does not exist for OpenGL ES. When
creating an OpenGL context, the spec requires that we emit EGL_BAD_MATCH
if we can't support the request; that error is generated in the egl_dri2
layer where the driver capability is actually checked.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99188
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit b85c0b569f)
The Vulkan spec indicates that
vkGetPhysicalDeviceQueueFamilyProperties() should overwrite
pQueueFamilyPropertyCount with the number of structures actually
written to pQueueFamilyProperties.
Signed-off-by: Damien Grassart <damien@grassart.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 75252826e8)
_mesa_choose_tex_format() already handles GL_RGBA + GL_UNSIGNED_SHORT_1_5_5_5_REV
by converting it to MESA_FORMAT_B5G5R5A1_UNORM. Teach it do the same for
the non-reversed type. Otherwise, the switch's fallthrough converts it
to an 8888 format, which has incompatible precision in the alpha
channel.
Patch 2/2 to fix dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8
on Intel.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99185
Cc: Haixia Shi <hshi@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3739810e3)
In this case we were dying when we tried to do SHL addr sampler imm(8)
because that puts an immediate in src0 of a two source instruction. This
fixes 2704 of the new separate sampler Vulkan CTS tests on Sky Lake.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 88b5acfa09)
When application window closed unexpectedly due to
lost window visualtypes getting invlaid parameters
which is causing a crash. Necessary check is added
to prevent the crash.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 01dd363e67)
Add the index to the location when assigning driver locations for
output variables.
Otherwise two fragment shader outputs declared as:
layout (location = 0, index = 0) out vec4 output1;
layout (location = 0, index = 1) out vec4 output2;
will end up aliasing one another.
Note that this patch will make the second output variable in the above
example alias a possible third output variable with location = 1 and
index = 0. But this shouldn't be a problem in practice since only one
color attachment is supported when dual-source blending is used.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 27a8aab882)
If the provided EGLConfig does not support the requested surface type,
then emit EGL_BAD_MATCH.
Fixes dEQP-EGL.functional.negative_api.create_pbuffer_surface
on GBM.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit fbb4af96c6)
If info->nr_samplers > ctx->nr_fragment_samplers_saved, the assignment
would prevent cso_single_sampler_done from unbinding the no longer used
samplers from the driver, which could result in use-after-free. This is
probably unlikely to happen in practice though.
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 3d661a12be)
This fixes a regression in a bunch of image store vulkan CTS tests
from commit ad38ba1134, which started
using OWORD block read messages to implement UBO loads. The reason
for the failure is that we were giving bogus buffer alignment limits
to the application (1B), so the CTS would happily come back with
descriptor sets pointing at not even word-aligned uniform buffer
addresses.
Surprisingly the sampler messages used to fetch pull constants before
that commit were able to cope with the non-texel aligned addresses,
but the dataport messages used to fetch pull constants after that
commit and the ones used to access storage buffers (before and after
the same commit) aren't as permissive with unaligned addresses.
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 79d08ed3d2)
The commits which introduced intel_miptree_copy are invasive/large to be
considered for stable.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Fixes the following compile error, present when the SHA1 library is libgcrypt:
CCLD glsl/tests/cache-test
glsl/.libs/libglsl.a(libmesautil_la-mesa-sha1.o): In function `call_once':
/mesa/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once'
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 5c73ecaac4)
We shouldn't ever see a SEL with conditional mod other than GE (for max)
or L (for min), but we might see one with predication and no conditional
mod.
total instructions in shared programs: 8241806 -> 8241902 (0.00%)
instructions in affected programs: 13284 -> 13380 (0.72%)
HURT: 62
total cycles in shared programs: 84165104 -> 84166244 (0.00%)
cycles in affected programs: 75364 -> 76504 (1.51%)
helped: 10
HURT: 34
Fixes generated code in at least Sanctum 2, Borderlands 2, Goat
Simulator, XCOM: Enemy Unknown, and Shogun 2.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92234
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7bed52bb5f)
Pretty basic, but it's a start.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 091a8a04ad)
[Emil Velikov: nir_shader_create() has only three arguments]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Matches the vec4 backend, cmod propagation, and saturate propagation.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6014da50ec)
This reverts commit 6aa730000f.
This was changing the size of the undef to always be 1 (the number of inputs
to imov and fmov) which is wrong, we could be moving a vec4 for example.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5502a721f)
Framebuffer attachments can be specified through FramebufferTexture*
calls. Upon specifying a depth (or stencil) framebuffer attachment that
internally reuses a texture, the cube map face of the new attachment
would not be updated (defaulting to TEXTURE_CUBE_MAP_POSITIVE_X).
Fix this issue by actually updating the CubeMapFace field.
This bug manifested itself in BindFramebuffer calls performed on
framebuffers whose stencil attachments internally reused a depth
texture. When binding a framebuffer, we walk through the framebuffer's
attachments and update each one's corresponding gl_renderbuffer. Since
the framebuffer's depth and stencil attachments may share a
gl_renderbuffer and the walk visits the stencil attachment after
the depth attachment, the uninitialized CubeMapFace forced rendering
to TEXTURE_CUBE_MAP_POSITIVE_X.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77662
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 63318d34ac)
The new implementation is more correct because it clamps the incoming value
to 10 to avoid floating-point overflow. It also uses a much reduced
version of the formula which only requires 1 exp() rather than 2. This
fixes all of the dEQP-VK.glsl.builtin.precision.tanh.* tests.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit da1c49171d)
Clamp input scalar value to range [-10, +10] to avoid precision problems
when the absolute value of input is too large.
Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
failures.
v2: added more explanation in the comment.
v3: fixed a typo in the comment.
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit d4983390a8)
When Kristian reworked descriptor set allocation, somehow he forgot to
actually store the offset in the free list. Somehow, this completely
missed CTS testing until now... This fixes all 2744 of the new
'dEQP-VK.texture.filtering.* tests in the latest CTS.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 37537b7d86)
Vertex emits beyond the specified maximum number of vertices are supposed to
have no effect, which is why we used to always kill GS that reached the limit.
However, if the GS also writes to memory (SSBO, atomics, shader images), then
we must keep going and only skip the vertex emit itself.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7655bccce8)
This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and
probably many other games too.
cso_cache deletes sampler states when the cache size is too big and doesn't
check which sampler states are bound, causing use-after-free in drivers.
Because of that, radeonsi uploaded garbage sampler states and the hardware
went bananas. Other drivers may have experienced similar issues.
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6dc96de303)
We would really like it to be false as that's what you get on hardware that
doesn't have RegisterPoleMode (Sky Lake for example). While we're at it,
we change it to a boolean. This fixes dEQP-VK.synchronization.smoke.events
on Broxton.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb7b51d62a)
Allocating zero URB space is a really bad idea. The hardware has to
give threads a handle to their URB space, and threads have to use that
to terminate the thread. Having it be an empty region just breaks a
lot of assumptions. Hence, why we asserted that it isn't possible.
Unfortunately, it /is/ possible prior to Gen8, if max_vertices = 0.
In theory a geometry shader could do SSBO/image access and maybe
still accomplish something. In reality, this is tripped up by
conformance tests.
Gen8+ already avoids this problem by placing the vertex count DWord
in the URB entry header. This fixes things on earlier generations.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit a41f5dcb14)
We were previously also verifying that no backing buffers were available
when an array wasn't enabled. This is has no basis in the spec, and it
causes GLupeN64 to fail as a result.
Fixes: c2e146f487 ("mesa: error out in indirect draw when vertex bindings mismatch")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 7c16552f8d)
This should be a win for most loops, which tend to have uniform control
flow.
More importantly, it exposes important information to live variables: that
the break/continue here means that our jump target may have access to
values that were live on our input. Previously, we were just setting the
exec mask and letting control flow fall through, so an intervening def
between the break and the end of the loop would appear to live variables
as if it screened off the variable, when it didn't actually.
Fixes a regression in glsl-vs-loop-redundant-condition.shader_test when a
perturbing of register allocation caused a live variable to get stomped.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8e5ec33f11)
Internal docs don't mention it, but they also don't mention that the bug
has been fixed (like other CI bugs fixed in VI).
Vulkan does this too.
v2: also update r600_gfx_write_fence_dwords
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
(cherry picked from commit bacf9b4e73)
This fixes dual source blending on Stoney. The fix was copied from Vulkan.
The problem was discovered during internal testing.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5e5573b1bf)
07fe2d565b introduced a big hack in order to return
NumSubroutineUniforms when querying ACTIVE_RESOURCES for
<shader>_SUBROUTINE_UNIFORM interfaces. However this is the
wrong fix we are meant to be returning the number of active
resources i.e. the count of subroutine uniforms in the
resource list which is what the code was previously doing,
anything else will cause trouble when trying to retrieve
the resource properties based on the ACTIVE_RESOURCES count.
The real problem is that NumSubroutineUniforms was counting
array elements as separate uniforms but the innermost array
is always considered a single uniform so we fix that count
instead which was counted incorrectly in 7fa0250f9.
Idealy we could probably completely remove
NumSubroutineUniforms and just compute its value when needed
from the resource list but this works for now.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0303201dfb)
[Emil Velikov: LinkStatus is in gl_shader_program]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/mesa/main/program_resource.c
The 1-D special case doesn't actually apply to depth or HiZ. I discovered
this while converting BLORP over to genxml and ISL. The reason is that the
1-D special case only applies to the new Sky Lake 1-D layout which is only
used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers,
the old gen4 2-D layout is used and the QPitch should be in rows.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f469235a6e)
When the memfd_create() and u_vector_init() fail on anv_block_pool_init(),
this patch makes to return VK_ERROR_INITIALIZATION_FAILED.
All of initialization success on anv_block_pool_init(), it makes to return
VK_SUCCESS.
CID 1394319
v2: Fixes from Emil's review:
a) Add the return type for propagating the return value to caller.
b) Changed anv_block_pool_init() to return VK_ERROR_INITIALIZATION_FAILED
on failure of initialization.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ecc618b0d8)
This can happen even if the binding table isn't changed. For instance, you
could have dynamic offsets with your descriptor set. This fixes the new
stress.lots-of-surface-state.cs.dynamic cricible test.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 054e48ee0e)
Conflicts:
src/intel/vulkan/genX_cmd_buffer.c
Squashed with commit:
anv/cmd_buffer: Emit CS push constants after binding tables
Emitting binding tables can cause push constants to be dirtied if the
shader uses images so we need to handle push constants later.
(cherry picked from commit 7a2cfd4adb)
This fixes:
dEQP-VK.api.image_clearing.clear_color_image.3d*
These were hitting an assert as the code wasn't taking the
baseMipLevel into account when minify the image depth.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 09c0c17bc3)
The intrinsic engine asserts in llvm due to this.
Reported-by: Christoph Haag <haagch+mesadev@frickel.club>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b56b54cbf1)
Squashed with commit:
radv/ac/llvm: fix regression with shadow samplers fix
This fixes b56b54cbf1:
radv/ac/llvm: shadow samplers only return one value
It makes sure we only do that for shadow sampling, as
opposed to sizing requests.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b2e217369e)
Squashed with commit:
radv: brown-paper bag for a forgotten else.
This fixes the fix:
radv/ac/llvm: fix regression with shadow samplers fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 020978af12)
The code didn't limit the offsets to the number supplied, so
if we expected 3 but only got 2 we were accessing undefined memory.
This fixes random failures in:
dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bb8ac18340)
In order to support FIFO mode without blocking the application on calls
to vkQueuePresentKHR it is necessary to enqueue the request and defer
calling the server until the next vblank period. The xcb present api
doesn't offer a way to register a callback, so we will have to spawn a
worker thread that will wait for a request to be added to the queue, call
to the server, and then make the image available for reuse. This commit
introduces the queue data structure needed to implement this.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 932bb3f0dd)
We shouldn't be using ASYNC here, that would be used
for immediate mode, so let's implement that.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ca035006c8)
This just moves this up a level as x11 will need it to
implement things properly.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1cdca1eb16)
For 0 timeout, just poll for an event, and if none, return
For UINT64_MAX timeout, just wait for special event blocked
For other timeouts get the xcb fd and block on it, decreasing
the timeout if we get woken up for non-special events.
v1.1: return VK_TIMEOUT for poll timeouts.
handle timeout going negative.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 787c172aed)
x11_surface_get_present_modes() is currently asserting that the number of
elements in pPresentModeCount must be greater than or equal to the number
of present modes available. This is buggy because pPresentModeCount
elements are later copied from the internal modes' array, so if
pPresentModeCount is greater, it will overflow it.
On top of that, this assertion violates the spec. From the Vulkan 1.0
(revision 32, with KHR extensions), page 581 of the PDF:
"If the value of pPresentModeCount is less than the number of
presentation modes supported, at most pPresentModeCount values will be
written. If pPresentModeCount is smaller than the number of
presentation modes supported for the given surface, VK_INCOMPLETE
will be returned instead of VK_SUCCESS to indicate that not all the
available values were returned."
So, the correct behavior is: if pPresentModeCount is greater than the
internal number of formats, it is clamped to that many present modes. But
if it is lesser than that, then pPresentModeCount elements are copied,
and the call returns VK_INCOMPLETE.
This fix is similar (but simpler and more readable) than the one I provided
in 750d8cad72 for vkGetPhysicalDeviceSurfaceFormatsKHR, which was suffering
from the same problem.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit b677b99db5)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
x11_surface_get_formats() is currently asserting that the number of
elements in pSurfaceFormats must be greater than or equal to the number
of formats available. This is buggy because pSurfaceFormatsCount
elements are later copied from the internal formats' array, so if
pSurfaceFormatCount is greater, it will overflow it.
On top of that, this assertion violates the spec. From the Vulkan 1.0
(revision 32, with KHR extensions), page 579 of the PDF:
"If pSurfaceFormats is NULL, then the number of format pairs supported
for the given surface is returned in pSurfaceFormatCount. Otherwise,
pSurfaceFormatCount must point to a variable set by the user to the
number of elements in the pSurfaceFormats array, and on return the
variable is overwritten with the number of structures actually written
to pSurfaceFormats. If the value of pSurfaceFormatCount is less than
the number of format pairs supported, at most pSurfaceFormatCount
structures will be written. If pSurfaceFormatCount is smaller than
the number of format pairs supported for the given surface,
VK_INCOMPLETE will be returned instead of VK_SUCCESS to indicate that
not all the available values were returned."
So, the correct behavior is: if pSurfaceFormatCount is greater than the
internal number of formats, it is clamped to that many formats. But
if it is lesser than that, then pSurfaceFormatCount elements are copied,
and the call returns VK_INCOMPLETE.
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 750d8cad72)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
Squashed with commit:
vulkan/wsi/x11: Smplify implementation of vkGetPhysicalDeviceSurfaceFormatsKHR
This patch simplifies x11_surface_get_formats(). It is actually just a
readability improvement over the patch I provided earlier this week
(750d8cad72).
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 129da27426)
According to the spec for vkGetPhysicalDeviceImageFormatProperties:
"If format is not a supported image format, or if the combination of format,
type, tiling, usage, and flags is not supported for images, then
vkGetPhysicalDeviceImageFormatProperties returns VK_ERROR_FORMAT_NOT_SUPPORTED."
Makes the following Vulkan CTS tests report 'Not Supported' instead of crashing:
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_unorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_snorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_unorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_snorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r4g4_unorm_pack8
dEQP-VK.api.image_clearing.clear_color_image.1d_r8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_b5g5r5a1_unorm_pack16
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 35deeda66f)
Squashed with:
anv/format: handle unsupported formats earlier
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 277f868e66)
Driver should enumerate only up-to min2(num_available, num_requested)
properties and return VK_INCOMPLETE if the # of requested props is
smaller than the ones available.
Presently we assert out in such cases.
Inspired by a similar fix for RADV.
v2: Use MIN2 + typed_memcpy (Jason).
Should fix: dEQP-VK.api.info.device.extensions
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 5cc07d854c)
If we try to allocate a binding table and fail, we have to get a new
binding table block, re-emit STATE_BASE_ADDRESS, and then try again. We
already handle this correctly for 3D and blorp but it never got handled for
CS. This fixes the new stress.lots-of-surface-state.cs.static crucible test.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 722ab3de9f)
Since both pCreateInfo->strideInBytes and pCreateInfo->extent.height
are of uint32_t type 32-bit arithmetic will be used.
Fix unintentional integer overflow by casting to uint64_t before
multifying.
CID 1394321
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
[Emil Velikov: cast only of the arguments]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e074a08a6d)
Consider a geometry shader that contains code like this:
some_out = expr;
if (cond) {
...
EmitVertex();
} else {
...
EmitVertex();
}
Both branches should see the correct value of some_out.
Since this is a rather subtle and rare case, I'm submitting a piglit test
for this as well.
GLSL says that the values of output variables are undefined after
EmitVertex(). With this change, the values will now be defined and
unmodified. This may reduce optimization opportunities in the probably
quite rare case where subsequent compiler passes cannot prove that the
value of the output variable is overwritten.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0d383a79a8)
For compute shaders, we free the selector after the shader has been
compiled, so we need to save this bit somewhere else. Also, make sure that
this type of bug cannot re-appear, by NULL-ing the selector pointer after
we're done with it.
This bug has been there since the feature was added, but was only exposed
in piglit arb_compute_variable_group_size-local-size by commit
9bfee7047b (which is totally unrelated).
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 42d5e91a2a)
This fixes the image view for sampling just the depth.
It removes some pointless swizzle code, and adds
a missing case for the x8_d24 format.
Fixes:
dEQP-VK.renderpass.formats.d32_sfloat_s8_uint.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6d7be52d90)
We weren't taking first_component into account when handling GS push
inputs. We hardly ever push GS inputs, so this was not caught by
existing tests. When I started using component qualifiers for the
gl_ClipDistance arrays, glsl-1.50-transform-feedback-type-and-size
started catching this.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c4be6e0b8d)
In case we have empty log (""), we should return 0. This fixes
Khronos WebGL conformance test 'program-infolog'.
From OpenGL ES 3.1 (and OpenGL 4.5 Core) spec:
"If pname is INFO_LOG_LENGTH , the length of the info log, including
a null terminator, is returned. If there is no info log, zero is
returned."
v2: apply same fix for get_shaderiv and _mesa_GetProgramPipelineiv (Ian)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97321
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ec4e71f75e)
GNU/Hurd does not define PATH_MAX since it doesn't have such arbitrary
limitation, so this failed to compile. Apparently glibc does not
enforce PATH_MAX restrictions anyway, so it's kind of a hoax:
https://www.gnu.org/software/libc/manual/html_node/Limits-for-Files.html
MSVC uses a different name (_MAX_PATH) as well, which is annoying.
We don't really need it. We can simply asprintf() the filenames.
If the filename exceeds an OS path limit, presumably fopen() will
fail, and we already check that. (We actually use ralloc_asprintf
because Mesa provides that everywhere, and it doesn't look like we've
provided an implementation of GNU's asprintf() for all platforms.)
Fixes the build on GNU/Hurd.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98632
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9bfee7047b)
[Emil Velikov: s|prog->Id|base->Id|]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/mesa/main/arbprogram.c
Otherwise, we'll try to clear it the first time it's used as a draw so if
you do some multisampled rendering, resolve to an attachment, and then draw
on top of the single-sampled attachment, we might accidentally clear it.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ccdf9af392)
Before, we were always treating it as an output which bogus. The only
stage in which this it can be an output is the geometry stage. In all
other stages, it's an input which, in the back-end, we actually want to be
a system value.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9557147592)
This fixes a bunch of new CTS tests which look for exactly this. Even in
the cases where we just call vk_free to free a CPU data structure, we still
handle NULL explicitly. This way we're less likely to forget to handle
NULL later should we actually do something less trivial.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 49f08ad77f)
[Emil Velikov: color_rt_surface_state is still around]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/intel/vulkan/anv_image.c
We got a couple for products that exist on ark.intel.com, so let's just
put them in now.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit b8509c8936)
Squashed with commit:
i965: Fix KBL typo in string
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 19a01f8139)
This fixes a number of CTS tests like:
dEQP-VK.glsl.texture_gather.basic.2d.rgba8ui.size_npot.clamp_to_edge_repeat
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 713522fb8d)
A commit from the CTS suite on the 1.0-dev branch started using
VK_REMAINING_MIP_LEVELS, we're not dealing with it properly for clears.
Fixes:
dEQP-VK.api.image_clearing.clear_color_image.*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a46bc3f70a)
I had this exactly backwards, but apparently the piglit tests were all
landing in r0-r3 anyway.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 977d8b526b)
There is currently no protection against walking a hash (using
_mesa_HashWalk()) and modifying it at the same time, for instance by inserting
or deleting elements. This leads to segfaults in multithreaded code if e.g.
someone calls glTexImage2D (which may have to walk the list of FBOs) while
another thread is calling glDeleteFramebuffers on another thread with the two
contexts sharing lists.
The reason for this is that _mesa_HashWalk() doesn't actually take the mutex
that normally protects the hash; it takes an entirely different mutex.
Thus, walks are only protected against other walks, and there is also no
outer lock taking this. There is an old comment saying that this is to fix
problems with deadlock if the callback needs to take a mutex; we solve this
by changing the mutex to be recursive.
A demonstration Helgrind hit from a real application:
==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1
==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400
==13412== at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395)
==13412== by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350)
==13412== by 0x1EE98174: _mesa_HashRemove (hash.c:365)
==13412== by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669)
==13412== by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473)
==13412== by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442)
[...]
==13412== This conflicts with a previous read of size 8 by thread #20
==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318
==13412== at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415)
==13412== by 0x1EE982A8: _mesa_HashWalk (hash.c:426)
==13412== by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683)
==13412== by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043)
==13412== by 0x1EED9410: teximage (teximage.c:3073)
==13412== by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105)
==13412== by 0x166A68: operator() (mixer.cpp:454)
There are many more interactions than just these two possible.
Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Steinar H. Gunderson <steinar+mesa@gunderson.no>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 2e2562cabb)
This allows for gl_PrimitiveId to come in as a system value rather than as
an input. This is the way it will come in from SPIR-V. We keeps the input
path working for now so we don't break GL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5e88e66e6)
[Emil Velikov: nir_shader::info is not a pointer in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
From the Vulkan spec 1.0.32 section 29.6 docs for vkAcquireNextImageKHR:
"Let n be the total number of images in the swapchain, m be the value of
VkSurfaceCapabilitiesKHR::minImageCount, and a be the number of
presentable images that the application has currently acquired (i.e.
images acquired with vkAcquireNextImageKHR, but not yet presented with
vkQueuePresentKHR). vkAcquireNextImageKHR can always succeed if a ≤ n -
m at the time vkAcquireNextImageKHR is called. vkAcquireNextImageKHR
should not be called if a > n - m with a timeout of UINT64_MAX; in such
a case, vkAcquireNextImageKHR may block indefinitely."
With minImageCount == 2 (as it was previously, the client is allowed to
acquire all but one image withoutblocking. If we really need 4 images for
mailbox mode + pageflipping, then we need to request a minimum of 4 images
up-front. This is a bit unfortunate because it means we will always
consume 4 images. In the future, we may be able to optimize this a bit by
waiting until the server starts to flip and returning OUT_OF_DATE to get
the client to re-allocate with more images or something like that.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4fa0ca80ee)
We can only read the valid samples if this is an MSAA
texture, which means the type field must be 0x14 or 0x15.
This fixes:
dEQP-VK.glsl.texture_functions.query.texturesamples.*
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2de85eb97a)
The #version directive can only handle decimal constants. Enforce that
the value is a decimal constant.
Section 3.3 (Preprocessor) of the GLSL 4.50 spec says:
The language version a shader is written to is specified by
#version number profile opt
where number must be a version of the language, following the same
convention as __VERSION__ above.
The same section also says:
__VERSION__ will substitute a decimal integer reflecting the version
number of the OpenGL shading language.
Use a separate flag to track whether or not the #version line has been
encountered. Any possible sentinel (0 is currently used) could be
specified in a #version directive. This would lead to trying to
(internally) redefine __VERSION__. Since there is no parser location
for this addition, NULL is passed. This eventually results in a NULL
dereference and a segfault.
Attempts to use -1 as the sentinel would also fail if '#version
4294967295' or '#version 18446744073709551615' were used. We should
have piglit tests for both of these.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Karol Herbst <karolherbst@gmail.com>
(cherry picked from commit e85a747e29)
Our previous fence implementation was very simple. Fences had two states:
signaled and unsignaled. However, this didn't properly handle all of the
edge-cases that we need to handle. In order to handle the case where the
client calls vkGetFenceStatus on a fence that has not yet been submitted
via vkQueueSubmit, we need a three-status system. In order to handle the
case where the client calls vkWaitForFences on fences which have not yet
been submitted, we need more complex logic and a condition variable. It's
rather annoying but, so long as the client doesn't do that, we should still
hit the fast path and use i915_gem_wait to do all our waiting.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 843775bab7)
It's much better to just skip the draw call entirely. Getting this
information out of register allocation will also be useful for
implementing threaded fragment shaders, which will need to retry
non-threaded if RA fails.
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d019bd703)
This is a port of commit a4a5917248:
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.
This fixes a segfault, related to pColorBlendState, seen in Talos Principle
which I've observed after startup is completed and when exiting the menus,
depending on when Vulkan rendering is selected.
v2: moved the NULL check in radv_pipeline_init_blend_state to after the
declarations.
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 9b121512ac)
In the event that multiple threads attempt to install a graph
concurrently, protect the shared list.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 381edca826)
Instead of trying to maintain a reference counted list of valid HUD
objects, and freeing them accordingly, creating race conditions
between unanticipated multiple threads, simply accept they're
allocated once and never released until the process terminates.
They're a shared resource between multiple threads, so accept
they're always available for use.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ffed08679)
We had missed a bit of errata - PS scratch needs to be computed as if
there were 4 subslices per slice, rather than 3.
Skylake Broxton Kabylake
GT1 GT2 GT3 GT4 2x6 3x6 GT1 GT1.5 GT2 GT3 GT4
Actual Slices 1 1 2 3 1 1 1 1 1 2 3
Total Subslices 3 3 6 9 2 3 2 3 3 6 9
Subsl. for PS Scratch 4 4 8 12 4 4 4 4 4 8 12
Note that Skylake GT1-3 already worked because we allocated 64 * 9
(trying to use a value that would work on GT4, with 9 subslices),
and the actual required values were 64 * 4 or 64 * 8. However, all
others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated,
which can lead to scratch writes trashing random process memory,
and rendering corruption or GPU hangs.
Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that
spill. Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.*
now runs successfully with no hangs and renders correctly. This may
fix problems on Broxton and Kabylake as well.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit aaee3daa90)
Port of the anv commit d96345de98 ("anv: Suffix the intel_icd file with
the host CPU").
v2: s/intel_icd/radeon_icd/ in commit summary (Gražvydas)
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC)
(cherry picked from commit 0f434a68a3)
Squashed with commit:
radv: automake: list correct file in the EXTRA_DIST
Earlier commit renamed the file radeon_icd.json{,.in} but missed one
reference of the file - in EXTRA_DIST.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 0f434a68a ("radv: Suffix the radeon_icd file with the host CPU")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b359f62456)
Vulkan has introduced the consept of .specVersion which can be used to
attribute changes of the said extension.
The current loader does not check the value, thus it have gone unnoticed
that the driver exposes an old version of the following extensions:
VK_KHR_xcb_surface (Rev 6)
VK_KHR_xlib_surface (Rev 6)
VK_KHR_wayland_surface (Rev 5)
- Updated the surface create function to take a pCreateInfo structure
VK_KHR_swapchain (Rev 68)
- Moved the "validity" include for vkAcquireNextImage to be in its proper
place, after the prototype and list of parameters.
...
According to the documentation:
* pname:specVersion is the version of this extension.
It is an integer, incremented with backward compatible changes.
Based on the history of vk.xml the above (latest) revision has been
available since Vulkan 1.0 so even if they were any backwards
incompatible change(s) [as hinted by the revision log] those should be
safe.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f373a91a52)
This fixes a bunch of GPU hangs introduced in some CTS
tests like
dEQP-VK.memory.pipeline_barrier.host_write_uniform_buffer.65536
It works around an issue seen in the LLVM backend, but
also makes the radv code work more like the radeonsi stack.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3c9af7578f)
This is ported from GLSL and converts
if (cond)
discard;
into
discard_if(cond);
This removes a block, but also is needed by radv
to workaround a bug in the LLVM backend.
v2: handle if (a) discard_if(b) (nha)
cleanup and drop pointless loop (Matt)
make sure there are no dependent phis (Eric)
v3: make sure only one instruction in the then block.
v4: remove sneaky tabs, add cursor init (Eric)
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b16dff2d88)
We are going to start lowering to this in NIR code,
so prepare radv for it.
v2: handle conversion to kilp properly (nha)
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dd77faeca2)
Since our surface state buffer is shared by all batches, the kernel does a
full stall and sync with the CPU between batches every time we call
execbuf2 because it refuses to do relocations on an active buffer. Doing
them in userspace and passing the NO_RELOC flag to the kernel allows us to
perform the relocations without stalling.
This improves the performance of Dota 2 by around 30% on a Sky Lake GT2.
v2 (Jason Ekstrand):
- Better comments (Chris Wilson)
- Fixed write_reloc for correct canonical form (Chris Wilson)
v3 (Jason Ekstrand):
- Skip relocations which aren't needed
- Provide an environment variable to always use the kernel
- More comments about correctness (Chris Wilson)
v4 (Jason Ekstrand):
- More comments (Chris Wilson)
v5 (Jason Ekstrand):
- Rebase on top of moving execbuf2 setup go QueueSubmit
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3a29f2e9e)
Ever since the early days of the Vulkan driver, we've been setting up the
lists of relocations at EndCommandBuffer time. The idea behind this was to
move some of the CPU load out of QueueSubmit which the client is required
to lock around and into command buffer building which could be done in
parallel. Then QueueSubmit basically just becomes a bunch of execbuf2
calls.
Technically, this works. However, when you start to do more in QueueSubmit
than just execbuf2, you start to run into problems. In particular, if a
block pool is resized between EndCommandBuffer and QueueSubmit, the list of
anv_bo's and the execbuf2 object list can get out of sync. This can cause
problems if, for instance, you wanted to do relocations in userspace.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8b61c57049)
The original reason for putting it in the batch_bo was to allow primaries
to share it across secondaries or something like that. However, the
relocation lists in secondary command buffers are are always left alone and
copied into the primary command buffer's relocation list. This means that
the offset really applies at the command buffer level and putting it in the
batch_bo doesn't make sense. This fixes a couple of potential bugs around
re-submission of command buffers that are not likely to be hit but are bugs
none the less.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 595400d577)
This commit adds a little helper struct for storing everything we use to
build an execbuf2 call. Since the add_bo function really has nothing to do
with a command buffer, it makes sense to break it out a bit. This also
reduces some of the churn in the next commit.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0fe6829427)
The old version wasn't properly handling large addresses where we have to
sign-extend to get it into the "canonical form" expected by the hardware.
Also, the new version is capable of doing a clflush of the newly written
reloc if requested.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 095c48a496)
Since -1 is an invalid GPU address, this lets us know whether or not we
have a valid address for a buffer. We don't get a valid address until the
first time that buffer is used in an execbuf2 ioctl.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d46bfb6297)
The previous implementation was being overly clever and using the
anv_bo::size field as its mutex. Scratch pool allocations don't happen
often, will happen at most a fixed number of times, and never happen in the
critical path (they only happen in shader compilation). We can make this
much simpler by just using the device mutex. This also means that we can
start using anv_bo_init_new directly on the bo and avoid setting fields
one-at-a-time.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd0f8d5070)
Because our relocation processing happens at EndCommandBuffer time and
because RENDER_SURFACE_STATE objects may be shared by batches, we really
have no clue whatsoever what address is actually written to the relocation
offset in the BO. We need to stop making such claims to the kernel and
just let it relocate for us.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ba1eea4f95)
This patch should have been the part of commit e592f7df.
In a situation when there are multiple render targets with alpha testing
enabled, if fragment shader doesn't write to draw buffer zero, it causes
the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a
warning for all gen6+ h/w:
"Illegal render target write message length 0xa expected 0xc"
This patch fixes the GPU hang as well as the simulator warning with
new piglit test fbo-mrt-alphatest-no-buffer-zero-write:
https://patchwork.freedesktop.org/patch/118212
No regressions in Jenkins CI system.
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit b9df2251c1)
I was getting a random GPU hang in the renderpass simple tests,
it turns out sometimes radv emitted the wrong thing "last".
This fixes the logic to emit Z/stencil last if they occur,
and not mark a color output as last. Also this relies on the
Z/STENCIL being the first two fragment outputs, which they are
so yay.
Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs)
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bafc75b437)
At least on Sky Lake, after emitting 3DSTATE_CONSTANT_*, you are required
to re-emit the 3DSTATE_BINDING_TABLE_POINTERS packet for the corresponding
stage. If you don't, double-buffering may fail and you may get the wrong
constants. It turns out that you need to do this even if you have no push
constants to speak of or else the next 3DSTATE_CONSTANT packet you emit for
that stage may not work correctly.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 406cd9d126)
The 1/W was apparently not accurate enough, and we were getting sparklies
in the distance. The closed driver also did a N-R step here.
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 283d4d18e5)
A (latent) bug in VDPAU interop was exposed by commit
e5cc84dd43.
Before that commit, the st_vdpau code created samplers with
first_layer == last_layer == 1 that the general texture handling code
would immediately delete and re-create, because the layer does not match
the information in the GL texture object.
This was correct behavior at least in the DMABUF case, because the imported
resource is supposed to have the correct offset already applied. In the
non-DMABUF case, this was just plain wrong but apparently nobody noticed.
After that commit, the state tracker assumes that an existing sampler is
correct at all times. Existing samplers are supposed to be deleted when
they may become invalid, and they will be created on-demand. This meant
that the sampler with first_layer == last_layer == 1 stuck around, leading
to rendering artefacts (on radeonsi), command stream failures (on r600), and
assertions (in debug builds everywhere).
This patch fixes the problem by simply not creating a sampler at all in
st_vdpau_map_surface. We rely on the generic texture code to do the right
thing, adding the layer_override to make the non-DMABUF case work.
v2: add the layer_override
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98512
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Cc: Christian König <deathsimple@vodafone.de>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 322483f71b)
This reverts commit d180de3532.
This is a radeon specific hack that causes problems on nouveau
when combined with the SHARED flag later. If radeonsi needs a fix
for this, please fix it in the driver.
[chk]
Using linear surfaces for this makes sense because tilling isn't
beneficial and the surfaces can potentially be shared with other GPUs
using the VDPAU OpenGL interop.
[airlied]
I think we need a flag that isn't SHARED/LINEAR that is more
SHARED_OTHER_GPU.
[mareko]
Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate
into linear ?
[mareko]
My only concern is decoding performance. If the decoder works in 64x1
blocks, tiling will hurt. That's the theory. I don't know how the
decoder works.
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)
(cherry picked from commit d0d5f7600c)
When splitting up loads, we have to add 16 bytes to the offset for
the high components, just like already happens for stores.
Fixes arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-shader.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e4b378800e)
Assuming the hardware is set up to use a screen coordinate system
flipped vertically with respect to the GL's window coordinate system,
the SYSTEM_VALUE_SAMPLE_POS vector will also be flipped vertically
with respect to the value expected by the GL, so we need to give it
the same treatment as gl_FragCoord. Fixes the following CTS tests on
i965:
ES31-CTS.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer
ES31-CTS.functional.shaders.sample_variables.sample_pos.correctness.default_framebuffer
when run with any multisample configuration, e.g. rgba8888d24s8ms4.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit f3d387867f)
When a UBO reference has the form block_name.foo where block_name refers
to a block where the first member has a non-zero offset, the base offset
was incorrectly added to the reference.
Fixes an assertion triggered in debug builds by
GL45-CTS.enhanced_layouts.uniform_block_layout_qualifier_conflict. That test
doesn't properly check for correct execution in this case, so I am also
going to send out a piglit test.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 37d646c1b3)
At link time, we resolve the size of implicitly sized arrays.
When doing so, we update the type of the ir_variables. However,
we neglected to update the type of ir_dereference nodes which
reference those variables.
It turns out array_resize_visitor (for GS/TCS/TES interface array
handling) already did 2/3 of the cases for this, so we can simply
refactor the code and reuse it.
This fixes:
GL45-CTS.shader_storage_buffer_object.basic-syntax
GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO
which have an SSBO containing an implicitly sized array, followed
by some other members. setup_buffer_access uses the dereference
types to compute offsets to fields, and it had a stale type where
the implicitly sized array's length was still 0 instead of the
actual length.
While we're here, we can also fix update_array_sizes to properly
update deref types as well, fixing a FINISHME from 2010.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 8df4aebc94)
This moves the delete linked shaders call to
_mesa_clear_shader_program_data() which makes sure we delete them
before returning due to any validation problems.
It also reduces some code duplication.
From the OpenGL 4.5 Core spec:
"If LinkProgram failed, any information about a previous link of
that program object is lost. Thus, a failed link does not restore
the old state of program.
...
If one of these commands is called with a program for which
LinkProgram failed, no error is generated unless otherwise noted.
Implementations may return information on variables and interface
blocks that would have been active had the program been linked
successfully. In cases where the link failed because the program
required too many resources, these commands may help applications
determine why limits were exceeded."
Therefore it's expected that we shouldn't be able to query the
program that failed to link and retrieve information about a
previously successful link.
Before this change the linker was doing validation before freeing
the previously linked shaders and therefore could exit on failure
before they were freed.
This change also fixes an issue in compat profile where a program
with no shaders attached is expect to fall back to fixed function
but was instead trying to relink IR from a previous link.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97715
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2861d682a)
Dolphin tried to use this, but we hadn't had any tests for it properly.
All that is required is the shader output format needs to be set
for 0 and 1 exports.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 73592b9284)
The case where we just want the loop to continue is INCOMPATIBLE_DRIVER
because that simply means that whatever FD we opened isn't a supported
Intel chip. Other error codes such as OUT_OF_HOST_MEMORY are actual errors
and we should be returning early in that case.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5f8ff6ca1)
Note LOCAL_CFLAGS and LOCAL_SHARED_LIBRARIES in Android.common.mk
are used by both host and target modules. However, commit 112e988
moved libdrm related flags to common. It causes the errors like:
error: 'out/host/linux-x86/obj32/SHARED_LIBRARIES/libdrm_intermediates/export_includes',
needed by 'out/host/linux-x86/obj32/EXECUTABLES/mesa_gen_matypes_intermediates/import_includes',
missing and no known rule to make it
No reason to use libdrm with host modules.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 112e988329 ("Android: move libdrm settings to top-level
Android.common.mk")
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e3e5b1a488)
This makes more sense than OUT_OF_HOST_MEMORY. Technically, you can
recover from a failed execbuf2 but the batch you just submitted didn't
fully execute so things are in an ill-defined state. The app doesn't want
to continue from that point anyway.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c41ec1679f)
Fix build error with clang.
Compiling src/compiler/glsl/link_varyings.cpp ...
In file included from src/compiler/glsl/link_varyings.cpp:33:
In file included from src/compiler/glsl/glsl_symbol_table.h:34:
In file included from src/compiler/glsl/ir.h:33:
In file included from src/compiler/glsl_types.h:29:
/usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration
extern int ffs (int __i) __THROW __attribute__ ((__const__));
^
src/util/bitscan.h:51:13: note: expanded from macro 'ffs'
^
src/util/bitscan.h:96:18: note: previous declaration is here
const int i = ffs(*mask) - 1;
^
src/util/bitscan.h:51:13: note: expanded from macro 'ffs'
^
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97952
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 889ee4da05)
When the video coded size is different from frame size, we need the result
buffers are same as coded size, which are not size compatible with encode
required size, so that simply use no tunnel for this case instead of frame
by frame converting.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 06e3cd6a45)
The address immediate field is only 9 bits and, since the value is in
bytes, the highest GRF we can point to with it is g15. This makes it
pretty close to useless for MOV_INDIRECT. There were already piles of
restrictions preventing us from using it prior to Broadwell, so let's get
rid of the gen8+ code path entirely.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2a4a86862c)
When blending with GL_COLORBURN_KHR and these colors:
dst = <0.372549027, 0.372549027, 0.372549027, 0.372549027>
src = <0.09375, 0.046875, 0.0, 0.375>
the normalized dst value became 0.99999994 (due to precision problems
in the floating point divide of rgb by alpha). This caused the color
burn equation to fail the dst >= 1.0 comparison. The blue channel would
then fall through to the dst < 1.0 && src >= 0 comparison, which was
true, since src.b == 0. This produced a factor of 0.0 instead of 1.0.
This is an inherent numerical instability in the color burn and dodge
equations - depending on the precision of alpha scaling, the value can
be either 0.0 or 1.0. Technically, GLSL floating point division doesn't
even guarantee that 0.372549027 / 0.372549027 = 1.0. So arguably, the
CTS should allow either value. I've filed a bug at Khronos for further
discussion (linked below).
In the meantime, this patch improves the precision of alpha scaling by
replacing the division with (rgb == alpha ? 1.0 : rgb / alpha). We may
not need this long term, but for now, it fixes the following CTS tests:
ES31-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR
ES31-CTS.blend_equation_advanced.blend_all.GL_COLORBURN_KHR_all_qualifier
Cc: currojerez@riseup.net
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16042
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit e6aeeace69)
Previously, we were creating the shader with a NULL ralloc context and then
trusting in blorp_compile_fs to clean it up. The only problem was that
blorp_compile_fs didn't clean up its context properly so we were leaking.
When I went to fix that, I realized that it couldn't because it has to
return the shader binary which is allocated off of that context and used by
the caller. The solution is to make blorp_compile_fs take a ralloc
context, allocate the nir_shaders directly off that context, and clean it
all up in whatever function creates the shader and calls blorp_compile_fs.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dadb6edd)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Conflicts:
src/intel/blorp/blorp_clear.c
With dealing with rectangles in compressed images, you can have a width or
height that isn't a multiple of the corresponding compression block
dimension but only if that edge of your rectangle is on the edge of the
image. When we call convert_to_single_slice, it creates an 2-D image and a
set of tile offsets into that image. When detecting the right-edge and
bottom-edge cases, we weren't including the tile offsets so the assert
would misfire. This caused crashes in a few UE4 demos
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98431
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Tested-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 4964a5149b)
Default precision qualifier for a data type could be set several times
inside a shader. This patch allows to update the default precision
qualifier for the given type that is saved in the symbol table.
If it is not in the symbol table, just add it.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97804
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 0e742926c6)
This function allows to modify an existing symbol.
v2:
- Remove namespace usage now that it was deleted.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit dfbdb2c0b3)
Nominated-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Namespace support seems to have been unused for a very long time.
Previously the hash table entry was never removed and the symbol name
wasn't freed until the symbol table was destroyed.
In theory this could reduced the number of times we need to copy a string
as duplicate names are reused. However in practice there is likely only a
limited number of symbols that are the same and this is likely to cause
other less than optimal behaviour such as the hash_table continuously
growing.
Along with dropping namespace support this change removes entries from
the hash table as they become unused.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
(cherry picked from commit 6dbe8a1b9f)
Nominated-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
SSO validation and other program interface queries want to see that
unsized (non-patch) TCS output/TES input arrays are implicitly sized
to gl_MaxPatchVertices.
By the time we create the program resource lists, we've sized the arrays
to their actual size. (We try to create TCS output arrays to match the
output patch size right away, and at this point, we should have shrunk
TES input arrays.) One option would be to keep them sized to
gl_MaxPatchVertices, and defer shrinking them. But that's a big change,
and I don't think it's a good idea.
Instead, this patch introduces a new ir_variable flag which indicates
the variable is implicitly to gl_MaxPatchVertices. Then, the linker
munges the types when creating the resource list, ignoring the size
in the IR's types. Basically, lie about it for resource queries.
It's ugly, but I think it ought to work.
We probably could use var->data.implicit_sized_array for this, but
I opted for a separate bit to try and avoid convoluting the existing
SSBO handling. They're similar in concept, but share none of the
same code...
Fixes:
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
and the ES32-CTS and ESEXT-CTS variants.
v2: Add a comment (requested by Timothy, written by me).
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 173558445d)
This affects GF100:GK110 chipsets, but not GM107+ where the
logic is a bit different. The emitters tried to emit sub
instead of subr when src0 has a NEG modifier.
This fixes the following piglit tests glsl-fs-loop-nested
and glsl-vs-loop-nested.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1ec7227d44)
Since commit 0a606a400f ("egl: add eglSwapBuffersWithDamageKHR"),
Android has been broken because the function eglSwapBuffersWithDamageKHR
is provided regardless of the extension being present. Also, the Android
meta-EGL always advertises the extension regardless of the underlying
EGL implementation. As there doesn't seem to be a simple way
conditionally make the EGL function ptr NULL, just implement a brain
dead version of eglSwapBuffersWithDamage{KHR,EXT}.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
CC: Rob Clark <robdclark@gmail.com>
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
[Emil Velikov: copy the original commit message from Rob's patch]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4fa799ae04)
Maybe this is why SDMA has been broken for many amdgpu users?
SDMA is the only block which is used with imported textures and relies
on this variable. DB also uses it, but it doesn't get imported textures,
so it's unaffected.
I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma
is changed to use imported textures.
Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ec3b2a4b1)
Use an ARGB format for the DRM buffer when the compositeAlpha field
in VkSwapchainCreateInfoKHR is set to
VK_COMPOSITE_ALPHA_PRE_MULTIPLIED_BIT_KHR.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 68db0fe034)
Pass the correct depth to xcb_dri3_pixmap_from_buffer_checked().
Otherwise xcb_present_pixmap() fails with a BadMatch error.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 972670c200)
Patch rearranges error checking so that enum checking provided via
destmask happens before other checks. It needs to be done in this
order because other error checks do not work properly if there were
invalid enums passed.
Patch also refines one existing check and it's documentation to match
GLES 3.0 spec (also in later specs). This was somewhat mysteriously
referring to desktop GL but had a check for gles3.
Fixes following dEQP tests:
dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers
no CI regressions observed.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98134
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a1652a059e)
While these max values were previously fixed for pbuffer creation, this
change makes also eglGetConfigAttrib() return correct values.
Fixes following dEQP tests:
dEQP-EGL.functional.create_surface.pbuffer.rgb888_no_depth_no_stencil
dEQP-EGL.functional.create_surface.pbuffer.rgb888_depth_stencil
dEQP-EGL.functional.create_surface.pbuffer.rgba8888_no_depth_no_stencil
dEQP-EGL.functional.create_surface.pbuffer.rgba8888_depth_stencil
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b91e1e38e8)
Post-splitting, VGRFs have a maximum size (MAX_VGRF_SIZE). This is
required by the register allocator, as we have to create classes for
each size of VGRF.
We can (and do) allocate virtual registers larger than MAX_VGRF_SIZE,
but we must ensure that they are splittable. split_virtual_grfs()
asserts that the post-splitting register size is in range.
Unfortunately, these trip for completely dead registers which are too
large - we only set split points for live registers. So dead ones are
never split, and if they happened to be too large, they'd trip asserts.
To fix this, call compact_virtual_grfs() to eliminate dead registers
before splitting.
v2: Add a comment written by Iago.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 27715c73ff)
Leak introduced by:
a83dce0128
The patch also moves the part to
release changed.vs_const_i and changed.vs_const_b
before the if (!cb.buffer_size) check,
to avoid reuploading every draw call if
integer or boolean constants are dirty, but the shaders
use no constants.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
CC: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25beccb379)
There are three intended functional changes here:
1. OpenGL 4.5 clarifies that primitive restart should only apply with index
buffers, so make that change explicit in the indirect draw path.
2. Make PrimitiveRestartFixedIndex work with indirect draws.
3. The change where primitive_restart is only set when the restart index can
actually have an effect (based on the size of indices) is also applied for
indirect draws.
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3d6b5dee3a)
shared glapi was previously built without setting CFLAGS for
AM_CFLAGS and VISIBILITY_CFLAGS.
This resulted in symbols being exported that shouldn't be.
The x86 and sparc assembly versions of the dispatch table partially
mitigated this by using .hidden. Otherwise shared_dispatch_stub_*
were being exported.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "11.2 12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
The original patch by Joanna added the function pointer and callback yet
things got only partially applied - the infra was added, but the
implementation was missing.
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 690ead4a13 ("egl/wayland-egl: Fix for segfault in
dri2_wl_destroy_surface.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 2e0ab61e29)
Earlier commit reworked the header install rules, to ensure that the
correct ones are installed only as needed.
By doing so it dropped a wildcard which was effectively including the
wglext.h header in the tarball.
Add the header to the top-level noinst_HEADERS, since the it is not
meant to be installed (autoconf is not used on Windows plaforms).
Fixes: a89faa2022 ("autoconf: Make header install distinct for various
APIs (v2)")
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3511a86111)
This fixes
dEQP-VK.pipeline.multisample.sampled_image*
These all render to multisampled image, and then
sample from it, so we must transition it correctly,
since we have a cmask and fmask this will cause
the correct transition.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a969548f59)
Vulkan has a multi-arch problem... The idea behind the Vulkan loader is
that you have a little json file on your disk that tells the loader where
to find drivers. The loader looks for these json files in standard
locations, and then goes and loads the my_driver.so's that they specify.
This allows you as a driver implementer to put their driver wherever on the
disk they want so long as the ICD points in the right place.
For a multi-arch system, however, you may have multiple libvulkan_intel.so
files installed that the loader needs to pick depending on architecture.
Since the ICD file format does not specify any architecture information,
you can't tell the loader where to find the 32-bit version vs. the 64-bit
version. The way that packagers have been dealing with this is to place
libvulkan_intel.so in the top level lib directory and provide just a name
(and no path) to the loader. It will then use the regular system search
paths and find the correct driver. While this solution works fine for
distro-installed Vulkan drivers, it doesn't work so well for user-installed
drivers because they may put it in /opt or $HOME/.local or some other more
exotic location. In this case, you can't use an ICD json file with just a
library name because it doesn't know where to find it; you also have to add
that to your library lookup path via LD_LIBRARY_PATH or similar.
This patch handles both use-cases by taking advantage of the fact that the
loader dlopen()s each of the drivers and, if one dlopen() calls fails, it
silently continues on to open other drivers. By suffixing the icd file, we
can provide two different json files: intel_icd.x86_64.json and
intel_icd.i686.json with different paths. Since dlopen() will only succeed
on the libvulkan_intel.so of the right arch, the loader will happily ignore
the others and load that one. This allows us to properly handle multi-arch
while still providing a full path so user installs will work fine.
I tested this on my Fedora 25 machine with 32 and 64-bit builds of our
Vulkan driver installed and 32 and 64-bit builds of crucible. It seems to
work just fine.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d96345de98)
Squashed with commit:
anv: Always use the full driver path in the intel_icd.*.json
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7ea4ef8849)
Squashed with commit:
configure: Get rid of the --disable-vulkan-icd-full-driver-path flag
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3f05fc62f9)
These two GLES 3.2 entry points were being defined in the category of
the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions
respectively instead of in the ES3.2 category. Defining them in the
ES3.2 category makes sure that the gl_procs.py generator emits
declarations in the glprocs.h header file for the unsuffixed GLES-only
entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR
respectively alias. This should avoid a compilation failure during
scons builds in combination with "mapi: export all GLES 3.2 functions
in libGLESv2.so".
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 15a084a039)
Long story short, 3D and CP are aliased on Fermi and initializing
compute after pushing the MS sample coordinate offsets seems to
corrupt 3D state for weird reasons.
I still don't have the faintest clue what is going on, but
this seems to only affect Fermi generation. A possible fix
could be to use two different channels, one for 3D and one
for CP.
This fixes a bunch of regressions pinpointed by piglit.
Fixes: "nvc0: fix up image support for allowing multiple samples"
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 42273edf79)
Even when enabled, primitive restart has no effect when the restart index
is larger than the representable values in the index buffer.
Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert
for radeonsi VI.
v2: add an explanatory comment
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
(cherry picked from commit bfa50f88ce)
Use a full writemask in this case. This is relevant e.g. when a function
has an inout argument which is an array of structs.
v2: use C-style comment (Timothy Arceri)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1895685f8)
Set the type of the left-hand side to the same as the right-hand side,
so that when the base type is double, the writemask of the MOV instruction
is properly fixed up.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ca592af880)
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.
This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd45d758ff)
On a debug llvm build we'd assert on the next compare
when the return from samples_identical was i1 instead
of i32.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d842546ad1)
This commit effectively reverts the following commits:
This reverts commit 0b6837a643.
This reverts commit 05f36435ef.
This reverts commit a2ae67aa47.
While the feature introduced is convinient for development it is not as
useful for distributions. Furthermore it even breaks things as one
wishes to have both 32 and 64 bit package installed on the same system.
Keep the functionality in development branch(es) and drop it from
distribution packages to avoid confusion and misuse.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-19 19:10:30 +01:00
334 changed files with 8724 additions and 3061 deletions
@@ -74,11 +75,236 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
TBD.
<ul>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=61907">Bug 61907</a> - Indirect rendering of multi-texture vertex arrays broken</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with "intel_do_flush_locked failed: No such file or directory" if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94561">Bug 94561</a> - [llvmpipe] PIPE_CAP_VIDEO_MEMORY reports negative value on 32 bits (with 16GB ram)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94627">Bug 94627</a> - Game Risen on wine black grass</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94681">Bug 94681</a> - dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=95000">Bug 95000</a> - deqp: assert in dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives "invalid floating point operation" at startup</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96484">Bug 96484</a> - [vulkan] deqp-vk.glsl.builtin.precision.sin / cos regression</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] "clover: Update OpenCL version string to match OpenGL": clover's build fails because of missing git_sha1.h</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96528">Bug 96528</a> - Location qualifier segfaults during shader compilation</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96541">Bug 96541</a> - Tonga Unreal elemental bad rendering since radeonsi: Decompress DCC textures in a render feedback loop</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - "gallium: Force blend color to 16-byte alignment" crash with "-march=native -O3" causes some 32bit games to crash</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96878">Bug 96878</a> - [Bisected: cc2d0e6][HSW] "GPU HANG" msg after autologin to gnome-session</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96949">Bug 96949</a> - [regression] Piglit numSamples assertion failures with 9a23a177b90</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96950">Bug 96950</a> - Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97019">Bug 97019</a> - [clover] build failure in llvm/codegen/native.cpp:129:52</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97083">Bug 97083</a> - [IVB,BYT] GPU hang on deqp-gles31.functional.separate.shader.random</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97140">Bug 97140</a> - dd_draw.c:949:11: error: implicit declaration of function 'fmemopen' is invalid in C99 [-Werror,-Wimplicit-function-declaration]</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error "Failed to make EGL context current"</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97322">Bug 97322</a> - GenerateMipmap creates wrong mipmap for sRGB texture</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97413">Bug 97413</a> - BioShock Infinite crashes on startup with Mesa Git version, R7 370</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from "loader/dri3: Overhaul dri3_update_num_back" commit</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97587">Bug 97587</a> - make check nir/tests/control_flow_tests regression</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97773">Bug 97773</a> - New Mesa master now results in warnings in glrender (and subsurfaces and simple-egl), black screen</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97790">Bug 97790</a> - Vulkan cts regressions due to 24be63066</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97804">Bug 97804</a> - Later precision statement isn't overriding earlier one</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97808">Bug 97808</a> - "tgsi/scan: don't set interp flags for inputs only used by INTERP instructions" causes glitches in wine with gallium nine</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97894">Bug 97894</a> - Crash in u_transfer_unmap_vtbl when unmapping a buffer mapped in different context</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97952">Bug 97952</a> - /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97969">Bug 97969</a> - [radeonsi, bisected: fb827c0] Video decoding shows green artifacts</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=97976">Bug 97976</a> - VCE regression BO to small for addr since winsys/amdgpu: enable buffer allocation from slabs</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98005">Bug 98005</a> - VCE dual instance encoding inconsistent since st/va: enable dual instances encode by sync surface</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98128">Bug 98128</a> - nir/tests/control_flow_tests.cpp:79:73: error: ‘nir_loop_first_cf_node’ was not declared in this scope</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98131">Bug 98131</a> - Compiler should reject lowp/mediump qualifiers on atomic_uints</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98133">Bug 98133</a> - GetSynciv should raise an error if bufSize < 0</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98135">Bug 98135</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings wants a different GL error code</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98167">Bug 98167</a> - [vulkan, radv] missing libgcrypt and openssl devel results in linker error in libvulkan_common</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98172">Bug 98172</a> - Concurrent call to glClientWaitSync results in segfault in one of the waiters.</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98244">Bug 98244</a> - dEQP: textureOffset(sampler2DArrayShadow, ...) should not exist.</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98264">Bug 98264</a> - Build broken for i965 due to multiple deifnitions of intelFenceExtension</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98307">Bug 98307</a> - "st/glsl_to_tgsi: explicitly track all input and output declaration" broke flightgear colors on rs780</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94512">Bug 94512</a> - X segfaults with glx-tls enabled in a x32 environment</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=94900">Bug 94900</a> - HD6950 GPU lockup loop with various steam games (octodad[always], saints row 4[always], dead island[always], grid autosport[sometimes])</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with "Fatal error: Cannot set display mode."</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98914">Bug 98914</a> - mesa-vdpau-drivers: breaks vdpau for mpeg2video</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99144">Bug 99144</a> - Incorrect rendering using glDrawArraysInstancedBaseInstance and first != 0 on Skylake</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99154">Bug 99154</a> - Link time error when using multiple builtin functions</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99158">Bug 99158</a> - vdpau segfaults and gpu locks with kodi on R9285</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=98421">Bug 98421</a> - src/loader/loader.c:111:40: error: unknown type name ‘drmDevicePtr’</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: ‘const struct API_STATE’ has no member named ‘linkageCount’</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>
</ul>
<h2>Changes</h2>
<p>Bartosz Tomczyk (2):</p>
<ul>
<li>r600: Fix stack overflow</li>
<li>r600/sb: Fix memory leak</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: [rasterizer core] Remove dead code Clipper::ClipScalar()</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv: change base aligmment for allocated memory.</li>
<li>radv: fix cik macroModeIndex.</li>
<li>radv: adopt some init config workarounds from radeonsi.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/dri2: add image_loader_extension back into loader extensions for wayland</li>
</ul>
<p>Emil Velikov (26):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.4</li>
<li>configure.ac: list radeon in --with-vulkan-drivers help string</li>
<li>i965: automake: correctly set MKDIR_GEN</li>
<li>freedreno: automake: correctly set MKDIR_GEN</li>
<li>i965: automake: include builddir prior to srcdir</li>
<li>i915: automake: include builddir prior to srcdir</li>
<li>egl: automake: include builddir prior to srcdir</li>
<li>clover: automake: include builddir prior to srcdir</li>
<li>st/dri: automake: include builddir prior to srcdir</li>
<li>d3dadapter9: automake: include builddir prior to srcdir</li>
<li>glx: automake: include builddir prior to srcdir</li>
<li>glx/apple: automake: include builddir prior to srcdir</li>
<li>glx/windows: automake: include builddir prior to srcdir</li>
<li>loader: automake: include builddir prior to srcdir</li>
<li>mapi: automake: include builddir prior to srcdir</li>
<li>radeon, r200: automake: include builddir prior to srcdir</li>
<li>dri/swrast: automake: include builddir prior to srcdir</li>
<li>dri/osmesa: automake: include builddir prior to srcdir</li>
<li>mesa/tests: automake: include builddir prior to srcdir</li>
<li>bin/get-extra-pick-list: use git merge-base to get the branchpoint</li>
<li>bin/get-extra-pick-list: rework to use already_picked list</li>
<li>bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed</li>
<li>bin/get-pick-list.sh: limit `git grep ...' only as needed</li>
<li>bin/get-pick-list.sh: remove ancient way of nominating patches</li>
<li>bin/get-fixes-pick-list.sh: add new script</li>
<li>Update version to 13.0.5</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Avoid emitting small immediates for UBO indirect load address guards.</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>glx/glvnd: Fix GLXdispatchIndex sorting</li>
</ul>
<p>Ian Romanick (11):</p>
<ul>
<li>linker: Slight code rearrange to prevent duplication in the next commit</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: "Note: Buggy applications may crash, if they do please report to vendor"</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - "ralloc: Make sure ralloc() allocations match malloc()'s alignment." causes seg fault in 32bit build</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (2):</p>
<ul>
<li>radv: Emit pending flushes before executing a secondary command buffer</li>
<li>radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer</li>
</ul>
<p>Bartosz Tomczyk (1):</p>
<ul>
<li>glsl: fix heap-buffer-overflow</li>
</ul>
<p>Bas Nieuwenhuizen (8):</p>
<ul>
<li>radv: Pass CMASK alignment to application.</li>
<li>radv: Pass DCC alignment to application.</li>
<li>radv: Never try to create more than max_sets descriptor sets.</li>
<li>radv: Reset emitted compute pipeline when calling secondary cmd buffer.</li>
<li>radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.</li>
<li>radv: Use correct size for availability flag.</li>
<li>radv: Disable HTILE for textures with multiple layers/levels.</li>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.