This manifested as rendering failures or sometimes GPU hangs in
compositors when they accidentally got MSAA visuals due to a bug in the X
Server. Today we decided that the problem in compositors was equivalent
to a corruption bug we'd noticed recently in resizing MSAA-visual
glxgears, and debugging got a lot easier.
When we allocate our MCS MT, libdrm takes the size we request, aligns it
to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880
bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes,
32 pages). Because it's Y tiled, we attach a 384-byte-stride fence to it.
When we memset by the BO size in Mesa, between bytes 122880 and 131072 the
data gets stored to the first 20 or so scanlines of each of the 3 tiled
pages in that row, even though only 2 of those pages were allocated by
libdrm. In the glxgears case, the missing 3rd page happened to
consistently be the static VBO that got mapped right after the first MCS
allocation, so corruption only appeared once window resize made us throw
out the old MCS and then allocate the same BO to back the new MCS.
Instead, just memset the amount of data we actually asked libdrm to
allocate for, which will be smaller (more efficient) and not overrun.
Thanks go to Kenneth for doing most of the hard debugging to eliminate a
lot of the search space for the bug.
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7ae870211d)
Gen6+ allows for color buffers to use a vertical alignment of either 4
or 2. Previously we defaulted to 2. This may have caused problems on
Gen7 because Y-tiled render targets are not allowed to use a vertical
alignment of 2.
This patch changes the vertical alignment to 4 on Gen7, except for the
few formats where a vertical alignment of 2 is required.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6b40dd17cf)
With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and
initialize glx display) we've split the big _Xglobal_lock handling in
a more fine grained manner.
Unfortunatelly we forgot to drop the unlock_mutex on the error paths,
leading to undefined behaviour as the mutex is already unlocked.
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f9832f960f)
Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
for AA lines (when the device doesn't support that feature). We need to
initialize this list before we setup the swtnl pieces.
Found/fixed by Charmaine Lee.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit e853ade544)
Conflicts:
src/gallium/drivers/svga/svga_context.c
Decompressing ETC2 textures was causing intermitent segfault
by copying resulting 4x4 texel block to the destination texture
regardless of the size of the destination texture. Issue found
via application crash in GLBenchmark 3.0's Manhattan test.
v2: add more detail comment. Compute limit outside inner loops.
v3: add bugzilla reference
v4: Correct cc syntax in commit log
v5: really grab the right patch
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1, suggested v2-3]
(cherry picked from commit cb4ad13685)
For TEX instructions, the set of samplers and sampler views should
be consistent. The XA state tracker sometimes passes an inconsistent
set of samplers and sampler views. Rather than assert and die, issue
a warning.
v2: add debugging code to detect inconsistent state.
v3: also check for null sampler in svga_state_tss.c
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 9bb2ec6fd1)
Conflicts:
src/gallium/drivers/svga/svga_state_fs.c
The pkg-config module was called "EXPAT" instead of "expat" in
PKG_CHECK_EXISTS. This seems to have been wrong because the wrong
argument was copied from PKG_CHECK_MODULES.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 476db98e03)
We want to call pipe->set_sampler_views() with count being the
maximum of the old number of sampler views and the new number.
This makes sure we null-out any old sampler views.
We already do the same thing for sampler states in single_sampler_done().
Fixes some assertions seen in the VMware driver with XA tracker.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 2355a64414)
The underlying glDrawArrays() calls weren't getting compiled into
the display list. We simply need to use the current dispatch table
so the CALL_DrawArrays() is routed to the display list save function.
This patch also fixes glMultiModeDrawArraysIBM and
glMultiModeDrawElementsIBM.
Fixes the new piglit gl-1.4-dlist-multidrawarrays test.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit e341856294)
Don't pass null query object pointers into gallium functions.
This avoids segfaulting in the VMware driver (and others?) if the
pipe_context::create_query() call fails and returns NULL.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 488d4c4826)
Release the references to the sampler views before
destroying the pipe context.
v2: remove TODO and unrelated change
v3: move to st_texture.[ch], rename callback, add comment
v4: fix rebase mess up and add further cleanups
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d117ddbe31)
With shared glx contexts it is possible that a texture is create and used
in one context and then used in another one resulting in incorrect
sampler view usage.
v2: avoid template copy
v3: add XXX comment
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 92e543c45d)
EXT_packed_depth_stencil is supported by all drivers, but
ARB_depth_texture isn't (notably nouveau_vieux). This should avoid
passing unexpected values down to ChooseTextureFormat.
The EXT_packed_depth_stencil spec does not make any explicit references
to requiring ARB_depth_texture in order to allow textures with that
format, however if there is no dependency, ARB_depth_texture would be
practically implied by the extension.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Note for 10.0 backport: This will produce a conflict, the solution is to
move the surrounding if as well.
(cherry picked from commit 18690995a6)
Conflicts:
src/mesa/main/teximage.c
nouveau_fence_wait has the expectation that an external entity is
holding onto the fence being waited on, not that it is merely held onto
by the current pointer. Fixes a use-after-free in nouveau_fence_wait
when used on the screen's current fence.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75279
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 507f0230d4)
Conflicts:
src/gallium/drivers/nouveau/nv30/nv30_screen.c
Fixes glGetTexImage() when converting from MESA_FORMAT_Z32_FLOAT_S8X24_UINT
to GL_UNSIGNED_INT_24_8. Hit by the piglit
ext_packed_depth_stencil-getteximage test.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a12d4d0398)
BRW_MAX_TEX_UNIT is the static limit on the number of textures we
support per-stage, not in total.
Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which
is significantly larger, and across the various shader stages, up to
ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually
used.
Fixes invisible bad behavior in piglit's max-samplers test (although
this escalated to an assertion failure on HSW with texture_view, since
non-immutable textures only have _Format set by validation.)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit befbda56a2)
glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just
using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row()
to convert texels from the hardware format to the GL format.
Fixes issue reported by David Meng at Intel. The new piglit
ext_packed_depth_stencil-getteximage test checks for this bug.
Also, add some format/type assertions. We don't yet handle the
GL_FLOAT_32_UNSIGNED_INT_24_8_REV type. That should be fixed in
a follow-on patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dee0295e)
The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are
not defined in GL_ARB_multisample, but they are defined in
the GLX 1.4 specification.
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3616e862f2)
The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are
GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in
the GLX 1.4 specification.
This fixes the glx-choosefbconfig-defaults piglit test.
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f41c2f6c33)
brw_init_state() calls brw_upload_initial_gpu_state(). If hardware
contexts are enabled (brw->hw_ctx != NULL), this will upload some
initial invariant state for the GPU. Without hardware contexts, we
rely on this state being uploaded via atoms that subscribe to the
BRW_NEW_CONTEXT bit.
Commit 46d3c2bf4d accidentally moved
the call to brw_init_state() before creating a hardware context.
This meant brw_upload_initial_gpu_state would always early return.
Except on Gen6+, we stopped uploading the initial GPU state via
state atoms, so it never happened.
Fixes a regression since 46d3c2bf4d.
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3663bbe773)
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:
"In addition, all identifiers containing two consecutive underscores
(__) are reserved as possible future keywords."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Names simply containing __ are dangerous to use, but should
be allowed.
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 2c85fd5a96)
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:
"All macro names containing two consecutive underscores ( __ ) are
reserved for future use as predefined macro names. All macro names
prefixed with "GL_" ("GL" followed by a single underscore) are also
reserved."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error. Names simply
containing __ are dangerous to use, but should be allowed. In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 0bd7892630)
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.
Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.
Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 03597cf802)
If built without llvm, the following error occurs with mplayer:
Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
[vo/vdpau] Error when calling vdp_device_create_x11: 1
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 61f6cddef7)
As documented, the _mesa_free_shader_program_data function:
"Frees all the data that hangs off a shader program object, but not
the object itself."
This means that this function may be called multiple times on the same object,
(and has been observed to). Meanwhile, the shProg->Label field was not being
set to NULL after its free(). This led to a second call to free() of the same
address on the second call to this function.
Fix this by setting this field to NULL after free(), (just as with all other
calls to free() in this function).
Reviewed-by: Brian Paul <brianp@vmware.com>
CC: mesa-stable@lists.freedesktop.org
(cherry picked from commit a92581acf2)
Commit f4ebcd133b ("dri/nouveau: NV17_3D class is not available for
NV1a chipset") fixed this partially by using the correct 3d class.
However there were a lot of checks left over comparing against the
chipset.
Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0c8b165366)
Currently we create a OPENGL_COMPAT context regardless of
what was requested by the program. Correct that by retaining
the program's request and passing it into _mesa_initialize_context.
Based on a similar commit for radeon/r200 by Ian Romanick.
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 76d9f6d972)
Consider a multithreaded program with two contexts A and B, and the
following scenario:
1. Context A calls initialize(), which allocates mem_ctx and starts
building built-ins.
2. Context B calls initialize(), which sees mem_ctx != NULL and assumes
everything is already set up. It returns.
3. Context B calls find(), which fails to find the built-in since it
hasn't been created yet.
4. Context A finally finishes initializing the built-ins.
This will break at step 3. Adding a lock ensures that subsequent
callers of initialize() will wait until initialization is actually
complete.
Similarly, if any thread calls release while another thread is still
initializing, or calling find(), the mem_ctx/shader would get free'd while
from under it, leading to corruption or use-after-free crashes.
Fixes sporadic failures in Piglit's glx-multithread-shader-compile.
Bugzilla: https://bugs.freedesktop.org/69200
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b47d231526)
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 40dd777b33)
nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
nv40/nv30. This fixes compilation of the varying-packing tests.
Furthermore it appears that the last 2 inputs on nv4x don't seem to
work in those tests, so just report 8 everywhere for now.
Tested on NV42, NV44. NV4B appears to have additional problems.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 356aff3a5c)
Mesa fails to retain the precision qualifier when parsing:
#version 300 es
centroid in mediump vec2 v;
Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:
<precision: mediump>
Then the storage_qualifier rule creates a second one:
<flags: in>
and calls merge_qualifier() to fold in any previous qualifications,
returning:
<flags: in, precision: mediump>
Finally, the auxiliary_storage_qualifier creates one for "centroid":
<flags: centroid>
it then does $$ = $1 and $$.flags |= $2.flags, resulting in:
<flags: centroid, in>
Since precision isn't stored in the flags bitfield, it is lost. We need
to instead call merge_qualifier to combine all the fields.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2062f40d81)
If st_GetTexImage() is to decompress the texture, avoid the fallback
path even if prefer_blit_based_texture_transfer = false. For drivers
that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we
were always taking the fallback path for texture decompression rather
than rendering a quad. The later is a lot faster.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f47e596288)
From the GLSL 4.40 spec, section 6.4 (Jumps):
The continue jump is used only in loops. It skips the remainder of
the body of the inner most loop of which it is inside. For while
and do-while loops, this jump is to the next evaluation of the
loop condition-expression from which the loop continues as
previously defined.
Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.
This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR. (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).
Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7f5740899f)
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).
This will be necessary in order to make continue statements work
properly in do-while loops.
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 56790856b3)
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.
The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).
Quoting Eric:
"If we want to actually make the no-alpha-bits-present thing work,
we need to override the bits in the surface state or in the
generated code. In the normal draw path, it's done for sampling
by the swizzling code in brw_wm_surface_state.c, and the blending
overrides is just to fix up the alpha blending stage which
doesn't pay attention to that for the destination surface."
If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.
This is effectively revert of c0554141a9:
i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal. However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.
For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors. This is fairly
straightforward with blending.
For now, this code is never used, as the source and destination formats
still must be equal. The next patch will relax that restriction.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 933be19cdf)
When we clipped a line weren't copying the provoking vertex
color to the second vertex. We also weren't checking for
first vs. last provoking vertex.
Fixes failures found with the new piglit line-flat-clip-color test.
Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit fc3fcd1e01)
For these objects, meta was already using the non-Apple function to
delete the objects. Everywhere else in the file uses
_mesa_GenVertexArrays and _mesa_BindVertexArrays.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit abfa65ca81)
The hardware decompression path isn't even close to being able to handle
this. This converts the crash (assertion failure) in
"EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a
plain old failure.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 070f55d893)
_mesa_meta_DrawPixels creates a VAO and (potentially) two fragment
programs, but none of them are ever released. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fcb498302b)
decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a
sampler object, but none of them are ever released. Later patches will
add program objects, exacerbating the problem. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d3f92e881)
OpenGL 3.3 spec expects GL_INVALID_OPERATION:
"For both the default framebuffer and framebuffer objects, the
constants FRONT, BACK, LEFT, RIGHT, and FRONT AND BACK are not
valid in the bufs array passed to DrawBuffers, and will result
in the error INVALID OPERATION."
But OpenGL 4.0 spec changed the error code to GL_INVALID_ENUM:
"For both the default framebuffer and framebuffer objects, the
constants FRONT, BACK, LEFT, RIGHT, and FRONT_AND_BACK are not
valid in the bufs array passed to DrawBuffers, and will result
in the error INVALID_ENUM."
This patch changes the behaviour to match OpenGL 4.0 spec
Fixes Khronos OpenGL CTS draw_buffers_api.test.
V2: Update the comment in code.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3303475558)
This patch handles the use of 'centroid' qualifier with 'in' variables
in a fragment shader when persample shading is enabled. Per sample
shading for the whole fragment shader can be enabled by:
glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID}
builtin variables in fragment shader. Explaining it below in more
detail.
/* Enable sample shading using OpenGL API */
glEnable(GL_SAMPLE_SHADING);
glMinSampleShading(1.0);
Example fragment shader:
in vec4 a;
centroid in vec4 b;
main()
{
...
}
Variable 'a' will be interpolated at sample location. But, what
interpolation should we use for variable 'b' ?
ARB_sample_shading recommends interpolation at sample position for
all the variables. GLSL 400 (and earlier) spec says that:
"When an interpolation qualifier is used, it overrides settings
established through the OpenGL API."
But, this text got deleted in later versions of GLSL.
NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3)
interpolates at sample position. This convinces me to use
the similar approach on intel hardware.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit f5cfb4ae21)
and
i965: Ignore 'centroid' interpolation qualifier in case of persample shading
I missed this change in commit f5cfb4a. It fixes the incorrect
rendering caused in Dolphin Emulator.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73915
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Markus Wick <wickmarkus@web.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit dc2f94bc78)
Current implementation of arb_sample_shading doesn't set 'Barycentric
Interpolation Mode' correctly. We use pixel barycentric coordinates
for per sample shading. Instead we should select perspective sample
or non-perspective sample barycentric coordinates.
It also enables using sample barycentric coordinates in case of a
fragment shader variable declared with 'sample' qualifier.
e.g. sample in vec4 pos;
A piglit test to verify the implementation has been posted on piglit
mailing list for review.
V2: Do not interpolate all the 'in' variables at sample position
if fragment shader uses 'sample' qualifier with one of them.
For example we have a fragment shader:
#version 330
#extension ARB_gpu_shader5: require
sample in vec4 a;
in vec4 b;
main()
{
...
}
Only 'a' should be sampled at sample location, not 'b'.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit a92e5f7cf6)
For 3 of the 4, I was already ignoring them since they were not picking
cleanly. Now, Anuj has explicitly requested they be ignored since they all
depend on a series that is not yet on the 10.0 branch.
This is a squash of three related cherry-picks from master.
[PATCH 1/3]
i965/gen6/blorp: Set need_workaround_flush immediately after primitive
This patch makes the workaround code in gen6 blorp follow the pattern
established in the regular draw path. It shouldn't result in any
behavioral change.
On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim()
and gen6_blorp_emit_primitive(). brw_emit_prim() sets
need_workaround_flush immediately after emitting the primitive, but
blorp does not. Blorp sets need_workaround_flush at the bottom of
brw_blorp_exec().
This patch moves the need_workaround_flush from brw_blorp_exec() to
gen6_blorp_emit_primitive(). There is no need to set
need_workaround_flush in gen7_blorp_emit_primitive() because the
workaround applies only to gen6.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 5e0cd58de4)
[PATCH 2/3]
i965/gen6/blorp: Set need_workaround_flush at top of blorp
Unconditionally set brw->need_workaround_flush at the top of gen6 blorp
state emission.
The art of emitting workaround flushes on Sandybridge is mysterious and
not fully understood. Ken and I believe that
intel_emit_post_sync_nonzero_flush() may be required when switching from
regular drawing to blorp. This is an extra safety measure to prevent
undiscovered difficult-to-diagnose gpu hangs.
I verified that on ChromeOS, pre-patch, need_workaround_flush was not
set at the top of blorp, as Paul expected. To verify, I inserted the
following debug code at the top of gen6_blorp_exec(), restarted the ui,
and inspected the logs in /var/log/ui. The abort gets triggered so early
that the browser never appears on the display.
static void
gen6_blorp_exec(...)
{
if (!brw->need_workaround_flush) {
fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__);
abort();
}
...
}
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 6a5c86f486)
[PATCH 3/3]
i965/gen6/blorp: Remove redundant HiZ workaround
Commit 1a92881 added extra flushes to fix a HiZ hang in
WebGL Google Maps. With the extra flushes emitted by the previous two
patches, the flushes added by 1a92881 are redundant.
Tested with the same criteria as in 1a92881: by zooming in and out
continuously for 2 hours on Sandybridge Chrome OS (codename
Stumpy) without a hang.
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 90368875e7)
Conflicts:
src/mesa/drivers/dri/i965/gen6_blorp.cpp
We were calling draw_total_vs_outputs() too early. The call to
draw_pt_emit_prepare() could result in the vertex size changing.
So call draw_total_vs_outputs() after draw_pt_emit_prepare().
This fix would seem to be needed for the non-LLVM code as well,
but it's not obvious. Instead, I added an assertion there to
try to catch this problem if it were to occur there.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit ad814d04ca)
Conflicts:
src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c
Simple shaders such as:
void splat(vec2 v, float f) {
v[0] = v[1] = f;
}
failed to compile with the following error:
error: value of type vec2 cannot be assigned to variable of type float
First, we would process v[1] = f, and transform:
LHS: (expression float vector_extract (var_ref v) (constant int (1)))
RHS: (var_ref f)
into:
LHS: (var_ref v)
RHS: (expression vec2 vector_insert (var_ref v) (constant int (1))
(var_ref f))
Note that the LHS type is now vec2, not a float. This is surprising,
but not the real problem.
After emitting assignments, this ultimately becomes:
(declare (temporary) vec2 assignment_tmp)
(assign (xy)
(var_ref assignment_tmp)
(expression vec2 vector_insert (var_ref v) (constant int (1))
(var_ref f)))
(assign (xy) (var_ref v) (var_ref assignment_tmp))
We would then return (var_ref assignment_tmp) as the rvalue, which has
the wrong type---it should be float, but is instead a vec2.
To fix this, we simply return (vector_extract (var_ref assignment_temp)
<the appropriate channel>) to pull out the desired float value.
Fixes Piglit's chained-assignment-with-vector-constant-index.vert and
chained-assignment-with-vector-dynamic-index.vert tests.
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74026
Reported-by: Dan Ginsburg <dang@valvesoftware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44a86e2b4f)
NV3x cards don't support NPOT textures. Technically this restriction
could be worked around, but since it also doesn't expose any video
decoding hw, just turn it off entirely.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 00e4314f6d)
The textures array is defined as a number of NV50_MAX_PIPE_CONSTBUFS
per shader stage. Currently the nv50 driver handles only 3 shader
stages, thus we wreck chaos when accessing array-out-of-bounds.
Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 12e744abbb)
The textures array is defined as a number of PIPE_MAX_SAMPLERS per shader stage.
Currently nv50 driver handles only 3 shader stages, thus we wreck chaos when
accessing array-out-of-bounds.
Fixes a segfault in piglit/bin/arb_texture_buffer_object-data-sync -fbo -auto
Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit d606ca37eb)
Commit c13970808 (mesa: GL_EXT_secondary_color is not optional) changed
CHECK_EXTENSION2(EXT_secondary_color, ARB_vetex_program, cap)
to
CHECK_EXTENSION(ARB_vertex_program, cap)
However CHECK_EXTENSION2 checks that either extension is available, not
both. Remove the extension check entirely since the intent was for it to
always be enabled.
v2: Fix glGet*(GL_COLOR_SUM) too. Suggested by Ian.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 739dc95e67)
The radeonsi code was not cleaning up either of these items leading to
leaked memory.
v2: Move cleanup to r600_common_context_cleanup instead of duplicating
the logic for SI
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 5ac3229f76)
Conflicts:
src/gallium/drivers/radeon/r600_pipe_common.c
The ES and desktop GL specs diverge here. Yay!
In desktop OpenGL, the driver can perform online compression of
uncompressed texture data. GL_NUM_COMPRESSED_TEXTURE_FORMATS and
GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats
that it could ask the driver to compress with some expectation of
quality. The GL_ARB_texture_compression spec calls this "suitable for
general-purpose usage." As noted above, this means
GL_COMPRESSED_RGBA_S3TC_DXT1_EXT is not included in the list.
In OpenGL ES, the driver never performs compression.
GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give
the application a list of formats that the driver can receive from the
application. It is the *complete* list of formats. The
GL_EXT_texture_compression_s3tc spec says:
"New State for OpenGL ES 2.0.25 and 3.0.2 Specifications
The queries for NUM_COMPRESSED_TEXTURE_FORMATS and
COMPRESSED_TEXTURE_FORMATS include COMPRESSED_RGB_S3TC_DXT1_EXT,
COMPRESSED_RGBA_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT3_EXT,
and COMPRESSED_RGBA_S3TC_DXT5_EXT."
Note that the addition is only to the OpenGL ES specification!
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
See-also: http://lists.freedesktop.org/archives/mesa-dev/2013-October/047439.html
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0a75909b3f)
The temporary variable used to store _ColorDrawBufferIndexes must be
signed (GLint), otherwise the following conditional will be incorrectly
evaluated. Leading to crashes in the driver/mesa or accessing/writing
to arbitrary memory location. The bug dates back to 2009.
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit bfcf78c110)
_ColorDrawBufferIndexes is defined as GLint* and using a GLuint*
will result in the first part of the conditional to be evaluated to
true always.
Unintentionally introduced by the following commit, this will result
in a driver segfault if one is using an old version of the piglit test
bin/clearbuffer-mixed-format -auto -fbo
commit 03d848ea10
Author: Marek Olšák <marek.olsak@amd.com>
Date: Wed Dec 4 00:27:20 2013 +0100
mesa: fix interpretation of glClearBuffer(drawbuffer)
This corresponding piglit tests supported this incorrect behavior instead of
pointing at it.
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 10368e1446)
Prior to this patch, if we ran out of aperture space during
brw_try_draw_prims(), we would rewind the batch buffer pointer
(potentially throwing some state that may have been emitted by
brw_upload_state()), flush the batch, and then try again. However, we
wouldn't reset the dirty bits to the state they had before the call to
brw_upload_state(). As a result, when we tried again, there was a
danger that we wouldn't re-emit all the necessary state. (Note: prior
to the introduction of hardware contexts, this wasn't a problem
because flushing the batch forced all state to be re-emitted).
This patch fixes the problem by leaving the dirty bits set at the end
of brw_upload_state(); we only clear them after we have determined
that we don't need to rewind the batch buffer.
Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit fb6d9798a0)
We definitely want to fall through to the unsynchronized map case, instead
of wasting bandwidth on a copy. Prevents a -43.2407% +/- 1.06113% (n=49)
performance regression on aa10perf when teaching glamor to provide the
GL_INVALIDATE_RANGE_BIT information.
This is a performance fix, which I usually wouldn't cherry-pick to stable.
But this was really was just a bug in the code, its presence would
discourage developers from giving us the best information they can, and I
think we've got fairly high confidence in the unsynchronized map path
already.
Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f46563fe1c)
This fixes the following compile error:
src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3
overloads have similar conversions
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 067ad6e53e)
The hardware is broken with nonzero texel offsets and unnormalized
coordinates; instead of doing correct offsetting, we get garbage.
This just extends the existing workaround for ir_txf and
ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect.
Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is
not enabled; also fixes the new piglit test
'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'.
Has been broken ~forever; suggesting including this in only 10.0 because
the lowering pass doesn't exist in 9.2 or earlier so would require quite
a different patch.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Lee Salzman <lsalzman@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e99735f30)
Commit 9119269ca1 moved the texel
buffer allocation to _swrast_texture_span(), however, when compiled
with OpenMP support this code already runs multi-threaded so a
critical section is required to prevent multiple allocations and
rendering errors.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2a0fb946e1)
This is part of the GL_EXT_packed_float extension.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 3486f6f31b
Also squashed in a subsequent bug fix:
mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed()
This packed floating point format only stores positive values.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 0fc8d7c66e)
Also squashed in a second, subsequent bug fix:
mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query
If a channel has zero bits it's not signed.
v2: also check for luminance and intensity format bits. Bruce
Merry's proposed piglit test hits the luminance case.
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d046fd731a)
Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Conflicts:
src/mesa/main/get.c
* These make up the base of what C++ GL Haiku applications
use for 3D rendering.
* Not placed in includes/GL to prevent Haiku headers from
getting installed on non-Haiku systems.
Acked-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 56d920a5c1)
The compiler back-ends (i965's fs_visitor and brw_visitor,
ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when
ir_loop::counter is non-null, it points to a fresh ir_variable that
should be used as the loop counter (as opposed to an ir_variable that
exists elsewhere in the instruction stream).
However, previous to this patch:
(1) loop_control_visitor did not create a new variable for
ir_loop::counter; instead it re-used the existing ir_variable.
This caused the loop counter to be double-incremented (once
explicitly by the body of the loop, and once implicitly by
ir_loop::increment).
(2) ir_clone did not clone ir_loop::counter properly, resulting in the
cloned ir_loop pointing to the source ir_loop's counter.
(3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting
in the ir_variable being missed by reparenting.
Additionally, most optimization passes (e.g. loop unrolling) assume
that the variable mentioned by ir_loop::counter is not accessed in the
body of the loop (an assumption which (1) violates).
The combination of these factors caused a perfect storm in which the
code worked properly nearly all of the time: for loops that got
unrolled, (1) would introduce a double-increment, but loop unrolling
would fail to notice it (since it assumes that ir_loop::counter is not
accessed in the body of the loop), so it would unroll the loop the
correct number of times. For loops that didn't get unrolled, (1)
would introduce a double-increment, but then later when the IR was
cloned for linking, (2) would prevent the loop counter from being
cloned properly, so it would look to further analysis stages like an
independent variable (and hence the double-increment would stop
occurring). At the end of linking, (3) would prevent the loop counter
from being reparented, so it would still belong to the shader object
rather than the linked program object. Provided that the client
program didn't delete the shader object, the memory would never get
reclaimed, and so the shader would function properly.
However, for loops that didn't get unrolled, if the client program did
delete the shader object, and the memory belonging to the loop counter
got re-used, this could cause a use-after-free bug, leading to a
crash.
This patch fixes loop_control_visitor, ir_clone, and
ir_hierarchical_visitor to treat ir_loop::counter the same way the
back-ends treat it: as a freshly allocated ir_variable that needs to
be visited and cloned independently of other ir_variables.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72026
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d6eb4321d0)
If an ir_loop has a non-null "counter" field, the variable referred to
by this field is implicitly read and written by the loop. We need to
account for this in ir_variable_refcount, otherwise there is a danger
we will try to dead-code-eliminate the loop counter variable.
Note: at the moment the dead code elimination bug doesn't occur due to
a bug in ir_hierarchical_visitor: it doesn't visit the "counter"
field, so dead code elimination doesn't treat it as a candidate for
elimination. But the patch to follow will fix that bug, so we need to
fix ir_variable_refcount first in order to avoid breaking dead code
elimination.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d2951ea0a)
Emitting flushes before depth and hiz resolves at the top of blorp's
state emission fixes the hang. Marchesin and I found the fix
experimentally, as opposed to adhering to a documented hardware
workaround. A more minimal fix likely exists, but this gets the job
done.
Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS.
Tested by zooming in and out continuously for 2 hours.
This patch is based on
8bc07bb701
CC: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1a928816a1)
The preprocessor currently accepts multiple else/elif-groups
per if-section. The GLSL-preprocessor is defined by the C++
specification, which defines the following parse-rule:
if-section:
if-group elif-groups(opt) else-group(opt) endif-line
This clearly only allows a single else-group, that has to come
after any elif-groups.
So let's modify the code to follow the specification. Add test
to prevent regressions.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb212c5a30)
This reverts commit 136a12ac98.
According to belak51 on IRC, this commit broke Allegro, which would no
longer compile. Applications apparently expect the GLXContextID typedef
to exist in glx.h; removing it breaks them. A bit of searching around
the internet revealed other complaints since upgrading to Mesa 10.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f425d56ba4)
Previously we were creating a new LLVMContext every time that we called
radeon_llvm_parse_bitcode, which caused us to leak the context every time
that we compiled a CL program.
Sadly, we can't dispose of the LLVMContext at the point that it was being
created because evergreen_launch_grid (and possibly the SI equivalent) was
assuming that the context used to compile the kernels was still available.
Now, we'll create a new LLVMContext when creating EG/SI compute state, store
it there, and pass it to all of the places that need it.
The LLVM Context gets destroyed when we delete the EG/SI compute state.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8c9a9205d9)
This fixes a crash where old_view->context was already freed in the
pipe_sampler_view_reference function contained in
src/gallium/auxiliary/utils/u_inlines.h. As a result, the
sampler_view_destroy function pointer contained 0xfeeefeee indicating
freed heap memory.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 670be71bd8)
This patch changes the error reporting behavior for incorrect function
invocation (triggered by match_function_by_name() unable to find a
matching function call) from using the line number information
associated to the function name term to using the line number
information of the entire function expression. Fixes bug #72264.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 23d294bb60)
This patch changes the error condition to satisfy below statement
from OpenGL 4.3 core specification:
"An INVALID_OPERATION error is generated if id is the name of a query
object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or
ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query
currently in progress."
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7a73c6acb0)
The driverPrivate pointer is opaque to the driver and we can't assume
it's a struct gl_context in dri_util.c. Instead provide a helper function
to set the struct gl_context flags from the incoming DRI context flags.
v2 (idr): Modify the other classic drivers to also use
driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests
with i965 and classic swrast without regressions.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau]
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 38366c0c6e)
On evergreen we have to reserve 1 stack element in some additional cases
besides the ones mentioned in the docs, but stack size computation was
recently reimplemented exactly as described in the docs by the patch that
added workarounds for stack issues on EG/CM, resulting in regressions
with some apps (Serious Sam 3).
This patch fixes it by restoring previous behavior.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Andre Heider <a.heider@gmail.com>
(cherry picked from commit 00faf82832)
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=58660">Bug 58660</a> - CAYMAN broken with HyperZ on</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=66352">Bug 66352</a> - GPU lockup in L4D2 on TURKS with HyperZ</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68799">Bug 68799</a> - [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error "SSE4.1 instruction set not enabled"</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=73088">Bug 73088</a> - [HyperZ] Juniper (6770): Gone Home / Unigine Heaven 4.0 lock up system after several minutes of use</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74428">Bug 74428</a> - hyperz causes gpu hang in Counter-strike: Source</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74803">Bug 74803</a> - [r600g] HyperZ broken on RV630 (Cogs shadows are broken)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74892">Bug 74892</a> - HyperZ GPU lockup with radeonsi 7970M PITCAIRN and Distance Alpha game</li>
/* Initialize the software rasterizer and helper modules.
*
* As of GL 3.1 core, the gen4+ driver doesn't need the swrast context for
@@ -697,12 +699,6 @@ brwCreateContext(gl_api api,
intel_batchbuffer_init(brw);
brw_init_state(brw);
intelInitExtensions(ctx);
intel_fbo_init(brw);
if(brw->gen>=6){
/* Create a new hardware context. Using a hardware context means that
* our GPU state will be saved/restored on context switch, allowing us
@@ -720,6 +716,12 @@ brwCreateContext(gl_api api,
}
}
brw_init_state(brw);
intelInitExtensions(ctx);
intel_fbo_init(brw);
brw_init_surface_formats(brw);
if(brw->is_g4x||brw->gen>=5){
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.