This manifested as rendering failures or sometimes GPU hangs in
compositors when they accidentally got MSAA visuals due to a bug in the X
Server. Today we decided that the problem in compositors was equivalent
to a corruption bug we'd noticed recently in resizing MSAA-visual
glxgears, and debugging got a lot easier.
When we allocate our MCS MT, libdrm takes the size we request, aligns it
to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880
bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes,
32 pages). Because it's Y tiled, we attach a 384-byte-stride fence to it.
When we memset by the BO size in Mesa, between bytes 122880 and 131072 the
data gets stored to the first 20 or so scanlines of each of the 3 tiled
pages in that row, even though only 2 of those pages were allocated by
libdrm. In the glxgears case, the missing 3rd page happened to
consistently be the static VBO that got mapped right after the first MCS
allocation, so corruption only appeared once window resize made us throw
out the old MCS and then allocate the same BO to back the new MCS.
Instead, just memset the amount of data we actually asked libdrm to
allocate for, which will be smaller (more efficient) and not overrun.
Thanks go to Kenneth for doing most of the hard debugging to eliminate a
lot of the search space for the bug.
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7ae870211d)
Gen6+ allows for color buffers to use a vertical alignment of either 4
or 2. Previously we defaulted to 2. This may have caused problems on
Gen7 because Y-tiled render targets are not allowed to use a vertical
alignment of 2.
This patch changes the vertical alignment to 4 on Gen7, except for the
few formats where a vertical alignment of 2 is required.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6b40dd17cf)
With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and
initialize glx display) we've split the big _Xglobal_lock handling in
a more fine grained manner.
Unfortunatelly we forgot to drop the unlock_mutex on the error paths,
leading to undefined behaviour as the mutex is already unlocked.
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f9832f960f)
Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
for AA lines (when the device doesn't support that feature). We need to
initialize this list before we setup the swtnl pieces.
Found/fixed by Charmaine Lee.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit e853ade544)
Conflicts:
src/gallium/drivers/svga/svga_context.c
Decompressing ETC2 textures was causing intermitent segfault
by copying resulting 4x4 texel block to the destination texture
regardless of the size of the destination texture. Issue found
via application crash in GLBenchmark 3.0's Manhattan test.
v2: add more detail comment. Compute limit outside inner loops.
v3: add bugzilla reference
v4: Correct cc syntax in commit log
v5: really grab the right patch
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1, suggested v2-3]
(cherry picked from commit cb4ad13685)
For TEX instructions, the set of samplers and sampler views should
be consistent. The XA state tracker sometimes passes an inconsistent
set of samplers and sampler views. Rather than assert and die, issue
a warning.
v2: add debugging code to detect inconsistent state.
v3: also check for null sampler in svga_state_tss.c
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 9bb2ec6fd1)
Conflicts:
src/gallium/drivers/svga/svga_state_fs.c
The pkg-config module was called "EXPAT" instead of "expat" in
PKG_CHECK_EXISTS. This seems to have been wrong because the wrong
argument was copied from PKG_CHECK_MODULES.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 476db98e03)
We want to call pipe->set_sampler_views() with count being the
maximum of the old number of sampler views and the new number.
This makes sure we null-out any old sampler views.
We already do the same thing for sampler states in single_sampler_done().
Fixes some assertions seen in the VMware driver with XA tracker.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 2355a64414)
The underlying glDrawArrays() calls weren't getting compiled into
the display list. We simply need to use the current dispatch table
so the CALL_DrawArrays() is routed to the display list save function.
This patch also fixes glMultiModeDrawArraysIBM and
glMultiModeDrawElementsIBM.
Fixes the new piglit gl-1.4-dlist-multidrawarrays test.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit e341856294)
Don't pass null query object pointers into gallium functions.
This avoids segfaulting in the VMware driver (and others?) if the
pipe_context::create_query() call fails and returns NULL.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 488d4c4826)
Release the references to the sampler views before
destroying the pipe context.
v2: remove TODO and unrelated change
v3: move to st_texture.[ch], rename callback, add comment
v4: fix rebase mess up and add further cleanups
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d117ddbe31)
With shared glx contexts it is possible that a texture is create and used
in one context and then used in another one resulting in incorrect
sampler view usage.
v2: avoid template copy
v3: add XXX comment
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 92e543c45d)
EXT_packed_depth_stencil is supported by all drivers, but
ARB_depth_texture isn't (notably nouveau_vieux). This should avoid
passing unexpected values down to ChooseTextureFormat.
The EXT_packed_depth_stencil spec does not make any explicit references
to requiring ARB_depth_texture in order to allow textures with that
format, however if there is no dependency, ARB_depth_texture would be
practically implied by the extension.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Note for 10.0 backport: This will produce a conflict, the solution is to
move the surrounding if as well.
(cherry picked from commit 18690995a6)
Conflicts:
src/mesa/main/teximage.c
nouveau_fence_wait has the expectation that an external entity is
holding onto the fence being waited on, not that it is merely held onto
by the current pointer. Fixes a use-after-free in nouveau_fence_wait
when used on the screen's current fence.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75279
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 507f0230d4)
Conflicts:
src/gallium/drivers/nouveau/nv30/nv30_screen.c
Fixes glGetTexImage() when converting from MESA_FORMAT_Z32_FLOAT_S8X24_UINT
to GL_UNSIGNED_INT_24_8. Hit by the piglit
ext_packed_depth_stencil-getteximage test.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a12d4d0398)
BRW_MAX_TEX_UNIT is the static limit on the number of textures we
support per-stage, not in total.
Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which
is significantly larger, and across the various shader stages, up to
ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually
used.
Fixes invisible bad behavior in piglit's max-samplers test (although
this escalated to an assertion failure on HSW with texture_view, since
non-immutable textures only have _Format set by validation.)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit befbda56a2)
glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just
using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row()
to convert texels from the hardware format to the GL format.
Fixes issue reported by David Meng at Intel. The new piglit
ext_packed_depth_stencil-getteximage test checks for this bug.
Also, add some format/type assertions. We don't yet handle the
GL_FLOAT_32_UNSIGNED_INT_24_8_REV type. That should be fixed in
a follow-on patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dee0295e)
The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are
not defined in GL_ARB_multisample, but they are defined in
the GLX 1.4 specification.
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3616e862f2)
The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are
GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in
the GLX 1.4 specification.
This fixes the glx-choosefbconfig-defaults piglit test.
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f41c2f6c33)
brw_init_state() calls brw_upload_initial_gpu_state(). If hardware
contexts are enabled (brw->hw_ctx != NULL), this will upload some
initial invariant state for the GPU. Without hardware contexts, we
rely on this state being uploaded via atoms that subscribe to the
BRW_NEW_CONTEXT bit.
Commit 46d3c2bf4d accidentally moved
the call to brw_init_state() before creating a hardware context.
This meant brw_upload_initial_gpu_state would always early return.
Except on Gen6+, we stopped uploading the initial GPU state via
state atoms, so it never happened.
Fixes a regression since 46d3c2bf4d.
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3663bbe773)
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:
"In addition, all identifiers containing two consecutive underscores
(__) are reserved as possible future keywords."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Names simply containing __ are dangerous to use, but should
be allowed.
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 2c85fd5a96)
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:
"All macro names containing two consecutive underscores ( __ ) are
reserved for future use as predefined macro names. All macro names
prefixed with "GL_" ("GL" followed by a single underscore) are also
reserved."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error. Names simply
containing __ are dangerous to use, but should be allowed. In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 0bd7892630)
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.
Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.
Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 03597cf802)
If built without llvm, the following error occurs with mplayer:
Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
[vo/vdpau] Error when calling vdp_device_create_x11: 1
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 61f6cddef7)
As documented, the _mesa_free_shader_program_data function:
"Frees all the data that hangs off a shader program object, but not
the object itself."
This means that this function may be called multiple times on the same object,
(and has been observed to). Meanwhile, the shProg->Label field was not being
set to NULL after its free(). This led to a second call to free() of the same
address on the second call to this function.
Fix this by setting this field to NULL after free(), (just as with all other
calls to free() in this function).
Reviewed-by: Brian Paul <brianp@vmware.com>
CC: mesa-stable@lists.freedesktop.org
(cherry picked from commit a92581acf2)
Commit f4ebcd133b ("dri/nouveau: NV17_3D class is not available for
NV1a chipset") fixed this partially by using the correct 3d class.
However there were a lot of checks left over comparing against the
chipset.
Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0c8b165366)
Currently we create a OPENGL_COMPAT context regardless of
what was requested by the program. Correct that by retaining
the program's request and passing it into _mesa_initialize_context.
Based on a similar commit for radeon/r200 by Ian Romanick.
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 76d9f6d972)
Consider a multithreaded program with two contexts A and B, and the
following scenario:
1. Context A calls initialize(), which allocates mem_ctx and starts
building built-ins.
2. Context B calls initialize(), which sees mem_ctx != NULL and assumes
everything is already set up. It returns.
3. Context B calls find(), which fails to find the built-in since it
hasn't been created yet.
4. Context A finally finishes initializing the built-ins.
This will break at step 3. Adding a lock ensures that subsequent
callers of initialize() will wait until initialization is actually
complete.
Similarly, if any thread calls release while another thread is still
initializing, or calling find(), the mem_ctx/shader would get free'd while
from under it, leading to corruption or use-after-free crashes.
Fixes sporadic failures in Piglit's glx-multithread-shader-compile.
Bugzilla: https://bugs.freedesktop.org/69200
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b47d231526)
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 40dd777b33)
nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
nv40/nv30. This fixes compilation of the varying-packing tests.
Furthermore it appears that the last 2 inputs on nv4x don't seem to
work in those tests, so just report 8 everywhere for now.
Tested on NV42, NV44. NV4B appears to have additional problems.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 356aff3a5c)
Mesa fails to retain the precision qualifier when parsing:
#version 300 es
centroid in mediump vec2 v;
Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:
<precision: mediump>
Then the storage_qualifier rule creates a second one:
<flags: in>
and calls merge_qualifier() to fold in any previous qualifications,
returning:
<flags: in, precision: mediump>
Finally, the auxiliary_storage_qualifier creates one for "centroid":
<flags: centroid>
it then does $$ = $1 and $$.flags |= $2.flags, resulting in:
<flags: centroid, in>
Since precision isn't stored in the flags bitfield, it is lost. We need
to instead call merge_qualifier to combine all the fields.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2062f40d81)
If st_GetTexImage() is to decompress the texture, avoid the fallback
path even if prefer_blit_based_texture_transfer = false. For drivers
that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we
were always taking the fallback path for texture decompression rather
than rendering a quad. The later is a lot faster.
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f47e596288)
From the GLSL 4.40 spec, section 6.4 (Jumps):
The continue jump is used only in loops. It skips the remainder of
the body of the inner most loop of which it is inside. For while
and do-while loops, this jump is to the next evaluation of the
loop condition-expression from which the loop continues as
previously defined.
Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.
This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR. (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).
Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7f5740899f)
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).
This will be necessary in order to make continue statements work
properly in do-while loops.
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 56790856b3)
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.
The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).
Quoting Eric:
"If we want to actually make the no-alpha-bits-present thing work,
we need to override the bits in the surface state or in the
generated code. In the normal draw path, it's done for sampling
by the swizzling code in brw_wm_surface_state.c, and the blending
overrides is just to fix up the alpha blending stage which
doesn't pay attention to that for the destination surface."
If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.
This is effectively revert of c0554141a9:
i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal. However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.
For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors. This is fairly
straightforward with blending.
For now, this code is never used, as the source and destination formats
still must be equal. The next patch will relax that restriction.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 933be19cdf)
When we clipped a line weren't copying the provoking vertex
color to the second vertex. We also weren't checking for
first vs. last provoking vertex.
Fixes failures found with the new piglit line-flat-clip-color test.
Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit fc3fcd1e01)
For these objects, meta was already using the non-Apple function to
delete the objects. Everywhere else in the file uses
_mesa_GenVertexArrays and _mesa_BindVertexArrays.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit abfa65ca81)
The hardware decompression path isn't even close to being able to handle
this. This converts the crash (assertion failure) in
"EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a
plain old failure.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 070f55d893)
_mesa_meta_DrawPixels creates a VAO and (potentially) two fragment
programs, but none of them are ever released. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fcb498302b)
decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a
sampler object, but none of them are ever released. Later patches will
add program objects, exacerbating the problem. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d3f92e881)
OpenGL 3.3 spec expects GL_INVALID_OPERATION:
"For both the default framebuffer and framebuffer objects, the
constants FRONT, BACK, LEFT, RIGHT, and FRONT AND BACK are not
valid in the bufs array passed to DrawBuffers, and will result
in the error INVALID OPERATION."
But OpenGL 4.0 spec changed the error code to GL_INVALID_ENUM:
"For both the default framebuffer and framebuffer objects, the
constants FRONT, BACK, LEFT, RIGHT, and FRONT_AND_BACK are not
valid in the bufs array passed to DrawBuffers, and will result
in the error INVALID_ENUM."
This patch changes the behaviour to match OpenGL 4.0 spec
Fixes Khronos OpenGL CTS draw_buffers_api.test.
V2: Update the comment in code.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3303475558)
This patch handles the use of 'centroid' qualifier with 'in' variables
in a fragment shader when persample shading is enabled. Per sample
shading for the whole fragment shader can be enabled by:
glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID}
builtin variables in fragment shader. Explaining it below in more
detail.
/* Enable sample shading using OpenGL API */
glEnable(GL_SAMPLE_SHADING);
glMinSampleShading(1.0);
Example fragment shader:
in vec4 a;
centroid in vec4 b;
main()
{
...
}
Variable 'a' will be interpolated at sample location. But, what
interpolation should we use for variable 'b' ?
ARB_sample_shading recommends interpolation at sample position for
all the variables. GLSL 400 (and earlier) spec says that:
"When an interpolation qualifier is used, it overrides settings
established through the OpenGL API."
But, this text got deleted in later versions of GLSL.
NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3)
interpolates at sample position. This convinces me to use
the similar approach on intel hardware.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit f5cfb4ae21)
and
i965: Ignore 'centroid' interpolation qualifier in case of persample shading
I missed this change in commit f5cfb4a. It fixes the incorrect
rendering caused in Dolphin Emulator.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73915
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Markus Wick <wickmarkus@web.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit dc2f94bc78)
Current implementation of arb_sample_shading doesn't set 'Barycentric
Interpolation Mode' correctly. We use pixel barycentric coordinates
for per sample shading. Instead we should select perspective sample
or non-perspective sample barycentric coordinates.
It also enables using sample barycentric coordinates in case of a
fragment shader variable declared with 'sample' qualifier.
e.g. sample in vec4 pos;
A piglit test to verify the implementation has been posted on piglit
mailing list for review.
V2: Do not interpolate all the 'in' variables at sample position
if fragment shader uses 'sample' qualifier with one of them.
For example we have a fragment shader:
#version 330
#extension ARB_gpu_shader5: require
sample in vec4 a;
in vec4 b;
main()
{
...
}
Only 'a' should be sampled at sample location, not 'b'.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit a92e5f7cf6)
For 3 of the 4, I was already ignoring them since they were not picking
cleanly. Now, Anuj has explicitly requested they be ignored since they all
depend on a series that is not yet on the 10.0 branch.
This is a squash of three related cherry-picks from master.
[PATCH 1/3]
i965/gen6/blorp: Set need_workaround_flush immediately after primitive
This patch makes the workaround code in gen6 blorp follow the pattern
established in the regular draw path. It shouldn't result in any
behavioral change.
On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim()
and gen6_blorp_emit_primitive(). brw_emit_prim() sets
need_workaround_flush immediately after emitting the primitive, but
blorp does not. Blorp sets need_workaround_flush at the bottom of
brw_blorp_exec().
This patch moves the need_workaround_flush from brw_blorp_exec() to
gen6_blorp_emit_primitive(). There is no need to set
need_workaround_flush in gen7_blorp_emit_primitive() because the
workaround applies only to gen6.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 5e0cd58de4)
[PATCH 2/3]
i965/gen6/blorp: Set need_workaround_flush at top of blorp
Unconditionally set brw->need_workaround_flush at the top of gen6 blorp
state emission.
The art of emitting workaround flushes on Sandybridge is mysterious and
not fully understood. Ken and I believe that
intel_emit_post_sync_nonzero_flush() may be required when switching from
regular drawing to blorp. This is an extra safety measure to prevent
undiscovered difficult-to-diagnose gpu hangs.
I verified that on ChromeOS, pre-patch, need_workaround_flush was not
set at the top of blorp, as Paul expected. To verify, I inserted the
following debug code at the top of gen6_blorp_exec(), restarted the ui,
and inspected the logs in /var/log/ui. The abort gets triggered so early
that the browser never appears on the display.
static void
gen6_blorp_exec(...)
{
if (!brw->need_workaround_flush) {
fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__);
abort();
}
...
}
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 6a5c86f486)
[PATCH 3/3]
i965/gen6/blorp: Remove redundant HiZ workaround
Commit 1a92881 added extra flushes to fix a HiZ hang in
WebGL Google Maps. With the extra flushes emitted by the previous two
patches, the flushes added by 1a92881 are redundant.
Tested with the same criteria as in 1a92881: by zooming in and out
continuously for 2 hours on Sandybridge Chrome OS (codename
Stumpy) without a hang.
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 90368875e7)
Conflicts:
src/mesa/drivers/dri/i965/gen6_blorp.cpp
We were calling draw_total_vs_outputs() too early. The call to
draw_pt_emit_prepare() could result in the vertex size changing.
So call draw_total_vs_outputs() after draw_pt_emit_prepare().
This fix would seem to be needed for the non-LLVM code as well,
but it's not obvious. Instead, I added an assertion there to
try to catch this problem if it were to occur there.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit ad814d04ca)
Conflicts:
src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c
Simple shaders such as:
void splat(vec2 v, float f) {
v[0] = v[1] = f;
}
failed to compile with the following error:
error: value of type vec2 cannot be assigned to variable of type float
First, we would process v[1] = f, and transform:
LHS: (expression float vector_extract (var_ref v) (constant int (1)))
RHS: (var_ref f)
into:
LHS: (var_ref v)
RHS: (expression vec2 vector_insert (var_ref v) (constant int (1))
(var_ref f))
Note that the LHS type is now vec2, not a float. This is surprising,
but not the real problem.
After emitting assignments, this ultimately becomes:
(declare (temporary) vec2 assignment_tmp)
(assign (xy)
(var_ref assignment_tmp)
(expression vec2 vector_insert (var_ref v) (constant int (1))
(var_ref f)))
(assign (xy) (var_ref v) (var_ref assignment_tmp))
We would then return (var_ref assignment_tmp) as the rvalue, which has
the wrong type---it should be float, but is instead a vec2.
To fix this, we simply return (vector_extract (var_ref assignment_temp)
<the appropriate channel>) to pull out the desired float value.
Fixes Piglit's chained-assignment-with-vector-constant-index.vert and
chained-assignment-with-vector-dynamic-index.vert tests.
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74026
Reported-by: Dan Ginsburg <dang@valvesoftware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44a86e2b4f)
NV3x cards don't support NPOT textures. Technically this restriction
could be worked around, but since it also doesn't expose any video
decoding hw, just turn it off entirely.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 00e4314f6d)
The textures array is defined as a number of NV50_MAX_PIPE_CONSTBUFS
per shader stage. Currently the nv50 driver handles only 3 shader
stages, thus we wreck chaos when accessing array-out-of-bounds.
Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 12e744abbb)
The textures array is defined as a number of PIPE_MAX_SAMPLERS per shader stage.
Currently nv50 driver handles only 3 shader stages, thus we wreck chaos when
accessing array-out-of-bounds.
Fixes a segfault in piglit/bin/arb_texture_buffer_object-data-sync -fbo -auto
Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit d606ca37eb)
Commit c13970808 (mesa: GL_EXT_secondary_color is not optional) changed
CHECK_EXTENSION2(EXT_secondary_color, ARB_vetex_program, cap)
to
CHECK_EXTENSION(ARB_vertex_program, cap)
However CHECK_EXTENSION2 checks that either extension is available, not
both. Remove the extension check entirely since the intent was for it to
always be enabled.
v2: Fix glGet*(GL_COLOR_SUM) too. Suggested by Ian.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 739dc95e67)
The radeonsi code was not cleaning up either of these items leading to
leaked memory.
v2: Move cleanup to r600_common_context_cleanup instead of duplicating
the logic for SI
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 5ac3229f76)
Conflicts:
src/gallium/drivers/radeon/r600_pipe_common.c
The ES and desktop GL specs diverge here. Yay!
In desktop OpenGL, the driver can perform online compression of
uncompressed texture data. GL_NUM_COMPRESSED_TEXTURE_FORMATS and
GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats
that it could ask the driver to compress with some expectation of
quality. The GL_ARB_texture_compression spec calls this "suitable for
general-purpose usage." As noted above, this means
GL_COMPRESSED_RGBA_S3TC_DXT1_EXT is not included in the list.
In OpenGL ES, the driver never performs compression.
GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give
the application a list of formats that the driver can receive from the
application. It is the *complete* list of formats. The
GL_EXT_texture_compression_s3tc spec says:
"New State for OpenGL ES 2.0.25 and 3.0.2 Specifications
The queries for NUM_COMPRESSED_TEXTURE_FORMATS and
COMPRESSED_TEXTURE_FORMATS include COMPRESSED_RGB_S3TC_DXT1_EXT,
COMPRESSED_RGBA_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT3_EXT,
and COMPRESSED_RGBA_S3TC_DXT5_EXT."
Note that the addition is only to the OpenGL ES specification!
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
See-also: http://lists.freedesktop.org/archives/mesa-dev/2013-October/047439.html
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0a75909b3f)
The temporary variable used to store _ColorDrawBufferIndexes must be
signed (GLint), otherwise the following conditional will be incorrectly
evaluated. Leading to crashes in the driver/mesa or accessing/writing
to arbitrary memory location. The bug dates back to 2009.
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit bfcf78c110)
_ColorDrawBufferIndexes is defined as GLint* and using a GLuint*
will result in the first part of the conditional to be evaluated to
true always.
Unintentionally introduced by the following commit, this will result
in a driver segfault if one is using an old version of the piglit test
bin/clearbuffer-mixed-format -auto -fbo
commit 03d848ea10
Author: Marek Olšák <marek.olsak@amd.com>
Date: Wed Dec 4 00:27:20 2013 +0100
mesa: fix interpretation of glClearBuffer(drawbuffer)
This corresponding piglit tests supported this incorrect behavior instead of
pointing at it.
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 10368e1446)
Prior to this patch, if we ran out of aperture space during
brw_try_draw_prims(), we would rewind the batch buffer pointer
(potentially throwing some state that may have been emitted by
brw_upload_state()), flush the batch, and then try again. However, we
wouldn't reset the dirty bits to the state they had before the call to
brw_upload_state(). As a result, when we tried again, there was a
danger that we wouldn't re-emit all the necessary state. (Note: prior
to the introduction of hardware contexts, this wasn't a problem
because flushing the batch forced all state to be re-emitted).
This patch fixes the problem by leaving the dirty bits set at the end
of brw_upload_state(); we only clear them after we have determined
that we don't need to rewind the batch buffer.
Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit fb6d9798a0)
We definitely want to fall through to the unsynchronized map case, instead
of wasting bandwidth on a copy. Prevents a -43.2407% +/- 1.06113% (n=49)
performance regression on aa10perf when teaching glamor to provide the
GL_INVALIDATE_RANGE_BIT information.
This is a performance fix, which I usually wouldn't cherry-pick to stable.
But this was really was just a bug in the code, its presence would
discourage developers from giving us the best information they can, and I
think we've got fairly high confidence in the unsynchronized map path
already.
Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f46563fe1c)
This fixes the following compile error:
src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3
overloads have similar conversions
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 067ad6e53e)
The hardware is broken with nonzero texel offsets and unnormalized
coordinates; instead of doing correct offsetting, we get garbage.
This just extends the existing workaround for ir_txf and
ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect.
Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is
not enabled; also fixes the new piglit test
'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'.
Has been broken ~forever; suggesting including this in only 10.0 because
the lowering pass doesn't exist in 9.2 or earlier so would require quite
a different patch.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Lee Salzman <lsalzman@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e99735f30)
Commit 9119269ca1 moved the texel
buffer allocation to _swrast_texture_span(), however, when compiled
with OpenMP support this code already runs multi-threaded so a
critical section is required to prevent multiple allocations and
rendering errors.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2a0fb946e1)
This is part of the GL_EXT_packed_float extension.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 3486f6f31b
Also squashed in a subsequent bug fix:
mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed()
This packed floating point format only stores positive values.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 0fc8d7c66e)
Also squashed in a second, subsequent bug fix:
mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query
If a channel has zero bits it's not signed.
v2: also check for luminance and intensity format bits. Bruce
Merry's proposed piglit test hits the luminance case.
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d046fd731a)
Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Conflicts:
src/mesa/main/get.c
* These make up the base of what C++ GL Haiku applications
use for 3D rendering.
* Not placed in includes/GL to prevent Haiku headers from
getting installed on non-Haiku systems.
Acked-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 56d920a5c1)
The compiler back-ends (i965's fs_visitor and brw_visitor,
ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when
ir_loop::counter is non-null, it points to a fresh ir_variable that
should be used as the loop counter (as opposed to an ir_variable that
exists elsewhere in the instruction stream).
However, previous to this patch:
(1) loop_control_visitor did not create a new variable for
ir_loop::counter; instead it re-used the existing ir_variable.
This caused the loop counter to be double-incremented (once
explicitly by the body of the loop, and once implicitly by
ir_loop::increment).
(2) ir_clone did not clone ir_loop::counter properly, resulting in the
cloned ir_loop pointing to the source ir_loop's counter.
(3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting
in the ir_variable being missed by reparenting.
Additionally, most optimization passes (e.g. loop unrolling) assume
that the variable mentioned by ir_loop::counter is not accessed in the
body of the loop (an assumption which (1) violates).
The combination of these factors caused a perfect storm in which the
code worked properly nearly all of the time: for loops that got
unrolled, (1) would introduce a double-increment, but loop unrolling
would fail to notice it (since it assumes that ir_loop::counter is not
accessed in the body of the loop), so it would unroll the loop the
correct number of times. For loops that didn't get unrolled, (1)
would introduce a double-increment, but then later when the IR was
cloned for linking, (2) would prevent the loop counter from being
cloned properly, so it would look to further analysis stages like an
independent variable (and hence the double-increment would stop
occurring). At the end of linking, (3) would prevent the loop counter
from being reparented, so it would still belong to the shader object
rather than the linked program object. Provided that the client
program didn't delete the shader object, the memory would never get
reclaimed, and so the shader would function properly.
However, for loops that didn't get unrolled, if the client program did
delete the shader object, and the memory belonging to the loop counter
got re-used, this could cause a use-after-free bug, leading to a
crash.
This patch fixes loop_control_visitor, ir_clone, and
ir_hierarchical_visitor to treat ir_loop::counter the same way the
back-ends treat it: as a freshly allocated ir_variable that needs to
be visited and cloned independently of other ir_variables.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72026
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d6eb4321d0)
If an ir_loop has a non-null "counter" field, the variable referred to
by this field is implicitly read and written by the loop. We need to
account for this in ir_variable_refcount, otherwise there is a danger
we will try to dead-code-eliminate the loop counter variable.
Note: at the moment the dead code elimination bug doesn't occur due to
a bug in ir_hierarchical_visitor: it doesn't visit the "counter"
field, so dead code elimination doesn't treat it as a candidate for
elimination. But the patch to follow will fix that bug, so we need to
fix ir_variable_refcount first in order to avoid breaking dead code
elimination.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d2951ea0a)
Emitting flushes before depth and hiz resolves at the top of blorp's
state emission fixes the hang. Marchesin and I found the fix
experimentally, as opposed to adhering to a documented hardware
workaround. A more minimal fix likely exists, but this gets the job
done.
Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS.
Tested by zooming in and out continuously for 2 hours.
This patch is based on
8bc07bb701
CC: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1a928816a1)
The preprocessor currently accepts multiple else/elif-groups
per if-section. The GLSL-preprocessor is defined by the C++
specification, which defines the following parse-rule:
if-section:
if-group elif-groups(opt) else-group(opt) endif-line
This clearly only allows a single else-group, that has to come
after any elif-groups.
So let's modify the code to follow the specification. Add test
to prevent regressions.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb212c5a30)
This reverts commit 136a12ac98.
According to belak51 on IRC, this commit broke Allegro, which would no
longer compile. Applications apparently expect the GLXContextID typedef
to exist in glx.h; removing it breaks them. A bit of searching around
the internet revealed other complaints since upgrading to Mesa 10.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f425d56ba4)
Previously we were creating a new LLVMContext every time that we called
radeon_llvm_parse_bitcode, which caused us to leak the context every time
that we compiled a CL program.
Sadly, we can't dispose of the LLVMContext at the point that it was being
created because evergreen_launch_grid (and possibly the SI equivalent) was
assuming that the context used to compile the kernels was still available.
Now, we'll create a new LLVMContext when creating EG/SI compute state, store
it there, and pass it to all of the places that need it.
The LLVM Context gets destroyed when we delete the EG/SI compute state.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8c9a9205d9)
This fixes a crash where old_view->context was already freed in the
pipe_sampler_view_reference function contained in
src/gallium/auxiliary/utils/u_inlines.h. As a result, the
sampler_view_destroy function pointer contained 0xfeeefeee indicating
freed heap memory.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 670be71bd8)
This patch changes the error reporting behavior for incorrect function
invocation (triggered by match_function_by_name() unable to find a
matching function call) from using the line number information
associated to the function name term to using the line number
information of the entire function expression. Fixes bug #72264.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 23d294bb60)
This patch changes the error condition to satisfy below statement
from OpenGL 4.3 core specification:
"An INVALID_OPERATION error is generated if id is the name of a query
object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or
ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query
currently in progress."
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7a73c6acb0)
The driverPrivate pointer is opaque to the driver and we can't assume
it's a struct gl_context in dri_util.c. Instead provide a helper function
to set the struct gl_context flags from the incoming DRI context flags.
v2 (idr): Modify the other classic drivers to also use
driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests
with i965 and classic swrast without regressions.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau]
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 38366c0c6e)
On evergreen we have to reserve 1 stack element in some additional cases
besides the ones mentioned in the docs, but stack size computation was
recently reimplemented exactly as described in the docs by the patch that
added workarounds for stack issues on EG/CM, resulting in regressions
with some apps (Serious Sam 3).
This patch fixes it by restoring previous behavior.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Andre Heider <a.heider@gmail.com>
(cherry picked from commit 00faf82832)
I'm not sure why this change is necessary. When I've built previous tar files
(such as 9.2.4) with the "make tarballs" target, they include the
bin/test-driver file. But at my first attempt to build the tar files for the
10.0.1 release this file was not being included and the build failed.
First off, nv50_program only has 16 in/out varyings. However reporting
16 makes 'm' become 68 in nv50_fp_linkage_validate with the
varying-packing-simple piglit test. (Subverting the assert makes it
compile but fail.) With this patch, varying-packing-simple passes.
See: https://bugs.freedesktop.org/show_bug.cgi?id=69155
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bad8871e52)
flush_with_flags, when available, allows the driver to throttle.
Using this suppress input lag issues that can be observed in heavy
rendering situations on non-intel cards.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit afcce46fd5)
This typically won't make a difference, since we only send the requests at
wl_display_flush() time. There might be a small race
with another thread calling wl_display_flush() after our commit request,
but before we flush the DRI driver. Moving the commit below the DRI
driver flush call looks more natural and eliminates the small race.
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit 33eb5eabee)
We would like the compositor to receive the commited buffer
as soon as possible, so it has the time to treat it, and
release old ones. We shouldn't rely on the client
to flush the queue for us.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit 402bf6e8d0)
If we're not using EGL_EXT_swap_buffers_with_damage, we have to
damage the full extent. EGL operates on buffer coordinates, but
wl_surface.damage takes surface coordinates. EGL doesn't know the
buffer transformation (rotated or scaled) and can't post accurate
damage in surface coordinates. The damage event however is clipped to
the surface extents so we can just damage the maximum rectangle.
In case of EGL_EXT_swap_buffers_with_damage, the application knows
the buffer transform and is expected to pass in rectangles in
surface space.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70250
Cc: "10.0" mesa-stable@lists.freedesktop.org
Tested-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
(cherry picked from commit bce64c6c83)
To help the transition period when DRI loaders are being updated
to support the newer __driDriverExtensions_foo mechanism,
we populate __driDriverExtensions with the extensions returned
by __driDriverExtensions_foo during a library contructor
function.
We find the driver foo's name by using the dladdr function
which gives the path of the dynamic library's name that
was being loaded.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4859d492b2)
Create the ref_bo without any storage type flags set for now. The issue
probably arises from our use of the additional buffer space at the end
of the ref_bo. It should probably be split up in the future.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Martin Peres <martin.peres@labri.fr>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e01ba9d6b0)
This fixes a memory leak in some situations. Also avoids emitting an
extra fence if the kick handler does the call to nouveau_fence_next
itself.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ce6dd69697)
The BSpec states that the aligment for the non-msrt clear rectangle must
be doubled; the BSpec does not restricit the workaround to specific
hardware.
Commit 9a1a67b applied the workaround to Haswell GT3. Commit 8b659ce
expanded the workaround to all Haswell variants. This commit expands it
to all hardware.
No Piglit regressions on Ivybridge 0x0166. No fixes either.
I know no Ivybridge nor Baytrail bug related to this workaround.
However, the BSpec says the extra alignment is required, so let's do it.
v2: Apply to all hardware, not just gen7.
CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org>
CC: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 998018d7be)
Pre-patch, the workaround was applied to only HSW GT3. However, the
workaround also fixes render corruption on the HSW GT1 Chromebook,
codenamed Falco.
Also, update the BSpec quote that discusses the workaround to reflect
the latest BSpec.
The BSpec states that the workaround is required for Ivybridge and
Baytrail as well as Haswell. But, we apply the workaround to only
Haswell because (a) we suspect that is the only hardware where it is
actually required and (b) we haven't yet validated the workaround for
the other hardware.
CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org>
CC: Anuj Phogat <anuj.phogat@gmail.com>
OTC-Tracker: CHRMOS-812
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 8b659cef3a)
On gen6, multisamble resolve blits use the SAMPLE message to blend
together the 4 samples for each texel. For some reason, SAMPLE
doesn't blend together the proper samples when the source format is
L32_FLOAT or I32_FLOAT, resulting in blocky artifacts.
To work around this problem, sample from the source surface using
R32_FLOAT. This shouldn't affect rendering correctness, because when
doing these resolve blits, the destination format is R32_FLOAT, so the
channel replication done by L32_FLOAT and I32_FLOAT is unnecessary.
Fixes piglit tests on Sandy Bridge:
- spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float
- spec/ARB_texture_float/multisample-formats 4 GL_ARB_texture_float
No piglit regressions on Sandy Bridge.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70601
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c4cf487315)
For some reason this was left out when the version was changed...
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
In brw_update_renderbuffer_surfaces(), if there are no color draw
buffers, we always set up a null render target at surface index 0 so we
have something to use with the FB write marking the end of thread.
However, when we recently began computing surface indexes dynamically,
we failed to reserve space for it. This meant that the first texture
would be assigned surface index 0, and our closing FB write would
clobber the texture.
Fixes Piglit's EXT_packed_depth_stencil/fbo-blit-d24s8 test on Gen4-5,
which regressed as of commit 4e5306453d
("i965/fs: Dynamically set up the WM binding table offsets.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70605
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Tested-by: lu hua <huax.lu@intel.com>
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit c4815f6cd6)
This is a squash of the following two cherry-picked patches:
i965: Only enable __DRI2_ROBUSTNESS if kernel support is available
Rather than always advertising the extension but failing to create a
context with reset notifiction, just don't advertise it. I don't know
why it didn't occur to me to do it this way in the first place.
NOTE: Kristian requested that I provide a follow-up for master that
dynamically generates the list of DRI extensions instead of selected
between two hardcoded lists.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9b1c68638d)
and
i965: Properly reject __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS when __DRI2_ROBUSTNESS is not enabled
Only allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in brwCreateContext if
intelInitScreen2 also enabled __DRI2_ROBUSTNESS (thereby enabling
GLX_ARB_create_context).
This fixes a regression in the piglit test
"glx/GLX_ARB_create_context/invalid flag"
v2: Remove commented debug code. Noticed by Jordan.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53a65e547c)
In commit 065da16 (glsl: Convert lower_clip_distance_visitor to be an
ir_rvalue_visitor), we failed to notice that since
lower_clip_distance_visitor overrides visit_leave(ir_assignment *),
ir_rvalue_visitor::visit_leave(ir_assignment *) wasn't getting called.
As a result, clip distance dereferences appearing directly on the
right hand side of an assignment (not in a subexpression) weren't
getting properly lowered. This caused an ir_dereference_variable node
to be left in the IR that referred to the old gl_ClipDistance
variable. However, since the lowering pass replaces gl_ClipDistance
with gl_ClipDistanceMESA, this turned into a dangling pointer when the
IR got reparented.
Prior to the introduction of geometry shaders, this bug was unlikely
to arise, because (a) reading from gl_ClipDistance[i] in the fragment
shader was rare, and (b) when it happened, it was likely that it would
either appear in a subexpression, or be hoisted into a subexpression
by tree grafting.
However, in a geometry shader, we're likely to see a statement like
this, which would trigger the bug:
gl_ClipDistance[i] = gl_in[j].gl_ClipDistance[i];
This patch causes
lower_clip_distance_visitor::visit_leave(ir_assignment *) to call the
base class visitor, so that the right hand side of the assignment is
properly lowered.
Fixes piglit test:
- spec/glsl-1.50/execution/geometry/clip-distance-itemized-copy
Cc: Ian Romanick <idr@freedesktop.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 9dfcb05fa6)
The previous commit fixes a bug wherein we would incorrectly refer to
stale geometry shader prog_data when no geometry shader was active.
This patch reduces the likelihood of that sort of bug occurring in the
future by setting prog_data to NULL whenever there is no GS program.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 37bdde1087)
Previously, in brw_gs_upload_binding_table(), we checked whether
brw->gs.prog_data was NULL in order to determine whether a geometry
shader was active. This didn't work: brw->gs.prog_data starts off as
NULL, but it is set to non-NULL when a geometry shader program is
built, and then never set to NULL again. As a result, if we called
brw_gs_upload_binding_table() while there was no geometry shader
active, but a geometry shader had previously been active, it would
refer to a stale (and possibly freed) prog_data structure.
This patch fixes the problem by modifying
brw_gs_upload_binding_table() to use the proper technique to determine
whether a geometry shader is active: by checking whether
brw->geometry_program is NULL.
This fixes the crash reported in comment 2 of bug 71870 (the incorrect
rendering remains, however).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2714ca81b9)
The fast tiled texture upload code does not compile with GCC 4.8's -Og
optimization flag.
memcpy() has the always_inline attribute set. This poses a problem,
since {x,y}tile_copy_faster calls it indirectly via {x,y}tile_copy,
and {x,y}tile_copy normally aren't inlined at -Og.
Using __attribute__((flatten)) tells GCC to inline every function call
inside the function, which I believe was the author's intent.
Fix suggested by Alexander Monakov.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ad542a10c5)
I think I was thinking of the batch command packet cache when I pasted
this in, but this counter is only used for dumping out streamed state for
INTEL_DEBUG=batch and for putting annotations in our aub files.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5891f98145)
From section 6.1.18 (Renderbuffer Object Queries) of the GL 3.2 spec,
under the heading "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE
is TEXTURE, then":
If pname is FRAMEBUFFER_ATTACHMENT_LAYERED, then params will
contain TRUE if an entire level of a three-dimesional texture,
cube map texture, or one-or two-dimensional array texture is
attached. Otherwise, params will contain FALSE.
Fixes piglit tests:
- spec/!OpenGL 3.2/layered-rendering/framebuffer-layered-attachments
- spec/!OpenGL 3.2/layered-rendering/framebuffertexture-defaults
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
v2: Don't include "EXT" in the error message, since this query only
makes sensen in context versions that have adopted
glGetFramebufferAttachmentParameteriv().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit ec79c05cbf)
Previously we were using the code path for validating
glFramebufferTextureLayer(). But glFramebufferTexture() allows
additional texture types.
Fixes piglit tests:
- spec/!OpenGL 3.2/layered-rendering/gl-layer-cube-map
- spec/!OpenGL 3.2/layered-rendering/framebuffertexture
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
v2: Clarify comment above framebuffer_texture().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit af1471dc04)
From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec:
When the Clear or ClearBuffer* commands are used to clear a
layered framebuffer attachment, all layers of the attachment are
cleared.
This patch fixes the fast depth clear path.
Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-depth".
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0831523350)
From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec:
When the Clear or ClearBuffer* commands are used to clear a
layered framebuffer attachment, all layers of the attachment are
cleared.
This patch fixes the blorp clear path for color buffers.
Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-color".
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit c1019670ea)
In order to properly clear layered framebuffers, we need to know how
many layers they have. The easiest way to do this is to record it in
the gl_framebuffer struct when we check framebuffer completeness.
This patch replaces the gl_framebuffer::Layered boolean with a
gl_framebuffer::NumLayers integer, which is 0 if the framebuffer is
not layered, and equal to the number of layers otherwise.
v2: Remove gl_framebuffer::Layered and make gl_framebuffer::NumLayers
always have a defined value. Fix factor of 6 error in the number of
layers in a cube map array.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 95140740ad)
Previously, we checked for interstage uniform interface block link
errors in validate_interstage_interface_blocks(), which is only called
on pairs of adjacent shader stages. Therefore, we failed to detect
uniform interface block mismatches between non-adjacent shader stages.
Before the introduction of geometry shaders, this wasn't a problem,
because the only supported shader stages were vertex and fragment
shaders, therefore they were always adjacent. However, now that we
allow a program to contain vertex, geometry, and fragment shaders,
that is no longer the case.
Fixes piglit test "skip-stage-uniform-block-array-size-mismatch".
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
v2: Rename validate_interstage_interface_blocks() to
validate_interstage_inout_blocks() to reflect the fact that it no
longer validates uniform blocks.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
v3: Make validate_interstage_inout_blocks() skip uniform blocks.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 544e3129c5)
Previously, when attempting to link a vertex shader and a geometry
shader that use different GLSL versions, we would sometimes generate a
link error due to the implicit declaration of gl_PerVertex being
different between the two GLSL versions.
This patch fixes that problem by only requiring interface block
definitions to match when they are explicitly declared.
Fixes piglit test "shaders/version-mixing vs-gs".
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
v2: In the interface_block_definition constructor, move the assignment
to explicitly_declared after the existing if block.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0f4cacbb53)
From section 7.1 (Built-In Language Variables) of the GLSL 4.10
spec:
Also, if a built-in interface block is redeclared, no member of
the built-in declaration can be redeclared outside the block
redeclaration.
We have been regarding this text as a clarification to the behaviour
established for gl_PerVertex by GLSL 1.50, so we apply it regardless
of GLSL version.
This patch enforces the rule by adding an enum to ir_variable to track
how the variable was declared: implicitly, normally, or in an
interface block.
Fixes piglit tests:
- gs-redeclares-pervertex-out-after-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-global-redeclaration.vert
- gs-redeclares-pervertex-out-after-other-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-other-global-redeclaration.vert
- gs-redeclares-pervertex-out-before-global-redeclaration
- vs-redeclares-pervertex-out-before-global-redeclaration
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
v2: Don't set "how_declared" redundantly in builtin_variables.cpp.
Properly clone "how_declared".
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2bbcf19aca)
Earlier comments suggest this was removed from GL core spec but it is
still there. Enabling makes 'texture_lod_bias_getter' Khronos
conformance tests pass, also removes some errors from Metro Last Light
game which is using this API.
v2: leave NOTE comment (Ian)
Cc: "9.0 9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 7e61b44dcd)
We need to check the drawbuffer's orientation before inverting Y
coordinates. Fixes piglit feedback tests when running with the
-fbo option.
Cc: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 15d8e05e1e)
Commit 70953b5 (i965: Initialize all member variables of
vec4_instruction on construction) inadvertently added a line to the
vec4_instruction constructor setting this->ir to NULL, wiping out the
previously set value. As a result, ever since then, the output of
INTEL_DEBUG=vs and INTEL_DEBUG=gs has been missing IR annotations.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 60b1a118e1)
v2: Don't go to extra work to avoid extraneous flushes. (Previous
experiments in the kernel have suggested that flushing the pipeline
when it is already empty is extremely cheap).
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 7dfb4b2d00)
radeon_llvm_compile allocates memory for binary.code, binary.config,
or neither depending on what's being done.
We need to make sure to free that memory after it's no longer needed.
v2: Don't bother checking for null before FREE()
CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 01f3622c74)
use memset to initialize to 0's... otherwise code_size and config_size
could be uninitialized when read later in this method.
It's also hard to do NULL checks on uninitialized pointers.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
v2: Fix indentation
CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dd73b99420)
After we blit/copy to a dest texture image we need to mark it as
being defined. This fixes broken mipmap generation for quite a
few texture formats. Mipgen involves making texture views and
svga_texture_view_surface() skips texture images that are undefined.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 3969330b47)
The index translation code expects the number of indexes to be
consistent with the primitive type (ex: a multiple of 3 for
PIPE_PRIM_TRIANGLES). If it's not, we can write out of bounds
in the destination buffer.
Fixes failed assertions in the pipebuffer debug code found with
Piglit primitive-restart-draw-mode test.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 79984b9928)
Previously, when doing intrastage and interstage interface block
linking, we only checked the interface type; this prevented us from
catching some link errors.
We now check the following additional constraints:
- For intrastage linking, the presence/absence of interface names must
match.
- For shader ins/outs, the interface names themselves must match when
doing intrastage linking (note: it's not clear from the spec whether
this is necessary, but Mesa's implementation currently relies on
it).
- Array vs. nonarray must be consistent, taking into account the
special rules for vertex-geometry linkage.
- Array sizes must be consistent (exception: during intrastage
linking, an unsized array matches a sized array).
Note: validate_interstage_interface_blocks currently handles both
uniforms and in/out variables. As a result, if all three shader types
are present (VS, GS, and FS), and a uniform interface block is
mentioned in the VS and FS but not the GS, it won't be validated. I
plan to address this in later patches.
Fixes the following piglit tests in spec/glsl-1.50/linker:
- interface-blocks-vs-fs-array-size-mismatch
- interface-vs-array-to-fs-unnamed
- interface-vs-unnamed-to-fs-array
- intrastage-interface-unnamed-array
v2: Simplify logic in intrastage_match() for handling array sizes.
Make extra_array_level const. Use an unnamed temporary
interface_block_definition in validate_interstage_interface_blocks()'s
first call to definitions->store().
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit f38ac41ed4)
From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit
Size):
j [vertical alignment] = 4 for any render target surface is
multisampled (4x)
From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most
messages), under the "Surface Vertical Alignment" heading:
This field is intended to be set to VALIGN_4 if the surface was
rendered as a depth buffer, for a multisampled (4x) render target,
or for a multisampled (8x) render target, since these surfaces
support only alignment of 4.
Back in 2012 when we added multisampling support to the i965 driver,
we forgot to update the logic for computing the vertical alignment, so
we were often using a vertical alignment of 2 for multisampled
buffers, leading to subtle rendering errors.
Note that the specs also require a vertical alignment of 4 for all
Y-tiled render target surfaces; I plan to address that in a separate
patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b4c3b833ec)
For both vertex and fragment shaders we default MaxUniformComponents
to 4 * MAX_UNIFORMS. It makes sense to do this for geometry shaders
too; if back-ends have different limits they can override them as
necessary.
Fixes piglit test:
spec/glsl-1.50/built-in constants/gl_MaxGeometryUniformComponents
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 46e9f78efc)
Systems with little physical memory installed will report less than
2GiB, and some systems may (hypothetically?) have a larger address space
for the GPU. My IVB still reports 1534.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb6182bdfa)
createContextAttribs is a superset of what createNewContext provides.
Also remove the function typedef, since createNewContext is deprecated
and no longer used in multiple interfaces.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e048953145)
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.
With this patch, we can now run gbm on render-nodes. A render-node is a
drm device that doesn't support modesetting and all the legacy DRI ioctls.
flink is also not supported, but now that gbm doesn't need flink, we can
run piglit on head-less gbm or head-less GPGPU.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 04e3ef00db)
Since LIFO fails on some shaders in one particular way, and non-LIFO
systematically fails in another way on different kinds of shaders, try
them both, and pick whichever one successfully register allocates first.
Slightly prefer non-LIFO in case we produce extra dependencies in register
allocation, since it should start out with fewer stalls than LIFO.
This is madness, but I haven't come up with another way to get unigine
tropics to not spill while keeping other programs from not spilling and
retaining the non-unigine performance wins from texture-grf.
total instructions in shared programs: 1626728 -> 1626288 (-0.03%)
instructions in affected programs: 1015 -> 575 (-43.35%)
GAINED: 50
LOST: 0
Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit e9daead784)
Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling
barriers, so we had to run scheduler before them in order for it to be
able to do basically anything. Now that that's fixed, we can delay the
scheduling until we go to allocate (which will make the next change less
scary).
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit fbd8303a94)
We care about depth-until-program-end, as a proxy for "make sure I
schedule those early instructions that open up the other things that can
make progress while keeping register pressure low", not actual latency
(since we're relying on the post-register-alloc scheduling to actually
schedule for the hardware).
total instructions in shared programs: 1609931 -> 1609931 (0.00%)
instructions in affected programs: 0 -> 0
GAINED: 55
LOST: 43
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit f72a0d99fe)
In the SIMD16 spilling changes, I replaced a "1" in the spill path with
"mlen", but obviously it wasn't mlen before because spills have the g0
header along with the payload. The interface I was trying to use was
asking for how many physical regs we're writing, so we're looking for "1"
or "2".
I'm guessing this actually passed piglit because the high 8 bits of the
execution mask in SIMD8 mode are all 0s.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit 7c90947a0b)
Previously, the best thing we had was to schedule the things unblocked by
the last chosen instruction, on the hope that it would be consuming two
values at the end of their live intervals while only producing one new
value. But that's just a guess, and we can do counting of usage of
registers to know when an instruction would (almost surely) reduce
register pressure.
The only failure mode I know of in this new dominant heuristic is that
inside of a loop when scheduling the iterator (for example), choosing the
last use of the iterator doesn't actually reduce the live interval of the
iterator. But it doesn't seem to matter in shader-db:
total instructions in shared programs: 1618700 -> 1618700 (0.00%)
instructions in affected programs: 0 -> 0
GAINED: 13
LOST: 0
Note: The new functions are made virtual because I expect we'll soon lift
the pre-regalloc scheduling heuristic over to the vec4 backend.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit bc0e3bb4d0)
Fixes infinite loop in find_grid_optimal_factor() in cases where the
user specifies a grid size with less dimensions than the device
supports.
Reported-by: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 99d447cc5d)
Otherwise, the function would enable generic vertex attributes 0
and 1 of the array object it does not own. This was causing crashes
in Euro Truck Simulator 2, since the incorrectly enabled generic
attribute 0 in the foreign context got precedence before vertex
position attribute at later time, leading to NULL pointer dereference.
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Petr Sebor <petr@scssoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit f2b844f59d)
Adding a vl_mpeg-based helper didn't seem to work, as it produced data
that the card couldn't handle. (And I didn't investigate further.) This
makes the decoding functionality only accessible via XvMC and avoids
crashes when attempting to use VDPAU.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08122e151a)
This fixes a crash in glamor when mesa links against static LLVM.
v2:
- Inline LINKER_SCRIPT variable
v3: Kai Wasserbäch
- Fix out out-of-tree-builds
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
(cherry picked from commit 594fa4a208)
This works now that pipe_*.so is no longer exporting LLVM symbols.
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
(cherry picked from commit cb080a10b6)
This makes it possible to use clover with statically linked LLVM.
v2:
- Inline LINKER_SCRIPT variable
v3: Kai Wasserbäch
- Fix out out-of-tree-builds
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
(cherry picked from commit 6d6c749215)
Previously, we would bogusly replace the entire statement containing the
ir_texture node with an ir_dereference_variable.
Correct this to just replace the ir_texture node itself as intended.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5442c0eae3)
Commit b16b3c87 began performing CSE on CMP instructions with null
destinations. I relaxed the restrictions a bit too much, thereby
allowing CSE to be performed on instructions with, for instance, an
explicit accumulator destination.
This broke the arb_gpu_shader5/fs-imulExtended shader tests because
they emit MUL instructions with the accumulator as the destination. CSE
would instead cause the MUL to write to a GRF, which is lower precision
than the accumulator.
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68349e5219)
After more testing (everyone else trying to build the stack is having as
much trouble as I had, even after the problems I had were fixed), it
really feels like dri3 is not something we're ready to support in this
stable branch. The .c/.h code will remain here to enable easier
cherry-picking from master, and everything stays on master so we can ship
a solid DRI3 in 3 months.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
When this function was added, the returned value was signed in some
places, unsigned in others.
v2: also add unsigned in the unit test, per Ian.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 75982a5df4)
* Inherit gl_context so we always have access to it
* Thanks curro for the idea.
* Last Haiku cannidate for 10.0.0
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=64323">Bug 64323</a> - Severe misrendering in Left 4 Dead 2</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68838">Bug 68838</a> - GLSL: struct declarations produce a "empty declaration warning" in 9.2</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=58660">Bug 58660</a> - CAYMAN broken with HyperZ on</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=66352">Bug 66352</a> - GPU lockup in L4D2 on TURKS with HyperZ</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68799">Bug 68799</a> - [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error "SSE4.1 instruction set not enabled"</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=73088">Bug 73088</a> - [HyperZ] Juniper (6770): Gone Home / Unigine Heaven 4.0 lock up system after several minutes of use</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74428">Bug 74428</a> - hyperz causes gpu hang in Counter-strike: Source</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74803">Bug 74803</a> - [r600g] HyperZ broken on RV630 (Cogs shadows are broken)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=74892">Bug 74892</a> - HyperZ GPU lockup with radeonsi 7970M PITCAIRN and Distance Alpha game</li>
@@ -55,16 +57,89 @@ Note: some of the new features are only available with certain drivers.
<li>GL_ARB_vertex_attrib_binding</li>
<li>GL_ARB_vertex_type_10f_11f_11f_rev on i965 and r600g</li>
<li>GL_KHR_debug</li>
<li>GLX_MESA_query_renderer</li>
</ul>
<h2>Bug fixes</h2>
TBD.
<p>Attempts have been made to <b>not</b> include bugs fixed in previous 9.2
releases or bugs that were regressions during 10.0 development. This list is
likely incomplete.</p>
<ul>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=47755">Bug 47755</a> - [glsl-compiler] no error checking when Interpolation qualifier for built-in variable is different in vertex and fragment shader</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=52171">Bug 52171</a> - [gallium/r600/clover] Simple benchmarks failed to run</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=53077">Bug 53077</a> - [IVB] Output error with msaa when both of framebuffer and source color's alpha are not 1</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=54867">Bug 54867</a> - bug in r300 compiler</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=60929">Bug 60929</a> - [r600-llvm] mono games with opengl are blocking on start</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=62142">Bug 62142</a> - Mesa/demo mipmap_limits upside down with running by SOFTWARE</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=66213">Bug 66213</a> - Certain Mesa Demos Rendering Inverted (vertically)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=66806">Bug 66806</a> - [softpipe] glxgears floating point exception</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=67921">Bug 67921</a> - [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68162">Bug 68162</a> - [radeonsi] texture rendering is broken in Source-Engine games</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68451">Bug 68451</a> - Texture flicker in native Dota2 in mesa 9.2.0rc1</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68503">Bug 68503</a> - Graphical glitches in Serious Sam 3 when SB is enabled</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=68792">Bug 68792</a> - Problems during playback of h264 files using UVD and VLC on AMD E-350 CPU</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=70042">Bug 70042</a> - Major texture flickering in Dota 2 (r600g on HD 6950)</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=70088">Bug 70088</a> - Glamor on r600g crashes Xserver</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=70123">Bug 70123</a> - Freeze caused by 'winsys/radeon: remove cs_queue_empty' commit</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=70327">Bug 70327</a> - Casting floating point variable to integer not working properly while constant gets converted properly</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=70891">Bug 70891</a> - CL_INVALID_BUILD_OPTIONS results in CL_INVALID_DEVICE when asking for build log</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=71022">Bug 71022</a> - configure: error: Expat required for DRI.</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=71110">Bug 71110</a> - xorg_driver.c:1030:2: error: too many arguments to function ‘DamageUnregister’</li>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.