Regardless of what we put in the screen structure, all of the extensions
that compute_version_es2 checks are present and 3.0 will be exposed
anyway.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7ae6864f0d)
So we don't emit it twice if we ever use the flag on
cayman.
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8b5acad0e9)
PS_PARTIAL flushes seems to be required in certain
cases to prevent hangs, especially on r6xx.
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ebf83f109)
Commit 86d30dea3c broke building with older
automake versions with this error:
Makefile:769: *** Recursive variable am__v_YACC_ references itself (eventually). Stop.
This patch fixes it. Fix stolen from xorg-macros.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
(cherry picked from commit 0a82828ad5)
Based on r600g commit 2b9659c9e6 .
Fixes crashes with 4 piglit tests which are now hitting these formats.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit f9adf79876)
It's the reciprocal of the register value.
Fixes piglit fragcoord_w and glsl-fs-fragcoord-zw-perspective.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 954bc4ac34)
11 more little piglits.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 95bced5929)
That means we can map and read multiple slices with one transfer_map call.
[ Cherry-picked from r600g commit 1aebb6911e ]
11 more little piglits on master, 1 more on the 9.1 branch (Marek's
glTex(Sub)Image improvements on master broke the other 10).
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 72f4490b55)
[ Cherry-picked from r600g commit ef11ed61a0 ]
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a84c4edeed)
[ Cherry-picked from r600g commit b278aba423 ]
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c4faab63c4)
Assuming I understand EXT_texture_sRGB correctly.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6520a86c67)
The OMOD value was only being folded to one instruction in cases where
the MUL instruction was reading a value written by more than one
instruction.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 10bcc843f8)
Now you can convert assembly strings into a full struct radeon_compiler
object and use it to test individual compiler pases.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 5e1321ddf4)
This way make check can report whether or not the tests pass.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit bcf2e157ca)
This will be used by the test suite in later commits.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 5355fc1e87)
These are all files that I authored, but forgot to add the license
headers.
NOTE: This is a candidate for the stable branches.
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 27d140b960)
GLX_INTEL_swap_event is broken on the server side, where it's
currently unconditionally enabled. This completely breaks
systems running on drivers which don't support that extension.
There's no way to test for its presence on this side, so instead
of disabling it uncondtionally, just disable it for drivers
which are known to not support it. It makes sense because
most drivers do support it right now.
We'll be able to remove this once Xserver properly advertises
GLX_INTEL_swap_event.
Note: This is a candidate for stable branch branches.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60052
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 076403c30d)
A single element in a GLX reply is contained in the header itself.
The number of elements is denoted in the "n" field of the reply.
If "n" is 1, the length of additional data is 0.
The XXX_data_length() function of xcb does not return the length of
the (optional, n>1) data but the number of elements.
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=59876
Note: This is a candidate for the stable branches.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5876a5dbc0)
In 2 of our checks, we were missing BREAK and CONTINUE.
NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bf91f0b039)
Detect a duplicate Shader type as and error instead of silently allowing
it, restrict to ES2 API.
v2: Tapani Pälli <tapani.palli@intel.com>
- make the check run time instead of compile time
v3: chadv
- Quote spec on which error to generate.
Signed-off-by: bma <Bo.Ma@windriver.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit ce3dfa19ab)
The hardware just doesn't support it. I suspect this was a regression from
the move to fixed MESA_FORMATs for compressed textures and that previously we
were storing uncompressed for this or something.
Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor
swizzled" on my GM965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ddc2b453d0)
_mesa_delete_renderbuffer does not call the driver-specific
renderbuffer delete function, so the blorp code was leaking the
Intel-specific bits, including some GEM objects.
Call the renderbuffer's ->Delete() method instead, which does the
right thing.
Fixes Unity rapidly sending the machine into the arms of the OOM-killer
Note: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit dd599188d2)
The GL_ARB_texture_rg spec says that we need to support both texturing
and rendering for the GL_RED and GL_RG formats. So move the format
check up into the rendertarget_mapping[] list. Also, add
PIPE_FORMAT_R8_UNORM to the list of formats required.
Note: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 4be5a06752)
In GLSL, sampler indices are allocated contiguously from 0. But in the
case of ARB_fragment_program (and possibly fixed function), an app that
uses texture 0 and 2 will use sampler indices 0 and 2, so we were only
allocating space for samplers 0 and 1 and setting up sampler 0. We
would read garbage for sampler 2, resulting in flickering textures and
an angry simulator.
Fixes bad rendering in 0 A.D. and ETQW. This was fixed for pre-gen7 by
28f4be9eb9
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
(cherry picked from commit 5bb05c6e6d)
Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX.
Fixes gles3conform's primitive_restart_mode test.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8cabe26f5d)
Tested with softpipe only exposing RGBA formats.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cb6470775c)
The Mesa format can be RGBA8888_REV, the format/type can be
GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be
LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base
internal format as well.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c8379204ab)
_mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc.
v2: add a (now hopefully complete) helper function to deal with this
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 09a99867ab)
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.
NOTE: This is a candidate for all stable branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0e2f26d5ea)
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).
Note: This is a candidate for the 9.0 branch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 66b6d51214)
At eglSwapBuffer time, we blindly assume we have a back buffer, but the
back buffer only gets allocated when somebody tries to render something.
NOTE: This is a candidate for the 9.0 and 9.1 branches.
https://bugs.freedesktop.org/show_bug.cgi?id=60086
(cherry picked from commit 1fe007399c)
We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering
didn't work properly.
Fixes piglit's framebuffer-srgb test.
Note: This is a candidate for the stable branches.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit ff60509157)
If we call gl[Copy]TexImage2D() with a generic compression format
(e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if
we don't have the external DXT compression library.
We weren't actually enforcing this before since the
pipe_screen::is_format_supported(DXT) query has no dependency on
the DXT compression library.
Now if we're given a generic compressed format and we can't do DXT
compression we'll fall back to a non-compressed format.
v2: use util_format_is_s3tc() function and add more comments about
the allow_dxt parameter.
Note: This is a candidate for the stable branches.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4df42890c5)
When glCompressedTexImage is called the internalFormat is a specific
format for the incoming image and the the hardware format should be
the same (since we never do format transcoding). So use the simpler
_mesa_glenum_to_compressed_format() function. This change is also
needed for the next patch.
Note: This is a candidate for the stable branches.
(cherry picked from commit 478056b81a)
GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL.
If the user specified --enable-xlib-glx --disable-opengl, error out, as these
cannot be both observed at the same time. If the user just specified
--disable-opengl but not --disable-glx, print a warning and disable GLX as
well.
NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364
Tested-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 3b888f534c)
Check that the return value from xcb_dri2_swap_buffers_reply is
non-NULL before accessing the struct members.
Note: This is a candidate for the 9.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 67e7263e45)
See previous commit for more info.
Note: This is a candidate for the 9.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 2180f32972)
The swrast fragment program interpreter has trouble computing the
right texture LOD because it doesn't have easy access to input
derivatives. This causes the GLSL-based meta generate mipmap code
to fetch texels from the wrong mipmap level.
One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters
to limit sampling from the right level. But let's just use the
_mesa_generate_mipmap() fallback since it's a lot faster than using
the fragment shader interpreter.
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240
Note: This is a candidate for the 9.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 89551ae04f)
In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says:
"If a vertex output is a signed or unsigned integer or integer
vector, then it must be qualified with the interpolation qualifier
flat."
The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output
Variables"):
"Vertex shader outputs that are, *or contain*, signed or unsigned
integers or integer vectors must be qualified with the
interpolation qualifier flat."
(Emphasis mine.)
The language in the GLSL ES 3.00 spec is clearly correct and should be
applied to all shading language versions, since varyings that contain
ints can't be interpolated, regardless of which shading language
version is in use.
(Note that in GLSL 1.50 the restriction is changed to apply to
fragment shader inputs rather than vertex shader outputs, to
accommodate the fact that in the presence of geometry shaders, vertex
shader outputs are not necessarily interpolated. That will be
addressed by a future patch).
NOTE: This is a candidate for stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 93c913485e)
From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):
"The precision statement
precision precision-qualifier type;
can be used to establish a default precision qualifier. The type
field can be either int or float or any of the sampler types, and
the precision-qualifier can be lowp, mediump, or highp."
GLSL ES 1.00 has similar language. GLSL 1.30 doesn't allow precision
qualifiers on sampler types, but this seems like an oversight (since
the intention of including these in GLSL 1.30 is to allow
compatibility with ES shaders).
Previously, Mesa followed GLSL 1.30 and only allowed default precision
qualifiers to be set for float and int. This patch makes it follow
GLSL ES rules in all cases.
Fixes Piglit tests default-precision-sampler.{vert,frag}.
Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737.
NOTE: This is a candidate for stable branches.
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d5948f2f5e)
With the llvm patches, fixing 14 piglit tests in total.
v2: increase the const limit
v3: document the const limit
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8c80894fb3)
This reverts commit 2906e2034c.
Fixes a regression in the glean depthStencil test.
Reverting this does not affect any tests in es3conform, so a more recent
patch must have also fixed the failure this one was intended to fix.
Reported-by: lu hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494
(cherry picked from commit a527b2192e)
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.
v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 974b482aca)
In OpenGL 4.3, new language was added that would require
this check. But, if this check results in broken applications
then perhaps it will be reversed.
For now, remove this check and re-evaluate when
desktop GL 4.3 is closer.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch
https://bugs.freedesktop.org/show_bug.cgi?id=59851
Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
(cherry picked from commit 1e857130f0)
The blit must be aligned on 8 horizontal block.
v2: no need to align the reminder
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 323a448825)
If the same context try to flink and open the object, use the
same bo struct instead of opening a new gem handle for the object.
This way we avoid avoid having 2 different handle pointing to the
same kernel object which can latter lead to trouble with virtual
address.
Fix:
https://bugs.freedesktop.org/show_bug.cgi?id=60200
Signed-off-by: Martin Andersson <g02maran@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit a37835c8ed)
Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS
when not cross compiling, but this assumption didn't take into
consideration 32-bit builds on 64-bit systems. More generally, not
honoring CFLAGS is bad.
Automake is evidently too stupid to accept
if CROSS_COMPILING
CC = @CC_FOR_BUILD@
...
else
CC = @CC@
endif
without warning that CC has been already defined. The warnings are
harmless, but I'd prefer to avoid future reports about them, so define
proxy variables, which are assigned inside the conditional and then
unconditionally assigned to CC et al.
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038
(cherry picked from commit 2db1f73849)
Make sure one can identify virtual address failure from allocation
failure.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 9a47684564)
The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 3310acdf47)
Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.
Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.
Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 99b78337e3)
Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.
Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail. Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 53febac02c)
There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).
This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b4db34cc4c)
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name(). This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).
Future patches will make use of this function for linking transform
feedback varyings.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b92900d26a)
Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.
For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV. Most places only appear to worry about the
former, so ignore the latter for now. We can always add it later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit 7d467f3c15)
Currently, Blorp requires the source and destination formats to be
equal. However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.
For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors. This is fairly
straightforward with blending.
For now, this code is never used, as the source and destination formats
still must be equal. The next patch will relax that restriction.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit c0554141a9)
The BLT engine has many limitations. Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.
Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer. This is a fundamental
limitation, and the only way around that is to use BLORP.
Previously, BLORP only handled BlitFramebuffer. This patch adds an
additional frontend for doing CopyTexSubImage. It also makes it the
default. This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases. With
trivial extensions, it should be able to handle everything the BLT can.
This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell. Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS. This is particularly bad in an MMO setting
because people cast spells all the time.
It also helps Xonotic in 4X MSAA mode. At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)
No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).
v2: Create a fake intel_renderbuffer to wrap the destination texture
image and then reuse do_blorp_blit rather than reimplementing most
of it. Remove unnecessary clipping code and conditional rendering
check.
v3: Reuse formats_match() to centralize checks; delete temporary
renderbuffers. Reorganize the code.
v4: Actually copy stencil when dealing with separate stencil buffers but
packed depth/stencil formats. Tested by a new Piglit test.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
(cherry picked from commit 0b3bebbaac)
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value. It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge. Or perhaps the
underlying hardware issue is fixed.
Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.
v2: Use GEN7_SBE_* defines rather than GEN6_SF_*. (A copy and paste
mistake.) They're the same, but using the right names is better.
NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44aa2e15f6)
(This commit message was primarily written by Paul Berry, who explained
what's going on far better than I would have.)
Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist. Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).
However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger". Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).
The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.
Thankfully, the PRM contains the precise formula the hardware expects.
Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.
NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 09fbc29828)
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.
NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5e9bc7bd12)
The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling. It doesn't want the final
attr_override DWord form, however.
NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b3efc5bea8)
The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
(cherry picked from commit 9add4e8038)
In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:
"The error INVALID_VALUE is generated if size is less than or
equal to zero or if offset + size is greater than the value of
BUFFER_SIZE."
This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.
Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange. We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.
Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.
(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer. We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).
Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 04f0d6cc22)
This matches the behavior of the Windows driver, but a bspec reference
should would be nice.
NOTE: This is a candidate for the 9.0 and 9.1 branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f29ab4ece5)
Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.
NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Also, add assertions to stress that render targets don't support scaled
formats.
20 more little piglits.
(cherry picked from commit 46dd16bca8b4526e46badc9cb1d7c058a8e6173e)
Append the overloaded vector type used for passing in the addressing
parameters.
Without this, LLVM uses the same function signature for all those types,
which cannot work.
Fixes problems e.g. with FlightGear and Red Eclipse.
(cherry picked from commit 1b3afea30de757815555d9eb1d6e72e2586d6a0c)
Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
That should work in all cases.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Note: this is a candidate for the 9.1 branch.
(cherry picked from commit af0af75881)
Was broken since commit bf469f4edc
('gallium: add void *user_buffer in pipe_index_buffer').
Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a8a5055f2d)
The hardware can't do it, and these were causing warnings in some piglit tests.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6455d40b7e)
In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 120efeef8b)
Fix up intrinsic names, and bitcast texture address parameters to integers.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit e5fb7347a7)
glRasterPos doesn't exist in the core profile.
NOTE: This is a candidate for the stable branches (9.0 and 9.1).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cc5fdaf2dc)
This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).
Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.
NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a06f03d795)
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.
The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).
v2: Remove left over from testing version, remove useless NULL
checking. Improve commit message.
v3: Add comment to code on memory accounting precision
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
Old kernel do not have dma support, patch pushed were missing some
of the check needed to not use dma.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Only drivers supporting DRI2 version >=4 support GLX_INTEL_swap_event.
So lets mark it as such otherwise applications which use this extension
(i.e. everything based on Clutter, e.g. gnome-shell) break horribly on
drivers supporting DRI2 versions only up to 3.
Note: This is a candidate for the 9.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Get rid of special handling for reserved regs.
Use one intrinsic for all kinds of interpolation.
v2[Vincent Lejeune]: Rebased against current master
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
r600_bytecode::ar_chan stores the register channel for the value that
will be loaded into the AR register.
At the moment, this field is only used by the LLVM backend. The default
backend always sets ar_chan = 0.
We keep track of ring emission order in a stack, whenever we need to
flush we empty the stack in a fifo order. There is few helpers function
for bo mapping and other ring activities that will make sure that
the ring stack is properly flush and submitted.
v2: fix st flush path, and other flush path to properly flush all
rings if necessary
v3: - improve name of ring helpers
- make sure that each time a cs is gona be written it endup at
top of the stack to avoid any issue such as :
STACK[0] = dma (withbo A,B)
STACK[1] = gfx (withbo C,D)
Now if code try to emit a dma command relative to bo C or D
it will start writting cmd stream into the cs and once it
reach the point where it adds relocation it will flush.
At that point the cs will have cmd that don't have proper
relocation into the relocation buffer and kernel will just
refuse to run.
v4: - Drop the stack idea as it turn out there is no way to use it
or benefit from it. Any time the driver start command on other
ring, it always need to flush the previous ring. So make code
simpler by not using a stack.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Add ring support, you can create a cs for each ring. DMA ring is
bit special regarding relocation as you must emit as much relocation
as there is use of the buffer.
v2: - Improved comment on relocation changes
- Use a single thread to queue cs submittion this simplify driver
code while not impacting performances. Rational for this is that
you have to wait for all previous submission to have completed
so there was never a case while we could have 2 different thread
submitting a command stream at the same time. This code just
consolidate submission into one single thread per winsys.
v3: - Do not use semaphore for empty queue signaling, instead use
cond var. This is because it's tricky to maintain an even number
of call to semaphore wait and semaphore signal (the number of
cs in the stack would for instance make that number vary).
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Make it obvious what "unit" this is (no change in functionality).
draw still uses "unit" in places where it changes the shader by adding
texture sampling itself - it seems like this can't work with shaders
using dx10-style sample opcodes (can't mix gl-style and dx10-style
sample instructions in a shader).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Split the sampler interface to use separate sampler and texture (sampler_view)
state. This is needed to support dx10-style sampling instructions.
This is not quite complete since both draw/llvmpipe don't really track
textures/samplers independently yet, as well as the gallivm code not quite
using the right sampler or texture index respectively (but it should work
for the sampling codes used by opengl).
We are however losing some optimizations in the process, apply_max_lod will
no longer work, and we potentially could end up with more (unnecessary)
recompiles (if switching textures with/without mipmaps only so it shouldn't
be too bad).
v2: don't use different callback structs for sampler/sampler view functions
(which just complicates things), fix up sampling code to actually use the
right texture or sampler index, and similar for llvmpipe/draw actually
distinguish between samplers and sampler views.
v3: fix more of PIPE_MAX_SAMPLER / PIPE_MAX_SHADER_SAMPLER_VIEWS mismatches
(both in draw and llvmpipe), based on feedback from José get rid of unneeded
static sampler derived state.(which also fixes the only 2 piglit regressions
due to a forgotten assignment), fix comments based on Brian's feedback.
v4: remove some accidental unrelated whitespace changes
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
npix_x/y/z is wrong with NPOT textures, since it's always aligned to POT
if the level is non-zero, so we can't use that.
This fixes piglit/spec/EXT_texture_shared_exponent/fbo-generatemipmap-formats.
Fixes a crash when the Redway3D Turbine demo exits. We've made this
change in other places in the past. The root issue is texture objects
are being shared by multiple contexts and sampler views get shared too.
Sampler views have a context pointer and if that context gets deleted
we may try to reference that context when finally deleting the sampler
view.
pipe_sampler_view_release() avoids this problem because it takes
an explicit context.
Reviewed-by: Zack Rusin <zackr@vmware.com>
Check the return value of calls to u_upload_alloc() and
u_upload_data() and return early if needed.
Since we don't have a way to propagate errors all the way up to
Mesa through pipe_context::draw_vbo(), call debug_warn_once() so
the user might have some clue about OOM errors.
Note: This is a candidate for the 9.0 branch.
We weren't properly checking the return value of these calls (and
calls to u_upload_data()) to detect OOM errors.
Note: This is a candidate for the 9.0 branch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Some callers of this function were checking the 'ptr' result to see if
the function failed. But the correct way is to check the regular
return value for PIPE_ERROR_x. Now we initialize all the returned
values at the top of the function in case we do hit an error (like OOM).
Callers are more likely to detect OOM conditions now. But there
are some callers which don't do any error checking...
Note: This is a candidate for the 9.0 branch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
That is, evaluate constant expressions for the following functions:
packSnorm4x8, unpackSnorm4x8
packUnorm4x8, unpackUnorm4x8
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
For each function {pack,unpack}{Snorm,Unorm}4x8, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We were using the NEED_RADEON_GALLIUM conditional to decide whether or not
to build llvm_wrapper.cpp, which is required for using the LLVM backend.
llvm_wrapper.cpp needs to be linked against the LLVM IPO libary
and this library is only added to LLVM_LIBS if either opencl or the
r600-llvm-compiler is enabled.
The NEED_RADEON_GALLIUM conditional is set to true when enabling the
radeonsi driver, so if the radeonsi and r600 drivers are enabled without
also enabling opencl or r600-llvm-compiler, llvm_wrapper.cpp will be
built, but the IPO library won't be added to LLVM_LIBS. This was
causing unresolved symbol errors when buiding with this configuration.
https://bugs.freedesktop.org/show_bug.cgi?id=59831
Tested-by: Alex Deucher <alexander.deucher@amd.com>
Dereffing all the values in the two callers was just pointless, and
the function isn't inlined so there was actual code impact.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The core Mesa code has just one more case than this (GL_BITMAP), so I
don't see any cause to special-case it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This was added in b93684f5f3, but there's
no need for it -- get_size has to succeed, and it has an assert for us
in debug builds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
For our current types, the required alignment is actually just 1 byte.
When we get doubles, we have to worry (those have to be aligned to the
natural size), but we don't have doubles yet and they'll just be a
special case.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: A previous patch contained a spurious hunk that removed an
assignment to ir_variable::uniform_block. That hunk was moved to this
patch. Suggested by Carl Worth.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
glGetActiveUniform is not supposed to report block members that are not
active even if they are included in the layout of the block. The block
layout is determined from the GLSL_TYPE_INTERFACE that defines the
block, so eliminating the ir_variables that correspond to the individual
fields is safe.
Fixes gles3conform test
uniform_buffer_object_getuniformindices_for_for_nonexistent_or_not_active_uniform_names.
This also fixes the assertion failures (added in the previous commit) in
gles3conform uniform_buffer_object_index_of_not_active_block,
uniform_buffer_object_inherit_and_override_layouts, and
uniform_buffer_object_repeat_global_scope_layouts.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Use the function added in the previous commit.
This temporarily causes gles3conform
uniform_buffer_object_index_of_not_active_block,
uniform_buffer_object_inherit_and_override_layouts, and
uniform_buffer_object_repeat_global_scope_layouts to assertion fail.
This is fixed in the next commit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Calculate all of the block member offsets, the IndexNames, and
everything else to do with every UBO.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Pretty much all of the compile-time, per-compilation unit block data is
about to get the axe.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
glGetUniformIndices requires that the block instance index not be
present in the name of queried uniforms. However,
gl_uniform_buffer_variable::Name will include the instance index. The
IndexName field is added to handle this difference.
Note that currently IndexName will always point to the same string as
Name. This will change soon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Pretty much all of the compile-time, per-compilation unit block data is
about to get the axe.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This flavor takes a type and a base name. It will be used to handle
cases where the block name (instead of the instance name) is used for an
interface block.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Inspite of the spell checker, spell recurse correctly. Suggested by
Carl Worth.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eventually the gl_uniform_block information won't be calculated until
linking. Block names need to be checked for name clashes during
compiling, so we have to track it differently.
v2: Update the commit message. Suggested by Carl Worth.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Not used yet, but the UBO layout visitor will use this.
v2: Remove a spruious hunk. This is moved to the patch "glsl: Remove
ir_variable::uniform_block". Suggested by Carl Worth.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Not used yet, but the UBO layout visitor will use this.
v2: Add some commentary as to why row_major is always set to false in
process. Suggesed by Paul Berry.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
For the first declaration below, there will be an ir_variable named
"instance" whose type and whose instance_type will be the same
glsl_type. For the second declaration, there will be an ir_variable
named "f" whose type is float and whose instance_type is B2.
"instance" is an interface instance variable, but "f" is not.
uniform B1 {
float f;
} instance;
uniform B2 {
float f;
};
v2: Copy the comment message documentation into the code. Suggested by
Paul Berry.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
For variables that are in an interface block or are an instance of an
interface block, this is the GLSL_TYPE_INTERFACE type for that block.
Convert the ir_variable::is_in_uniform_block method added in the
previous commit to use this field instead of ir_variable::uniform_block.
v2: Fix the place-holder comment on ir_variable::interface_type.
Suggested by Paul Berry.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The way a variable is tested for this property is about to change, and
this makes the code easier to modify.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
If the block has an instance name, add the instance name to the symbol
table instead of the individual fields.
Fixes the piglit test interface-name-access-without-interface-name.vert
for real.
v2: Update the comment before the assertion that interface block
definitions won't generate instructions. Suggested by Paul Berry.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Interfaces are structurally identical to structures from the compiler's
point of view. They have some additional restrictions, and generally
GPUs use different instructions to access them. Using a different base
type should make this a bit easier.
This commit also adds the glsl_type::interface_packing fields. For
GLSL_TYPE_INTERFACE types, this will track the specified packing mode.
It is analogous to gl_uniform_buffer::_Packing.
v2: Add serveral missing GLSL_TYPE_INTERFACE cases in switch-statements.
v3: Add information about glsl_type::interface_packing. Move row_major
checking in glsl_type::record_key_compare from this patch to the
previous patch. Both suggested by Paul Berry.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
For now, this will always be false. In the near future, an "interface"
type will be added that shares a lot of infrastructure with structures.
v2: Move row_major checking in glsl_type::record_key_compare from the
next patch to this patch. Suggested by Paul Berry.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The size is parsed and stored in the AST, but it is not used yet.
Processing of the array size is added in the patch "glsl: Handle
instance array declarations"
v2: Update the commit message (suggested by Carl Worth). Add a comment
to ast_uniform_block::array_size (suggested by Paul Berry).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In GLSL ES 3.00 (and GLSL 1.50), uniform blocks can have an associated
"instance name", which essentially namespaces the variables inside.
This patch adds basic parsing for this new feature, but doesn't yet hook
it up to actually do anything yet.
It does not support for arrays of interface blocks; a later commit will
take care of that.
This change temporarily regresses the piglit test
interface-name-access-without-interface-name.vert. This shader failed
to compile before (the expected result), but it failed to compile for
the wrong reason. This is not a real regression.
v2: Add some comments to ast_uniform_block::instance_name. Suggested by
Paul Berry.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The existing code has a lot of duplication; the only difference between
the two cases is whether we merge in an additional layout qualifier.
Apparently creating a layout_qualifieropt rule that can be empty causes
a lot of conflicts and confusion. However, refactoring out the guts of
the ast_uniform_block creation works fine.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Also slightly change the compatibility test. Instead of comparing the
offsets of the block variables, compare the packing mode of the blocks.
Ideally we don't want to assign the offsets until a later stage of
linking.
This is put in a new file called link_uniform_blocks.cpp. Some new
functions related to uniform blocks are going to live in that file as
well.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This makes it easier to find switch-statements that need to be updated
after a new GLSL_TYPE_* is added because the compiler will generate a
warning.
Switch-statements that only had a small number of cases (e.g.,
everything in ir_constant_expression.cpp) were not modified. I may
regret that decision when we eventually add support for doubles.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Too much attention was paid to the first paragraphs, and not enough to
the last little note that "oh, by the way, the rendered things
themselves still have to be clipped to just 8192 wide/high".
Fixes GTF's clip.c test with 4096 or higher width on ivb, where one of
the triangles got the upper half of its pixels dropped.
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Khronos has apparently decided that depth textures with sized formats
(allowed with ARB_internalformat_query or ES 3.0) should be treated as
GL_RED, while unsized formats (an existing feature) should be treated
as GL_INTENSITY for compatibility with ES 2.0.
Ian is proposing changes to ARB_internalformat_query which will make
this actually legal and consistent.
A similar problem exists with GL 4.2, but we're going to ignore that
for the time being.
Tested on Ivybridge: no Piglit regressions; fixes 4 es3conform tests:
- depth_texture_fbo
- depth_texture_fbo_clear
- depth_texture_teximage
- depth_texture_texsubimage
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Since patch "i965: Validate requested GLES context version in
brwCreateContext", we have been able to create ES 3.0 contexts due to the
max version check. So...bump the max version.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
v2: Add ARB_internalformat_query to the list of required extensions.
v3: Add OES_depth_texture_cube_map to the list of required extensions.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
s/src/src_w/
That little typo, which sneaked into v4 of the previous patch, generates
incorrect fs code.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
FIXME: This patch emits VS code that violates documented hardware
restrictions and then relies on undocumented behavior that results from
that violation. This patch passes all tests, but should be fixed ASAP to
conform to the hardware documentation.
v2: Explain undocumented hardware behavior. Improve comments.
v3: Use ALU1 helper methods F32TO16() and F16TO32(). [for anholt]
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit
these opcodes.
- Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}.
- Add the opcodes to the brw_disasm table.
- Define convenience functions brw_{F32TO16,F16TO32}.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
On gen < 7, we fully lower all operations to arithmetic and bitwise
operations.
On gen >= 7, we fully lower the Snorm2x16 and Unorm2x16 operations, and
partially lower the Half2x16 operations.
v2:
- Comment that scalarization is needed only for SOA code [for idr].
- Replace switch-statement with if-statement [for idr].
- Remove misplaced hunk from previous patch [found by idr].
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Lower them to arithmetic and bit manipulation expressions.
v2: Rewrite using ir_builder [for idr].
v3: Comment typos. [for mattst88]
v4: Fix arithmetic error in comments.
Factor out a shift instruction.
Don't heap allocate factory.instructions.
[for paul]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)
Reviewed-by: Matt Tuner <mattst88@gmail.com> (v3)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v4)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
In ir_expression's constructor, the cases for {bit,logic}_{and,or,xor}
failed to handle the case when both operands were vectors.
Note: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Replace tabs with spaces. According to docs/devinfo.html, Mesa's
indetation style is:
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
This patch prevents whitespace weirdness in the next patch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Add two overloaded variants of
ir_if *if_tree()
The new functions allow one to chain together if-trees within a single C++
expression that resembles a real if-statement.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Using this enum improves the readibility of calls to assign(), whose third
argument is a writemask.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Add method ir_factory::constant. This little method constructs an
ir_constant using the factory's mem_ctx.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Add the following functions, each of which construct the similarly named
ir expression:
div, round_even, clamp
equal, less, greater, lequal, gequal
logic_not, logic_and, logic_or
bit_not, bit_or, bit_and, lshift, rshift
f2i, i2f, f2u, u2f, i2u, u2i
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
That is, evaluate constant expressions of the following functions:
packSnorm2x16 unpackSnorm2x16
packUnorm2x16 unpackUnorm2x16
packHalf2x16 unpackHalf2x16
v2: Reuse _mesa_pack_float_to_half and its inverse to evaluate
pack/unpackHalf2x16. [for idr]
v3: Whitespace fixes. [for mattst88]
Don't cast neg floats directly to uint16; use an intermediate cast to
int16. [for paul]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Not all float32 values can be exactly represented as a float16.
_mesa_float_to_half() rounded such intermediate float32 values to zero by
truncating unrepresentable bits in the mantissa.
This patch improves _mesa_float_to_half() by rounding intermediate float32
values to the nearest float16; when the float32 is exactly between two
float16 values we round to the one with an even mantissa. This behavior is
preferred over the old behavior because:
- It has reduced bias relative to the old behavior.
- It reproduces the behavior of real hardware: opcode F32TO16 in
Intel's GPU ISA.
- By reproducing the behavior of the GPU (at least on Intel hardware),
compile-time evaluation of constant packHalf2x16 GLSL expressions will
result in the same value as if the expression were executed on the GPU.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Move round_to_even's definition to mesa/main so that _mesa_float_to_half()
can use it in order to eliminate rounding bias.
In additon to moving the fuction definition, prefix its name with "_mesa",
just as all other functions in mesa/main are prefixed.
v2: Fix Android build.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
A subsequent patch will add mesa/main/imports.c as a dependency to the
compiler, which in turn requires that _mesa_warning() be defined.
The real definition of _mesa_warning() is in mesa/main/errors.c, but to
pull that file into the standalone scaffolding would require transitively
pulling in the dispatch tables.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
For each function {pack,unpack}{Snorm,Unorm,Half}2x16, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Also, add opcodes for scalarized variants of the Half2x16 functions. (The
code generator for the i965 fragment shader requires that all vector
operations be scalarized. A lowering pass, to be added later, will
scalarize the Half2x16 functions).
v2: Fix assertion message in ir_to_mesa [for idr].
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
For each of the following functions, add a declaration to
builtins/profiles/300es.glsl and create new file
builtins/ir/${funcname}.ir:
packSnorm2x16 unpackSnorm2x16
packUnorm2x16 unpackUnorm2x16
packHalf2x16 unpackHalf2x16
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
s/num_operands()/get_num_operands()/
Discovered because Eclipse failed to resolve the false reference.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The bug: The printed horizontal stride was the numerical value of the
BRW_HORIZONTAL_$N enum.
The fix: Translate the enum before printing.
Note: This is a candidate for the stable releases.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
When possible, glCopyTexSubImage calls are performed using the
hardware blitter. However, according to the Ivy Bridge PRM, Vol1
Part4, section 1.2.1.2 (Graphics Data Size Limitations):
The BLT engine is capable of transferring very large quantities of
graphics data. Any graphics data read from and written to the
destination is permitted to represent a number of pixels that
occupies up to 65,536 scan lines and up to 32,768 bytes per scan
line at the destination. The maximum number of pixels that may be
represented per scan line’s worth of graphics data depends on the
color depth.
With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels. Other pixel formats have
accordingly larger limits.
To make matters worse, if the pitch of the buffer is 32k or greater,
intel_copy_texsubimage's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).
We can conveniently avoid both problems by avoiding use of the blitter
when the miptree's pitch is >= 32k.
Fixes gles3conform "framebuffer_blit_functionality_magnifying_blit"
tests when the buffer width is equal to 8192.
Note: this is very similar to the recent patch "intel: Fix ReadPixels
on buffers whose width >= 32kbytes" except that it applies to
glCopyTexSubImage instead of glReadPixels. In a future patch it would
be nice to refactor the code so that (a) overflow is avoided, and (b)
intelEmitCopyBlit is responsible for checking whether the blitter can
handle the width, so that all callers of intelEmitCopyBlit work
properly, rather than just these two.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously I thought that varying structs had been added to GLSL ES
3.00 by mistake, because chapter 11 of the GLSL ES 3.00 spec
("Counting of Inputs and Outputs") failed to mention how structs
should be handled. Khronos has clarified
(https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9828) that varying
structs are indeed required, and that chapter 11 will be modified to
indicate that the minimal reference packing algorithm flattens varying
structs to their individual components.
Mesa doesn't flatten varying structs to their individual components,
but this is ok, since it packs varyings of all kinds with no wasted
space at all (except where this is impossible due to differing
interpolation modes), so it will outperform the minimal reference
packing algorithm in all but the most pathological cases.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
It is not clear from the GLSL ES 3.00 spec how transform feedback is
supposed to apply to varying structs:
- There is no specification for how the structure is to be packed when
it is recorded into the transform feedback buffer.
- There is no reasonable value for GetTransformFeedbackVarying to
return as the "type" of the variable.
We currently have a Khronos bug requesting clarification on how this
feature is supposed to work
(https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9856).
This patch just disables transform feedback of varying structs for
now; we can implement the proper behaviour once we find out from
Khronos what it is.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This patch adds code to lower_packed_varyings to handle varyings of
type struct. Varying structs are currently packed in the most naive
possible way (in declaration order, with no gaps), so there is a
potential loss of runtime efficiency. In a later patch it would be
nice to replace this with a "flattening" approach (wherein a varying
struct is flattened to individual varyings corresponding to each of
its structure elements), so that the linker can align each structure
element independently. However, that would require a significantly
more complex implementation.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This patch paves the way for allowing varying structs by generalizing
varying_matches::compute_packing_order to handle any type of varying.
Previously, we packed in the order (vec4, vec2, float, vec3), with
matrices being packed according to the size of their columns. Now, we
pack everything according to its number of components mod 4, in the
order (0, 2, 1, 3).
There is no behavioural change for vectors. Matrices are now packed
slightly differently:
- mat2x2 gets assigned PACKING_ORDER_VEC4 instead of
PACKING_ORDER_VEC2. This is slightly better, because it guarantees
that the matrix occupies a single varying slot.
- mat2x3 gets assigned PACKING_ORDER_VEC2 instead of
PACKING_ORDER_VEC3. This is kind of a wash. Previously, mat2x3 had
a 25% chance of having neither of its columns double parked, a 50%
chance of having exactly one of its columns double parked, and a 25%
chance of having both of its columns double parked. Now it always
has exactly one of its columns double parked.
- mat3x3 gets assigned PACKING_ORDER_SCALAR instead of
PACKING_ORDER_VEC3. This doesn't affect much, since in both cases
there is no guarantee of how the matrix will be aligned.
- mat4x2 gets assigned PACKING_ORDER_VEC4 instead of
PACKING_ORDER_VEC2. This is slightly better for the same reason as
in mat2x2.
- mat4x3 gets assigned PACKING_ORDER_VEC4 instead of
PACKING_ORDER_VEC3. This is slightly better for the same reason as
in mat2x2.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously, it didn't matter whether structure splitting tried to
split shader ins/outs, because structs were prohibited from being used
for shader ins/outs. However, GLSL 3.00 ES supports varying structs.
In order for varying structs to work, we need to make sure that
structure splitting doesn't get applied to them, because if it does,
then the linker won't be able to match up varyings properly.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This patch replaces the three ir_variable_mode enums:
- ir_var_in
- ir_var_out
- ir_var_inout
with the following five:
- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout
This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR. This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.
In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.
Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables. Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
The case statement purported to handle the addition of ir_var_const_in
and ir_var_inout builtin variables. But no such variables exist.
This patch removes the unnecessary cases, and adds a comment
explaining why they're not needed.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
For texelFetchOffset(), we just add the texel offsets to the coordinate
rather than using the message header's offset fields. So we don't
actually need a header on Gen5+.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
When possible, glReadPixels calls are performed using the hardware
blitter. However, according to the Ivy Bridge PRM, Vol1 Part4,
section 1.2.1.2 (Graphics Data Size Limitations):
The BLT engine is capable of transferring very large quantities of
graphics data. Any graphics data read from and written to the
destination is permitted to represent a number of pixels that
occupies up to 65,536 scan lines and up to 32,768 bytes per scan
line at the destination. The maximum number of pixels that may be
represented per scan line’s worth of graphics data depends on the
color depth.
With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels.
To make matters worse, if the pitch of the buffer is 32k or greater,
intel_miptree_map_blit's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).
We can conveniently avoid both problems by avoiding the readpixels
blit path when the miptree's pitch is >= 32k.
Fixes gles3conform "half_float" tests when the buffer width is greater
than 2048.
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
This reverts commit 7824ab8070.
Now that we force linking with LLVM shared libs when building clover,
we can link against libgallium.la with no problems.
If we build clover with LLVM static libraries, then clover and also each
pipe_*.so driver that is built will contain their own static copy of
LLVM. The recent automake changes have uncovered a problem where
the pipe_*.so drivers try to use clover's LLVM symbols. This causes
LLVM's static registry objects to be initialized each time
a pipe_*.so driver is loaded by clover. Initializing these objects
multiple times is not allowed and leads to assertion failures in the
LLVM code.
We can avoid all these problems by having clover and all the pipe_*.so
drivers link against the same LLVM shared library.
https://bugs.freedesktop.org/show_bug.cgi?id=59334https://bugs.freedesktop.org/show_bug.cgi?id=59534
v2:
- Fix shared library detection when LLVM is built with CMake
In order to determine which static LLVM libraries are needed we pass
a list of components to llvm-config and it generates the list of
library dependencies for us. The advantage of only calling llvm-config
one time is that it can determine if two components depend on the same
library and then add it to the output list only once. The old practice
of having each driver call llvm-config to add its own dependencies to
$(LLVM_LIBS) caused many libraries to be added to this variable multiple
times.
Always enable the use of pre-compressed texture data. The ability to
perform on-line compression still requires the presence of libtxc_dxtn
or an explicit driconf over-ride. Applications that just want to submit
precompessed data when an on-line compressor is not available can look
for the GL_EXT_texture_compression_dxt1 and
GL_ANGLE_texture_compression_dxt[35] extensions.
v2: Only enable the extensions that do not require on-line compression
by default. The previous statement "This should not impact many (if
any) real applications." proved to be false for at least Sauerbraten.
This application mostly submits pre-compressed data, but it also can
submit uncompressed data that it asks the driver to compress.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Acked-by: Eric Anholt <eric@anholt.net> [v1]
Acked-by: Lee Salzman <lsalzman@gmail.com>
This is technically outside the ANGLE spec, but it seems unlikely to
cause any harm.
v2: Simplify the extension checks by assuming the ANGLE extension will
always be enabled by any driver that enables the EXT. Suggested by
Eric Anholt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Lee Salzman <lsalzman@gmail.com>
For non-generic compressed format we assert two things:
1. The format has already been validated against the set of available
extensions.
2. The driver only enables the extension if it supports all of the
formats that are part of that extension.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Similar to the previous commit, we may be using a texture with actual RGBA
storage for the GL_ALPHA format, so force the color values to 0.0.
This commit fixes the following piglit (sub) tests:
EXT_texture_snorm/fbo-blending-formats
GL_ALPHA16_SNORM
GL_ALPHA8_SNORM
GL_ALPHA_SNORM
Note: Haswell bypasses this swizzle code, so may require an independent fix
for this bug.
Reviewed-by: Eric Anholt <eric@anholt.net>
We may be using a texture with actual RGBA storage for these formats, so force
the alpha value read to 1.0.
This commit fixes the following piglit (sub) tests:
ARB_texture_float/fb-blending-formats
GL_RGB16F_ARB
EXT_framebuffer_object/fbo-blending-formats
GL_RGB10
GL_RGB12
GL_RGB16
EXT_texture_snorm/fbo-blending-formats
GL_RGB16_SNORM
GL_RGB8_SNORM
GL_RGB_SNORM
These test improvements depend on the previous commit as well. That commit
smashes alpha to 1.0 for the case of ReadPixels (so fixes "FBO testing" as
reported by this test), while this commit smashes alpha to 1.0 for the case of
texturing (fixed the "window testing" as reported by this test).
Note: Haswell bypasses this swizzle code, so may require an independent fix
for this bug.
Reviewed-by: Eric Anholt <eric@anholt.net>
When performing a ReadPixels operation, we may be reading from a buffer that
stores alpha values, but that is actually representing a buffer with no alpha
channel. In this case, while rebasing the values, touch up all alpha values
read to 1.0.
This commit fixes the following piglit (sub) tests:
ARB_texture_float/fbo-colormask-formats
GL_RBG16F_ARB
EXT_texture_snorm/fbo-colormask-formats
GL_RGB16_SNORM
GL_RGB8_SNORM
GL_RGB_SNORM
It likely improves the results of other tests as well, but a PASS remains
elusive due to additional bugs.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The renderbuffer's Format field may have an alpha channel even when the
underlying _BaseFormat does not. This can happen when mesa chooses to use
RGBA16 for an RGB16 format, for example.
So look at _BaseFormat when deciding whether to fixup the blend factors.
This test improves the results of at least the following piglit tests:
EXT_frambebuffer_object/fbo-blending-formats
{GL_RGB10, GL_RGB12, GL_RGB16}
EXT_texture_snorm/fbo-blending-formats
{GL_RGB16_SNORM, GLRGB8_SNORM, GL_RGB_SNORM}
But none of these actually change from FAIL to PASS yet. The R, G, and B probe
values are fixed with this commit, but the tests still fail because the alpha
values are still wrong.
Reviewed-by: Eric Anholt <eric@anholt.net>
/ vs \ mismatch was causing .objs to be put in the source tree, causing
breakeage when doing different build types in the same tree (eg., debug
vs release).
Fix this by normalizing everything to / slashes.
It's probably a good idea to purge all .objs from source tree to prevent
issues completely.
Thanks to Fredrik Höglund, all the hard work was already done.
Tested using a modified oglconform (that actually runs these tests on
our driver); it looks like there may be some bugs when using client
arrays. All applicable non-compatibility tests passed.
For now, only enable it in core profiles.
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ian Romanick <idr@freedesktop.org>
Squashed with two reverts:
Revert "android: Update for builtin_stubs.cpp move"
This reverts commit c0def90ede.
Revert "scons: Update for builtin_stubs.cpp"
This reverts commit 8ac4b82699.
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Tested-on-Android-by: Chad Versace <chad.versace@linux.intel.com>
I don't see how this could have ever worked right.
The screen-space interpolation code uses the vertex->data[pos_attr]
position which contain window coords. But window coords are only
computed for the unclipped vertices; the clipped vertices have
undefined window coords (see draw_cliptest_tmp.h).
Use the vertex clip coords instead which are always defined.
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=55476
(piglit fbo-blit-stretch failure on softpipe)
Note: This is a candidate for the 9.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
In debug builds, set clipped vertex window coordinates to NaN values
to help debugging. Otherwise, we're just leaving the coordinate in clip
space and it's invalid to use it later expecting it to be a window coord.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Fixes piglit spec/ARB_sampler_objects/sampler-incomplete and
spec/EXT_texture_swizzle/depth_texture_mode_and_swizzle.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
This error was added in the 3.0.1 update to the OpenGL ES 3.0 spec.
Fixes the updated gles3conform packed_depth_stencil_parameters test.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes gles3conform test CoverageES30. It temporarily regresses some
framebuffer_blit tests, but the failing subcases have been determined to
be invalid for OpenGL ES 3.0.
v2: Fix typo in depth (and stencil) RB checking. Noticed by Ken.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
These were introduced in 2000 during a rework of the TNL module (commit
cab974cf6c), though I'm having a hard time
finding an instance there of one of these Exec functions being changed
at runtime.
Regardless, as far as I can tell now, these functions don't get changed,
by grepping for calls to SET_* to change the dispatch table (we do change
functions in GLvertexformat at runtime, but those don't overlap with
this set of functions). Remove them and just let them be initialized to
the same functions as are in the Exec table.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This cuts out a ton of code to make functions not set to a save_ variant
match.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This will let us copy from the Exec dispatch to deal with our commands that
don't get compiled into display lists.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I want to drive the Save dispatch table setup from this same function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
All callers are in Mesa core and all use _gloffset_COUNT, so just rely on
the already baked-in use of _gloffset_COUNT in the function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We now have a separate dispatch table for begin/end that prevent these
functions from being entered during that time. The
ASSERT_OUTSIDE_BEGIN_END_WITH_RETVALs are left because I don't want to
change any return values or introduce new error-only stubs at this
point.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is a step toward getting rid of ASSERT_OUTSIDE_BEGIN_END() in Mesa.
v2: Finish create_beginend_table() comment, move loopback API init into it,
and add a const flag. (suggestions by Brian)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
My change 7ca4f07b5b caused errors to not
be thrown when they should, because the new if statement for ExecuteFlag
made the CurrentSavePrimitive not get set. And on further review, we
shouldn't be validating our primitive in GL_COMPILE mode, since the
command shouldn't be executed yet.
Partially fixes piglit gl-1.0-beginend-coverage.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This reduces jitter slightly in a cleaner way, without desynchronizing mplayer2 as badly
when falling behind.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
It appears that scons implicit dependency scanners fail to chain
dependencies of generated headers when these are outside the build tree.
This patch ensures generated source files are _always_ put in the build
tree. I'm not 100% this will fix all depency issues, but from my
experiments it does seem to fix this.
NOTE: For this to be effective it is necessary to clean the source tree
from generated header/source files.
Reviewed-by: Brian Paul <brianp@vmware.com>
There really isn't any point. There is no resource savings, and we have
to do gymnastics in the driver to make it work.
There are also bad interactions with multisampling and OpenGL ES 3.0.
In ES3, a multisample-to-singlesample blit must have identical source
and destination format. This means a multisample RGBA8 to singlesample
RGB8 (window) blit will generate an error. Also in ES3, RGB8 is not a
renderable format. This means that the application CANNOT make an RGB8
multisample renderbuffer.
As a result, if an application gets an RGB8 window and wants to do
multisample FBO rendering, it will probably break.
"Fixes" gles3conform
framebuffer_blit_functionality_multisampled_to_singlesampled_blit test
on RGB8 visuals.
v2: Fix 'formats' array size. Suggested by Ken.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
[ Squashed port of the following r600g commits: - Michel Dänzer ]
commit 428e37c2da
Author: Marek Olšák <maraeo@gmail.com>
Date: Tue Oct 2 22:02:54 2012 +0200
r600g: add in-place DB decompression and texturing with DB tiling
The decompression is done in-place and only the compressed tiles are
decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.
The texture unit is programmed to use non-displayable tiling and depth
ordering of samples, so that it can fetch the texture in the native DB format.
The latest version of the libdrm surface allocator is required for stencil
texturing to work. The old one didn't create the mipmap tree correctly.
We need a separate mipmap tree for stencil, because the stencil mipmap
offsets are not really depth offsets/4.
There are still some known bugs, but this should save some memory and it also
improves performance a little bit in Lightsmark (especially with low
resolutions; tested with Radeon HD 5000).
The DB->CB copy is still used for transfers.
commit e2f623f1d6
Author: Marek Olšák <maraeo@gmail.com>
Date: Sat Jul 28 13:55:59 2012 +0200
r600g: don't decompress depth or stencil if there isn't any
commit 43e226b6ef
Author: Marek Olšák <maraeo@gmail.com>
Date: Wed Jul 18 00:32:50 2012 +0200
r600g: optimize uploading depth textures
Make it only copy the portion of a depth texture being uploaded and
not the whole 2D layer.
There is also a little code cleanup.
commit b242adbe5c
Author: Marek Olšák <maraeo@gmail.com>
Date: Wed Jul 18 00:17:46 2012 +0200
r600g: remove needless wrapper r600_texture_depth_flush
commit 611dd52942
Author: Marek Olšák <maraeo@gmail.com>
Date: Wed Jul 18 00:05:14 2012 +0200
r600g: init_flushed_depth_texture should be able to report errors
commit 80755ff563
Author: Marek Olšák <maraeo@gmail.com>
Date: Sat Jul 14 17:06:27 2012 +0200
r600g: properly track which textures are depth
This fixes the issue with have_depth_texture never being set to false.
commit fe1fd67556
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Jul 8 03:10:37 2012 +0200
r600g: don't flush depth textures set as colorbuffers
The only case a depth buffer can be set as a color buffer is when flushing.
That wasn't always the case, but now this code isn't required anymore.
commit 5a17d8318e
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Jul 8 02:14:18 2012 +0200
r600g: flush depth textures bound to vertex shaders
This was missing/broken. There are also minor code cleanups.
commit dee58f94af
Author: Marek Olšák <maraeo@gmail.com>
Date: Sun Jul 8 01:54:24 2012 +0200
r600g: do fine-grained depth texture flushing
- maintain a mask of which mipmap levels are dirty (instead of one big flag)
- only flush what was requested at a given point and not the whole resource
(most often only one level and one layer has to be flushed)
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Use r600_resource_texture::flished_depth_texture for GPU access, and
allocate it in the VRAM. For transfers we'll allocate texture in the GTT
and store it in the r600_transfer::staging.
Improves performance when flushed depth texture is frequently used by the
GPU, e.g. in Lightsmark
[ Ported from r600g commit 3770847960 ]
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
[ Squashed port of the following r600g commits: - Michel Dänzer ]
commit c1e8c845ea
Author: Marek Olšák <maraeo@gmail.com>
Date: Sat Jul 7 19:10:00 2012 +0200
r600g: inline r600_hw_copy_region
commit 4891c5dc64
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Jun 25 22:53:21 2012 +0200
r600g: inline r600_blit_push_depth and use resource_copy_region
We are going to have a separate resource for depth texturing and transfers
and this is just a transfer thing.
commit da98bb6fc1
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Jun 25 12:45:32 2012 +0200
r600g: split flushed depth texture creation and flushing
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Allow additional format/type combinations based on the
color render buffer to fix failures with gles3-gtf.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
For GLES2/3 allow reading of pixels with format/type based on:
* GL_IMPLEMENTATION_COLOR_READ_FORMAT
* GL_IMPLEMENTATION_COLOR_READ_TYPE
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
[mattst88] v2: Enable only for ES3 per spec.
[mattst88] v3: Use _mesa_is_gles3 since EXT_color_buffer_float is
ES3-only.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
For now I'm just enabling this on the same subset of hardware that has
OpenGL 3.0 enabled. This same functionality is part of OpenGL 3.0, and
there is no matching desktop extension.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
They're part of GL_OES_depth_texture_cube_map, and we'll always enable
that extension in ES3 contexts.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The vs part hasn't been wired up since tgsi_sse2 was disabled in:
commit 4eb3225b38
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Tue Nov 8 00:10:47 2011 +0000
Remove tgsi_sse2.
And it would certainly not work correctly in its current state:
draw/draw_vs_ppc.c: In function ‘draw_create_vs_ppc’:
draw/draw_vs_ppc.c:190:24: warning: assignment from incompatible pointer
type [enabled by default]
As with the sse2 backend, this should be done in llvm anyway.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
configure should warn if libxml2 is not found.
libxml2 is needed by glapi/gen.
Fixes error during build in src/mapi/glapi/gen:
ImportError: No module named libxml2
NOTE: This is a candidate for the 9.0 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31598
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is required by OpenGL ES 3.0 and desktop OpenGL 4.2. Previous
version were ambiguous. This also matches the behavior of NVIDIA's
closed-source driver (version 304.64).
Fixed gles3conformance test uniform_buffer_object_getactiveuniformsiv
and uniform_buffer_object_structure_and_array_element_names (on my
in-progress branch that fixes a bunch of other stuff...YMMV).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is required by OpenGL ES 3.0 and desktop OpenGL 4.2. Previous
version were ambiguous. This also matches the behavior of NVIDIA's
closed-source driver (version 304.64).
Fixed gles3conformance test uniform_buffer_object_getactiveuniform.
Several piglit tests expect glGetActiveUniform to *not* include the [0]
on the end. These tests were already failing on NVIDIA, and this change
regresses them on Mesa. Patches have been sent to the piglit mailing
list to fix the tests.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We currently have a bug in this code, and I don't want to fix it in two
places.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The GLSL 1.40 spec says:
"Uniform block names and variable names declared within uniform
blocks are scoped at the program level."
Track the block name in the symbol table and emit errors when conflicts
exist.
Fixes es3conform's uniform_buffer_object_block_name_conflict test, and
fixes the piglit block-name-clashes-with-{variable,function,struct}.vert
tests.
NOTE: This is a candidate for the 9.0 branch.
v2: Fix bad constructor initialization. Noticed by Topi Pohjolainen.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
About both row_major and column_major layout qualifiers, the GLSL spec
says:
"It only affects the layout of matrices."
However, the OpenGL ES 3.0 conformance tests have taken this to mean it
is an error use it elsewhere. This seems logical given that
'layout(row_major) vec4 foo' is probably not what the programmer meant.
The only catch is dealing with structures that contain matrices. Layout
qualifiers cannot be applied directly to fields of structures, so the
only way to affect the layout of the fields is to apply a qualifier to
the structure declaration itself. There is ongoing debate about this
within Khronos, and it seems to be settling in favor of allowing the
qualifiers on structures. I light of this, I have chosen to allow the
qualifiers on structures but emit a warning since the usage may not be
portable.
Fixes gles3conform test
uniform_buffer_object_layouts_not_for_matrix_type and causes no
regressions.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Between the previous commit and this one, improves GLBenchmark 2.1
offscreen performance by 0.48% +/- 0.24% (n=22, throttling outliers
removed).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is for GL_ARB_texture_buffer_object_rgb32 support, but it also
causes the format to get used for float32 rgb textures as well on
Ironlake and later. Since that came with some surprises, separate
the change from the enable commit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We almost never want a stride in pixels -- if you're doing anything with
a stride, you're specifying an offset or incrementing a pointer, and in
both cases you had to multiply by cpp to get the bytes value you wanted.
But worse, on the way to creating a region from a new tiled BO, we
divided by cpp to get pitch in pixels, and for an RGB32 buffer (an
upcoming change) the pitch wouldn't divide exactly, and we'd end up with
a wrong stride in our region.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Vincent Lejeune:
- tgsi to llvm now emits pointers for constants
Tom Stellard:
- Only use texture cache for vtx fetch with compute shaders
- Change address space used for constant loads to match LLVM
backend.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
According to the OpenGL 3.2 Core Profile specification, section 3.8.12:
"For one-, two-, and three-dimensional and one-and two-dimensional array
textures, a texture is mipmap complete if all of the following
conditions hold true:
- [...]
- levelbase <= levelmax [...]
Using the preceding definitions, a texture is complete unless any of
the following conditions hold true:
- [...]
- The minification filter requires a mipmap (is neither NEAREST nor
LINEAR), and the texture is not mipmap complete."
(This text also appears in all GL >= 3.2 specs and the ES 3.0 spec.)
From this, we see that levelbase <= levelmax should only affect mipmap
completeness, not base-level completeness.
Prior versions of GL did not have the notion of mipmap completeness,
simply calling the texture incomplete in this case. But I don't think
we really care.
Fixes es3conform's sgis_texture_lod_basic_completeness test.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
The sampler appears to ignore writemasks (even when correcting the
WRITEMASK_XYZW in brw_vec4_emit.cpp to the proper writemask) and just
always writes all four values.
To cope with this, just texture into a temporary, then MOV out into a
register that has the proper number of components.
NOTE: This is a candidate for stable branches.
Fixes es3conform's shadow_execution_vert.test.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Previously it was left undefined, causing us to select a random LOD.
NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
This is purely a refactor. However, in a moment, we'll want to set
lod_type to float for ir_tex, where ir->lod_info.lod is NULL.
NOTE: This is a candidate for stable branches (for the next patch).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Fixes regressions since commit 899017fc54
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Fri Jan 4 07:53:09 2013 -0800
i965: Use Haswell's sample_d_c for textureGrad with shadow samplers.
That patch assumed that all instances were lowered. However, we weren't
lowering textureGrad() with samplerCubeShadow because I couldn't figure
out the LOD calculations. It turns out they're easy: you just have to
use 1 for the depth. This causes it to pass oglconform's four tests.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Ian Romanick <idr@freedesktop.org>
Now that things mostly seem to work enable those formats.
Some formats cause crashes (notably RGB8 variants) so switch these off
(these crashes are not specific to INT/UINT variants but the state tracker
doesn't use them for UNORM etc. formats so it went unnoticed so far).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Cast back the fake floats to ints, and make sure we don't try to do scaling
in format conversion (which only makes sense with normalized values).
Also need to disable blending and alpha test (as per spec) for such buffers.
This makes fbo-blending from the piglit ext_texture_integer tests work for most
formats (some crash, and the luminance and intensity variants have the GB or
GBA channels respectively wrong).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
We were passing in the rt index however this was always 0 for non-independent
blend case. (The format was only actually used to decide if the color mask
covered all channels so this went unnoticed and was discovered by accident.)
Additionally, there was a second problem because we do fixups in the key based
on color buffer format we cannot use non-independent blend anyway as the fixed
up values would never get used.
So always turn non-independent blending into independent.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Filtering of DEPTH_COMPONENT and DEPTH_STENCIL for TEXTURE_3D is already
done in texture_error_check because these combinations aren't allowed on
desktop GL either.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
1. The loop over dest buffers in blit_linear() needed a null pointer
check. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59499
2. The code to grab the drawRb's format needs to be inside the drawing loop.
3. An equality test was using = instead of == thus messing up a
renderbuffer attachment texture pointer. This lead to memory
corruption and a crash at exit.
Finally, fix a capitalization error NumDrawBuffers -> numDrawBuffers
and change type to unsigned to fix signed/unsigned comparison warnings.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Instead of deriving it from the colour buffer formats only.
Fixes a number of piglit tests which export depth from the pixel shader.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes piglit 'spec/ARB_depth_buffer_float/fbo-clear-formats stencil' crash.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Enabling it for all resources still seems to cause problems, but depth/stencil
buffers are always accessed with tiling by the DB block.
Also, stick to 1D tiling for now. Getting 2D tiling to work properly will
require substantial changes in libdrm_radeon and possibly the kernel as well.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Apart from the obvious cleanup, this makes sure all blocks use the same tiling
mode for accessing the resource.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Currently the use of external firmware is required, with kernel and
userspace firmware needed for all Fermi cards except nvd9. Kepler and nvd9
should only require kernel firmware.
1. Loop over multiple destination color buffers. If we set
glDrawBuffers(GL_FRONT_AND_BACK) we need to loop over multiple color
buffers, blitting to each.
2. Add checks for null src/dst surface pointers. This fixes a crash
in the piglit fbo-missing-attachment-blit test.
See bug http://bugs.freedesktop.org/show_bug.cgi?id=59450
Reviewed-by: Reviewed-by: Marek Olšák <maraeo@gmail.com>
Use os.path.join() rather than hand-rolling it, so path is correct if
sys.argv[0] returns an absolute path.
(According to the python documentation, it's platform dependent whether
sys.argv[0] is a full pathname or not. It probably also depends on how
the process was started...)
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It seems the other code expects surface[0..1] to be the luma field in interlaced case.
See for example vdpau/surface.c vlVdpVideoSurfaceClear and vlVdpVideoSurfacePutBitsYCbCr.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Writemask was XY instead of YZ (thanks to calim for spotting it).
The pixel calculation resulted in the pixel always being off by one.
If y was .5:
y' = round(y) + 0.5 = 1.5
Fixing this also means the LRP function has to swap the pixels it, since
it's now the other way around for top/bottom.
WIth these fixes only chroma for top and bottom pixel rows are wrongly interpolated
in my test program:
--- nvidia
+++ nouveau
@@ -1,4 +1,4 @@
-YCbCr[0] = 00c080
+YCbCr[0] = 00b070
YCbCr[1] = 00b070
YCbCr[2] = 029050
YCbCr[3] = 207050
@@ -61,4 +61,4 @@
YCbCr[60] = 0c5070
YCbCr[61] = c05090
YCbCr[62] = 0e70b0
-YCbCr[63] = e080c0
+YCbCr[63] = e070b0
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Use this method in _mesa_GetInternalformativ for both GL_SAMPLES and
GL_NUM_SAMPLE_COUNTS.
v2: internalFormat may not be color renderable by the driver, so zero
can be returned as a sample count. Require that drivers supporting the
extension provide a QuerySamplesForFormat function. The later was
suggested by Eric Anholt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Though, I'm tempted to always expose this extension when
GL_ARB_framebuffer_object is exposed. In that case, it would share the same
enable bit.
v2: Correctly sort extension names. Suggested by Eric Anholt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This is for the GL_ARB_internalformat_query extension and GLES 3.0.
v2: Generate GL_INVALID_OPERATION if the extension is not supported.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
OpenGL 4.2 specification suggests rounding the float data to nearest
integer when the type of internal state is integer. Out of range floats
should be clamped to {INT_MIN, INT_MAX}. This is not specified anywhere
in gl/gles spec but below test expects this behavior. This patch makes
gles3 conformance sgis_texture_lod_basic_getter.test pass.
A GL spec bug will be raised to include clamping of out of range floats.
V2: Round float to nearest integer for all cases where
_mesa_Texparameterf() converts float param to int. Use the same block of
float to int conversion code for GL_TEXTURE_SWIZZLE_{R,G,B,A}_EXT cases
as well.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch fixes a blitting case when drawAttachment->Texture ==
readAttachment->Texture. It was causing an assertion failure in
intel_miptree_attach_map() with gles3 conformance test case:
framebuffer_blit_functionality_minifying_blit
Number of changes in this file look scary. But most of them are caused
by introducing a big for loop to support rendering to multiple color
draw buffers.
V2: Fixed a case when number of draw buffer attachments are zero.
V3: Put a for loop in blit_nearest() and blit_linear() functions in to
support blitting to multiple color draw buffers.
V4: Remove variable declaration in for loop to avoid MSVC compilation
issues.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch adds required error checking in _mesa_BlitFramebuffer() when
blitting to multiple color render targets. It also fixes a case when
blitting to a framebuffer with renderbuffer/texture attached to
GL_COLOR_ATTACHMENT{i} (where i!=0). Earlier it skips color blitting if
nothing is found attached to GL_COLOR_ATTACHMENT0.
V2: Fixed a case when number of draw buffer attachments are zero.
V3: Do compatible_color_datatypes() and compatible_resolve_formats()
check for all the draw renderbuffers in fbobject.c. Fix debug code
at bottom of _mesa_BlitFramebuffer() to handle MRTs. Combine error
checking code for linear blits with other color blit error checking.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This allows query on default framebuffer in
glGetFramebufferAttachmentParameteriv() for gles3. Fixes unexpected GL
errors in gles3 conformance test case:
framebuffer_blit_functionality_multisampled_to_singlesampled_blit
V2: Use _mesa_is_gles3() check to restrict allowed attachment types to
specific APIs.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch enables blitting to multiple color attachments of a
framebuffer. It also fixes a case when blitting to a framebuffer with
renderbuffer/texture attached to non-zero attachment point
i.e. GL_COLOR_ATTACHMENT{1, 2, ...}. Earlier we were incorrectly
blitting to GL_COLOR_ATTACHMENT0 by default.
V2: Use intel_copy_texsubimage() for blitting only if all the color
attachments can blit using it.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch rewrites _mesa_meta_BlitFrameBuffer() function to add support
for blitting with GLSL/GLSL ES shaders. These changes were required to
support glBlitFrameBuffer() in gles3. This patch, along with other
patches in this series, make 16 failing framebuffer_blit test cases in
gles3 conformance pass.
V2: Properly handle flipped blits for source and destination
renderbuffer / textures. Add support for GL_TEXTURE_RECTANGLE in
_mesa_meta_BlitFrameBuffer. Create a temp depth texture to support
depth buffer blitting.
V3: Remove unsupported / redundant shader code. Add an assertion to make
sure that we don't use rectangle texture in ES. Put API guard on
glTexEnvi().
V4: For gles3: Don't use ReadPixels or CopyTexImage2D to blit depth
buffer. gles3 spec says for CopyTexImage2D that "color buffer
components can be dropped during the conversion to internalformat,
but new components cannot be added." So, use the internal format of
read renderbuffer to create texture for color buffer blitting.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
V2:
If mask has GL_STENCIL_BUFFER_BIT set, the depth formats for
readRenderBuffer and drawRenderBuffer must match unless one of the two
buffers doesn't have depth, in which case it's not blitted, so the
format check should be ignored. Same comment goes for stencil formats
in depth renderbuffers if mask has GL_DEPTH_BUFFER_BIT set.
v3 (Kayden): Refactor code to be a bit more readable.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In ES 3.0, when calling glDrawBuffers() on the window system
framebuffer, the only valid targets are GL_NONE or GL_BACK. Since there
is no stereo rendering in ES 3.0, this is a single buffer, unlike
desktop where it may be two (and thus isn't allowed).
For single-buffered configs, GL_BACK ironically means the front (and
only) buffer. I'm not sure that it matters, however, as ES shouldn't
have front buffer rendering in the first place.
Fixes es3conform framebuffer_blit_coverage_default_draw_buffer_binding.
v2: Update GLES3 spec reference.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> [v2]
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
I didn't notice this due to a noobed piglit run. It wasn't previously
noticed because the patch was only run on a driver that supported GLES3.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
At process exit DLL_PROCESS_DETACH is signaled to DllMain(), where then
a final cleanup is triggered. In stw_cleanup() code is triggered that
tries to communicate a shutdown to the spawned threads -- however at
that time those threads have already been terminated by the OS and so
the process hangs.
v2: skip stw_cleanup_thread() too
Signed-off-by: José Fonseca <jfonseca@vmware.com>
Fixes error EGL_BAD_ATTRIBUTE in the tests below on Intel Sandybridge:
* piglit egl-create-context-verify-gl-flavor, testcase OpenGL ES 3.0
* gles3conform, revision 19700, when runnning GL3Tests with -fbo
This plumbing is added in order to comply with the EGL_KHR_create_context
spec. According to the EGL_KHR_create_context spec, it is illegal to call
eglCreateContext(EGL_CONTEXT_MAJOR_VERSION_KHR=3) with a config whose
EGL_RENDERABLE_TYPE does not contain the EGL_OPENGL_ES3_BIT_KHR. The
pertinent
portion of the spec is quoted below; the key word is "respectively".
* If <config> is not a valid EGLConfig, or does not support the
requested client API, then an EGL_BAD_CONFIG error is generated
(this includes requesting creation of an OpenGL ES 1.x, 2.0, or
3.0 context when the EGL_RENDERABLE_TYPE attribute of <config>
does not contain EGL_OPENGL_ES_BIT, EGL_OPENGL_ES2_BIT, or
EGL_OPENGL_ES3_BIT_KHR respectively).
To create this patch, I searched for all the ES2 bit plumbing by calling
`git grep "ES2_BIT\|DRI_API_GLES2" src/egl`, and then at each location
added a case for ES3.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
If the hardware/driver combo supports GLES3, then set the GLES3 bit in
intel_screen's bitmask of supported DRI API's. Neither the EGL nor GLX
layer uses the bit yet.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This enum corresponds to EGL_OPENGL_ES3_BIT_KHR.
Neither the GLX nor EGL layer use the enum yet.
I don't like the GLES bits. I'd prefer that all GLES APIs be exposed
through a single API bit, as is done in GLX_EXT_create_context_es_profile.
But, we need this GLES3 enum in order to do the plumbing necessary to
correctly support EGL_OPENGL_ES3_BIT_KHR as required by the
EGL_KHR_create_context spec.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Each driver (i830, i915, i965) used independent but similar code to
validate the requested context version. With the rececnt arrival of GLES3,
that logic has needed an update. Rather than apply identical updates to
each drivers validation code, let's just move the validation into the
shared routine intelInitContext.
This refactor required some incidental changes to functions
i830CreateContext and intelInitContext. For each function, this patch:
- Adds context version parameters to the signature.
- Adds a DRI_CTX_ERROR out param to the signature.
- Sets the DRI_CTX_ERROR at each early return.
Tested against gen6 with piglit egl-create-context-verify-gl-flavor.
Verified that this patch does not change the set of exposed EGL context
flavors.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Before this patch, intelInitScreen2 set DRIScreen::api_mask with the hacky
heuristic below:
if (gen >= 3)
api_mask = GL | GLES1 | GLES2;
else
api_mask = 0;
This hack was likely broken on gen2 (i830), but I don't care enough to
properly investigate. It appears that every EGLConfig on i830 has
EGL_RENDERABLE_TYPE=0, and thus eglCreateContext will never succeed.
Anyway, moving on to living drivers...
With the arrival of EGL_OPENGL_ES3_BIT_KHR, this heuristic is now
insufficient. We must enable the GLES3 bit if and only if the driver is
capable of creating a GLES3 context. This requires us to determine the
maximum supported context version supported by the hardware/driver for
each api *during initialization of intel_screen*.
Therefore, this patch adds four new fields to intel_screen which indicate
the maximum supported context version for each api:
max_gl_core_version
max_gl_compat_version
max_gl_es1_version
max_gl_es2_version
The api mask is now correctly set as:
api_mask = GL;
if (max_gl_es1_version > 0)
api_mask |= GLES1;
if (max_gl_es2_version > 0)
api_mask |= GLES2;
Tested against gen6 with piglit egl-create-context-verify-gl-flavor.
Verified that this patch does not change the set of exposed EGL context
flavors.
v2:
- Replace the if-tree on gen with a switch, for Ian.
- Unconditionally enable the DRI_API_OPENGL bit, for Ian.
v3:
- Drop max gl version to 1.4 on gen3 if !has_occlusion_query,
because occlusion queries entered core in 1.5. For Ian.
v4:
- Drop ES2 version back to 2.0 due to rebase (Ian).
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick.intel.com>
I'm not sure if this is the correct fix. The
_mesa_es_error_check_format_and_type function (used above in the ES 1
and 2 cases) was originally added for glTexImage checking and allows
GL_DEPTH_STENCIL/GL_UNSIGNED_INT_24_8 combinations. Using it in ES 3
causes other tests to regress.
Fixes es3conform's packed_depth_stencil_error test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
INVALID_ENUM is for when the type is simply not known.
Fixes part of es3conform's packed_depth_stencil_error test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
ES 3 specifies some formats as texture-only (i.e., not available for
renderbuffers).
See the "Required Texture Formats" section (pg 126) of the ES 3 spec.
v2: Allow RED and RG float rendering in core profiles The check used to
be (version > 30) || (compat profile w/extensions). Just deleting
<version > 30) broke 3.0+ core profiles.
Fixes es3conform's color_buffer_unsupported_format test.
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
According to both the GL 3.0 and ES 3.0 specifications (table 2.7 for GL
and table 2.8 for ES), the default value of BUFFER_ACCESS_FLAGS is
supposed to be zero.
Note that there are two related quantities: the obsolete BUFFER_ACCESS
enum and the new BUFFER_ACCESS_FLAGS bitfield.
BUFFER_ACCESS can only be GL_READ_ONLY, GL_WRITE_ONLY, or GL_READ_WRITE;
BUFFER_ACCESS_FLAGS can easily represent all three via GL_MAP_WRITE_BIT,
GL_MAP_READ_BIT, and their logical or. It also supports more flags.
Thus, Mesa only stores the bitfield, and simply computes the old enum
when queried, via simplified_access_mode(bufObj->AccessFlags).
The tricky part is that, while BUFFER_ACCESS_FLAGS defaults to 0,
BUFFER_ACCESS defaults to GL_READ_WRITE for desktop [GL 3.0, table 2.8]
and GL_WRITE_ONLY_OES for ES [the GL_EXT_map_buffer_range extension].
Mesa tried to implement this by setting the default AccessFlags to
GL_MAP_READ_BIT | GL_MAP_WRITE_BIT on desktop, and GL_MAP_WRITE_BIT on
ES. But in all specifications, it needs to be 0.
This patch moves that logic into simplified_access_mode(): when
AccessFlags == 0, it now returns GL_READ_WRITE for desktop and
GL_WRITE_ONLY for ES 1/2. (BUFFER_ACCESS doesn't exist on ES 3.0,
so it's irrelevant there.)
With that in place, it changes the AccessFlags default to 0.
Fixes three es3conform tsets:
- copy_buffer_defaults
- map_buffer_range_modify_indices
- pixel_buffer_object_default_parameters
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Perhaps most importantly, this patch adds comments quoting the relevant
spec paragraphs above each error condition.
It also makes three changes:
- For FBOs, GL_COLOR_ATTACHMENTm where m >= MaxDrawBuffers is supposed
to generate INVALID_OPERATION (not INVALID_ENUM).
- Constants that refer to multiple buffers (such as FRONT, BACK, LEFT,
RIGHT, and FRONT_AND_BACK) are supposed to generate INVALID_OPERATION,
not INVALID_ENUM.
- In ES 3.0, for FBOs, buffers[i] must be NONE or GL_COLOR_ATTACHMENTi
or else INVALID_OPERATION occurs. (This is a new restriction.)
Fixes es3conform's draw-buffers-api test.
v2: The error path was missing a "return" like all the other error
paths. Also, we may as well call it glDrawBuffers in the error message
since the ARB suffix doesn't exist in ES 3.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The specification requires that query results are processed in order, (when
one query result is returned, all previous query of the same type must also be
available). The implementation was failing this requirement in the case of
BeginQuery and EndQuery with no intervening drawing, (the result would be made
available immediately without flushing previous queries).
This fixes the following es3conform test:
occlusion_query_query_order
as well as the following piglit test:
occlusion_query_order
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This allows for avoiding the occlusion query erroneously accumulating results
during the meta operation. This functionality is made conditional on a new
MESA_META_OCCLUSION_QUERY bit so that meta-operations which should generate
fragments can continue to get the current behavior.
The implementation of glClear is specifically augmented to request the flag
since glClear is specified to not generate fragments.
This fixes the following es3conform tests:
occlusion_query_draw_occluded.test
occlusion_query_clear
occlusion_query_custom_framebuffer
occlusion_query_stencil_test
occlusion_query_discarded_fragments
As well as the following piglit test:
occlusion_query_meta_no_fragments
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This flag allows for the specified behavior that GenQueries reserves a name,
but does not associate an object with it until BeginQuery. We allocate the
object immediately with the new EverBound flag set to false, and then set the
flag to true at the time of BeginQuery.
This allows us to implement a conformant IsQuery function by checking the
state of the new EverBound flag.
This fixes the following es3conform tests:
occlusion_query_genqueries
occlusion_query_is_query_nonzero
and the following piglit test:
occlusion_query_lifetime
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The reference to "correct, see spec" was a bit too vague to be useful,
(particularly since the language being referenced here changes between OpenGL
3.1 and OpenGL 4.3).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
These are optimizations which make MSAA a lot faster.
The MSAA work is complete with this commit. (except for enablement of AA
optimizations for RGBA16F, for which a patch is ready and waiting until
the kernel CS checker fix lands)
MSAA can't be made any faster as far as hw programming is concerned.
The catch is only one process and one colorbuffer can use the optimizations
at a time. There usually is only one MSAA colorbuffer, so it shouldn't be
an issue.
Also, there is a limit on the size of MSAA colorbuffer resolution in terms
of megapixels. If the limit is surpassed, the AA optimizations are disabled.
The limit is:
- 1 Mpix on low-end and some mid-level chipsets (1024x768 and 1280x720)
- 2 Mpix on some mid-level chipsets (1600x1200 and 1920x1080)
- 3 or 4 Mpix on high-end chipsets (2048x1536 or 2560x1600, respectively)
It corresponds to the number of raster pipes (= GB pipes) available, each pipe
can hold 1 Mpix of AA compression data.
If it's enabled, the driver prints to stdout:
radeon: Acquired access to AA optimizations.
This reverts commit 4148a29ed8.
This is a work-around for bug:
https://bugs.freedesktop.org/show_bug.cgi?id=59334
We really should be linking against libgallium.la instead of
libgallium.a, but until we can figure why linking against libgallium.la
causes runtime failures in clover we will continue to link against
libgallium.a
Acked-by: Andreas Boll <andreas.boll.dev@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
I think the conditional always evaluates to false.
If I understand the code in core Mesa correctly, depthBits or stencilBits
is 0 if the depth or stencil renderbuffer is NULL, respectively.
Reviewed-by: Brian Paul <brianp@vmware.com>
Since automake changes, softpipe and llvmpipe are mutually exclusive at link
time. This doesn't make much sense to me as we can choose between them at
run-time using GALLIUM_DRIVER.
Creating library file: .libs/libGL.dll.a
.libs/xlib.o: In function `sw_screen_create_named':
/jhbuild/checkout/mesa/mesa/src/gallium/targets/libgl-xlib/../../../../src/gallium/auxiliary/target-helpers/inline_sw_helper.h:35:
undefined reference to `_softpipe_create_screen'
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>
Call Driver.AllocTextureImageBuffer rather than calling
Driver.TexImage with NULL data, format=GL_NONE and type=GL_NONE.
This avoids setting ctx->Unpack, which can lead to incorrectly
trying to upload data.
The GLES3 GTF program's packed_pixels_pbo test was triggering
an error for i965 with the previous code.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This function checks for ES3 compatible
format/type/internalFormat/dimension combinations.
[jordan.l.justen@intel.com: additional tweaks for gles3-gtf]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes a runtime error:
glxgears: symbol lookup error: /home/brian/mesa/lib/gallium/libGL.so.1: undefined symbol: clock_gettime
v2: use $(CLOCK_LIB) and $(PTHREAD_LIBS) per Andreas Boll.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
The hardware does not support a render target without an alpha channel.
So when the user creates a render buffer with no alpha channel, there actually
is storage available for alpha internally. It requires special care to
avoid these unwanted alpha bits from causing any problems.
Specifically, when blending, and when the blend factors would read the
destination alpha values, this commit coerces the blend factors to instead be
either 0 or 1 as appropriate.
A similar fix was made for pre-gen6 hardware in commit eadd9b8e and this
commit shares the fixup function written by Ian then.
This commit the following es3conform test:
rgb8_rgba8_rgb
As well as the following piglit (sub) tests:
EXT_framebuffer_object/fbo-blending-formats/3
EXT_framebuffer_object/fbo-blending-formats/GL_RGB
EXT_framebuffer_object/fbo-blending-formats/GL_RGB8
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We used to keep the color buffers in the dri_buffers array and
swap __DRI_BUFFER_BACK_LEFT and __DRI_BUFFER_FRONT_LEFT around there
and swap third_buffer in in case we needed to triple buffer. That
gets a little fidgety with all the swaps, so lets use the
color_buffers pool like the gbm platform does. We track the color buffers,
their corresponding wl_buffer and locked status here and just plug
a free one into dri2_surf->buffers when we need to.
This is a nice clean-up in itself, but it also sets us up to track
buffer age in the color_buffers structs.
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
RB3D_DEBUG_CTL doesn't help, so I resolve to a tiled temporary texture and
then blitting it to the destination one, which we also do in other situations.
The handling of the CAP is broken in st/mesa anyway. Let's just kill it.
This commit pretty much enables fast Z clear for FBOs with Z24S8.
The driver falls back to clearing with a quad if the fast clear cannot be
used. It can still do fast color clear, for example.
Effectively this path would always assert. Move the break statement to
the (probable) intended place.
Note: This is a candidate for the stable branches.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Include LLVM_LDFLAGS when building with LLVM. Fixes the following build
errors:
CXXLD swrast_dri.la
/usr/bin/ld: cannot find -lLLVMR600CodeGen
/usr/bin/ld: cannot find -lLLVMR600Desc
/usr/bin/ld: cannot find -lLLVMR600Info
/usr/bin/ld: cannot find -lLLVMR600AsmPrinter
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
According to bug #54524, I regressed oglconform's multicontext test
when I reenabled the fragment shader precompile.
However, these test cases only passed by miraculous coincedence. We
assign each fragment program a unique ID (brw_fragment_program::id which
becomes brw_wm_prog_key::program_string_id) which we obtain by storing a
per-context counter.
The test case uses GLX context sharing to access the same fragment
program from two different contexts. This means that we share a program
cache. Before the precompile, if both contexts happened to use the same
shaders in the same order, we'd obtain the same program_string_ids (by
virtue of doing the same computation twice). However, the more likely
scenario is that they completely disagree on program_string_id.
This meant that we'd have two completely different fragment shaders in
the cache with the same ID, tricking us to think they were the same
(aside from NOS), so we'd render using the wrong program.
This patch implements a simple fix suggested by Eric: it moves the
global counter out of brw_context and into intel_screen, which is shared
across all contexts. A mutex protects it from concurrent access.
This is also the first direct usage of pthreads in the i965 driver.
Fixes 10 subcases of oglconform's multicontext test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54524
Reviewed-by: Eric Anholt <eric@anholt.net>
Technically, variable sized arrays are a required feature of C99,
redacted to be optional in C11, and not actually part of C++ whatsoever.
Gcc allows using them in C++ unless you specify -pedantic, and Clang
appears to allow them for simple/POD types.
exec_list is arguably POD, since it doesn't have virtual methods, but I
can see why Clang would be like "meh, it's a C++ struct, say no", seeing as
it's meant to support C99.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58970
Reviewed-by: Matt Turner <mattst88@gmail.com>
The simulator gets very angry about our i2b code:
cmp.ne(16) g3<1>D g2<0,1,0>D 0F
We can't mix integer DWord and float types. The only reason to use 0F
here was to share code with f2b. Split it and use 0D instead.
While we don't believe anything bad will actually happen because of
this, it's nice to fix the warnings and easy enough to do.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Often when debugging, I don't want to see SIMD16 shaders. It makes
INTEL_DEBUG=vs/fs output much easier to read, especially when a program
dumps many shaders. Plus, I also want to verify that SIMD8 works before
even considering SIMD16.
v2: Fix the likeliness check (caught by Chris and Eric).
Reviewed-by: Eric Anholt <eric@anholt.net>
Choose MESA_FORMAT_ARGB2101010 when storing
GL_RGBA + GL_UNSIGNED_INT_2_10_10_10_REV or
GL_RGB + GL_UNSIGNED_INT_2_10_10_10_REV.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The ARB_get_program_binary spec says "OpenGL 3.0 is required." The
nearly identical OES_get_program_binary extension is available for
OpenGL ES 2.0, so I don't see how / why OpenGL 3.0 is a requirement for
the ARB version. Let's just enable whenever GL_ARB_shader_objects is
available.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
After recent changes in the XML, the dispatch generators will expect
this function to be named _mesa_ProgramParameteri.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There were two bugs here. First, this and several other queries were
not available in a desktop GL context with GL_ARB_ES2_compatibility.
Second, GL_NUM_SHADER_BINARY_FORMATS returns zero, but
GL_SHADER_BINARY_FORMATS writes one element of data to the buffer. If
NUM is zero, no data should be written.
Fixes piglit test 'arb_get_program_binary-overrun shader'.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This adds TBO support to r600g, and with GLSL 1.40 enabled,
we now get 3.1 core profiles advertised for r600g.
The r600/700 implementation is a bit different from the evergreen one,
as r6/7 hw lacks vertex fetch swizzles. So we implement it by passing 5
constants per sampler to the shader, the shader uses the first 4 as masks
for each component and the 5th as the alpha value to OR in.
Now TXQ is also broken so we have to pass a constant for the buffer size,
on evergreen we just pass this, on r6/7 we pass it as the 6th element
in the const info buffer.
v1.1: drop return as DDX doesn't use a texture type
v2: add r600/700 support.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds 12 more constant buffers for use as UBOs,
along with adding relative constant fetching for 2D indices.
This with GLSL 1.40 enabled passes all the same tests as softpipe
on my evergreen system.
Signed-off-by: Dave Airlie <airlied@redhat.com>
First we test that line continuations are honored within a comment, (as
recently changed in glcpp), then we test that line continuations can be
disabled via an option within the context. This is tested via the new support
for a test-specific command-line option passed to glcpp.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, we were only supporting line-continuation backslash characters
within lines of pre-processor directives, (as per the specification). With
OpenGL 4.2 and GLES3, line continuations are now supported anywhere within a
shader.
While changing this, also fix a bug where the preprocessor was ignoring
line continuation characters when a line ended in multiple backslash
characters.
The new code is also more efficient than the old. Previously, we would
perform a ralloc copy at each newline. We now perform copies only at each
occurrence of a line-continuation.
This commit fixes the line-continuation.vert test in piglit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This will allow testing of disabled line-continuation on a case-by-case basis,
(with the option communicated to the preprocessor via the GL context).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This will allow the test exercising disabled line continuations to arrange
for the --disable-line-continuations argument to be passed to the standalone
glcpp.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
As the preprocessor becomes more sophisticated and gains more optional
behavior, it's easiest to just pass the GL context pointer to it so that
it can examine any fields there that it needs to (such as API version,
or the state of any driconf options, etc.).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This application is known to contain shaders that:
1. Have a stray backslash as the last line of comment lines
2. Have a declaration immediately following that line
Hence, interpreting that backslash as a line continuation causes the
declaration to be hidden and the shader fails to compile. Fortunately, the
shaders also:
3. Do not have any other intentional line-continuation characters
So disabling line continuations entirely for the application fixes this
problem without causing any other breakage.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is to enable a quirk for Savage2 which includes a shader with a stray '\'
at the end of a comment line. Interpreting that backslash as a line
continuation will break the compilation of the shader, so we need a way to
disable this.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously this was happening unconditionally, leading to some excessive
rebuilding/relinking during builds.
Note that the .po files are not automatically updated due to changes to the
t_options.h file. Instead, translators should continue to use "make po"
manually. This is because after new strings are merged into the existing .po
file, manual work is still required by translators to ensure that the
translations are correct.
Previously, the xmlpool directory had a lone Makefile to assist poeple in
manually invoking a deep make in order to update the translations in
options.h. We can observe that this wasn't happening in fact, (new
translations had been added to de.po without being generated into options.h,
and new options had been manually added directly to options.h rather than to
t_options.h).
Prevent both of these problems from occurring in the future by automatically
generating options.h as part of the standard build of mesa.
For this, the generated options.h is now removed from version control, (along
with Makefile in favor of Makefile.am).
[chadv: Port the Autotools changes to Android.]
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
As can be seen, many other translation strings already include a single
apostrophe just fine without any escaping. This strangely-escaped apostrophe
was causing a build failure ("invalid escape sequence") resulting in no "de"
translations in the final options.h file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The gen_xmlpool.py script would work correctly only when executed from the
directory that contained the script. This shortcoming was due to some
hard-coded paths in the script.
In order to easily invoke the script from the Android build system, we
must be able to execute the script from an arbitrary directory. To enable
that, this patch replaces the two hard-coded paths with new command line
arguments.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
These translations have existed in the de.po file, but were not in the
generated options.h file. This was fixed by simply running "make options.h".
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
For the last two most-recently-added driconf options, their definition was
manually added to options.h, a file which is intended to be automatically
generated, (as part of support for translated driconf option
descriptions). This means that these options would be eliminated if the
generation step were performed again.
Fix this by correctly adding the definitions of these options to t_options.h,
(the file used as input to the generator), and not the options.h file, which
is generated.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This fixes several duplicate symbol errors.
libllvmradeon is a simple helper library. If it requires symbols in
other libraries, this should be taken care of by the gallium target that
uses it (e.g. libr600.la)
This requires some derived state. The cut vertex used is either the
value specified by glPrimitiveRestartIndex or it's hard-coded to ~0.
The derived state gl_array_attrib::_RestartIndex captures this value.
In addition, the derived state gl_array_attrib::_PrimitiveRestart is set
whenever either gl_array_attrib::PrimitiveRestart or
gl_array_attrib::PrimitiveRestartFixedIndex is set.
v2: Use _mesa_is_gles3.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
The GLSL ES 3.0 spec (Section 12.17) says:
"GLSL ES 1.00 removed token pasting and other functionality."
NOTE: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Simply emitting a nicely-formatted error message if any undefined macro is
encountered in a parser context expecting an expression.
With this commit, the following piglit test now passes:
spec/glsl-es-3.00/compiler/undefined-macro.vert
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This can be triggered either by creation of a GLES context (with
api == API_OPENGLES2) or else by a #version directive with version
value 100 or with a string of "es" following the version value.
There's no behavioral change with this commit—just preparation for ES-specific
behavior in the preprocessor in the future.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes the following build error:
CXXLD egl_gallium.la
g++: error: ../../../../src/egl/wayland/wayland-drm/.libs/.libs/libwayland-drm.a: No
such file or directory
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
We get int/uint clear color value in this case, and util_pack_color can't
handle these formats at all (even if it could, float input color isn't what
we want).
Pass through the color union appropriately and handle the packing ourselves
(as I couldn't think of a good generic util solution).
This gets piglit fbo_integer_precision_clear and
fbo_integer_readpixels_sint_uint from the ext_texture_integer test group from
segfault to pass (which only leaves fbo-blending from that group not working).
v2: fix up comments
Need to bitcast the float border color (luckily we already get
the color as int just disguised as float).
Fixes piglit texwrap GL_EXT_texture_integer bordercolor.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Change the texel type to int/uint instead of float throughout the sampling
code which makes it easier to catch errors (as llvm will complain about wrong
types if we mistakenly treat these values as real floats somewhere).
This should also get things like e.g. sampler swizzles (for unused channels)
right.
This fixes piglit texture_integer_glsl130 test.
Border color not working (crashing) yet.
(These formats are not exposed yet in llvmpipe.)
v2: couple cleanups according to José's comments
Reviewed-by: José Fonseca <jfonseca@vmware.com>
C++ linking (controlled by the nodist_EXTRA idiom) is needed
unconditionally for:
nouveau (uses C++ in the driver)
r300 (since LLVM is always required)
radeonsi (since LLVM is always required)
swrast (if builting LLVM pipe)
and conditionally (depends whether LLVM is enabled) for
i915
r600
vmwgfx
and never needed for swrast (softpipe).
Unfortunately, automake seems to *always* link with C++ if nodist_EXTRA
is specified, even inside a false conditional. Not sure if this is a
bug, but it does seem to be weird behavior.
v2: Johannes Obermayr <johannesobermayr@gmx.de>
- Fix some undefined symbols.
v3: Johannes Obermayr <johannesobermayr@gmx.de>
- Install pipe_* to $(libdir)/gallium-pipe.
v4: Johannes Obermayr <johannesobermayr@gmx.de>
- Build it only once on --enable-gallium-gbm / --enable-opencl.
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- add missing xvmc state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing xvmc state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing xvmc state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing xvmc state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing vdpau state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing vdpau state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing vdpau state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing vdpau state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Add missing vdpau state tracker to _LIBADD variable
v3: Andreas Boll <andreas.boll.dev@gmail.com>
- Provide compatibility with scripts for the old Mesa build system
v2: Johannes Obermayr <johannesobermayr@gmx.de>
Fix some undefined symbols.
v3: Johannes Obermayr <johannesobermayr@gmx.de>
Build it -shared to fix egl_gallium.so on r600/radeonsi builds.
The function was named badly and wasn't in the dispatch table,
making it hard to find.
Fixes transform_feedback2_states and gets a few other transform
feedback tests closer to working in es3conform.
Reviewed-by Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
For glGetIntegerv, add support for the following in an OpenGL ES 3.0
context:
GL_MAJOR_VERSION
GL_MINOR_VERSION
GL_NUM_EXTENSIONS
See Table 6.29 of the OpenGL ES 3.0 spec.
Fixes error GL_INVALID_ENUM in piglit egl-create-context-verify-gl-flavor,
testcase for OpenGL ES 3.0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
Fixes es3conform's element_index_uint_constants test.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
From GL/GLES/GL_CORE and GLES2 -> GL/GL_CORE/GLES2.
Yes, we really were exposing ES2_compatibility queries on ES 1.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes the transform_feedback2_init_defaults test from es3conform.
The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
TRANSFORM_FEEDBACK_ACTIVE.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
but an MSAA resource is bound. This effectively makes the MSAA disable switch
not affect rasterization, but it still affects the alpha-to-one and
alpha-to-coverage states. This hardware just lacks a proper MSAA disable
switch.
This fixes graphics corruption in sauerbraten.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59194
In most cases, the width, height, and depth of the physical surface
used by the driver to implement a texture or renderbuffer is equal to
the logical width, height, and depth exposed to the client through
functions such as glTexImage3D(). However, there are two exceptions:
cube maps (which have a physical depth of 6 but a logical depth of 1)
and multisampled renderbuffers (which have larger physical dimensions
than logical dimensions to allow multiple samples per pixel).
Previous to this patch, we accounted for the difference between
physical and logical surface dimensions at inconsistent places in the
call graph (multisampling was accounted for in
intel_miptree_create_for_renderbuffer(), and cubemaps were accounted
for in intel_miptree_create_internal()). As a result, it wasn't
always clear, when calling a miptree creation function, whether
physical or logical dimensions were needed. Also, we weren't
consistent about storing logical dimensions in the intel_mipmap_tree
structure (we only did so in the
intel_miptree_create_for_renderbuffer() code path, and we did not
store depth).
This patch refactors things so that intel_miptree_create_internal() is
responsible for converting logical to physical dimensions and for
storing both the physical and logical dimensions in the
intel_mipmap_tree structure. As a result, all miptree creation
functions interpret their arguments as logical dimensions, and both
physical and logical dimensions are always available to functions that
work with intel_mipmap_trees.
In addition, it renames the fields in intel_mipmap_tree used to store
the dimensions, so that it is clear from the name whether physical or
logical dimensions are being referred to.
This should fix the following bugs:
- When creating a separate stencil surface for a depthstencil cubemap,
we would erroneously try to convert the depth from 1 to 6 twice,
resulting in an assertion failure.
- When creating an MCS buffer for compressed multisampling, we used
physical dimensions instead of logical dimensions, resulting in
wasted memory.
In addition, this should considerably simplify the implementation of
ARB_texture_multisample, because it moves the code to compute the
physical size of multisampled surfaces out of renderbuffer-only code.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
This allows intel_miptree_alloc_mcs() to force Y tiling for the MCS
buffer. Previously we accomplished this by the hack of passing
INTEL_MSAA_LAYOUT_CMS as the msaa_layout parameter, but that parameter
is going to be going away soon.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
No functional change. This patch moves the compute_msaa_layout()
function earlier in intel_mipmap_tree.c so that it can be used by
other functions in that file.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
I erroneously added this back in January 2011 in commit 88421589.
Looking at the commit message, I have no idea why I added it. It only
added non-array structure fields to the symbol table, so array structure
fields are treated correctly.
Fixes piglit tests structure-and-field-have-same-name.vert and
structure-and-field-have-same-name-nested.vert. It should also fix
WebGL conformance tests shader-with-non-reserved-words.
NOTE: This is a candidate for the stable release branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57622
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
R6xx doesn't work - the issue seems to be with flushing (sometimes
the destination buffer contains garbage). There are no hangs, so we're good.
R7xx doesn't seem to have any alignment restriction despite our initial
thinking. Everything just works.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This patch enhances the varying packing code so that flat varyings of
uint, int, and float types can be packed together.
We accomplish this in lower_packed_varyings.cpp by making the type of
all flat varyings ivec4, and then using information-preserving type
conversions (e.g. ir_unop_bitcast_f2i) to convert all other types to
ints.
The varying_matches::compute_packing_class() function is updated to
reflect the fact that varying packing no longer needs to segregate
varyings of different base types.
Fixes piglit test varying-packing-mixed-types.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Split lower_packed_varyings_visitor::bitwise_assign into
pack/unpack variants.
The GLSL 1.30 spec only allows vertex shader outputs and fragment
shader inputs ("varyings" in pre-GLSL-1.30 parlance) to be of type
int, uint, float, or vectors, matrices, or arrays thereof. Bools,
bvec's, and structs are prohibited. (Integral varyings were
prohibited prior to GLSL 1.30).
Previously, Mesa only performed this check on variables declared with
the "varying" keyword, and it always performed the check according to
the pre-GLSL-1.30 rules. As a result, bools and structs were allowed
to slip through, provided they were declared using the new in/out
syntax.
This patch modifies the error check so that it occurs after "varying"
is converted to "in/out", and corrects it to properly account for GLSL
version.
Fixes piglit tests:
in-bool-prohibited.frag
in-bvec2-prohibited.frag
in-bvec3-prohibited.frag
in-bvec4-prohibited.frag
in-struct-prohibited.frag
out-bool-prohibited.vert
out-bvec2-prohibited.vert
out-bvec3-prohibited.vert
out-bvec4-prohibited.vert
out-struct-prohibited.vert
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This patch adds logic to allow the ast_to_hir function
apply_type_qualifier_to_variable() to tell whether it is acting on a
variable declaration or a function parameter. This will allow it to
correctly interpret the meaning of "out" and "in" keywords (which have
different meanings in those two contexts).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
linker.cpp is getting pretty big, and we're about to add even more
varying packing code, so split out the linker code that concerns
varyings to its own file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously this macro existed in 3 separate places, some inside the
intel driver and some outside of it. It makes more sense to have it
in main/macros.h
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When analyzing a loop where the loop condition is expressed in the
non-standard order (e.g. "4 > i" instead of "i < 4"), we were
reversing the condition incorrectly, leading to a loop bound that was
off by 1.
Fixes piglit tests {vs,fs}-loop-bounds-unrolled.shader_test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Haswell and later support the GL_FIXED and 2_10_10_10_rev vertex formats
natively, and don't need shader workarounds.
Reviewed-by: Eric Anholt <eric@anholt.net>
* We have a symbol conflict as rtasm in
Mesa collides with rtasm in gallium.
* As us linking gallium and mesa together
is an edge case, lets just omit the rtasm
code from Mesa as we should be going
llvmpipe soon :)
This is not as optimized as r600g - the MSAA compression is missing,
so r300g needs a lot of bandwidth (more than r600g to do the same thing).
However, if the bandwidth is not an issue for you, you can enjoy this
unoptimized MSAA support.
The only other missing optimization for MSAA is the fast color clear.
MSAA is enabled on r500 only, because that's the only GPU family I tested.
That said, MSAA should work on r300 and r400 as well (but you must set
RADEON_MSAA=1 to allow it, then turn MSAA on in your app or set GALLIUM_MSAA=n,
n >= 2, n <= 6)
I will enable the support by default on r300-r400 once someone (other than me)
tests those chipsets with piglit.
The supported modes are 2x, 4x, 6x.
The supported MSAA formats are RGBA8, BGRA8, and RGBA16F (r500 only).
Those 3 formats are used for all GL internal formats.
Tested with piglit. (I have ported all MSAA tests to GL2.1)
Fixes this build error on platforms not using GNU indent.
indent: Command line: ``-T'' requires a parameter
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
v2: Move {RED,RG,RGB,RGBA}_SNORM changes from the previous commit to
this commit. Based on suggestions from Ken.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The OpenGL 3.2 core profile spec says:
"The following base internal formats from table 3.11 are
color-renderable: RED, RG, RGB, and RGBA. The sized internal formats
from table 3.12 that have a color-renderable base internal format
are also color-renderable. No other formats, including compressed
internal formats, are color-renderable."
The OpenGL 3.2 compatibility profile spec says (only ALPHA is added):
"The following base internal formats from table 3.16 are
color-renderable: ALPHA, RED, RG, RGB, and RGBA. The sized internal formats
from table 3.17 that have a color-renderable base internal format
are also color-renderable. No other formats, including compressed
internal formats, are color-renderable."
Table 3.12 in the core profile spec and table 3.17 in the compatibility
profile spec list SNORM formats as having a base internal format of RED,
RG, RGB, or RGBA. From this we infer that they should also be color
renderable.
The OpenGL ES 3.0 spec says:
"An internal format is color-renderable if it is one of the formats
from table 3.12 noted as color-renderable or if it is unsized format
RGBA or RGB. No other formats, including compressed internal
formats, are color-renderable."
In the OpenGL ES 3.0 spec, none of the SNORM formats have "color-
renderable" marked in table 3.12. The RGB I and UI formats also are not
color-renderable in ES3, but we'll save that change for another patch.
Both NVIDIA's closed-source driver (version 304.64) and AMD's
closed-source driver (Catalyst 12.6 on HD 3650) reject *all* SNORM
formats for renderbuffers in OpenGL 3.3 compatibility profiles.
v2: Move {RED,RG,RGB,RGBA}_SNORM changes from the this commit to the
next commit. Based on suggestions from Ken.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-04 17:39:05 -08:00
760 changed files with 24017 additions and 17426 deletions
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=55175">Bug 55175</a> - Memoryleak with glPopAttrib only on Intel GM45</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=56442">Bug 56442</a> - glcpp accepts junk after #else/#elif/#endif tokens</li>
<li><ahref="https://bugs.freedesktop.org/show_bug.cgi?id=56706">Bug 56706</a> - EGL sets error to EGL_SUCCESS when DRI driver fails to create context</li>
<li>libdrm, A user-space library that interfaces with drm. Most distros ship with this driver. Safest bet is really to replace the system one. Optionally you can point LIBDRM_CFLAGS and LIBDRM_LIBS to the libdrm-2.4.22 package in toolchain. But here, we replace:
<li>libdrm, a user-space library that interfaces with drm.
Most distros ship with this but it's safest to install a newer version.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.