Compare commits

..

2738 Commits

Author SHA1 Message Date
Tomasz Lis
9f07ca11c1 mesa: Dispatch ARB_framebuffer_object and EXT_framebuffer_object differently
Almost all of the functions between the ARB and the EXT share the same
GLX protocol because the functionality is, essentially, identical.
However, there are some differences between the extensions:

- In the ARB extension, names must come from glGenBuffers.

- In the ARB extension, framebuffer objects are not shared (but they are
  in the EXT).

For these reasons, glBindFramebuffer and glBindRenderbuffer have
different GLX protocol opcodes than their EXT counterparts.  Currently
these functions alias each other in the dispatch table.  This makes it
impossible to be truly spec conformant.

This patch enables fixing the conformance issue by splitting
glBindFramebuffer / glBindFramebufferEXT and glBindRenderbuffer /
glBindRenderbufferEXT into separate dispatch table entries.

Patches will be available shortly to:

- Fix the conformance issue.

- Stop advertising the EXT in OpenGL 3.1 (or core profiles).

HOWEVER, this does represent a compatibility break between the loader
(libGL or the Xserver GLX module) and the driver.  Mesa drivers compiled
without this change will request a single dispatch table entry for
glBindFramebuffer and glBindFramebufferEXT.  Since the updated loader
has different entries for each, the request will fail, and the driver
will die in a fire.

Drivers built with the change should continue to load fine on loaders
without the change.  In this case, the driver will separately ask for
entries for glBindFramebuffer and glBindFramebufferEXT, and the loader
will tell it the same location.  Since the loader in the server's GLX
module is not (yet) updated, this should not be a problem.  We also do
not advertise the ARB extension from the server, so, again, this should
not be a problem for the server.

HOWEVER, this means that DRI1 drivers (remember mga_dri.so?) will no
longer load with libGL build hereafter.  That means this patch will need
to be back ported to the 8.0 branch.

v2 (idr): Added missing GLX protocol opcodes for the EXT functions and
corrected the opcodes for the ARB functions.  Updated GLX indirect_api
unit test and dispatch sanity unit test.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Bartosz Zawistowski <bartosz.l.zawistowski@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
2013-07-18 17:42:46 -07:00
Kenneth Graunke
adfd0123c8 st/mesa: Enable the ARB_shading_language_420pack extension for 1.30+.
Any driver that supports GLSL 1.30 should be able to handle this
extension, as it's entirely implemented in the GLSL compiler.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
46d9baf3e3 i965: Enable the GL_ARB_shading_language_420pack extension on Gen6+.
While all the work is in the shared GLSL compiler, this extension
requires GLSL 1.30, which is currently only supported on Gen6+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
bfcec4618a glsl: Handle the binding qualifier for UBO variables.
layout(binding = N) is equivalent to calling glUniformBlockBinding(_,N).

This currently only handles the GLSL 1.40 case - no interface names, no
arrays of uniform blocks.  This is okay since we don't yet support GLSL
1.50, and don't expose ARB_shading_language_420pack in ES 3.0.

v2: Move into the other function; use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
f25d94084c glsl: Propagate UBO binding qualifier into UBO member variables.
Without an instance name, there is no ir_variable representing the
actual uniform block declaration.  When the linker goes to set uniform
initializers, it only sees the members as ir_variables; never the block.

So, unfortunately, the members need to know about the binding.

There has to be a better way to do this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
34e2ccc9f0 glsl: Handle the binding qualifier for arrays of samplers.
Normally, uniform array variables are initialized by array literals.
That is, val->type->array_elements >= storage->array_elements.

However, samplers are different.  Consider a declaration such as:

   layout(binding = 5) uniform sampler2D[3];

The initializer value is a single integer (5), while the storage has 3
array elements.  The proper behavior here is to increment one for each
element; they should be initialized to 5, 6, and 7.

This patch introduces new code for sampler types which handles both
arrays of samplers and single samplers correctly.

v2: Move into the other function; use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
67038c6ba2 glsl: Add plumbing for handling uniform binding qualifiers.
Sampler uniforms and uniform blocks do not have a var->constant_value.
Instead, they have an integer var->binding value.

This makes extending set_uniform_initializer() somewhat problematic: it
assumes that there is an ir_constant * which represents the initializer,
and that it's safe to dereference that without any NULL checks.

Instead, this patch creates an analogous function for binding
qualifiers, and calls one or the other as appropriate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
0a23ec2b6e glsl: Delete unused code for handling samplers in array-initializers.
There is existing code to handle sampler uniform initializers.  Prior to
GLSL 4.20's "binding" keyword, sampler uniforms don't have initializers
at all, so this is somewhat surprising.

The existing code is broken into two cases: one where both the variable and
initializer are arrays, and a second where the variable and initializer are
scalars.

The first case should never occur, since array-typed initializers do not
exist for sampler uniforms.  Even with the binding keyword, the
initializer is a single integer which represents the texture unit to use
for the first array element.

The second is apparently used for some fixed-function code.

v2: Rewrite the commit message - suggested by Paul.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
9a9a830b44 glsl: Cross-validate explicit binding points.
All compilation units need to agree on the binding point, if they
specify one at all.

v2: Use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
d4375fc016 glsl: Propagate explicit binding information from AST to IR.
Rather than creating a new "binding" field in ir_variable, we reuse
constant_value since the linker code for handling uniform initializers
uses that.

Since UBOs and samplers can't otherwise have initializers/constant
values, there shouldn't be a conflict.

v2: Propagate the new binding variable around too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
4da1504c0f glsl: Add ir_variable fields for explicit bindings.
These are not used yet, but they exist and are copied appropriately.

v2: Add an explicit "int binding" variable rather than reusing
    constant_value, as suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
5e5e12040b glsl: Add validation for the "binding" qualifier.
The "binding" qualifier only applies to UBO blocks and samplers, along
with arrays of those types.  (It would also apply to images and atomic
counters, but we don't support those yet.)

This also validates sampler bindings against the maximum number of
texture units, and UBO bindings against the number of uniform buffer
binding points.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
0418846a07 glsl: Parse the "binding" keyword and store it in ast_type_qualifier.
Nothing actually uses this yet.

v2: Remove >= 0 checks.  They'll be handled in later validation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
7f6a2d6937 glsl: Have the lexer return LAYOUT_TOK if 420pack is enabled.
GL_ARB_shading_language_420pack also provides layout qualifiers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
56bcde34b2 glsl: Use has_layout() rather than a partial open coded version.
The idea of this code is to disallow layout(...) sections with the
deprecated "varying" or "attribute" keywords, unless a few select
extensions are enabled which allow a more relaxed check.

In order to detect a layout(...) section, the code checks for a number
of layout qualifiers.  However, it failed to check for all of them,
which could lead to layout(...) not being detected when it should.

By replacing this with has_layout(), we properly check for all layout
qualifiers, and also guarantees that new qualifiers added in the future
will not be forgotten.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
c397ec94e9 glsl: Relax auxiliary storage ordering requirements with 420pack.
These were already semi-relaxed, since the storage qualifier rule
already skipped when 420pack was enabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
b5d6c51e2b glsl: Handle centroid qualifier ordering in C code, not the parser.
The GL_ARB_shading_language_420pack extension/GLSL 4.20 split centroid
off into a new category, "auxiliary storage qualifiers," and allow these
to be placed anywhere in the series.  So we have to stop recognizing
"centroid in"/"centroid out"/"centroid varying" in the grammar and get
more creative.

The same approach used before works here, too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
844307a584 glsl: Allow precision qualifiers to be flexibly ordered with 420pack.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
6eec502e84 glsl: Move precision handling to be part of qualifier handling.
This is necessary for the parser to be able to accept precision
qualifiers not immediately adjacent to the type, such as "const highp
inout float foo".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
308d4c7146 glsl: Change is_precision_statement to default_precision != none.
Currently, we store precision in ast_type_specifier, rather than
ast_type_qualifier.  This works because precision is the last qualifier,
and immediately adjacent to the type.

Default precision statements (such as "precision highp float") are
represented as ast_type_specifier objects, with a boolean to indicate
that it's a default precision statement rather than an ordinary type.

ast_type_specifier::precision will be moving to ast_type_qualifier soon,
in order to support arbitrary qualifier ordering.  However, we still
need to store a "this is a precision statement" flag /and/ the default
precision in ast_type_specifier.

This patch changes the boolean into a new field, default_precision.
If default_precision != ast_precision_none, it's a precision statement
with the specified precision.  Otherwise, it's an ordinary type.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
7855482138 glsl: Disable ordering checks for const parameters with 420pack.
This makes the complier accept both "const in" and "in const".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
293dfe5738 glsl: Handle "const" as a parameter qualifier.
This will make it easy to support both "const in" and "in const", as
required by GLSL 4.20/ARB_shading_language_420pack.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
a4d15a3cd9 glsl: Refactor parameter qualifier handling.
"Parameter direction qualifier" is a new term I invented just now; it's
not part of any GLSL specification.

This paves the way handling multiple parameter qualifiers, in any order,
as required by GLSL 4.20/ARB_shading_language_420pack.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
83fe4f7019 glsl: Use merge_qualifier() when processing qualifier lists.
Most of ast_type_qualifier is simply a bitfield (represented as a
structure of unsigned:1 bits in a union with an unsigned).  However, it
also contains ARB_explicit_attrib_location's location/index fields.

In the past, this has worked by simply returning the layout qualifier's
ast_type_qualifier and merging the other bits into it.  However, that's
not obvious until you break it by switching $1 and $2.

Using merge_qualifier() copies them appropriately, and also properly
overrides layout qualifiers.  It also checks for duplicate qualifiers,
which renders some of the checks in the previous patch unnecessary.
However, those checks provide better error messages, such as "Duplicate
interpolation qualifier", rather than just "duplicate qualifier".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
0cb90fcfbd glsl: Allow duplicate layout qualifiers with 420pack.
The new 4.20 rules explicitly allow multiple layout(...) sections.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
89f75e7e7b glsl: Disable ordering checks on most qualifiers for 420pack.
This makes the compiler accept invariant, storage, layout, and
interpolation qualifiers in any order when ARB_shading_language_420pack
is enabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
48e3bd33dc glsl: Handle most qualifier ordering in C code rather than the grammar.
The GL_ARB_shading_language_420pack extension/GLSL 4.20 allow qualifiers
to be specified in (basically) any order.  In order to support this, we
can't hardcode the ordering restrictions in the grammar.

This patch alters the grammar to accept invariant, storage, layout, and
interpolation qualifiers in any order, but adds C code to enforce the
ordering requirements.  In the 420pack case, we should be able to simply
skip the error checks.

As a bonus, this also lets us generate decent error messages, rather
than Bison's awful "unexpected TOKEN" errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
1b719df14d glsl: Add a new ast_type_qualifier::has_auxiliary_storage() method.
"Auxiliary storage qualifiers" is the new term given to "centroid",
"patch", and "sample" by GLSL 4.20/GL_ARB_shading_language_420pack.

Even though we only support "centroid", it's useful to add this now
so that all auxiliary storage qualifiers get handled in the right places
once they're eventually supported.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
eb30af51d6 glsl: Add a new ast_type_qualifier::has_storage() method.
This makes it easy to check if any storage qualifiers are set.

"centroid" is not considered a storage qualifier.  In the old language
rules, you can't specify "centroid" by itself; it's always "centroid
in", "centroid out", or "centroid varying."  So one of the other storage
qualifiers will always be set; there's no need to specifically check for
centroid.

In the new 4.20 rules, centroid is an auxiliary storage qualifier, not a
storage qualifier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
7cef2b22b8 glsl: Add a new ast_type_qualifier::has_layout() method.
This makes it easy to check if any layout qualifiers are set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:21 -07:00
Kenneth Graunke
7ce5c6b214 i965: Combine URB code emission into a single group.
All four URB packets need to be programmed together in order for the GPU
state to be valid.  Putting them in separate BEGIN..ADVANCE blocks is
risky: if we're nearing the end of a batch, the batch could be flushed
inbetween two of the commands, causing the URB programming to be split
into two batchbuffers.

This -might- be okay with hardware contexts, but it offers no advantages
over keeping them together, and has a potential for hangs.

Putting them into a single BEGIN..ADVANCE block ensures they'll be kept
in the same batch, which seems wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-18 16:57:21 -07:00
Chad Versace
30f33deccb i965/hsw: Change L3 MOCS for depth, hiz, and stencil
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:22 -07:00
Chad Versace
2273b652bb i965/hsw: Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

In blorp, change only the PS packet, because the VS packet is disabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:22 -07:00
Chad Versace
2f346395f5 i965/hsw: Change L3 MOCS of SURFACE_STAT
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:21 -07:00
Chad Versace
a16d47465e i965/hsw: Change L3 MOCS of 3DSTATE_VERTEX_BUFFERS
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:21 -07:00
Tomasz Lis
eb83079b35 glx: Enable floating-point fbconfig extensions
Signed-off-by: Tomasz Lis <listom@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Ian Romanick
74cbe6e497 egl: Drop configs with unknown or invalide __DRI_ATTRIB_RENDER_TYPE
Some render types, such as floating-point, aren't valid with EGL.
Return NULL in those cases to drop them.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
c37c367d38 dri: Introduce new flags in __DRI_ATTRIB_RENDER_TYPE
Mark __DRI_ATTRIB_FLOAT_MODE as deprecated, and introduce new flags to
__DRI_ATTRIB_RENDER_TYPE for float modes.  Both signed float
(fbconfig_float) and unsigned (packed_float) are introduced. The old
attribute should be set for both float modes.

v2 (idr): Require that the render mode from the DRI attributes matches the
render mode of the config exactly.  This is the behavior of the old code.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
4473af7aca glx: Require proper drawableType in init_fbconfig_for_chooser
Make sure that init_fbconfig_for_chooser sets correct value of
drawableType for visual configs and fbconfigs.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
2eed9ff2fb glx: Validate the GLX_RENDER_TYPE value
Correctly handle the value of renderType in GLX context.  In case of the
value being incorrect, context creation fails.

v2 (idr): indirect_create_context is just a memory allocator, so don't
validate the GLX_RENDER_TYPE there.  Fixes regressions in several
GLX_ARB_create_context piglit tests.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
27c8aa5cfb glx: Store the RENDER_TYPE in indirect rendering
v2 (idr): Open-code the check for GLX_RENDER_TYPE.
dri2_convert_glx_attribs can't be called from here because that function
only exists in direct-rendering builds.  Also add a stub version of
indirect_create_context_attribs to tests/fake_glx_screen.cpp to prevent
'make check' regressions.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
1c748dff6b glx: Handling RENDER_TYPE in glXCreateContext and init_fbconfig_for_chooser
Set the correct values of renderType in glXCreateContext and
init_fbconfig_for_chooser.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
b8126c7c8a glx: Changes to visual configs initialization.
Correctly handle the value of renderType and drawableType in
fbconfig. Modify glXInitializeVisualConfigFromTags to read the parameter
value, or detect it if it's not there.

v2 (idr): If there was no GLX_RENDER_TYPE property, set the type based
purely on the rgbMode as the previous code did.  It is impossible for
floatMode to be set at this point, so we can't have a float config.  The
previous code regressed a large number of piglit GLX tests because those
tests don't set GLX_RENDER_TYPE in the glXChooseConfig call.  Restoring
the old behavior for that case fixes those regressions.

Also fix handling of GLX_DONT_CARE for GLX_RENDER_TYPE.  Fixes a
regression in glx-dont-care-mask.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
a92cd5b245 glx: Retrieve the value of RENDER_TYPE from GLX attribs array
Make sure that context creation routines are provided with the value of
RENDER_TYPE retrieved from GLX attribs.

v2 (idr): Minor formatting changes.  Change type of
dri2_convert_glx_attribs render_type parameter to uint32_t to silence
some GCC warnings.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
36259a16fe glx: Store the value of renderType while creating context
Make sure that renderType property value is stored in GLX context while
it's being created.  Further patches will be provided to make the value
correspond to fbconfig's renderType.

v2 (idr): Move a hunk from the next patch to this patch to prevent a
build break.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Kenneth Graunke
7791c9869b i965: Add #defines for Memory Object Control State fields on Gen7-7.5.
The L3 controls are identical on all platforms, but LLC differs:
- Ivybridge has a "cache in LLC" flag
- Baytrail has no LLC, but instead has a snoop bit:
  "data accesses in this page must be snooped in the CPU caches."
- Haswell has writeback/uncached flags for LLC and eLLC (eDRAM).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:03:19 -07:00
Fabian Bieler
6368478712 glsl/linker: Use correct array length when linking inter-stage uniforms and varyings.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
2013-07-18 14:12:44 -07:00
Mike Frysinger
73c9b4b0e0 gen_matypes: fix cross-compiling with gcc
The current gen_matypes logic assumes that the host compiler will produce
information that is useful for the target compiler.  Unfortunately, this
is not the case whenever cross-compiling.

When we detect that we're cross-compiling and using GCC, use the target
compiler to produce assembly from the gen_matypes.c source, then process
it with a shell script to create a usable header.  This is similar to how
the linux kernel creates its asm-offsets.c file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-07-18 13:55:48 -07:00
Andreas Oberritter
a48be954ce ax_prog_flex.m4: change grep syntax to accept e.g. flex.real
This is required in case a wrapper or symlink is used. This patch
has also been sent upstream, awaiting moderation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Oberritter <obi@saftware.de>
2013-07-18 13:54:59 -07:00
Jonathan Liu
2da0bd0526 builtin_compiler/build: Avoid using libtool if cross compiling
Adds the dependencies of builtin_compiler as sources when cross
compiling instead of using libtool to share compilation with src/glsl.
The builtin_compiler executable is built for the host when cross
compiling so it doesn't make sense to share compilation with src/glsl
built for the target in this case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44618
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-18 13:54:20 -07:00
Kenneth Graunke
2b5b436615 i965: Add MOCS shift and mask for SURFACE_STATE entries.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 10:45:49 -07:00
Roland Scheidegger
4ef19f7fec llvmpipe: clamp inputs for srgb render buffers
Usually with fixed point renderbuffers clamping is done as part of conversion.
However, since we blend in float format, we essentially skip all conversion
steps pre-blend but since this is still a fixed point renderbuffer we must
still clamp the inputs in this case. Makes no difference for piglit though.
Obviously we could skip this if fragment color clamping is enabled, but a)
this is deprecated in OpenGL (d3d never had it) and b) we don't support it
natively so it gets baked into the shader.
Also add some comment about logic ops being broken for srgb, luckily no test
tries to do that as there's no easy fix...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-18 19:04:20 +02:00
Roland Scheidegger
e57b98bad3 llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha
We were fixing up the blend factor to ZERO, however this only works correctly
with fixed point render buffers where the input values are clamped to 0/1
(because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped
inputs). Haven't seen any failure anywhere due to that with fixed point SNORM
buffers (which clamp inputs to -1/1) but it should apply there as well (snorm
blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all,
d3d10 requires them but they are not blendable).
Doesn't look like piglit hits this though (some internal testing hits the
float case at least). (With legacy OpenGL we could theoretically still use the
fixup to zero if the fragment color clamp is enabled, but we can't detect that
easily since we don't support native clamping hence it gets baked into the
shader.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-18 19:03:35 +02:00
Marek Olšák
0d7f087483 r600g: use WAIT_3D_IDLE before using CP DMA
I broke this with 7948ed1250 for r700 at least.
2013-07-18 14:27:34 +02:00
Jonathan Gray
0b405f364f r300g: make use of gallium's os_get_process_name()
Lets the code compile on non Linux systems.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-18 14:04:48 +02:00
Jean-Sébastien Pédron
148f0deb06 configure.ac: On some systems, "x86-64" is called "amd64"
For instance, this is the case on FreeBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 23:10:23 -07:00
Ilia Mirkin
fbdae1ca41 nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0
Adds H.264 and MPEG2 codec support via VP2, using firmware from the
blob. Acceleration is supported at the bitstream level for H.264 and
IDCT level for MPEG2.

Known issues:
 - H.264 interlaced doesn't render properly
 - H.264 shows very occasional artifacts on a small fraction of videos
 - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there
   when using XvMC on the same videos

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-18 07:52:32 +02:00
Jonathan Gray
f96c07abf6 configure.ac: make grep tests more portable
Use grep -w instead of the empty string escape sequences
which are less portable.  Makes the grep tests
function as intended on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 22:50:19 -07:00
Jonathan Gray
78fbb41fe3 configure.ac: add OpenBSD
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 21:06:46 -07:00
Vinson Lee
21f97446f4 glsl: Remove comma at end of enumerator list.
Fixes this build error on OpenBSD 5.3.

In file included from ../../src/mesa/main/ff_fragment_shader.cpp:53:
./../glsl/ir_optimization.h:64: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 20:57:54 -07:00
Vinson Lee
77311dab3a mesa: Remove commas at end of enumerator lists.
Fixes these build errors on OpenBSD 5.3.

In file included from ../../src/mesa/main/errors.h:47,
                 from ../../src/mesa/main/imports.h:41,
                 from ../../src/mesa/main/ff_fragment_shader.cpp:32:
../../src/mesa/main/mtypes.h:3286: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3296: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3303: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3356: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 20:57:53 -07:00
Carl Worth
ceaf1a74cb docs: Import 9.1.5 release notes
And add news item for the release.
2013-07-17 20:11:02 -07:00
Roland Scheidegger
7fd30a8621 gallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bit
Use "or" instead of "add" (this is a classic select sequence, which at
least newer llvm versions can actually recognize (3.2+?), and the "add"
might prevent that - and we really don't want an add instead of an or with
avx if it isn't recognized (even without avx logic ops might be cheaper)).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-17 18:16:34 +02:00
Roland Scheidegger
f0f9fb59c3 util/u_format_s3tc: handle srgb formats correctly.
Instead of just ignoring the srgb/linear conversions, simply call the
corresponding conversion functions, for all of pack/unpack/fetch,
both for float and unorm8 versions (though some don't make a whole
lot of sense, i.e. unorm8/unorm8 srgb/linear combinations).
Refactored some functions a bit so don't have to duplicate all the code
(there's a slight change for packing dxt1_rgb, as there will now be
always 4 components initialized and sent to the external compression
function so the same code can be used for all, the quite horrid and
ad-hoc interface (by now) should always have worked with that).

Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-17 18:16:27 +02:00
Vadim Girlin
07baf9cfd1 r600g/sb: improve alu packing on cayman
Scheduler/register allocator in r600-sb was developed and optimized
on evergreen (VLIW-5) hardware, so currently it's not optimal for
VLIW-4 chips.
This patch should improve performance on cayman gpus due to better alu
packing, but also it tends to increase register usage, so overall positive
effect on performance has to be proven by real benchmarks yet.

Some results with bfgminer kernel on cayman:
source bytecode:       60 gprs, 3905 alu groups,
sbcl before the patch: 45 gprs, 4088 alu groups,
sbcl with this patch:  55 gprs, 3474 alu groups.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:29:56 +04:00
Vadim Girlin
ba7fa4c4c9 r600g/sb: fix handling of new multislot instructions on cayman
Ex-scalar instructions that became multislot on cayman do replicate result
to all channels - handle them similar to DOT4.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:31 +04:00
Vadim Girlin
033eec4145 r600g/sb: fix debug dump code in scheduler
Update the stale debug code for other changes related to debug output.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:31 +04:00
Vadim Girlin
44ebe7291c r600g/sb: fix initial register allocation
Mark values that are members of the 'same register' constraint as
preallocated in ra_init pass, this will prevent incorrect
reallocation in scheduler in some cases.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vadim Girlin
f0d881106a r600g/sb: move chip & class name functions to sb_context
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vadim Girlin
96efa4cdf4 r600g/sb: fix handling of PS in source bytecode on cayman
Actually PS doesn't make sense for cayman and isn't even mentioned in
cayman docs, but llvm backend currently uses it in bytecode and, assuming
that hw seems to be mostly ok with it, this will allow sb to parse such
source bytecode correctly.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vinson Lee
81d3881367 r600g/sb: Initialize ra_checker member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 18:27:30 +04:00
Emil Velikov
b20e0fb520 gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}int
Every function but the above four uses explicitly sized types for their
src and dst arguments. Even fetch_rgba_{s,u}int follows the convention.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-17 13:01:46 +02:00
Kyle McMartin
87c3440567 llvmpipe: use MCJIT on ARM and AArch64
MCJIT is the only supported LLVM JIT on AArch64 and ARM (the regular
JIT has bit-rotted badly on ARM and doesn't exist on AArch64.)

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2013-07-17 17:29:01 +10:00
Kenneth Graunke
00d32cd5b4 glsl: Fix absurd whitespace conventions in the parser.
Historically, we indented grammar production rules with a single 8-space
tab, but code inside of blocks used Mesa's 3-space indents.

This meant when editing code, you had to use an 8-space tab for the
first level of indentation, and 3-spaces after that.  Unless you
specifically configure your editor to understand this, it will get the
indentation wrong on every single line you touch, which quickly devolves
into a colossal waste of time.

It's also inconsistent with every other file in the entire project.

This patch removes all tabs and moves to a consistent 3-space indent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Kenneth Graunke
4ab7fc9ec3 glsl: Fail the build if the grammar contains shift/reduce errors.
When working on a parser, it's very easy to accidentally introduce
new shift/reduce conflicts.  Failing the build guarantees they'll
be noticed and fixed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Kenneth Graunke
73620709c9 glsl: Silence the last shift/reduce conflict warning in the grammar.
The single remaining shift/reduce conflict was the classic ELSE problem:

  292 selection_rest_statement: statement . ELSE statement
  293                         | statement .

    ELSE  shift, and go to state 479

    ELSE      [reduce using rule 293 (selection_rest_statement)]
    $default  reduce using rule 293 (selection_rest_statement)

The correct behavior here is to shift, which is what happens by default.
However, resolving it explicitly will make it possible to fail the build
on new errors, making them much easier to detect.

The classic way to solve this is to use right associativity:
http://www.gnu.org/software/bison/manual/html_node/Non-Operators.html

Since there is no THEN token in GLSL, we need to fake one.  %right THEN
creates a new terminal symbol; the %prec directive says to use the
precedence of that terminal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Vinson Lee
fa7829c36b glsl: Initialize ast_jump_statement::opt_return_value.
opt_return_value was not initialized if mode != ast_return.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-16 09:03:02 -07:00
Vinson Lee
f74acb9835 glapi: Do not use backtrace on OpenBSD.
execinfo.h is not available on OpenBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-16 09:00:38 -07:00
Maarten Lankhorst
b20b2b6dc8 osmesa: link against static libglapi library too to get the gl exports
This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.

This is a candidate for the stable series.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-07-16 10:18:40 +02:00
Chris Forbes
121ea0b38b i965/Gen4: Zero extra coordinates for ir_tex
We always emit U,V,R coordinates for this message, but the sampler gets
very angry if we pass garbage in the R coordinate for at least some
texture formats.

Fill the remaining coordinates with zero instead.

Fixes broken rendering on GM45 in Source games, and in VDrift.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65236

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-16 19:08:41 +12:00
Kenneth Graunke
e4fdf1b008 i965: Cite the Ivybridge PRM for 3DSTATE_CLEAR_PARAMS notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:53 -07:00
Kenneth Graunke
b72a298751 i965: Refer people to brw_tex_layout.c rather than the BSpec.
brw_tex_layout.c sets up the align_w/h fields, and has all the
appropriate spec references already.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:53 -07:00
Kenneth Graunke
4b704424e0 i965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets.
The Sandybridge code had a citation for the range of the "Maximum Number
of Threads" field, and the Ivybridge code just mentioned the "BSpec" in
general.  That's documented in the obvious place, so people can find it
without a spec reference.

The real value of the comment is to say "we tried zero, and it exploded,
so program it to a valid number even if pixel shading is off."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
ada110716a i965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
90b5a03581 i965: Update workaround flush comments for Gen6 3DSTATE_VS.
Unfortunately, the workaround text never made it into the Sandybridge
PRM, so we still have to refer to the BSpec.

It also wasn't obvious why we needed this workaround at all, since we
don't currently do VS passthrough - but BLORP can turn off the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
3b3a440d2b i965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
9a86875c6b i965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements.
Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant
text for some reason.  However, the Sandybridge PRM has the text Chad
originally quoted, and the modern BSpec has the same text.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
2e928e2a3f i965: Cite the Ivybridge PRM for multisample surface format notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
43ea434225 i965: Delete "the data cache is the sampler cache" comments on Gen7+.
I cut and pasted these comments from the Gen4 code during Ivybridge
enabling, and didn't understand what they meant at the time.

The data cache is NOT the same as the sampler cache on Ivybridge.
The sampler cache has L1 and L2 caches in addition to the L3 cache,
while data port messages to the "data cache" hit L3 directly.

This means that the sampler domain is technically wrong, but we stopped
caring about read/write domains quite a while ago.  The kernel just
flushes all the caches at the end of each batchbuffer, and our render to
texture code flushes the sampler caches when necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
3f64cfabfc i965: Cite the 965 PRM for "the data cache is the sampler cache".
Presumably, this comment exists to justify the usage of
I915_GEM_DOMAIN_SAMPLER for this relocation.  At one point, this was
necessary to ensure that the right flushing was done to keep caches
coherent.  These days, the kernel just flushes everything, so I don't
think it matters.

Still, the comment is interesting, so leave it in place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
f254c94204 i965: Cite the Ivybridge PRM for DP message descriptor fields.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
a0c8e76202 i965: Cite the Ivybridge PRM for why the fake MRF range is what it is.
The exact text is in the public docs, so we should cite those.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
3090d39dde i965: Cite the Ivybridge PRM for SFID enum values.
The Ivybridge PRM adds new SFIDs and lists them in a different volume
than Sandybridge, so it's worth adding a reference.

I also removed the BSpec reference, as the section it referred to
was moved somewhere, and I couldn't find it.  This leaves one Haswell
SFID without a citation, but we can add one once the PRMs are out.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Roland Scheidegger
dc1cc928ed llvmpipe: support sRGB framebuffers
Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never
worked anyway in the blend code and are thus disabled, and I don't think anyone
is interested in L8/L8A8. Would need even more hacks otherwise.
Unless I'm missing something, this is the last feature except MSAA needed for
OpenGL 3.0, and for OpenGL 3.1 as well I believe.

v2: prettify a bit, use separate function for packing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-16 01:54:51 +02:00
Marek Olšák
a882067d74 Revert "r300g: allow HiZ with a 16-bit zbuffer"
This reverts commit 631c631cbf.

https://bugs.freedesktop.org/show_bug.cgi?id=66921

Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:46:01 +02:00
Marek Olšák
7969b567bd r300g/swtcl: fix a lockup in MSAA resolve
Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:45:22 +02:00
Marek Olšák
22427640b2 r300g/swtcl: fix geometry corruption by uploading indices to a buffer
The splitting of a draw call into several draw commands was broken, because
the split sometimes took place in the middle of a primitive. The splitting
was supposed to be dealing with the case when there are more indices than
the maximum size of a CS.

This commit throws that code away and uses a real index buffer instead.

https://bugs.freedesktop.org/show_bug.cgi?id=66558

Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:45:16 +02:00
Matt Turner
c889df3fbe glsl: Reject C-style initializers with unknown types.
_mesa_ast_set_aggregate_type walks through declarations initialized with
C-style aggregate initializers and stops when it runs out of LHS
declarations or RHS expressions.

In the example

   vec4 v = {{{1, 2, 3, 4}}};

_mesa_ast_set_aggregate_type would not recurse into the subexpressions
(since vec4s do not contain types that can be initialized with an
aggregate initializer) to set their <constructor_type>s. Later in ::hir
we would dereference the NULL pointer and segfault.

If <constructor_type> is NULL in ::hir we know that the LHS and RHS
were unbalanced and the code is illegal.

Arrays, structs, and matrices were unaffected.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-15 13:02:36 -07:00
Paul Berry
7706e52b25 glsl: Rework builtin_variables.cpp to reduce code duplication.
Previously, we had a separate function for setting up the built-in
variables for each combination of shader stage and GLSL version
(e.g. generate_110_vs_variables to generate the built-in variables for
GLSL 1.10 vertex shaders).  The functions called each other in ad-hoc
ways, leading to unexpected inconsistencies (for example,
generate_120_fs_variables was called for GLSL versions 1.20 and above,
but generate_130_fs_variables was called only for GLSL version 1.30).
In addition, it led to a lot of code duplication, since many varyings
had to be duplicated in both the FS and VS code paths.  With the
advent of geometry shaders (and later, tessellation control and
tessellation evaluation shaders), this code duplication was going to
get a lot worse.

So this patch reworks things so that instead of having a separate
function for each shader type and GLSL version, we have a function for
constants, one for uniforms, one for varyings, and one for the special
variables that are specific to each shader type.

In addition, we use a class, builtin_variable_generator, to keep track
of the instruction exec_list, the GLSL parse state, commonly-used
types, and a few other variables, so that we don't have to pass them
around as function arguments.  This makes the code a lot more compact.

Where it was feasible to do so without introducing compilation errors,
I've also gone ahead and introduced the variables needed for
{ARB,EXT}_geometry_shader4 style geometry shaders.  This patch takes
care of everything except the GS variable gl_VerticesIn, the FS
variable gl_PrimitiveID, and GLSL 1.50 style geometry shader inputs
(using the gl_in interface block).  Those remaining features will be
added later.

I've also made a slight nomenclature change: previously we used the
word "deprecated" to refer to variables which are marked in GLSL 1.40
as requiring the ARB_compatibility extension, and are marked in GLSL
1.50 onward as requiring the compatibilty profile.  This was
misleading, since not all deprecated variables require the
compatibility profile (for example gl_FragData and gl_FragColor, which
have been deprecated since GLSL 1.30, but do not require the
compatibility profile until GLSL 4.20).  We now consistently use the
word "compatibility" to refer to these variables.

This patch doesn't introduce any functional changes (since geometry
shaders haven't been enabled yet).

Reviewed-by: Matt Turner <mattst88@gmail.com>

v2: Rename "typ" -> "type".  Add blank line between inline functions
and declarations in builtin_variable_generator class.  Use the
standard comment "/* FALLTHROUGH */" for compatibility with static
code analysis tools.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 09:35:28 -07:00
Paul Berry
428e030210 glsl: Fix lower_named_interface_blocks to account for dereferences of consts.
In certain rare cases (such as those involving dereference of a
literal constant array of structs),
flatten_named_interface_blocks_declarations's rvalue visitor may be
invoked on an ir_dereference_record whose variable_referenced() method
returns NULL.

Check for this case to avoid a segfault.

Prevents crashes in piglit tests
{vs,fs}-deref-literal-array-of-structs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-15 07:59:52 -07:00
Paul Berry
b2265db8e7 glsl: Don't allow vertex shader input arrays until GLSL 1.50.
Vertex shader inputs are not allowed to be arrays until GLSL 1.50.  We
were accidentally enabling them for GLSL 1.40 (although we haven't
written any tests for them, so it's not clear whether they actually
work).

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 07:50:47 -07:00
Chris Forbes
b616d01661 i965: Gen4/5: use IEEE floating point mode for GLSL shaders.
Fixes isinf(), isnan() from GLSL 1.30

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14 19:58:25 +12:00
Chris Forbes
1ec66f2fb2 i965/vs: Gen4/5: enable front colors if back colors are written
Fixes undefined results if a back color is written, but the
corresponding front color is not, and only backfacing primitives are
drawn. Results are still undefined if a frontfacing primitive is drawn,
but that's OK.

The other reasonable way to fix this would have been to just pick
the one color slot that was populated, but that dilutes the value of
the tests.

On Gen6+, the fixed function clipper and triangle setup already take
care of this.

Fixes 11 piglits:
spec/glsl-1.10/execution/interpolation/interpolation-none-gl_Back*Color-*

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14 19:58:11 +12:00
Roland Scheidegger
796b73d1fe gallivm: (trivial) use constant instead of exp2f() function
Some lame compilers can't do exp2f() and as far as I can tell they can't do
exp2() (with doubles) neither so instead of providing some workaround for
that (wouldn't actually be too bad just replace with pow) and since it is
used with a constant only just use the precalculated constant.
2013-07-14 02:39:33 +02:00
Chia-I Wu
62c546bbf8 ilo: skip 3DSTATE_INDEX_BUFFER when possible
When only the offset to the index buffer is changed, we can skip the
3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add
(offset / index_size) to Start Vertex Location in 3DPRIMITIVE.
2013-07-14 05:59:52 +08:00
Roland Scheidegger
6bcbb0dc82 gallivm: handle srgb-to-linear and linear-to-srgb conversions
srgb-to-linear is using 3rd degree polynomial for now which should be _just_
good enough. Reverse is using some rational polynomials and is quite accurate,
though not hooked into llvmpipe's blend code yet and hence unused (untested).
Using a table might also be an option (for srgb-to-linear especially).
This does not enable any new features yet because EXT_texture_srgb was already
supported via util_format fallbacks, but performance was lacking probably due
to the external function call (the table used by the util_format_srgb code may
not be all that much slower on its own).
Some performance figures (taken from modified gloss, replaced both base and
sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge,
the numbers aren't terribly accurate):

normal gloss, aos, 8-wide: 47 fps
normal gloss, aos, 4-wide: 48 fps

normal gloss, forced to soa, 8-wide: 48 fps
normal gloss, forced to soa, 4-wide: 47 fps

patched gloss, old code, soa, 8-wide: 21 fps
patched gloss, old code, soa, 4-wide: 24 fps

patched gloss, new code, soa, 8-wide: 41 fps
patched gloss, new code, soa, 4-wide: 38 fps

So there's a performance hit but it seems acceptable, certainly better
than using the fallback.
Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will
continue to use the old util_format fallback, because I can't be bothered
to write code for formats noone uses anyway (as decoding is done as part of
lp_build_unpack_rgba_soa which can only handle block type width of 32).
Compressed srgb formats should get their own path though eventually (it is
going to be expensive in any case, first decompress, then convert).
No piglit regressions.

v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also
since keeping both linear to srgb functions for now make sure both are
compiled (since they share quite some code just integrate into the same
function).

v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb
path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00
Roland Scheidegger
9b8d97e5bf gallivm: better support for fast rsqrt
We had to disable fast rsqrt before because it wasn't precise enough etc.
However in situations when we know we're not going to need more precision
we can still use a fast rsqrt (which can be several times faster than
the quite expensive sqrt). Hence introduce a new helper which does exactly
that - it is probably not useful calling it in some situations if there's
no fast rsqrt available so make it queryable if it's available too.

v2: use fast_rsqrt consistently instead of rsqrt_fast, fix indentation,
let rsqrt use fast_rsqrt.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00
Klemens Baum
45574ab2e9 configure.ac: better detection of LLVM version
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-12 21:20:59 -07:00
Vinson Lee
b0c3c955ae r600g/sb: Initialize ra_constraint::cost.
Fixes "Uninitialized scalar field" reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-13 06:57:26 +04:00
Vinson Lee
be8d787873 glsl: Initialize ast_aggregate_initializer::constructor_type.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:42:46 -07:00
Paul Berry
c6bfe62e21 glsl: Make gl_TexCoord compatibility-only
gl_TexCoord was deprecated in GLSL 1.30.  In GLSL 1.40 it was marked
as ARB_compatibility-only, and in GLSL 1.50 and above it was marked as
only appearing in the compatibility profile.  It has never appeared in
GLSL ES.

However, Mesa erroneously included it in all desktop versions of GLSL,
even versions 1.40 and 1.50 (which do not currently support the
compatibility profile).  This patch makes gl_TexCoord available in the
compatibility profile (and GLSL versions 1.30 and prior) only.

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:18:49 -07:00
Paul Berry
8f51d68f8c glsl ES: Fix magnitude of gl_MaxVertexUniformVectors.
Previously, we set it equal to MaxVertexUniformComponents.  It should
be MaxVertexUniformComponents / 4.

NOTE: This is a candidate for the stable branches.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:18:48 -07:00
Marek Olšák
06b38dbab2 winsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfault
The original idea was that cs=NULL should be allowed here, but we never used
NULL until 862f69fbe1. This fixes a segfault in CoreBreach.
2013-07-13 02:38:23 +02:00
Chia-I Wu
8d4ac98549 ilo: move a santiy check into its assert()
The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and
can be eliminated in a release build in gen6_pipeline_end().  Move the call
into the assert().
2013-07-13 07:27:28 +08:00
Chia-I Wu
bf9670270f ilo: mark some states dirty when they are really changed
The checks may seem redundant because cso_context handles them, but
util_blitter does not have access to cso_context.
2013-07-13 06:43:53 +08:00
Chia-I Wu
9047598a8d ilo: clean up ilo_blitter_pipe_begin()
Document why certain states need to be saved, and fix a bug when blitting with
scissor enabled.
2013-07-13 06:43:53 +08:00
Alex Deucher
e0a7565832 r600g: don't use the CB/DB CP COHER logic on r6xx
There are hw bugs.  Flush and inv event is sufficient.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66837

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-12 18:07:56 -04:00
Jonathan Liu
af16f73051 configure: Avoid use of AC_CHECK_FILE for cross compiling
The AC_CHECK_FILE macro can't be used for cross compiling as it will
result in "error: cannot check for file existence when cross compiling".
Replace it with the AS_IF macro.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-12 13:21:28 -07:00
Brian Paul
bf86e0e050 nv30: fix KILL_IF breakage
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858
2013-07-12 10:00:18 -06:00
Zack Rusin
00cd455bd5 gallium: fixup definitions of the rsq and sqrt
GLSL spec says that rsq is undefined for src<=0, but the D3D10
spec says it needs to be a NaN, so lets stop taking an absolute
value of the source which completely breaks that behavior. For
the gl program we can simply insert an extra abs instrunction
which produces the desired behavior there.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-11 20:19:04 -04:00
José Fonseca
a171812d27 util/u_format: Comment out half float denormal test case.
So that lp_test_format doesn't fail until we decide what should be done.
2013-07-12 15:48:38 +01:00
José Fonseca
1b0d29b5da gallivm: Eliminate redundant lp_build_select calls.
lp_build_cmp already returns 0 / ~0, so the lp_build_select call is
unnecessary.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12 15:40:16 +01:00
Brian Paul
46205ab8cc tgsi: rename the TGSI fragment kill opcodes
TGSI_OPCODE_KIL and KILP had confusing names.  The former was conditional
kill (if any src component < 0).  The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.

This patch renames both opcodes:
  TGSI_OPCODE_KIL -> KILL_IF   (kill if src.xyzw < 0)
  TGSI_OPCODE_KILP -> KILL     (unconditional kill)

Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.

I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up.  Driver authors should review their code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
f501baabdb tgsi: fix-up KILP comments
KILP is really unconditional fragment kill.

We've had KIL and KILP transposed forever.  I'll fix that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
e7c3898725 tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector
To align with the docs and the state tracker.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
f3fad24b62 tgsi: use X component of the second operand in exec_scalar_binary()
The code happened to work in the past since the (scalar) src args
effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so
whether you grab the X or Y component doesn't really matter.  Just
fixing the code to make it look right.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
cb2de08f27 mesa: update glext.h to version 20130708
This update fixes the problem with duplicated typedefs for
GLclampf and GLclampd in the previous version.

It also changes some parameter types for glDebugMessageCallbackARB()
and glTransformFeedbackVaryingsEXT().

Note we should someday update the glapi-gen code so that it
understands void pointer parameters.  Currently, the Python code
only understands "GLvoid *" but not "void *".  Luckily, the
compilers don't seem to complain about mixing GLvoid and void.
2013-07-12 08:32:51 -06:00
Brian Paul
5749aea255 mesa: fix Address Sanitizer (ASan) issue in _mesa_add_parameter()
If the size argument isn't a multiple of four, we would have read/
copied uninitialized memory.

Fixes an issue reported by Myles C. Maxfield <myles.maxfield@gmail.com>
2013-07-12 08:32:51 -06:00
Brian Paul
9ca026e220 mesa: simplify some _mesa_IsEnabled() queries
No need to test array->Enabled != 0 since the Enabled field can
only be 0 or 1.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-12 08:32:50 -06:00
Brian Paul
9fc532a263 os: add os_get_process_name() function
v2: explicitly test for BSD/APPLE, #warning for unexpected
environments.
2013-07-12 08:32:50 -06:00
Brian Paul
3fb3e1e38c mesa: whitespace, formatting, 80-column wrapping 2013-07-12 08:32:22 -06:00
Brian Paul
919236f3a2 softpipe: silence some MSVC warnings 2013-07-12 08:19:52 -06:00
Brian Paul
76666b9394 hud: silence some MSVC warnings 2013-07-12 08:19:52 -06:00
Brian Paul
d7a852b3a1 util: add casts to silence MSVC warnings in u_blit.c 2013-07-12 08:19:51 -06:00
Brian Paul
c45d8f2e98 tgsi: s/unsigned/int/ to silence MSVC warning 2013-07-12 08:19:50 -06:00
Brian Paul
2cfd768473 mesa: s/unsigned/int/ to fix MSVC warning in uniforms.c 2013-07-12 08:19:50 -06:00
Brian Paul
5b0fbf1b0b mesa: s/GLuint/GLint/ to silence MSVC warning in textore.c 2013-07-12 08:19:50 -06:00
Brian Paul
721f47227e mesa: add casts to fix MSVC warnings in multisample.c 2013-07-12 08:19:49 -06:00
Brian Paul
528e5b9476 mesa: s/GLint/GLuint/ to fix MSVC warnings in mipmap.c 2013-07-12 08:19:49 -06:00
Brian Paul
738337356b mesa: fix inconsistent function declaration, definitions
To silence MSVC warnings that the declaration and definitions
were different.
2013-07-12 08:19:49 -06:00
Brian Paul
8ba5c79d2c mesa: add cast to silence MSVC warning 2013-07-12 08:19:49 -06:00
Christian König
1681bd7f2b radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2
UVD 2.x doesn't support hardware decoding of MPEG2, just use shader
based decoding for those chipsets.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450

v2: fix interlacing as well

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-12 10:52:27 +02:00
José Fonseca
649ef4da30 glsl: Avoid variable length arrays.
They are a non-standard GCC extension that's not widely supported by
other C/C++ compilers.

Use a dynamic array instead.

Trivial. Should fix the MSVC build.
2013-07-12 09:28:22 +01:00
Matt Turner
1b0d6aef03 glsl: Add support for C-style initializers.
Required by GL_ARB_shading_language_420pack.

Parts based on work done by Todd Previte and Ken Graunke, implementing
basic support for C-style initializers of arrays.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
ae79e86d4c glsl: Add infrastructure for aggregate initializers.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
8d45caaeba glsl: Add an is_declaration field to ast_struct_specifier.
Will be used in a later commit to differentiate between a structure type
declaration and a variable declaration of a struct type. I.e., the
difference between

   struct S { float x; }; (is_declaration = true)

and

   S s;                   (is_declaration = false)

Also note that is_declaration = true for

   struct S { float x; } s;

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
5df807b06f glsl: Track structs' ast_type_specifiers in symbol table.
Will be used in a future commit. An ast_type_specifier is stored (rather
than an ast_struct_specifier) with the idea that we may have more
general uses for this in the future. struct names are prefixed with
'#ast.' to avoid collisions with the glsl_types in the symbol table.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
e641b5fbee glsl: Add process_vec_mat_constructor() function.
Based largely on process_array_constructor().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
af2987d5b6 glsl: Separate code into process_record_constructor().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
a760c73853 glsl: Add copy-constructor for ast_struct_specifier.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
43757135b2 glsl: Add a constructor for ast_type_specifier.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
b85f0c5121 glsl: Clean up and clarify comment explaining initializer rules.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
ce2464a8a7 glsl: Change type of is_array to bool.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
361206771c glsl: Add a comment to note what an exec_list is a list of.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
46b74ca7bc glsl: Fix inverted conditional in error message.
The code float a[2] = float[2]( 3.4, 4.2, 5.0 ); previously generated
this:

   error: array constructor must have at least 2 parameters

when in fact it requires exactly two.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
9749d96817 glsl: Add missing return error_value(ctx) in error path.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
e117eda251 glsl: Remove unnecessary #include from ast_type.cpp.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Chia-I Wu
93742d9757 glsl/build: build builtin_compiler with VISIBILITY_CFLAGS
libglslcore.la and libglcpp.la that are built with builtin_compiler are also
linked to by drivers not using libdricore.  Since there is no public symbol in
them, it is better to mark all symbols hidden.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 09:42:25 +08:00
Matt Turner
08c90f651b glsl: Add comment explaining "row_major" parsing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-11 16:22:07 -07:00
Matt Turner
14ed9018de glsl: Mark "row_major" as not a reserved word in GLSL ES 3.0.
We mark ARB_uniform_buffer_object as enabled under ES 3 since it
contains that functionality, which tricked the compiler into tokenizing
"row_major".

Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 16:22:07 -07:00
Matt Turner
c30948517e glsl: Remove outdated FINISHME comment.
Explicit index support was added by commit 1256a5dc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 16:22:07 -07:00
Alex Deucher
77300bacaf radeon: bump libdrm_radeon requirement for CIK support
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-11 19:11:44 -04:00
Christoph Bumiller
9974593dfb r600g: x/y coordinates must be divided by block dim in dma blit
Note: this is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-11 19:11:44 -04:00
Chih-Wei Huang
1d9271a95c r600g/sb: Fix Android build v2
Add the sb CXX files to the Android Makefile and also stop using some
c++11 features.

v2 (Vadim Girlin): use &bc[0] instead of bc.begin()
2013-07-12 01:11:04 +04:00
Vadim Girlin
758ac6f918 r600g/sb: improve math optimizations v2
This patch adds support for some math optimizations that are generally
considered unsafe, that's why they are currently disabled for compute
shaders.

GL requirements are less strict, so they are enabled for
for GL shaders by default. In case of any issues with
applications that rely on higher precision than guaranteed by GL,
'sbsafemath' option in R600_DEBUG allows to disable them.

v2 - always set proper src vector size for transformed instructions
   - check for clamp modifier in the expr_handler::fold_assoc

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-11 23:01:01 +04:00
Jonathan Gray
c451619dde st/xvmc/tests: avoid non portable error.h functions
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-07-11 09:52:00 +02:00
Anuj Phogat
9a1a67b081 i965/blorp: Fix clear rectangle alignment in fast color clear
From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel >
Pixel Backend > MCS Buffer for Render Target(s) [DevIVB+]:
[DevHSW:GT3]: Clear rectangle must be aligned to two times
the number of pixels in the table shown below...
Observed no piglit, gles3conform regressions with this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65744
2013-07-10 18:41:16 -07:00
Chia-I Wu
ad244884fc winsys/intel: build with VISIBILITY_CFLAGS
There is no public symbol in this winsys.
2013-07-11 09:03:59 +08:00
Chia-I Wu
79bc245c01 ilo: reduce PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS to 12
So that there are at most (2^22 * 6) texels, lower than the 2^26 limit.
2013-07-11 08:03:27 +08:00
Chia-I Wu
29af29b8dc ilo: correctly initialize undefined registers in fs
Initialize all 4 channels of undefined registers (that is, TEMPs that are used
before being assigned) in FS.
2013-07-11 07:01:51 +08:00
Michel Dänzer
a06ee5a09e radeonsi: Handle TGSI_OPCODE_DDX/Y using local memory
16 more little piglits.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-10 18:40:32 +02:00
Michel Dänzer
a6b83c0f23 radeonsi: Handle TGSI_OPCODE_TXD
One more little piglit.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-10 12:16:38 +02:00
José Fonseca
b042aae70d util/u_math: Use xmmintrin.h whenever possible.
It seems  __builtin_ia32_ldmxcsr is only available on gcc and only when
-msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but
these too are only available with gcc when -msse/-msse3 are set.

scons build always sets -msse on x86 builds, but autotools doesn't seem
to.

We could try to get this working on gcc x86 without -msse by emitting
assembly, but I believe that in this day and age we really should be
building Mesa with -msse and -msse2.
2013-07-10 07:56:17 +01:00
Chia-I Wu
045bf0db52 ilo: honor surface padding requirements
The PRM specifies several padding requirements that we failed to honor.
2013-07-10 12:40:22 +08:00
Zack Rusin
63386b2f66 util: treat denorm'ed floats like zero
The D3D10 spec is very explicit about treatment of denorm floats and
the behavior is exactly the same for them as it would be for -0 or
+0. This makes our shading code match that behavior, since OpenGL
doesn't care and on a few cpu's it's faster (worst case the same).
Float16 conversions will likely break but we'll fix them in a follow
up commit.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-09 23:30:55 -04:00
Matt Turner
80bc14370a mesa: Set ProfileMask properly for core profile.
Fixes MESA_GL_VERSION_OVERRIDE=3.2 egl-create-context-verify-gl-flavor.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-09 14:19:22 -07:00
Kenneth Graunke
8c9a54e7bc i965: Delete intel_context entirely.
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:35 -07:00
Kenneth Graunke
53631be4eb i965: Move intel_context::gen and gt fields to brw_context.
Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:34 -07:00
Kenneth Graunke
2e26afb37b i965: Move intel_context::has_llc to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:33 -07:00
Kenneth Graunke
794de2f387 i965: Move intel_context::is_<platform> flags to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:31 -07:00
Kenneth Graunke
44fd490067 i965: Move must_use/has_separate_stencil fields to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:30 -07:00
Kenneth Graunke
3b80b147f6 i965: Move intel_context::has_hiz to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:29 -07:00
Kenneth Graunke
351d2add62 i965: Free brw, not intel.
Things worked out in the past because both brw and intel share the same
memory address (by virtue of intel being the first member of brw).

However, brw is what actually gets rzalloc'd (brw_context.c:285), so
freeing that seems safer and more obvious.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:28 -07:00
Kenneth Graunke
e3c2bb1eb4 i965: Shorten context base class dereference chains.
ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:26 -07:00
Kenneth Graunke
d5b4a3f5a3 i965: Move intel_context::has_swizzling to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:25 -07:00
Kenneth Graunke
02128c448d i965: Move intel_context::intelScreen to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:24 -07:00
Kenneth Graunke
44a11eab9c i965: Delete unused intel_context::driFd field.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:23 -07:00
Kenneth Graunke
e0858763bc i965: Store brw_context as the DRI driver private, not intel_context.
Right now, they're interchangeable.  In the future, intel_context will
either go away or change purpose.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:21 -07:00
Kenneth Graunke
a1d94cdb00 i965: Move intel_context::driContext to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:20 -07:00
Kenneth Graunke
a9d33dbbdd i965: Move intel_context::NewGLState to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:19 -07:00
Kenneth Graunke
dd54558d31 i965: Move intel_context::upload to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:17 -07:00
Kenneth Graunke
0273e6e23e i965: Move intel_context::max_gtt_map_object_size to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:16 -07:00
Kenneth Graunke
b15f1fc3c6 i965: Move intel_context::perf_debug to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:14 -07:00
Kenneth Graunke
7c3180a4ad i965: Move intel_context::no_batch_wrap to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:13 -07:00
Kenneth Graunke
5314afa27a i965: Move intel_context's framerate throttling fields to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:12 -07:00
Kenneth Graunke
ec995de6fb i965: Move intel_context::stats_wm to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:10 -07:00
Kenneth Graunke
329779a0b4 i965: Move intel_context::batch to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:08 -07:00
Kenneth Graunke
5d8186ac1a i965: Move intel_context::hw_ctx to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:07 -07:00
Kenneth Graunke
eeb75b41f1 i965: Move intel_context::bufmgr to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:05 -07:00
Kenneth Graunke
e33439045d i965: Move intel_context's driconf flags to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:04 -07:00
Kenneth Graunke
fe0a8cb30d i965: Move intel_context::reduced_primitive to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:03 -07:00
Kenneth Graunke
9147b40496 i965: Move front buffer rendering fields from intel_context to brw.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:01 -07:00
Kenneth Graunke
e43043c316 i965: Move intel_context::vtbl to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:58 -07:00
Kenneth Graunke
fbdd3891e1 i965: Move intel_context::optionCache to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:55 -07:00
Kenneth Graunke
ca437579b3 i965: Pass brw_context to functions rather than intel_context.
This makes brw_context available in every function that used
intel_context.  This makes it possible to start migrating fields from
intel_context to brw_context.

Surprisingly, this actually removes some code, as functions that use
OUT_BATCH don't need to declare "intel"; they just use "brw."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:53 -07:00
Kenneth Graunke
86f2711722 i965: Remove pointless intel_context parameter from try_copy_propagate.
It's already part of the visitor class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:51 -07:00
Kenneth Graunke
18a223d323 i965: Add forward declarations of brw_context to a few places.
These files have forward declarations for intel_context.  This makes
brw_context available in the same places without further #include
monkeying.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:50 -07:00
Kenneth Graunke
a69274454b i965: Replace #include "intel_context.h" with brw_context.h.
brw_context.h includes intel_context.h, but additionally makes the
brw_context structure available.  Switching this allows us to start
using brw_context in more places.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:48 -07:00
Kenneth Graunke
99ebf9d07a i965: Move ctx->Const setup from intelInitContext to the new helper.
This also requires moving _mesa_init_point() to after the ctx->Const
initialization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:47 -07:00
Kenneth Graunke
963d9f78a4 i965: Split code to set ctx->Const values into a helper function.
brwCreateContext() has a lot of random things to do.  Factoring out the
part that initializes ctx->Const values and shader compiler options
makes the main function a bit easier to read.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:45 -07:00
Kenneth Graunke
d13c120573 i915: Remove i965+ chip names.
i965+ chipsets shouldn't ever hit this driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:44 -07:00
Kenneth Graunke
e4f3d5cdcf i965: Remove i915 chip names.
i915 chipsets shouldn't ever hit this driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:42 -07:00
Kenneth Graunke
2921390666 i965: Replace intel_context:needs_ff_sync with intel->gen == 5.
Technically, needs_ff_sync was set on Gen5+, but it was only consulted
in the clipper threads and quad/lineloop decomposition code, which are
both Gen4-5 only.  So in reality it only identified Ironlake.

The named flag doesn't really clarify things, and seems like overkill.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:07:13 -07:00
Kenneth Graunke
968c57782d i965: Add missing newline to blorp color clear perf_debug message.
perf_debug() doesn't add a newline for you; without this, all the
INTEL_DEBUG=perf output was jumbled together.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-09 10:10:46 -07:00
Emil Velikov
f0260f4e3d glsl: Silence unused variable warning in the release build
Resolves the following gcc warning

 opt_flip_matrices.cpp:84:32: warning: unused variable 'deref'

v2: keep the variable, but wrap it in a ifndef NDEBUG block
    (suggested by Ian)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 19:08:42 -07:00
Emil Velikov
4df6823f21 glsl/ast: Silence uninitialized variable warnings in the release build
Resolves the following gcc warnings

 warning: 'iface_type_name' may be used uninitialized in this function
 warning: 'var_mode' may be used uninitialized in this function

Note: The variables are initialised to UNKNOWN and ir_var_auto

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-08 19:08:30 -07:00
Paul Berry
292368570a i965: Add an assertion to brwProgramStringNotify.
driver->ProgramStringNotify is only called for ARB programs, fixed
function vertex programs, and ir_to_mesa (which isn't used by the i965
back-end).  Therefore, even after geometry shaders are added,
brwProgramStringNotify should only ever be called with a target of
GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB.

This patch adds an assertion to clarify that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 14:18:02 -07:00
Matt Turner
ba7b60d3e4 glsl: Allow non-constant expression initializers of const-qualified vars.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 12:46:56 -07:00
Marek Olšák
1faa375573 r600g: improve the mechanism for recognizing an empty CS
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
287b2fa115 r600g: explicitly flush caches for streamout-based buffer copying & clearing
It's done automatically for vertex buffers, but not for constant buffers,
textures, and colorbuffers.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
7948ed1250 r600g: only flush the caches that need to be flushed during CP DMA operations
This should increase performance if constant uploads are done with the CP DMA,
because only the cache that needs to be flushed is flushed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
1b40398d02 r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags
also flushing any cache in evergreen_emit_cs_shader seems to be superfluous
(we don't flush caches when changing the other shaders either)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Alex Deucher
098316211c r600g: adjust flush flags (v3)
1. flush SH with read caches
2. add flag for DB flushes
3. add flag for CB flushes

v2: flush all CBs, remove redundant emit_state variable.
v3: Marek: also set the new flags in r600_context_flush, the CP dma functions,
    and texture_barrier, and rename them

Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
862f69fbe1 r600g: don't call buffer_wait in buffer_mmap_sync_with_rings
The winsys should do this, because it measures how much time we spend
in buffer_map doing synchronization, which can be viewed with the gallium
HUD.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
94d294137e r600g: don't read back the MSAA depth buffer if the read flag is not set
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
141b892620 r600g: don't flush the context in texture_transfer_map
the winsys does this automatically

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
ae87aae0c4 r600g: fix texture offset computation for mapped MSAA depth buffers
It was wrong, because the offset shouldn't be applied to MSAA depth buffers.
This small cleanup should prevent such issues in the future.

This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n".

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
a3263cca59 r600g: fix color resolve for RGBX8 and RGBX16 integer formats
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
b1a061b81e r600g: enable fast MSAA color clear for array/3D/cube textures
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
87669c3654 r600g: implement fast MSAA color clear for integer textures
this also fixes the fast clear with multiple colorbuffers and each having
a different format

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Christian König
085c695488 r600/uvd: fix check for UVD 2.x
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-08 19:51:20 +02:00
Chris Forbes
1415a1884c i965: fix alpha test for MRT
Include src0 alpha in the RT write message when using MRT, so it is used
for the alpha test instead of the normal per-RT alpha value.

Fixes broken rendering in Dota2 under Wine [FDO #62647].

No Piglit regressions on Ivybridge.

V2: reuse (and simplify) existing sample_alpha_to_coverage flag in
the FS key, rather than adding another redundant one.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewd-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647
NOTE: This is a candidate for the stable branches.
2013-07-06 12:41:54 +12:00
Roland Scheidegger
9ef49cfd84 gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch
The logic for choosing number of lods was bogus.
(The code should ultimately handle the case of only one lod even with multiple
quads but currently can't.)
2013-07-05 18:07:51 +02:00
José Fonseca
45f174ce40 gallivm: Remove bogus assert.
It is perfectly valid for the swizzle to be bigger than 2. For example the
texel offsets could be

  SAMPLE ..., IMM[0].zzz

What is not correct is for chan_index to be bigger than 2.

Trivial.
2013-07-05 14:35:54 +01:00
Ben Skeggs
c29c6b2b2e nvc0: enable very initial support for nvf0 (GK110)
Shaders need a lot of work still.  Basic stuff generally works, so this
is basically just fine for gnome-shell, OA etc at this point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-07-05 14:15:04 +10:00
Roland Scheidegger
4dbca8672b gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources
The assertion was always broken but the code unused until enabling the
per-element lod code. Fixes piglit texelFetch vs isampler1D and similar
tests (only run with GL 3.0 version override).
2013-07-05 01:19:23 +02:00
Roland Scheidegger
f3bbf65929 gallivm: do per-pixel lod calculations for explicit lod
d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just affect neighboring pixels.
Some code was already there to handle this so fix it up and enable it.
There will no doubt be a performance hit unfortunately, we could do better
if we'd knew we had a real vector shift instruction (with variable shift
count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu).
Don't do anything for lod bias and explicit derivatives yet, though
no special magic should be needed for them neither.
Likewise, the size query is still broken just the same.

v2: Use information if lod is a (broadcast) scalar or not. The idea would be
to base this on the actual value, for now just pretend it's a scalar in fs
and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same
code is generated for fs as before).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-04 19:42:04 +02:00
Zack Rusin
bbd1e60198 draw: fix overflows in the indexed rendering paths
The semantics for overflow detection are a bit tricky with
indexed rendering. If the base index in the elements array
overflows, then the index of the first element should be used,
if the index with bias overflows then it should be treated
like a normal overflow. Also overflows need to be checked for
in all paths that either the bias, or the starting index location.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:30 -04:00
Zack Rusin
09820902d7 draw/llvm: index overflows if it's greater than elt max
The comparison, incorrectly, was greater-than-or-equal to
elt max.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:24 -04:00
Kenneth Graunke
764afc48cf i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.
The texture alignment unit functions are called from brw_tex_layout.c,
so it makes sense to put them there.  Since the only caller of
intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be
made into a static function.  However, this patch instead simply folds
it into the caller, as it's only two lines anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
466aa712b6 i965: Push intel_get_texture_alignment_unit call into brw_miptree_layout
intel_miptree_create_layout() calls intel_get_texture_alignment_unit()
and then immediately calls brw_miptree_layout().  There are no other
callers.

intel_get_texture_alignment_unit() populates the miptree's alignment
unit fields, which are used by brw_miptree_layout() to determine where
to place each miplevel.  Since brw_miptree_layout() needs those to be
present, it makes sense to have it initialize them as the first step.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
c4c3c0dc94 i965: Declare for-loop counters in the loop in brw_tex_layout.c.
The driver is compiled in C99 mode, so this is not a problem.  It's
slighlty tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ccf312fd12 i965: Remove use of GLuint/GLint in brw_tex_layout.c.
Using GL types is silly; this isn't even remotely API-facing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ed95e396f3 i965: Tidy the brw_tex_layout.c copyright and file header comments.
This uses Doxygen style for the file comments, and generally makes it
more consistent with the rest of the driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
2ea87fde31 i965: Move i945_texture_layout_2d to brw_tex_layout.c
This consolidates the miptree layout logic in a single file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
1920209970 i965: Remove fallthrough for Gen4 cube map layout.
Now that both 2DArray and Cube layouts are taken care of by helper
functions, it's easy to just call the right function for each
generation.  This is a little cleaner than falling through.

This also reworks the comments.  Referencing "Volume 1" of the BSpec
isn't very helpful, since that's only available inside Intel, and it
doesn't even use volume numbers.  Also, "Ironlake...finally" sounds a
bit strange considering that almost all hardware uses the 2D array
approach.  At this point, Gen4 is the only special case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7e4007a1b3 i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.
These do the exact same thing; combining them is tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
bc51f15b32 i965: Pull 3D texture layout code out into a helper function.
A bit cleaner than having it in one giant function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
abc2bdffd6 i965: Replace maxBatchSize variable with BATCH_SZ define.
maxBatchSize was only ever initialized to BATCH_SZ, and a few places
used BATCH_SZ directly anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
2c602d2adf i965: Move annotate_aub out of the vtable.
brw_annotate_aub() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
f05f8793c8 i965: Move debug_batch hook out of the vtable.
brw_debug_batch() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
749160aab3 i965: Remove render_target_supported from the vtable.
brw_render_target_supported() is the only implementation of this
function, so it makes sense to just call it directly.

Rather than adding an #include of brw_wm.h, this patch moves the
prototype to brw_context.h.  Prototypes seem to be in rather arbitrary
places at the moment, and either place seems as good as the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7c5279e554 i965: Move is_hiz_depth_format out of the vtable.
brw_is_hiz_depth_format() is the only implementation of this function,
so it makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
607338f1cb i965: Remove the invalidate_state() vtable hook.
The hook was a noop.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
251cdcf059 i965: Replace fprintfs with assertions in GLenum comparison translators.
These functions translate GLenum comparison operations into the hardware
enumerations.  They should never be passed something other than a GL
comparison operator, or something is very broken.

Assertions seem more appropriate than fprintf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7ee616f1bf i965: Replace intel_state.c enums with those from brw_defines.h.
Both intel_context.h and brw_defines.h have #defines for comparison
functions, stencil ops, blending logic ops, and blending factors.
They're exactly the same values, so it makes sense to pick one.

brw_defines.h is the logical place for this kind of stuff, so this patch
converts intel_state.c to use the set defined there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
c9db037dc9 i965: Delete pre-DRI2.3 viewport hacks.
The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit
4258e3a2e1.  At this point, it's unlikely that anyone's using the
right mix of new and old components to hit this path.  Deleting it
removes an untested code path and cleans up the driver a bit.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Keith Packard <keithp@keithp.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
cbb37b7586 i965: Remove "There are probably better ways" comment.
There are always better ways to do things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7115bee993 i965: Delete brw_print_reg() function.
This wasn't called from anywhere; presumably it was used to examine
brw_regs when debugging shader assembly.  However, it prints registers
in a different notation than brw_disasm.c which everyone is used
to...which means I doubt anyone will want to use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
bc8b62e3a0 i965: Move contents of intel_clear.h to intel_context.h.
Having a header file for a single prototype seems rather excessive.
Plus, the actual function is in brw_clear.c, not intel_clear.c, so
there isn't even the .c/.h filename symmetry one might expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d8e70f301 i965: Move contents of intel_extensions.h to intel_context.h.
Having an entire header file for a single prototype seems a bit
excessive.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d119880e8 i965: Remove some dead code.
A random smattering of things that just aren't used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
d245e795cf i965: Delete dead intel_buffer_object::range_map_size field.
Nothing uses this, apparently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
1f6ebdd43f i965: Remove intel_buffer_object::source.
This was only used for BOs backed by system memory on i915.  With that
gone, there's nothing that even sets source to non-zero, so this is
purely dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
6e5b80ee5a i965: Fix buffer object segfault since removal of system memory BOs.
Commit cf31a19300 removed support for BOs
backed by system memory, as it was only useful for i915.  However, it
removed a little too much code: intel_bufferobj_buffer() used to call
intel_bufferobj_alloc_buffer(), and after that commit, it didn't.

This led to NULL pointer dereferences in several test cases, such as
es3conform's transform_feedback_state_variables test.

This commit restores the allocation, preserving the original behavior.
It may not be the cleanest approach, but tidying should come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:12 -07:00
Matthew McClure
012ba47076 postprocess: move second temporary assertion into isolated configuration
With this patch we will only assert that the second temporary is allocated,
when there are more than two active filters.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-07-03 09:19:04 -06:00
José Fonseca
9b6788eb15 glsl: Ensure snprintf is defined on MSVC builds.
Should fix:

  src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found
  ...
2013-07-03 08:26:08 +01:00
Ilia Mirkin
4bc8e3c3e4 targets/xvmc-nouveau: add in missing nv30 lib
Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so
that it may be dlopen'd.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-03 09:02:40 +02:00
Marek Olšák
30c3e8718d mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies
Not needed with do_dead_builtin_varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
74edd56927 st/mesa: disable EXT_separate_shader_objects
The extension disallows elimination of set-but-unused varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
b3d8b4c0b4 glsl/linker: eliminate unused and set-but-unused built-in varyings
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.

v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
    - use snprintf
    - disable the optimization for GLES2

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
3c555827c3 glsl/linker: check against varying limit after unused varyings are eliminated
We counted even the varyings which were later eliminated, which was
suboptimal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
284d954912 glsl/linker: link shaders in the opposite order (from fragment to vertex)
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
030ca230e2 mesa: renumber shader indices according to their placement in pipeline
See my explanation in mtypes.h.

v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
José Fonseca
84f367e69a gallivm: Simplify intrinsic name construction.
Just noticed this could be slightly shortened when fixing MSVC build.

Trivial.
2013-07-02 13:12:31 +01:00
Kenneth Graunke
15ca0ca1b6 glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.
This patch adds texture() for isamplerCubeArray and usamplerCubeArray,
which were entirely missing.

It also makes texture() with a LOD bias fragment shader specific.  The
main GLSL specification explicitly says that texturing with LOD bias
should not be allowed for vertex shaders.

Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert.
which tries to use bias in a vertex shader.  Currently, it expects this
to pass (so this patch regresses the test), but I've sent a patch to
reverse the expected behavior (so this patch would fix the updated test):
http://lists.freedesktop.org/archives/piglit/2013-June/006123.html

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2013-07-02 01:01:30 -07:00
José Fonseca
4c859901ce gallivm: Fix MSVC build. 2013-07-02 06:41:32 +01:00
José Fonseca
e621ec816d gallivm: Fix indirect immediate registers.
If reg->Register.Indirect is true then the immediate is not truly a
constant LLVM expression.

There is no performance regression in using LLVMBuildBitCast, as it will
fallback to LLVMConstBitCast internally when the argument is a constant.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-02 06:30:06 +01:00
Zack Rusin
70bc43acdb gallium/tests: fix the translate test 2013-06-28 09:43:17 -04:00
Anuj Phogat
722721d718 i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
This patch enables ext_framebuffer_multisample_blit_scaled extension
on intel h/w >= gen6.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Anuj Phogat
6fc3da2da0 i965/blorp: Add bilinear filtering of samples for multisample scaled blits
Current implementation of ext_framebuffer_multisample_blit_scaled in
i965/blorp uses nearest filtering for multisample scaled blits. Using
nearest filtering produces blocky artifacts and negates the benefits
of MSAA. That is the reason why extension was not enabled on i965.

This patch implements the bilinear filtering of samples in blorp engine.
Images generated with this patch are free from blocky artifacts and show
big improvement in visual quality.

Observed no piglit and gles3 regressions.

V3:
- Algorithm used for filtering assumes a rectangular grid of samples
  roughly corresponding to sample locations.
- Test the boundary conditions on the edges of texture.

V4:
- Clip texcoords and use conditional MOVs.
- Send texture dimensions as push constants.
- Remove the optimization in case of scaled multisample blits.

V5:
- Move mcs_fetch() inside the 'for' loop after computing pixel coordinates.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Ian Romanick
27f2df2507 docs: Import 9.1.4 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-01 14:48:58 -07:00
Zack Rusin
1c2e5c223d draw/translate: fix instancing
We were incorrectly computing the buffer offset when using the
instances. The buffer offset is always equal to:
start_instance * stride + (instance_num / instance_divisor) *
stride
We were completely ignoring the start instance quite
often producing instances that completely wrong, e.g. if
start instance = 5, instance divisor = 2, then on the first
iteration it should be:
5 * stride, not (5/2) * stride as we'd have currently, and if
start instance = 1, instance divisor = 3, then on the first
iteration it should be:
1 * stride, not 0 as we'd have.
This fixes it and adjusts all the code to the changes.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 05:21:20 -04:00
Zack Rusin
df4ab7974a draw: fix incorrect clipper invocation statistics
clipper invocations are computed earlier (of course
before the emittion) so this code was adding bogus
numbers to already computed clipper invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:29 -04:00
Zack Rusin
34546d61c1 draw/gallivm: export overflow arithmetic to its own file
We'll be reusing this code so lets put it in a common file
and use it in the draw module.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:24 -04:00
Zack Rusin
88de009cc1 draw: check for integer overflows in instance computation
Integers could easily overflow is the starting instance
was large enough. Instead of letting bogus counts through
set the instance to max if it overflown and let our
regular buffer overflow computation handle it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:20 -04:00
Zack Rusin
2f13f28120 draw: check for an integer overflow when computing stride
Our buffer overflow arithmetic was susceptible to integer
overflows which was the buffer overflow logic to break.
Lets use the llvm overflow intrinsics to check for integer
overflows while computing the stride/needed buffer size.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:16 -04:00
Zack Rusin
e742f7788e draw: account for elem size when computing overflow
We weren't taking into account the size of element
that is to be fetched, which meant that it was possible
to overflow the buffer reads if the stride was very
close to the end of the buffer, e.g. stride = 3, buffer
size = 4, and the element to be read = 4. This should
be properly detected as an overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:12 -04:00
Vinson Lee
7214fe3cc4 i965: Initialize brw_blorp_const_color_program member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-01 10:16:16 -07:00
Ross Burton
2c6186390c eglplatform: use unsigned long instead of 32-bit ints in generic platform
In the generic Unix case use the "unsigned long" type instead of 32-bit
integers so that the type sizes are consistant on 64-bit machines between X11
and not-X11.

Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:24 -07:00
Ross Burton
1a7275de9a build: fix EGL build when no X11 headers are present
eglplatform.h defaults to X11 on Unix unless told otherwise, so if we're doing a
build without any X11 support tell it so that we don't try including headers
that don't exist.

Also set GL_PC_FLAGS so that the definition is in egl.pc, so that applications
using EGL don't try to pull in X11 headers on systems where EGL was configured
without X11 support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64959
Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:11 -07:00
José Fonseca
acc6a141b8 tools/trace: Return dummy fence object to silence warnings. 2013-07-01 12:06:58 +01:00
José Fonseca
0fd71ac9eb tools/trace: Don't crash if a trace has no timing information. 2013-07-01 12:05:57 +01:00
José Fonseca
fa3040c117 scons: Fix dependencies of enums.c and api_exec.c. 2013-07-01 12:04:59 +01:00
Maarten Lankhorst
bf95ca7de0 nvc0: allow frame dropping in h264
The only reason the checks existed were paranoia, when I first
wrote the code I wasn't sure it was correct. Now that I am,
the asserts triggered when XBMC was dropping frames, so remove it.

NOTE: This is a candidate for the 9.1 branch.
2013-07-01 08:47:49 +02:00
Tom Stellard
24fa43675f r300g/compiler: Prevent regalloc from swizzling texture operands v2
https://bugs.freedesktop.org/show_bug.cgi?id=63520

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
e2c3640540 r300g/compiler/tests: Add an assembly parser
The assembly parser can be used to load r300 assembly dumps
and run them through any of the r300 compiler passes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
ab40d8d56f r300g: Fix make check
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:24:55 -07:00
Grigori Goronzy
30004b20c2 r600g: implement fast color clears for MSAA on evergreen+
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.

Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMASK is required, we have to be able to create a clear
color value for the format and the texture mustn't contain multiple
images. Technically, it should be possible to support array textures
and cubemaps if all images are attached to the framebuffer,
but this does not appear to be common.

v2: fix fast clear check
v3: Marek: - disable fast clear with 128-bit formats, which are unsupported
           - set tex->dirty_level_mask in r600_clear, so that the driver knows
             the resource must be decompressed/expanded
           - return early from r600_clear if there's nothing else to do

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
b1693194ee r600g/compute: disable unused colorbuffer slots
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
f83e220d36 st/mesa: handle SNORM formats in generic CopyPixels path
v2: check desc->is_mixed in util_format_is_snorm
2013-06-30 22:14:37 +02:00
Matt Turner
adf8afa168 i965: NULL check depth_mt to quiet static analysis.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-29 15:19:08 -07:00
Roland Scheidegger
7d430bfab9 llvmpipe: fix timer query if there's no bins
b04a295a4a removed seemingly unnecessary
code in get_query. Turns out this code could in fact be reached - while
timestamps are always binned, if there are no bins (which happens if fb
size is 0) then the rasterization query code filling this in is still
never executed.
So fix this up by filling in some timestamp, but do it at EndQuery time
not GetQuery time which should be more appropriate.
Makes piglit arb_timer_query-timestamp-get happy again.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-29 16:58:02 +02:00
Tom Stellard
5a925cc550 clover: Don't segfault when compiling a program with no kernel 2013-06-28 15:19:06 -07:00
Eric Anholt
d7361f2943 mesa: Remove unused allow_large_textures driconf from classic drivers.
This option hasn't been used since the introduction of DRI2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:27 -07:00
Kenneth Graunke
03600660a1 i915: Remove GLES 3.0 sRGB workaround.
Gen3 doesn't support GLES 3.0, so there's no need for it.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
dc8796506e i965: Remove is_945.
Only relevant on Gen3.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
a4e31956ac i965: Delete hw_stencil flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
4299e35888 i965: Remove hw_stipple flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1a5dca38e9 i965: Remove use_early_z option.
This was only used by i965+.

v2: Also remove the option from the driconf list. (change by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2cc5724db2 i965: Remove unused SUBPIXEL_* macros.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2e9fe0ca12 i965: Remove redundant Gen3 PCI IDs.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1811f5c43d intel: Remove unused INTEL_MAX_FIXUP macro.
v2: Remove it from i915, too (change by anholt)

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Eric Anholt
0ac0a1b02e i965: Drop i915 register/instruction definitions.
v2: Remove unused DV_PF_* macros, too. (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:26 -07:00
Eric Anholt
1b67cd29a1 i965: Drop code for calling the empty brw_update_draw_buffers() hook.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
7c232189c5 i965: Drop dead i915 blend state code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
d58d0a3754 i965: Drop i915-specific blit clear code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
cf31a19300 i965: Drop the system-memory VBO support for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
814440aadd i965: Drop i915 swtnl code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
bb2e312d4d i965: Drop i915-specific vtbl entries.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
a61d8f6110 i965: Drop swtnl fallback code for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
28e80d7136 i965: Drop i915 code from intel_screen.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
4a08a86f22 i965: Drop #ifdef I915 code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
6fddd375d7 i965: Drop code checking for gen <= 3.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
3c231b8631 i915: Remove a duplicated set of PCI IDs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
8ac1ed92aa i915: Remove various remaining dead code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
934974fba6 i915: Remove dead debug flags.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
39c5fd7f13 i915: Remove state batch emit support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
a40f9871a0 i915: Drop unused register #defines from the shared reg file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
173666e2ed i915: Drop 965+ GL version setup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
f6426509dc i915: Remove gen6+ batchbuffer support.
While i915 does have hardware contexts in hardware, we don't expect there
to ever be SW support for it (given that support hasn't even made it back
to gen5 or gen4).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
c25e3c34d6 i915: Drop chipset detection code for 965+ chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
014251ef42 i915: Drop context fields specific to 965+ chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
d71b7301ec i915: Drop all has_llc code.
i915 never has llc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
be63c1c993 i915: Remove the remainder of the batchbuffer caching.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
7f210bf535 i915: Remove miscellanous uncalled gen4 code from formerly shared files.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
6bdc5ecbba i915: Remove most of the code under gen >= 4 checks.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
18100d415e i915: Remove fake ETC support that only existed on gen4+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
27eedca3e0 i915: Remove separate stencil code.
This was formerly-shared code for supporting gen5+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
279f0bce47 i915: Remove the I915 macro from the formerly shared code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
f26104eb5b i915: Remove all the MSAA support code.
This hardware doesn't have MSAA support, so this code is all a waste for it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
0f31e06a2e i915: Remove all the HiZ code from i915.
v2: Remove extra struct forward declaration (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Ian Romanick
927f572c27 mesa: GL_EXT_shadow_funcs is not optional with GL_ARB_shadow
Every driver left in Mesa that enables one also enables the other.
There's no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
41853b598c mesa: GL_ARB_texture_storage_multisample is not optional with GL_ARB_texture_multisample
In Mesa, this extension is implemented purely in software.  Drivers may
*optionally* provide optimized paths.  If a driver enables,
GL_ARB_texture_multisample, it gets GL_ARB_texture_storage_multisample
for free.

NOTE: This has the side effect of enabling the extension in Gallium
drivers that enable GL_ARB_texture_multisample.

v2 (Ken): Still prevent multisample texture targets in TexParameter for
implementations that don't support multisampling.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
d5b6b7a39b mesa: GL_ARB_texture_storage is not optional
In Mesa, this extension is implemented purely in software.  Drivers may
*optionally* provide optimized paths.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

v2: Minor whitespace tidying (suggested by Brian).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
70966570f3 mesa: GL_ARB_shading_language_100 is not optional
This extension just provides some of the most basic software framework
for GLSL.  Without GL_ARB_vertex_shader or GL_ARB_fragment_shader,
applications still cannot use GLSL.  There's no value in
conditionalizing support for this extension.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
e6ec425d6e mesa: GL_ARB_shader_objects is not optional
This extension just provides some of the most basic software framework
for GLSL.  Without GL_ARB_vertex_shader or GL_ARB_fragment_shader,
applications still cannot use GLSL.  There's no value in
conditionalizing support for this extension.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
9bc24b4fc4 mesa: GL_NV_blend_square is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
338ea2e4d1 mesa: GL_EXT_fog_coord is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
c139708087 mesa: GL_EXT_secondary_color is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
b5305a303b mesa: GL_EXT_framebuffer_object is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
f4571640b8 mesa: Remove GL_MESA_resize_buffers
Commit bab755a made the implementation a no-op, and it was only ever
enabled by software rasterizers.

v2: Move the spec into docs/specs/OLD since it's now obsolete
    (squashed patch from Andreas Boll)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
34e8905077 mesa: Remove _mesa_{enable, disable}_extension and _mesa_extension_is_enabled
They're not used anywhere.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
e14b486113 mesa: Just set extension flags instead of calling _mesa_enable_extension
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
b0d755f00b mesa: Remove _mesa_enable_._._extensions functions
After the preceeding commits, they are not used.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
45099ec175 swrast: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
a964397fd9 osmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
c9edd661c4 wmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
89cf6e6273 x11: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.  Also, don't duplicate the DXTn checks.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
0b9398c74f i965: Merge the two GEN >= 6 extension enable blocks
There's no reason for these blocks to be separate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:21 -07:00
Ian Romanick
ae66a656fd i965: Move GEN >= 4 extensions into the "always on" list
This copy of the source file is only used for GEN >= 4, so extensions
that are enabled for GEN >= 4 are always enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:21 -07:00
Ian Romanick
4ed976f6b5 i965: Move GEN >= 3 extensions into the "always on" list
This copy of the source file is only used for GEN >= 4, so extensions
that are enabled for GEN >= 3 are always enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:20 -07:00
Ian Romanick
e621208e29 i915: Remove GEN >= 4 extension support
This copy of the source file is only used for GEN <= 3, so remove the
dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:20 -07:00
Kenneth Graunke
745f6c692c i965: Split surface format code into a new file (brw_surface_formats.c).
brw_wm_surface_state.c has gotten rather large and unwieldy.  At this
point, it consists of two separate portions:

1. Surface format code

   This includes the giant table of surface formats and what features
   they support on each generation, as well as the code to translate
   between Mesa formats and hardware formats.

   This is used across all generations.

2. Binding table (SURFACE_STATE) related code.

   This is the code to generate SURFACE_STATE entries for renderbuffers,
   textures, transform feedback buffers, constant buffers, and so on, as
   well as the code to assemble them into binding tables.

   This is only used on Gen4-6; gen7_surface_state.c has Gen7+ code.

Since the two are logically separate, and one is reused on every
generation while the other is not, it makes a lot of sense to split
them out.  It should also make finding code easier.

No code is changed by this patch.  I simply copied the file then deleted
portions of both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:11 -07:00
Alex Deucher
c309e64db8 radeonsi: add kabini pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:27 -04:00
Alex Deucher
b6b1346691 radeonsi: add bonaire pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:18 -04:00
Alex Deucher
d669992e35 radeonsi: disable 2D tiling on CIK for now
Causes GPU hangs.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:10 -04:00
Alex Deucher
1357624abc radeonsi: add llvm processor names for CIK
Requires updated llvm.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:00 -04:00
Alex Deucher
234d81e6b2 radeonsi: emit PA_SC_RASTER_CONFIG[_1] on cik
Use the golden values for each asic.

Todo: update Kabini and Kaveri.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:53 -04:00
Alex Deucher
9d8ad222c6 radeonsi: PA_CL_ENHANCE is privileged on CIK
Needs to be and is set by the kernel.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:46 -04:00
Alex Deucher
72c10be3a7 radeonsi: update surface sync packet emit for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:35 -04:00
Alex Deucher
f2a9bd8084 radeonsi: store chip class in the pm4 struct
Will be used for asic specific pm4 behavior.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:27 -04:00
Alex Deucher
3a47f1945f radeonsi: properly handle DB tiling setup on CIK
On CIK, DB switches back to using per-surface tiling
parameters rather than the tile index used on SI.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:17 -04:00
Alex Deucher
8c903f5df9 radeonsi: emit additional shader pgm rsrc registers for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:10 -04:00
Alex Deucher
59e4fe0b75 radeonsi: emit TA_BC_BASE_ADDR_HI for border color on CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:03 -04:00
Alex Deucher
b363a45c54 radeonsi: fix VGT_PRIMITIVE_TYPE emit for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:54 -04:00
Alex Deucher
ecb679a8d3 radeonsi: register updates for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:46 -04:00
Alex Deucher
deb2358243 radeonsi: initial PM4 changes for CIK
note which packets are removed and add new ones.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:36 -04:00
Alex Deucher
f29f206c93 radeonsi: initial support for CIK chips
Add the infrastructure to differentiate them.
Just treat them like SI for now.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:28 -04:00
Alex Deucher
5b3f1ea933 radeonsi: rename SI chip class from TAHITI to SI
Covers the entire family.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:20 -04:00
Tom Stellard
47e35eff9d r600g: Fix build
Broken since 2840bec56f when opencl is
disabled.
2013-06-28 11:11:43 -07:00
Anuj Phogat
ee723ffabb mesa: Return ZeroVec/dummyReg instead of NULL pointer
Assertions are not sufficient to check for null pointers as they don't
show up in release builds. So, return ZeroVec/dummyReg instead of NULL
pointer in get_{src,dst}_register_pointer(). This should calm down the
warnings from static analysis tool.

Note: This is a candidate for the 9.1 branch.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 10:53:43 -07:00
Tom Stellard
bee49cb0ec mesa: Fix build with older gcc since update of glext.h
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 08:49:06 -07:00
Tom Stellard
2840bec56f r600g/compute: Accept LDS size from the LLVM backend
And allocate the correct amount before dispatching the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
2013-06-28 08:33:11 -07:00
Tom Stellard
2639fca1f0 r600g/compute: Move compute_shader_create() function into evergreen_compute.c
Tested-by: Aaron Watry <awatry@gmail.com>
2013-06-28 08:33:11 -07:00
Brian Paul
ba4979810f svga: pass svga_compile_key by reference instead of value
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-28 08:38:00 -06:00
Brian Paul
74e8a7d1dd svga: use switch statement in svga_shader_type()
Safer in case the PIPE_SHADER_x tokens get renumbered (as Marek
wanted to do).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-28 08:37:59 -06:00
Chia-I Wu
24b05ff158 ilo: clean up states that use ilo_view_surface
Use variables that are easier to remember what they are.
2013-06-28 15:01:00 +08:00
Chia-I Wu
2c9b6a2164 ilo: remove ilo_cbuf_state::count
We can derive it from enabled_mask.
2013-06-28 15:01:00 +08:00
Chia-I Wu
7ea3ed81c8 ilo: clean up ilo_set_constant_buffer()
Add loops that will be optimized away.
2013-06-28 15:01:00 +08:00
Chia-I Wu
11d283cde9 ilo: clean up states that take a start_slot
They are similar, so clean them up to make them look similar.
2013-06-28 15:00:42 +08:00
Vinson Lee
def634979d glsl: Initialize member variable is_ubo_var in constructor.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-27 21:51:32 -07:00
Chia-I Wu
20c691b936 ilo: use shorter names for dirty flags
The new names match those of ilo_context's members respectively, and are
shorter.
2013-06-28 10:44:51 +08:00
Chia-I Wu
cabc7b44c0 ilo: track if primitive restart has changed
Re-emit 3DSTATE_INDEX_BUFFER to enable/disable primitive restart.
2013-06-28 10:44:38 +08:00
Chia-I Wu
e071812e46 ilo: avoid potential dangling pointer dereference
Set pipe_draw_info to NULL after draw_vbo().
2013-06-28 10:11:49 +08:00
Ian Romanick
c74a7eb9c5 mesa: Remove GL_EXT_clip_volume_hint
As far as I can tell, no driver has enabled this extension since c6499a7
back in 2007.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-27 18:14:33 -07:00
Chad Versace
6b676e6634 i965,i915: Return early if miptree allocation fails
If allocation fails in intel_miptree_create_layout(), don't proceed to
dereference the miptree. Return an early NULL.

Fixes static analysis error reported by Klocwork.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-27 13:16:47 -07:00
Roland Scheidegger
670f829102 llvmpipe: handle offset_clamp
This was just ignored (unless for some reason like unfilled polys draw was
handling this).
I'm not convinced of that code, putting the float for the clamp in the key
isn't really a good idea. Then again the other floats for depth bias are
already in there too anyway (should probably have a jit_context for the
setup function), so this is just a quick fix.
Also, the "minimum resolvable depth difference" used isn't really right as it
should be calculated according to the z values of the current primitive
and not be a constant (of course, this only makes a difference for float
depth buffers), at least for d3d10, so depth biasing is still not quite right.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Roland Scheidegger
b04a295a4a llvmpipe: remove never reached code for timestamp queries.
timestamp queries are always binned in an active scene, therefore
always have a result.
2013-06-27 19:06:40 +02:00
Roland Scheidegger
59b8689d37 llvmpipe: fix a bug in opaque optimization
If there are queries active the opaque optimization reseting the bin needs to
be disabled.
(Not really tested since the bug was discovered by code inspection not
an actual test failure.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Vinson Lee
f12e551810 radeonsi/compute: Fix memory leak in radeonsi_launch_grid.
Fixes "Resource leak" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-06-27 10:03:33 -07:00
Tom Stellard
0e990736f3 clover: Fix build with LLVM 3.4
Reported on IRC by lordheavy
2013-06-27 10:03:33 -07:00
Bill York
191795eaf1 docs: updated instructions for Mesa on Windows
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-27 09:49:41 -06:00
Matthew McClure
e87fc11cac postprocess: handle partial intialization failures.
This patch fixes segfaults observed when enabling the post processing
features. When the format is not supported, or a texture cannot be
created, the code must gracefully handle failure and report the error to
the calling code for proper failure handling.

To accomplish this the following changes were made to the filters.h
prototypes:

- bool return for pp_init_func
- Added pp_free_func for filter specific resource destruction

Fixes segfaults from backtraces:

* util_destroy_blit
  pp_free

* u_transfer_inline_write_vtbl
  pp_jimenezmlaa_init_run
  pp_init

This patch also uses tgsi_alloc_tokens to allocate temporary tokens in
pp_tgsi_to_state, instead of allocating the array on the stack. This
fixes the following stack corruption segfault in pp_run.c:

* _int_free
  aaline_delete_fs_state
  pp_free

Bug Number: 1021843
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-27 09:44:29 -06:00
Brian Paul
482c43a946 glx: return True/False instead of GL_TRUE/GL_FALSE
Just to be consistent with the functions' Bool return type.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:19 -06:00
Brian Paul
d171bc9d19 glx: move declarations before code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:18 -06:00
Brian Paul
d43548ca37 mesa: move declarations before code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:18 -06:00
José Fonseca
15085b477b glsl: Use the C99 variadic macro syntax.
MSVC does not support the old GCC syntax.

See also
http://gcc.gnu.org/onlinedocs/gcc/Variadic-Macros.html
2013-06-27 07:44:11 +01:00
José Fonseca
bcd6f3b23c scons: Add dependencies to all .xml files.
Should prevent stuck builds when only some of the included .xml files
change.
2013-06-27 07:25:10 +01:00
Chia-I Wu
9f3cfe6aaf ilo: plug a potential index buffer leak
This is harmless since st_context and u_vbuf both set index buffer to NULL
before destroying themselves.  But we do not want to rely on that behavior.
2013-06-27 11:46:58 +08:00
Roland Scheidegger
eabe068747 softpipe: honor predication for clear_render_target and clear_depth_stencil
trivial, copied from llvmpipe

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger
2e4da1f594 llvmpipe: add support for nested / overlapping queries
OpenGL doesn't support this but d3d10 does.
It is a bit of a pain as it is necessary to keep track of queries
still active at the end of a scene, which is also why I cheat a bit
and limit the amount of simultaneously active queries to (arbitrary)
16 (simplifies things because don't have to deal with a real list
that way). I can't think of a reason why you'd really want large
numbers of overlapping/nested queries so it is hopefully fine.
(This only affects queries which need to be binned.)

v2: don't copy remainder of array when deleting an entry simply replace
the deleted entry with the last one (order doesn't matter).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger
0820342880 llvmpipe: rework query logic
Previously lp_rast_begin_query commands were always inserted into each bin,
and re-issued if the scene was restarted, while lp_rast_end_query commands
were executed for each still active query at the end of tile rasterization.
Also, the ps_invocations and vis_counter were set to zero when the respective
command was encountered.
This however cannot work for multiple queries of the same type (note that
occlusion counter and occlusion predicate while different type were also
affected).
So, change the logic to always set the ps_invocations and vis_counter to zero
at the start of tile rasterization, and then use "start" and "end" per-thread
query values when encountering the begin/end query commands instead, which
should work for multiple queries of the same type. This also means queries do
not have to be reissued in a new scene, however they still need to be finished
at end of tile rasterization, so a list of queries still active at the end of
a scene needs to be maintained.
Also while here don't bin the queries which don't do anything in rasterization.
(This change does not actually handle multiple queries of the same type yet,
as the list of active queries is just a simple fixed array and setup can still
only have one query active per type.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Eric Anholt
3dbba95b72 i965: Move the remaining intel code to the i965 directory.
Now that i915's forked off, they don't need to live in a shared directory.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
(and I hear second hand that idr is OK with it, too)
2013-06-26 12:28:26 -07:00
Eric Anholt
733d32f376 i915: Fork the shared code from i965.
Of this 15000 lines of code in intel/, we've identified 4000 lines that
are trivially unnecessary for i915, and another 1000 that are pointless for
i965, and expect to find more as time goes on.  Split the i915 driver off,
so that we can continue active development on i965 without worrying about
breaking i915.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
(and I hear second hand that idr is OK with it, too)
2013-06-26 12:28:25 -07:00
Eric Anholt
43a6795a1f i915: Remove dead symlink. 2013-06-26 12:28:25 -07:00
Eric Anholt
fc32d40534 glx: Fix another missed glMultiDrawElementsEXT const change.
The build was broken for me since
b7d9478f36.
2013-06-26 12:28:25 -07:00
Ian Romanick
c170c901d0 glsl: Move all var decls to the front of the IR list in reverse order
This has the (intended!) side effect that vertex shader inputs and
fragment shader outputs will appear in the IR in the same order that
they appeared in the shader code.  This results in the locations being
assigned in the declared order.  Many (arguably buggy) applications
depend on this behavior, and it matches what nearly all other drivers
do.

Fixes the (new) piglit test attrib-assignments.

NOTE: This is a candidate for stable release branches (and requires the
previous commit to prevent a regression in OpenGL ES 2.0 conformance
test stencil_plane_operation).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-26 12:27:23 -07:00
Ian Romanick
329cd6a9b1 i965: Be more careful with the interleaved user array upload optimization
The checks to determine when the data can be uploaded in an interleaved
fashion can be tricked by certain data layouts.  For example,

    float data[...];

    glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 16, &data[0]);
    glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 16, &data[4]);
    glDrawArrays(GL_POINTS, 0, 1);

will hit the interleaved path with an incorrect size (16 bytes instead
of 32 bytes).  As a result, the data for attribute 1 never gets
uploaded.  The single element draw case is the only sensible case I can
think of for non-interleaved-that-looks-like-interleaved data, but there
may be others as well.

To fix this, make sure that the end of the element in the array being
checked is within the stride "window."  Previously the code would check
that the begining of the element was within the window.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 12:27:23 -07:00
Brian Paul
b7d9478f36 mesa: add const qualifier to glMultiDrawElementsEXT() indices param
The 20130624 version of glext.h changed this to match the
glMultiDrawElements() function which already had the extra const
qualifier.

Fixes warnings/errors that seem to vary from one compiler to the next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 13:12:01 -06:00
Brian Paul
15436adab0 mesa: remove const from glDebugMessageCallbackARB() function parameter
The new 20130624 version of glext.h removed the const qualifier on
the 'userParam' parameter.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 13:12:01 -06:00
Kenneth Graunke
dd0b99b0be i965/vs: Combine code generation's inst->opcode switch statements.
vec4_visitor::generate_code() switches on vec4_instruction::opcode and
calls into the brw_eu_emit.c layer to generate code for some of them.
It then has a default case which calls generate_vec4_instruction() to
handle the rest...which switches on opcode and handles the rest of the
cases.

The split apparently is that generate_code() handles the actual hardware
opcodes (BRW_OPCODE_*) while generate_vec4_instruction() handles the
virtual opcodes (SHADER_OPCODE_* and VS_OPCODE_*).  But this looks
fairly arbitrary, and it makes more sense to combine the two switches.

This patch moves the cases from generate_code() into the helper function
so that generate_code() isn't as large.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:13 -07:00
Kenneth Graunke
55272883ac i965: Remove broken source type assertions from brw_alu3().
Commit 526ffdfc03 attempted to generalize
the source register type assertions to allow D and UD.  However, the
src1 and src2 assertions actually checked src0.type against D and UD due
to a copy and paste bug.

It also began setting the source and destination register types based on
dest.type, ignoring src0/src1/src2.type completely.  BFE and BFI2 may
actually pass mixed D/UD types and expect them to be ignored, which is
arguably a bit sloppy, but not too crazy either.

This patch simply removes the source register assertions as those values
aren't used anyway.  It also clarifies the comment above the block that
sets the register types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-26 11:25:13 -07:00
Kenneth Graunke
9321f3257f i965: Add back strict type assertions for MAD and LRP.
Commit 526ffdfc03 relaxed the type
assertions in brw_alu3 to allow D/UD types (required by BFE and BFI2).
This lost us the strict type checking for MAD and LRP, which require
all four types to be float.

This patch adds a new ALU3F wrapper which checks these once again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
4563dfe23a glsl: Streamline the built-in type handling code.
Over the last few years, the compiler has grown to support 7 different
language versions and 6 extensions that add new built-in types.  With
more and more features being added, some of our core code has devolved
into an unmaintainable spaghetti of sorts.

A few problems with the old code:
1. Built-in types are declared...where exactly?

   The types in builtin_types.h were organized in arrays by the language
   version or extension they were introduced in.  It's factored out to
   avoid duplicates---every type only exists in one array.  But that
   means that sampler1D is declared in 110, sampler2D is in core types,
   sampler3D is a unique global not in a list...and so on.

2. Spaghetti call-chains with weird parameters:

   generate_300ES_types calls generate_130_types which calls
   generate_120_types and generate_EXT_texture_array_types, which calls
   generate_110_types, which calls generate_100ES_types...and more

   Except that ES doesn't want 1D types, so we have a skip_1d parameter.
   add_deprecated also falls into this category.

3. Missing type accessors.

   Common types have convenience pointers (like glsl_type::vec4_type),
   but others may not be accessible at all without a symbol table (for
   example, sampler types).

4. Global variable declarations in a header file?

   #include "builtin_types.h" in two C++ files would break the build.

The new code addresses these problems.  All built-in types are declared
together in a single table, independent of when they were introduced.
The macro that declares a new built-in type also creates a convenience
pointer, so every type is available and it won't get out of sync.

The code to populate a symbol table with the appropriate types for a
particular language version and set of extensions is now a single
table-driven function.  The table lists the type name and GL/ES versions
when it was introduced (similar to how the lexer handles reserved
words).  A single loop adds types based on the language version.
Explicit extension checks then add additional types.  If they were
already added based on the language version, glsl_symbol_table simply
ignores the request to add them a second time, meaning we don't need
to worry about duplicates and can simply list types where they belong.

v2: Mark uvecs and shadow samplers as ES3 only, and 1DArrayShadow as
    unsupported in ES entirely.  Add a touch more doxygen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
818da74af5 glsl: Don't use random pointers as an array of glsl_type objects.
Using a random glsl_type convenience pointer as an array is a really bad
idea, for all the reasons mentioned in the previous commit.

The new glsl_type::bvec() function is simpler anyway.

Prevents breakage in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
4530ed4f26 glsl: Stop being clever with pointer arithmetic when fetching types.
Currently, vector types are linked together closely: the glsl_type
objects for float, vec2, vec3, and vec4 are all elements of the same
array, in that exact order.  This makes it possible to obtain vector
types via pointer arithmetic on the scalar type's convenience pointer.
For example, float_type + (3 - 1) = vec3.

However, relying on this is extremely fragile.  There's no particular
reason the underlying type objects need to be stored in an array.  They
could be individual class members, possibly with padding between them.
Then the pointer arithmetic would break, and we'd get bad pointers to
non-heap allocated data, causing subtle breakage that can't be detected
by valgrind.  Cue insanity.

Or someone could simply reorder the type variables, causing us to get
the wrong type entirely.  Also cue insanity.

Writing this explicitly is much safer.  With the new helper functions,
it's a bit less code even.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
d367a1cbdb glsl: Add simple vector type accessor helpers.
This patch introduces new functions to quickly grab a pointer to a
vector type.  For example:

   glsl_type::bvec(4)   returns   glsl_type::bvec4_type
   glsl_type::ivec(3)   returns   glsl_type::ivec3_type
   glsl_type::uvec(2)   returns   glsl_type::uvec2_type
   glsl_type::vec(1)    returns   glsl_type::float_type

This is less wordy than glsl_type::get_instance(GLSL_TYPE_BOOL, 4, 1),
which can help avoid extra word wrapping.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Brian Paul
9a14e412d6 mesa: update glext.h to version 20130624
In glapi_priv.h we always need the typedef for the GLclampx type
since GL_OES_fixed_point is now defined in glext.h but the
GLclampx type is not.  GLclampx is not used by anything in glext.h
but we need it for GL ES dispatch.

This is a huge patch because the structure of the file has been
changed.

The following extensions are new, however:

GL_AMD_interleaved_elements
GL_AMD_shader_trinary_minmax
GL_IBM_static_data
GL_INTEL_map_texture
GL_NV_compute_program5
GL_NV_deep_texture3D
GL_NV_draw_texture
GL_NV_shader_atomic_counters
GL_NV_shader_storage_buffer_object
GL_NVX_conditional_render
GL_OES_byte_coordinates
GL_OES_compressed_paletted_texture
GL_OES_fixed_point
GL_OES_query_matrix
GL_OES_single_precision

And these extensions were removed:

GL_FfdMaskSGIX
GL_INGR_palette_buffer
GL_INTEL_texture_scissor
GL_SGI_depth_pass_instrument
GL_SGIX_fog_scale
GL_SGIX_impact_pixel_texture
GL_SGIX_texture_select

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-06-26 10:43:27 -06:00
Brian Paul
bc6eb8068f st/mesa: add casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
202299d16e st/mesa: make rtt_level, face, slice unsigned to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
2285645aa2 hud: add float casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
87d5a16927 hud: include stdio.h since we use fprintf(), fscanf(), etc 2013-06-26 10:42:59 -06:00
Brian Paul
61964a9ceb hud: add cast to silence MSVC warning 2013-06-26 10:42:59 -06:00
Brian Paul
f06e60fde4 os: add cast in os_time_sleep() to silence MSVC warning 2013-06-26 10:42:59 -06:00
Brian Paul
21f8729c3d vega: add some casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
4d452f1988 util: int/unsigned changes to silence some MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
bbdd7cfb8b util: add some casts to silence some MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
aab8ca8fd1 util: s/int/unsigned/ to silence some MSVC warnings 2013-06-26 10:42:58 -06:00
Maarten Lankhorst
e72cc26518 nvc0: set rsvd_kick correctly
This prevents trampling beyond the end of the command stream during flushes.

NOTE: This is a candidate for the stable branches.

Reported-by: Christoph Bumiller <christoph.bumiller@speed.at>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-26 16:50:08 +02:00
Maarten Lankhorst
30c2c34464 nvc0: fix push_space checks for video decoding 2013-06-26 16:18:42 +02:00
Vinson Lee
e6479b4330 ilo: Remove max_threads dead code path.
max_threads cannot be greater than 28. It is either 21 or 28.

Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-06-26 21:51:07 +08:00
Jean-Sébastien Pédron
c6d52f2290 winsys/intel: fix typo in "ETIMEOUT"
Should be "ETIMEDOUT".

[olv: commit message slightly re-formatted]

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-06-26 21:51:07 +08:00
Chia-I Wu
c610b67972 ilo: use a bitmask for enabled constant buffers
Looping over 4 * 13 constant buffers while in most cases only two are enabled
is stupid.
2013-06-26 21:50:26 +08:00
Maarten Lankhorst
9aebad618c vl/mpeg12: handle mpeg-1 bitstreams more correctly
Add support for D-frames.
Add support for slices ending on a different horizontal row of macroblocks.
2013-06-26 11:40:47 +02:00
Chia-I Wu
95c21f12f3 ilo: support PIPE_CAP_USER_INDEX_BUFFERS
We want to access the user buffer, if available, when primitive restart is
enabled and the restart index/primitive type is not natively supported.

And since we are handling index buffer uploads in the driver with this change,
we can also work around misalignment of index buffer offsets.
2013-06-26 16:42:46 +08:00
Chia-I Wu
5fb5d4f0a6 ilo: make pipe_draw_info a context state
Rename ilo_finalize_states() to ilo_finalize_3d_states(), and bind
pipe_draw_info to the context when it is called.  This saves us from having to
pass pipe_draw_info around in several places.
2013-06-26 16:42:46 +08:00
Chia-I Wu
3eb6754e94 ilo: support PIPE_CAP_USER_CONSTANT_BUFFERS
We need it for HUD support, and will need it for push constants in the future.
2013-06-26 16:42:45 +08:00
Eric Anholt
79385950f3 i915: Drop dead batch dumping code.
Batch dumping is now handled by shared code in libdrm.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
57407bcaf8 intel: Drop little bits of dead code.
I noticed these while building the fork-i915 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
88514d922e i965: Stop recomputing the miptree's size from the texture image.
We've already computed what the dimensions of the miptree are, and stored
it in the miptree.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
820325b258 i965: Drop unused argument to translate_tex_format().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
c20f973c4f i965/gen4-5: Stop using bogus polygon_offset_scale field.
The polygon offset math used for triangles by the WM is "OffsetUnits * 2 *
MRD + OffsetFactor * m" where 'MRD' is the minimum resolvable difference
for the depth buffer (~1/(1<<16) or ~1/(1<<24)), 'm' is the approximated
slope from the GL spec, and '2' is this magic number from the original
i965 code dump that we deviate from the GL spec by because "it makes glean
work" (except that it doesn't, because of some hilarity with 0.5 *
approximately 2.0 != 1.0.  go glean!).

This clipper code for unfilled polygons, on the other hand, was doing
"OffsetUnits * garbage + OffsetFactor * m", where garbage was MRD in the
case of 16-bit depth visual (regardless the FBO's depth resolution), or
128 * MRD for 24-bit depth visual.

This change just makes the unfilled polygons behavior match the WM's
filled polygons behavior.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
dba46831b0 i915: Use the current drawbuffer's depth for polygon offset scale.
There's no reason to care about the window system visual's depth for
handling polygon offset in an FBO, and it could only lead to pain.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
c31aee99f3 intel: Add perf debug for glCopyPixels() fallback checks.
The separate function for the fallback checks wasn't particularly
clarifying things, so I put the improved checks in the caller.  (Note that
the dropped _mesa_update_state() had already happened once at the start of
the caller)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
a2ca98b211 i965: Add debug to INTEL_DEBUG=blorp describing hiz/blit/clear ops.
I think we've all added instrumentation at one point or another to see
what's being called in blorp.  Now you can quickly get output like:

Testing glCopyPixels(depth).
intel_hiz_exec depth clear to mt 0x16d9160 level 0 layer 0
intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0
intel_hiz_exec hiz ambiguate to mt 0x16d9160 level 0 layer 0
intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
da00782ed8 ra: Fix register spilling.
Commit 551c991606 tried to avoid spilling
registers that were trivially colorable.  But since we do optimistic
coloring, the top of the stack also contains nodes that are not trivially
colorable, so we need to consider them for spilling (since they are some
of our best candidates).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58384
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63674
NOTE: This is a candidate for the 9.1 branch.
2013-06-26 01:07:11 -07:00
Eric Anholt
c6d74a4992 i965/fs: Dump IR when fatally not compiling due to bad register spilling.
It should never happen, but it does, and at this point, you're going to
_mesa_problem() and abort() (unless it's just in precompile).  Give the
developer something to look at.
2013-06-26 01:07:11 -07:00
Naohiro Aota
95e145aaee xmlpool/build: Make sure to set mo properly
Some shells does not set variables sequentially in a statement i.e. "a=X
b=${a}" won't set "b" to "X" but empty value.

This patch introduce ";" to make sure "mo" is set properly before "lang"
assignment.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=471302
2013-06-25 21:22:56 -07:00
Eric Anholt
04e03d9645 i965: Remove the rest of brw_update_draw_buffer().
The last piece of code with an effect was flagging _NEW_BUFFERS.  Only,
that is already flagged from everything that calls this function: Mesa GL
state updates flag it before even calling down into the driver, and the
calls from the DRI2 window system framebuffer update path end up flagging
it as part of the ResizeBuffers() hook.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:22 -07:00
Eric Anholt
c39111509d i965: Stop updating FBO state on drawbuffers change.
The computed fields are updated appropriately as part of the normal draw
call path due to _NEW_BUFFERS being set.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:22 -07:00
Eric Anholt
9d523e3372 i965: Stop recomputing drawbuffer bounds on drawbuffer change.
For winsys FBOs, the bounds are appropriately updated immediately upon
_mesa_resize_framebuffer().  For user FBOs, they're updated as part of the
normal draw path state update due to _NEW_BUFFERS having been flagged.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
15c47481ba i965: Remove _NEW_DEPTH state flagging on drawbuffers change.
Of the places noting a _NEW_DEPTH dependency, all were already checking
for _NEW_BUFFERS if appropriate.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
94ecf913b4 intel: Stop doing special _NEW_STENCIL state flagging on drawbuffers.
2/3 packets depending on Stencil._Enabled already checked for
_NEW_BUFFERS, so just add _NEW_BUFFERS to the remaining one.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
3faccc42ad i965: Stop flagging viewport/scissor change on drawbuffers change.
The viewport (ctx->Viewport._WindowMap) doesn't change with drawable size
changes, and we update scissor (ctx->DrawBuffer->_Xmin and friends) on
_NEW_BUFFERS in things like brw_sf_state.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
438f85717d i965: Stop flagging _NEW_POLYGON on drawbuffers change.
Things like brw_sf.c that need to know about orientation are already
recomputing on _NEW_BUFFERS.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
b04c718ebd radeon: Remove gratuitous custom framebuffer resize code.
_mesa_resize_framebuffer(), the default value of the ResizeBuffers hook,
already checks for a window system framebuffer and walks the renderbuffers
calling AllocStorage().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
17bc8fdb1d intel: Remove gratuitous custom framebuffer resize code.
_mesa_resize_framebuffer(), the default value of the ResizeBuffers hook,
already checks for a window system framebuffer and walks the renderbuffers
calling AllocStorage().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
d7165b383d mesa: Remove the Initialized field from framebuffers.
This existed to tell the core not to call GetBufferSize, except that even
if you didn't set it nothing happened because nobody had a GetBufferSize.

v2: Remove two more instances of setting the field (from Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:20 -07:00
Eric Anholt
bab755ad1b mesa: Remove Driver.GetBufferSize and its callers.
Only the GDI driver set it to non-NULL any more, and that driver has a
Viewport hook that should keep it limping along as well as it ever has.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:20 -07:00
Vinson Lee
61bfed2d09 glsl: Fix gl_shader_program::UniformLocationBaseScale assert.
commit 26d86d26f9 added
gl_shader_program::UniformLocationBaseScale. According to the code
comments in that commit, UniformLocationBaseScale "must be >=1".

UniformLocationBaseScale is of type unsigned. Coverity reported a "Macro
compares unsigned to 0" defect as well.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-25 18:45:01 -07:00
Brian Paul
0b994961ff svga: allow 3D transfers in svga_texture_transfer_map()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
808da7d8ca svga: use new svga_define_texture_level() helper
To get array bounds checking.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
2cc27c3faa svga: fix layer/level mix-up in svga_mark_surface_dirty()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
04e3969597 svga: use new svga_age_texture_view() helper
The function does array bounds checking.  Note, this exposes a
bug in the svga_mark_surface_dirty() function: we're calling
svga_age_texture_view() with a texture slice instead of mipmap
level.  This can lead to a failed assertion.  That'll be fixed next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
a4e4a413e5 svga: add array index assertion in svga_validate_sampler_view() 2013-06-25 17:54:24 -06:00
Brian Paul
82d6a52530 svga: use svga_texture() helper instead of casting 2013-06-25 17:54:23 -06:00
José Fonseca
464c6949cb util/debug: Cleanup/improve debug_symbol_name_dbghelp.
- use mgwhelp -- the successor for bfdhelp which does not have a hard
  dependency on BFD, and works on 64bits.
- use a macro instead of hand-typing to dispatch DbgHelp functions
- dump line numbers
- dump module names when symbols are not available
- support 64bits.
- add comments

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-25 18:41:59 +01:00
José Fonseca
a26f834a39 util/debug: Make debug_backtrace_capture work for 64bit windows.
Rely on Windows' CaptureStackBackTrace to do the grunt work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-25 18:41:59 +01:00
Zack Rusin
29dacd9803 draw: allow overflows in the llvm paths
Because our code couldn't handle it we were skipping rendering
if we detected overflows. According to the spec we should
still render but with all 0 vertices, which is what the llvm
code already does. So for the llvm paths lets enable processing
even if an overflow condition has been detected.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-25 11:57:01 -04:00
Zack Rusin
f96326b2f6 draw: avoid overflows in the llvm draw loop
Before we could easily overflow if start+count>max integer. To
avoid it we can just iterate over the count. This makes sure
that we never crash, since most of the overflow conditions
is already handled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-25 11:56:41 -04:00
Maarten Lankhorst
e2b02080d8 nvc0: do not set tiled mode on gart bo when fence debugging is used
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-25 13:34:15 +02:00
Chia-I Wu
c8240c9dea ilo: honor render condition in blitter
Make pass_render_condition() available for blitter, and check for render
condition in (and only in) clear(), clear_render_target(), and
clear_depth_stencil().
2013-06-25 15:38:07 +08:00
Chia-I Wu
5f4b769127 ilo: remove ilo_shader_internal.h from GEN6 pipeline
Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.
2013-06-25 13:51:59 +08:00
Chia-I Wu
63165df90f ilo: remove ilo_shader_internal.h from GEN7 pipeline
Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.
2013-06-25 13:51:59 +08:00
Chia-I Wu
855b684141 ilo: speed up ilo_shader_select_kernel_routing() a bit
Remember the order of the source attributes and avoid recomputation when it
does not change.
2013-06-25 13:51:59 +08:00
Chia-I Wu
9b18df6e08 ilo: move SBE setup code to ilo_shader.c
Add ilo_shader_select_kernel_routing() to construct 3DSTATE_SBE.  It is called
in ilo_finalize_states(), rather than in create_fs_state(), as it depends on
VS/GS and rasterizer states.

With this change, ilo_shader_internal.h is no longer needed for
ilo_gpe_gen6.c.
2013-06-25 13:51:58 +08:00
Chia-I Wu
c4fa24ff08 ilo: use ilo_shader_state exclusively in GPE
This allows us to remove ilo_shader_internal.h from ilo_gpe_gen7.c.  The
unfinished code in 3DSTATE_DS, 3DSTATE_HS, and INTERFACE_DESCRIPTOR_DATA are
partly or entirely removed.
2013-06-25 13:18:08 +08:00
Chia-I Wu
91cf6c1e92 ilo: map SO registers at shader compile time
The unmodified pipe_stream_output_info describes its outputs as if they are in
TGSI_FILE_OUTPUT.  Remap the register indices to where they appear in the VUE.

TGSI_SEMANTIC_PSIZE needs a little care because it is at the W channel.
2013-06-25 13:18:08 +08:00
Chia-I Wu
68522bf36c ilo: use ilo_shader_cso for FS
Add ilo_gpe_init_fs_cso() to construct 3DSTATE_PS and shader part of
3DSTATE_WM once and early for fragment shaders.
2013-06-25 13:18:08 +08:00
Chia-I Wu
639a2cddc6 ilo: use ilo_rasterizer_state exclusively in GPE
Replace pipe_rasterizer_state by ilo_rasterizer_state for the remaining GPE
functions for consistency.
2013-06-25 13:18:07 +08:00
Chia-I Wu
54ab03523b ilo: convert pipe_rasterizer_state to ilo_rasterizer_wm
Add ilo_gpe_init_rasterizer_wm() to construct fixed-function part of
3DSTATE_WM once in create_rasterizer_state().
2013-06-25 13:17:56 +08:00
Chia-I Wu
851202c319 ilo: use ilo_shader_cso for GS
Add ilo_gpe_init_gs_cso() to construct 3DSTATE_GS once and early for geometry
shaders.
2013-06-25 13:17:21 +08:00
Chia-I Wu
d209da5e33 ilo: introduce ilo_shader_cso for VS
When a new VS kernel is generated, a newly added function,
ilo_gpe_init_vs_cso(), is called to construct 3DSTATE_VS command in
ilo_shader_cso.  When the command needs to be emitted later, we copy the
command from the CSO instead of constructing it dynamically.
2013-06-25 12:42:04 +08:00
Chia-I Wu
5c8db569ab ilo: add functions to query shaders
Add ilo_shader_get_type() to query the type (PIPE_SHADER_x) of the shader.
Add ilo_shader_get_kernel_offset() and ilo_shader_get_kernel_param() to query
the cache offset and various kernel parameters of the selected kernel.
2013-06-25 12:28:54 +08:00
Chia-I Wu
96e2133e72 ilo: clean up finalize_shader_states()
Add ilo_shader_select_kernel() to replace the dependency table,
ilo_shader_variant_init(), and ilo_shader_state_use_variant().

With the changes, we no longer need to include ilo_shader_internal.h in
ilo_state.c.
2013-06-25 12:10:34 +08:00
Chia-I Wu
f0afedeb75 ilo: use multiple entry points for shader creation
Replace ilo_shader_state_create() by

 ilo_shader_create_vs()
 ilo_shader_create_gs()
 ilo_shader_create_fs()
 ilo_shader_create_cs()

Rename ilo_shader_state_destroy() to ilo_shader_destroy().  The old
ilo_shader_destroy() is renamed to ilo_shader_destroy_kernel().
2013-06-25 11:54:14 +08:00
Chia-I Wu
4d789c76dc ilo: move internal shader interface to a new header
Move it to ilo_shader_internal.h.  The goal is to make files not part of the
compiler include only ilo_shader.h eventually.
2013-06-25 11:51:26 +08:00
Brian Paul
e3cbb18321 gallium/hud: do not use free() for the free_query_data hook
That confuses Gallium's memory debugging code where CALLOC/MALLOC
must be matched with FREE, not free().

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-24 14:23:54 -06:00
Matthew McClure
e5bf19ac1c draw: check for out-of-memory conditions in the AA line module.
To prevent segfaults in the AA line module, the code will check for a
valid pointer to the aaline_stage in the draw context.

Fixes segfault from backtrace:

* aaline_stage_from_pipe
  aaline_delete_fs_state

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-24 08:36:47 -06:00
José Fonseca
06badea0da tests/graw: Fix typo in shader-leak.c 2013-06-24 15:29:25 +01:00
José Fonseca
a3d75db022 tools/trace: Fix syntax.
Cleaned/commented up the code, but forgot to actually test before
commiting...
2013-06-24 15:28:48 +01:00
Richard Sandiford
5a0556f061 st/dri/sw: Fix pitch calculation in drisw_update_tex_buffer
swrastGetImage rounds the pitch up to 4 bytes for compatibility reasons
that are explained in drisw_glx.c:bytes_per_line, so drisw_update_tex_buffer
must do the same.

Fixes window skew seen while running firefox over vnc on a 16-bit screen.

NOTE: This is a candidate for the stable branches.

[ajax: fixed typo in comment]

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-24 09:52:24 -04:00
Adam Jackson
2151d893fb gallium: Fix llvmpipe on big-endian machines
Squashed commit of the following:

commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 0d65131649a8aa140e2db228ba779d685c4333e3
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    gallivm: Fix big-endian machines

    This adds a bit-shift count to the format table, and adds the concept of
    vector or bitwise alignment on gathers.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 9740bda9b7dc894b629ed38be9b51059ce90818f
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    llvmpipe: Fix convert_to_blend_type on big-endian

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit ae037c2de0f029e4e99371c0de25560484f0d8df
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    util: Convert color pack to packed formats

    This fixes them on big-endian.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    graw-xlib: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    format: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 417b60bc66eb450e68a92ab0e47f76e292b385e6
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    st/dri: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 0934b2e022a5e0847d312c40734e2b44cac52fd8
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    st/xlib: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit a307ea3c3716a706963acce7966b5e405ba11db9
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gbm: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    tests: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 2f77fe3ee524945eacd546efcac34f7799fb3124
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 13:07:37 2013 -0400

    gallium: Document packed formats

    Signed-off-by: Adam Jackson <ajax@redhat.com>

commit 1f1017159ce951f922210a430de9229f91f62714
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gallium: Introduce 32-bit packed format names

    These are for interacting with buffers natively described in terms of
    bit shifts, like X11 visuals:

        uint32_t xyzw8888 = (x << 0) | (y << 8) | (z << 16) | (w << 24);

    Define these in terms of (endian-dependent) aliases to the array-style
    format names.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a
Author: Adam Jackson <ajax@redhat.com>
Date:   Mon Jun 3 12:10:32 2013 -0400

    gallium: Document format name conventions

    v2:
    - Fix a channel name thinko (Michel Dänzer)
    - Elaborate on SCALED versus INT
    - Add links to DirectX and FOURCC docs

    Signed-off-by: Adam Jackson <ajax@redhat.com>

commit df4d269e7fb62051a3c029b84147465001e5776e
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gallivm: Remove all notion of byte-swapping

    Signed-off-by: Adam Jackson <ajax@redhat.com>

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-06-24 09:48:56 -04:00
Roland Scheidegger
d282f4ea9b llvmpipe: fix wrong results for queries not in a scene
The result isn't always 0 in this case (depends on query type),
so instead of special casing this just use the ordinary path (should result
in correct values thanks to initialization in query_begin/end), just
skipping the fence wait.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-22 17:09:37 +02:00
Brian Paul
a415aa9489 gallium/docs: more documentation for pipe_resource::array_size
It should never be zero and for cube/cube_arrays it should be a
multiple of six.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-22 08:50:15 -06:00
Brian Paul
cba7939790 svga: minor cleanups, comments in svga_tgsi_insn.c 2013-06-22 08:49:09 -06:00
Brian Paul
b03f394508 svga: add null ptr check in svga_get_tex_sampler_view()
Trivial.
2013-06-22 08:49:09 -06:00
José Fonseca
67bfdea933 tools/trace: Several tweaks/fixes to dump_state 2013-06-22 12:30:39 +01:00
José Fonseca
545d3d32d8 trace: Dump result of create_stream_output_target 2013-06-22 12:30:39 +01:00
Maarten Lankhorst
6aabd9490c vl/mpeg12: fix mpeg-1 bytestream parsing
This fixes the bytestream parsing of mpeg-1 stream, but still leaves
open a number of issues with the interpretation:
- IDCT mismatch control is not correct for MPEG-1.
- Slices do not have to start and end on the same horizontal row of macroblocks.
- picture_coding_type = 4 (D-pictures) is not handled.
- full_pel_*_vector is not handled.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-22 09:40:15 +02:00
Rob Clark
efdc6caaf5 freedreno/a3xx/compiler: ensure min # of cycles after bary instr
The results of a bary.f do not appear to be immediatley available, but
there is no explicit sync bit.  Instead the compiler must just ensure
that there are a minimum number of instructions following the bary
before use of the result of the bary.  We aren't clever enough for that
so just throw in some nop's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
d4aaa4439a freedreno/a3xx/compiler: add TGSI_OPCODE_ABS
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
fe4ae1163d freedreno/a3xx/compiler: add TGSI_OPCODE_DPH
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
3f965556b4 freedreno/a3xx/compiler: fix for replicating instructions
If we are accumulating result into tmp.x, and need a mov to final
destination, we want to move the .x component into all of the components
enabled from the read dest's writemask, ie. we want:

  MOV dst.xyzw tmp.xxxx

rather than:

  MOV dst.xyzw tmp.xyzw

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Eric Anholt
0343f20e2f mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.
This code had no relation to ir_to_mesa.cpp, since it was also used by
intel and state_tracker, and most of it was duplicated with the standalone
compiler (which has periodically drifted from the Mesa copy).

v2: Split from the ir_to_mesa to shaderapi.c changes.

Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:30 -07:00
Eric Anholt
10c14d16d2 mesa: Move shader compiler API code to shaderapi.c
There was nothing ir_to_mesa-specific about this code, but it's not
exactly part of the compiler's core turning-source-into-IR job either.

v2: Split from the ir_to_mesa to glsl/ commit, avoid renaming the sh
    variable.

Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:29 -07:00
Eric Anholt
88398a817c mesa: Fix missing setting of shader->IsES.
I noticed this while trying to merge code with the builtin compiler, which
does set it.

Note that this causes two regressions in piglit in
default-precision-sampler.* which try to link without a vertex or fragment
shader, due to being run under the desktop glslparsertest binary (using
ARB_ES3_compatibility) that doesn't know about this requirement.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
faf3dbad0d mesa: Use shared code for converting shader targets to short strings.
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
426ca34b7a glsl: Remove ir_print_visitor.h includes and usage
We have ir->print() to do the old declaration of a visitor and having the
IR accept the visitor (yuck!).  And now you can call _mesa_print_ir()
safely anywhere that you know what an ir_instruction is.

A couple of missing printf("\n")s are added in error paths -- when an
expression is handed to the visitor, it doesn't print '\n' (since it might
be a step in printing a whole expression tree).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
2b049aa53e glsl: Make _mesa_print_ir() available from anything including ir.h.
No more forgetting to #include "ir_print_visitor.h" when doing temporary
debug code, or forgetting and leaving it in after removing your temporary
debug code.  Also, available from C code so you don't need to move the
caller to C++ just to call it (see also: ir_to_mesa.cpp).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Paul Berry
d0abac22c3 glsl: Make some files safe to include from C
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:28 -07:00
José Fonseca
2d7e837716 tools/trace: Quick instructions/notes.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
c14f516e58 tools/trace: Do a better job at comparing multi line strings.
For TGSI diffing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
9b7d21f8f5 tools/trace: Tool to compare json state dumps.
Copied verbatim from apitrace's scripts/jsondiff.py
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
cc4ad695ca tools/trace: Tool to dump gallium state at any draw call.
Based from the code from the good old python state tracker.

Extremely handy to diagnose regressions in state trackers.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
a7bccb33b9 tools/trace: Defer blob hex-decoding.
To speed up parsing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
José Fonseca
a8f7e12d92 trace: Don't dump texture transfers.
Huge trace files with little value.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
Chia-I Wu
bbd2d575e6 ilo: replace a boolean by bool
bool is used internally.  This is just cosmetic.
2013-06-20 11:40:20 +08:00
Chia-I Wu
8b2cba8f97 ilo: rename cache_seqno to uploaded
It has been used as a bool since shader cache rework.
2013-06-20 11:36:54 +08:00
Roland Scheidegger
ffebefa114 util: (trivial) add has_popcnt field
Not used yet but there's a couple of places in llvmpipe which should use this
(occlusion count is currently very inefficent if there's no cpu popcnt
instruction).
2013-06-19 23:47:36 +02:00
Roland Scheidegger
5c9aee111e llvmpipe: use 64bit counter for occlusion queries
Some APIs require 64bit and at least for 64bit archs the overhead
should be minimal.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
dc5dc4fd94 llvmpipe: handle more queries
Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and
also fill out the ps_invocations and c_primitives from the
PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already
be handled). Note that ps_invocations isn't pixel exact, just 16 pixel
exact but I guess it's better than nothing.
Doesn't really seem to work correctly but there's probably bugs elsewhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
bf5096303f softpipe: handle all queries, and change for the new disjoint semantics
The driver can do render_condition but wasn't handling the occlusion
and so_overflow predicates (though the latter might not work yet due
to gs support).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
cdf89d0b5c gallium: fix PIPE_QUERY_TIMESTAMP_DISJOINT
The semantics didn't really make sense, not really matching neither d3d9
(though the docs are all broken there) nor d3d10. So make it match d3d10
semantics, which actually gives meaning to the "disjoint" part.
Drivers are fixed up in a very primitive way, I have no idea what could
actually cause the counter to become unreliable so just always return
FALSE for the disjoint part.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:35 +02:00
José Fonseca
a0a40805dd trace: Dump pipe_rasterizer_state::clip_halfz.
Trivial.
2013-06-19 18:16:16 +01:00
Brian Paul
1e16e48f88 svga: add some comments about primitive conversion
And clean up the svga_translate_prim() function with better
variable names.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
8b3d4efed8 indices: add some comments
This is pretty complicated code with few/any comments.  Here's a first stab.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
2e8c51c98f svga: reindent svga_tgsi.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
0de01a47dd svga: whitespace, comment, formatting fixes in svga_tgsi_emit.h
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
1f57349e20 svga: move some svga/tgsi functions
Move some functions from the svga_tgsi_insn.h header into the
svga_tgsi_insn.c file since they're only used there.  Plus, add
comments and fix formatting.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
3abd9285be svga: formatting fixes in svga_tgsi_insn.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
9e6c29bf12 mesa: wrap comments, code to 78 columns in multisample.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
bdd5a0c12b mesa: remove unused BITSET64 macros
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Maarten Lankhorst
f1cccd6ca0 nvc0: kill assert in ppp code
It's no longer always true, and the video tilign aligment should
ensure the alignment is handled correctly regardless.
2013-06-19 13:08:51 +02:00
Chia-I Wu
cf41fae96b ilo: rework shader cache
The new code makes the shader cache manages all shaders and be able to upload
all of them to a caller-provided bo as a whole.

Previously, we uploaded only the bound shaders.  When a different set of
shaders is bound, we had to allocate a new kernel bo to upload if the current
one is busy.
2013-06-19 16:46:42 +08:00
Emil Velikov
7f7b05d6b3 nv50: avoid crash on updating RASTERIZE_ENABLE state
When doing blit using the 3D engine, the rasterizer cso may be NULL.

Ported from nvc0 commit 8aa8b0539.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-06-19 00:02:24 +02:00
Kristian Høgsberg
712269d674 wayland: Handle global_remove event as well
We need to set up a handler for the global_remove event that gets sent
out when a global gets removed.  Without the handler we end up calling
a NULL pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=65910

NOTE: This is a candidate for the stable branches.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-06-18 17:45:19 -04:00
Jordan Justen
adeda5afd4 gen7: fix GPU hang on WebGL texture-size test
When rendering to a texture with BaseLevel set, the miptree may be laid
out such that BaseLevel is in level 0 of the miptree (to avoid wasting
memory on unused levels between 0 and BaseLevel-1).  In that case, we
have to shift our render target's level down to the appropriate level of
the smaller miptree.

The WebGL test in combination with a meta code relating to
glGenerateMipmap also triggered a similar failure scenario.

This GPU hang regression was introduced by c754f7a8.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65324
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-18 14:06:46 -07:00
Eric Anholt
248fddecd8 intel: Remove unused IS_POWER_OF_TWO() macro.
The is_power_of_two() inline function has been used instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-18 12:08:08 -07:00
Zack Rusin
9542131b27 Revert "draw: clear the draw buffers in draw"
This reverts commit 41966fdb3b.
While it's a lot cleaner it causes regressions because
the draw interface is always called from the draw functions
of the drivers (because the buffers need to be mapped) which
means that the stream output buffers endup being cleared on
every draw rather than on setting.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-17 21:43:10 -04:00
Roland Scheidegger
8975dc798d llvmpipe: fixes for conditional rendering
honor render_condition for clear_render_target and clear_depth_stencil.
Also add minimal support for occlusion predicate, though it can't be active
at the same time as an occlusion query yet.
While here also switchify some large if-else (actually just mutually
exclusive if-if-if...) constructs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Roland Scheidegger
793e8e3d7e gallium: add condition parameter to render_condition
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Chia-I Wu
443dc15cf7 ilo: construct depth/stencil command in create_surface()
Add ilo_gpe_init_zs_surface() to construct

 3DSTATE_DEPTH_BUFFER
 3DSTATE_STENCIL_BUFFER
 3DSTATE_HIER_DEPTH_BUFFER

at surface creation time.  This allows fast state emission in draw_vbo().
2013-06-18 16:23:13 +08:00
Eric Anholt
eb20215075 intel: Allow blorp CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
746b57ef0e intel: Allow blit CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
b0e3c3b852 intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.
This gets us support for blitting to attachment types other than
textures.

v2: fix up comments from review by Kenneth.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
815dce9282 intel: Move XRGB->ARGB blit logic into intel_miptree_blit().
Now any caller (such as glCopyPixels()) can benefit from it, and it only
changes the correct subset of the destination instead of a whole teximage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
04a5e940c9 intel: Fix Y tiling support for glCopyTexSubImage's alpha override.
Apparently we don't have any piglit tests for this, because it would have
assertion failed in a debug build, or just rendered wrong in a non-debug
build if the destination wasn't covering whole tiles.

v2: Use the new macros.

Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-17 15:43:23 -07:00
Eric Anholt
78c2fc5925 intel: Make batch macros for doing BCS_SWCTRL setup.
We're going to add more BCS_SWCTRL setup instances soon, and you have to
be careful to have the set and restore atomic with the rendering that's
done, so that our state doesn't leak out to other rendering processes.

v2: Rewrite the patch to have batch begin/advance macros so that magic
    numbers don't get sprinkled around (and so you don't mix up your
    do-I-need-to-reset vs what-do-I-reset-to logic, which I nearly did in
    the next patch when first writing it)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-17 15:43:13 -07:00
Eric Anholt
b65b1c3148 mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().
Intel had brokenness here, and I'd like to continue moving Mesa toward
hiding 1D_ARRAY's ridiculousness inside of the core, like we did with
MapTextureImage.  Fixes copyteximage 1D_ARRAY on intel.

There's still an impedance mismatch in meta when falling back to read and
texsubimage, since texsubimage expects coordinates into 1D_ARRAY as
(width, slice, 0) instead of (width, 0, slice).

v2: Fix offset of scanline reads from the source. (Thanks Brian!), replace
    dd.h comment with Paul's text and replace early exit with an assert.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
2013-06-17 15:26:20 -07:00
Dave Airlie
9e8400f4c9 tgsi: text parser: fix parsing of array in declaration
I noticed this code didn't work as advertised while doing some passing around
of TGSI shaders and trying to reparse them, and things failing.

This seems to fix it here for at least the small test case I hacked into a
graw test.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-18 08:25:12 +10:00
Sven Joachim
0829b893a9 mesa: Fix ieee fp on Alpha
Commit 1f82bf12ed inadvertently broke it, checking for __IEEE_FLOAT on all
Alpha machines instead of only on VMS as before.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
2013-06-17 10:02:56 -07:00
Richard Sandiford
c132c2978b st/xlib: Fix XImage stride calculation
Fixes window skew seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:15:13 -04:00
Richard Sandiford
876fefe2ff st/xlib Fix XIMage bytes-per-pixel calculation
Fixes a crash seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:14:32 -04:00
Jonathan Gray
ebd68dd029 gallium: replace bswap_32 calls with util_bswap32
byteswap.h and bswap_32 aren't portable, replace them with calls to
gallium's util_bswap32 as suggested by Mark Kettenis.  Lets these files
build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-06-17 17:22:28 +02:00
Zack Rusin
7807763dd8 draw: fix a regression in computing max elt
gl can use elts without setting indices, in which case
our eltMax was set to 0 and always invoking the overflow
condition. So by default set eltMax to maximum, it will
be curbed by draw_set_indexes (if it ever comes) and if
not then it will let gl's glVertexPointer/glDrawArrays
work correctly. Fixes piglit's
triangle-rasterization-overdraw test.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-17 11:06:39 -04:00
Zack Rusin
41966fdb3b draw: clear the draw buffers in draw
Moves clearing of the draw so target buffers to the draw
module. They had to be cleared in the drivers before
which was quite messy.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-17 11:06:39 -04:00
Chia-I Wu
98bc4c62a6 ilo: add pipe-based copy method to ilo_blitter
It enables accelerated resource_copy_region() when blt-based method fails.
2013-06-17 18:28:58 +08:00
Chia-I Wu
ebfd7a61c0 ilo: add BLT-based blitting methods to ilo_blitter
Port BLT code in ilo_blit.c to BLT-based blitting methods of ilo_blitter.  Add
BLT-based clears.  The latter is verifed with util_clear(), but it is not in
use yet.
2013-06-17 16:36:53 +08:00
Chia-I Wu
b4b3a5c6dc ilo: replace util_blitter by ilo_blitter
ilo_blitter is just a wrapper for util_blitter for now.  We will port BLT code
to ilo_blitter shortly.
2013-06-17 14:37:10 +08:00
Kenneth Graunke
6d7abafdc8 i965: Assume flexible hardware primitive restart exists in the future.
Primitive restart with an arbitrary cut index was first supported as of
Haswell.  It's very doubtful that they'd take that away in future
hardware, so we may as well alter the check now.
2013-06-14 22:58:18 -07:00
Chris Forbes
def84d8014 i965: Shrink Gen5 VUE map layout to be the same as Gen4.
The PRM suggests a larger layout, mostly to support having
gl_ClipDistance[] somewhere predictable for the fixed-function clipper
-- but it didn't actually arrive in Gen5.

Just use the same layout for both Gen4 and Gen5.

No Piglit regressions.

Improves performance in CS:S Video Stress Test by ~3%.

V2: - Remove now-useless function for determining the SF URB read offset
    - Remove now-unused BRW_VARYING_SLOT_POS_DUPLICATE

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-16 01:05:41 +12:00
Kenneth Graunke
1b77d2133c i965: Implement 16-wide math on G45 and Ironlake.
[chrisf:]
Improves performance in CS:S video stress test by about 2%.
No piglit regressions on Ironlake.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-06-16 00:47:50 +12:00
Matt Turner
fcaa48d9cc glsl: Disallow return with a void argument from void functions.
NOTE: This is a candidate for the stable branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
1a1b03e6bc glsl: Allow implicit conversion of return values.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
876e16562b glsl: Add gl_{Max,Min}ProgramTexelOffset built-in constants.
Required by ARB_shading_language_420pack. Note that the 420pack spec
incorrectly specifies their values as (Min, Max) = (-7, 8) when they
should be (-8, 7) as listed in the GLSL 4.30 and ESSL 3.0 specs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
ed455cdb0b glsl: Allow swizzles on scalars.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
a8492e8fe7 glsl: Allow .length() method on vectors and matrices.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Todd Previte
cf7f424e18 mesa: Add infrastructure for ARB_shading_language_420pack.
v2 [mattst88]
  - Split infrastructure into separate patch.
  - Add preprocessor #define.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:48 -07:00
Chia-I Wu
bfa8d21759 ilo: fix for half-float vertex arrays
Commit 6fe0453c33 broke half-float vertex
arrays.  This reverts a part of that commit, and explains why.
2013-06-15 01:00:03 +08:00
Chia-I Wu
36ffd08706 ilo: add some assertions to help debugging
Assert that we do not support user vertex/index/constant buffers.  Issue a
warning when a sampler view is created for a resource without
PIPE_BIND_SAMPLER_VIEW.
2013-06-14 16:02:31 +08:00
Chia-I Wu
0d9afaad35 ilo: silence a compiler warning
The path should never be hit.
2013-06-14 15:36:30 +08:00
Vinson Lee
93534873b0 glsl: Fix null check in read_dereference.
Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 22:13:34 -07:00
Chia-I Wu
399548b17f st/mesa: fix temp texture bindings in st_CopyPixels()
The temporary texture should have either PIPE_BIND_RENDER_TARGET or
PIPE_BIND_DEPTH_STENCIL set in addition to PIPE_BIND_SAMPLER_VIEW.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-14 08:46:04 +08:00
Zack Rusin
5507c11f85 gallium/draw: add limits to the clip and cull distances
There are strict limits on those registers. Define the maximums
and use them instead of magic numbers. Also allows us to add
some extra sanity checks.
Suggested by Brian.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 12:13:11 -04:00
Zack Rusin
b63eeaf7b7 draw: cleanup the distance culling code a bit
We don't need the clamped variable, because we can just
return early. We should also do the regular culling after
the distance culling passes.
All spotted by Brian.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 12:13:01 -04:00
Chia-I Wu
c7e9b15010 ilo: mapping a resource may make some states dirty
When a resource is busy and is mapped with
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE, the underlying bo is replaced.  We need
to mark states affected by the resource dirty.

With this change, we no longer have to emit vertex buffers and index buffer
unconditionally.
2013-06-13 23:47:18 +08:00
Chia-I Wu
5f15050dc9 ilo: bump up PIPE_CAP_GLSL_FEATURE_LEVEL to 140
With UBO and TBO support, we are supposedly good to claim GLSL 1.40.
2013-06-13 23:47:18 +08:00
Chia-I Wu
4df85dbc06 ilo: initialize dirty flags in ilo_init_states()
Now that we have a function to initialize states, initialize dirty flags there
too.
2013-06-13 23:47:18 +08:00
Chia-I Wu
6057d7b7b5 ilo: re-emit states that involve resources
Even with hardware contexts, since we do not pin resources, we have to re-emit
the states so that the resources are referenced (by cp->bo) and their offsets
are updated in case they are moved.  This also allows us to elimiate cp flush
in is_bo_busy().
2013-06-13 12:58:47 +08:00
Chia-I Wu
b65bdc61bd ilo: fix for util_blitter_clear() changes
It has been broken since 17350ea979.
2013-06-13 12:58:47 +08:00
Manfred Ernst
bf2c074a2f mesa: Fix bug in unclamped float to ubyte conversion.
Problem: The IEEE float optimized version of UNCLAMPED_FLOAT_TO_UBYTE
in macros.h computed incorrect results for inputs in the range
0x3f7f0000 (=0.99609375) to 0x3f7f7f80 (=0.99803924560546875)
inclusive.  0x3f7f7f80 is the IEEE float value that results in 254.5
when multiplied by 255.  With rounding mode "round to closest even
integer", this is the largest float in the range 0.0-1.0 that is
converted to 254 by the generic implementation of
UNCLAMPED_FLOAT_TO_UBYTE.  The IEEE float optimized version
incorrectly defined the cut-off for mapping to 255 as 0x3f7f0000
(=255.0/256.0). The same bug was present in the function
float_to_ubyte in u_math.h.

Fix: The proposed fix replaces the incorrect cut-off value by
0x3f800000, which is the IEEE float representation of 1.0f. 0x3f7f7f81
(or any value in between) would also work, but 1.0f is probably
cleaner.

The patch does not regress piglit on llvmpipe and on i965 on sandy
bridge.

Tested-by Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-12 20:24:48 -07:00
Marek Olšák
3475b22133 st/dri: if flushing a drawable, don't set reason=SWAPBUFFERS
0 means SWAPBUFFERS.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
a713d7b1b9 st/dri: resolve the back buffer only in SwapBuffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
3b525036b9 st/dri: manually swap MSAA front and back buffers in SwapBuffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
b77316ad75 st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers
This commit fixes these piglit tests with an MSAA visual forced on:
- read-front
- glx-copy-sub-buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
fdf9d234e2 st/dri: refactor dri_msaa_resolve
The generic blit will be used by the following commit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
6c6cfc02c9 st/dri: reuse depth-stencil and MSAA resources after DRI2 invalidate event
Page flipping generates an invalidate event every frame, causing reallocations
of all private resources (MSAA and depth-stencil).

Reusing the resources may improve performance (especially under memory
pressure).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
683b065320 st/dri: fix MSAA resolving of buffers with height > width
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
526ebfa278 st/mesa: make generic CopyPixels path work with MSAA visuals
We have to use pipe->blit, not resource_copy_region, so that the read buffer
is resolved if it's multisampled. I also removed the CPU-based copying,
which just did format conversion (obsoleted by the blit).

Also, the layer/slice/face of the read buffer is taken into account (this was
ignored).

Last but not least, the format choosing is improved to take float and integer
read buffers into account.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
9ef44e6eb7 st/mesa: don't use blit_copy_pixels if an occlusion query is active
CopyPixels, just as DrawPixels, should count the samples that passed
depth test.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
79e421260a st/mesa: rework blit_copy_pixels to use pipe->blit
There were 2 issues with it:
- resource_copy_region doesn't allow different sample counts of both src
  and dst, which can occur if we blit between a window and a FBO, and
  the window has an MSAA colorbuffer and the FBO doesn't.
  (this was the main motivation for using pipe->blit)
- blitting from or to a non-zero layer/slice/face was broken, because
  rtt_face and rtt_slice were ignored.

blit_copy_pixels is now used even if the formats and orientation of
framebuffers don't match.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
4d59258856 r600g: upsample and downsample MSAA resources for transfers
We did downsample (=resolve) MSAA resources to make ReadPixels work with MSAA
GLX visuals, which was enough for read-only color-only transfers.

This commit makes write color transfers and depth-stencil transfers work
in a similar manner. It does downsampling in transfer_map and upsampling
in transfer_unmap.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
72a086b8b2 gallium/u_format: add a new helper for initializing pipe_blit_info::mask
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
d6d4a9a2e8 gallium/u_blitter: make clearing independent of the colorbuffer format
There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching.
Both of them don't do any format conversion.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
17350ea979 gallium/u_blitter: make clearing independent of the number of bound colorbuffers
We can use the fragment shader TGSI property WRITES_ALL_CBUFS.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
de1c38299c gallium/util: make WRITES_ALL_CBUFS optional in the passthrough fragment shader
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
45595d5066 mesa: fix OES_EGL_image_external being partially allowed in the core profile
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-13 03:54:13 +02:00
Ian Romanick
cfa3c5ad82 glsl: Generate smaller values for uniform locations
Previously we would generate uniform locations as (slot << 16) +
array_index.  We do this to handle applications that assume the location
of a[2] will be +1 from the location of a[1].  This resulted in every
uniform location being at least 0x10000.  The OpenGL 4.3 spec was
amended to require this behavior, but previous versions did not require
locations of array (or structure) members be sequential.

We've now encountered two applications that assume uniform values will
be "small."  As far as we can tell, these applications store the GLint
returned by glGetUniformLocation in a int16_t or possibly an int8_t.

THIS BEHAVIOR IS NOT GUARANTEED OR IMPLIED BY ANY VERSION OF OpenGL.

Other implementations happen to have both these behaviors (sequential
array elements and small values) since OpenGL 2.0, so let's just match
their behavior.

Fixes "3D Bowling" on Android.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:29 -07:00
Ian Romanick
26d86d26f9 glsl: Add gl_shader_program::UniformLocationBaseScale
This is used by _mesa_uniform_merge_location_offset and
_mesa_uniform_split_location_offset to determine how the base and offset
are packed.  Previously, this value was hard coded as (1U<<16) in those
functions via the shift and mask contained therein.  The value is still
(1U<<16), but it can be changed in the future.

The next patch dynamically generates this value.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:18 -07:00
Ian Romanick
5097f35841 glsl: Add a gl_shader_program parameter to _mesa_uniform_{merge,split}_location_offset
This will be used in the next commit.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:06 -07:00
Roland Scheidegger
4cce4efaa3 util: new util_fill_box helper
Use new util_fill_box helper for util_clear_render_target.
(Also fix off-by-one map error.)

v2: handle non-zero z correctly in new helper

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 00:41:43 +02:00
Roland Scheidegger
957c040eb8 gallivm: (trivial) remove duplicated code block (including comment) 2013-06-13 00:41:43 +02:00
Paul Berry
b09a754078 i965/gen7: Enable support for fast color clears.
This patch adds code to place mcs_state into INTEL_MCS_STATE_RESOLVED
for miptrees that are capable of supporting fast color clears.  This
will have no effect on buffers that don't undergo a fast color clear;
however, for buffers that do undergo a fast color clear, an MCS
miptree will be allocated (at the time of the first fast clear), and
will be used thereafter.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
ef9142d4a3 i965/gen7+: Disable fast color clears on shared regions.
In certain circumstances the memory region underlying a miptree is
shared with other miptrees, or with other code outside Mesa's control.
This happens, for instance, when an extension like GL_OES_EGL_image or
GLX_EXT_texture_from_pixmap extension is used to associate a miptree
with an image existing outside of Mesa.

When this happens, we need to disable fast color clears on the miptree
in question, since there's no good synchronization mechanism to ensure
that deferred clear writes get performed by the time the buffer is
examined from the other miptree, or from outside of Mesa.

Fortunately, this should not be a performance hit for most
applications, since most applications that use these extensions use
them for importing textures into Mesa, rather than for exporting
rendered images out of Mesa.  So most of the time the miptrees
involved will never experience a clear.

v2: Rework based on the fact that we have decided not to use an
accessor function to protect access to the region.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
67cd0f9703 i965/gen7+: Resolve color buffers when necessary.
Resolve color buffers that have been fast-color cleared:
    1. before texturing from the buffer (brw_predraw_resolve_buffers())
    2. before using the buffer as the source in a blorp blit
       (brw_blorp_blit_miptrees())
    3. before mapping the buffer's miptree (intel_miptree_map_raw(),
       intel_texsubimage_tiled_memcpy())
    4. before accessing the buffer using the hardware blitter
       (intel_miptree_blit(), do_blit_bitmap())

v2: Rework based on the fact that we have decided not to use an
accessor function to protect access to the region.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
e9dfcb38e9 i965/gen7+: Ensure that front/back buffers are fast-clear resolved.
We already had code in intel_downsample_for_dri2_flush() for
downsampling front and back buffers when multisampling was in use.
This patch extends that function to perform fast color clear resolves
when necessary.

To account for the additional functionality, the function is renamed
to simply intel_resolve_for_dri2_flush().

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
418aecea7d i965/blorp: Write blorp code to do render target resolves.
This patch implements the "render target resolve" blorp operation.
This will be needed when a buffer that has experienced a fast color
clear is later used for a purpose other than as a render target
(texturing, glReadPixels, or swapped to the screen).  It resolves any
remaining deferred clear operation that was not taken care of during
normal rendering.

Fortunately not much work is necessary; all we need to do is scale
down the size of the rectangle primitive being emitted, run the
fragment shader with the "Render Target Resolve Enable" bit set, and
ensure that the fragment shader writes to the render target using the
"replicated color" message.  We already have a fragment shader that
does that (the shader that we use for fast color clears), so for
simplicity we re-use it.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
fac32c0bd3 i965/blorp: Expand clear class hierarchy to prepare for RT resolves.
The fragment shaders that to do color clears will be re-used to
perform so-called "render target resolves" (the resolves associated
with fast color clears).  To prepare for that, this patch expands the
class hierarchy for blorp params by adding
brw_blorp_const_color_params (which will be used for all blorp
operations where the fragment shader outputs a constant color).

Some other data structures and functions were also renamed to use
"const_color" nomenclature where appropriate.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:06 -07:00
Paul Berry
5e5d4e021f i965/gen7+: Implement fast color clear operation in BLORP.
Since we defer allocation of the MCS miptree until the time of the
fast clear operation, this patch also implements creation of the MCS
miptree.

In addition, this patch adds the field
intel_mipmap_tree::fast_clear_color_value, which holds the most recent
fast color clear value, if any. We use it to set the SURFACE_STATE's
clear color for render targets.

v2: Flag BRW_NEW_SURFACES when allocating the MCS miptree.  Generate a
perf_debug message if clearing to a color that isn't compatible with
fast color clear.  Fix "control reaches end of non-void function"
build warning.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:06 -07:00
Paul Berry
dd3f950115 i965/gen7+: Create helper functions for single-sample MCS buffers.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
460b7bc7a1 i965/gen7+: Set up MCS in SURFACE_STATE whenever MCS is present.
On Gen7+, MCS buffers are used both for compressed multisampled color
buffers and for "fast clear" of single-sampled color buffers.

Previous to this patch series, we didn't support fast clear, so we
only used MCS with multisampled bolor buffers.

As a first step to implementing fast clears, this patch modifies the
code that sets up SURFACE_STATE so that it configures the MCS buffer
whenever it is present, regardless of whether we are multisampling or
not.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
7e5cb4bc4c i965/gen7+: Create an enum for keeping track of fast color clear state.
This patch includes code to update the fast color clear state
appropriately when rendering occurs.  The state will also need to be
updated when a fast clear or a resolve operation is performed; those
state updates will be added when the fast clear and resolve operations
are added.

v2: Create a new function, intel_miptree_used_for_rendering() to
handle updating the fast color clear state when rendering occurs.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
8f5147c199 intel: Conditionally compile mcs-related code for i965 only.
This patch ifdefs out intel_mipmap_tree::mcs_mt when building the i915
(pre-Gen4) driver (MCS buffers aren't supported until Gen7, so there
is no need for this field in the i915 driver).  This should make it a
bit easier to implement fast color clears without undue risk to i915.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
a5efdca7b7 intel: Keep region name in intel_miptree_create_for_dri2_buffer().
When processing a buffer received from the X server,
intel_process_dri2_buffer() examines intel_region::name to determine
whether it's received a brand new buffer, or the same buffer it
received from the X server the last time it made a request.

However, this didn't work properly, because in the call to
intel_miptree_create_for_dri2_buffer(), we create a fresh intel_region
object to represent the buffer, and this was causing us to forget the
buffer's previous name.

This patch fixes things by copying over the region name when creating
the fresh intel_region object.

At the moment, this is just a minor performance optimization.
However, when fast color clears are added, it will be necessary to
ensure that the fast color clear state for a buffer doesn't get
discarded the next time we receive that buffer from the X server.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Chia-I Wu
adf324ad28 winsys/intel: make struct intel_bo alias drm_intel_bo
There is really nothing in struct intel_bo, and having it alias drm_intel_bo
makes the winsys impose almost zero overhead.

We can make the overhead gone completely by making the functions static
inline, if needed.
2013-06-12 17:46:52 +08:00
Chia-I Wu
e7a14eea16 winsys/intel: reorganize functions
Move functions around to match the order of the declarations in the header.
2013-06-12 17:46:52 +08:00
Chia-I Wu
39226705b7 ilo: update winsys interface
The motivation is to kill tiling and pitch in struct intel_bo.  That requires
us to make tiling and pitch not queryable, and be passed around as function
parameters.
2013-06-12 17:46:52 +08:00
Chia-I Wu
cdfb2163c4 ilo: get rid of function tables in winsys
We are moving toward making struct intel_bo alias drm_intel_bo.  As a first
step, we cannot have function tables.
2013-06-12 17:46:52 +08:00
Chia-I Wu
6fe0453c33 ilo: access bo size directly
buf->bo_size is readily avaiable, no need to go via buf->bo->get_size().
2013-06-12 17:46:52 +08:00
Chia-I Wu
3f79188854 ilo: remove unnecessary tex_set_bo/buf_set_bo
Merge the bodies to tex_create_bo/buf_create_bo respectively.
2013-06-12 17:46:52 +08:00
Kenneth Graunke
b00d61151d i965: Emit the depth/stencil state pointer directly, not via atoms.
See two commits ago for the rationale.  This allows us to delete the
whole gen7_cc_state.c file.

This does move these commands before the depth stall flushes from
brw_emit_depthbuffer, which may be a problem.  The documentation for
3DSTATE_DEPTH_BUFFER mentions that depth stall flushes are required
before changing any depth/stencil buffer state, but explicitly lists
3DSTATE_DEPTH_BUFFER, 3DSTATE_HIER_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER,
and 3DSTATE_CLEAR_PARAMS.  It does not mention this particular packet
(_3DSTATE_DEPTH_STENCIL_STATE_POINTERS).

No observed Piglit regressions on Sandybridge or Ivybridge.

Together with the last two commits, this makes a cairo-gl benchmark
faster by 0.324552% +/- 0.258355% on Ivybridge.  No statistically
significant change on Sandybridge.  (Thanks to Eric for the numbers.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-11 15:42:17 -07:00
Kenneth Graunke
8ab15bacf4 i965: Emit the CC state pointer directly rather than via atoms.
See the previous commit for the rationale.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-11 15:42:17 -07:00
Kenneth Graunke
da1a896b0f i965: Emit the BLEND_STATE pointer directly rather than via atoms.
Previously, we would:
1. Emit the new indirect state.
2. Flag CACHE_NEW_BLEND_STATE.
3. Rely on later state atoms to notice CACHE_NEW_BLEND_STATE and emit a
   pointer to the new indirect state.

This is rather cumbersome: it requires two state atoms instead of one,
and there's a strict ordering dependency in the list.  Plus, the code
gets spread across two functions (or even files in the case of Gen7+).

Gen7+ has a packet to update just the blend state pointer, so it makes a
lot of sense to simply emit that right away.  Gen6 has a combined packet
which updates blending, the color calculator, and depth/stencil state;
however, each can still be modified independently.

This drops the Gen6 micro-optimization where we tried to only emit one
packet that changed all three states.  State updates are pretty cheap.

CACHE_NEW_BLEND_STATE is no longer necessary, so drop it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-11 15:42:16 -07:00
Zack Rusin
babe35a067 draw: implement distance culling
Works similarly to clip distance. If the cull distance is negative
for all vertices against a specific plane then the primitive
is culled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:28 -04:00
Zack Rusin
3d08eada34 gallium: add a cull distance semantic
cull distance is analogous to clip distance. If a register is
given this semantic, then the values in it are assumed to be a
float32 distance to a plane. Primitives will be completely
discarded if the plane distance for all of the vertices in
the primitive are < 0.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:28 -04:00
Zack Rusin
0a3779d955 draw: fix clipper invocation statistics
We need to figure out the number of invocations of the clipper
before the emit, because in the emit we are after clipping
where the number of primitives will be equal to number of clipper
invocations minus the clipped primitives. So our computations
were always off by the number of clipped primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:28 -04:00
Zack Rusin
2b2e7bb133 draw: enable user plane clipping when clipdistance is used
Draw depended on clip_plane_enable being set in the rasterizer
to use clipdistance registers for clipping. That's really
unfriendly because it requires that rasterizer state to have
variants for every shader out there. Instead of depending on
the rasterizer lets extract the info from the available state:
if a shader writes clipdistance then we need to use it and we
need to clip using a number of planes equal to the number
of writen clipdistance components. This way clipdistances
just work.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:27 -04:00
Zack Rusin
c1a50f5ed7 draw: make sure clipdistances work with geometry shaders
we were always fetching the info from the vertex shader, but if
geometry shader is present it should be used as the source of
that info.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:27 -04:00
Kenneth Graunke
3dacb7d40b Revert "i965: Disable unused pipeline stages once at startup on Gen7+."
This reverts commit 6c966ccf07.

Apparently causes GPU hangs.

Conflicts:
	src/mesa/drivers/dri/i965/brw_state.h
	src/mesa/drivers/dri/i965/brw_state_upload.c
2013-06-11 10:53:44 -07:00
Brian Paul
42adf5f0dd swrast: add texfetch code for some XBGR formats
Fixes piglit texture-packed-formats regression.  We need to implement
more XBGR formats here eventually, but many are UINT/SINT formats
which swrast doesn't handle yet anyway (integer textures).

Bugzilla https://bugs.freedesktop.org/show_bug.cgi?id=64935

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-11 08:26:38 -06:00
Brian Paul
91405e3502 mesa: add missing texture strings in tex_target_name()
And add a static assert for the future.
2013-06-10 16:35:35 -06:00
Alex Deucher
761320b197 winsys/radeon: add env var to disable VM on Cayman/Trinity
Set env var RADEON_VA=0 to disable VM on Cayman/Trinity.
Useful for debugging.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-10 18:02:57 -04:00
Eric Anholt
fceff14450 mesa: Add a _mesa_problem to document a piglit failure on i965.
Having figured out what was going on with piglit fbo-depth copypixels
GL_DEPTH_COMPONENT32F (falling all the way back to swrast on CopyPixels to
a float depth buffer), I'm not inclined to fix the problem currently but
it seems worth saving someone else the debug time.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-10 14:04:25 -07:00
Eric Anholt
9a0bd682f9 i965/vs: Avoid the MUL/MACH/MOV sequence for small integer multiplies.
We do a lot of multiplies by 3 or 4 for skinning shaders, and we can avoid
the sequence if we just move them into the right argument of the MUL.

On pre-IVB, this means reliably putting a constant in a position where it
can't be constant folded, but that's still better than MUL/MACH/MOV.

Improves GLB 2.7 trex performance by 0.788648% +/- 0.23865% (n=29/30)

v2: Fix test for pre-sandybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
2013-06-10 14:04:24 -07:00
Eric Anholt
d28e285d41 i965/vs: Allow copy propagation into MUL/MACH.
This is a trivial port of 1d6ead3804 from
the FS.

No significant performance difference on trex (misplaced the data, but it
was about n=20).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-10 14:04:24 -07:00
Eric Anholt
263a7e4cd9 i965/vs: Use the MAD instruction when possible.
This is different from how we do it in the FS - we are using MAD even when
some of the args are constants, because with the relatively unrestrained
ability to schedule a MOV to prepare a temporary with that data, we can
get lower latency for the sequence of instructions.

No significant performance difference on GLB2.7 trex (n=33/34), though it
doesn't have that many MADs.  I noticed MAD opportunities while reading
the code for the DOTA2 bug.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-10 14:04:24 -07:00
Richard Sandiford
1ff10f92e7 draw: Add A8R8G8B8 to draw_print_arrays
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>
2013-06-10 16:28:31 -04:00
Richard Sandiford
5876a4c71d draw: Fix type mismatch between draw_private.h and LLVM
draw_vertex_buffer declared the size field to be a size_t, but the LLVM
code used an int32 instead.  This caused problems on big-endian 64-bit
targets, because the first 32-bit chunk of the 64-bit size_t was always 0.

In one sense size_t seems like a good choice for a size, so one fix
would have been to try to get the LLVM code to use the equivalent of
size_t too.  However, in practice, the size is taken from things like ~0
or width0, both of which are int-sized, so it seemed simpler to make the
size field int-sized as well.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:26:14 -04:00
Richard Sandiford
337f21bc35 util: Use sizeof(void *) rather than 0 as the fallback cache line size
Without this, llvmpipe ends up giving a zero size to all uncompressed textures
on non-x86 systems, since align() cannot handle a 0 alignment.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:26:09 -04:00
Richard Sandiford
ba6cd796dd llvmpipe: Use saturating add/sub for UNORM formats
lp_build_add and lp_build_sub have fallback code for cases
that cannot be handled by known intrinsics.  For UNORM formats,
this code was using modulo rather than saturating arithmetic.

This fixes some rendering issues for a gnome session on System z.
It also fixes various piglit tests on z, such as
spec/ARB_color_buffer_float/GL_RGBA8-render.

The patch deliberately doesn't tackle the more complicated
SNORM case.

Tested against piglit on x86_64 and System z with no regressions.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:20:45 -04:00
Kenneth Graunke
a0037cecd1 intel: Reserve less batchbuffer space.
Now that Gen6+ relies on hardware contexts, we don't need to record an
occlusion query value at the end of each batch.  That means we no longer
need to reserve space for the absurd number of PIPE_CONTROLs required to
do that on Sandybridge.

See commit 4e087de51a, which bumped this
up to 60 bytes.  This is not quite a revert, as it uses 24 bytes instead
of 16, and saves the comments.  As far as I can tell, the old value of
16 bytes was just wrong, so we shouldn't go back to that.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:51 -07:00
Kenneth Graunke
fc800f0c60 i965: Allocate push constant L3 space once at startup on Gen7+.
We always allocate the maximum amount of space and never change it, so
it makes sense to do it once.  Programming it on startup also lets us
skip re-programming it from BLORP.

This removes a tiny amount of overhead from our drawing loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:47 -07:00
Kenneth Graunke
6c966ccf07 i965: Disable unused pipeline stages once at startup on Gen7+.
This removes a tiny bit of code from our drawing loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:46 -07:00
Kenneth Graunke
b607d57630 i965: Don't emit PIPELINE_SELECT from BLORP.
Now that we emit invariant state at startup (and never select the media
pipeline), the 3D pipeline will always already be selected, even if BLORP
is the first operation.  So this is unnecessary.

v2: Fix unused variable warning (intel_context is no longer used).

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:44 -07:00
Kenneth Graunke
d671eb140f i965: Emit invariant state once at startup on Gen6+.
Now that we have hardware contexts, we can safely initialize our GPU
state once at startup, rather than needing a state atom with the
BRW_NEW_CONTEXT flag set.

This removes a tiny bit of code from our drawing loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:42 -07:00
Kenneth Graunke
33b90804ee i965: Delete some dead state atom prototypes.
These atoms don't actually exist.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:40 -07:00
Kenneth Graunke
233de8e8d3 i965: Change return type of check_state() to bool.
The existing code already returned a boolean; this just clarifies that.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:38 -07:00
Kenneth Graunke
650d5de6ea i965: Remove unused second parameter of brw_print_dirty_count().
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:29 -07:00
Kenneth Graunke
ca6b520f3a glsl: Allow the use of determinant() in GLSL 1.50.
We already implemented this for ES3, so we just need to turn it on.

Fixes 6 Piglit tests:
spec/glsl-1.50/compiler/built-in-functions/determinant-mat[234].{vert,frag}

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:54:57 -07:00
Kenneth Graunke
603940d5bb glcpp: Automatically #define GL_core_profile 1 on GLSL 1.50+.
Page 17 of the GLSL 1.50.11 specification states:
"There is a built-in macro definition for each profile the
 implementation supports.  All implementations provide the following
 macro:

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:54:56 -07:00
Kenneth Graunke
e203919a4e glsl: Parse "#version 150 core" directives.
Previously we only supported "#version 150".  This patch recognizes
"compatibility" to give the user a more descriptive error message.

Fixes Piglit's version-150-core-profile test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:54:42 -07:00
Kenneth Graunke
f730b1f72a glsl: Bail on parsing if the #version directive is bogus.
If we didn't successfully parse the #version line, there's no point in
continuing with parsing and compiling: it's already failed.

Furthermore, it can actually be harmful: right after handling #version,
we call _mesa_glsl_initialize_types(), which checks state->es_shader and
language_version.  If it isn't valid, it hits an assertion failure.

Fixes Piglit's "invalid-version-es."  When processing "#version 110 es",
our code set state->es_shader and state->language_version = 110.  It
then properly determined that this was invalid and flagged an error.
Since we continued anyway, we hit the assertion mentioned above.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:50:12 -07:00
Chris Forbes
a2e3b1c4e2 dlist: fix save_SamplerParameteri
This was building the temporary array to pass to
save_SamplerParameteriv, and then not passing it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-06-09 14:00:40 -07:00
Vinson Lee
ce1f85133d mesa: Prevent possible out-of-bounds read by save_SamplerParameteriv.
Fixes "Out-of-bounds access" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-08 13:32:53 -07:00
Maarten Lankhorst
26e047dec8 nvc0: fix up video buffer alignment requirements
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-08 20:11:33 +02:00
Rob Clark
e9edbf0a68 freedreno: better scissor fix
Actually respect rasterizer state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Rob Clark
4af1dcbb7d freedreno: gmem bypass
The GPU (at least a3xx, but I think also a2xx) can render directly to
memory, bypassing tiling.  Although it can't do this if blend, depth,
and a few other features of the pipeline are enabled.  This direct
memory mode can be faster for some sorts of operations, such as simple
blits.  In particular, this significantly speeds up XA by avoiding to
pull the entire dest pixmap into GMEM, render tiles, and write it all
back out again.  This should also speed up resource copy-region and
blit.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Rob Clark
2855f3f7bc freedreno: add a3xx support
The adreno a3xx GPU is found in newer snapdragon devices, such as the
nexus4.  The a3xx is GLESv3 and OpenCL capable, although that is not
enabled yet in gallium.

Compared to a2xx, it introduces an entirely new unified shader ISA, and
re-shuffles all or nearly all of the registers.  The good news is that
(for the most part) the registers are more orthogonal, not combining
unrelated state in a single register.  And that there is a lot more
flexibility, so we don't need to patch and re-emit the shader like we
did on a2xx.

The shader compiler is currently quite dumb, there would be a lot of
room for improvement with an optimizing pass.  Despite that, with the
a320 in my nexus4 it seems to be ~2-3x faster compared to the a220 in my
HP touchpad.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Rob Clark
18c317b21d freedreno: prepare for a3xx
Split the parts that are specific to adreno a2xx series GPUs from the
parts that will be in common with a3xx, so that a3xx support can be
added more cleanly.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Roland Scheidegger
213c207b3a gallivm: work around slow code generated for interleaving 128bit vectors
We use 128bit vector interleave for untwiddling in the blend code (with
256bit vectors). llvm generates terrible code for this for some reason,
so instead of generating a shuffle for 2 128bit vectors use a
extract/insert shuffle instead (it only seems to matter we're not using
128bit wide vectors for the shuffle). This decreases instruction count of
the blend code generated for a rgba8 render target without blending from
169 to 113 with llvm 3.1 and from 136 to 114 in llvm 3.2/3.3, and I got
a ~8% (llvm 3.1) and ~5% (3.2/3.3) performance improvement in gears.
(The generated code is still not terribly good as we could actually avoid
the interleaving completely but llvm can't know this.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-08 17:33:51 +02:00
José Fonseca
0aca2c6b60 scons: Fix implicit python dependency discovery on Windows.
Probably due to CRLF endings, the discovery of python import statements
was not working on Windows builds, causing incremental builds to often
fail unless one wiped out the build directory.

NOTE: This is a candidate for stable branches.
2013-06-08 08:55:06 +01:00
Stéphane Marchesin
4f905d4900 st/xlib: Flush the front buffer before doing CopySubBuffer
We flush pending rendering before running CopySubBuffer, which
ensures that the right bits get to the screen.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 18:53:54 -07:00
Stéphane Marchesin
4e5416b0e2 st/xlib: Fix upside down coordinates for CopySubBuffer
The coordinates need to be inverted between glX and gallium.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 18:53:54 -07:00
Eric Anholt
3c21a7d3c9 mesa: Report core FBO incompleteness cases through GL_ARB_debug_output.
Just like we produce from inside the Intel driver, this can help provide
information quickly about FBO incompatibility problems (particularly when
using apitrace replay).

Currently, in driver-marked incompleteness cases, you'll get both the
driver message and the core message on Intel.  Until the other drivers are
fixed to produce output, I think this is better than not putting in a
message for driver-marked incomplete.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 16:05:42 -07:00
Paul Berry
9e3475b39a intel: flush fake front buffer if server is about to destroy it.
Fixes piglit test "spec/!OpenGL 1.0/gl-1.0-front-invalidate-back"

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-07 13:42:34 -07:00
Paul Berry
447df5eaba intel: flush fake front buffer more robustly.
When a fake front buffer is in use, if we request the front buffer
(using screen->dri2.loader->getBuffersWithFormat()), the X server
copies the real front buffer to the fake front buffer and returns the
fake front buffer.  We sometimes make redundant requests for the front
buffer (due to using a single counter to track invalidates for both
the front and back buffers), so there's a danger of pending front
buffer rendering getting overwritten when the redundant front buffer
request occurs.

Previous to this patch, intel_update_renderbuffers() worked around
that problem by sometimes doing intel_flush() and intel_flush_front()
before calling intel_query_dri2_buffers().  But it only did the
workaround when the front buffer was bound for drawing; it didn't do
it when the front buffer was bound for reading.

This patch moves the workaround code to intel_query_dri2_buffers(), so
that it happens in exactly the circumstances where it is needed.

This should fix some of the sporadic failures in Piglit tests
fbo-sys-blit and fbo-sys-sub-blit.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-07 13:26:43 -07:00
Paul Berry
03cc310313 intel: make intel_flush_front safe to call during initial MakeCurrent
The patch that follows will fix a bug that prevents
intel_flush_front() from being called often enough.  In doing so, it
will create a situation where intel_flush_front() is called during the
initial call to glXMakeCurrent().  In this circumstance,
ctx->DrawBuffer hasn't been initialized yet and is NULL.  Fortunately,
intel->front_buffer_dirty is false, so intel_flush_front() doesn't
actually need to do anything.  To avoid a segfault, swap the order of
terms in intel_flush_front()'s if statement.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-07 13:26:36 -07:00
Eric Anholt
bc8bfdc42c mesa: Expose MAX_FRAGMENT_INPUT_COMPONENTS on ES3 and desktop 3.2.
piglit OpenGL ES 3.0/minmax now passes.  This was also one of the subcase
failures in OpenGL 3.2/minmax (and still is, because our value is too low
for 3.2, but at least we report what it is).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 12:55:07 -07:00
Eric Anholt
7500ad23eb mesa: Expose texture array getters on GLES3.
Part of fixing piglit OpenGL ES 3.0/minmax.

v2: s/_gles3/_es3/ in extra name, for consistency (review by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-07 12:55:06 -07:00
Eric Anholt
fd27e82ded mesa: Fix the return value of TEXTURE_BINDING_2D_ARRAY.
Noticed by inspection when reviewing the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 12:55:06 -07:00
Eric Anholt
11ace8a827 mesa: Expose texel offset limits in GLES3.
Part of fixing piglit OpenGL ES 3.0/minmax.

v2: s/_gles3/_es3/ in extra name, for consistency (review by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-07 12:55:06 -07:00
Roland Scheidegger
fa8cefa892 util: add comment about bogus transfer flags 2013-06-07 21:15:01 +02:00
Roland Scheidegger
b47d13f425 util: fix util_clear_render_target and util_clear_depth_stencil layer handling
These functions must clear all bound layers, not just the first.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 21:15:01 +02:00
Roland Scheidegger
201d7a352b llvmpipe: move create_surface/destroy_surface functions to lp_surface.c
Believe it or not but these two are actually the first two functions which
really belong in this file nowadays.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 21:15:01 +02:00
Roland Scheidegger
d8146f240e llvmpipe: add support for layered rendering
Mostly just make sure the layer parameter gets passed through to the right
places (and get clamped, can do this at setup time), fix up clears to
clear all layers and disable opaque optimization. Luckily don't need to
touch the jitted code.
(Clears invoked via pipe's clear_render_target method will not work however
since the pipe_util_clear function used for it doesn't handle clearing
multiple layers yet.)

v2: per Brian's suggestion, prettify var initialization and add some comments,
add assertion for impossible layer specification for surface.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 21:15:01 +02:00
Roland Scheidegger
0f4c08aea2 gallium/docs: fix up transfer description for 1d arrays, add cube map arrays
Transfers always use z/depth for layers no matter if it's a 1d or 2d array
texture, we don't follow OpenGL's crazyness there. Luckily this appears to
only be a doc bug, everyone doing the right thing already.
While here also document z/depth parameter for cube map arrays.

v2: fix typo spotted by Eric Anholt

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 21:15:01 +02:00
Chia-I Wu
7916d5ed88 ilo: fix textureSize() for single-layered array textures
We returned 0 instead of 1 for the number of layers when the array texutre is
single-layered.  This fixed it on GEN7+.
2013-06-08 01:39:47 +08:00
Chia-I Wu
d6c2708e1e util: add util_resource_is_array_texture()
Checking if array_size is greater than 1 is not enough for single-layered
array textures.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-08 01:37:40 +08:00
Brian Paul
90fa71b277 docs: update some environment variable info
Drop the GALLIUM_NOSSE/PPC env vars, added ST_DEBUG and some of the
VMware SVGA driver env vars.
2013-06-07 10:12:32 -06:00
Arnas Milasevicius
3069357ef0 gallium: Remove draw_arrays() and draw_arrays_instanced() functions
Moved draw_arrays() to st_draw_feedback.c and removed draw_arrays_instanced().
draw_arrays() was used by nobody else.  Now there's just one "draw" entrypoint
into the draw module.

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-07 09:29:29 -06:00
Brian Paul
14541dacab tgsi: replace tgsi_file_names tgsi_file_names[] with tgsi_file_name() function
This change came from the discovery that the STATIC_ASSERT to check that
the number of register file strings didn't actually work.

Similar changes could be made for the other string arrays in tgsi_string.c

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 09:23:24 -06:00
Chia-I Wu
97d641eb22 u_vbuf: fix index buffer leak
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-07 19:33:30 +08:00
Chris Forbes
06a503ca71 i965/vs: add support for emitting gl_ClipVertex
Removes the special-case suppression of gl_ClipVertex in the VUE map.

Also calculate vertex outcodes for user clip planes based on
gl_ClipVertex if written; otherwise gl_Position.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 20:50:33 +12:00
Chris Forbes
3615949990 i965/clip: Add support for gl_ClipVertex
When clipping triangles against a user clip plane, and gl_ClipVertex
is provided in the vertex, use it instead of hpos.

TODO: A similar change should be made at some point for line clipping.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 20:50:33 +12:00
Chia-I Wu
9b34a7f29a ilo: advertise PIPE_CAP_CUBE_MAP_ARRAY
It was supported but not advertised.  Also remove TODO tag for
PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT, as it is not a TODO.
2013-06-07 15:37:40 +08:00
Chia-I Wu
cde49c71a3 ilo: add support for TEX2/TXB2/TXL2 in fs
They were already supported, just being rejected in the TGSI translator.
2013-06-07 15:37:35 +08:00
Vinson Lee
f8df73f41c glsl linker: Initialize member variable interface_namespace.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-06 22:55:24 -07:00
Chia-I Wu
7142da6dd1 ilo: use slab allocator for transfers
Slab allocator is perfect for transfer.  Improved OpenArena performance by 1%
with several casual runs.
2013-06-07 13:23:43 +08:00
Chia-I Wu
09f62a13fc ilo: clean up states upon context destroy
We need to unreference resources that we referenced.
2013-06-07 11:28:21 +08:00
Chia-I Wu
7cbf0a410e ilo: unmap cp bo before destroying it
The BOs are mapped in their entire life times for the chipsets we support so
do not forget to unmap it.
2013-06-07 11:28:20 +08:00
Chia-I Wu
27804b2fc7 ilo: enable bo reuse
This magical line of code must have got lost at some point in the history...
2013-06-07 11:28:20 +08:00
Chia-I Wu
20d23b2275 ilo: construct 3DSTATE_SF in create_rasterizer_state()
Add ilo_rasterizer_sf and initialize it in create_rasterizer_state().
2013-06-07 11:13:16 +08:00
Chia-I Wu
3c2fea206f ilo: construct 3DSTATE_CLIP in create_rasterizer_state()
Add ilo_rasterizer_clip and initialize it in create_rasterizer_state().
2013-06-07 11:13:16 +08:00
Chia-I Wu
4006f4ce26 ilo: use emit_SURFACE_STATE() for render targets
Introduce ilo_surface_cso and initialize it in create_surface().  With the
change, we can emit SURFACE_STATE directly from the CSO and remove
emit_surf_SURFACE_STATE().  We do not deal with depth/stencil surfaces yet.
2013-06-07 11:13:16 +08:00
Chia-I Wu
5354dc7428 ilo: use emit_SURFACE_STATE() for constant buffers
Introduce ilo_cbuf_cso and initialize it in set_constant_buffer().  As
ilo_view_surface is embedded in ilo_cbuf_cso, switch to emit_SURFACE_STATE()
for constant buffers and remove emit_cbuf_SURFACE_STATE().
2013-06-07 11:13:16 +08:00
Chia-I Wu
2d82885d3c ilo: add emit_SURFACE_STATE() for sampler views
Introduce ilo_view_cso and initialize it in create_sampler_view().  Add
emit_SURFACE_STATE() to GPE, which can emit SURFACE_STATE from
ilo_view_surface.
2013-06-07 11:13:16 +08:00
Chia-I Wu
39e947569e ilo: add ilo_view_surface for SURFACE_STATE
Define struct ilo_view_surface for SURFACE_STATE construction and emission.
2013-06-07 11:13:15 +08:00
Courtney Goeltzenleuchter
c6983ea035 ilo: convert generic depth-stencil-alpha pipe state to ilo pipe state
Moving the work to create time reduces the work at emit time.
Saves time overall as create work is only done once.
Fix compiler warning in gen7_pipeline_sol.

[olv: remember pipe_alpha_state instead of pipe_depth_stencil_alpha_state in
      ilo_dsa_state]
2013-06-07 11:13:15 +08:00
Chia-I Wu
70e78211d6 ilo: introduce vertex element CSO
Introduce ilo_ve_cso and initialize it in create_vertex_elements_state().
This commit goes a step further by setting up mappings from HW VB to PIPE VB,
which we failed to do previously.  That allows us to support instanced
rendering.
2013-06-07 11:13:15 +08:00
Chia-I Wu
d4fa98db0c ilo: simplify emit_3DSTATE_DEPTH_BUFFER()
Remove hiz and dsa from the parameters.  We would know whether HiZ buffer
exists from ilo_texture once it is supported.  DSA state should not affect
3DSTATE_DEPTH_BUFFER.
2013-06-07 11:13:15 +08:00
Chia-I Wu
eea1be2072 ilo: introduce blend CSO
Introduce ilo_blend_cso and initialize it in create_blend_state().  This saves
us from having to construct hardware blend states in draw_vbo().
2013-06-07 11:13:15 +08:00
Chia-I Wu
b3c9e2161f ilo: introduce sampler CSO
Introduce ilo_sampler_cso and initialize it in create_sampler_state().  This
saves us from having to perform CPU-intensive calculations to construct
hardware sampler states in draw_vbo().
2013-06-07 11:13:15 +08:00
Chia-I Wu
99725d2f8a ilo: construct SCISSOR_RECT in set_scissor_states()
This allows us to memcpy() the state in draw_vbo().  Add ilo_init_states() and
ilo_cleanup_states() that are called when contexts are created and destroyed
respectively, and properly set the initial scissor state in ilo_init_states().
2013-06-07 11:13:15 +08:00
Chia-I Wu
e51806ee7a ilo: introduce viewport CSO
Introduce ilo_viewport_cso and initialize it in set_viewport_states().  This
saves us from having to perform CPU-intensive calculations to construct
hardware viewport states in draw_vbo().
2013-06-07 11:13:15 +08:00
Chia-I Wu
4228cf3746 ilo: switch to ilo states for shaders and resources
Define and use

 struct ilo_sampler_state;
 struct ilo_view_state;
 struct ilo_cbuf_state;
 struct ilo_resource_state;
 struct ilo_global_binding;

in ilo_context.
2013-06-07 11:13:15 +08:00
Chia-I Wu
94212915ee ilo: switch to ilo states for CC stage
Define and use

 struct ilo_dsa_state;
 struct ilo_blend_state;
 struct ilo_fb_state;

in ilo_context.
2013-06-07 11:13:15 +08:00
Chia-I Wu
29b938d9f4 ilo: switch to ilo states for WM stage
Define and use

 struct ilo_rasterizer_state;

in ilo_context.
2013-06-07 11:13:15 +08:00
Chia-I Wu
130364ad1d ilo: switch to ilo states for CLIP and SF stages
Define and use

 struct ilo_viewport_state;
 struct ilo_scissor_state;

in ilo_context.
2013-06-07 11:13:14 +08:00
Chia-I Wu
3bc8289f49 ilo: switch to ilo states for SOL stage
Define and use

 struct ilo_so_state;

in ilo_context.
2013-06-07 11:13:14 +08:00
Chia-I Wu
6b14b392d0 ilo: switch to ilo states for VF stage
Define and use

 struct ilo_vb_state;
 struct ilo_ve_state;
 struct ilo_ib_state;

in ilo_context.
2013-06-07 11:13:14 +08:00
Chia-I Wu
f0af292239 ilo: move hardware limits to ilo_gpe.h 2013-06-07 11:13:14 +08:00
Roland Scheidegger
644b8346fd draw: trivial fix comment typo 2013-06-06 23:51:39 +02:00
Roland Scheidegger
769449b3e8 gallium/tgsi: add missing string for layer semantic
Also report if a shader writes the layer semantic

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-06 23:51:38 +02:00
Roland Scheidegger
d0518c4c69 llvmpipe: bump 3d and cube map limits to 2048 and 8192 respectively
These should just work, required by d3d10. Too large resources will
get thrown out separately anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-06 23:51:38 +02:00
Eric Anholt
38e77e545d glsl: Fix uniform buffer object counting.
We were counting uniforms located in UBOs against the default uniform
block limit, while not doing any counting against the specific combined
limit.

Note that I couldn't quite find justification for the way I did this, but
I think it's the only sensible thing: The spec talks about components, so
each "float" in a std140 block would count as 1 component and a "vec4"
would count as 4, though they occupy the same amount of space.  Since GPU
limits on uniform buffer loads are surely going to be about the size of
the blocks, I just counted them that way.

Fixes link failures in piglit
arb_uniform_buffer_object/maxuniformblocksize when ported to geometry
shaders on Paul's GS branch, since in that case the max block size is
bigger than the default uniform block component limit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-06 14:37:41 -07:00
Eric Anholt
93c8692ce9 glsl: Make a local variable to avoid restating this array lookup.
v2: Convert another instance of the array lookup. (caught by Tapani)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-06 14:37:40 -07:00
Kenneth Graunke
757ad82867 intel: Use the CHIPSET macro in the PCI ID tables for the device name.
Putting the human readable device names directly in the PCI ID list
consolidates things in one place.  It also makes it easy to customize
the name on a per-PCI ID basis without a huge code explosion.

Based on a patch by Kristian Høgsberg.

v2: Fix 830M/845G names and #undef CHIPSET (caught by Emit Velikov).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-06 14:28:35 -07:00
Kenneth Graunke
ea92b700df intel: Remove 'misc' parameter from CHIPSET macro in PCI ID tables.
This has never actually been used for anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-06 14:28:35 -07:00
Andreas Boll
8bc788ea9e build: Use PACKAGE_VERSION from autoconf
Both variables had the same value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-06 19:07:22 +02:00
Andreas Boll
c0f7ccc136 build: Unify PACKAGE_VERSION on autotools, scons and Android
This patch unifies mesa's PACKAGE_VERSION on autotools, scons and
Android build systems.

Current behaviour is:
 - Autotools uses 9.2.0 as PACKAGE_VERSION
 - Scons and Android use 9.2-devel as PACKAGE_VERSION

With this patch all three build systems use 9.2.0-devel as
PACKAGE_VERSION.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-06 19:07:14 +02:00
Jonathan Gray
5bd808a2c7 radeon/winsys: correct RADEON_GEM_WAIT_IDLE use
RADEON_GEM_WAIT_IDLE is declared DRM_IOW but mesa
uses it with drmCommandWriteRead instead of drmCommandWrite
which leads to the ioctl being unmatched and returning an
error on at least OpenBSD.

Problem originally noticed in libdrm by Mark Kettenis.
Dave Airlie pointed out that mesa has the same issue.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2013-06-06 11:01:18 +02:00
Mike Stroyan
962204961d configure.ac: Build dricommon for gallium swrast
When building dri-swrast, use gallium_check_st to set HAVE_COMMON_DRI.
Commit 07f2dee7 added setting of HAVE_COMMON_DRI in gallium_check_st.
But the dri-swrast case did not use gallium_check_st.
So dri/common was still not built.

v2: set HAVE_COMMON_DRI=yes instead of using gallium_check_st

NOTE: This is a candidate for the 9.1 branch.
      (Depends on 7de78ce5 and 07f2dee)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-06-06 08:54:07 +02:00
Rodrigo Vivi
ce67fb4715 i965: Adding more reserved PCI IDs for Haswell.
At DDX commit Chris mentioned the tendency we have of finding out more
PCI IDs only when users report. So Let's add all new reserved Haswell IDs.

NOTE: This is a candidate for stable branches.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=63701
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-05 10:44:15 -07:00
Rico Schüller
3998cfa933 mesa: remove outdated version lines in comments
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-05 08:54:27 -06:00
Richard Sandiford
7bdf1f2f1a gallium: System z support
The main change is to use MCJIT rather than the old JIT, which will never
be supported for System z.  The endianness part is by example since the
patch was tested on a glibc system.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-05 08:36:24 -06:00
Roland Scheidegger
008fd03600 llvmpipe: improve alignment calculation for fetching/storing pixels
This was always doing per-pixel alignment which isn't necessary, except
for the buffer case (due to the per-element offset). The disabled code
for calculating it was incorrect because it assumed that always the full
block would be fetched, which may not be the case, so fix this up.
The original code failed for instance for r10g10b10a2 the alignment would
have been calculated as 4 (block_width) * 4 (bytes) so 16, but the actual
fetch may have only fetched 2 values at a time, hence only alignment 8 -
it is unclear what exactly would happen in this case (alignment larger
than size to fetch).
So just use the (already calculated) fetch size instead and get alignment
from that which should always work, no matter if fetching 1,2 or 4 pixels.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:47 +02:00
Roland Scheidegger
ffe2a1ca3c llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1
For rendering to buffers, we cannot have any y alignment.
So make sure that tile clear commands only clear up to the fb width/height,
not more (do this for all resources actually as clearing more seems
pointless for other resources too). For the jit fs function, skip execution
of the lower half of the fragment shader for the 4x4 stamp completely,
for depth/stencil only load/store the values from the first row
(replace other row with undef).
For the blend function, also only load half the values from fs output,
replace the rest with undefs so that everything still operates on the
full 4x4 block to keep code the same between 4x1 and 4x4 (except for
load/store of course which also needs to skip (store) or replace these
values with undefs (load))., at the cost of slightly less optimal code
being produced in some cases.
Also reduce 1d and 1d array alignment too, because they can be handled the
same as buffers so don't need to waste memory.

v2: don't try to run special blend code for 4x1, (very) slightly less
complexity if we just use the same code as for 4x4 which may or may not
make it easier to optimize in the future (as we care a lot more about 4x4
performance than 1d).

v2: don't use undef values for unused fs src outputs with llvm 3.1 as it
apparently can trigger a bug in llvm.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:47 +02:00
Roland Scheidegger
ef3e887084 llvmpipe: cleanup of generate_unswizzled_blend
Some parameters were used inconsistently, for instance not using
block_width/block_height/block_size for deferring number of pixels
but rather relying on guesses from the number of fragment shaders etc,
so fix this up (no actual change in behavior since the block size stays
fixed). (Though most of the code would work with different block_height,
with three exceptions, one being the hacked r11g11b10 conversions and
twiddle code which only work with block_height 2 not 1, and the last
one being blend vector type not being 128bit wide.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:47 +02:00
Roland Scheidegger
44993c1808 gallivm: enhance special sse2 4x4f and 2x8f -> 1x16ub conversion
There's no good reason why it can't handle 2x4f->1x8ub, 1x4f->1x4ub and
1x8f->1x8ub cases, there might be legitimate reasons why we don't have
enough input vectors for a full destination vector, and using pack
intrinsics should still be much better than using generic conversion
(it looks like convert_alpha from the blend code might hit this though
I suspect it could be avoided).

v2: add another test vector format to lp_test_conv so this gets tested.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:46 +02:00
Roland Scheidegger
ce82523db9 gallivm: (trivial) fix lp_build_concat_n
The code was designed to handle no-op concat but failed (unless the
caller was using same pointer for src and dst).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:46 +02:00
Brian Paul
f270baf074 mesa: change MAX_PROGRAM_ADDRESS_REGS to 1, clamp to it in state tracker
We've never properly supported more than one address register.  There
isn't even a field in prog_src_register or prog_dst_register to indicate
which address register to use if RelAddr!=0.

In the state tracker, clamp MaxAddressRegs against MAX_PROGRAM_ADDRESS_REGS
since many gallium drivers do support more.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65226

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-04 13:29:38 -06:00
Paul Berry
2fd785d126 intel: Don't try to blorp or blit CopyTexSubImage(1D_ARRAY).
Blorp and the hardware blitter can't be used to implement
CopyTexSubImage when the image type is 1D_ARRAY, because of a
coordinate system mismatch (the Y coordinate in the source image is
supposed to be matched up to the Z coordinate in the destination
texture).

The hardware blitter path (intel_copy_texsubimage) contained a perf
debug warning for this case, but it failed to actually fall back.  The
blorp path didn't even check.

Fixes piglit test "copyteximage 1D_ARRAY".

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-04 09:14:44 -07:00
Paul Berry
32d1f423bc i965/gen6+: Fix multisample assertions in CopyTexSubImage hw blitter path.
Commit 045612c (intel: Add an assert for glCopyTexSubImage() being
called on MSAA buffers) added an assertion to intel_copy_texsubimage()
to make sure that multisampling was not in use, based on the
assumption that glCopyTexSubImage() can't legally be used with
multisampling.

However, there is one case where glCopyTexSubImage() can legally be
used with multisampling: when the source buffer is a multisampled
window system buffer.  If the source and destination color formats
don't match, the blorp path will fail, so intel_copy_texsubimage()
will be called.  In this case, we need intel_copy_texsubimage() to
return false so that we fall back to meta to do the copy.  (The
multisampled source buffer won't cause a problem for the meta path,
because it uses glReadPixels, which forces a multisample resolve).

It's still safe to assert that the destination image is
single-sampled, because it's not legal to call glCopyTexSubImage() on
multisampled textures.

Fixes some failures with piglit tests "copyteximage
{1D,2D,CUBE,RECT,2D_ARRAY}" (with "samples=..." argument).

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-04 09:14:40 -07:00
Vinson Lee
7bafd88c15 mesa: Prevent possible out-of-bounds read by save_SamplerParameterfv.
Fixes "Out-of-bounds access" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-03 23:01:46 -07:00
Dave Airlie
0677ea063c i965: fix problem with constant out of bounds access (v3)
Okay I now understand why Frank would want to run away, this is
my attempt at fixing the CVE out of bounds access to constants
outside the range. This attempt converts any illegal constants
to constant 0 as per the GL spec, and is undefined behaviour.

A future patch should add some debug for users to find this out,
but this needs to be backported to stable branches.

CVE-2013-1872

v2: drop the last hunk which was a separate fix (now in master).
hopefully fix the indentations.

v3: don't fail piglit, the whole 8/16 dispatch stuff was over
my head, and I spent a while figuring it out, but this one is
definitely safe, one piglit pass extra on my Ironlake.

NOTE: This is a candidate for stable branches.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-04 13:50:20 +10:00
Eric Anholt
bb525f1f11 intel: Fix copying of separate stencil data in glCopyTexSubImage().
We were copying the source stencil data onto the destination depth data.

Fixes piglit copyteximage other than 1D_ARRAY.

v2: Fix unintentional dropping of the "don't double-copy for packed
    depth/stencil" check.  While blorp is only supported on separate
    stencil hardware at the moment, hopefully that will change soon.
    Review by Jordan.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-06-03 14:22:54 -07:00
Eric Anholt
c937aea3d1 meta: Fix temporary image type for float depth/stencil.
Fixes assertion failure in piglit copyteximage.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-03 13:47:19 -07:00
Eric Anholt
f96de8ad96 intel: Fix performance regression from miptree blit changes.
When making v2 of da2880bea0, I carefully
checked all of the calls in that commit to see that I'd updated them, but
forgot to update the new calls in the later commits such as
.e845c5cf7abce55759501a473459aff3bf25c9ca.  As a result, we were getting Y
tiled temporaries even though the whole point of the temporary was to
untile!

The steady state of the intro scene of lightsmark goes from 13 to 17 fps.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65154
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-03 13:47:18 -07:00
Carl Worth
610fe6da79 glcpp: Add test case for recently fixed loop-control underflow bug.
To trigger the bug, it suffices to have a line-continuation followed by
a newline and then a non-line-continuation backslash.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-03 13:33:32 -07:00
Carl Worth
d8eeb1d330 glcpp: Fix post-decrement underflow in loop-control variable
This loop-control condition with a post-decrement operator would lead to
an underflow of collapsed_newlines. This in turn would cause a subsequent
execution of the loop to labor inordinately trying to return the loop-control
variable to a value of 0 again.

Fix this by dis-intertwining the test and the decrement.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65112

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-03 13:33:31 -07:00
Chad Versace
7a9f4d3e71 i965: Fix glColorPointer(GL_FIXED)
When a gl_client_array is created with glColorPointer,
gl_client_array::Normalized is true. This caused the translation from the
gl_client_array's type to a BRW_SURFACEFORMAT to assertion fail.

Fixes the spinning cube's color in Android 4.2's ApiDemos.apk,
"Graphics > OpenGL ES".

Fixes assertion failure in mesa-demos/src/egl/opengles1/tri_x11 on Haswell
and Ivybridge:
  brw_draw_upload.c:287: get_surface_type: Assertion `0' failed.

No Piglit regressions on Haswell.

Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42182
Issue: AXIA-2954
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-03 13:03:28 -07:00
Zack Rusin
e54c924a0e softpipe: draw_find_shader_output returns -1 on invalid outputs
It was changed from 0 to allow shader outputs at 0 that are
different from position.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 19:54:25 -04:00
Tom Stellard
124e1f91a7 radeonsi/compute: Upload work group, work item size in input buffer 2013-06-03 14:03:13 -04:00
Tom Stellard
3d831206a4 radeonsi/compute: Pass kernel arguments in a buffer v2
v2:
  - Fix memory leak in si_set_constant_buffer()
2013-06-03 14:03:08 -04:00
Tom Stellard
67e5c9ae0e radeonsi/compute: Implement un-binding of global buffers 2013-06-03 10:24:54 -04:00
Tom Stellard
d2472ceb92 radeonsi/compute: Support multiple kernels in a compute program 2013-06-03 10:24:54 -04:00
Tom Stellard
3f24190325 radeonsi/compute: Add missing PIPE_COMPUTE caps 2013-06-03 10:24:54 -04:00
Jordan Justen
c754f7a8fd i965 gen7: use SURFACE_STATE fields to select render level/layer
Rather than pointing the surface_state directly at a single
sub-image of the texture for rendering, we now point the
surface_state at the top level of the texture, and configure
the surface_state as needed based on this.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:39:38 -07:00
Jordan Justen
6bfd897fc4 mesa/texformat: add _mesa_tex_target_is_array function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:39:38 -07:00
Jordan Justen
6a5469cff9 intel: add layered parameter to update_renderbuffer_surface
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:38:37 -07:00
Jordan Justen
8312caf673 intel_fbo: set gl_renderbuffer Depth field
Set the renderbuffer's Depth field to match the texture's
Depth when rendering to a texture.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:38:37 -07:00
Jordan Justen
a2d31371e9 intel: print image depth in debug message
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:38:37 -07:00
Brian Paul
e20a2df401 mesa: handle missing read buffer in _mesa_get_color_read_format/type()
We were crashing when GL_READ_BUFFER == GL_NONE.  Check for NULL
pointers and reorganize the code.  The spec doesn't say which error
to generate in this situation, but NVIDIA raises GL_INVALID_OPERATION.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65173
NOTE: This is a candidate for the stable branches.

Tested-by: Vedran Rodic <vrodic@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-06-02 18:12:07 -06:00
Brian Paul
dcc5b6bfb7 meta: move vertex array enables for mipmap generation
Before, on the second call to GenerateMipmap we were enabling two
vertex arrays for the current vertex array object, rather than
the private generate-mipmap vertex array object.  This caused
things to blow up elsewhere.

This patch moves the array enables into the block where the
generate-mipmap vertex array object is created, as we do in
the setup_ff_generate_mipmap() function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60518
NOTE: This is a candidate for the stable branches.

Tested-by: core13@gmx.net
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-02 18:06:17 -06:00
Brian Paul
8588350dc0 mesa: fix hodge podge indentation, update comments in texformat.c 2013-06-02 18:06:17 -06:00
Roland Scheidegger
6b53e2b038 gallium: add support for layered rendering
Since pipe_surface already has all the necessary fields no interface
changes are necessary except adding a new shader semantic value
(TGSI_SEMANTIC_LAYER).
(Note that what GL knows as "gl_Layer" variable d3d10 is naming
"RENDER_TARGET_ARRAY_INDEX".)

v2: drop cap bit (just tied to geometry shader), add docs.
2013-06-01 20:03:59 +02:00
Roland Scheidegger
458a9a0f85 gallivm: fix out-of-bounds access with mirror_clamp_to_edge address mode
Surprising this bug survived so long, we were missing a clamp (in the
linear filtering version).
(Valgrind complained a lot about invalid reads with piglit texwrap,
I've also seen spurios failures in this test which might have
happened due to this. Valgrind probably didn't complain before the
alignment reduction in llvmpipe to 4x4 since the test is using tiny
textures so the reads were still always well within allocated area.)
While here, also do an effective clamp (after half subtraction)
of [0,length-0.5] instead of [0, length-1] which saves an instruction
(the filtering weight could be different due to this, but only if
both texels point to the same max texel so it doesn't matter).
(Both changes are borrowed from PIPE_TEX_CLAMP_TO_EDGE case.)

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-01 20:03:59 +02:00
Roland Scheidegger
f51fc7a71c llvmpipe: fix bogus assertions for buffer surfaces
One of the assertion made no sense for buffer rendertargets
(due to the union), so drop it. (The same assertion is present already in
the path for texture surfaces later.).

v2: make assertion completely accurate (suggested by Jose).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-01 20:03:59 +02:00
Kenneth Graunke
4405ff4055 i965: Fix haswell_upload_cut_index when there's no index buffer.
brw->ib.type is reset to -1 at the start of each batch.  If there's no
index buffer, it won't get updated to a sensible value, resulting in
_mesa_primitive_restart_index's "Invalid index buffer type" assertion
tripping.

Fixes a regression since 7c87a3b5da.

NOTE: This is a candidate for the 9.1 branch (and should be squashed).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65195
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-31 21:54:49 -07:00
Roland Scheidegger
869c5d438f llvmpipe: reduce alignment requirement for resources from 64x64 to 4x4
The overallocation was very bad especially for things like 1d array
textures which got blown up by a factor of 64. (Even ordinary smallish
2d textures benefit a lot from this, a mipmapped 64x64 rgba8 texture
previously used 7*16kB = 112kB instead of now ~22kB.)
4x4 is chosen because this is the size the jit functions run on, so
making it smaller is going to be a bit more complicated.
It is actually not strictly 4x4 pixel, since we'd want to avoid situations
where different threads are rendering to the same cacheline so we keep
cacheline size alignment in x direction (often 64bytes).
To make this work introduce new task width/height parameters and make
sure clears don't clear the whole tile if it's a partial tile. Likewise,
the rasterizer may produce fragments outside the 4x4 blocks present in a
tile, so don't call the jit function for them.
This does not yet fix rendering to buffers (which cannot have any y
alignment at all), and 1d/1d array textures are still overallocated by a
factor of 4.

v2: replace magic number 4 with LP_RASTER_BLOCK_SIZE, fix size of buffers
allocated (needed in case we render to them).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-31 20:21:05 +02:00
Adam Jackson
e881c9a5dc llvmpipe: Remove x/y from cmd_bin
These were mostly just a waste of memory and cache pressure, and were
really only used for debugging.

This change reduces instruction count (as measured by callgrind's Ir
event) of gnome-shell-perf-tool on Ivybridge by 3.5% ± 0.015% (n=20).

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-05-31 20:21:05 +02:00
Vadim Girlin
eb4c992ea5 r600g/sb: fix broken assert
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-31 22:11:42 +04:00
Andreas Boll
5ea43e6549 glapi: Add some missing static_dispatch="false" annotations to es_EXT.xml
This fixes the following build errors on powerpc:

  CC     glapi_dispatch.lo
  In file included from glapi_dispatch.c:90:0:
  ../../../../../src/mapi/glapi/glapitemp.h:1640:1: error: no previous
  prototype for 'glReadBufferNV' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:4198:1: error: no previous
  prototype for 'glDrawBuffersNV' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6377:1: error: no previous
  prototype for 'glFlushMappedBufferRangeEXT'
  [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6389:1: error: no previous
  prototype for 'glMapBufferRangeEXT' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6401:1: error: no previous
  prototype for 'glBindVertexArrayOES' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6413:1: error: no previous
  prototype for 'glDeleteVertexArraysOES' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6433:1: error: no previous
  prototype for 'glGenVertexArraysOES' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6445:1: error: no previous
  prototype for 'glIsVertexArrayOES' [-Werror=missing-prototypes]

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-31 17:18:57 +02:00
Vinson Lee
171199b2b7 mesa: Add missing break statement in _mesa_choose_tex_format.
Fixes "Missing break in switch" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-30 23:12:32 -07:00
Alan Coopersmith
306f630e67 integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2]
clientDriverNameLength is a CARD32 and needs to be bounds checked before
adding one to it to come up with the total size to allocate, to avoid
integer overflow leading to underallocation and writing data from the
network past the end of the allocated buffer.

NOTE: This is a candidate for stable release branches.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 18:03:45 -07:00
Alan Coopersmith
2e5a268f18 integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2]
busIdStringLength is a CARD32 and needs to be bounds checked before adding
one to it to come up with the total size to allocate, to avoid integer
overflow leading to underallocation and writing data from the network past
the end of the allocated buffer.

NOTE: This is a candidate for stable release branches.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 18:03:39 -07:00
Brian Paul
51498a3e71 mesa: fix error checking of DXT sRGB formats in _mesa_base_tex_format()
For formats such as GL_COMPRESSED_SRGB_S3TC_DXT1_EXT we need to
have both the GL_EXT_texture_sRGB and GL_EXT_texture_compression_s3tc
extensions.  This patch adds the missing check for the later.

Found when checking out https://bugs.freedesktop.org/show_bug.cgi?id=65173

NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-30 14:01:31 -06:00
Brian Paul
fb1785197f mesa: asst. whitespace, formatting fixes in teximage.c 2013-05-30 14:01:31 -06:00
Zack Rusin
978d5ed06b draw: fix vs/fs input/output mismatches
When we've changed draw_find_shader_output to return -1 instead
of 0 on non found attribs we broke the default behavior of
draw, which was to always redirect those to the first (0th) slot.
To preserve that behavior if draw_emit_vertex_attr notices a
mismatched vertex attrib, it just redirects it to the first slot
(instead of trying to use negative index in an array).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-30 15:34:19 -04:00
Anuj Phogat
0a70fdfb3f intel: Add multisample scaled blitting in blorp engine
In traditional multisampled framebuffer rendering, color samples must be
explicitly resolved via BlitFramebuffer before doing the scaled blitting
of the framebuffer. So, scaled blitting of a multisample framebuffer
takes two separate calls to BlitFramebuffer.

This patch implements the functionality of doing multisampled scaled
resolve using just one BlitFramebuffer call. Important changes involved
in this patch are listed below:
    - Use float registers to scale and offset texture coordinates.
    - Change offset computation to consider float coordinates.
    - Round the scaled coordinates down to nearest integer.
    - Modify src texture coordinates clipping to account for scaling..
    - Linear filter is not yet implemented in blorp. So, don't use
      blorp engine to do single sampled scaled blitting.

V3: Fix nearest filtering issue in scaled blits. Makes failing piglit
fbo-blit-stetch test and framebuffer_blit_functionality_magnifying_blit.test
in gles3 CTS pass.

Observed no piglit, gles3 CTS regressions on sandybridge & ivybridge with
this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-30 10:50:30 -07:00
Anuj Phogat
6e28713a8d intel: Change the register type from UW to UD in blorp engine
These changes are required to implement scaled blitting in blorp
in my next patch.

No regressions observed in piglit quick-driver.tests with this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-30 10:50:29 -07:00
Anuj Phogat
40e3298125 mesa: Implement ext_framebuffer_multisample_blit_scaled extension
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 10:50:29 -07:00
Kenneth Graunke
60f9b722ef Revert "i965: fix problem with constant out of bounds access (v2)"
This reverts commit 98dfd59a04.

The patch was clearly not Piglit tested, as it caused at least 225
tests to start crashing with assertion failures.  That was before my
desktop tanked and the test run died completely.
2013-05-29 23:31:09 -07:00
Courtney Goeltzenleuchter
8b1c9de166 ilo: simplify shader variant handling
Remove hash function on shader variants. Nature of variants limits them to a
small number and thus its more efficient to just do a memory compare of the
actual shader structures rather than compute and compare hashes.
2013-05-30 13:58:40 +08:00
Dave Airlie
98dfd59a04 i965: fix problem with constant out of bounds access (v2)
This is my attempt at fixing this as the CVE is making RH security team
care enough to make me look at this. (please upstream, security fixes are
more important than whatever else you are doing, if for no other reason than
it saves me having to fix stuff I've no real clue about).

Since Frank's original fix was denied, here is my attempt to just
alias all constants that are out of bounds < 0 or > nr_params to constant 0,
hopefully this provides the undefined behaviour idr requires..

CVE-2013-1872

v2: drop the last hunk which was a separate fix (now in master).
hopefully fix the indentations.

NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-05-30 12:59:34 +10:00
Frank Henigman
02fe736cc0 intel: initialize fs_visitor::params_remap in constructor
Set fs_visitor::params_remap to NULL in the constructor.
This variable was potentially tested in fs_visitor::remove_dead_constants()
before being set.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-05-30 10:37:35 +10:00
Brian Paul
83aaf61e24 draw: add cast in debug_printf() to silence warning 2013-05-29 18:07:35 -06:00
Brian Paul
71682c1599 svga: add PIPE_CAP_MAX_VIEWPORTS to switch to silence warning 2013-05-29 18:07:11 -06:00
Zack Rusin
c08baef508 draw: make sure viewport index is fetched from leading vertex
Viewport index should only be used on a per primitive basis, so
instead of fetching it from each vertex, potentially making each
vertex in a primitive use a different viewport index, which is
obviously broken, make sure that we only fetch from the first
vertex in the primitive making the viewport index the same
for the entire primtive.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
c88ce3480c llvmpipe: clamp scissors to be between 0 and max
We need to clamp to make sure invalid shader doesn't crash our
driver. The spec says to return 0-th index for everything that's
out of bounds.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
d7d676252d draw: clamp the viewports to always be between 0 and max
If the viewport index is larger than the PIPE_MAX_VIEWPORTS,
then the first (0-th) viewport should be used.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
26fe24c479 gallium/docs: adds documentation for multi viewport cap
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
4b5595b38b draw: fixup draw_find_shader_output
draw_find_shader_output like most of the code in draw used to
depend on position always being at output slot 0. which meant
that any other attribute being at 0 could signify an error.
unfortunately position can be at any of the output slots, thus
other attributes can occupy slot 0 and we need to mark the ones
which were not found by something else. This commit changes
draw_find_shader_output so that it returns -1 if it can't
find the given attribute and adjust the code that depended
on it returning >0 whenever it correctly found an attrib.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
97b8ae429e llvmpipe: implement support for multiple viewports
Largely related to making sure the rasterizer can correctly
pick out the correct scissor box for the current viewport.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
7756aae815 draw: implement support for multiple viewports
This adds support for multiple viewports to the draw module.
Multiple viewports depend on the presence of geometry shaders
which can write the viewport index.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
eaabb4ead0 gallium: Add support for multiple viewports
Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Kenneth Graunke
e6efb900e7 mesa: Delete the ctx->Array._RestartIndex derived state.
It's incorrect and isn't used any longer.

v2: Actually flush vertices/flag _NEW_TRANSFORM on RestartIndex change.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:17 -07:00
Kenneth Graunke
51c0ffacb2 mesa: Ignore fixed-index primitive restart in ArrayElement().
GL_PRIMITIVE_RESTART_FIXED_INDEX is only supposed to apply to
glDrawElements*.  This code is for legacy drawing paths and display
lists, so it shouldn't apply.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:14 -07:00
Kenneth Graunke
a41478e3f6 st/mesa: Go back to using ctx->Array.RestartIndex, not _RestartIndex.
The derived _RestartIndex field is an attempt to support both
GL_PRIMITIVE_RESTART and GL_PRIMITIVE_RESTART_FIXED_INDEX (part of ES
3.0).  Gallium drivers don't appear to support ES 3.0 yet, so they don't
need to use it.  Plus, it's broken and going to go away soon.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:11 -07:00
Kenneth Graunke
49aba27973 i965: Fix can_cut_index_handle_restart_index() for byte/short types.
Pre-Haswell hardware doesn't support an arbitrary restart index, and
instead compares the index buffer value against 0xFF for byte-size
buffers, 0xFFFF for short-size buffers, or 0xFFFFFFFF for unsigned
integer buffers.

OpenGL allows the restart index to be an arbitrary unsigned integer.
When comparing against byte/short types, the index buffer value should
be promoted to a full 32-bit integer before doing the comparison.  The
restart index is /not/ supposed to be masked to byte/short size.

This means that with certain restart indexes, the comparison should
always fail.  For example, a restart index of 0xF000FFFF should never
match any byte/short index buffer values due to the extra high bits.

We must not enable hardware primitive restart in such a case.  For now,
fall back to software primitive restart as it's the simplest fix.  In
the future, we could detect restart indexes that will never match and
skip both hardware and software primitive restart.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:08 -07:00
Kenneth Graunke
7c87a3b5da i965: Use the correct restart index for fixed index mode on Haswell.
The code that updates the ctx->Array._RestartIndex derived state mashed
it to 0xFFFFFFFF when GL_PRIMITIVE_RESTART_FIXED_INDEX was enabled
regardless of the index buffer type.  It's supposed to be 0xFF for byte,
0xFFFF for short, or 0xFFFFFFFF for integer types.

The new _mesa_primitive_restart_index() helper gets this right.

The hardware appears to compare against the full 32-bit value some of
the time, causing primitive restart not to occur when it should.  The
fact that it works some of the time is rather frightening.

Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart
conformance test when run in combination with other tests.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:06 -07:00
Kenneth Graunke
1569709663 vbo: Use the new primitive restart index helper function.
This gets the correct restart index for unsigned byte/short types when
using GL_PRIMITIVE_RESTART_FIXED_INDEX.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:04 -07:00
Kenneth Graunke
959d076b30 mesa: Add a helper function for determining the restart index.
The derived state approach currently used (_RestartIndex) doesn't work:
in the GL_PRIMITIVE_RESTART_FIXED_INDEX case, the restart index depends
on the index buffer's data type, and that isn't known until draw time.

The existing code also fails to obey the GL 4.3 rules which say that
FIXED_INDEX takes precedence over normal primitive restart.

This helper function correctly determines the restart index, and will
replace the derived state.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:02 -07:00
Kenneth Graunke
37f278000c vbo: Ignore PRIMITIVE_RESTART_FIXED_INDEX for glDrawArrays().
The derived _PrimitiveRestart enable flag combines the PrimitiveRestart
and PrimitiveRestartFixedIndex enable flags.  However, DrawArrays is not
supposed to do FixedIndex restart:

From the OpenGL 4.3 Core specification, section 10.3.5 (page 302):
"If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
 performed for array elements transferred by any drawing command not
 taking a type parameter, including all of the *Draw* commands other
 than *DrawElements*."

The OpenGL ES 3.0 specification agrees by omission:
"When DrawElements, DrawElementsInstanced, or DrawRangeElements
 transfers a set of generic attribute array elements to the GL..."

Notably, DrawArrays is not included in the list of draw calls that
take PRIMITIVE_RESTART_FIXED_INDEX into consideration.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:21:51 -07:00
Eric Anholt
6220cc931f i965/vs: Fix implied_mrf_writes() for integer division pre-gen6.
Previously it would assertion fail in debug builds (though the correct
value was returned in a non-debug build).  Marking it as a candidate for
stable even though it has no current consumers in the stable branches, in
case one shows up in a later backport.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64727
NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 11:02:01 -07:00
Eric Anholt
0a0b323193 i965/fs: Fix test for smearing enabled on an instruction.
We were expanding the live range too far, breaking register_coalesce_2()
and compute_to_mrf() on 16-wide shaders.  Turning it back on improves
GLB2.7 performance by 0.239355% +/- 0.0850649% (n=398). shader-db stats
are:

total instructions in shared programs: 1627211 -> 1609262 (-1.10%)
instructions in affected programs:     450351 -> 432402 (-3.99%)

While 33 new 16-wide shaders are gained, 70 are lost.  Despite that,
tropics (the app that lost the most 16-wide) shows a .41% +/- .16%
(n=7/8, first-run outlier removed) performance improvement on my HSW.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 10:20:26 -07:00
Eric Anholt
9a31c4f9ac i965/fs: Fix segfault in instruction scheduling with LINTERP using last GRF.
The scheduler didn't know about uniform-type accesses, and if a uniform
access was last in a 16-wide, we'd walk off the end of the array.  This
never happened, because we'd never coalesce out all the GRFs, due to a bug
to be fixed in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 10:16:44 -07:00
Eric Anholt
7e7600d10b mesa: Fix test for optimistic coloring being necessary.
i965 and radeon use ra_set_node_reg() to force payload registers to
specific registers while exposing those registers to the allocator still.
We were treating those register nodes as unsuccessfully allocated in the
ra_simplify() step, leading to walking the registers again to do
optimistic coloring even if there was nothing left ot do.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 10:16:44 -07:00
Anthony G. Basile
22f1add968 gallium: fix build on uclibc system
execinfo.h and debug_symbol_name_glibc() are pure GNU-isms and do not
build on uclibc systems.  A previous patch addressed this issue, but
there was an error.  This patch corrects that error.  See

  https://bugs.freedesktop.org/show_bug.cgi?id=51782
  https://bugs.gentoo.org/show_bug.cgi?id=469768

Signed-off-by: Anthony G. Basile <blueness@gentoo.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-05-29 08:32:35 -06:00
Eric Anholt
4dea6cf215 intel: Enable blit glCopyTexSubImage/glBlitFramebuffer with sRGB.
Since the introduction of default-to-SARGB8 window system framebuffers,
non-blorp hardware lost blit acceleration for these two paths between the
window system and ARGB8888 textures.  Since we shouldn't be doing any
conversion anyway, just compatibility-check the linear variants of the
formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61954
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
2013-05-28 17:53:44 -07:00
Andreas Hartmetz
f43f07d588 radeonsi: Add ipo to LLVM_COMPONENTS
r600g needs it too, so add ipo in the common radeon_llvm_check().

radeonsi compiled and linked, but it failed at dynamic link time
with a missing symbol.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-05-28 17:08:00 -07:00
Roland Scheidegger
33fcce3682 llvmpipe: get rid of tiled/linear layout remains
Eliminate the rest of the no longer needed layout logic.
(It is possible some code could be simplified a bit further still.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-29 00:41:06 +02:00
Eric Anholt
b3abc93f47 intel: Remove dead intel_drawbuf_region().
Since the glBitmap() MRT change, it's unused.  There was basically no way
to responsibly use this function since MRT was introduced.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:58 -07:00
Eric Anholt
0a39cb88de intel: Fix format handling of blit glBitmap()
Any 32-bit format got ARGB8888 handling (including, say, GL_RG1616), and
anything else got 16-bit (including, say, GL_R8), which could potentially
hang the GPU by writing out of bounds.

NOTE: This is a candidate for the stable branches.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:58 -07:00
Eric Anholt
1cb8de6fff intel: Fix MRT handling of glBitmap().
We'd only hit color buffer 0 even if multiple draw buffers were bound.

NOTE: This is a candidate for the stable branches.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
5f29dca070 intel: Rebuild PBO blit glTexImage() on top of miptrees.
This will ensure that we have resolves if we ever extend this to
glTexSubImage(), and fixes missing image start offset handling.

The texture buffer alloc ended up getting moved up, because we want to
look at the format of the image's actual mt to see if we'll end up
blitting the right thing, in the case of packed depth/stencil uploads.

This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
3c3e83014b intel: Rebuild PBO blit glReadPixels() on top of miptrees.
The previous code was missing depth resolves, that had only been prevented
due to no blitting of Y tiling.  The pair of flip args in the new blit
function means that we can just drop the pack->Invert fallback.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
8c3392e274 intel: Rework intel_miptree_create_for_region() to wrap a BO.
I needed to do this for the PBO blit cases to use intel_miptree_blit().
But this also actually partially fixes a bug in EGLImage handling: We
can't share regions across contexts, because regions have a refcount that
isn't protected by a mutex, and different contexts can be simulataneously
accessed from multiple threads.  Now we just need to get regions out of
__DRIImage.  There was also a missing use of image->offset in the EGLImage
renderbuffer storage code.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
e845c5cf7a intel: Make a temporary miptree for the blit path of miptree mapping.
In a bit of debug code, we no longer have the inter-slice x/y to print.
But I think the level/slice is more useful in this case for looking at
what's getting mapped, especially given that INTEL_DEBUG=blit will tell
you the other value.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:56 -07:00
Eric Anholt
4a13beef88 intel: Make a temporary miptree when doing blit uploads for glTexSubImage().
While this is a bit more CPU work, it also is less code to handle this
path, and fixes problems with 32k-pitch textures and missing resolves.

v2: Add error checking in new code.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:56 -07:00
Eric Anholt
da2880bea0 intel: Extend the force_y_tiling flag to allow forcing no tiling.
For a blit-uploaded temporary, it's faster on current hardware to memcpy
the data into a linear CPU mapping than to go through the GTT.

v2: Turn the not-fully-supported mask into 3 supported enum values.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v2)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v2)
2013-05-28 13:06:43 -07:00
Eric Anholt
045612c90e intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers.
This is just in case someone else trips over this due to our weird reuse
of this code in glBlitFramebuffer().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:44 -07:00
Eric Anholt
7638f5578e i965: Allow glCopyTexSubImage() on depth textures.
If the hw is pre-gen5 and can't blit depth, it'll cleanly error out.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:39 -07:00
Eric Anholt
48a22340cf i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.
I think we've measured no performance difference from this in the past,
except that the blorp code can do things like multisample resolves.
Prevents piglit regression in the next commit when a testcase started
trying to do a multisampled resolve through the old glCopyTexSubImage()
path.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:35 -07:00
Eric Anholt
9720d436d1 i965: Consistently do depth resolves before blitting.
We were protected for a long time by the fact that depth was Y tiled and
you couldn't blit Y.  Now that we can blit Y, we were failing to resolve
depth in glCopyPixels().

Note in the comment about swrast, that the swrast map path does resolves
appropriately already.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:30 -07:00
Eric Anholt
6a7c27786c intel: Make a wrapper for intelEmitCopyBlit using miptrees.
I had previously asserted that it was hard to write a useful, simpler
blit function, but I think this might be it.

This has the side effect of extending the 32k pitch check to a few more
places that were missing it.

v2: Update comment for being moved inside intel_miptree_blit().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:25 -07:00
Eric Anholt
0ae294bf7c intel: Rename intel_renderbuffer_tile_offsets.
This makes it more consistent with intel_miptree_get_tile_offsets().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:21 -07:00
Eric Anholt
4e8eafd8f4 intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:15 -07:00
Eric Anholt
5c85e1cf55 intel: Make intel_miptree_get_tile_offsets return a page offset.
Right now, the callers in i965 don't expect a nonzero page offset to
actually occur (since that's being handled elsewhere), but it seems
like a trap to leave it this way.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:00 -07:00
José Fonseca
4eaa0999b5 glsl: Fix MSVC build.
It appears that `sizeof(Class::member)` is either non-standard or
merely unsupported in MSVC.

So use `sizeof(instance->member)` instead, which is guaranteed to work
everywhere.

Also promote the assert to a static assert.

Trivial.
2013-05-28 13:56:18 +01:00
Marek Olšák
d4a06d77f5 mesa: fix GLSL program objects with more than 16 samplers combined
The problem is the sampler units are allocated from the same pool for all
shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment
shader samplers start at index 12, leaving only 4 sampler units
for the fragment shader. The main cause is probably the fact that samplers
(texture unit -> sampler unit mapping, etc.) are tracked globally
for an entire program object.

This commit adapts the GLSL linker and core Mesa such that the sampler units
are assigned to sampler uniforms for each shader stage separately
(if a sampler uniform is used in all shader stages, it may occupy a different
sampler unit in each, and vice versa, an i-th sampler unit may refer to
a different sampler uniform in each shader stage), and the sampler-specific
variables are moved from gl_shader_program to gl_shader.

This doesn't require any driver changes, and it fixes piglit/max-samplers
for gallium and classic swrast. It also works with any number of shader
stages.

v2: - converted tabs to spaces
    - added an assertion to _mesa_get_sampler_uniform_value

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-28 13:05:30 +02:00
Marek Olšák
b4cb857dbf swrast: increase array size of TextureSample
to match the size of ctx->Texture.Unit, and it will also fix
piglit/max-samplers with the following commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-28 13:05:30 +02:00
Marek Olšák
15a4b6db21 mesa: declare UniformBufferBindings as an array with a static size
Some Gallium drivers were crashing, because the array was not large enough.

v2: clamp the per-shader maximum in st/mesa, then sum them all up

NOTE: This is a candidate for the stable branches.
2013-05-28 13:05:30 +02:00
Michel Dänzer
cdad129f9c radeonsi: Enable GLSL 1.30 2013-05-28 11:20:53 +02:00
Michel Dänzer
0495adbac5 radeonsi: Handle TGSI TXQ opcode 2013-05-28 11:20:53 +02:00
Michel Dänzer
3623111960 radeonsi: Add support for TGSI TXF opcode 2013-05-28 11:20:53 +02:00
Michel Dänzer
beaa5eb03a radeonsi: Use tgsi_util_get_texture_coord_dim() 2013-05-28 11:20:53 +02:00
Michel Dänzer
0afeea5ad2 radeonsi: Handle TGSI_SEMANTIC_CLIPDIST 2013-05-28 11:20:16 +02:00
Michel Dänzer
784df2e115 radeonsi: Make border colour state handling safe for integer textures 2013-05-28 09:55:46 +02:00
Michel Dänzer
e369f40a9b radeonsi: Fix hardware state for dual source blending
Set up CB_SHADER_MASK register according to pixel shader exports, and enable
some minimal state for colour buffer 1 in case dual source blending is used.
2013-05-28 09:55:46 +02:00
Vadim Girlin
08810ca9ef r600g/sb: handle more cases for folding in gvn pass
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-28 05:24:53 +04:00
Christian König
5328c8001b st/vdpau: destroy handle table only when it's empty
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Christian König
f796b67431 st/vdpau: remove vlCreateHTAB from surface functions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Christian König
8ea34fa0e8 st/vdpau: invalidate the handles on destruction
Fixes a problem with xbmc when switching channels.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Vadim Girlin
5de41575a1 r600g/sb: improve folding for SETcc
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:30:01 +04:00
Vadim Girlin
88e700329b r600g/sb: optimize CNDcc instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:29:56 +04:00
Vadim Girlin
725671a83a r600g/sb: improve optimization of conditional instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:19:20 +04:00
Chia-I Wu
5285c4c88e ilo: enable multiple constant buffers
This effectively enables uniform buffer object support.
2013-05-27 12:31:42 +08:00
Chia-I Wu
3a5dd39b1d ilo: add support for indirect access of CONST in FS
Unlike other register files, CONST is read with a message and indirect access
is easier to implement.
2013-05-27 12:30:51 +08:00
Chia-I Wu
8e7987cc49 ilo: add support for TBOs on GEN6
This hunk was missing in the last commit.
2013-05-27 12:30:42 +08:00
Chia-I Wu
11c9aaf30a ilo: advertise supports for pure integer formats
For pure integer formats, no filtering nor blending is needed.
2013-05-27 11:02:57 +08:00
Chia-I Wu
fb40aca879 ilo: add support for texture buffer objects
Take care of sampler views that have buffers as the underlying resources.
Update caps related to TBOs.
2013-05-27 11:02:57 +08:00
Chia-I Wu
441aa9326a tgsi: add buffer texture to tgsi_util_get_texture_coord_dim()
TGSI_TEXTURE_BUFFER is one-dimensional.  Assert that exec_tex() is never
called with TGSI_TEXTURE_BUFFER.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-27 11:02:06 +08:00
Vadim Girlin
63d09a0cb7 r600g/sb: improve handling of KILL instructions
This patch improves handling of unconditional KILL instructions inside
the conditional blocks, uncovering more opportunities for if-conversion.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin
880f435a7e r600g/sb: fix peephole optimization for PRED_SETE
Fixes incorrect condition that prevented optimization for
PRED_SETE/PRED_SETE_INT.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin
ff2a611699 r600g/sb: fix scheduling of PRED_SET instructions
PRED_SET instructions that update exec mask should be scheduled immediately
prior to the "if-then-else" block, because any instruction that is
inserted after alu clause with PRED_SET and before conditional block is
also conditionally executed by hw (exec mask is already updated at that
moment).

Propbably it's better to make PRED_SET a part of conditional
"if-then-else" block in the IR to handle this more cleanly,
but for now this temporary solution should prevent the problem.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin
44a117ab9a r600g/sb: fix handling of preloaded inputs for compute shaders
For compute shaders we need to let the backend know that
GPRs 0 and 1 are preloaded with some compute-specific input
values, otherwise any use of these regs without previous
definition is considered as undefined value and usually
is simply replaced with 0.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-25 22:56:53 +04:00
Brian Paul
fd9fe4470b xlib: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-24 16:35:25 -06:00
Brian Paul
fd29e4acda st/glx: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-24 16:35:25 -06:00
Brian Paul
db4580cbdf st/mesa: add switch cases for new IR enums to silence warnings 2013-05-24 16:35:25 -06:00
Brian Paul
820de34ceb st/glx/xlib: assorted whitespace, comment fixes 2013-05-24 16:35:24 -06:00
Vadim Girlin
8e41ced4b3 r600g/sb: fix incorrect assert
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin
e9aa46e665 r600g/sb: relax some restrictions for FETCH instructions
This allows GVN rewrite pass to propagate non-const (register)
values to FETCH source operands, helping to eliminate unnecessary
copies in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin
5a68a29706 r600g/sb: relax register allocation for compute shaders
We have to assume that all GPRs in compute shader can be indirectly
addressed because LLVM backend doesn't provide any indirect array info.
That's why for compute shaders GPR array is created that covers all used
GPRs (0..r600_bytecode::ngpr-1), but this seriously restricts register
allocation in sb.

This patch checks for actual use of indirect access in the shader and
if it's not used then GPR array is not created, so that regalloc is not
unnecessarily restricted.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin
0b5b3f8816 r600g/sb: fix gpr array handling for compute shaders
Fixes segfault with bfgminer and R600_DEBUG=sbcl.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 16:45:58 +04:00
Vadim Girlin
d1e0dc6275 r600g/sb: fix buffer overflow in sb_ostream
Fixes segfault during bytecode dump with bfgminer kernel

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 16:40:58 +04:00
Tom Stellard
b1797c3a38 r600g/compute: Use common transfer_{map,unmap} functions for global resources
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-23 14:52:34 -07:00
Tom Stellard
65d67bcc4b r600g/compute: Use common transfer_{map,unmap} functions for kernel inputs
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-23 14:52:34 -07:00
Kenneth Graunke
062317d667 i965: Go back to using the kernel SOL reset feature.
It turns out the MI_LOAD_REGISTER_IMM approach doesn't work on Haswell,
and regressed essentially all the transform feedback Piglit tests.

This morally reverts eaa6fbe6d5.  However,
the code is still simpler than it was.  On BeginTransformFeedback, we
simply flush the batch and set the SOL reset flag so that the next batch
will start with zeroed offsets.  There's still no software counting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64887
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-23 13:32:02 -07:00
Rob Clark
95670bdee2 freedreno: scissor fix
Don't assume the state-tracker will set the scissor after the
framebuffer state is changed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-05-23 14:35:21 -04:00
Rob Clark
97fa811d14 freedreno: implement pipe->resource_copy_region()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-05-23 14:35:21 -04:00
Kenneth Graunke
3ddfccb303 glsl linker: compare interface blocks during interstage linking
Verify that interface blocks match when linking separate shader
stages into a program.

Fixes piglit glsl-1.50 tests:
* linker/interface-blocks-vs-fs-member-count-mismatch.shader_test
* linker/interface-blocks-vs-fs-member-order-mismatch.shader_test

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-05-23 09:37:12 -07:00
Jordan Justen
4a0bcd90cf glsl linker: compare interface blocks during intrastage linking
Verify that interface blocks match when combining compilation
units at the same stage. (For example, when merging all vertex
shaders.)

Fixes piglit glsl-1.50 test:
* linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test

v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h
file for consistency with other linker code.  Remove "ok" variable.
Fold cross_validate_interface_blocks into its caller.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
d6863acb9f glsl linker: support arrays of interface block instances
With this change we now support interface block arrays.
For example, cases like this:

out block_name {
    float f;
} block_instance[2];

This allows Mesa to pass the piglit glsl-1.50 test:
* execution/interface-blocks-complex-vs-fs.shader_test

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
c30ca431ba glsl link_varyings: link interface blocks using the block name
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
5ebf547312 glsl linker: remove interface block instance names
Convert interface blocks with instance names into flat
interface blocks without an instance name.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
b24eeb078f glsl ast_to_hir: support in/out for interface blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
cb29a7095f glsl ast_to_hir: reject row/column_major for in/out interface blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
c00387497d glsl ast_to_hir: move uniform block symbols to interface blocks namespace
Uniform/interface blocks are a separate namespace from types.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
3919c19468 glsl_symbol_table: add interface block namespaces
For interface blocks, there are three separate namespaces for
uniform, input and output blocks.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
9368604d99 glsl parser: allow in & out for interface block members
Previously uniform blocks allowed for the 'uniform' keyword
to be used with members of a uniform blocks. With interface
blocks 'in' can be used on 'in' interface block members and
'out' can be used on 'out' interface block members.

The basic_interface_block rule will verify that the same
qualifier type is used with the block and each member.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
067cc08d6a glsl ast_to_hir: reject interpolation qualifiers for uniform blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
4410eba598 glsl parser: handle interface block member qualifier
An interface block member may specify the type:
in {
    in vec4 in_var_with_qualifier;
};

When specified with the member, it must match the same
type as interface block type.

It can also omit the qualifier:
uniform {
    vec4 uniform_var_without_qualifier;
};

When the type is not specified with the member,
it will adopt the same type as the interface block.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
4369acff5e glsl parser: on desktop GL require GLSL 150 for instance names
Interface blocks in GLSL 150 allow an instance name to be used.

v2:
 * use state->check_version

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
d36cb3617c glsl parser: reject VS+in & FS+out interface blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
6d3d974e37 glsl: parse in/out types for interface blocks
Previously only 'uniform' was allowed for uniform blocks.

Now, in/out can be parsed, but it will only be allowed for
GLSL >= 150.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
744c270406 glsl parser: rename uniform block to interface block
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
c9f58544be glsl: rename ast_uniform_block to ast_interface_block
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Chris Forbes
7bfb4bea65 i965: Enable guardband clipping on Gen4/5.
Enables guardband clipping when the viewport covers the entire render
target.

No piglit regressions on Ironlake.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-24 08:00:47 +12:00
Chris Forbes
a3d8e7c57c ARB_fp: accept duplicate precision options
Relaxes the validation of

   OPTION ARB_precision_hint_{nicest,fastest};

to allow duplicate options. The spec says that both /nicest/ and
/fastest/ cannot be specified together, but could be interpreted
either way for respecification of the same option.

Other drivers (NVIDIA etc) accept this, and at least one Unity3D game
expects it to succeed (Kerbal Space Program).

V2: Add spec quote.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-24 07:50:51 +12:00
Vinson Lee
e3eeb72f24 ilo: Initialize need_flush in draw_vbo.
need_flush was uninitialized if hw3d->new_batch was true.

Fixes "Uninitialized scalar variable" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-05-23 15:31:42 +08:00
Vinson Lee
36e2c7cc1a radeon: Initialize variables in radeon_llvm_context_init.
'type' was not fully initialized when calling lp_build_context_init.

Fixes "Uninitialized scalar variable" defect reported by Coverity.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-22 23:06:23 -07:00
Eric Anholt
cf37e12024 intel: Count fragments in our blitter-based glBitmap() path.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59440
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-22 14:35:44 -07:00
Eric Anholt
0af614727a i965: Shut up more compiler warnings from vector insert/extract changes.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-22 14:35:25 -07:00
Roland Scheidegger
2b291eaa90 softpipe: change TEX_TILE_SIZE and NUM_TEX_TILE_ENTRIES
Initially we had NUM_TEX_TILE_ENTRIES of 50, however this was using too much
memory (mostly because the tile cache is operating on fixed max current
sampler views which could be fixed but that's another topic). So it was
decreased to 4. However this is a ridiculously low number which can't
actually really work (the number of tiles needed for as little as
a single quad with linear_mipmap_linear is 2 to 8 for a 2d texture, and
4 to 16 for a 3d texture), as it just about guarantees there will be
cache thrashing sometimes (just about always for 3d textures in fact, since
while there are 4 entries the cache is direct mapped).
So increase that number to 16 (which is still on the low side for direct
mapped cache though I guess using something like 4-way associativity would
be more effective than increasing this further) which has at least some good
chance to avoid thrashing. Since we don't want to increase memory requirements
however in turn decrease the tile size accordingly from 64 to 32 (as a bonus
point this also decreases the cost of texture thrashing which might still
happen sometimes).
I've seen performance improvement in the order of factor ~200 (specifically,
drawing the first frame from the replay from bug 41787 needs "only" ~10s
instead of ~30min, meaning I can actually compare the output with other
drivers...) with this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
2f567fb7b5 softpipe: disambiguate TILE_SIZE / TEX_TILE_SIZE
These can be different (just like NUM_TEX_TILE_ENTRIES / NUM_ENTRIES),
though currently they aren't.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
80e2cc0f97 llvmpipe: disable simple_shader optimization
This optimization disabled mask checks if the shader is simple enough.
While this should work correctly, the problem is that it can hide real issues
because shaders in practice are usually complex enough (8 instructions or 1
texture is already enough) so this doesn't get used, whereas dumbed-down
tests which should hit all the same code paths suddenly do something quite
different. This was the reason that bug 41787 could not be easily tracked as
stencil test not working correctly (piglit would in fact have failed some
tests without that optimization).
So disable it for now, it's unclear if it's much of a win in any case.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
e108716429 llvmpipe: fix early depth test / late depth write stencil issues
We actually did early depth/stencil test and late depth/stencil write even
when the shader could kill the fragment (alpha test or discard). Since it
matters for the new stencil value if the fragment is killed by depth/stencil
test or by the shader (in which case it will not reach the depth/stencil
test) this simply cannot work (we also would possibly skip writing the new
stencil value due to mask checks but this is a secondary issue).
So use late depth test / late depth write instead in this case.
(No piglit changes as it doesn't seem to hit such bogus early depth test
/ late depth write path.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
82d7733b52 llvmpipe: fix issue with not writing new stencil values
We did mask checks between depth/stencil testing and depth/stencil write.
This meant that if the depth/stencil test killed off all fragments we never
actually wrote the new stencil value. This issue affected all early/late
test/write combinations.
So move the mask check after depth/stencil write (for early depth test,
could do the same for late depth test but might not be worth it at that
point so just skip it there).
This addresses https://bugs.freedesktop.org/show_bug.cgi?id=41787.
Piglit does not hit this issue because of the simple_shader optimization
in generate_fs_loop() which means we're skipping the mask checks.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
3c91ef0f29 llvmpipe: (trivial) remove confusing code in stencil test
This was meant to disable some code which isn't needed when depth/stencil
isn't written. However, there's more code which wouldn't be needed in that
case so having the condition there was just odd (llvm will drop all the code
anyway).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
5314f5d829 llvmpipe: fix bug in early depth test / late depth write handling
Using wrong type if the format was less than 32bits.
No piglit changes as it doesn't hit that path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Alexander von Gluck IV
6d20e251f2 Haiku: Add Gallium winsys and target code
* We generate a static library for Haiku
  Gallium targets as our port system combines
  the compiled rendering code into a modular
  ar for each module (for example, our port
  system combines llvm libsoftpipe.a libllvmpipe.a
  into a single ar for the Haiku build system.
  I'd like the Gallium hgl target scons build
  system to do this some day, however how is
  beyond me at the moment. This is a first step.
2013-05-22 14:31:44 -05:00
Chia-I Wu
ff68f61bed ilo: set more fields of 3DSTATE_DEPTH_BUFFER
Set lod/layer related fields of 3DSTATE_DEPTH_BUFFER.  Since we always point
to a single level/layer, those fields are always zero and this commit
effectively makes no change.

While at it, make it easier to disable manual slice offset calculation.
2013-05-22 20:25:57 +08:00
Chia-I Wu
f3da711bea ilo: correctly set view extent in SURFACE_STATE
The view extent was set to be the same as the depth while it should be set to
the number of layers.  It makes a difference for 3D textures.

Also use this as a chance to clean up the code.
2013-05-22 18:12:01 +08:00
Chia-I Wu
bbb30398e5 ilo: avoid unnecessary emission of SO states
No need to emit 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST when SO is
disabled.  As the implicit flush done by the commands is also gone, emit an
explicit flush.
2013-05-22 18:09:17 +08:00
Eric Anholt
08f87ac333 i965: Skip etc-to-rgb transcode on BayTrail.
The hardware does it, so no need for this workaround.

Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-20 23:04:32 -07:00
Eric Anholt
c245efe7e8 mesa: Remove extension checking from ChooseTexFormat.
This should already be handled by _mesa_base_tex_format() calls in
TexImage*.
2013-05-21 15:20:28 -07:00
Eric Anholt
36e7c01101 mesa: Add ChooseTexFormat support for the new XBGR formats. 2013-05-21 15:20:28 -07:00
Kenneth Graunke
b29381567a i965: Split BeginTransformFeedback hook into Gen6 and Gen7+ variants.
Most of the work in BeginTransformFeedback is only necessary on Gen6.
We may as well just skip it on Gen7+.

v2: Add an intel->gen == 6 assert.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:40 -07:00
Kenneth Graunke
64a87f29ce i965: Kill software primitive counting entirely.
Now that we have hardware contexts, we don't need to continually
reprogram the GS_SVBI_INDEX registers.  They're automatically saved and
restored with the context, so they can just increment over time.  We
only need to reset them when starting transform feedback.

There's also no reason to delay until the next drawing operation; we can
just emit the packet immediately.  However, this means we must drop the
initialization in brw_invariant_state, as BeginTransformFeedback may
occur before the first drawing in a context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:27 -07:00
Kenneth Graunke
647fc0c50b i965: Remove software geometry query code.
EXT_transform_feedback isn't yet supported on Gen4-5, so none of this
query code is actually used.  This also means we can remove some of the
surrounding support code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:25 -07:00
Kenneth Graunke
b863d44451 i965: Delete unused brw->sol.offset_0_batch_start field.
This was only used for the the non-hardware context code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:24 -07:00
Kenneth Graunke
eaa6fbe6d5 i965: Stop using the kernel SOL reset feature.
We can just do it ourselves with MI_LOAD_REGISTER_IMM.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:22 -07:00
Kenneth Graunke
6837ebd00f i965: Remove dead code for Gen7 SOL without hardware contexts.
Failing to get a hardware context now means failing to load the driver,
so this code will never get hit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:19 -07:00
Kenneth Graunke
58765bb481 i965: Add a macro for accessing the SO_WRITE_OFFSET[0-3] registers.
Using a function-like macro makes it easy to loop over all four streams.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:06 -07:00
Ian Romanick
0ba1e65fb6 docs: Import 9.1.3 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-21 13:16:56 -07:00
Michel Dänzer
d42a2df19c radeonsi: Fix user clip planes
4 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:13 +02:00
Michel Dänzer
e3befbca5e radeonsi: Handle TGSI_SEMANTIC_CLIPVERTEX
17 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:13 +02:00
Michel Dänzer
eb19163a4d radeonsi: Initial support for multiple constant buffers
Just enough to support an additional internal constant buffer for the user
clip planes.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:12 +02:00
Michel Dänzer
4730dea5f5 radeonsi: Fix handling of TGSI_SEMANTIC_PSIZE
Two more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:12 +02:00
Marek Olšák
2eac0aa1d8 radeonsi: increase array size for shader inputs and outputs
and add assertions to prevent buffer overflow. This fixes corruption
of the si_shader struct.

NOTE: This is a candidate for the 9.1 branch.

[ Cherry-pick of r600g commit da33f9b919 ]

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-21 17:47:44 +02:00
Brian Paul
9772284df2 xlib: check for null ctx pointer in glXIsDirect()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745
Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-21 07:35:12 -06:00
Brian Paul
1e9875acbe st/glx/xlib: check for null ctx pointer in glXIsDirect()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745
Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-21 07:35:12 -06:00
José Fonseca
8cabc7be1d scons: Don't force stabs debug format for Mingw.
- recent gdb handles DWARF fine (tested both with version
  7.1.90.20100730 from mingw-w64 project, and 7.5-1 from mingw project)

- http://people.freedesktop.org/~jrfonseca/bfdhelp/ was updated to
  handle DWARF

- stabs requires ugly hacks to prevent compilation failures

- mixing stabs/dwarf prevents proper backtraces (which is inevitable,
  given that the MinGW C runtime is pre-built with DWARF)

For example, without this change I get:

  (gdb) bt
  #0  _wassert (_Message=0xf925060 L"Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0xf60b488 L"llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:51
  #1  0x0368996b in _assert (_Message=0x39d7ee4 "Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0x39d7e94 "llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:44
  #2  0x00000004 in ?? ()
  #3  0x00000004 in ?? ()
  #4  0x0f60b488 in ?? ()
  #5  0x00000000 in ?? ()

While with this change I get:

  (gdb) bt
  #0  _wassert (_Message=0xfb982e8 L"Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0xefbcb40 L"llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:51
  #1  0x039c996b in _assert (_Message=0x3d17f24 "Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0x3d17ed4 "llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:44
  #2  0x033111cc in getOperand (Num=4, this=<optimized out>)
      at llvm/include/llvm/CodeGen/SelectionDAGNodes.h:534
  #3  getOperand (i=4, this=<optimized out>)
      at llvm/include/llvm/CodeGen/SelectionDAGNodes.h:779
  #4  llvm::SelectionDAG::getNode (this=0xf00cb08, Opcode=79, DL=..., VT=..., N1=..., N2=...)
      at llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:2859
  #5  0x03377b20 in llvm::SelectionDAGBuilder::visitExtractElement (this=0xfb45028, I=...)
      at llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2803
  [...]

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-21 12:34:19 +01:00
Chia-I Wu
2b7463cf3a ilo: use BLT engine to copy between textures
Emit XY_SRC_COPY_BLT to do the job.  Since ETC1 textures cannot be mapped for
reading, as is required by util_copy_resource_region, this fixes copying of
ETC1 textures.
2013-05-21 12:02:55 +08:00
Chia-I Wu
c44ebb4ef4 ilo: use BLT engine to copy between buffers
Emit (possibly multiple) SRC_COPY_BLT to copy between buffers of arbitrary
sizes.
2013-05-21 11:47:20 +08:00
Chia-I Wu
731cafe7b2 ilo: refactor blitter_xy_color_blt()
Add gen6_XY_COLOR_BLT() and let blitter_xy_color_blt() call the function.  Not
sure if this path is still being hit by any application.
2013-05-21 11:47:20 +08:00
Chia-I Wu
0d42a9e941 ilo: replace cp hooks by cp owner and flush callback
The problem with cp hooks is that when we switch from 3D ring to 2D ring, and
when there are active queries, we will emit 3D commands to 2D ring because
the new-batch hook is called.

This commit introduces the idea of cp owner.  When the cp is flushed, or when
another owner takes place, the current owner is notified, giving it a chance
to emit whatever commands there need to be.  With this mechanism, we can
resume queries when the 3D pipeline owns the cp, and pause queries when it
loses the cp.  Ring switch will just work.

As we still need to know when the cp bo is reallocated, a flush callback is
added.
2013-05-21 11:47:20 +08:00
Chia-I Wu
a04d8574c6 ilo: harware contexts are only for the render ring
The hardware context should not be passed for bo execution when the ring is
not the render ring.  Rename hw_ctx to render_ctx for clarity.
2013-05-21 11:47:19 +08:00
Chia-I Wu
1ed7b825cf ilo: update format mappings
Add more PIPE_FORMAT -> BRW_SURFACEFORMAT mappings, and update
surface_format_info from i965.
2013-05-21 11:47:19 +08:00
Chia-I Wu
bd8090a5af ilo: update headers from i965
Mainly for MI_LOAD_REGISTER_IMM and BCS_SWCTRL.
2013-05-21 11:47:19 +08:00
Anuj Phogat
06cd89a88c i965: Fix build failure
meta.h should be included in brw_state_upload.c to get access to
function _mesa_meta_in_progress().
2013-05-20 16:15:57 -07:00
Kenneth Graunke
f09b91f782 i965: Implement transform feedback query support in hardware on Gen6+.
Now that we have hardware contexts and can use MI_STORE_REGISTER_MEM,
we can use the GPU's pipeline statistics counters rather than going out
of our way to count primitives in software.

Aside from being simpler, this also paves the way for Geometry Shaders,
which can output an arbitrary number of primitives on the GPU.  It will
also allow us to use hardware primitive restart when these queries are
in use.

The GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query is easy: it
corresponds to the SO_NUM_PRIMS_WRITTEN/SO_NUM_PRIMS_WRITTEN0_IVB
counters.

The GL_PRIMITIVES_GENERATED query is trickier.  Gen provides several
statistics registers which /almost/ match the semantics required:
- IA_PRIMITIVES_COUNT
  The number of primitives fetched by the VF or IA (input assembler).
  This undercounts when GS is enabled, as it can output many primitives.
- GS_PRIMITIVES_COUNT
  The number of primitives output by the GS.  Unfortunately, this
  doesn't increment unless the GS unit is actually enabled, and it
  usually isn't.
- SO_PRIM_STORAGE_NEEDED*_IVB
  The amount of space needed to write primitives output by transform
  feedback.  These naturally only work when transform feedback is on.
  We'd also have to add the counters for all four streams.
- CL_INVOCATION_COUNT
  The number of primitives processed by the clipper.  This doesn't work
  if the GS or SOL throw away primitives for rasterizer discard.
  However, it does increment even if the clipper is in REJECT_ALL mode.

Dynamically switching between counters would be painfully complicated,
especially since GS, rasterizer discard, and transform feedback can all
be switched on and off repeatedly during a single query.

The most usable counter is CL_INVOCATION_COUNT.  The previous two
patches reworked rasterizer discard support so that all primitives hit
the clipper, making this work.

v2: Occlusion query bug fixes removed and squashed in earlier patches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
037a901a5b i965: Handle rasterizer discard in the clipper rather than GS on Gen6.
This has more of a negative impact than the previous patch, as on Gen6
passing primitives through to the clipper means we actually have to make
the GS thread write them to the URB.

I don't see another good solution though, and rasterizer discard is not
the most common of cases, so hopefully it won't be too terrible.

v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags;
    remove the rasterizer_discard field from brw_gs_prog_key.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
d1e4e9960c i965: Handle rasterizer discard in the clipper rather than SOL on Gen7.
In order to implement the GL_PRIMITIVES_GENERATED query in a sane
fashion on our hardware, we can't discard primitives until the clipper.
The patch after next explains the rationale.

By setting the clipper to REJECT_ALL mode, all primitives get thrown away,
so rendering is still appropriately disabled.

This may negatively impact performance in the rasterizer discard case,
but it's unclear how much and this hasn't been observed to be a
bottleneck in any application we've looked at.  The clipper is the very
next stage in the pipeline, so I don't think it will be terrible.

v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
5ebe9523f9 i965: Disable clipper statistics when meta operations are in progress.
We don't currently use the clipper statistics, but we'll soon use
CL_INVOCATIONS_COUNT to implement the GL_PRIMITIVES_GENERATED query.
The number of primitives generated is not supposed to be altered during
operations such as glGenerateMipmap.

Prevents spec/EXT_transform_feedback/generatemipmap prims_generated
from breaking when we start using pipeline statistics registers to
implement the GL_PRIMITIVES_GENERATED query in a few commits.

v2: Use the BRW_NEW_META_IN_PROGRESS flag for correct state handling.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
b96f93c453 i965: Create a BRW_NEW_META_IN_PROGRESS state flag.
This will allow us to disable statistics during meta operations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
bbf86712f8 i965: Add #defines for the pipeline statistics counter registers.
These come from the Ivybridge PRM, Volume 1, Part 3.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
e32cd5ffbb i965: Rely on hardware contexts for query objects on Gen6+.
Hardware contexts greatly simplify the query object code.  The pipeline
statistics counters get saved and restored with the context, which means
that we don't need to worry about other workloads polluting them.

This means that we can simply write a single pair of values (one at
BeginQuery and one at EndQuery) rather than a series of pairs.  This
also means we don't need to worry about the BO getting full.  We also
don't need to delay BO allocation and starting snapshot until the first
draw.

The generation split here is a little off: technically, Ironlake can also
support hardware contexts.  However, the kernel currently doesn't, and
even if it were to do so someday, we'd need to wait a while before
bumping the kernel requirement to take advantage of it.

v2: Incorporate Paul's feedback.
- Clarify which functions are Gen4/5-only via assertions and comments.
- Change how driver hook initialization happens.
- Update comments.
- Squash a bug fix from a later commit here where it belongs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
72b1e440dd i965: Disable pixel statistics in BLORP.
BLORP is used for operations like glClear, glCopyTexImage, and
glBlitFramebuffer which aren't supposed to contribute fragments toward
occlusion queries.

This prevents Piglit tests from breaking in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:17 -07:00
Kenneth Graunke
92d2f5acfa i965: Require hardware contexts (and thus Kernel 3.6) on Gen6+.
Hardware contexts are necessary to reasonably support OpenGL 3.2.
In particular, we currently maintain software counters for transform
feedback buffer offsets and counters, which relies on knowing the number
of primitives generated.  Geometry shaders violate that assumption.

At the time of writing, Debian has moved to Kernel 3.8, which means most
people probably have a newer kernel by now.  It's also worth noting that
this patch won't land until Mesa 10 which is currently targeted for
September.  By that point, even more people will have a newer kernel.

Also, don't bother trying to allocate contexts on pre-Gen6, as it
currently will always fail, and if this changes in the future, we'll
need to reevaluate our hw_ctx/gen checks.

This patch leaves the code for flagging BRW_NEW_CONTEXT on new
batchbuffers if hw_ctx == NULL since that still occurs pre-Gen6.

Also remove the Gen7+ check for kernel 3.3, since it's now redundant.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:17 -07:00
Kenneth Graunke
50e60bf8da i965: Bump kernel requirement to 3.3 on Ivybridge.
Kernel 3.3 introduced the SOL reset execbuf parameter, needed for GL 3.0
on Ivybridge.  Bumping the requirement will give an obvious error
message rather than simply reporting GL 2.1.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:17 -07:00
Vincent Lejeune
9fd7ea786c r600g/llvm: fix cubemap lod/bias 2013-05-20 20:23:19 +02:00
Vincent Lejeune
9a95fb1605 r600g/llvm: Fix texelFetchOffset-2D 2013-05-20 20:23:14 +02:00
Vincent Lejeune
32c9cbb38f r600g/llvm: Fix cubearray textureSize 2013-05-20 20:23:09 +02:00
Vincent Lejeune
9c2943601e r600g/llvm: Factorize code loading from const buffer. 2013-05-20 20:23:04 +02:00
Kenneth Graunke
01b79b2e3b i965: Add cases for ir_triop_vector_insert that assert.
brw_link_shader() unconditionally calls lower_vector_insert() with true
as the second parameter.  This means that both constant and variable
indexed expressions will get lowered, so we should never see this in the
backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-20 10:19:48 -07:00
Kenneth Graunke
e1e8876797 i965: Add cases for ir_binop_vector_extract that assert.
do_vec_index_to_swizzle() should remove any vector extract operations
with a constant index.  It's unconditionally called from
do_common_optimization().

do_vec_index_to_cond_assign() should remove the rest, and it is
unconditionally called from brw_link_shader().  This means that we
should never see ir_binop_vector_extract in the backend.

Silences compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-20 10:19:30 -07:00
Roland Scheidegger
f6beb4c6b6 llvmpipe: enable z32s8x24 format
Now that we can handle it both for sampling and as depth/stencil enable it.
Passes nearly all additional piglit tests which are now performed, with two
exceptions (one being a framebuffer blit which fails for all other formats
including stencil too as we don't support stencil blits, the other reporting
a unexpected GL error so doesn't look to be llvmpipe's fault).
2013-05-18 00:32:45 +02:00
Roland Scheidegger
070a9afb54 llvmpipe: handle z32s8x24 depth/stencil format
We need to split up the depth and stencil values in this case, and there's
some new logic required to handle float depth and stencil simultaneously.
Also make sure we get the 64bit zs clear values and masks propagated
correctly.
2013-05-18 00:32:33 +02:00
Roland Scheidegger
f3ad716e8f llvmpipe: get rid of unused tiled/linear logic
We do rendering to linear color buffers for quite some time, and since
switching to linear depth buffers all the tiled/linear logic was unused.
So get rid of (most) of it - there's still some LAYOUT_NONE things and
late allocation of resources which probably could be simplified.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-18 00:32:27 +02:00
Roland Scheidegger
87978518e9 llvmpipe: fix bogus handling of first_layer when setting up texture sampling
The code avoided first_layer parameter in the sampler interface (and needing
to do another calculation at runtime) by fixing up the base texture pointer
instead. Unfortunately, this didn't actually work as we have mip-first
texture layout so fixing up the base ptr by a fixed amount is very wrong if
there are mipmaps present. The wrong offsets caused misrendering and crashes.
Fix this by just adjusting the individual mip level offsets instead.
Spotted by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-18 00:32:18 +02:00
Roland Scheidegger
d7e811c0b0 gallivm: handle z32s8x24 format for sampling
Since we can only sample either depth or stencil but not both only load
the required bits which makes things a bit easier (it requires special
handling since the format doesn't fit into 32bit).
The logic for deciding if depth or stencil should be sampled is a bit odd,
but seems to be what other drivers and statetrackers do: if it's a format with
both depth and stencil (or just with depth) then sample depth, for sampling
stencil a sampler view format with only stencil is required.
Also while here fix up stencil sampling for other formats as well, though
this isn't supported by mesa (ARB_stencil_texturing), and while blits would
use it they don't work neither since they'd also need stencil export.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-18 00:31:49 +02:00
Roland Scheidegger
0346e9b3bb st/mesa: fix weird UCMP opcode use for bool ubo load
I don't know what this code was trying to do but whatever it was it couldn't
have worked since negation of integer boolean inputs while not specified as
outright illegal (not yet at least) won't do anything since it doesn't affect
the result of comparison with zero at all. In fact it looks like the whole
instruction can just be omitted.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-18 00:31:49 +02:00
Eric Anholt
a5b0452400 mesa: Make FinishRenderTexture just take the renderbuffer being finished.
Now that the rb has a reference to the teximage, we didn't need anything
else out of the attachment.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:05 -07:00
Eric Anholt
e98c39c109 mesa: Track the TexImage being rendered to in the gl_renderbuffer.
We keep having to pass the attachments around with our gl_renderbuffers
because that's the only way to find what the gl_renderbuffer actually
refers to.  This is a step toward removing that (though drivers still need
the Zoffset as well).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:05 -07:00
Eric Anholt
7b085d1bfa radeon: Remove dead radeon_wrap_texture().
I should have killed this in my previous cleanup.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:04 -07:00
Eric Anholt
c810e67c55 mesa: Make gl_renderbuffers backed by EGL images use FinishRenderTexture.
This is the opportunity that radeon and intel drivers rely on for flushing
render targets that may get reused as textures.  Before EGL, that only
happened for GL_TEXTURE attachments.

Fixes piglits:
KHR_gl_renderbuffer_image/renderbuffer-texture
OES_EGL_image/renderbuffer-texture

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:04 -07:00
José Fonseca
6166ffeaf7 gallivm: Eliminate 8.8 fixed point intermediates from AoS sampling path.
This change was meant as a stepping stone to use PMADDUBSW SSSE3
instruction, but actually this refactoring by itself yields a 10%
speedup on texture intensive shaders (e.g, Google Earth's ocean water
w/o S3TC on a Ivy Bridge machine), while giving yielding exactly the
same results, whereas PMADDUBSW only gave an extra 5%, at the expense of
2bits of precision in the interpolation.

I belive that the speedup of this change comes from the reduced register
pressure (as 8.8 fixed point intermediates take twice the space of 8bit
unorm).

Also, not dealing with 8.8 simplifies lp_bld_sample_aos.c code
substantially -- it's no longer necessary to have code duplicated for
low and high register halfs.

Note about lp_build_sample_mipmap(): the path for num_quads > 1 is never
executed (as it is faster on AVX to split the 256bit wide texture
computation into two 128bit chunks, in order to leverage integer
opcodes).  This path might be useful in the future, so in order to
verify this change did not break that path I had to apply this change:

  @@ -1662,11 +1662,11 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
         /*
          * we only try 8-wide sampling with soa as it appears to
          * be a loss with aos with AVX (but it should work).
          * (It should be faster if we'd support avx2)
          */
  -      if (num_quads == 1 || !use_aos) {
  +      if (/* num_quads == 1 || ! */ use_aos) {

            if (num_quads > 1) {
               if (mip_filter == PIPE_TEX_MIPFILTER_NONE) {
                  LLVMValueRef index0 = lp_build_const_int32(gallivm, 0);
                  /*

and then run texfilt mesademo:

  LP_NATIVE_VECTOR_WIDTH=256 ./texfilt

Ran whole piglit without regressions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-17 20:23:00 +01:00
José Fonseca
5aaa4bafe0 gallivm: Add and use lp_build_lerp_3d.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-17 20:22:50 +01:00
Tom Stellard
e230d9debb radeon/llvm: Run standard optimization passes on conpute shader modules
The SROA and function inliner passes are espically important, because
they optimize away unsupported features: functions and indirect
private memory access.
2013-05-17 07:38:01 -07:00
Kenneth Graunke
ccb041fe8e intel: Don't spam "intelReadPixels: fallback to swrast" in non-PBO case.
When an application is using PBOs, we attempt to use the BLT engine to
perform ReadPixels.  If that fails due to some restrictions, it's useful
to raise a performance warning.

In the non-PBO case, we always use a CPU mapping since getting the data
into client memory requires a CPU-side copy.  This is a very common case,
so raising a performance warning is annoying.  In particular, apitrace's
image dumping code hits this path, causing it to print hundreds of
thousands of performance warnings via ARB_debug_output.  This tends to
obscure actual errors or other important messages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-16 22:35:01 -07:00
Paul Berry
46ea804107 intel: Do a depth resolve before copying images between miptrees.
When intel_finalize_mipmap_tree() calls intel_miptree_copy_teximage()
to reassemble a depth miptree that has been broken apart into pieces
(to deal with misalignment of levels/layers within the miptree), it
just copies the depth data, not the HiZ data.  This is reasonable,
since the alignment restrictions of HiZ are a large part of the reason
why the miptree had to be broken apart in the first place.  However,
in order for the depth copy to be sufficient, we need to do a depth
resolve first, to make sure any deferred depth writes that are in the
HiZ buffer get performed.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=64662 and
https://bugs.freedesktop.org/show_bug.cgi?id=64659.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-16 14:42:54 -07:00
Niels Ole Salscheider
7e17e72cb7 r600g: fixup for MSAA texture support checking
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-05-16 12:03:47 -07:00
José Fonseca
4f518e1738 llvmpipe: Temporary workaround to prevent segfault on array textures. 2013-05-16 15:14:10 +01:00
José Fonseca
cb9913cdab gallivm: Support pointers in lp_build_print_value().
Trivial.
2013-05-16 15:14:10 +01:00
Chia-I Wu
435aea6f32 ilo: emit 3DSTATE_STENCIL_BUFFER on GEN7+
Whether HiZ is enalbed or not, separate stencil is supported and enforced on
GEN7+.  Now that we support separate stencil resources, we know how to emit
3DSTATE_STENCIL_BUFFER.
2013-05-16 18:33:59 +08:00
Chia-I Wu
6b894e6900 ilo: add support for stencil resources on GEN7+
For allocations, we need to support stencil-only and separate stencil
resources.  For mapping, we need to support software tiling and
packing/unpacking for separate stencil resources.
2013-05-16 18:20:17 +08:00
Chia-I Wu
5c9b69d259 winsys/intel: test for and expose address swizzling
Without knowing whether addresses are swizzled or not, we cannot manipulate a
tiled surface in CPU.
2013-05-16 11:24:59 +08:00
Marek Olšák
639d0f73c1 st/mesa: handle texture_from_pixmap and other surface-based textures correctly
There were 2 issues with it:
1) The texture format which should be used for texturing was only set
   in gl_texture_image::TexFormat, which wasn't used for sampler views.
2) Textures are sometimes reallocated under some circumstances
   in st_finalize_texture, which is unacceptable if the texture comes
   from a window system.

The issues are resolved as follows:
1) If surface_based is true (texture_from_pixmap, etc.), store the format
   in a new variable st_texture_object::surface_format.
2) Don't reallocate a surface-based texture in st_finalize_texture.

Also don't use st_ChooseTextureFormat is st_context_teximage, because
the format is dictated by the caller.

This fixes the glx-tfp piglit test.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2013-05-15 20:22:48 +02:00
Marek Olšák
5a3fac4d26 r600g: cleanup MSAA texture support checking
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-15 20:20:32 +02:00
Marek Olšák
61c995bc47 r600g: rewrite FMASK allocation, fix FMASK texturing with 2 and 4 samples
This fixes and enables texturing with compressed MSAA colorbuffers
on Evergreen and Cayman. For the first time, multisample textures work
on Cayman.

This requires the libdrm flag RADEON_SURF_FMASK.

v2: require libdrm_radeon 2.4.45

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-15 20:19:45 +02:00
Eric Anholt
61506257f6 i965: Fill in brw_format_for_mesa_format for some non-rendering formats.
This should have no change on driver operation, but it means that when you
wonder why some format isn't supported natively, you can just look at the
table above, instead of wondering if maybe there's an appropriate entry in
the surface formats table that is already supported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:46 -07:00
Eric Anholt
9db9bc3aa1 i965: Use native RGB_FLOAT16 support when available.
Previously we would expand it to RGBA_FLOAT16.  This format now comes out
as framebuffer incomplete, but it seems worth the memory savings if that's
what people are asking for (and GL3 does list it under "texture-only"
color formats)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:46 -07:00
Eric Anholt
645b610b62 intel: Add support for blitting 6 byte-per-pixel formats.
The next commit introduces what is apparently our first one, which tripped
over this in glReadPixels.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:45 -07:00
Eric Anholt
028c11e8e3 i965: Use the Mesa surface formats for float RGB surfaces.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:45 -07:00
Eric Anholt
2e057076a8 i965: Use the new XRGB UNORM formats.
This is a step on the way to removing some of our code for forcing alpha
to 1, but I want easy bisecting so I'll add groups of formats separately.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:45 -07:00
José Fonseca
2a43dfda95 draw: More defensive coding in DRAW_GET_IDX.
Doesn't make a difference ATM, but just in case.
2013-05-15 16:59:28 +01:00
José Fonseca
1883e1d3e9 draw: Fix vsplit regression when the ib can be used directly.
`ib` no longer is offseted by `istart`.

Trivial.
2013-05-15 16:57:44 +01:00
Chris Forbes
53a5f11f0d mesa: Stop clamping stencil reference value at specification time
All drivers now clamp this to the appropriate range for the bound
stencil buffer when emitting stencil state.

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:53 +12:00
Chris Forbes
978f91b829 swrast: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:53 +12:00
Chris Forbes
db8a84de87 st: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:53 +12:00
Chris Forbes
c411f40cba radeon: Use accessor for stencil reference values
V2: Drop spurious mask with 0xff.

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:34 +12:00
Chris Forbes
7bbe9b78ae nouveau: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:01:08 +12:00
Chris Forbes
f819ec46d5 intel: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:01:06 +12:00
Chris Forbes
96a1bf1ba3 mesa: Use accessor for stencil reference values in glGet
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:01:03 +12:00
Chris Forbes
38f65162af mesa: add accessor for effective stencil ref
Clamps the stencil reference value to the range representable in the
currently-bound draw framebuffer's stencil attachment.

V2: Add spec quote.

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:00:55 +12:00
Chia-I Wu
c68424bac4 ilo: clean up transfer format conversion
Map the bo directly, instead of calling transfer_map().
2013-05-15 15:21:50 +08:00
Chia-I Wu
cb57da421a ilo: rework transfer mapping method choosing
Always check if a bo is busy in choose_transfer_method() since we always need
to map it in either map() or unmap().  Also determine how a bo is mapped in
choose_transfer_method().
2013-05-15 15:21:50 +08:00
Chia-I Wu
b6c307744f ilo: refactor transfer mapping
Add tex_get_box_offset() to compute transfer offet from the pipe_box.  Add
tex_get_slice_stride() to compute slice stride for a transfer.
2013-05-15 15:21:50 +08:00
Chia-I Wu
5af8641ce0 ilo: no writeback without PIPE_TRANSFER_WRITE
We should not write staging data back when PIPE_TRANSFER_WRITE is not set.
2013-05-15 15:08:54 +08:00
Chia-I Wu
46bb33bc21 ilo: minor cleanups for transfers
Rename some functions and reorder some code.
2013-05-15 15:08:54 +08:00
Chia-I Wu
ca349e0217 ilo: simplify ilo_texture_get_slice_offset()
Always return a tile-aligned offset.  Also fix for W tiling.
2013-05-15 15:08:54 +08:00
Zack Rusin
013424678e draw/gs: fix extracting of the clip
The indices are not consecutive when using the geometry shader,
which means we were extracting non existing values. Create
an array of linear indices and always use it instead of the passed
indices. Found by Jose.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-14 04:04:08 -04:00
Kenneth Graunke
a6961f391a docs: Mark a few things as in progress. 2013-05-14 12:22:40 -07:00
Zack Rusin
5104ed3dbf draw: try to prevent overflows on index buffers
Pass in the size of the index buffer, when available, and use it
to handle out of bounds conditions. The behavior in the case of
an overflow needs to be the same as with other overflows in the
vertex processing pipeline meaning that a vertex should still
be generated but all attributes in it set to zero.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:10:56 -04:00
Zack Rusin
d5250da818 draw: use the total number of vertices for statistics
the number of vertices to fetch doesn't necessarily equal the
total number of input vertices, e.g. we might want to fetch
a single vertex but then draw it twice. Lets use the correct
number of input vertices in the statistics.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:10:33 -04:00
Zack Rusin
29853ab7b8 draw: don't crash on vertex buffer overflow
We would crash when stride was bigger than the size of the buffer.
The correct behavior is to just fetch zero's in this case.
Unfortunatly with user_buffer's there's no way to validate the size
because currently we're just not getting it. Adjust the draw interface
to pass the size along the mapped buffer, which works perfectly
for buffer backed vertex_buffers and, in future, it will allow
us to plumb user_buffer sizes through the same interface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:09:32 -04:00
Zack Rusin
386327c48f gallivm/soa: implement indirect addressing in immediates
The support is analogous to the way we handle indirect addressing
in temporaries, except that we don't have to worry about storing
(after declarations) and thus we'll able to keep using the old
code when indirect addressing isn't used. In other words we're
still using constants directly, unless the instruction has
immediate register with indirect addressing.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:09:15 -04:00
Zack Rusin
2866525b86 draw/gs: don't bind the tgsi state if we're using llvm paths
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:08:56 -04:00
Vinson Lee
ff256ec068 gallivm: Fix build with LLVM >= 3.4 r181680.
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-05-14 09:06:14 -07:00
José Fonseca
36385c0bdf mesa/st: Temporary workaround for fdo bug 64568.
Effectively reverting the problematic hunk of
commit 614ee25077
2013-05-14 17:02:53 +01:00
Alex Deucher
29b8d6a1da radeonsi: add Hainan pci ids
Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-14 10:51:10 -04:00
Alex Deucher
d188f14941 radeonsi: update r600_get_llvm_processor_name for hainan
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-14 10:51:10 -04:00
Alex Deucher
4045c3d060 radeonsi: add support for hainan chips
Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-14 10:51:10 -04:00
José Fonseca
c475ae5d3d draw: Fix io_ptr/num_prims name in IR.
Trivial.
2013-05-14 15:36:37 +01:00
José Fonseca
2f3d939e36 graw/tgsi_dump: Fix gdb macro.
The macro was relying on "tokens" local variable to exist.
2013-05-14 15:36:37 +01:00
Vadim Girlin
560ddad261 r600g/sb: add missing cases for ARUBA chips
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-14 17:36:25 +04:00
Vadim Girlin
ecde4b07e2 r600g/sb: get rid of standard c++ streams
Static initialization of internal libstdc++ data related to iostream
causes segfaults with some apps.

This patch replaces all uses of std::ostream and std::ostringstream in sb
with custom lightweight classes.

Prevents segfaults with ut2004demo and probably some other old apps.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-14 17:36:25 +04:00
Vadim Girlin
57d1be0d2d r600g/sb: separate bytecode decoding and parsing
Parsing and ir construction is required for optimization only,
it's unnecessary if we only need to print shader dump.
This should make new disassembler more tolerant to any new
features in the bytecode.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-14 17:36:25 +04:00
Christian König
e195d301ae vl/vdpau: fix PresentationQueueQuerySurfaceStatus
The last queued surface always keeps displaying.

Fixing a problem with XBMC.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-14 15:16:15 +02:00
Chia-I Wu
176ad54c04 ilo: rework ilo_texture
Use ilo_buffer for buffer resources and ilo_texture for texture resources.  A
major cleanup is necessitated by the separation.
2013-05-14 16:07:22 +08:00
Chia-I Wu
768296dd05 ilo: rename ilo_resource to ilo_texture
In preparation for the introduction of ilo_buffer.
2013-05-14 16:01:25 +08:00
Chia-I Wu
528ac68f7a ilo: move transfer-related functions to a new file
Resource mapping is distinct from resource allocation, and is going to get
more and more complex.  Move the related functions to a new file to make the
separation clear.
2013-05-14 16:01:20 +08:00
Rodrigo Vivi
888fc7a891 i965: Add missing Haswell GT3 Desktop to IS_HSW_GT3 check.
NOTE: This is a candidate for stable branches.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 17:00:46 -07:00
Jordan Justen
a16a2d7147 i965: write layer if gl_Layer is used in VS
This is enabled by the AMD_vertex_shader_layer extension.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-13 13:57:57 -07:00
Jordan Justen
220f70667d glsl: add AMD_vertex_shader_layer support
This GLSL extension requires that AMD_vertex_shader_layer be
enabled by the driver.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-13 13:57:52 -07:00
Jordan Justen
c9e981b8fb extensions: add AMD_vertex_shader_layer
This extension will require driver support, so it must
be enabled by the driver.

http://www.opengl.org/registry/specs/AMD/vertex_shader_layer.txt

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-13 13:57:03 -07:00
Chad Versace
1776eeedd3 mesa: Expose GL_OES_texture_npot on GLES1
Mesa's extension table incorrectly lists this GL_OES_texture_npot as
ES2-only. It's also an ES1 extension. This patch adds ES1 to the
extensions API mask.

From the GL_OES_texture_npot spec:
    OpenGL ES 1.0 or OpenGL ES 2.0 is required. This extension is
    written against OpenGL ES 1.1.12 and OpenGL ES 2.0.25.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:08:37 -07:00
Ian Romanick
a61a0dbed2 glsl: Death to array dereferences of vectors!
Now that all the places that used to generate array derefeneces of
vectors have been changed to generate either ir_binop_vector_extract or
ir_triop_vector_insert (or both), remove all support for dealing with
this deprecated construct.

As an added safeguard, modify ir_validate to reject ir_dereference_array
of a vector.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
1e773626ee glsl: Generate correct ir_binop_vector_extract code for out and inout parameters
Like with type conversions on out parameters, some extra copies need to
occur to handle these cases.  The fundamental problem is that
ir_binop_vector_extract is not an lvalue, but out and inout parameters
must be lvalues.  A previous patch delt with a similar problem in the
LHS of ir_assignment.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
c3bb07f875 glsl: Use vector-insert and vector-extract on elements of gl_ClipDistanceMESA
Variable indexing into vectors using ir_dereference_array is being
removed, so this lowering pass has to generate something different.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Simplify code slightly by assuming that elements of
gl_ClipDistanceMESA will always be vec4.  Suggested by Paul.

v4: Fairly substantial rewrite based on the rewrite of "glsl: Convert
lower_clip_distance_visitor to be an ir_rvalue_visitor"

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-13 12:05:19 -07:00
Ian Romanick
d13fbeea96 glsl: Remove some stale comments about ir_call
ir_call was changed long ago to be a statement rather than an
expression.  That makes this comment no longer valid.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-13 12:05:19 -07:00
Ian Romanick
065da16508 glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor
Right now the lower_clip_distance_visitor lowers variable indexing into
gl_ClipDistance into variable indexing into both the array
gl_ClipDistanceMESA and the vectors of that array.  For example,

    gl_ClipDistance[i] = f;

becomes

    gl_ClipDistanceMESA[i >> 2][i & 3] = f;

However, variable indexing into vectors using ir_dereference_array is
being removed.  Instead, ir_expression with ir_triop_vector_insert will
be used.  The above code will become

    gl_ClipDistanceMESA[i >> 2] =
        vector_insert(gl_ClipDistanceMESA[i >> 2], i & 3, f);

In order to do this, an ir_rvalue_visitor will need to be used.  This
commit is really just a refactor to get ready for that.

v4: Split the least amount of refactor from the rest of the code
changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-13 12:05:19 -07:00
Ian Romanick
3acb21517b glsl: Generate ir_binop_vector_extract for indexing of vectors
Now ir_dereference_array of a vector will never occur in the RHS of an
expression.

v2: Add back the { } around the if-statement body to make it more
readable.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
89704eb1b0 glsl: Convert ir_binop_vector_extract in the LHS to ir_triop_vector_insert
The ast_array_index code can't know whether to generate an
ir_binop_vector_extract or an ir_triop_vector_insert.  Instead it will
always generate ir_binop_vector_extract, and the LHS and RHS have to be
re-written.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
ee7a6dad30 glsl: Add lowering pass for ir_triop_vector_insert
This will eventually replace do_vec_index_to_cond_assign.  This lowering
pass is called in all the places where do_vec_index_to_cond_assign or
do_vec_index_to_swizzle is called.

v2: Use WRITEMASK_* instead of integer literals.  Use a more concise
method of generating broadcast_index.  Both suggested by Eric.

v3: Use a series of scalar compares instead of a single vector compare.
Suggested by Eric and Ken.  It still uses 'if (cond) v.x = y;' instead
of conditional assignments because ir_builder doesn't do conditional
assignments, and I'd rather keep the code simple.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
b881ddba7d glsl: Lower ir_binop_vector_extract to conditional moves
Lower ir_binop_vector_extract with a non-constant index to a series of
conditional moves.  This is exactly like ir_dereference_array of a
vector with a non-constant index.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
943de9cdea glsl: Lower ir_binop_vector_extract to swizzle
Lower ir_binop_vector_extract with a constant index to a swizzle.  This
is exactly like ir_dereference_array of a vector with a constant index.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Correctly call convert_vector_extract_to_swizzle in
ir_vec_index_to_swizzle_visitor::visit_enter(ir_call *ir).  Suggested by
Ken.

v4: Use CLAMP instead of MIN2(MAX2()).  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Ian Romanick
63e1147ea1 glsl: Refactor part of convert_vec_index_to_cond_assign
Use a first function that extract the vector being indexed and the index
from the deref.  Call the second function that does the real work.

Coming patches will add a new ir_expression for variable indexing into a
vector.  Having the lowering pass split into two functions will make it
much easier to lower the new ir_expression.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Move some bits from a later patch back to this patch so that it
actually compiles.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Ian Romanick
dafd6918f3 glsl: Add ir_triop_vector_insert
The new opcode is used to generate a new vector with a single field from
the source vector replaced.  This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Add constant expression handling for ir_triop_vector_insert.  This
prevents the constant matrix inversion tests from regressing.  Duh.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Ian Romanick
f274a2ca87 glsl: Add ir_binop_vector_extract
The new opcode is used to get a single field from a vector.  The field
index may not be constant.  This will eventually replace
ir_dereference_array of vectors.  This is similar to the extractelement
instruction in LLVM IR.

http://llvm.org/docs/LangRef.html#extractelement-instruction

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Add array index range checking to ir_binop_vector_extract constant
expression handling.  Suggested by Ken.

v4: Use CLAMP instead of MIN2(MAX2()).  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Paul Berry
b0bb6103d2 glsl: Fix "make check" breakage after adding options to do_common_optimization.
Commit b765740 (glsl: Pass struct shader_compiler_options into
do_common_optimization.) added a new parameter to
do_common_optimization() but didn't update test_optpass.cpp, causing
"make check" to break.

This patch makes the proper updates to test_optpass.cpp so that the
build succeeds again.
2013-05-13 07:55:37 -07:00
Kenneth Graunke
e413d3f15c glsl: Add a pass to flip matrix/vector multiplies to use dot products.
This pass flips (matrix * vector) operations to (vector *
matrixTranspose) for certain built-in matrices (currently
gl_ModelViewProjectionMatrix and gl_TextureMatrix).

This is equivalent, but results in dot products rather than multiplies
and adds.  On some hardware, this is more efficient.

This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set
to indicate they prefer dot products.

Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10)
on a Haswell GT2 system.  Passes Piglit on Ivybridge.

v2: Use struct gl_shader_compiler_options instead of plumbing through
    another boolean flag for this purpose.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:46 -07:00
Kenneth Graunke
72a0b7a435 i965/vs: Set the PreferDP4 shader compiler option.
Doing matrix multiplies with DP4s is fewer instructions than MUL/ADD,
especially since we don't support MAD in the vertex shader.

Not observed to improve performance in any fixed function applications,
but is useful for the next patch.

I've left this unset for the fragment shader because the scalar backend
can't use DP4 and does have MAD support.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:44 -07:00
Kenneth Graunke
bbf029f7cf mesa: Move the mvp_with_dp4 flag to ShaderCompilerOptions.
This flag essentially tells the compiler whether it prefers
dot products or multiply/adds for matrix operations.  As such,
ShaderCompilerOptions seems like the right place for it.

This also lets us specify it on a per-stage basis.  This patch makes all
existing users set the flag for the Vertex Shader stage only, as it's
currently only used for fixed-function vertex programs.  That will
change soon, and I wanted to preserve the existing behavior.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:43 -07:00
Kenneth Graunke
b765740a66 glsl: Pass struct shader_compiler_options into do_common_optimization.
do_common_optimization may need to make choices about whether to emit
certain kinds of instructions.  gl_context::ShaderCompilerOptions
contains exactly that information, so it makes sense to pass it in.

Rather than passing the whole array, pass the structure for the stage
that's currently being worked on.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:41 -07:00
Kenneth Graunke
6bb9acfb4e glsl: Initialize ctx->ShaderCompilerOptions in standalone scaffolding.
This code is copied from _mesa_init_shader_state().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:39 -07:00
Kenneth Graunke
1c95cea40b glsl: Copy _mesa_shader_type_to_index() to standalone scaffolding.
We can't include shaderobj.h from the standalone utilities, so we
unfortunately have to copy this function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:18 -07:00
Kenneth Graunke
a67b18e5a7 mesa: Add comments about bit-ordering of new XRGB/XBGR formats.
Marek added these new formats in commit f9fa725690, but
without comments relating to the packing.  Sometimes the naming is
confusing, so these comments are helpful in determining whether two
formats are compatible.

The new comments are based on my reading of format_unpack.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-12 09:32:42 -07:00
Marek Olšák
f486c52f9e st/mesa: remove dependency on _NEW_BUFFER_OBJECT for vertex arrays
_NEW_BUFFER_OBJECT means glBufferData was called. We can just set our own
flag in BufferData.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:59:20 +02:00
Marek Olšák
b88cebb634 st/mesa: don't check for _NEW_PROGRAM when binding UBOs
Probably copied from i965. However st/mesa has its flags ST_NEW_xxx_PROGRAM.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
a17e87d4eb st/mesa: fix a couple of issues in st_bind_ubos
- don't reference a buffer for a local variable
  (that's never useful unless it can be the only reference to the buffer)
- check if the buffer is not NULL
- set buffer_size as specified with BindBufferRange

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
1ba1d617bf st/mesa: restore the transfer_inline_write path for BufferData
Version 2 that shouldn't crash.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
6a2ad679e6 st/mesa: initialize Const.MaxColorAttachments
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
52cb395bb1 gallium: add PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE for GL
v2: fix typo 65535 -> 65536

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
b6d3373442 st/mesa: consolidate setting MaxTextureImageUnits
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
614ee25077 st/mesa: initialize all program constants and UBO limits
Also simplify UBO support checking.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
d90f04a65b glsl: fix the value of gl_MaxFragmentUniformVectors
NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
77d8fbcfd4 mesa: add & use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECT
v2: move the flagging from intel_bufferobj_data to intel_bufferobj_alloc_buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
081c789c3e mesa: skip _MaxElement computation unless driver needs strict bounds checking
If Const.CheckArrayBounds is false, the only code using _MaxElement is
glDrawRangeElements, so I changed it and explained in the code why
_MaxElement is not very useful there.

BTW, the big magic number was copied to the letter
from _mesa_update_array_max_element.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
db38e9a0e1 mesa: remove unused gl_array_object::NewArray
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
74ca7f0974 mesa: remove unused gl_constants::MaxColorTableSize
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
286d06ddc4 mesa: unify MaxVertexVaryingComponents and MaxGeometryVaryingComponents
The limits should not be different and OpenGL requires both to be at least 32,
which is also the maximum limit on radeon.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
5e78433eec mesa: move max texture image unit constants to gl_program_constants
Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits
Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits
etc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
d27d29f1a6 mesa: consolidate definitions of max texture image units
Shaders are unified on most hardware (= same limits in all stages).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:44:55 +02:00
Vinson Lee
5471e3949c ilo: Initialize read_back in transfer_map_sys.
Fixes "Uninitialized scalar variable" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-05-10 15:29:40 +08:00
Marek Olšák
da33f9b919 r600g: increase array size for shader inputs and outputs
and add assertions to prevent buffer overflow. This fixes corruption
of the r600_shader struct.

NOTE: This is a candidate for the stable branches.
2013-05-10 03:23:31 +02:00
Chí-Thanh Christopher Nguyễn
121c2c8983 targets/dri-i915: Force c++ linker in all cases
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=461696
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-09 17:04:27 -07:00
Ben Widawsky
fc98c47115 i965: Actually use the user timeout in glClientWaitSync.
Use the new libdrm functionality to actually do timed waits on the sync
object.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-09 16:41:44 -07:00
Paulo Zanoni
f1d2b37317 i965: make GT3 machines work as GT3 instead of GT2
We were not allowed to say the "GT3" name, but we really needed to
have the PCI IDs because too many people had such machines, so we had
to make the GT3 machines work as GT2.

Let's just say that GT2_PLUS was a short for GT2_PLUS_1 :)

NOTE: This is a candidate for stable branches.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-09 15:11:53 -07:00
Kenneth Graunke
d0b82b1add i965: Add chipset limits for the Haswell GT3 variant.
NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
2013-05-09 15:11:53 -07:00
Kenneth Graunke
eca2251f42 i965: Update URB partitioning code for Haswell's GT3 variant.
Haswell's GT3 variant offers 32kB of URB space for push constants, while
GT1 and GT2 match Ivybridge, providing 16kB.  Update the code to reserve
the full 32kB on GT3.

v2: Specify push constant size correctly.  I thought GT3 reinterpreted
    the value as multiples of 2kB, but it doesn't.  You simply have to
    program an even number.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-09 15:11:52 -07:00
Kenneth Graunke
c56eba5adb i965: Delete dead intel_span.c symlink. 2013-05-09 15:11:52 -07:00
Eric Anholt
0f3068a58b i965/vs: Make virtual grf live intervals actually cover their used range.
This is the same change as the previous commit to the FS.  A very few VSes
are regressed by 1 or 2 instructions, which look recoverable with a bit
more dead code elimination.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-09 14:38:05 -07:00
Eric Anholt
e290372542 i965/fs: Make virtual grf live intervals actually cover their used range.
Previously, we would sometimes not consider a write to a register to
extend the end of the interval, nor would we consider a read before a
write to extend the start.  This made for a bunch of complicated logic
related to how to treat the results when dead code might be present.
Instead, just extend the interval and fix dead code elimination to know
how to remove it.

Interestingly, this actually results in a tiny bit more optimization:
total instructions in shared programs: 1391220 -> 1390799 (-0.03%)
instructions in affected programs:     14037 -> 13616 (-3.00%)

v2: Fix a theoretical problem with the simd16 workaround if dst == src,
    where we would revert the bump of the live range.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2013-05-09 14:38:05 -07:00
Marek Olšák
dd6152b6ca docs: document GALLIUM_HUD and LIBGL_SHOW_FPS 2013-05-09 23:28:05 +02:00
Courtney Goeltzenleuchter
daa90f91ff ilo: Add support for HW primitive restart.
Now tells Gallium that ilo supports primitive restart.
Updated ilo_draw_vbo to be able to check that the indexed
primitive being rendered can actually be supported in HW. If not,
will break up into individual prims similar to what Mesa does.

[olv: a minor fix after rebasing and formatting]
2013-05-10 00:06:14 +08:00
Brian Paul
009d79734f svga: misc whitespace and comment fixes in svga_cmd.c 2013-05-09 07:43:46 -06:00
Brian Paul
60c71cce3f docs: remove ^M chars from GL3.txt 2013-05-09 07:43:46 -06:00
Brian Paul
e0144019c0 st/mesa: generate GL_OUT_OF_MEMORY if we can't create the index buffer
Before, if we failed to allocate the index buffer we'd silently
return from st_draw_vbo() without drawing anything.  We should
raise GL_OUT_OF_MEMORY to give some indication that something went
wrong.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-09 07:43:46 -06:00
Chia-I Wu
a8e4614071 ilo: add support for PIPE_FORMAT_ETC1_RGB8
It is decompressed to and stored as PIPE_FORMAT_R8G8B8X8_UNORM on-the-fly.
2013-05-09 16:05:48 +08:00
Chia-I Wu
183ea823fd ilo: support mapping with a staging system buffer
It can be used for unpacking compressed texture on-the-fly or to support
explicit transfer flushing.
2013-05-09 16:05:47 +08:00
Chia-I Wu
baa44db065 ilo: allow for different mapping methods
We want to or need to use a different mapping method when when the resource is
busy, the bo format differs from the requested format, and etc.
2013-05-09 16:05:47 +08:00
Chia-I Wu
7cca1aac9d ilo: allow bo format to differ from that requested
For separate stencil buffer or formats not supported natively, the real format
of the bo may differ from that requested.
2013-05-09 16:05:47 +08:00
Stéphane Marchesin
1c56fc1025 draw/llvm: Add additional llvm optimization passes
It helps a bit with vertex shader performance on i915g
(a couple percent faster with openarena).

I have tried most other passes, and they weren't showing
any measurable improvement. Note that my vertex shaders
didn't have loops, so maybe the loop optimizations could
still be useful in the future.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-08 22:05:54 -07:00
Eric Anholt
0b0d6f97cf i965: Sync brw_format_for_mesa_format() table with new Mesa formats.
I'm not filling them all in, to prevent any breakage in this commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 15:31:07 -07:00
Eric Anholt
2755946427 i965: Update the surface formats table from the current specs.
Unfortunately the surface formats table is now splattered across multiple
chapters.  All surface format enums from brw_defines.h are present, but
only support for them that is mentioned in the public specs is included
here.

v2 (from Ken): Mark R32G32B32A32_SFIXED as unsupported on Ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 15:31:06 -07:00
Eric Anholt
5d89487eb2 i965: Add surface format defines from the public specs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 14:27:30 -07:00
Fabian Bieler
4e9c7f9c5a mesa/program: Don't copy propagate from swizzles.
Do not propagate a copy if source and destination are identical.

Otherwise code like

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].xyzw

is changed to

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].wzyx

This fixes Piglit test shaders/glsl-copy-propagation-self-2 for drivers that
use Mesa IR.

NOTE: This is a candidate for the stable branches.
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-08 13:59:19 -07:00
Fabian Bieler
e1ff753d67 mesa/st: Don't copy propagate from swizzles.
Do not propagate a copy if source and destination are identical.

Otherwise code like

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].xyzw

is changed to

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].wzyx

This fixes Piglit test shaders/glsl-copy-propagation-self-2 for gallium drivers.

NOTE: This is a candidate for the stable branches.
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-08 13:59:14 -07:00
Eric Anholt
5d06c9ea0f i965: Fix hangs on HSW since the gen6 blorp fix.
The constant packets for gen6 are too small for gen7, and while IVB seems
happy with them HSW blows up.  Fix it by emitting the correct packets on
gen7, for all stages.

v2: Include the packets instead of just skipping them.
NOTE: This is a candidate for the stable branches.
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 10:23:41 -07:00
Chad Versace
2878f4685c egl/android: Fix error condition for EGL_ANDROID_image_native_buffer
Emit EGL_BAD_CONTEXT if the user passes a context to
eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer).

From the EGL_ANDROID_image_native_buffer spec:
  * If <target> is EGL_NATIVE_BUFFER_ANDROID and <ctx> is not
    EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated.

Note: This is a candidate for the stable branches.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-08 08:44:05 -07:00
Stéphane Marchesin
38d2a16c01 i915: Use Y tiling for textures
This basically reverts commit
2acc719374.

With the previous change, we're not batchbuffer limited any
longer. So we actually start seeing a performance difference
between X and Y tiling. X tiling is funny because it is
faster for screen-aligned quads but slower in games. So let's
use Y tiling which is 10% faster overall.
2013-05-08 02:07:00 -07:00
Stéphane Marchesin
fc24c7aede i915g: Optimize batchbuffer sizes
Now that we don't throttle at every batchbuffer, we can shrink
the size of batchbuffers to achieve early flushing. This gives
a significant speed boost in a lot of games (on the order of
20%).
2013-05-08 02:06:56 -07:00
Stéphane Marchesin
7f7c7fda83 i915g: Add more PIPE_CAP_* support 2013-05-08 01:37:55 -07:00
Chia-I Wu
00035670de ilo: remove our own type inference
tgsi_opcode_infer_{src,dst}_type() works just fine.
2013-05-08 11:33:34 +08:00
Chia-I Wu
b74af51a46 ilo: use tgsi_util_get_texture_coord_dim()
And remove toy_tgsi_get_texture_coord_dim().
2013-05-08 11:07:46 +08:00
Chia-I Wu
75a48a53d8 tgsi: fix operand type of TGSI_OPCODE_NOT
It should be TGSI_TYPE_UNSIGNED, not TGSI_TYPE_FLOAT.

Fixed also gallivm not_emit_cpu() to use uint build context.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:49 +08:00
Chia-I Wu
1f970816b1 tgsi: refactor tgsi_opcode_infer_src_type()
Call tgsi_opcode_infer_type() from tgsi_opcode_infer_src_type().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:47 +08:00
Chia-I Wu
364feb327d tgsi: refactor tgsi_opcode_infer_dst_type()
Move the body of tgsi_opcode_infer_dst_type() to a new helper function,
tgsi_opcode_infer_type(), and call the helper function from
tgsi_opcode_infer_dst_type().  The diff looks complicated simply because the
code is moved around.

A following commit will make tgsi_opcode_infer_src_type() call
tgsi_opcode_infer_type().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:43 +08:00
Chia-I Wu
8a52453f5d tgsi: reorder opcodes in opcode type inference
Reorder opcodes by their assigned numbers.  This makes it easier to see the
differences between tgsi_opcode_infer_src_type() and
tgsi_opcode_infer_dst_type().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:24 +08:00
Chia-I Wu
61d57ec276 tgsi: clean up exec_tex()
Make use of tgsi_util_get_texture_coord_dim() to replace the big switch table.

There is a subtle difference with this change.  When TXP is used with an array
texture, the layer is now also projected.  This behavior matches the TGSI doc.
Since GLSL does not allow TXP on an array texture, I am not sure which
behavior is correct or preferred.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:00:07 +08:00
Chia-I Wu
80857d2c8b tgsi: add tgsi_util_get_texture_coord_dim()
This util function returns the dimension of the texture coordinates for a
texture target, and the location of the shadow reference value.

For example, when the texture target is TGSI_TEXTURE_SHADOW2D, the dimension
of the texture coordinates is 2, and the location of the ref value is 2
(that is, the Z channel).

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 10:58:53 +08:00
Bryan Cain
14a0bb81fe nv50: initialize kick_notify callback in nv50_create
Fixes infinite loop on startup in Portal and Left 4 Dead 2.

NOTE: This is a candidate for the 9.0 and 9.1 branches.
2013-05-07 17:01:59 -05:00
Eric Anholt
3f09e528d5 i965: Use Y-tiled blits to untile for cached mappings of miptrees.
Fixes a regression in firefox's unaccelerated compositing path for WebGL
with the introduction of Y tiling.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64213
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-07 11:45:45 -07:00
Eric Anholt
d641a01d98 i965: Add support for Y-tiled blits on gen6+.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-07 11:45:45 -07:00
Eric Anholt
7a74808d78 i965: Count occlusion query samples for CopyPixels using the 2D engine.
We accidentally "fixed" the piglit test for this when introducing Y
tiling, since this path stopped being executed.  In reenabling this path
for Y tiling, we ended up regressing it again, so just fix it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59439
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-07 11:45:45 -07:00
Robert Bragg
f8c3242682 egl/wayland: Implement EGL_EXT_swap_buffers_with_damage
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-05-07 17:07:50 +01:00
Robert Bragg
6425b14515 egl: Add extension infrastructure for EGL_EXT_swap_buffers_with_damage
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-05-07 17:07:45 +01:00
Robert Bragg
95dda0d649 egl: Update to revision 21254 of eglext.h
This pulls in EGL_EXT_swap_buffers_with_damage.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-05-07 17:07:44 +01:00
Roland Scheidegger
65102b708b gallium: more tgsi documentation updates
Adds the remaining integer opcodes, and some opcodes are moved to more
appropriate places, along with getting rid of the (already nearly empty)
ps_2_x section. Though the CAP bits for some of these are still a bit in
the air so the documentation isn't quite as watertight as is desirable.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-07 16:13:23 +02:00
Vinson Lee
4ba9c9c5be ilo: Add missing break statement in aos_tex TGSI_OPCODE_TEX2 case.
Fixes "Missing break in switch" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-05-07 12:15:48 +08:00
Vadim Girlin
c9cf83b587 r600g/sb: optimize some cases for CNDxx instructions
We can replace CNDxx with MOV (and possibly eliminate after
propagation) in following cases:

If src1 is equal to src2 in CNDxx instruction then the result doesn't
depend on condition and we can replace the instruction with
"MOV dst, src1".

If src0 is const then we can evaluate the condition at compile time and
also replace it with MOV.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-07 04:40:26 +04:00
Vadim Girlin
46dfad8b36 r600g/sb: fix memory leaks
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-07 04:40:26 +04:00
Vadim Girlin
1c28e7c5a1 r600g/sb: fix kcache handling on r6xx
Use the same limit for kcache constants in alu group on r6xx as on other
chips (two const pairs). Relaxing this will require additional checks to
make sure that all 4 consts in the group come from 2 kcache sets (clause
limit), probably without noticeable improvements of shader performance.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-07 04:40:26 +04:00
Eric Anholt
03ef60681e intel: Remove renderbuffer delete setup from texture wrapping.
This is already set by intel_new_renderbuffer().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:34:27 -07:00
Eric Anholt
77a405dba7 mesa: Make Mesa core set up wrapped texture renderbuffer state.
Everyone was doing effectively the same thing, except for some funky code
reuse in Intel, and swrast mistakenly recomputing _BaseFormat instead of
using the texture's _BaseFormat.  swrast's sRGB handling is left in place,
though it should be done by using _mesa_get_render_format() at render time
instead (as-is, it will miss updates to GL_FRAMEBUFFER_SRGB).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:34:14 -07:00
Eric Anholt
5b190d19d3 intel: Simplify renderbuffer-for-texture width setup.
We're looking for the logical width of our level, which is what
image->Width2/Height2 is.  The previous code relied on MSAA textures being
only level 0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:43 -07:00
Eric Anholt
749a92786d mesa: Make core Mesa allocate the texture renderbuffer wrapper.
Every driver did the same thing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:38 -07:00
Eric Anholt
5b9609f59a i965: Use brw_blorp_blit_miptrees() for CopyTexSubImage().
Now that depth resolves are handled there, we don't need to make the
temporary renderbuffer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:33 -07:00
Eric Anholt
40956c5519 i965: Move blorp resolve setup into brw_blorp_blit_miptrees().
There was some comment about trying to avoid marking resolves in
updownsample, but if the downsample is never actually rendered to, then
the required resolve tracked in the downsample will never be executed, so
who cares?

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:27 -07:00
Tom Stellard
730c90a70e gallivm: Fix build for LLVM < 3.3
The C API versions of the LLVM multithreaded functions were added in
LLVM 3.3.
2013-05-06 11:17:03 -07:00
Tom Stellard
bb94d4d8fe r600g/llvm: Parse config values in register / value pairs
Rather than relying on a predetermined order for the config values.
2013-05-06 10:54:52 -07:00
Tom Stellard
df27320560 r600g/llvm: Don't feed LLVM output through r600_bytecode_build()
The LLVM backend emits raw ISA now, so we can just its output
unmodified.
2013-05-06 10:54:52 -07:00
Tom Stellard
e917ed96ae r600g/llvm: Don't emit CALL_FS for vertex shaders
The LLVM backend takes care of this now.
2013-05-06 10:54:52 -07:00
Matt Turner
1d09a8c3cd i965: Lower bitfieldInsert.
v2: Only lower bitfieldInsert to BFM+BFI (and don't lower
    bitfieldExtract at all) since three-source instructions are now
    usable in the vertex shader.
v3: Lower bitfield_insert in the same pass with everything else, since
    it doesn't produce any instructions to be lowered (the other two
    lowering passes that were in a previous iteration of this series
    emitted subtractions which needed to be lowered).

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
2013-05-06 10:17:14 -07:00
Matt Turner
acd2bccd85 i965/vs: Add support for bit instructions.
v2: Rebase on LRP addition.
    Use fix_3src_operand() when emitting BFE and BFI2.
    Add BFE and BFI2 to is_3src_inst check in
      brw_vec4_copy_propagation.cpp.
    Subtract result of FBH from 31 (unless an error) to convert
      MSB counts to LSB counts

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:14 -07:00
Matt Turner
1f0f26d60c i965/fs: Add support for bit instructions.
Don't bother scalarizing ir_binop_bfm, since its results are
identical for all channels.

v2: Subtract result of FBH from 31 (unless an error) to convert
    MSB counts to LSB counts.
v3: Use op0->clone() in ir_triop_bfi to prevent (var_ref
    channel_expressions) from appearing multiple times in the IR.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
2013-05-06 10:17:14 -07:00
Matt Turner
fa958182b7 i965: Add support for emitting and disassembling bit instructions.
Specifically
   bfe - for bitfieldExtract()
   bfi1 and bfi2 - for bitfieldInsert()
   bfrev - for bitfieldReverse()
   cbit - for bitCount()
   fbh - for findMSB()
   fbl - for findLSB()

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:14 -07:00
Matt Turner
c71bee757b i965: Print the correct dst and shared-src types for 3-src instructions.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:14 -07:00
Matt Turner
526ffdfc03 i965/gen7: Set src/dst types for 3-src instructions.
Also update asserts to allow BFE and BFI2, which take (unsigned)
doubleword arguments.

v2: Allow BRW_REGISTER_TYPE_UD for src1 and src2 as well.
    Assert that src2.type (instead of src0.type) matches dest.type since
    it's the primary argument and src0 and src1 might correctly have
    different types.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v1]
2013-05-06 10:17:13 -07:00
Matt Turner
2305047823 i965: Add 3-src destination and shared-source type macros.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
4049d48e02 i965: Add Gen7+ fields to brw_instruction and add comments.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
dafd050883 glsl: Add a pass to lower bitfield-insert into bfm+bfi.
i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement
bitfieldInsert() from ARB_gpu_shader5.

v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp.
    Remove spurious temporary assignment and dereference.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
9c04b8c28c glsl: Add constant evaluation of bit built-ins.
v2: Order bits from LSB end (31 - count) for ir_unop_find_msb.
v3: Add ir_triop_bitfield_extract as an exception to the op[0]->type ==
    op[1]->type assertion in ir_constant_expression.cpp.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
2013-05-06 10:17:13 -07:00
Matt Turner
499d8c6545 glsl: Add support for new bit built-ins in ARB_gpu_shader5.
v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
44d3287ecd glsl: Add new bit built-ins IR and prototypes from ARB_gpu_shader5.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
f9e37879eb glsl: Rework ir_reader to handle expressions with four operands.
Needed to support the bitfieldInsert() built-in added by
ARB_gpu_shader5.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:12 -07:00
Matt Turner
f99f78e49a mesa: Add infrastructure for ARB_gpu_shader5.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:12 -07:00
Tom Stellard
914d797797 radeon/llvm: Always build libradeonllvm as static
This library is very small, so there is not much to gain from building
it as a shared library.  Also, when linking statically with LLVM, a
shared libradeonllvm exports LLVM symbols and creates problems when
used with other shared objects that also link statically to LLVM.

Reviewed-by: Mathias.Froehlich@web.de
2013-05-06 09:06:10 -07:00
Tom Stellard
024fe6852a radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2
The LLVM C API is considered stable and should never change, so it
is much more desirable to use than the LLVM C++ API, which is constantly in
flux.

v2:
  - Split target initialization and lookup into separate functions

Reviewed-by: Mathias.Froehlich@web.de
2013-05-06 09:06:06 -07:00
Tom Stellard
55eb8eaaa8 gallivm: Move LLVMStartMultithreaded() static initializer into gallivm
This does not solve all of the problems with using LLVM in a
multithreaded enivronment, but it should help in some cases.

Reviewed-by: Mathias.Froehlich@web.de
2013-05-06 09:06:03 -07:00
Tom Stellard
7cc98ea88f radeon/llvm: Don't use the global context when parsing LLVM IR
This leads to crashes when multiple threads try to compile compute
shaders in the same time.

Fixes a crash in bfgminer when using more than one thread.
2013-05-06 09:06:00 -07:00
Eric Anholt
bd850cb4f2 i965: Remove GL_ARB_color_buffer_float from GL core contexts.
Of the 3 controls in the extension, one was kept in GL core and the other
two were explicitly deprecated and the reasonable default behavior was
encoded in the spec.  By not exposing the extension, we avoid shader
recompiles when switching between float and unorm color buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 09:01:51 -07:00
Tom Stellard
ec143dc0b1 r600g/llvm: Update radeon family mappings for LLVM backend
New processors were added to the backend to distinguish between
GPUs with and without vertex caches.
2013-05-06 08:22:24 -07:00
Chia-I Wu
5cca6b6280 android: libsync is needed on Android 4.2+ for any driver
Add libsync not only for MESA_BUILD_CLASSIC, but also for MESA_BUILD_GALLIUM.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-06 07:20:08 -07:00
Chia-I Wu
da109d56d5 android: add ilo to the build system
It can be selected with

  BOARD_GPU_DRIVERS := ilo

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2013-05-06 07:20:07 -07:00
Eric Anholt
739b88330c glsl: Flip around "if" statements with empty "then" blocks.
This cleans up some funny-looking code in some unigine shaders I was
looking at.  Also slightly helps on planeshift and a few shaders in an
upcoming Valve release.

total instructions in shared programs: 1653715 -> 1653587 (-0.01%)
instructions in affected programs:     16550 -> 16422 (-0.77%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-05 13:20:42 -07:00
Chia-I Wu
008346273c ilo: correctly set return types of sampler messages
Correctly set the types of the temporaries.  We do not want type conversions
when moving the results to the final destinations.
2013-05-05 14:36:39 +08:00
Vincent Lejeune
b42fe195a2 r600g/llvm: Undefines unrequired texture coord values
This is a port of "r600g:mask unused source components for SAMPLE"
patch from Vadim Girlin.
2013-05-04 23:38:50 +02:00
Maarten Lankhorst
c4150123aa nvc0: fixup video decoding with 2D_ARRAY
Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2013-05-04 20:56:23 +02:00
Chia-I Wu
8c347d4e57 gallium: fix type of flags in pipe_context::flush()
It should be unsigned, not enum pipe_flush_flags.

Fixed a build error:

  src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error:
  invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive]

v2: replace all occurrences of enum pipe_flush_flags by unsigned

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

[olv: document the parameter now that the type is unsigned]
2013-05-04 17:32:10 +08:00
Eric Anholt
cbf3462c35 i965: Enable fast clears on non-8x4-aligned sizes.
Improves glb2.7 performance at a misaligned size by 2.3% +/- 0.7% (n=11).
The workaround was to avoid bad primitive/surface sizes, but that's worked
around as of a14dc4f92c.  (One might note
that pre-gen7 we don't know that the right half of an 8x4 at the right
edge is actually our pixels, but we're already clobbering those pixels for
depth resolves anyway and more work would be required to avoid that).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-03 20:59:51 -07:00
Brian Paul
76084907fb vbo: add comments, const qualifiers
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
0baf32508a mesa: whitespace, formatting fixes, etc in api_arrayelt.c
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
7c9e5afe81 vbo: use new no-op ArrayElement in _mesa_noop_vtxfmt_init()
As we do for the other commands which can appear between glBegin/End.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
7b762305d5 mesa: change ctx->Driver.NeedFlush to GLbitfield and update comment
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
36c83ccca0 mesa; change ctx->Driver.SaveNeedFlush to boolean, and document it.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
af30987a69 vbo: update comments for vbo_save_NotifyBegin()
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
4ea05bcba6 vbo: implement primitive merging for glBegin/End sequences
A surprising number of apps and benchmarks have poor code like this:

glBegin(GL_LINE_STRIP);
glVertex(v1);
glVertex(v2);
glEnd();
// Possibly some no-op state changes here
glBegin(GL_LINE_STRIP);
glVertex(v3);
glVertex(v4);
glEnd();
// repeat many, many times.

The above sequence can be converted into:

glBegin(GL_LINES);
glVertex(v1);
glVertex(v2);
glVertex(v3);
glVertex(v4);
glEnd();

Similarly for GL_POINTS, GL_TRIANGLES, etc.

Merging was already implemented for GL_QUADS in the display list code.
Now other prim types are handled and it's also done for immediate mode.

In one case:
                                 before   after
-----------------------------------------------
number of st_draw_vbo() calls:     141      45
number of _mesa_prims issued:     7520     632

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
3702d25082 vbo: create a few utility functions for merging primitives
To be used by following commit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Zack Rusin
a232afdbfb draw/pt: adjust overflow calculations
gallium lies. buffer_size is not actually buffer_size but available
size, which is 'buffer_size - buffer_offset' so by adding buffer
offset we'd incorrectly compute overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 07:07:33 -04:00
Zack Rusin
8490d21cbe tgsi/ureg: make the dst register match the src indirection
In ureg src registers could have an indirect register that was
either a temp or an addr register, while dst registers allowed
only addr. That made moving between them a little difficult so
make them behave the same way and allow temp's and addr registers
as indirect files for both (tgsi supports it, just ureg didn't).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 07:07:33 -04:00
Roland Scheidegger
23025ed15d gallium: tgsi documentation updates and clarification for integer opcodes.
A lot of them were missing. Others were moved from the Compute ISA
to a new Integer ISA section as that seemed more appropriate.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 21:36:28 +02:00
Roland Scheidegger
ae507b6260 llvmpipe: get rid of depth swizzling.
Eliminating this we no longer need to copy between linear and swizzled layout.
This is probably not quite ideal since it's a bit more work for now, could do
some optimizations by moving depth testing outside the fragment shader loop
(but tricky for early depth test as we don't have neither the mask nor the
interpolated z in the right order handy).
The large amount of tile/untile code is no longer needed will be deleted
in next commit.
No piglit regressions.
v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR.
v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 21:36:20 +02:00
Lauri Kasanen
e495d88453 r600g: Correctly initialize the shader key, v2
Assigning a struct only copies the members - any padding is left as is.

Thus this code:

struct foo_t foo;
foo = bar;

leaves the padding of foo intact, ie uninitialized random garbage.

This patch fixes constant shader recompiles by initializing the struct
to zero. For completeness, memcpy is used to copy the key to the shader
struct.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-03 19:28:57 +02:00
Lauri Kasanen
5ff81cfd86 st/xvmc/tests: Fix build failure, v2
v2: Removed extra libs as requested by Matt Turner.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-03 19:14:54 +02:00
Andreas Boll
e62be5de53 scons: remove nouveau build
One build system for linux/unix only drivers should be enough.
Additionally the nouveau target was disabled anyway.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 18:44:57 +02:00
Andreas Boll
4ca44f2c5e scons: remove radeon build
One build system for linux/unix only drivers should be enough.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48694

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 18:44:43 +02:00
Alex Deucher
4539f8e20a r600g: don't emit surface_sync after FLUSH_AND_INV_EVENT
It shouldn't be needed since the FLUSH_AND_INV_EVENT has already
made sure the destination caches are flushed.  Additionally,
we didn't previously emit the surface_sync until this commit:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e5e4c07e7964a3258ed02b530bcdc24c0650204b
Emitting them together causes hangs in compute on cayman/TN
and hangs in Heaven on evergreen.

Note: this patch is a candidate for the 9.1 branch, but requires:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=156bcca62c9f4e79e78929f72bc085757f36a65a
as well.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-03 10:55:05 -04:00
Vadim Girlin
41005d7bd2 r600g/sb: zero-initialize bytecode structs
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
f92bd0958e r600g/sb: fix constant propagation in gvn pass
Fixes the bug that prevented propagation of literals in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
3c201a22ca r600g/sb: don't run unnecessary passes
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
48ba5712f5 r600g/sb: silence warnings with gcc 4.8
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
c49b6d7f27 r600g/sb: fix handling of interference sets in post_scheduler
post_scheduler clears interference set for reallocatable values when
the value becomes live first time, and then updates it to take into
account modified order of operations, but this was not handled properly
if the value appears first time as a source in copy operation.

Fixes issues with webgl demo: http://madebyevan.com/webgl-water/

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
e16ef1f454 r600g/sb: fix allocation of indirectly addressed input arrays
Some inputs may be preloaded into predefined GPRs,
so we can't reallocate arrays with such inputs.

Fixes issues with webgl demo: http://oos.moxiecode.com/js_webgl/snake/

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Vadim Girlin
a6fe055fa7 r600g/sb: use hex instead of binary constants
This should fix build issues with GCC < 4.3

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Vadim Girlin
4ca67dbf0c r600g: use old shader disassembler by default
New disassembler is not completely isolated yet from further processing
in r600g/sb that is not required for printing the dump, so it has higher
probability to fail in case of any unexpected features in the bytecode.

This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new
disassembler in r600g/sb for shader dumps when shader optimization
is not enabled.

If shader optimization is enabled, new disassembler is used by default.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Christian König
b4b3041132 radeon/uvd: enable interlaced buffers by default
Kills tilling on UVD buffers, but we currently don't really need that.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:21 +02:00
Christian König
85b0880a17 vl/idct: fix for commit 7d2f2a0c89
We still need the option for handling 3D textures as well.

Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=64143

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:21 +02:00
Christian König
379753869d vl/buffers: fix typo in function name
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:20 +02:00
Christian König
9c353ea293 radeon/uvd: fix some MPEG4 artifacts
Still not perfect, but a step in the right direction.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:20 +02:00
José Fonseca
abbbc9b667 draw: Update for u_assembled_primitive -> u_assembled_prim rename.
Mesa build is too complex to rely on successful builds. On refactorings
it is always a good idea to use git grep to prevent missing cases:

  $ git grep u_assembled_primitive
  src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:      u_assembled_primitive(in_prim);
2013-05-03 08:35:17 +01:00
Chia-I Wu
8b2a967e32 st/egl: fix bulid errors on Android 4.2
The differences from the previous releases that affect st/egl are

 - logging macros are prefixed with an 'A'
 - dequeueBuffer() and enqueueBuffer() require an additoinal argument for
   fence fd, acquired from libsync

Additionally, include gralloc_drm.h with extern "C".
2013-05-03 13:04:00 +08:00
Chia-I Wu
7346ab3b43 ilo: use u_reduced_prims_for_vertices()
We do not need our own prim_count() anymore.
2013-05-03 11:59:10 +08:00
Chia-I Wu
f87dccdc19 util/prim: add u_reduced_prims_for_vertices()
The function returns the number of reduced/tessellated primitives for the
given vertex count.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
90d5190594 util/prim: assorted fixes for u_decomposed_prims_for_vertices()
Switch to '>=' for comparisons, and it becomes obvious that the comparison for
PIPE_PRIM_QUAD_STRIP was wrong.

Add minimum vertex count check for PIPE_PRIM_LINE_LOOP.  Return 1 for
PIPE_PRIM_POLYGON with 3 vertices.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
30671cecc0 util/prim: use vertex count info in u_validate_pipe_prim()
As a side effect, primitives with adjacency are now correctly validated.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
ddf0e3930f util/prim: fix the name of the include guard
It should be U_PRIM_H, not U_BLIT_H.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
5dd3bd70a1 draw: use u_assembled_prim() instead of u_assembled_primitive()
The latter function is also removed as a result of the change.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
185692e72c util/prim: clean up and add comments
Move together (or add) functions to decompose/reduce/assemble a primitive,
give them consistent names, and document them.  Add u_prim_vertex_count() so
that the vertex count information can be used elsewhere.

u_assembled_primitive() will be removed in a folow-on commit.

[olv: fix a warning when -Wold-style-declaration is enabled]

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:58:57 +08:00
Chia-I Wu
64913002e4 util/prim: fix primitive trimming for triangles with adjacency
Fix for PIPE_PRIM_TRIANGLES_ADJACENCY and PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:39:12 +08:00
Eric Anholt
573d8813fd i965/vs: Add instruction scheduling.
While this is ignorant of dependency control, it's still good for a 0.39%
+/- 0.08% performance improvement on GLBenchmark 2.7 (n=548)

v2: Rewrite as a subclass of the base class for the FS instruction
    scheduler, inheriting the same latency information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:47 -07:00
Eric Anholt
3b00a6acac i965: Move most of the FS instruction scheduler code to a general class.
About half of this is shareable with the VS code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:43 -07:00
Eric Anholt
ce22dd75b7 i965: Pull a couple of FS scheduling functions out to methods.
These will get virtualized as we add VS scheduling support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:39 -07:00
Eric Anholt
ee0223ba2a i965: Move FS instruction scheduling to a non-FS-specific file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:35 -07:00
Eric Anholt
ab04f3b2d7 i965: Share the register file enum between the two backends.
I need this so I can look at vec4 and fs registers' files from the same
.cpp file without namespaces.  As far as I can tell we never rely on the
particular numerical values of the files, though I thought it sounded like
a good idea when doing the VS (it turns out having 0 be BAD_FILE is nicer).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:31 -07:00
Eric Anholt
63c8155b09 i965: Make dump_instructions be a virtual method of the visitor.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:26 -07:00
Eric Anholt
74e670d0a3 i965/vs: Do round-robin register allocation on gen6+ like we do in the FS.
This will free instruction scheduling to make better choices.  No
statistically significant performance difference on GLB2.7 (n=93).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:09 -07:00
Rob Bradford
15e64de9e6 wayland: Make eglQueryBufferWL succeed for width and height requests too
Following the addition of the EGL_WIDTH and EGL_HEIGHT this function should
return EGL_TRUE for those requested attributes too.
2013-05-02 16:46:04 -04:00
Zack Rusin
396b861ceb draw/gs: don't crash when vs/gs signatures don't match
instead of crashing just fill zeros at the input slots that don't
match, that's the mandated behavior and it avoids debug asserts.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-02 02:43:42 -04:00
Zack Rusin
999cd79c9e tgsi: allow negation of all integer types
It's valid because we reuse certain arithmetic operations
for both signed and unsigned types (e.g. uadd, umad, which
have a bit unfortunate naming)

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-02 02:43:42 -04:00
Eric Anholt
1dfea559c3 i965: Fix SNB GPU hangs when a blorp batch is the first thing to execute.
The GPU apparently goes looking for constants even though there are no
shader stages enabled, and gets stuck because we haven't told it there are
no constants to collect.  If any other user of the 3D pipeline had run
(even the Render accel of the X server!) since power on, then the in-GPU
constant buffers would have been set up with some contents we didn't use,
and we would succeed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56416
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Dave Airlie <airlied@redhat.com>
NOTE: This is a candidate for the stable branches.
2013-05-02 11:27:37 -07:00
Tom Stellard
156bcca62c r600g: Don't set the dest cache bits on surface sync for R600_CONTEXT_FLUSH_AND_INV
We are already emitting a EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT packet
when this flush flag is set, so flushing the dest caches with a
SURFACE_SYNC should not be necessary.

The motivation for this change is that emitting a SURFACE_SYNC packet with
the CB bits set was causing compute shaders to hang on Cayman.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-02 09:00:37 -07:00
Tom Stellard
5752be0cb7 r600g/compute: Fix build error in debug code
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-02 09:00:37 -07:00
Armin K
cd84353d57 radeon: Fix build with LLVM 3.3
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-05-02 09:00:37 -07:00
Armin K
4742f9b00b gallivm: Fix build with LLVM 3.3
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-05-02 09:00:37 -07:00
Brian Paul
fcfbf4a19f mesa: update comments, simplify code in vtxfmt.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
5dc0081ade mesa: update GLvertexformat comments
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
200e09e393 mesa: remove GLvertexformat::EvalMesh1(), EvalMesh2()
See previous commit comments.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
0f365b2d77 mesa: remove GLvertexformat::Rectf()
As with the glDraw* functions, this doesn't have to be in GLvertexformat.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
49993a1a9d mesa: simplify dispatch for glDraw* functions
Remove all the glDraw* functions from the GLvertexformat structure.
The point of that dispatch struct is to handle all the functions which
dispatch differently depending on whether we're inside glBegin/End.
glDraw* are never allowed inside glBegin/End so we can remove those
entries.

This simplifies the code paths and gets rid of quite a bit of code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
79679e258b vbo: add new vbo_initialize_exec_dispatch(), vbo_initialize_save_dispatch()
First step in simplifying the vertex array / glDraw dispatch code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
d0102500bd mesa: remove _MESA_INIT_EVAL_VTXFMT() macro
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
43b3d3bc25 mesa: remove _MESA_INIT_ARRAYELT_VTXFMT() macro
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
95188fd10f mesa: remove _MESA_INIT_DLIST_VTXFMT() macro
Just expand the code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
84e62b7358 mesa: change _mesa_inside_dlist_begin_end() to handle PRIM_UNKNOWN
If the currently compiled primitive state is PRIM_UNKNOWN we should
not return true from _mesa_inside_dlist_begin_end().  This lets us
simplify the calls to that function.

Note, the call to _mesa_inside_dlist_begin_end() in vbo_save_EndList()
should have probably been checking for PRIM_UNKNOWN too, but it wasn't.
So there's no code change change.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
daf19f28c6 mesa: add names of geometry shader prims in gl_enums.py
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
5472ae1fa9 vbo: fix initial value of ctx->Driver.CurrentSavePrimitive
This is set during context creation/initialization.  We know we're
not inside glBegin/glEnd at this point so use PRIM_OUTSIDE_BEGIN_END.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
ecea61e414 vbo: fix error detection in vbo_save_playback_vertex_list()
The old code didn't make sense.  The clause in question did the
same thing as the next else-if clause.  If we're already executing
a glBegin/End pair and we're starting a new primitive, that's an
error.

Fixes more failures in piglit gl-1.0-beginend-coverage test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
a07437dc28 mesa: comments, formatting fixes in dlist code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
e880b7cbf8 vbo: remove redundant vfmt->Begin = _save_Begin assignment
The same assignment appears later in the function.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
3e7c16997a mesa: don't install glDraw* functions into the BeginEnd dispatch table
Functions like glDrawArrays, glDrawElements, etc. are illegal between
glBegin/glEnd and should generate GL_INVALID_OPERATION.

Fixes several piglit gl-1.0-beginend-coverage failures.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
d6f3ef92d7 vbo: fix parameter validation for saving dlist glDraw* functions
The _save_OBE_DrawArrays/Elements/RangeElements() functions are
called when building a display list and we know we're outside
glBegin/End.

We shouldn't call the normal _mesa_validate_DrawArrays/Elements()
functions here because those functions only work properly in immediate
mode or during dlist execution.  At dlist compile time, we can't call
_mesa_update_state(), etc. and examine the current state since it won't
apply when the list is executed later.

Fixes several failures in piglit's gl-1.0-beginend-coverage test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
94c7caf406 mesa: add missing error check in _mesa_EndList()
If we're in GL_COMPILE_AND_EXECUTE mode and inside glBegin, calling
glEndList() should generate an error.

Fixes a failure in piglit's gl-1.0-beginend-coverage test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
c1a5c5c13d mesa: remove unused PRIM_INSIDE_UNKNOWN_PRIM constant
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
d5bdce1142 mesa: simplify save_Begin() error checking
The old code was hard to understand and not entirely correct.
Note that PRIM_INSIDE_UNKNOWN_PRIM is no longer set anywhere so
we'll be able to remove that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
bb459f6295 mesa: refactor _mesa_valid_prim_mode()
...in terms of new _mesa_is_valid_prim_mode().  We need a mode validater
function that doesn't depend on current state for the display list code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Brian Paul
8be093e2f6 mesa: fix CurrentSavePrimitive <= GL_POLYGON tests
Use the new PRIM_MAX value instead so that new geometry shader primitive
types are accounted for.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Brian Paul
cce6e30613 mesa: adjust PRIM_x constants for geometry shaders
These values pertain to display lists, and the new types of geometry
shader primitives can be used in display lists.

And add new PRIM_MAX constant for follow-on changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Brian Paul
aa782f260d mesa: fix save_ShadeModel() logic and add new comments
This removes the test for _mesa_inside_dlist_begin_end().
If ctx->Driver.CurrentSavePrimitive==PRIM_UNKNOWN (the initial value),
_mesa_inside_dlist_begin_end() will, confusingly, return TRUE.
So we didn't set the ctx->ListState.Current.ShadeModel value and it
remained in its indeterminate state.

This didn't effect correctness, but it defeated the intended optimization
of dropping redundant glShadeModel() state changes in order to
coalesce sequences of drawing commands.

Verified with new piglit gl-1.0-dlist-shademodel test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Adam Jackson
16296cc843 gallivm: Fix altivec intrinsics for 8xi16 add/sub
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-05-02 10:34:08 -04:00
Lauri Kasanen
35c5b95b94 r600/sb: Fix build failure with non-standard libdrm installation prefix
Just like radeon/uvd, r600/sb fails to find the libdrm includes.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
2013-05-02 14:57:00 +02:00
Lauri Kasanen
e2b985dc0f radeon/uvd: Fix build failure with non-standard libdrm installation prefix
Without this patch, radeon_uvd failed to find the libdrm includes:

In file included from radeon_uvd.c:48:
../../winsys/radeon/drm/radeon_winsys.h:44:35: error:
libdrm/radeon_surface.h: No such file or directory

Signed-off-by: Lauri Kasanen <cand@gmx.com>
2013-05-02 14:54:03 +02:00
Jordan Justen
02f2bce08d mesa: implement glFramebufferTexture
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 16:18:25 -07:00
Jordan Justen
5da8288911 mesa: add Layered field to framebuffers
When checking framebuffer completeness, we test each attachment.
We verify that all attachments are consistent in terms of layers.

1. They must all be layered, or all non-layered
2. If they are layered, they must match in depth

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 15:31:48 -07:00
Jordan Justen
a62808085a mesa: add renderbuffer attachment Layered field
If glFramebufferTexture is used, then the framebuffer attachment is
layered.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 15:31:44 -07:00
Jordan Justen
a05e201d4a mesa: add renderbuffer Depth field
With glFramebufferTexture, a renderbuffer may support
all layers of the texture, so we need the depth of the
renderbuffer to check for consistency which is required
for framebuffer completeness.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 15:30:48 -07:00
Andreas Boll
b8e41db053 mesa: add usage examples to get-pick-list and shortlog scripts
NOTE: This is a candidate for the stable branches.
2013-05-01 21:42:02 +02:00
Andreas Boll
df01201132 docs: add info about bugzilla_mesa.sh script 2013-05-01 21:42:02 +02:00
Andreas Boll
ca79b72c00 mesa: Add a script to generate the list of fixed bugs
This list appears in the fixed bugs section of the release notes.

v2: Add usage examples

NOTE: This is a candidate for the stable branches.
2013-05-01 21:42:02 +02:00
Andreas Boll
f6aab27d43 scons: remove IN_DRI_DRIVER
Not used anymore.
2013-05-01 21:34:48 +02:00
Andreas Boll
be0fec4f5b build: remove unused API_DEFINES
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Brian Paul
7f8434b866 configure: remove IN_DRI_DRIVER
Not used anymore.

v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - split patch into two patches
    - remove more unused code

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Brian Paul
4ede5fb0c6 configure: remove FEATURE_GL/ES1/ES2
Not used anymore.

v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - split patch into two patches

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Andreas Boll
6b8f55c4da intel: use automake conditionals for defining FEATURE_{ES1,ES2}
Removes the need of API_DEFINES.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Andreas Boll
afa33a001a egl-static: use automake conditionals for defining FEATURE_{GL,ES1,ES2}
Removes the need of API_DEFINES.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Andreas Boll
3537d853d0 intel: remove executable bit from C file
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Brian Paul
aaab450d22 docs: s/Aprile/April/ 2013-05-01 13:17:21 -06:00
Andreas Boll
85e5bc106c docs: fix 9.1.2 release notes 2013-05-01 21:01:48 +02:00
Marek Olšák
8eef6ad2e2 vbo: fix possible use-after-free segfault after a VAO is deleted
This like the fifth attempt to fix the issue.

Also with the new "validating" flag, we can set recalculate_inputs to FALSE
earlier in vbo_bind_arrays, because _mesa_update_state won't change it.

NOTE: This is a candidate for the stable branches.

v2: fixed a typo

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 20:08:53 +02:00
Kenneth Graunke
b5b6460c40 i965/vs: Fix textureGrad() with shadow samplers on Haswell.
The shadow comparitor needs to be loaded into the Z component of the
last DWord.

Fixes es3conform's shadow_execution_vert and oglconform's
shadow-grad advanced.textureGrad.1D tests on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-01 10:42:51 -07:00
Kenneth Graunke
e2f887b243 i965: Lower textureGrad() for samplerCubeShadow.
According to the Ivybridge PRM, Volume 4 Part 1, page 130, in the
section for the sample_d message: "The r coordinate contains the faceid,
and the r gradients are ignored by hardware."

This doesn't match GLSL, which provides gradients for all of the
coordinates.  So we would need to do some math to compute the face ID
before using sample_d.  We currently don't have any code to do that.

However, we do have a lowering pass that converts textureGrad to
textureLod, which solves this problem.  Since textureGrad on three
components is sufficiently obscure, it's not a performance path.

For now, only handle samplerCubeShadow; we need tests for samplerCube
and samplerCubeArray.

Fixes es3conform's shadow_comparison_frag test on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-01 10:42:51 -07:00
Christian König
163b4da874 radeon/uvd: fix quant scan order for mpeg2
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Christian König
3aafe2437d st/vdpau: fix background handling in the mixer
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Christian König
7d2f2a0c89 vl/buffer: use 2D_ARRAY instead of 3D textures
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Christian König
e27f87b549 vl/compositor: cleanup background clearing
Add an extra parameter to specify if we should clear the render target.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Brian Paul
236ea7900f swrast: add casts for ImageSlices pointer arithmetic
MSVC doesn't like pointer arithmetic with void * so use GLubyte *.

Reviewed-by: Jose Fonseca<jfonseca@vmware.com>
2013-05-01 11:53:02 +01:00
Chia-I Wu
22c5e048bd ilo: fix PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS
On GEN7+, is->dev.has_gen7_sol_reset is required.
2013-05-01 17:41:39 +08:00
Chia-I Wu
16f81fcf1e ilo: enable SO support on GEN7 2013-05-01 17:36:44 +08:00
Chia-I Wu
d26f70e208 ilo: reset SO write offsets for new SO targets
When the SO targets are changed and no appending is requested, we need to send
SOL_RESET on GEN7+.
2013-05-01 17:36:44 +08:00
Chia-I Wu
68e1f76e46 ilo: correctly program SO states for GEN7
With the commands supported by GPE, we can finally program the states.
2013-05-01 17:36:44 +08:00
Chia-I Wu
9557cd39e2 ilo: implement GEN7 SO GPE functions
They were just stubs before.
2013-05-01 17:36:09 +08:00
Chia-I Wu
9069a3b065 ilo: add gen6_pipeline_update_max_svbi()
Move max_svbi calculation to a helper function and make it available for other
GENs.
2013-05-01 17:35:43 +08:00
Chia-I Wu
252a21c2cc ilo: expose register indices of OUTs in ilo_shader
pipe_stream_output_info tells us which of OUT[i] needs to be written out.
We need the info to map OUT[i] to VUE offset.
2013-05-01 17:34:49 +08:00
Chia-I Wu
440557db4e ilo: allow one-off flags to be specified for CP
It will be used for SOL_RESET on GEN7.
2013-05-01 16:03:44 +08:00
Chia-I Wu
dd62e7bc02 ilo: fix tiling/size for special-purpose resources
We do not allocate such resources yet though.
2013-05-01 12:00:32 +08:00
Chia-I Wu
7726e9500c ilo: use UMS layout for render targets
As we do not advertise MSAA support, this change should not make any
difference yet.
2013-05-01 11:56:43 +08:00
Chia-I Wu
334abed828 ilo: support and prefer compact array spacing
There is no reason to waste the memory when the HW can support compact array
spacing (ARYSPC_LOD0).
2013-05-01 11:31:15 +08:00
Chia-I Wu
ce188bb252 ilo: move device limits to ilo_dev_info or to GPEs
It seems a bit weird to have device limits in a context.
2013-05-01 11:23:11 +08:00
Chia-I Wu
bef98f9c3a ilo: use ilo_dev_info in toy compiler
We need only dev->gen, but it makes sense to expose other information to the
compiler.
2013-05-01 11:22:57 +08:00
Chia-I Wu
51d749e7e2 ilo: use ilo_dev_info in GPE and 3D pipeline
We need only dev->gen and dev->gt, but it makes sense to expose other
information to the pipeline.
2013-05-01 11:22:20 +08:00
Chia-I Wu
bb1f635dcc ilo: add ilo_dev_info shared by the screen and contexts
The struct is used to describe the device information, such as PCI ID, GEN,
GT, and etc.
2013-05-01 11:20:41 +08:00
Chia-I Wu
355f3f7ab5 ilo: fix indentation of ilo_gpe_gen*.h 2013-05-01 11:20:32 +08:00
Kenneth Graunke
6c5cf8baa1 glsl: Ignore redundant prototypes after a function's been defined.
Consider the following shader:

    vec4 f(vec4 v) { return v; }
    vec4 f(vec4 v);

The prototype exactly matches the signature of the earlier definition,
so there's absolutely no point in it.  However, it doesn't appear to
be illegal.  The GLSL 4.30 specification offers two relevant quotes:

"If a function name is declared twice with the same parameter types,
 then the return types and all qualifiers must also match, and it is the
 same function being declared."

"User-defined functions can have multiple declarations, but only one
 definition."

In this case the same function was declared twice, and there's only one
definition, which fits both pieces of text.  There doesn't appear to be
any text saying late prototypes are illegal, so presumably it's valid.

Unfortunately, it currently triggers an assertion failure:
ir_dereference_variable @ <p1> specifies undeclared variable `v' @ <p2>

When we process the second line, we look for an existing exact match so
we can enforce the one-definition rule.  We then leave sig set to that
existing function, and hit sig->replace_parameters(&hir_parameters),
unfortunately nuking our existing definition's parameters (which have
actual dereferences) with the prototype's bogus unused parameters.

Simply bailing out and ignoring such late prototypes is the safest
thing to do.

Fixes Piglit's late-proto.vert as well as 3DMark/Ice Storm for Android.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-04-30 16:43:42 -07:00
Ian Romanick
abfe486b9e docs: Import 9.1.2 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-30 15:33:29 -07:00
Matt Turner
1b6281443d build: Remove libws_xlib.la from GALLIUM_PIPE_LOADER_LIBS.
The three users of GALLIUM_PIPE_LOADER_LIBS (OpenCL, gallium-gbm,
gallium tests) don't appear to need libws_xlib.la.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
460996b937 build: Remove libpipe_loader.la from GALLIUM_PIPE_LOADER_LIBS.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
538e10f3ea build: Remove HAVE_PIPE_LOADER_SW.
It guarded the function prototype of pipe_loader_sw_probe, whose use (in
pipe_loader.c) and definition (in pipe_loader_sw.c) were not guarded.
Both are built into libpipe_loader.la if HAVE_LOADER_GALLIUM, which is
enable_gallium_loader in configure.ac.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
ea6caf4cdf build: Remove libws_null.la from GALLIUM_PIPE_LOADER_LIBS.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
242809942f build: Rename PIPE_LOADER_HAVE_XCB to HAVE_PIPE_LOADER_XCB.
For consistency, since we already have HAVE_PIPE_LOADER_{SW,DRM}.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
657cfe6252 configure.ac: Remove unused HAVE_PIPE_LOADER_XLIB macro.
Added in e1364530 but never used.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:31 -07:00
Paul Berry
bdf13dc832 i965: Stop passing num_samples to intel_miptree_alloc_hiz().
The number of samples is already available in the miptree data
structure, so there's no need to pass it in.

I suspect this may fix a subtle bug because in one case
(intel_renderbuffer_update_wrapper) we were always passing zero for
num_samples, even though the buffer in question was not guaranteed to
be single-sampled.  But I wasn't able to find a failing test case.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 13:46:57 -07:00
Zack Rusin
d48054ff22 draw: don't crash if GS doesn't emit anything
Technically it's legal for geometry shader to not emit any
vertices. It's silly, but perfectly legal, so lets make draw
stop crashing if it happens.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-27 17:28:04 -04:00
Eric Anholt
e56095dc2e i965: Implement color clears using a simple shader in blorp.
The upside is less CPU overhead in fiddling with GL error handling, the
ability to use the constant color write message in most cases, and no GLSL
clear shaders appearing in MESA_GLSL=dump output.  The downside is more
batch flushing and a total recompute of GL state at the end of blorp.
However, if we're ever going to use the fast color clear feature of CMS
surfaces, we'll need this anyway since it requires very special state
setup.

This increases the fail rate of some the GLES3conform ARB_sync tests,
because of the initial flush at the start of blorp.  The tests already
intermittently failed (because it's just a bad testing procedure), and we
can return it to its previous fail rate by fixing the initial flush.

Improves GLB2.7 performance 0.37% +/- 0.11% (n=71/70, outlier removed).

v2: Rename the key member, use the core helper for sRGB, and use
    BRW_MASK_* enums, fix comment and indentation (review by Paul).
v3: Rewrite a comment, drop a silly temporary variable (review by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 11:59:23 -07:00
Eric Anholt
e34c857639 mesa: Make a Mesa core function for sRGB render encoding handling.
v2: const-qualify ctx, and add a comment about the function (recommended
    by Brian and Kenneth).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-30 11:59:23 -07:00
Eric Anholt
db31bc5cfb i965: Don't flush the batch at the end of blorp.
Improves GLB2.7 performance 0.13% +/- 0.09% (n=104/105, outliers removed).
More importantly, once color glClear()s are done through blorp in the next
commit, this reduces regression in GLES3 conformance tests that rely on
queueing up many glClear()s and having the GPU report being still busy in
an ARB_sync query after that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 11:59:23 -07:00
Vadim Girlin
fb1eed9ec5 r600g/sb: remove unused code
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3f18dd818f r600g/sb: collect shader statistics
Collects various statistical information for each shader
and total stats for contexts.

Printed with R600_DEBUG=sb,sbstat

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
6ba7a162b6 r600g/sb: don't propagate dead values in GVN pass
In some cases we use value::gvn_source field to link values that
are known to be equal before gvn pass (e.g. results of DOT4 in different
slots of the same alu group), but then source value may become dead later
and this confuses further passes.

This patch resets value::gvn_source to NULL in the dce_cleanup pass
if it points to dead value.

Fixes segfault during shader optimization with ETQW.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3e476c311f r600g/sb: use simple heuristic to limit register pressure
It's not a complete register pressure tracking, yet it helps to prevent
register allocation problems in some cases where they were observed.

The problems are uncovered by false dependencies between fetch instructions
introduced by some recent changes in TGSI and/or default backend.
Sometimes we have code like this:

...
SAMPLE R5.xyzw, R5.xyzw
... store R5.xyzw somewhere
MOV R5.x, <next x coord>
MOV R5.y, <next y coord>
SAMPLE R5.xyzw, R5.xyzw
... <may be repeated a lot of times>

With 2D resources, z and w in SAMPLE src reg aren't used and can be simply
masked, but shader backend doesn't have this information, so it's
considered as data dependency by optimization algorithms.
2013-04-30 21:50:48 +04:00
Vadim Girlin
6d6c8c88a3 r600g/sb: improve error checking in ra_coalesce pass 2013-04-30 21:50:47 +04:00
Vadim Girlin
188c893e65 r600g/sb: use source bytecode in case of optimization errors 2013-04-30 21:50:47 +04:00
Vadim Girlin
ad1df471d0 r600g: plug in optimizing backend
Optimization is enabled with "R600_DEBUG=sb".

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Vadim Girlin
2cd7691793 r600g/sb: initial commit of the optimizing shader backend 2013-04-30 21:50:47 +04:00
Vadim Girlin
fbb065d629 r600g: use enum type for domains field in struct r600_resource
This prevents the problems when the header is included in C++ code.
2013-04-30 21:50:47 +04:00
Vadim Girlin
d5b30fd036 r600g: add new flags to isa instruction tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
a919424215 r600g: always create reverse lookup isa tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
7d555f2f4c r600g: mask unused source components for SAMPLE
This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Eric Anholt
df410863d7 intel: Remove the last spans code!
The remaining bits happen to do nothing that
_swrast_span_render_start()/finish() don't do.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
526cf46666 intel: Move the S8 offset calc function near its remaining usage.
It's not really span code ever since we stopped using spans for S8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
e7c5e9949b intel: Ensure renderbuffers are current when mapping them.
In the case of renering to windows in X, we would render to stale buffers
(or not render at all!) if you hit a MapRenderbuffer as the first thing
done to your window after new buffers are ready to be collected in DRI2.

I think this also covers the weird comment about irb->mt being missing
sometimes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
0e8ef74c5f mesa: Add a clarifying comment about rowStride of compressed textures.
I always forget how we do this for compressed textures.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
3750ff9e5f mesa: Remove the Map field from texture images.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
adf958d9c2 swrast: Always use MapTextureImage for mapping textures for swrast.
Now that everything goes through ImageSlices[], we can rely on the
driver's existing texture mapping function.

A big block of code goes away on Radeon that looks like it was to deal with
the validate that happened at SpanRenderStart, which no longer occurs since we
don't need validation for the MapTextureImage hook.

v2: Rewrite comment about ImageSlices, fix duplicated swImages, touch up
    unmap loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
ea05e259c9 nouveau: Replace swrast_texture_image->Map usage with ->Buffer.
This code is trying to deal with providing a map in the case that
AllocTexImageBuffer was called, which is hooked up to the swrast variant.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
b78e48289f nouveau: Just use MapTextureImage instead of duplicating the logic.
MapTextureImage has the exact same logic, except it can also handle
swrast-allocated buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
f91823f026 swrast: Make a teximage's stored RowStride be in terms of bytes per row.
For hardware drivers with pitch alignment requirements, a
non-power-of-two-sized texture format won't end up being an integer number
of pixels per row.  Also, avoids having to change our units between
MapTextureImage's rowStride and swrast's RowStride.

This doesn't fully convert the compressed texel fetch path, but does make
sure we don't drop any bits (not that we'd expect to).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
35e179b18c swrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0].
This gets us ready for the Map field to die.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
0c883e46d8 swrast: Replace ImageOffsets with an ImageSlices pointer.
This is a step toward allowing drivers to use their normal mapping paths,
instead of requiring that all slice mappings come from an aligned offset
from the first slice's map.

This incidentally fixes missing slice handling in FXT1 swrast.

v2: Use slice height helper function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
e7ecc11311 swrast: Reuse _swrast_free_texture_image_buffer from drivers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
0a484f1006 swrast: Move ImageOffsets allocation to shared code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
f709c31c67 swrast: Clean up and explain the mapping process.
v2: Move slice height calculation to a helper function (recommeded by Brian).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
741e540055 swrast: Factor out texture slice counting.
This function going to get used a lot more in upcoming patches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Eric Anholt
dca4178130 radeon: Remove some dead teximage mapping code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Eric Anholt
0de08fb594 radeon: Add missing swrast field initialization.
This is the equivalent of intel's
80513ec8b4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Vincent Lejeune
a6a4b70e2d r600g/llvm: Fix opencl build 2013-04-30 16:38:47 +02:00
Alexander von Gluck IV
f1361ed084 Gallium: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Alexander von Gluck IV
60cc73c333 Mapi: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Alexander von Gluck IV
39bdf08628 Mesa: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Vincent Lejeune
51e9bfdc48 r600g/llvm: get use_kill from compiler shader 2013-04-30 02:17:18 +02:00
Eric Anholt
a79786af64 i965/fs: Print out the estimated cycle count in INTEL_DEBUG=wm
This could be used by shader-db for hopefully more accurate regression
testing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:44:35 -07:00
Eric Anholt
61ca2c4f73 i965/fs: Allow LRPs with uniform registers.
Improves GLB2.7 performance on my HSW by 0.671455% +/- 0.225037% (n=62).

v2: Make is_valid_3src() a method of the fs_reg. (recommended by Ken)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-29 11:41:35 -07:00
Eric Anholt
de7e8b1d01 intel: Be more conservative in disabling tiling to save memory.
Improves GLB2.7 trex performance 1.01985% +/- 0.721366% on my IVB (n=10)
and by 3.38771% +/- 0.584241% (n=15) on my HSW, due to a 32x32 ARGB8888
cubemap going from untiled to tiled.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-29 11:41:34 -07:00
Eric Anholt
73bc6061f5 i965: Disable Z16 on contexts that don't require it.
It appears that Z16 on Intel hardware is in fact slower than Z24, so
people are getting surprisingly hurt when trying to use Z16 as a
performance-versus-precision tradeoff, or when they're targeting GLES2 and
that's all you get.

GL 3.0+ have Z16 on the list of required exact format sizes, but GLES
doesn't, so choose the better-performing layout in that case.  Improves
GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB
system.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
e409889213 intel: Report FBO incompleteness causes through GL_ARB_debug_output.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
6ae473221a intel: Fold the one last function intel_tex_format.c into the caller.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
40b207b62f mesa: Fix error checking for GS UBO getters.
These are supposed to be present if both things are available, but we were
enabling them if either one was.
2013-04-29 11:41:34 -07:00
Eric Anholt
072709da91 mesa: Add a clarifying comment about EXTRA_ error checking. 2013-04-29 11:41:34 -07:00
Eric Anholt
eac1199604 mesa: Add an extra clarifying set of braces to getter checking.
For this multi-page single statement, my thought the end was to that the
next block was mis-indented, rather than that the dropped indentation
actually indicated the end of the loop.
2013-04-29 11:41:33 -07:00
Eric Anholt
2534f0a57d mesa: Fix error checking for getters consisting of only API versions.
In almost all of our cases, getters that are turned on for only some API
variants will have an extension listed as one of the things that can
enable it, and thus api_check gets set.  For extra_gl30_es3 (used for
NUM_EXTENSIONS, MAJOR_VERSION, MINOR_VERSION) on a GL 2.1 context, though,
we would check twice, not find either one, but never actually throw the
error.
2013-04-29 11:41:33 -07:00
Eric Anholt
d63a10afcc mesa: Clarify the names of error checking variables for glGet.
There's no reason to actually count these things, so the integer ++
behavior was just confusing.
2013-04-29 11:41:33 -07:00
Eric Anholt
4df1b986d3 i915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode.
This brings the driver up to GL 2.1.
2013-04-29 11:41:33 -07:00
Eric Anholt
97217a40f9 i915: Always enable GL 2.0 support.
There's no point in shipping a non-GL2 driver today.
2013-04-29 11:41:33 -07:00
Eric Anholt
eb062ab07f i915: Correctly set the OQ counter bits.
While we may provide the extension, we need to tell applications that they
can't actually use it:

            An implementation can either set QUERY_COUNTER_BITS_ARB to the
            value 0, or to some number greater than or equal to n.  If an
            implementation returns 0 for QUERY_COUNTER_BITS_ARB, then the
            occlusion queries will always return that zero samples passed the
            occlusion test, and so an application should not use occlusion
            queries on that implementation.
2013-04-29 11:41:33 -07:00
Kenneth Graunke
5e46482993 i965: Move is_math/is_tex/is_control_flow() to backend_instruction.
These are entirely based on the opcode, which is available in
backend_instruction.  It makes sense to only implement them in one
place.

This changes the VS implementation of is_tex() slightly, which now
accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD.  However, since those
aren't generated in the VS anyway, it should be fine.

This also makes is_control_flow() available in the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-29 11:10:50 -07:00
Zack Rusin
a6e7c22664 draw/so: fix overflow calculation
only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 03:48:36 -04:00
José Fonseca
220ef8295c llvmpipe: Fix queries when screen->num_threads == 0.
That is, when llvmpipe is run in single-threaded mode.

Trivial.

Tested with

  LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
2013-04-29 15:40:06 +01:00
José Fonseca
c4bea00fb3 Revert "st/mesa: add a simple path to BufferData if it only discards buffer contents"
This reverts commit 5649f886f7.

It causes segfaults when size is zero.
2013-04-29 15:13:57 +01:00
Jerome Glisse
c7a13dc5f5 r600g: force full cache for hyperz
Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.

Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-29 10:06:29 -04:00
Rob Clark
3900a0e4df freedreno: fix rebase screw-up
Add back 2nd arg to emit_vertexbufs() which got lost in rebase.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-29 07:36:27 -04:00
Chris Forbes
79f786f936 i965/fs: Don't try to use bogus interpolation modes pre-Gen6.
Interpolation modes other than perspective-barycentric-pixel-center (and
their associated coefficients in the WM payload) only exist in Gen6 and
later.

Unfortunately, if a varying was declared as `centroid`, we would blindly
read the nonexistant values, and so produce all manner of bad behavior
-- texture swimming, snow, etc.

Fixes rendering in Counter-Strike Source and Team Fortress 2 on
Ironlake.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-30 06:50:16 +12:00
Matt Turner
a8eed0299d i965/vs: Fix order of source arguments to LRP.
The order or arguments matches DirectX, and is backwards from GLSL's
mix() built-in.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983
2013-04-28 14:38:14 -07:00
Zack Rusin
3bba787879 llvmpipe: stop crashing when one of the so targets is null
Fixes a crash when one of the so targets is null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:12 -04:00
Zack Rusin
0031cde1e1 draw/so: indicate overflow when buffer is missing
We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:07 -04:00
Zack Rusin
f9f57312de gallivm: fix indirect addressing of temps in soa mode
we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:18:51 -04:00
Zack Rusin
3093ac6f4f tgsi/ureg: Add a function to return the number of outputs
We already hold the variable, just weren't providing access
to it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-26 23:05:45 -04:00
Zack Rusin
53d36d5fb0 draw/so: Fix overflow calculations
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:04:26 -04:00
Zack Rusin
d996622cfa draw/llvm: fix viewport transformations
This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:46 -04:00
Zack Rusin
5d9ef5b365 gallium: increase the number of available stream output decls
There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:23 -04:00
Zack Rusin
562835bcdf llvmpipe: implement so_overflow query
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 22:58:54 -04:00
Brian Paul
49dda2d92f mesa: fix the compressed TexSubImage size checking code
Before, we'd incorrectly generate an error if we we tried to
replace a non-4x4 block near the edge of a NPOT compressed texture.
For example, if the dest image was 15 texels wide and xoffset=12
and width=3 we'd incorrectly generate GL_INVALID_OPERATION.

Verified with new tests added to piglit s3tc-errors test.

Note: This is a candidate for the stable branches.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:30 -06:00
Brian Paul
ff74cf62b1 llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:24 -06:00
Brian Paul
38a751cbe8 llvmpipe: bump LP_MAX_THREADS to 16
On the mesa-users list, Burlen Loring reported a speed-up with 16 cores
and his test/app.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:12 -06:00
Brian Paul
8fbc36ff48 mesa: updated read_buffer_enum_to_index() comment
Remove the part about the value of gl_framebuffer::Name.
2013-04-26 08:30:25 -06:00
Christian König
e3ac293daa r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2
That is just not supported by the hardware.

v2: fix compare

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-26 15:35:36 +02:00
Christian König
2c2c54b819 radeon/uvd: stop using anonymous unions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-26 15:35:36 +02:00
Tapani Pälli
12b0bfa6e9 mesa: fix type comparison errors in sub-texture error checking code
patch fixes a crash that happens if glTexSubImage2D is called with a
negative xoffset.

NOTE: This is a candidate for stable branches.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-26 06:47:44 -06:00
José Fonseca
c5e8573762 Revert "draw: Yield zeros for LLVM fetches of non-existing vertex elements."
After more thought/discussion, it seems it is better to handle this sort
of stuff in the state tracker.

So this reverts commit 12096f334b, except the
variant->key -> key shorthands.
2013-04-26 12:15:39 +01:00
Chia-I Wu
5816a471af ilo: add the driver to the build system
Add ilo to targets/egl-static and add a new target dri-ilo.  Update autoconf
and automake rules.
2013-04-26 16:20:52 +08:00
Chia-I Wu
825aa60707 ilo: compile VS/GS/FS with the toy compiler 2013-04-26 16:20:52 +08:00
Chia-I Wu
7118ff8bb0 ilo: add a toy shader compiler
This is a simple shader compiler that performs almost zero optimizations.  The
generated code is usually much larger comparing to that generated by i965.
The generated code also requires many more registers.

Function-wise, it lacks register spilling and does not support most TGSI
indirections.  Other than those, it works alright.
2013-04-26 16:20:52 +08:00
Chia-I Wu
0fa2d0e98a ilo: hook up pipe context GPGPU functions
This just adds a stub.
2013-04-26 16:16:43 +08:00
Chia-I Wu
cf8f3dd373 ilo: hook up pipe context video functions
This just hooks them up with auxiliary/vl layer.
2013-04-26 16:16:43 +08:00
Chia-I Wu
12dd397d0c ilo: add support for time/occlusion/primitive queries 2013-04-26 16:16:43 +08:00
Chia-I Wu
e6186b0769 ilo: hook up pipe context 3D functions 2013-04-26 16:16:43 +08:00
Chia-I Wu
5b310f6230 ilo: add GEN7 support for 3D pipeline 2013-04-26 16:16:43 +08:00
Chia-I Wu
91ce766c35 ilo: add 3D pipeline for GEN6
The 3D pipeline is a high-level interface to emit 3D commands and states.  It
uses GEN6 GPE to do the real work.
2013-04-26 16:16:43 +08:00
Chia-I Wu
67233b56d6 ilo: add GEN7 GPE 2013-04-26 16:16:43 +08:00
Chia-I Wu
d3602dfac6 ilo: add GEN6 GPE
GEN6 GPE (Graphics Processing Engine) is a low-level interface to emit 3D
commands and states.
2013-04-26 16:16:43 +08:00
Chia-I Wu
72357cf3bb ilo: hook up pipe context query functions
None of the query types are supported yet.
2013-04-26 16:16:43 +08:00
Chia-I Wu
8f949bc1da ilo: hook up pipe context transfer functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
0754ff33e3 ilo: hook up pipe context blit functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
89d1702b9b ilo: hook up pipe context state functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
520af66797 ilo: add functions to manage shaders
This commits add shader cache, shader state, shader variant, and etc.  It does
not add the shader compiler though.
2013-04-26 16:16:42 +08:00
Chia-I Wu
86940bf41c ilo: hook up pipe context flush function 2013-04-26 16:16:42 +08:00
Chia-I Wu
eed1e5a407 ilo: add command parser
The command parser manages batch buffers and command submissions.
2013-04-26 16:16:42 +08:00
Chia-I Wu
3a4a570c34 ilo: hook up pipe screen resource functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
b50e68cb67 ilo: hook up pipe screen format functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
babb2b5c50 ilo: hook up pipe_screen param and fence functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
e74d67738d ilo: add debug flags settable through ILO_DEBUG 2013-04-26 16:16:42 +08:00
Chia-I Wu
63b5720105 ilo: new pipe driver for Intel GEN6+
This commit adds some boilerplate code.  The header files found under include/
are copied from i965.
2013-04-26 16:16:41 +08:00
Chia-I Wu
380e6875b8 winsys/intel: new winsys for intel
This is a wrapper for libdrm_intel to allow the pipe driver to stay OS
agnostic.
2013-04-26 15:49:00 +08:00
José Fonseca
542c5b3703 gallivm: Fix trivial out-of-bounds indirection in lp_build_cube_lookup().
Courtesy of clang:

  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
           tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02);
           ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
              rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]);
                                                       ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
              rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]);
                                                       ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
2013-04-26 08:44:37 +01:00
Matt Turner
0c1d87b0d7 i965/vs: Add support for LRP instruction.
Only 13 affected programs in shader-db, but they were all helped.

total instructions in shared programs: 368877 -> 368851 (-0.01%)
instructions in affected programs:     1576 -> 1550 (-1.65%)

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 18:27:39 -07:00
Matt Turner
c0f67a127b i965/vs: Add a function to fix-up uniform arguments for 3-src insts.
Three-source instructions have a vertical stride overloaded to 4, which
prevents directly using vec4 uniforms as arguments. Instead we need to
insert a MOV instruction to do the replication for the three-source
instruction.

With this in place, we can use three-source instructions in the vertex
shader. While some thought needs to go into deciding whether its better
to use a three-source instruction rather than a sequence of equivalent
instructions (when one or more sources are uniforms or immediates), this
will allow us to skip a lot of ugly lowering code and use the BFE and
BFI2 instructions directly.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 18:27:39 -07:00
Jerome Glisse
abb96fdea7 winsys/radeon: consolidate tracing into winsys v2
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).

Lot of file touched because of winsys API changes.

v2: Do not write lockup file if ib uniq id does not match last one

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 18:36:31 -04:00
Tom Stellard
53fbae7eac r600g/compute: Removed unused and untested code
There was a lot of code in evergreen_compute_internal.c that was not
being used at all and most of it was duplicating code from other parts
of the driver.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:32:22 -07:00
Tom Stellard
f986087d5c r600g/compute: Use a constant buffer to store kernel parameters v2
v2:
  - Fix usage of set_constant_buffer()
  - Fix typo in comment

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 13:32:17 -07:00
Tom Stellard
ffadc71afb r600g: Add evergreen_emit_cs_constant_buffers() v2
v2:
  - Bump R600_NUM_ATOMS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 13:25:00 -07:00
Tom Stellard
83a00a1de8 r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernel
The state tracker should be responsible for waiting for the kernel to
finish.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:24:51 -07:00
Tom Stellard
09e47f7a25 r600g/compute: Fix input buffer size calculation
Buffer size should be in bytes not dwords.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:24:24 -07:00
Adam Jackson
904b03824b linux: Don't emit a .note.ABI-tag section anymore (#26663)
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed.  Just remove it.

Note: this is a candidate for the stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-25 15:51:35 -04:00
Rob Clark
73de07cbbc freedreno: use writecombine buffers
Better than uncached for writes, which are common for vertex buffer
upload, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-25 15:10:56 -04:00
Rob Clark
f706d4d340 freedreno: don't patch and re-emit same shader as much
New textures or vertex buffers don't always require patching and
re-emitting the shaders.  So do a better job of figuring out when we
actually have to patch the shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-25 15:10:56 -04:00
Eric Anholt
578987ce1c i965: Avoid recompiles for fragment clamping on non-clamping APIs.
Removes 75/78 state-dependent recompiles in GLB2.7 (the remaining 3 are
due to FBO-rendering size predictions).  We currently expose
GL_ARB_color_buffer_float on GL core, so we may mis-predict there, but I'm
about to send a patch for removing that silly extension in that case.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-25 12:03:00 -07:00
Alex Deucher
b5145ca2a8 radeonsi: add new SI pci ids
Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 14:22:46 -04:00
Alex Deucher
b3a856dfa9 r600g: add new richland pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 14:21:15 -04:00
José Fonseca
12096f334b draw: Yield zeros for LLVM fetches of non-existing vertex elements.
If a bug in an app/stater-tacker causes vertex buffer to fetch vertex
elements that are not bound, simply return zeros instead of crashing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-25 16:16:21 +01:00
José Fonseca
28e6a272fc trace: Only close trace files on exit.
Many applications don't exit cleanly, others may create and destroy a
screen multiple times, so we only write </trace> tag and close at exit
time.
2013-04-25 14:18:33 +01:00
José Fonseca
74d1153c9c graw: Set the vertex shader constant buffer.
We were setting the fragment shader, which wasn't needed.
2013-04-25 14:06:50 +01:00
José Fonseca
e88a1dba09 graw: Simple utilities to dump and disassemble TGSI tokens.
Useful for core dumps, where calling tgsi_dump() from gdb is not an
alternative.
2013-04-25 13:03:06 +01:00
José Fonseca
1687932d2b scons: Support clang.
clang is supports most gcc options / extensions, with a some exceptions.

The biggest advantage of using clang is that compilation times are much
short.

One can tell scons to use clang when building by invoking it as

   CC=clang CXX=clang++ scons libgl-xlib
2013-04-25 11:59:01 +01:00
José Fonseca
f0c296773d util/u_sse: Fix _mm_shuffle_epi8 prototype for clang.
Clang does not support __artificial__. Instead match precisely what's
in the clang headers.
2013-04-25 11:59:01 +01:00
José Fonseca
45a60e2e7a scons: Remove redundant code.
-fvisibility=hidden is already elsewhere for the whole tree.
2013-04-25 11:59:01 +01:00
Chris Forbes
8fd0190278 mesa: fix bogus comment about PrimitiveRestart fields
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 20:49:25 +12:00
Chris Forbes
447bf1fb52 i965: report correct sample positions
From low to high bits, the sample positions are packed y0,x0,y1,x1...

Fixes arb_texture_multisample-sample-position piglit.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-25 20:47:54 +12:00
Rob Clark
49a7624973 freedreno: fix bogus IMM const reg index
We were assigning incorrect const register for immediates, and
potentially writing immediate const to the wrong location.  This fixes
an incorrect-rendering bug with xonotic.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
9495ee12c6 freedreno: clear fixes and debugging
Set a few extra registers to make sure we are in proper state for
clearing.  And also add some debug options to mark all state dirty in
clear and gmem operations to aid in debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
d5d6ec8843 freedreno: fix texture fetch type
There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether
there are two or there valid input components.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
d086bb22bc freedreno: fix temp register usage
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases.  For example, if the dst
register is the same as one of the src registers.

For now, just simplify it and always allocate a new register to use as
an intermediate.  In some cases this will result in more registers used
than required.  I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
7a837da556 freedreno: add noop driver
It is useful for debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
eec37f1cdc freedreno: use u_math macros/helpers more
Get rid of a few self-defined macros:
  ALIGN() -> align()
  min() -> MIN2()
  max() -> MAX2()

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
38d8b02eba freedreno: implement fd_screen_destroy()
Opps, didn't notice that I had left it stubbed out.

Also, make things fail a bit more gracefully when things go wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
a64e2d9d9f freedreno: set SWAP bit based on format
Really this should be set based on buffer format, not on color vs
depth/stencil.  Probably there should be more formats that set the bit
as we add support for more render target formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Tom Stellard
d9a32b84e3 radeon/llvm: Fix segfault with a specifc libelf implementation
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
requires calling elf_version() prior to calling elf_memory()

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-24 16:51:25 -07:00
Alex Deucher
5bbeae7a3d r600g: use CP DMA for buffer clears on evergreen+
Lighter weight then using streamout.  Only evergreen
and newer asics support embedded data as src with
CP DMA.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-24 18:54:31 -04:00
Chia-I Wu
9d0ad4c2f2 i965/gen7: fix encoding of (huge) surface size for BRW_SURFACE_BUFFER
Unlike GEN6, the bits of entry count are distributed like this

  width  = (entry_count & 0x0000007f);       /* bits [6:0] */
  height = (entry_count & 0x001fff80) >> 7;  /* bits [20:7] */
  depth  = (entry_count & 0x7fe00000) >> 21; /* bits [30:21] */

The maximum entry count is still limited to 2^27.

This was noted while going over the PRM.  No test is impacted, because
1<<20 (the bit that moved) is much larger than GL_UNIFORM_BLOCK_MAX_SIZE,
GL_MAX_TEXTURE_BUFFER_SIZE, or MAX_*_UNIFORM_COMPONENTS.

v2: Explain more in the commit message (by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 12:56:17 -07:00
Chia-I Wu
75d402b211 i965/gen7: fix 3DSTATE_LINE_STIPPLE_PATTERN
The inverse repeat count should taks up bits 31:15 and is in U1.16.  Fixes
the "Restarting lines within a single Begin/End block" subtest of piglit
linestipple, and gets the other failing subtests much closer to passing.

v2: Rewrite commit message with more detailed piglit info (by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 12:56:17 -07:00
Chia-I Wu
bc98950a2a i965: fix SURFACE_STATE dumping
Wrong fields were used when dumping width and height.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 12:56:17 -07:00
Matt Turner
d611f12d82 i965: Remove strange comments about math functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 12:51:36 -07:00
Matt Turner
0c16c12e46 i965: Remove traces of nonexistent TAN math function.
Never existed? At least never supported. Doesn't appear in 965, G45,
or ILK documentation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 12:51:36 -07:00
Paul Berry
5bb90cfceb glsl: Teach basic block analysis about break/continue/discard.
Previously, the only kind of ir_jump that would terminate a basic
block was "return".  However, the other possible types of ir_jump
("break", "continue", and "discard") should terminate a basic block
too.  This patch modifies basic block analysis so that it terminates a
basic block on any type of ir_jump, not just ir_return.

Fixes piglit test dead-code-break-interaction.shader_test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 09:57:37 -07:00
Paul Berry
70ca263623 glsl: Add virtual function ir_instruction::as_jump()
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 09:57:37 -07:00
Tom Stellard
f64058803a r600g/llvm: Pass struct r600_bytecode to r600_llvm_compile
This way we don't need to update the function signature everytime we
emit a new config value.  This also fixes the build with
--enable-opencl.
2013-04-24 12:42:41 -04:00
José Fonseca
e29525f79f winsys/sw/xlib: Prevent shared memory segment leakage.
Running piglit with this was causing all sort of weird stuff happening
to my desktop (Chromium webpages become blank, Qt Creator flickered,
etc).  I tracked this down to shared memory segment leakage when GL is
not shutdown properly. The segments can be seen running `ipcs` and
looking for nattch==0.

This changes fixes this by calling shmctl(IPC_RMID) soon after creation
(which does not remove the segment immediately, but simply marks it for
removal when no more processes are attached).

This matches src/mesa/drivers/x11/xm_buffer.c behaviour.

v2:
- move shmctl(IPC_RMID) after XShmAttach() for *BSD, per Chris Wilson
- remove stray debug printfs, spotted by Ian Romanick

NOTE: This is a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-24 16:54:58 +01:00
Zack Rusin
1a87473998 draw/gs: preserve leading vertex info for gs
We need to handle the leading vertex information when
assembling primitives for the geometry shader otherwise
the resulting triangles will have vertices at incorrect
input locations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-23 06:17:59 -04:00
Laurent Carlier
addf00e2ad r200: fix build regression introduced with 9a32203e16
Signed-off-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-04-24 16:48:29 +02:00
Christian König
c5c754d184 radeonsi: cleanup disabling tiling for UVD v3
Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=63702

v2: add a comment that this is just a workaround
v3: fix typo in comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-24 11:07:26 +02:00
Chad Versace
d3dfce3276 egl/dri2: Fix min/max swap interval of configs
The commit below exposed a bug in dri2_add_config.

    commit 3998f8c6b5
    Author: Ralf Jung <post@ralfj.de>
    Date:   Tue Apr 9 14:09:50 2013 +0200

	egl/x11: Fix initialisation of swap_interval

This little code snippet near the bottom of dri2_add_config,

    if (double_buffer) {
       ...
       conf->base.MinSwapInterval = dri2_dpy->min_swap_interval;
       conf->base.MaxSwapInterval = dri2_dpy->max_swap_interval;
    }

it never did what it claimed to do. The assignment never changed the value
of conf->base.MaxSwapInterval, because dri2_dpy->max_swap_interval was,
until the above exposing commit, unitialized here. That is,
conf->base.MaxSwapInterval was 0 before and after assignment. Ditto for
the min swap interval.

Above the troublesome code snippet, the call to _eglFilterArray rejects
the config as unmatching if its swap interval bounds differ from the base
config's.  Before the exposing commit, at the call to _eglFilterArray, the
swap interval bounds were always [0,0], and hence no config was rejected
due to swap interval.

After the exposing commit, _eglFilterArray incorrectly rejected some
configs, which prevented dri2_egl_config::dri_double_config from getting
set for the rejected config, which resulted in a NULL pointer getting
passed into dri2CreateNewDrawable, and then segfault.

The solution: set the swap interval bounds before _eglFilterArray.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63447
Tested-by: Lu Hua <huax.lu@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-24 08:05:13 +02:00
Kenneth Graunke
cef31bb290 mesa: Add unpack functions for A/I/L/LA [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:13:02 -07:00
Kenneth Graunke
995051ee34 mesa: Add unpack functions for R/RG/RGB [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:13:00 -07:00
Kenneth Graunke
531be501de mesa: Add an unpack function for ARGB2101010_UINT.
v2: Remove extra parenthesis (suggested by Brian).

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:12:58 -07:00
Kenneth Graunke
b1fded54c9 mesa: Fix unpack function for ETC2_SRGB8_PUNCHTHROUGH_ALPHA1.
We accidentally set MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1 twice,
rather than setting the RGB8 and SRGB8 formats.

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:12:50 -07:00
Kenneth Graunke
097b39276c mesa: Fix up some final license word wrapping issues by hand.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:14 -07:00
Kenneth Graunke
f0cb66b699 mesa: Restore 78-column wrapping of license text in C++-style comments.
The previous commit introduced extra words, breaking the formatting.

This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript2

where 'vimscript2' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/^ *$/ !fmt -w 78 -p '// '
:wq

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:12 -07:00
Kenneth Graunke
3d8d5b298a mesa: Restore 78-column wrapping of license text in C-style comments.
The previous commit introduced extra words, breaking the formatting.

This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript

where 'vimscript' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/\*\// !fmt -w 78 -p ' * '
:wq

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:09 -07:00
Kenneth Graunke
96ff2edc73 mesa: Add "OR COPYRIGHT HOLDERS" to license text disclaiming liability.
This brings the license text in line with the MIT License as published
on the Open Source Initiative website:

http://opensource.org/licenses/mit-license.php

Generated automatically be the following shell command:
$ git grep 'THE AUTHORS BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
  sed -i 's/THE AUTHORS/THE AUTHORS OR COPYRIGHT HOLDERS/' {}

This introduces some wrapping issues, to be fixed in the next commit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:06 -07:00
Kenneth Graunke
ca29382dc3 mesa: Change "BRIAN PAUL OR IBM" to "THE AUTHORS" in license text.
See previous commit for the rationale.  These weren't caught by the
automatic conversion due to the "OR IBM" addition.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:04 -07:00
Kenneth Graunke
dd404bc94f mesa: Change "BRIAN PAUL" to "THE AUTHORS" in license text.
Generated automatically be the following shell command:
$ git grep 'BRIAN PAUL BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
  sed -i 's/BRIAN PAUL/THE AUTHORS/' {}

The intention here is to protect all authors, not just Brian Paul.  I
believe that was already the sensible interpretation, but spelling it
out is probably better.

More practically, it also prevents people from accidentally copy &
pasting the license into a new file which says Brian is not liable when
he isn't even one of the authors.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:06:38 -07:00
Brian Paul
cab19eced5 mesa: make _mesa_save_vtxfmt_init() static
It's called from nowhere else.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-23 21:12:25 -06:00
Brian Paul
71ee003041 docs: document issue with Viewperf proe-05 test 6 2013-04-23 21:09:17 -06:00
Brian Paul
f74da3e988 mesa: use new _mesa_inside_dlist_begin_end() function
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-23 21:09:17 -06:00
Brian Paul
976b529b7c mesa: use new _mesa_inside_begin_end() function
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-23 21:09:17 -06:00
Marek Olšák
9a32203e16 mesa: remove unused opcodes AND, DP2A, NOT, NRM3, NRM4, OR, PRINT, XOR
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:24 +02:00
Marek Olšák
3140d132ef mesa: don't flush vertices and don't flag _NEW_COLOR in ClearColor, ClearIndex
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:24 +02:00
Marek Olšák
9f3985238f mesa: don't flush vertices and don't flag _NEW_COLOR for GL_CLAMP_READ_COLOR
There used to be a derived state _ClampReadColor, so setting _NEW_COLOR
made sense. The state is gone now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:24 +02:00
Marek Olšák
43dac2700c mesa: don't flag _NEW_DEPTH in Begin/EndQuery if driver implements the functions
We don't want to set the flag for Gallium.

I think only swrast needs the flag to be set for occlusion queries.

v2: fix stats_wm updates in i965

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
629813d9de mesa: don't flush vertices and don't flag _NEW_DEPTH in ClearDepth
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
3975f52eb4 mesa: don't flush and don't flag _NEW_STENCIL in ClearStencil, ActiveStencilFace
The functions don't affect driver state. There is no code that would rely
on vertices being flushed prior to changing the states, and no code that
would check for _NEW_STENCIL before using the states.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
1e3b422685 mesa: don't set _NEW_BUFFERS in GenerateMipmap and BlitFramebuffer
both functions don't change the framebuffer in any way
(if mesa_meta is not used)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
d883d00878 mesa: remove _NEW_PACKUNPACK
No driver checks the flag. Nobody uses it.

I also removed the FLUSH_VERTICES calls, because PixelStorei has no effect
on rendering.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
99bd76d834 mesa: convert _NEW_RASTERIZER_DISCARD to a driver flag
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
b95cbe5e80 mesa,i965: use NewDriverState to communicate TFB state changes with the driver
_NEW_TRANSFORM_FEEDBACK is not used by core Mesa, so it can be removed.
Instead, an new private flag is added to i965 to serve the same purpose.

If you're new to this:

* When creating a context. you can set private dirty flags
  in gl_context::DriverFlags, eg.:
    ctx->DriverFlags.NewStateX = BRW_NEW_STATE_X;

* When StateX is changed, core Mesa does:
    ctx->NewDriverState |= ctx->DriverFlags.NewStateX;

* When you have to draw, read and clear ctx->NewDriverState.

* Pros: not touching NewState, the driver decides the mapping between
  GL states and hw state groups, unlimited number of flags in core Mesa
  (still limited number of flags in the driver though)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
ef39bc4f2e mesa: remove redundant _NEW_BUFFERS setting in ReadBuffer
already set by _mesa_readbuffer

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
5649f886f7 st/mesa: add a simple path to BufferData if it only discards buffer contents
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-24 03:23:23 +02:00
Marek Olšák
d23c7455ae st/mesa: depth-stencil-alpha state also depends on _NEW_BUFFERS
because the code looks at the visual if there is a depth or stencil buffer
before enabling depth or stencil, respectively.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-24 03:23:23 +02:00
José Fonseca
2737abb44e gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
Squashed commit of the following:

commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Apr 23 17:37:18 2013 +0100

    gallium: s/lower_left_origin/bottom_edge_rule/

commit 4dff4f64fa83b9737def136fffd161d55e4f1722
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Apr 23 17:35:04 2013 +0100

    gallium: Move diagram to docs.

commit 442a63012c8c3c3797f45e03f2ca20ad5f399832
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 11 17:50:55 2012 +0100

    gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.

    This change is necessary to achieve correct results when using OpenGL
    FBOs.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-23 19:42:47 +01:00
Marek Olšák
b692076420 r600g: initialize CMASK and HTILE with the GPU using streamout
This fixes a crash when a resource cannot be mapped to the CPU's address space
because it's too big.

This puts a global pipe_context in r600_screen, which is guarded by a mutex,
so that we can use pipe_context when there isn't one around.
Hopefully our multi-context support is solid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

NOTE: This is a candidate for the 9.1 branch.
2013-04-23 20:26:20 +02:00
Marek Olšák
1ba46bbb4c gallium/u_blitter: implement buffer clearing
Although this might be useful for ARB_clear_buffer_object,
I need it for initializating resources in r600g.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: comment cleanups

NOTE: This is a candidate for the 9.1 branch.
2013-04-23 20:26:20 +02:00
Vincent Lejeune
edd90a19ca r600/llvm: Read stacksize from config header 2013-04-23 19:52:29 +02:00
Vincent Lejeune
a7f73f5155 /bin/bash: q : commande introuvable 2013-04-23 19:52:02 +02:00
Tom Stellard
a0c8942bb4 radeon/llvm: Fix build with LLVM >= r180063 2013-04-23 11:53:05 -04:00
Tom Stellard
ead4db420e gallivm: Fix build with LLVM >= r180063 2013-04-23 11:53:05 -04:00
Zack Rusin
1fb8c3ce55 draw: use the prim count for ia primitives
Number of vertices to fetch doesn't always equal the number of input
vertices. To correctly compute the number if IA primitives we need
to use the total number of input vertices, not only those that
need to be fetched.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Zack Rusin
76587d2e5e tgsi/scan: set correct input limits for geometry shader
TGSI geometry shader input declerations are of the IN[][2] format
and the dimensions of the array have to be deduced from the input
primitive property.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Zack Rusin
913ed25f18 draw: add code to reset instance dependent data
We want to be able to reset certain parts of the pipeline,
in particular the input primitive index, but only either with
seperate invocations of the draw_vbo or new instances. In all
other cases (e.g. new invocations due to primitive restart)
that data needs to be preserved. Add a function through which
we can reset instance dependent data.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Zack Rusin
2aad06844f softpipe: fix streamout with an emptry geometry shader
Same approach as in the llvmpipe, if the geometry shader is
null and we have stream output then attach it to the vertex
shader right before executing the draw pipeline.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Andreas Boll
723b78397f configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL
Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa
9.0.x

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-23 03:16:10 +02:00
Matt Turner
7be536bb19 i965/fs: Don't save value returned by emit() if it's not used.
Probably a copy-n-paste mistake.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-22 15:34:32 -07:00
Brian Paul
4d5827ea83 mesa: Remove extra MapBufferRange in create_beginend_table()
Looks like a copy&paste typo.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-22 12:38:04 -06:00
José Fonseca
7c1bf8e381 gallium: Add a new clip_halfz rasterizer state.
gl_rasterization_rules lumps too many different flags.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-22 18:39:06 +01:00
Kenneth Graunke
95c83824e6 i965: Fix a mistake in the comments for software counters.
The code doesn't set brw->query.obj to NULL, it sets query->bo to NULL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-22 10:34:49 -07:00
José Fonseca
c0538860bf gallivm: Fix assignment of unsigned values to OUT register.
TEMP is not the only register file that accept unsigned. OUT too.

Actually, what determines the appropriate type of the destination value is
not the opcode, but rather the register.

Also cleanup/simplify code.  Add a few more asserts, but also make
code more robust by handling graceful if assert fails.

This fixes segfault / assertion in the included vert-uadd.sh graw shader.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-22 18:23:42 +01:00
Matt Turner
ec646e4654 i965: Apply CMP NULL {Switch} work-around to other Gen7s.
Listed in the restrictions section of CMP, but not on the work-arounds
page.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-22 09:45:10 -07:00
Brian Paul
6654b9d1eb st/mesa: minor indentation fixes 2013-04-22 10:08:06 -06:00
Eric Anholt
47c0b5ecdd mesa: Introduce a globally-available minify() macro.
This matches u_minify()'s behavior, for consistency.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-21 12:28:04 -07:00
Eric Anholt
1842dd08b8 mesa: Generalize TexStorage allocator between swrast and intel.
This should be reusable for other non-gallium drivers, so we can make the
extension always be available.

v2: Add a more detailed comment than the old function had (recommended
    by Brian).

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2013-04-21 12:28:04 -07:00
Eric Anholt
e86170c2b8 mesa: Add performance debug for meta code.
I noticed a fallback in regnum through sysprof, and wanted a nicer way to
get information about it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-21 12:28:03 -07:00
Eric Anholt
cbe8b75b58 intel: Mention how much data we're trying to subdata in perf debug.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-21 12:28:03 -07:00
José Fonseca
9fb5b2f45c Revert "gallivm: Emit vector selects."
It caused inumerous regressions (LLVM 3.1) in blending. In particular:

 - lp_test_blend

    type=u8nx16 rgb_func=sub rgb_src_factor=zero rgb_dst_factor=inv_src_color alpha_func=rev_sub alpha_src_factor=one alpha_dst_factor=const_color ...  MISMATCH
     Src:  0  0  0 b5 49 29  0 a2  0 21 de  0 c3 1b ec  0
     Src1: 2d 85 14  0 f8  0 79 a1 99  0 d8  0 59 16  0  0
     Dst:  0 a9 97  0 c0  0 78  0  0 8b aa f0 bd  0 78 f6
     Con: 7d  0 c0  0  0 bb 77  0  0  0 50  0 40 51  0  0
     Res:  0  0  0  0  0 29  0  0  0  0 c8  0 97 1b e3  0
     Ref:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    type=u8nx16 rgb_func=max rgb_src_factor=one rgb_dst_factor=inv_const_color alpha_func=min alpha_src_factor=zero alpha_dst_factor=inv_src1_alpha ...  MISMATCH
     Src:  d  0  0 e9  0 37 35 f0 62  0  0 b2 e9 f7  0 5c
     Src1: 8f  0 bf  0 a8  5  0  0 c4  0 d7  7 92  a  0 17
     Dst: cb  0 1e  0  0  0 19 8e  0 4d  0  0  0  0  3 46
     Con: aa 5a 5f 8f  0  0 bc 92  0 88  0  0 b7 8a c0 88
     Res: 44  0 13  0  0  0  7 8e  0 24  0  0  0  0  1 40
     Ref: 44  0 13  0  0 37 35  0 62 24  0  0 e9 f7  1  0

This reverts commit 1e266c7ef0.
2013-04-21 09:07:19 +01:00
José Fonseca
d8a4c4c524 llvmpipe: verify function on blend test. 2013-04-21 08:53:31 +01:00
José Fonseca
a79990bec0 llvmpipe: Don't support Z32_FLOAT_S8X24_UINT texture sampling support either.
Because we don't support, and the u_format fallback doesn't work for
zs formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
José Fonseca
c08b04992a llvmpipe: Ignore depth-stencil state if format has no depth/stencil.
Prevents assertion failures inside the driver for such state combinations.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
José Fonseca
f701a5a0fe gallivm: Disable LLVM 2.7 workaround on other versions.
2.7 was a particularly trouble ridden release.

Furthermore, the bug no longer can be reproduced ever since the
first_level state was taken in account.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
José Fonseca
1e266c7ef0 gallivm: Emit vector selects.
They are supported on LLVM 3.1, at least on x86. (I haven't tested on PPC
though.)

Actually lp_build_linear_mip_levels() already has been emitting them for
some time.

This avoids intrinsics, which tend to be an obstacle for certain
optimization passes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
Rob Clark
26b39df08f freedreno: move ir -> ir2
There will be a new IR for a3xx, which has a very different shader ISA
(more scalar oriented).  So rename to avoid conflicts later when I start
adding a3xx support to the gallium driver.

Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
2013-04-20 17:59:41 -04:00
Rob Clark
d8134792ae freedreno: cleanup some cruft left over from fdre
The standalone shader assembler needed some meta-data to know about
attributes/varyings/etc, to do the shader linkage.  We don't need these
parts with gallium/tgsi, so just get rid of it.

Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
2013-04-20 17:31:47 -04:00
Roland Scheidegger
85974e5fee gallivm: implement switch opcode
Should be able to handle all things which make this tricky to implement.
Fallthroughs, including most notably into/out of default, should be handled
correctly but are quite a mess.
If we see largely unoptimized switches in the wild should probably think
about some "real" switch optimization pass, e.g. things like this:

switch
case1
someinst
brk
case2
default
case3
someinst
brk
case4
someinst
endswitch

are legal, but the pointless case2/case3 statements not only cause condition
evaluation but will turn this into a "fake" fallthrough case (because
mask and defaultmask are already updated for case2 when default is
encountered) requiring executing code twice.
If default is at the end though, there's never any code re-execution, and
if that's not the case if there's no fallthrough in (not even a fake one)
and out of default there's no code re-execution neither.

v2: add comments, and use enum for break type instead of magic boolean.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
8f5d4283c0 gallivm: use uint build context for mask instead of float
Unsurprisingly noone was using it except for grabbing builder.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
107550e71a gallivm/tgsi: fix up breakc
It seems there was a typo in gallivm breakc handling (I am actually still
not sure it is really needed but otherwise that statement really should go
away). Also fix the wrong src argument type, even though they weren't really
used.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
e8d1b26a82 svga: remove TGSI_OPCODE_BREAKC instruction translation
While initially that opcode probably was meant for something along the
lines of sm3 break_comp it has never worked that way (not even the
argument count was right) and now the opcode has quite different
semantics so just remove it. (Discovered by Jose Fonseca)
2013-04-20 02:27:53 +02:00
Roland Scheidegger
794579105a gallium: document breakc and switch/case/default/endswitch
docs were missing, especially the opcode-from-hell switch however is anything
but obvious.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
443950c6aa gallivm: increase nesting limit to 66
This is still not really correct, since at least for sm 4.0
the nesting limit is 64 per subroutine, and subroutine nesting itself
has a limit of 32, so since we have a flat stack we'd need 32*64.
But this should probably be better fixed with per-subroutine stacks,
since otherwise these structures get really big (like 100kB for the
lp_exec_mask).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Zack Rusin
12eab7cc56 draw: implement primitive assembler
Input assembler needs to be able to decompose adjacency primitives
into something that can be understood by the rest of the pipeline.
The specs say that the adjacency primitives are *only* visible
in the geometry shader, for everything else they need to be
decomposed. Which in most of the cases is not an issue, because
the geometry shader always decomposes them for us, but without
geometry shader we were passing unchanged adjacency primitives
to the rest of the pipeline and causing crashes everywhere. This
commit introduces a primitive assembler which, if geometry
shader is missing and the input primitive is one of the
adjacency primitives, decomposes them into something
that the rest of the pipeline can understand.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-18 11:51:22 -07:00
Zack Rusin
e4752d0f56 util/prim: fix decomposed counts for adjacency primitives
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-18 11:37:37 -07:00
Zack Rusin
c1299204ad draw/so: uses the correct index with the pre clipped coordinates
pre_clip_pos is a float[4] we just used (*float)[4] to be able to
jump within the array of vertex_headers with it. So if the idx
happened to be anything but 0, we'd actually read from some garbage
in memory. Change it to just be a simple pointer instead of casting
it to something that it's not. As suggested by Jose.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-18 11:36:38 -07:00
Eric Anholt
8b2662e900 glapi: Add counter information for glBufferData(), like glBufferSubData().
This causes this function to become asynchronous with glthread.
2013-04-19 10:13:00 -07:00
Eric Anholt
1a3ea852ea glapi: Add parameter count information for uniforms.
This is the kind of information that would have been present for GLX, if
GLX supported modern GL.  This allows these entrypoints to get automatic
asynchronous marshalling code generated for glthread.
2013-04-19 10:13:00 -07:00
Paul Berry
57b7c20ca5 glapi: skip padding in get_called_parameter_string
This bug is currently benign, since get_called_parameter_string() is
currently only used for functions that return true for
glx_function.has_different_protocol(), and none of those functions
include padding.  However, in order to implement marshalling of GL API
functions, we'll need to use get_called_parameter_string() far more
often.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-19 10:12:36 -07:00
Paul Berry
fe955dc6b6 mesa: Fix up program_parse.y to avoid uninitialized $$
Without this patch, $$.negate, $$.rgba_valid, and $$.xyzw_valid take
on garbage values.  At the moment this problem is benign (the garbage
values happen to be zero), but in my experiments executing GL
operations on a background thread, the garbage values change, leading
to piglit failures.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-19 10:12:27 -07:00
Eric Anholt
ea6cf2b686 mesa: Use quotes on bool driconf options to prevent stdbool.h breakage.
Since stdbool.h's "true" and "false" are #defines, they got expanded when
used as macro arguments, and that expanded value was stored in the
XML string, producing XML that driconf would then fail to parse.

Currently no drivers included stdbool along with driconf, but I keep
accidentally doing so on intel as we move towards using normal C.

v2: rebase on master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-19 10:10:22 -07:00
Brian Paul
cecbfce5eb svga: whitespace, comment fixes in svga_pipe_query.c 2013-04-19 10:04:11 -06:00
Brian Paul
ef1b2b8da7 svga: whitespace, comment fixes in svga_pipe_fs/vs.c 2013-04-19 10:03:56 -06:00
José Fonseca
dbb690872e gallivm: Fix half floats with MCJIT.
Prevents:

  LLVM ERROR: Cannot select: intrinsic %llvm.x86.vcvtph2ps.128
2013-04-19 10:13:19 +01:00
Matt Turner
e87015f508 Revert "i965: Check reg.nr for BRW_ARF_NULL instead of reg.file."
This reverts commit ecdda414d3.

Commit was supposed to be a simple typo fix. Clearly needs more
investigating.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63688
2013-04-18 21:52:27 -07:00
Matt Turner
34efd9295e configure.ac: Remove gallium-g3dvl flag.
It's next to useless, since it just allows you to turn off VDPAU and
XvMC with a single switch. Just check whether Gallium drivers are
enabled instead.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-04-18 21:52:26 -07:00
Jerome Glisse
d0e9aaa31c radeonsi: add support for compressed texture v2
Most test pass, issue are with border color and swizzle.

Based on ircnick<maelcum> patch.

v2: Restaged commit hunk

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-18 17:25:38 -04:00
Jerome Glisse
dc21e30a62 radeonsi: add 2d tiling support for texture v3
v2: Remove left over code
v3: Restage properly the commit so hunk of first one are not in
    second one.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-18 17:25:38 -04:00
Vadim Girlin
f732036f12 gallium: handle drirc disable_glsl_line_continuations option
NOTE: This is a candidate for the 9.1 branch

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-19 01:05:03 +04:00
José Fonseca
b72ff373fb llvmpipe: Take in consideration all current constant buffers when mapping.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-04-18 20:48:12 +01:00
Christoph Bumiller
78eaaff696 nv50: add remaining RGBX formats
Not all are supported as render targets.

The state tracker fallback of using RGBA instead of RGBX currently
fails for blending, we could work around this by clearing their alpha
to 1 and modifying the color mask to disable writing alpha.
2013-04-18 21:04:22 +02:00
Christoph Bumiller
729abfd0f5 st/mesa: optionally apply texture swizzle to border color v2
This is the only sane solution for nv50 and nvc0 (really, trust me),
but since on other hardware the border colour is tightly coupled with
texture state they'd have to undo the swizzle, so I've added a cap.

The dependency of update_sampler on the texture updates was
introduced to avoid doing the apply_depthmode to the swizzle twice.

v2: Moved swizzling helper to u_format.c, extended the CAP to
provide more accurate information.
2013-04-18 20:35:40 +02:00
Christoph Bumiller
246ff8f887 nv50: set BORDER_COLOR_SRGB in sampler objects 2013-04-18 20:35:40 +02:00
Christoph Bumiller
2d5d054752 nv50: fix 4th component of Lx_SINT/UINT formats 2013-04-18 20:35:40 +02:00
Tom Stellard
3b20170b2f r600g: Fix build with --enable-opencl 2013-04-18 11:24:48 -07:00
Brian Paul
877e3c1d42 mesa: enable GL_ARB_texture_float if TEXTURE_FLOAT_ENABLED is defined
Per message on mesa-users list, this wasn't working before.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-18 10:41:08 -06:00
Roland Scheidegger
50cbcf0c46 gallivm: change cubemaps / derivatives handling, take 55
Turns out the previous "fix" for handling per-pixel face selection and
derivatives didn't work out that well - the derivatives were wrong by
quite a bit, in theory transformation of the derivatives into cube space
should work, but would be _a lot_ more work than the "simplified" transform
used.
So, for explicit derivatives, I'm just giving up and go back to not honoring
them.
For implicit derivatives (and the fake explicit ones) however we try
something a little different, we just calculate rho as we would for a 3d
texture, that is after scaling the coords by the inverse major axis.
This gives the same results as calculating the derivs after projection of
the coords to the same face as long as all pixels hit the same face (and
only without rho_no_opt, otherwise it should be a bit worse). And when
not all pixels are hitting the same face, the results aren't so hot but
not catastrophically bad (I believe not off by more than a factor of 2 without
no_rho_approx and not more than sqrt(2) with no_rho_approx). I think this is
better than just picking the wrong face but who knows...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-18 17:06:43 +02:00
Roland Scheidegger
0d07f05ee8 gallivm: Add no_rho_approx debug option
This will calculate rho correctly as
sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2))
instead of max(|ds/dx|,|dt/dx|,|dr/dx|,|ds/dy|,|dt/dy,|dr/dy|)
(for 3 coords - 2 coords work analogous, for 1 coord there's no point doing
the exact version), for both implicit and explicit derivatives.
While such approximation seems to be allowed in OpenGL some APIs may be less
forgiving, and the error can be quite large (sqrt(2) for 2 coords, sqrt(3) for
3 coords so wrong by nearly one mip level in the latter case).
This also helps to single out "real" bugs from "expected" ones, so it is debug
only (though at least combined with no_brilinear I didn't really see much of a
performance difference but only tested with a debug build - at least with
implicit mipmaps the instruction count is almost exactly the same though the
instructions are more complex (1 sqrt and mul/adds instead of and/max mostly).
The code when the option isn't set stays exactly the same.

v2: rename no_rho_opt to no_rho_approx.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-18 17:04:01 +02:00
José Fonseca
a930136977 llvmpipe: Support half integer pixel center fs coord.
Tested with graw/fs-fragcoord 2/3, and piglit
glsl-arb-fragment-coord-conventions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-18 14:18:25 +01:00
José Fonseca
b191be52f2 llvmpipe: Remove the static interpolation.
No longer used.

If we ever want the old behavior we can run a loop unroller pass.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-18 14:18:22 +01:00
José Fonseca
6e833d4d09 gallivm: Drop pos arg from lp_build_tgsi_soa.
Never used.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-18 14:18:13 +01:00
Andreas Boll
34bec4a251 docs: update release notes for 9.2
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-18 09:36:57 +02:00
José Fonseca
392f6cfced ralloc: Move declarations before statements.
Trivial.  Should fix MSVC build.
2013-04-18 06:21:04 +01:00
Emil Velikov
c7b88ed16e configure: enable vdpau and xvmc detection, with gallium
Currently the vdpau and xvmc detection code, is enabled for all builds. The
state trackers exist only within gallium. Enable whenever at least one gallium
driver is selected

v2: removed stray '-a'
[mattst88 v3]: Removed stray $.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63645
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-17 18:19:34 -07:00
Matt Turner
ecdda414d3 i965: Check reg.nr for BRW_ARF_NULL instead of reg.file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 18:19:34 -07:00
Matt Turner
60e4c99488 i965: Implement work-around for CMP with null dest on Haswell.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 18:19:34 -07:00
Stuart Abercrombie
1a59cc777f i915g: Release old fragment shader sampler views with current pipe
We were trying to use a destroy method from a deleted context.
This fix is based on what's in the svga driver.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2013-04-17 18:15:12 -07:00
Paul Berry
417d8917d4 i965/vec4: Fix hypothetical use of uninitialized data in attribute_map[].
Fixes issue identified by Klocwork analysis:

    'attribute_map' array elements might be used uninitialized in this
    function (vec4_visitor::lower_attributes_to_hw_regs).

The attribute_map array contains the mapping from shader input
attributes to the hardware registers they are stored in.
vec4_vs_visitor::setup_attributes() only populates elements of this
array which, according to core Mesa, are actually used by the shader.
Therefore, when vec4_visitor::lower_attributes_to_hw_regs() accesses
the array to lower a register access in the shader, it should in
principle only access elements of attribute_map that contain valid
data.  However, if a bug ever caused the driver back-end to access an
input that was not flagged as used by core Mesa, then
lower_attributes_to_hw_regs() would access uninitialized memory, which
could cause illegal instructions to get generated, resulting in a
possible GPU hang.

This patch makes the situation more robust by using memset() to
pre-initialize the attribute_map array to zero, so that if such a bug
ever occurred, lower_attributes_to_hw_regs() would generate a (mostly)
harmless access to r0.  In addition, it adds assertions to
lower_attributes_to_hw_regs() so that if we do have such a bug, we're
likely to discover it quickly.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:41:55 -07:00
Dave Airlie
47bd6e46fe ralloc: don't write to memory in case of alloc fail.
For some reason I made this happen under indirect rendering,
I think we might have a leak, valgrind gave out, so I said I'd
fix the basic problem.

NOTE: This is a candidate for stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-18 09:50:42 +10:00
Brian Paul
815ca0bf38 mesa: generate glGetInteger/Boolean/Float/Doublev() code for all APIs
No longer pass -a flag to the get_hash_generate.py script to specify
OpenGL, ES1, ES2, etc.  This updates the autoconf, scons and android
build files too (so we can bisect).

This is the last of the API-dependent conditional compilation in
core Mesa.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
9835d90596 mesa: remove mfeatures.h
No longer needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
b76f6d9557 mesa: remove #include "mfeatures.h" from numerous source files
None of the remaining FEATURE_x symbols in mfeatures.h are used anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
c6e00b6f6c glapi: no longer emit #include "mfeatures.h" in generated files
None of the symbols in mfeatures.h are used anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
7fd12a8ae1 mesa: remove FEATURE_remap_table from remap.[ch]
It was always defined.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:39 -06:00
Brian Paul
0bcced7716 glapi: remove FEATURE_remap_table test (it's always defined)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:39 -06:00
Zack Rusin
8e7f7e9693 draw/so: respect leading/provoking vertex info
we were ignoring leading/provoking vertex settings which was
breaking decomposition of some strips.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-17 15:43:50 -07:00
Zack Rusin
6bb217a489 softpipe/so: use the correct variable for reporting stream out
we were using the wrong vars, reporting incorrect stream output
statistics.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-17 15:28:54 -07:00
Zack Rusin
cb58c79efb gallivm/gs: fix indirect addressing in geometry shaders
We were always treating the vertex index as a scalar but when the
shader is using indirect addressing it will be a vector of indices
for each channel. This was causing some nasty crashes insides
LLVM.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-17 15:28:54 -07:00
Brian Paul
02039066a8 st/wgl: fix issue with SwapBuffers of minimized windows
If a window's minimized we get a zero-size window.  Skip the SwapBuffers
in that case to avoid some warning messages with the VMware svga driver.
Internal bug #996695

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-17 16:23:19 -06:00
Ian Romanick
505ac6ddc6 intel: Don't dereference a NULL pointer of calloc fails
The caller of NewTextureObject does the right thing if NULL is returned,
so this function should do the right thing too.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 14:12:46 -07:00
Eric Anholt
50064164a4 i965: Trim trailing whitespace in brw_defines.h.
It was all over the formats section I wanted to edit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 14:12:01 -07:00
Laurent Carlier
867f71db6b r200: fix build failure introduced with cbbcb0247e
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-04-17 13:48:40 -06:00
Brian Paul
1079475481 st/mesa: clean up formatting in st_cb_msaa.c
Insert blank lines, wrap lines, remove trailing whitespace, etc.
2013-04-17 12:28:13 -06:00
Brian Paul
3350ca223e mesa: remove gl_context::_TriangleCaps
No longer used anywhere.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:42 -06:00
Brian Paul
cbbcb0247e mesa: remove DD_TRI_LIGHT_TWOSIDE flag
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:42 -06:00
Brian Paul
c9bb052e31 mesa: remove DD_TRI_UNFILLED flag
Use alternate code in intel, r200, radeon drivers.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:41 -06:00
Brian Paul
56dc53ed5b mesa: remove DD_TRI_SMOOTH flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:41 -06:00
Brian Paul
b32fb8ac9e mesa: remove DD_TRI_STIPPLE flag
Make it a local macro for the i915 driver.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:41 -06:00
Brian Paul
dfb1474aac mesa: remove DD_TRI_OFFSET flag
Make it a local macro for the i915 driver.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
c6a81448f8 mesa: remove DD_POINT_ATTEN flag
For the i915 driver, make it a local macro.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
4f57fbb507 mesa: remove DD_POINT_SMOOTH flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
8ac8ae8360 mesa: remove DD_LINE_STIPPLE flag
For the i915 driver, make it a local macro.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
55b2033f0a mesa: remove DD_SEPARATE_SPECULAR flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:39 -06:00
Brian Paul
c1c5d689c5 mesa: remove unused DD_LINE_SMOOTH flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:39 -06:00
Zack Rusin
f01f754ca1 draw/gs: make sure geometry shaders don't overflow
The specification says that the geometry shader should exit if the
number of emitted vertices is bigger or equal to max_output_vertices and
we can't do that because we're running in the SoA mode, which means that
our storing routines will keep getting called on channels that have
overflown (even though they will be masked out, but we just can't skip
them).
So we need some scratch area where we can keep writing the overflown
vertices without overwriting anything important or crashing.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:47 -07:00
Zack Rusin
be497ac9d3 draw/gs: Return early if the passed geometry shader is null
Can happen if we were using stream output without geometry
shader, by returning early we avoid a crash.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:47 -07:00
Zack Rusin
80ee4a407a draw: implement pipeline statistics in the draw module
This is a basic implementation of the pipeline statistics in the
draw module. The interface is similar to the stream output statistics
and also requires that the callers explicitly enable it.
Included is the implementation of the interface in llvmpipe and
softpipe. Only softpipe enables the pipeline statistics capability
though because llvmpipe is lacking gathering of the fragment shading
and rasterization statistics.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:47 -07:00
Zack Rusin
b739376cff gallivm/gs: fix the end primitive calls
The issue with SOA execution and end_primitive opcode is that it
can be executed both when we haven't emitted any vertices, in
which case we don't want to emit an empty primitive, and when
the execution mask is zero and the execution should be skipped. We
handled only the latter of those conditions. Now we're combining the
execution mask with a mask created from emitted vertices to handle
both cases. As a result we don't need the pending_end_primitive
flag which was broken because it was static and could be affected
by both above mentioned conditions at run-time.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:46 -07:00
Zack Rusin
93627e33cc tgsi/exec: geometry shaders are executed on a single primitive
which means that our execution mask in GS is equal to 1 not 0xf.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 23:38:46 -07:00
Zack Rusin
88db6f0a73 tgsi/exec: fix the udiv and umod instructions
Same as with llvmpipe: we can't be divind/moding by zero and we
need to make sure that dividing/moding by zero produces 0xffffffff.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:46 -07:00
José Fonseca
b8f6858fcb gallivm: JIT symbol resolution with linux perf.
Details on docs/llvmpipe.html

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-17 16:50:52 +01:00
José Fonseca
35ef27d485 draw: Silence uninitialized var warnings.
Trivial.
2013-04-17 16:50:52 +01:00
Vincent Lejeune
2b9ed257c0 r600g/llvm: Use gprcount from llvm 2013-04-17 17:24:29 +02:00
Anuj Phogat
484b89ace9 intel: Add a null pointer check before dereferencing the pointer
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-17 08:17:47 -07:00
Emil Velikov
b03f6de63b docs: Update 'Making new mesa release'
Add a note to update PACKAGE_VERSION for Android and scons builds

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:15 -06:00
Emil Velikov
91984a732e docs: Add some missing release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:15 -06:00
Emil Velikov
cf9bf1d4a6 docs: move specs to a separate folder
Handle legacy/obsolete specs as well
List all specs in extensions.html
Mark 'OLD' extensions as obsolete in extensions.html
Update the spec location in old relnotes

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:14 -06:00
Emil Velikov
5fd3b3b085 docs: restructure release notes into separate folder
relnotes-*html > relnotes/*html
RELNOTES-* > relnotes/*
fix links, css and frames

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:14 -06:00
José Fonseca
50b3fc6204 gallium: Disambiguate TGSI_OPCODE_IF.
TGSI_OPCODE_IF condition had two possible interpretations:

- src.x != 0.0f

  - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
    vertex and fragment shaders
  - gallivm/llvmpipe
  - postprocess
  - vl state tracker
  - vega state tracker
  - most old drivers
  - old internal state trackers
  - many graw examples

- src.x != 0U

  - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
    vertex and fragment shaders
  - tgsi_exec/softpipe
  - r600
  - radeonsi
  - nv50

And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)

This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range.  It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.

But with this change there are now two opcodes, IF and UIF, with clear
meaning.

Drivers that do not support native integers do not need to worry about
UIF.  However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.

I tried to implement this for r600 and radeonsi based on the surrounding
code.  I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte  Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
  properly in nv50/ir.
2013-04-17 10:54:08 +01:00
José Fonseca
f61b7da80e gallium: Eliminate TGSI_OPCODE_IFC.
Never used or implemented.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-17 10:54:08 +01:00
Kenneth Graunke
e7965598b7 i965: Enable the Bay Trail platform.
This patch adds PCI IDs for Bay Trail (sometimes called Valley View).
As far as the 3D driver is concerned, it's very similar to Ivybridge,
so the existing code should work just fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 15:08:12 -07:00
Christian König
13ddf9baf2 r600/uvd: cleanup disabling tiling on pre EG asics
Set transfer flag instead of fiddling with the tilling params directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-16 22:36:51 +02:00
Christian König
7490eeb3d6 autoconf: enable detection of vdpau and xvmc by default
Since we now have UVD support we should enable them by default.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-16 22:36:20 +02:00
Ian Romanick
025f03f3b7 mesa/swrast: Move memory allocation outside the blit loop
Assume the maximum pixel size (16 bytes per pixel).  In addition to
moving redundant malloc and free calls outside the loop, this fixes a
potential resource leak when a surface is mapped and the malloc fails.
This also makes blit_nearest look a bit more like blit_linear.

v2: Use MAX_PIXEL_BYTES instead of 16.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 10:18:14 -07:00
Ian Romanick
a27c6e1aea mesa/swrast: Move free calls outside the attachment loop
This was originally discovered by Klocwork analysis:

    Possible memory leak. Dynamic memory stored in 'srcBuffer0'
    allocated through function 'malloc' at line 566 can be lost at line
    746

However, I think the problem is actually much worse.  Since the memory
is freed after the first pass through the loop, the released buffer may
be used on the next iteration!

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 10:13:48 -07:00
Ian Romanick
6758498eb7 mesa/swrast: Refactor no-memory error checking in blit_linear
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 10:13:10 -07:00
Martin Andersson
4c3ed79566 r600g: Workaround for a harware bug with nested loops on Cayman
There is a hardware bug on Cayman where a BREAK/CONTINUE followed by
LOOP_STARTxxx for nested loops may put the branch stack into a state
such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this
by replacing the ALU_PUSH_BEFORE with a PUSH + ALU

Fixes piglit tests EXT_transform_feedback/order*

v2: Use existing loop count and improve comment
v3: [Vadim Girlin] Set jump address for PUSH instructions

NOTE: This is a candidate for the 9.1 branch

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-16 18:02:11 +04:00
Marek Olšák
8616b224bf gallium/hud: fix FPS computation for framerate > 4.2k 2013-04-16 13:56:47 +02:00
Marek Olšák
332af88c39 gallium/hud: increase vertex buffer size for background black rectangles
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 13:56:47 +02:00
Marek Olšák
0108114619 gallium/hud: update the contents of GALLIUM_HUD=help
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 13:56:47 +02:00
Marek Olšák
30284f8892 gallium/hud: remove pipeline-statistics- prefix in query names
for the env var string not to be awfully long

v2: fix bug in indexing of "name"

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 13:56:47 +02:00
Marek Olšák
dfe5367f0f r600g: implement pipeline statistics query 2013-04-16 13:56:47 +02:00
Marek Olšák
817723baf8 winsys/radeon: use query_value for timestamp, remove query_timestamp 2013-04-16 13:56:47 +02:00
Marek Olšák
413ca78af3 r600g: add a debug flag for printing virtual addresses of resources 2013-04-16 13:56:47 +02:00
Marek Olšák
05fa3595e0 r600g: add a query returning the amount of time spent during bo_map sync. 2013-04-16 13:56:47 +02:00
Matt Turner
b3f1f665b0 build: Get rid of GALLIUM_WINSYS_DIRS
configure still uses it to print the enabled winsys.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
3a6e548a85 build: Get rid of GALLIUM_TARGET_DIRS
configure still uses it to print the enabled targets.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
2f7a37d858 build: Build pipe-loader before gallium tests
And don't build it from other Makefiles. That's awful, and breaks
distclean.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
0d3b1b0e2e build: Get rid of GALLIUM_MAKE_DIRS
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
69b69b1a0b build: Stop using GALLIUM_STATE_TRACKERS_DIRS for SUBDIRS
configure still uses it to print the enabled state trackers.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
13a7010c21 build: Get rid of DRIVER_DIRS
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
8341effd4a build: Stop AC_SUBST'ing DRI_DIRS and GALLIUM_DRIVERS_DIRS
Neither are used in Makefile.ams.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
70531b4a25 build: Remove GALLIUM_DIRS
It's always constant anyway.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
a9676ae44a build: Get rid of SRC_DIRS
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
691c30404d build: Get rid of CORE_DIRS
A step toward working make dist/distcheck.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:25 -07:00
Matt Turner
d5e9426b96 build: Move src/mapi/mapi/* to src/mapi/
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:25 -07:00
Matt Turner
3c690524e2 build: Rename sources.mak -> Makefile.sources
For the sake of consistency.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:25 -07:00
Tom Stellard
d50343dff1 radeonsi: Read config values from the .AMDGPU.config ELF section
Instead of emitting configuration values (e.g. number of gprs used) in a
predefined order, the LLVM backend now emits these values in
register/value pairs.  The first dword contains the register address and
the second dword contians the value to write.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-15 10:54:30 -07:00
Tom Stellard
9277b04c02 radeon/llvm: Handle ELF formatted binary output from the LLVM backend 2013-04-15 10:54:29 -07:00
Tom Stellard
7782d19cdc radeon/llvm: Use a struct for storing compiled code 2013-04-15 10:13:10 -07:00
Roland Scheidegger
1d6eb23f2d gallivm: fix small but severe bug in handling multiple lod level strides
Inserting the value for the second quad in the wrong place for the
following shuffle. This meant the row or image stride was undefined which is
quite catastrophic, can lead to bogus texels fetched or just segfault.
This code is only hit for SoA path currently, still surprising it
didn't crash more or caused more visible issues (I think llvm used a
broadcast shuffle for the undefined parts of the vector, hence the undefined
value for the second quad was just the same as that from the first quad,
so as long as both quads hit the same mip level everything was fine, and since
lower mips always have the same large stride it made it less likely to
hit out-of-bound memory in case of differing lods).

Note: this is a candidate for stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-15 15:23:40 +02:00
Francisco Jerez
02b808b08a clover: Fix usage of incorrect object as destination in clEnqueueCopyBufferToImage.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-04-13 14:24:10 +02:00
Francisco Jerez
1a8ad6c2e3 clover: Define platform class and merge with device_registry.
Null platform IDs are OK according to the spec, but some applications have
been reported to get paranoid and assume that our NULL platform is unusable.

As it doesn't hurt to have device enumeration separate from the rest of the
device code (quite the opposite, it makes the code cleaner), make the API use
an actual platform object that keeps track of the available devices instead of
the former NULL pointer.

Reported-and-reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-04-13 14:20:16 +02:00
Francisco Jerez
6ace452055 clover: Add missing fields to the module serializer.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-04-13 14:12:49 +02:00
Eric Anholt
1658efc42c i965: Shut up the last release build warning.
I don't see a sensible value to use in this path, but we shouldn't ever
hit this outside of developer new-texture-target enabling.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:14 -07:00
Eric Anholt
dcb1b89c65 i965: Silence one more compile warning.
We don't want to store this thing in the class, and we do need the
definition to be at the top of the function and held onto until the end
here, so there's not much to do besides (void) reference it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:14 -07:00
Eric Anholt
dea70404eb i965: Fix a warning in the release build.
This was copy and pasted from can_reswizzle_dst(), and we can just fold it
in instead to avoid the warning.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:14 -07:00
Eric Anholt
28170c5b7f i965: Fix an unused variable warning in the release build.
I think this actually clarifies what's going on in the asserts a bit,
given how many regions we've got floating around.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
248175ab3b i965: Fix an unused variable warning in the release build.
It's used in an assert, but we have this as a member of the class anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
6cec233c62 intel: Return failure properly in the texsubimage blit path.
We assert that failure doesn't happen, but it fixes a warning in the
release build and it would at least give working behavior for a user by
falling back to the normal texsubimage path.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
b681a89588 intel: Fix a warning in the release build.
This was silly -- checking that we didn't overflow the array by dividing
the array size by 2 and then multiplying it back up by 2.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
1433936fe5 intel: Fix an unused variable warning in the release build.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
9167ba8584 intel: Improve diagnostics for emit_linear_blit failure path.
This fixes unused variable warnings in the release build, and should be
more useful if it ever triggers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
aceba66795 i965: Fix error path for MCS allocation.
Asserts don't stop execution in release builds, so we would continue on to
use an uninitialized format value.  Just take the failure path, which
appears to continue up the call stack for a while.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
331766b9a2 i830: Move assert-only code into the assert.
The call has no side effects, and moving it into the assert cleans up a
compile warning in the release build.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
adf251406b i965/fs: Fix some untriggered optimization bugs with uncompressed/sechalf.
We have this support for firsthalf/sechalf instructions, which would be
called in the !has_compr4 (aka original gen4) 16-wide case.  We currently
only support 16-wide for gen5+, so we weren't tripping over this, but it
would have been a problem if we ever try to enable it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
eaca8a94e2 i965/fs: Add basic-block-level dead code elimination.
This is a poor substitute for proper global dead code elimination that
could replace both our current paths, but it was very easy to write.  It
particularly helps with Valve's shaders that are translated out of DX
assembly, which has been register allocated and thus have a bunch of
unrelated uses of the same variable (some of which get copy-propagated
from and then left for dead).

shader-db results:
total instructions in shared programs: 1735753 -> 1731698 (-0.23%)
instructions in affected programs:     492620 -> 488565 (-0.82%)

v2: Fix comment typo

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
36d0fde603 i965/fs: Remove incorrect note of writing attr in centroid workaround.
This instruction doesn't update its IR destination, it just moves from
payload to f0.  This caused the dead code elimination pass I'm adding to
dead-code-eliminate the first step of interpolation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
2cb7f1e766 i965/fs: Add a helper function for checking for partial register updates.
These checks were all over, and every time I wrote one I had to try to
decide again what the cases were for partial updates.

v2: Fix inadvertent reladdr check removal.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
df25b4f3cf mesa: Add a macro to bitset for determining bitset size.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
b5a0f59c0f i965: Fix compiler warnings since the introduction of texture multisample.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:11 -07:00
Ian Romanick
1faaa411c7 mesa: Don't leak gl_context::BeginEnd at context destruction
The other dispatch tables (Exec and Save) are freed, but BeginEnd is
never freed.  This was found by inspection why investigating the leak of
shared state in _mesa_initialize_context.

NOTE: This is a candidate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-12 16:24:48 -07:00
Ian Romanick
6e06550e4e mesa: Don't leak shared state when context initialization fails
Back up at line 1017 (not shown in patch), we add a reference to the
shared state.  Several places after that may divert to the error
handler, but, as far as I can tell, nothing ever unreferences the shared
state.

Fixes issue identified by Klocwork analysis:

    Resource acquired to 'shared->TexMutex' at line 1012 may be lost
    here. Also there is one similar error on line 1087.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-12 16:24:48 -07:00
Ian Romanick
f730c210b8 egl/dri2: NULL check value returned by dri2_create_surface
dri2_create_surface can fail for a variety of reasons, including bad
input data.  Dereferencing the NULL pointer and crashing is not okay.

Fixes issue identified by Klocwork analysis:

    Pointer 'surf' returned from call to function 'dri2_create_surface'
    at line 285 may be NULL and will be dereferenced at line 291.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:48 -07:00
Ian Romanick
2cc0b3294a mesa: NULL check the pointer before trying to dereference it
Duh.

Fixes issues identified by Klocwork analysis:

    Pointer 'table' returned from call to function 'calloc' at line 115
    may be NULL and will be dereferenced at line 117.

and

    Suspicious dereference of pointer 'table' before NULL check at line
    119.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:48 -07:00
Ian Romanick
ee55b845d2 glsl: Fix hypothetical NULL dereference related to process_array_type
Ensure that process_array_type never returns NULL, and let
process_array_type handle the case where the supplied base type is NULL.

Fixes issues identified by Klocwork analysis:

    Pointer 'type' returned from call to function 'get_type' at line
    1907 may be NULL and may be dereferenced at line 1912.

and

    Pointer 'field_type' checked for NULL at line 4160 will be
    dereferenced at line 4165. Also there is one similar error on line
    4174.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:44 -07:00
Ian Romanick
278c9af85e glsl: Fix hypothetical NULL dereference in ast_process_structure_or_interface_block
Fixes issue identified by Klocwork analysis:

    Pointer 'field_type' returned from call to function 'glsl_type' at
    line 4126 may be NULL and may be dereferenced at line 4139.  Also
    there are 2 similar errors on line(s) 4165, 4174.

In practice, it should be impossible to actually get NULL in here
because a syntax error would have already caused compilation to halt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:39 -07:00
Tom Stellard
c6a86fb563 r300g: Fix bug in OMOD optimization
https://bugs.freedesktop.org/show_bug.cgi?id=60503

NOTE: This is a candidate for the stable branches.
2013-04-12 08:33:31 -07:00
Emil Velikov
ac1118d53c nvc0: set ret variable if launch desc allocation failed
Pointed out by gcc

nve4_compute.c: In function 'nve4_launch_grid':
nve4_compute.c:511:7: warning: 'ret' may be used uninitialized in
 this function [-Wmaybe-uninitialized]
    if (ret)
       ^

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Edit by Christoph Bumiller:
Set it to -1 to indicate failure and only when it's actually required.
2013-04-12 17:15:14 +02:00
Emil Velikov
48bcb94dc3 nvc0: bail out early during nve4_compute_setup()
Exit gracefully rather than trying to create a random object, whenever the
chipset is unknown

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-12 17:10:11 +02:00
Emil Velikov
e28c266682 nvc0: compile nve4_cache_split_name() only in debug build
As otherwise it is unused - pointed out by gcc

nve4_compute.c:586:20: warning: 'nve4_cache_split_name' defined but not used [-Wunused-function]
 static const char *nve4_cache_split_name(unsigned value)
                    ^

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-12 17:09:03 +02:00
Emil Velikov
249f3d73cf nv50/codegen: do not emitATOM() if the subOp is unknown
For debug build we'll hit the assert, for release we are going to emit random data
as subOp is used uninitilised. Spotted by gcc

codegen/nv50_ir_emit_nv50.cpp: In member function 'void nv50_ir::CodeEmitterNV50::emitATOM(const nv50_ir::Instruction*)':
codegen/nv50_ir_emit_nv50.cpp:1554:12: warning: 'subOp' may be used uninitialized in this function [-Wmaybe-uninitialized]
    uint8_t subOp;
            ^

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-12 17:08:26 +02:00
Christoph Bumiller
4da54c91d2 nvc0: implement multisample textures 2013-04-12 13:02:18 +02:00
Christoph Bumiller
71c1c8a9b8 nvc0: patch up TEX cases with 5 or 6 sources on nve4
Hackishly fixes alignment requirement of 2nd tuple for now.
2013-04-12 11:41:35 +02:00
Christoph Bumiller
2b62ba7cb0 nvc0: fix 2D engine MS2 resolve 2013-04-12 11:41:35 +02:00
Christoph Bumiller
69804c2ab8 nv50,nvc0: add RGBX16/32_FLOAT formats 2013-04-12 11:41:35 +02:00
Matt Turner
195a6cca3c i965/vs: Print error if vertex shader fails to compile.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-11 17:22:07 -07:00
Matt Turner
32a8e87766 i965: NULL check prog on shader compilation failure.
Also change if (shader) to if (prog) for consistency.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-11 17:21:13 -07:00
José Fonseca
ed9687cf1b scons: Add st_cb_msaa.c to source list. 2013-04-11 22:37:34 +01:00
Dave Airlie
f024c72476 r600g: add get_sample_position support (v3)
v2: I rewrote this to use the sample positions properly.
v3: rewrite properly to use bitfield to cast back to signed ints

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:29 +01:00
Dave Airlie
f152da6bf9 st/mesa: add support for ARB_texture_multisample (v3)
This adds support to the mesa state tracker for ARB_texture_multisample.

hardware doesn't seem to use a different texture instructions, so
I don't think we need to create one for TGSI at this time.

Thanks to Marek for fixes to sample number picking.

v2: idr pointed out a bug in how we picked the max sample counts,
use new internal format chooser interface to pick proper answers.
v3: use st_choose_format directly, it was okay, fix anding of masks.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:29 +01:00
Dave Airlie
1d90ee5ef5 st/mesa: add support for get sample position
This just calls into the gallium interface.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:28 +01:00
Dave Airlie
cc906396c7 gallium: add get_sample_position interface
This is to be used to implement glGet GL_SAMPLE_POSITION.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:28 +01:00
Dave Airlie
184278a804 r600g: fix two issues in compressed msaa reading code
I've no idea when sample_chan would ever be 4 here, but 4 is most
definitely wrong, array textures have it as 3 as well.

Also the cayman code though unused is obviously wrong.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:27 +01:00
Paul Berry
e9fa3a9448 i965/vs: Don't hardcode DEBUG_VS in generic vec4 code.
Since the vec4_visitor and vec4_generator classes are going to be
re-used for geometry shaders, we can't enable their debug
functionality based on (INTEL_DEBUG & DEBUG_VS) anymore.  Instead, add
a debug_flag boolean to these two classes, so that when they're
instantiated the caller can specify whether debug dumps are needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:26 -07:00
Paul Berry
defdb310b7 i965/vs: Generalize computation of array strides in preparation for GS.
Geometry shader inputs are arrays, but they use an unusual array
layout: instead of all array elements for a given geometry shader
input being stored consecutively, all geometry shader inputs are
interleaved into one giant array.  As a result, the array stride we
use to access geometry shader inputs must be equal to the size of the
input VUE, rather than the size of the array element.

This patch introduces a new virtual function,
vec4_visitor::compute_array_stride(), which will allow geometry shader
compilation to specialize the computation of array stride to account
for the unusual layout of geometry shader input arrays.  It also
renames the local variable that the ir_dereference_array visitor uses
to store the stride, to avoid confusion.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:26 -07:00
Paul Berry
444fce6398 i965/vs: Generalize attribute setup code in preparation for GS.
This patch introduces a new function,
vec4_visitor::lower_attributes_to_hw_regs(), which replaces registers
of type ATTR in the instruction stream with the hardware registers
that store those attributes.  This logic will need to be common
between the vertex and geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:26 -07:00
Paul Berry
28fe02ce6e i965/vs: Generalize vertex emission code in preparation for GS.
This patch introduces a new function, vec4_visitor::emit_vertex(),
which contains the code for emitting vertices that will need to be
common between the vertex and geometry shaders.

Geometry shaders will need to use a different message header, and a
different opcode, for their URB writes, so we introduce virtual
functions emit_urb_write_header() and emit_urb_write_opcode() to take
care of the GS-specific behaviours.

Also, since vertex emission happens at the end of the VS, but in the
middle of the GS, we need to be sure to only call
emit_shader_time_end() during VS vertex emission.  We accomplish this
by moving the call to emit_shader_time_end() into the VS
implementation of emit_urb_write_opcode().

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
7214451bdc i965/vs: rename vec4_generator::generate_vs_instruction.
Since this function is going to get used for geometry shaders too, it
deserves a more generic name: generate_vec4_instruction.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
9bb6840b28 i965/vs: Generalize data structures pointed to by vec4_generator.
This patch removes the following field from vec4_generator, since it
is not used:

- struct brw_vs_compile *c

And changes the following field:

- struct gl_vertex_program *vp => struct gl_program *prog

With these changes, vec4_generator no longer refers to any VS-specific
data structures.  This will pave the way for re-using it for geometry
shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Use the name "prog" rather than "p".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
4d773603d3 i965/vs: Rename vec4_generator::prog to shader_prog.
The next patch is going to change the type of vec4_generator::vp from
struct gl_vertex_program * to struct gl_program *, and rename it.  The
sensible name to change it to is vec4_generator::prog.  However, prog
is already used.  Since the existing vec4_generator::prog is of type
struct gl_shader_program, it makes sense to rename it to shader_prog.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
5743bea0ba i965/vs: move VS-specific data members to vs_vec4_visitor.
This patch moves the following data structures from vec4_visitor to
vec4_vs_visitor, since they contain VS-specific data:

- struct brw_vs_compile *c (renamed to vs_compile)
- struct brw_vs_prog_data *prog_data (renamed to vs_prog_data)
- src_reg *vp_temp_regs
- src_reg vp_addr_reg

Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic
data, the following pointers are added to the base class, to allow it
to access the vec4-generic portions of these data structures:

- struct brw_vec4_compile *c
- struct brw_vec4_prog_key *key
- struct brw_vec4_prog_data *prog_data

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

v2: Use shorter names in the base class and longer names in the
derived class.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
0ce95222af i965/vs: move ARB_vertex_program functions to vec4_vs_visitor.
This patch moves functions from vec4_visitor to vec4_vs_visitor that
deal with ARB (assembly) vertex programs.  There's no point in having
these functions in the base class since we don't intend to support
assembly programs for the GS stage.  The following functions are
moved:

- setup_vp_regs
- get_vp_dst_reg
- get_vp_src_reg

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
42a3d63dd4 i965/vs: Add virtual function make_reg_for_system_value().
The system values handled by vec4_visitor::visit(ir_variable *) are
VS-specific (vertex ID and instance ID).  This patch moves the
handling of those values into a new virtual function,
make_reg_for_system_value(), so that this VS-specific code won't be
inherited by geomtry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
8941f73c7c i965/vs: Make some vec4_visitor functions virtual.
This patch makes the following vec4_visitor functions virtual, since
they will need to be implemented differently for vertex and geometry
shaders.  Some of the functions are renamed to reflect their generic
purpose, rather than their VS-specific behaviour:

- setup_attributes
- emit_attribute_fixups (renamed to emit_prolog)
- emit_vertex_program_code (renamed to emit_program_code)
- emit_urb_writes (renamed to emit_thread_end)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
e9be5a05f7 i965/vs: Make vec4_vs_visitor class derived from vec4_visitor.
This patch just creates the derived class; later patches will migrate
VS-specific functions and data structures from the base class into the
derived class.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
5fff3752c8 i965/vs: split brw_vs_prog_data into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Put urb_read_length and urb_entry_size in the generic struct.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
0c994f181c i965/vs: split brw_vs_prog_key into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
d7af636473 i965/vs: split brw_vs_compile into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
09cd6e06d2 i965/vs: Remove brw_vs_prog_data pointer from brw_vs_compile.
In patches that follow, we'll be splitting structs brw_vs_prog_data
and brw_vs_compile into a vec4-generic base struct and a VS-specific
derived struct (this will allow the vec4-generic code to be re-used
for geometry shaders).  Having brw_vs_compile point to
brw_vs_prog_data makes it difficult to do this cleanly.

Fortunately most of the functions that use brw_vs_compile (those in
the vec4_visitor class) already have access to brw_vs_prog_data
through a separate pointer (vec4_visitor::prog_data).  So all we have
to do is use that pointer consistently, and plumb prog_data through
the few remaining functions that need access to it.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
deffbbed4e i965: Generalize computation of VUE map in preparation for GS.
This patch modifies the arguments to brw_compute_vue_map() so that
they no longer bake in the assumption that we are generating a VUE map
for vertex shader outputs.  It also makes the function non-static so
that we can re-use it for geometry shader outputs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
b29613371c i965/vs: Make type of vec4_visitor::vp more generic.
The vec4_visitor functions don't use any VS specific data from
vec4_visitor::vp.  So rename it to "prog" and change its type from
struct gl_vertex_program * to struct gl_program *.  This will allow
the code to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Use the name "prog" rather than "p".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
fe97f26c86 i965: Rename backend_visitor::prog to shader_prog.
The next patch is going to change the type of vec4_visitor::vp from
struct gl_vertex_program * to struct gl_program *, and rename it.  The
sensible name to change it to is vec4_visitor::prog.  However, prog is
already used in backend_visitor (which vec4_visitor derives from).
Since backend_visitor::prog is of type struct gl_shader_program *, it
makes sense to rename it to shader_prog.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
5b0bd8ece8 glsl: Fix (and validate) comment above glsl_type::name.
The comment above glsl_type::name claimed that it could sometimes be
NULL.  This was wrong--it is never NULL.  Many error handling paths
would segfault if it were.  (Anonymous structs are assigned names like
"#anon_struct_0001"--see the ast_struct_specifier constructor in
glsl_parser_extras.cpp.)

Fix the comment and add assertions to validate that it really is never
NULL.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-11 09:25:24 -07:00
Christian König
5b2855bfe7 radeon/uvd: add UVD implementation v5
Just everything you need for UVD with r600g and radeonsi.

v2: move UVD code to radeon subdir, clean up build system additions,
    remove an unused SI function, disable tiling on SI for now.
v3: some minor indentation fix and rebased
v4: dpb size calculation fixed
v5: implement proper fall-back in case the kernel doesn't support UVD,
    based on patches from Andreas Boll but cleaned up a bit more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-11 17:10:28 +02:00
Christian König
f91e4d2c9d radeon/winsys: add uvd ring support to winsys v3
Separated from UVD patch for clarity.

v2: sync with next tree for 3.10
v3: as pointed out by Andreas Bool check for drm minor >= 32

http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10-wip

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-04-11 17:10:01 +02:00
Dave Airlie
cb12bf7606 st/mesa: fix UBO offsets.
Reported and tested by degasus on #radeon.

Note: This is a candidate for the 9.1 branch

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 15:20:19 +10:00
Ralf Jung
3998f8c6b5 egl/x11: Fix initialisation of swap_interval
The EGLConfig attributes EGL_MIN/MAX_SWAP_INTERVAL were incorrectly set to
0 and 0. This prevented clients from setting the swap interval to a
reasonable value, like 1 or 2.

Swap interval worked correctly in Mesa 9.0. The commit below introduced
the bug.

    commit 7e9bd2b2ed
    Author: Eric Anholt <eric@anholt.net>
    Date:   Tue Sep 25 14:05:30 2012 -0700
	egl: Add support for driconf control of swapinterval.

Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63078
[chadv: Wrote commit message]
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 19:16:45 -07:00
Kenneth Graunke
cbe24ff7c8 intel: Fall back to X-tiling when larger than estimated aperture size.
If a region is larger than the estimated aperture size, we map/unmap it
by copying with the BLT engine.  Which means we can't use Y-tiling.

Fixes Piglit max-texture-size and tex3d-maxsize, which regressed in my
recent change to use Y-tiling by default on Gen6+.  This was due to a
botched merge conflict resolution.

v2: Return a mask of valid tilings from intel_miptree_select_tiling.
    This allows us to avoid the X-tiling fallback if Y-tiling is actually
    mandatory.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 16:54:31 -07:00
Kenneth Graunke
eef3dff3fd intel: Refactor code in intel_miptree_choose_tiling().
This reduces the nesting level slightly, and in my opinion, makes it a
bit easier to follow.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 16:54:31 -07:00
Kenneth Graunke
ba38ac062c intel: Move the max_gtt_map_object_size estimation to intel_context.
We need know this in order to decide what tiling mode to use.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 16:54:31 -07:00
Fredrik Höglund
fb69dbb0d1 r600g: Add support for GL_ARB_texture_buffer_range
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-11 00:10:45 +02:00
Paul Berry
42767dc22f i965/blorp: Remove unnecessary test in gen7_blorp_emit_depth_stencil_config.
gen7_blorp_emit_depth_stencil_config() is only called when
params->depth.mt is non-null.  Therefore, it's not necessary to do an
"if (params->depth.mt)" test inside it.  The presence of this if test
was misleading static analysis tools (and briefly, me) into thinking
that gen7_blorp_emit_depth_stencil_config() might sometimes access
uninitialized data and dereference a null pointer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-10 13:17:53 -07:00
Marek Olšák
34c3f98641 r600g: fix valgrind warning on Cayman
Warning: "Conditional jump or move depends on uninitialised value(s)".
2013-04-10 21:56:51 +02:00
Zack Rusin
fe29f99293 gallivm/tgsi: handle untyped moves
both mov and ucmp can be used to move variables of any type.
correctly note that about ucmp in the tgsi_info and make
sure gallivm can handle that by correctly casting the untyped
moves.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:37:17 -07:00
Zack Rusin
d56f2d5267 gallivm: fix loops and conditionals within GS
We were using simple temporaries, without using alloca or phi
nodes which meant that on every iteration of the loop our
temporaries, which were holding the number of vertices and
primitives which were emitted, were being reset to zero. Now
we're using alloca to allocate those variables to preserve
them across conditionals.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:33:59 -07:00
Zack Rusin
c1cd19c3b8 llvmpipe: implement PIPE_QUERY_SO_STATISTICS
We were missing the implementation of PIPE_QUERY_SO_STATISTICS
query, this change implements it on top of the existing
facilities.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:32:56 -07:00
Zack Rusin
7466e0b6c8 gallivm: fix unsigned divide and remainder opcodes
We want to both make sure we never divide by zero to not generate
sigfpe and that divide by zero is guaranteed to return 0xffffffff.
Based on José idea.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:31:22 -07:00
Zack Rusin
1ad4a4eeb3 gallivm: fix breakc
we break when the mask values are 0 not, 1, plus it's bit comparison
not a floating point comparison. This fixes both.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:25:34 -07:00
Chad Versace
e4484a0309 intel/hsw: Enable hiz (v2)
Enable hiz by setting intel_context::has_hiz.  However, to work around
a hardware bug, we selectively enable hiz for only nicely aligned miptree
slices.

No Piglit regressions on Haswell 0x0d26 rev07 when based atop
mesa-master-4ad3601.

Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52%
(hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901;
samples=3).

v2: Replace the check for IS_HASWELL(devid) in intel_miptree_slice_has_hiz()
    with a conditional set of has_hiz. [for anholt]

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:26 -07:00
Chad Versace
916d1ea7dc i965: Remove brw_context::depthstencil::hiz_mt
After recent refactorings, the field is written but no longer read.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
2d3bbc576c intel: Replace checks for hiz_mt with intel_has*hiz()
When appropriate, replace each check `hiz_mt != NULL` with either a call
to intel_miptree_slice_has_hiz() or intel_renderbuffer_has_hiz().  No
behavioral change.

This prepares for selectively enabling hiz on individual miptree slices
for Haswell.

This refactoring had several side effects.

  1. To prevent new warnings about discarding the const qualifier,
     I removed 'const' from some variable declarations in
     intel_validate_framebuffer().  The alternative was to add const
     qualifiers to multiple function signatures in the
     intel_renderbuffer_has_hiz call graph. Since the dominant convention
     in the Intel code is to not qualify function parameters as const,
     I chose to remove rather than add const qualifiers.

  2. I changed the signature of brw_emit_depth_stencil_hiz() by replacing
     `struct intel_mipmap_tree *hiz_mt` with `bool hiz`. The function used
     hiz_mt mostly as a boolean indicator of the presence of hiz, so the
     signature change is consistent with the patch's goal.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
5b79705526 i965: Change signature of brw_get_depthstencil_tile_masks()
Add new parameters `depth_level` and `depth_layer`, which specify depth
miptree's slice of interest.  A following patch will pass the new
parameters through to intel_miptree_slice_has_hiz().

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
87f4541bc1 i965/blorp: Add fields brw_blorp_mip_info::level,layer
The new fields define the 2D miptree slice to be used. A following patch
will pass the new fields through to intel_miptree_slice_has_hiz().

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
2a416a9b1b intel: Add field intel_mipmap_slice::has_hiz
On Haswell, HiZ will selectively be enabled on individual miptree slices
to workaround a hardware bug. The new field 'has_hiz' indicates if HiZ is
enabled for a given slice.

Also add two new accessor functions for this field.
  intel_miptree_slice_has_hiz
  intel_renderbuffer_has_hiz

The new field and accessor functions are not yet used. Also, this patch
introduces no behavioral change because, in this patch,
intel_miptree_alloc_hiz() sets has_hiz for all slices.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
a14dc4f92c i965/blorp: Align rectangle primitive for hiz ops
The hardware docs and the simulator require that the rectangle primitive
emitted during fast depth clears and hiz resolves must be aligned to 8x4
pixels.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Eric Anholt
d5f7aebac2 i965/vs: Use GRFs for pull constant offsets on gen7.
This allows the computation of the offset to get written directly into the
message source.

shader-db results:
total instructions in shared programs: 3308390 -> 3283025 (-0.77%)
instructions in affected programs:     442998 -> 417633 (-5.73%)

No difference in GLB2.7 low res (n=9).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-10 09:45:21 -07:00
Eric Anholt
3badbf7f7f i965/vs: When asked to make a dst_reg for a src.xxxx, just write to src.x.
We have several places in our pull constant handling where we make a
temporary src_reg for an int, and then turn it into a dst.  In doing so,
we were writing to the dst.xyzw, so we never register coalesced it with a
later mov from dst.x to real_dst.x.

These extra channels written would be removed if we had channel-wise DCE
in the backend, but we don't.  Fix it for now by just not writing these
extra channels that won't get used.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-10 09:45:21 -07:00
Eric Anholt
007a88ed24 i965/gen6: Reduce updates of transform feedback offsets with HW contexts.
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we were actually
updating it with a bogus value if the batch wrapped and we emitted the
packet again during a single transform feedback.  By reducing state
emission, we avoid the bug.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
2013-04-10 09:45:21 -07:00
Eric Anholt
62a18da341 i965/gen7: Skip resetting SOL offsets at batch start with HW contexts.
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we can't reliably
compute offsets for our buffer pointers after a batch flush.  Thanks to HW
contexts, our transform feedback offsets are now saved, so we can just
keep using the ones from before the batch wrap.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
2013-04-10 09:45:21 -07:00
Christian König
ccf3e8fc9b radeonsi: remove sampler writemask v3
v2: fix instrinsic name as well
v3: LLVM revision incremented as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-10 10:41:29 +02:00
Niels Ole Salscheider
31f14f3def pipe-loader: Fix out of source build
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-04-10 09:45:04 +02:00
Brian Paul
b74b510d64 st/mesa: remove #if FEATURE_GL/ES tests
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
c04e0b9f4b mesa: remove old comment about FEATURE_GL
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
f490c6839b mesa: remove #ifdef FEATURE_ES2, add some comments instead
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
9dc6f76e44 st/mesa: remove #include mfeatures.h
None of these were needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
04bd972fc3 docs: initial 9.2 release notes file 2013-04-09 18:30:23 -06:00
Brian Paul
acd4fb8b5a st/osmesa: re-use buffers in OSMesaMakeCurrent()
Rather than creating a new buffer each time.  Fixes problems found
with vtk.

Tested-by: Kevin H. Hobbs <hobbsk@ohio.edu>
2013-04-09 18:30:23 -06:00
Marek Olšák
4f1fd920c9 mesa: update derived framebuffer state in GetMultisamplefv
This makes sure that ctx->DrawBuffer->Visual.samples is up-to-date.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 02:01:16 +02:00
Marek Olšák
b6475f9437 mesa: fix glGet queries depending on derived framebuffer state (v2)
"ctx->DrawBuffer->Visual" might be invalid if (NewState &_NEW_BUFFERS) != 0.

v2: also fix:
    - RGBA_INTEGER_MODE_EXT
    - RGBA_FLOAT_MODE_ARB (also check API support)
    - FRAMEBUFFER_SRGB_CAPABLE_EXT

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 02:01:16 +02:00
Paul Berry
34efd9214d i965/gen7.5: Allow HW primitive restart for all primitive types.
Gen7.5 (Haswell) hardware supports primitive restart for all primitive
types.  It also handles all possible primitive restart indices.
Rather than specialize both can_cut_index_handle_restart_index() and
the switch statement in can_cut_index_handle_prims() for Haswell, just
return early if the hardware is Haswell because we know it can handle
everything.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-09 15:37:36 -07:00
Paul Berry
a7388f8e6f i965: Only use brw_draw.c's trim() function when necessary.
brw_draw.c contains a trim() function which modifies the vertex count
for quads and quad strips in order to discard dangling vertices.  In
principle this shouldn't be necessary, since hardware since Gen4 is
capable of discarding dangling vertices by itself.  However, it's
necessary because as a hack to speed up rendering on Gen 4-5, we
sometimes convert quads to trifans and quad strips to tristrips.  The
trim() function isn't necessary on Gen6 and up.

This patch documents why and when the trim() function is necessary,
and avoids calling it when it's not needed.

This will avoid creating problems when we enable hardware support for
primitive restart of quads and quad strips on Haswell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-09 15:37:35 -07:00
Paul Berry
56ce7fa4b8 i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.
The call to emit_shader_time_end() before the second URB write was
conditioned with "if (eot)", but eot is always false in this code
path, so emit_shader_time_end() was never being called for vertex
shaders that performed 2 URB writes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-09 12:15:08 -07:00
Christian König
462647453c st/vdpau: fix subtitle related bug v2
Drawing subtitles didn't increased the dirty area of the surface.

Reported and tested by freeedrich on irc.

v2: don't clear the surface

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-09 21:11:32 +02:00
Paul Berry
5306af2113 glsl/linker: Reduce scope of non-flat integer varying fix.
In the mailing list discussion of "glsl/linker: fix varying packing
for non-flat integer varyings." (commit 7862bde), we concluded that
since the bug only applies to integral variables, it is safer to just
apply the bug fix to integer varyings.  I forgot to make the change
before pushing the patch upstream.  (Note: we aren't aware of any bugs
in commit 7862bde; it just seems wise to be on the safe side).

This patch makes the change.  Assuming commit 7862bde gets
cherry-picked back to 9.1, this commit should be cherry-picked too.

NOTE: This is a candidate for the 9.1 release branch.
2013-04-09 10:37:16 -07:00
Paul Berry
32d2b2aa2c glsl/linker: Adapt flat varying handling in preparation for geometry shaders.
When a varying is consumed by transform feedback, but is not used by
the fragment shader, assign_varying_locations() sets its interpolation
type to "flat" in order to ensure that lower_packed_varyings never has
to deal with non-flat integral varyings (the GLSL spec doesn't require
integral vertex outputs to be flat if they aren't consumed by the
fragment shader).

A similar situation will arise when geometry shader support is added,
since the GLSL spec only requires integral vertex shader outputs to be
flat when they are consumed by the fragment shader.  This patch
modifies the linker to handle this situation too.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 10:25:57 -07:00
Paul Berry
8687c40c2d glsl: Document lower_packed_varyings' "flat" requirement with an assert.
To minimize the variety of type conversions that lower_packed_varyings
needs to perform, it assumes that integral varyings are always
qualified as "flat".  link_varyings.cpp takes care of ensuring that
this is the case (even in the circumstances where GLSL doesn't require
it).

This patch documents the assumption with an assertion, for ease in
future debugging.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 10:25:19 -07:00
Paul Berry
7862bde8af glsl/linker: fix varying packing for non-flat integer varyings.
Commit dfb57e7 (glsl: Fix error checking on "flat" keyword to match
GLSL ES 3.00, GLSL 1.50) relaxed the rules for integral varyings: they
only need to be declared as "flat" if they are a fragment shader
inputs.  This allowed for the possibility of a vertex shader output
being a non-flat integer, provided that it was not matched to a
fragment shader input.  A non-contrived situation where this might
arise is if a vertex shader generates some integral outputs which are
consumed by tranform feedback, but not by the fragment shader.

Unfortunately, lower_packed_varyings assumes that *all* integral
varyings are flat, regardless of whether they are consumed by the
fragment shader.  As a result, attempting to create a non-flat
integral vertex output of a size that required packing (i.e. a size
other than ivec4 or uvec4) would cause an assertion failure in
lower_packed_varyings.

This patch prevents the assertion failure by forcing vertex shader
outputs to be "flat" whenever they are not consumed by the fragment
shader.  This should have no effect on rendering since the "flat"
keyword only affects the behaviour of fragment shader inputs.

Fixes piglit test "spec/EXT_transform_feedback/nonflat-integral".

NOTE: This is a candidate for the 9.1 release branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 10:25:15 -07:00
Paul Berry
778ce82b71 glsl: Check the size of ir_print_visitor's mode[] array with STATIC_ASSERT.
ir_print_visitor::visit(ir_variable *)'s mode[] array needs to match
the declaration of the enum ir_variable_mode.  It's hard to verify
that at compile time, but at least we can use a STATIC_ASSERT to make
sure it's the right size.

This required adding ir_var_mode_count to the enum.
2013-04-09 10:19:22 -07:00
Paul Berry
67f226e179 glsl: Fix ir_print_visitor's handling of interpolation qualifiers.
This patch updates the interp[] array to match the enum
glsl_interp_qualifier.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Add a STATIC_ASSERT to make sure the array is the correct size.
This required adding INTERP_QUALIFIER_COUNT to the enum.
2013-04-09 10:19:11 -07:00
Johannes Obermayr
c295874129 autotools: Better describe which cases OProfileJIT is required.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-04-09 17:38:42 +01:00
Brian Paul
4ad360133c softpipe: misc updates to image dumping in softpipe_flush() 2013-04-09 08:27:53 -06:00
Vinson Lee
04ffce3004 tgsi: Ensure struct tgsi_ind_register field Index is initialized.
Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-08 18:59:34 -07:00
Martin Andersson
a8246927e3 r600g: Fix UMAD on Cayman
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.

This fixed hardlocks in the EXT_transform_feedback/order tests.

NOTE: This is a candidate for the stable branches.
(might not be easy to cherry-pick though)

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-04-09 03:09:37 +02:00
Kenneth Graunke
b76539aabe intel: Remove the texture_tiling driconf option.
This option can force textures to be untiled.  However, on Gen6+, depth
buffers must be Y-tiled.  MSAA buffers also must be Y-tiled.  So setting
this option on even a trivial application like glxgears causes assertion
failures in a debug build, and likely GPU hangs in a release build.

It's just giving users a license to shoot themselves in the foot.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:07 -07:00
Kenneth Graunke
55ecc448b9 i965: Prefer Y-tiling on Gen6+.
In the past, we preferred X-tiling for color buffers because our BLT
code couldn't handle Y-tiling.  However, the BLT paths have been largely
replaced by BLORP on Gen6+, which can handle any kind of tiling.

We hadn't measured any performance improvement in the past, but that's
probably because compressed textures were all untiled anyway.

Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%.

v2: Rebase on top of Eric's untiled-for-larger-than-aperture changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:07 -07:00
Kenneth Graunke
40e30c1ca1 i965: Use tiling even for compressed textures.
The code has no rationale for why we would force compressed textures to
be untiled, and it appears to work fine.  Git archeology indicates that
it's been that way dating back to when we first started tiling.

Improves performance in GLB27_TRex_C24Z16_FixedTimeStep at 1280x720 by
10.0529% +/- 0.573075% (n=12).  Improves performance in Xonotic by
4.56409% +/- 0.27965% (n=3).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:07 -07:00
Chad Versace
f709198b10 intel: Refactor selection of miptree tiling
This patch (1) extracts from intel_miptree_create() the spaghetti logic
that selects the tiling format, (2) rewrites that spaghetti into a lucid
form, and (3) moves it to a new function, intel_miptree_choose_tiling().
No behavioral change.

As a bonus, it is now evident that the force_y_tiling parameter to
intel_miptree_create() does not really force Y tiling.

v2 (Ken): Rebase on top of Eric's untiled-for-larger-than-aperture
changes.  This required passing in the miptree.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:06 -07:00
Chad Versace
aa391976df intel: Allocate hiz in intel_renderbuffer_move_to_temp()
When moving the renderbuffer to a new miptree, we neglected to allocate
the hiz buffer for the new miptree. Oops.

Fixes all Piglit depthstencil-render-miplevels tests from crash to pass on
Sandybridge.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-08 16:09:26 -07:00
Dave Airlie
d0bf48f8e9 st/mesa: fix levels in initial texture creation
calim pointed out we were getting mipmap levels for array multisamples,
this didn't make sense. So then I noticed this function takes last_level
so we are passing in a too high value here.

I think this should fix the case he was seeing.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-08 23:56:06 +01:00
Ian Romanick
58d93e3247 glsl: Don't early-out for error-type inputs
Check the type of the array operand and the index operand before doing
other checks.  This simplifies the code a bit now (eliminating the
error_emitted parameter), and enables some later functional changes.

The shader

uniform float x[6];
uniform sampler2D s;
void main() { gl_Position.x = xx[s + 1]; }

still generates (only) the two expected errors:

0:3(33): error: `xx' undeclared
0:3(39): error: Operands to arithmetic operators must be numeric

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
a131b87706 glsl: Don't emit spurious errors for constant indexes of the wrong type
Previously the shader

uniform float x[6];
void main() { gl_Position.x = x[1.0]; }

would have generated the errors

0:2(33): error: array index must be integer type
0:2(36): error: array index must be < 6

Now only

0:2(33): error: array index must be integer type

will be generated.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
a70d2f05dc glsl: Collect all of the non-constant index error checks together
This puts all of the checks togeher for easier reading.  It also means
that all the checks are blocked on array->type->is_array.  Shortly this
will allow elimination of some is_error check work-arounds in this
function.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
f9d8ca2817 glsl: Minor code compaction in _mesa_ast_array_index_to_hir
Also, document the reason for not checking for type->is_array in some of
the bound-checking cases.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
2c333a878c glsl: Don't return a value from check_builtin_array_max_size
That last consumer of the return value was changed to not use it by the
previous commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
666fafc144 glsl: Remove some unnecessary uses of error_emitted
The error_emitted flag is used in semantic checking to prevent spurious
cascading errors.  For example,

void foo(sampler2D s, float a)
{
    float x = a + (1.2 + s);

    ...
}

should only generate a single error.  Without the error_emitted flag for
the first error, "a + ..." would also generate an error.

However, a bunch of cases in _mesa_ast_array_index_to_hir that were
setting error_emitted would mask legitimate errors.  For example,

    vec4 a[7];
    float b = a[3.14];

should generate two error (float index and type mismatch in assignment).
The uses of error_emitted would cause only the first to be emitted.

This patch removes most of the places in _mesa_ast_array_index_to_hir
that would set the error_emitted flag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
46934adb8d glsl: Refactor handling of ast_array_index to a separate function
I love 800+ line switch-statements as much as the next guy... Future
commits will make changes to this part of the AST-to-HIR conversion, and
extracting this code will make that a bit easier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
cd39ae7394 glsl: Make check_build_array_max_size externally visible
A future commit will try to use this function in a different file.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Eric Anholt
ca9a7d975a intel: Avoid making tiled miptrees we won't be able to blit.
Doing so was breaking miptree mapping, which we really need to be able to
handle.  With this change, intel_miptree_map_direct() falls through to
doing a CPU mapping on the buffer like we need.

With the previous 2 patches, all of these should be fixed:
piglit max-texture-size (all 3 patches required!)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37871
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44958
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53494

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 11:49:33 -07:00
Eric Anholt
dfed115090 intel: Do temporary CPU maps of textures that are too big to GTT map.
This still fails, since 8192*4bpp == 32768, which is too big to use the
blitter on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-08 11:49:25 -07:00
Eric Anholt
b3a3cb9611 intel: Add support for writing to our linear-temporary-CPU-map case.
This will be used for handling updates of large textures.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>.
2013-04-08 11:49:20 -07:00
Kenneth Graunke
97e40a524e intel: Remove check for kernel 2.6.29.
Now that we require 2.6.39, there's no need to also check for 2.6.29.
Calling drm_intel_bufmgr_gem_enable_fenced_relocs() without checking
should be safe, as it simply sets a flag.

This does remove the check for zero fences available, but that doesn't
seem worth checking.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
394edb5af5 intel: Require kernel 2.6.39 for relaxed relocation support.
Chris Wilson's relaxed relocation patch landed in March 2011.  Anyone
running pre-3.0 kernels probably isn't going to get the latest Mesa
anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
d7fd5696e6 i965: Remove a few BRW_STATE_... enum values.
These were likely used for BRW_NEW_... dirty bit flags at one point, but
they're unused now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
79c27e7528 i965: Remove brw->vb.info and struct brw_vertex_info.
Nobody uses this value, so there's no need to set it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
b29dc25572 i965: Remove the BRW_NEW_INPUT_DIMENSIONS flag.
When I removed the proj_attrib_mask optimization, I also removed the
last consumer of this bit without realizing it.

Since nobody uses it, there's no point in flagging it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Matt Turner
2e177bc8a5 register_allocate: Fix the type of best_benefit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 10:30:40 -07:00
Tom Stellard
a5a76782d5 radeon/llvm: Bump minimum LLVM version to 3.3 2013-04-08 07:43:34 -07:00
Niels Ole Salscheider
b336f51cc7 clover: Fix linkage of libOpenCL
Clover needs the irreader component of llvm

v2: Check for irreader component
irreader is only available with LLVM 3.3 >= 177971

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-04-08 07:08:10 -07:00
Vincent Lejeune
5019af2145 r600g/llvm: Add support for native isa for pre EG
This fixes bug 62756 :
https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
2013-04-08 15:11:59 +02:00
Marek Olšák
eff66bc9f8 gallium/util: add const to a parameter of util_max_layer 2013-04-06 23:57:15 +02:00
Marek Olšák
08275b25cc st/mesa: don't expose ARB_color_buffer_float without driver support in GL core
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:12 +02:00
Marek Olšák
3264c3e997 mesa: allow drivers not to expose ARB_color_buffer_float in GL core profile
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:10 +02:00
Marek Olšák
9d4f67600b mesa: move updating clamp control derived state out of mesa_update_state_locked
It has 2 dependencies: glClampColor and the framebuffer, we might just as well
do the update where those two are changed.

v2: cosmetic changes from Brian's email

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:09 +02:00
Marek Olšák
755648c37f mesa: don't set _ClampFragmentColor to TRUE if it has no effect
This should reduce shader recompilations with drivers that emulate fragment
color clamping, because we want the clamping to be enabled only if there is
a signed normalized or floating-point colorbuffer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:06 +02:00
Marek Olšák
21d407c1b8 mesa: refactor clamping controls, get rid of _ClampReadColor
v2: cosmetic changes from Brian's email

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:04 +02:00
Chris Forbes
c4629ad3f9 mesa: don't memcmp() off the end of a cache key.
Reported-by: `per` in #intel-gfx

The size of the cache key varies, so store the actual size as well as
the key blob itself, rather than just assuming it's the same as the size
passed in.

NOTE: This is a candidate for stable branches.

V2: Don't leave silly holes in structure; use unsigned instead of GLuint.
V3: Fix missing case for `last` match.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-06 18:30:08 +13:00
Tom Stellard
302f53dc20 radeonsi: Add compute support v3
v2:
  - Only dump shaders when env variable is set.

v3:
  - Don't emit VGT registers

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
4f7fe2cf2c radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cache
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
0ccf82c557 radeonsi: Remove si_pm4_inval_vertex_cache()
This function is a holdover from r600g and is identical to
si_pm4_inval_texture_cache(), so it is not needed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
c5e5b3401c gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2
This target string now contains four values instead of three.  The old
processor field (which was really being interpreted as arch) has been split
into two fields: processor and arch.  This allows drivers to pass a
more a more detailed description of the hardware to compiler frontends.

v2:
  - Adapt to libclc changes

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-04-05 18:43:34 -04:00
Wladimir
1a868acbec util: add ETC as compressed format
Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-05 16:14:51 -06:00
Brian Paul
de99b6d117 gallium/u_blitter: fix is_blit_generic_supported() stencil checking
Don't check if there's sampler support for stencil if we're not
going to actually blit/copy stencil values.  Fixes the case where
we mistakenly said we can't support a blit of depth values from
S8Z24 to X8Z24.

Also, rename the is_stencil variable to dst_has_stencil to improve
readability.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-05 16:14:51 -06:00
Alexander Monakov
9cda356004 Honor GLX_DONT_CARE in MATCH_MASK
NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47478
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62999
Bugzilla: http://bugs.winehq.org/show_bug.cgi?id=26763
2013-04-05 14:32:45 -07:00
Rob Clark
aac7f06ad8 freedreno: use autogenerated register defs
Switch to use the envytools generated headers for register/bitfield
definitions.  This is the first step in preparing to add a3xx support,
since it avoids having conflicting names for a3xx and a2xx registers.
And since I'm using envytools for a3xx it is simpler to just use it for
everything.

This shouldn't cause any functional change, it is really just a lot of
renaming.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-04-05 14:33:16 -04:00
José Fonseca
1fefc65d20 st/wgl: Install our windows message hook to threads created before the ICD is loaded.
Otherwise we will not receive destroy windows events, causing framebuffers
to leak.

This happens particularly with java and jogl.

Tested with java + jogl, MATLAB.

VMware Internal Bug Number: 1013086.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-05 18:27:54 +01:00
Adam Jackson
ca70de9bd2 llvmpipe: Work without sse2 if llvm is new enough
At least on llvm 3.2 this appears to work fine.  Tested on an Athlon XP
2600+, which has sse and 3dnow but not sse2.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-05 11:32:53 -04:00
Jerome Glisse
b8998f976e winsys/radeon: add command stream replay dump for faulty lockup v3
Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to
enable it.

When enabled after each cs submission the code will try to detect lockup by
waiting on one of the buffer of the cs to become idle, after a timeout it
will consider that the cs triggered a lockup and will write a radeon_lockup.c
file in current directory that have all information for replaying the cs.

To build this file :
gcc -O0 -g radeon_lockup.c -ldrm -o radeon_lockup -I/usr/include/libdrm

v2: Add radeon_ctx.h file to mesa git tree
v3: Slightly improve dumped file for easier editing, only dump first faulty cs

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-05 10:22:05 -04:00
Brian Paul
5192262833 st/xlib: add HUD support for xlib/GLX
For the softpipe and llvmpipe drivers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 17:00:42 -06:00
Brian Paul
f5071783c1 gallium/hud: add GALLIUM_HUD_PERIOD env var
To set the graph update rate, in seconds.  The default update rate
has also been changed to 1/2 second.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-04 17:00:42 -06:00
Brian Paul
6211c45186 gallium/hud: initialize sampler state
The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with
unnormalized texcoords (at least for softpipe).

v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-04 17:00:42 -06:00
Kenneth Graunke
edc52a8f28 glsl: Add an optimization pass to flatten simple nested if blocks.
GLBenchmark 2.7's shaders contain conditional blocks like:

if (x) {
    if (y) {
        ...
    }
}

where the outer conditional's then clause contains exactly one statement
(the nested if) and there are no else clauses.  This can easily be
optimized into:

if (x && y) {
    ...
}

This saves a few instructions in GLBenchmark 2.7:

    total instructions in shared programs: 11833 -> 11649 (-1.55%)
    instructions in affected programs:     8234 -> 8050 (-2.23%)

It also helps CS:GO slightly (-0.05%/-0.22%).  More importantly,
however, it simplifies the control flow graph, which could enable other
optimizations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
967514ce68 i965: Use a variable for the push constant size in kB.
This clarifies that the offset of 2 is actually 16 kB / 8kB units.
It also keys both computations off of a single variable, which should
make it easier to change in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
8cdb2d32ec i965: Turn brw->urb.vs_size and gs_size into local variables.
These variables are only used within a single function, so we may as
well make them local variables.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
b99ad7f02c i965: Remove BRW_NEW_WM_INPUT_DIMENSIONS dirty bit.
This was only produced by the brw_wm_input_dimensions atom, which was
removed in the previous commit.  So there's no need for the dirty bit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
d198546bac i965: Delete brw_vs_constval.c and the brw_wm_input_sizes atom.
This was only used to compute proj_attrib_mask, which was removed by the
previous commit.  That makes this dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
705c8247fa i965: Remove now dead brw_wm_prog_key::proj_attrib_mask field.
The previous commit removed the last user of this field, so there's no
longer any point in setting it.  Removing this should eliminate
state-dependent recompiles, and make the precompile more reliable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
7183568869 i965: Remove fixed-function texture projection avoidance optimization.
This optimization attempts to avoid extra attribute interpolation
instructions for texture coordinates where the W-component is 1.0.

Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes
state atom (all the brw_vs_constval.c code) needs to run on each draw.
It computes the input_size_masks array, then uses that to compute
proj_attrib_mask.  Differences in proj_attrib_mask can cause
state-dependent fragment shader recompiles.  We also often fail to guess
proj_attrib_mask for the fragment shader precompile, causing us to
needlessly compile it twice.

Furthermore, this optimization only applies to fixed-function programs;
it does not help modern GLSL-based programs at all.  Generally, older
fixed-function programs run fine on modern hardware anyway.

The optimization has existed in some form since the initial commit.  When
we rewrote the fragment shader backend, we dropped it for a while.  Eric
readded it in commit eb30820f26 as part of
an attempt to cure a ~1% performance regression caused by converting the
fixed-function fragment shader generation code from Mesa IR to GLSL IR.
However, no performance data was included in the commit message, so it's
unclear whether or not it was successful.

Time has passed, so I decided to re-measure this.  Surprisingly,
Eric's OpenArena timedemo actually runs /faster/ after removing this and
the brw_wm_input_sizes atom.  On Ivybridge at 1024x768, I measured a
1.39532% +/- 0.91833% increase in FPS (n = 55).  On Ironlake, there was
no statistically significant difference (n = 37).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
32726b1af6 i965: Use ctx->Stencil._WriteEnabled in DEPTH_STENCIL_STATE.
This is the same computation as the _WriteEnabled flag, so we may as
well use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
01bd29d681 i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+.
ctx->Stencil.WriteMask is a statically sized array of 3 elements.
Checking it against 0 actually is a NULL check, and can never fail,
which meant that we always said stencil writes were enabled.

Use the new core Mesa derived state flag to fix this.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:18 -07:00
Kenneth Graunke
1e3235d36e mesa: Add new ctx->Stencil._WriteEnabled derived state flag.
i965 needs to know whether stencil writes are enabled in several places,
and gets the test wrong sometimes.  While we could create a function to
compute this, it seems generally useful enough to warrant a new piece of
derived state.  Also, all the plumbing is already in place.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:18 -07:00
Roland Scheidegger
9eef86bb55 gallivm: some minor cube map cleanup
The ar_ge_as_at variable was just very very confusing since the condition
was actually the other way around (as_at_ge_ar). So change the condition
(and the selects depending on it) to match the variable name.
And also change the chosen major axis in case the coord values are the
same. OpenGL doesn't care one bit which one is chosen in this case but
it looks like dx10 would require z chosen over y, and y chosen over x
(previously did x chosen over y, y chosen over z). Since it's all the
same effort just honor dx10's wishes. (Though actually, for some prefered
orderings, we could save one (or two with derivatives) selects since the
tnewx and tnewz (and the corresponding dmax values) are the same.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 23:22:10 +02:00
Eric Anholt
b6e9b54d06 i965: Ask the register allocator to round-robin through registers.
The way we were allocating registers before, packing into low register
numbers for Ironlake, resulted in an overly-constrained dependency graph
for instruction scheduling.  Improves GLBenchmark 2.1 performance by
4.5% +/- 0.7% (n=26).  No difference on my old GLSL demo (n=20).  No
difference on nexuiz (n=15).

v2: Fix off-by-one bug that made the change only work for 16-wide on i965.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-04 12:51:06 -07:00
Zack Rusin
be9a42e980 llvmpipe: implement ucmp
and add a test for it

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 12:09:55 -07:00
Paul Berry
5db2249493 Avoid spurious GCC warnings in STATIC_ASSERT() macro.
GCC 4.8 now warns about typedefs that are local to a scope and not
used anywhere within that scope.  This produced spurious warnings with
the STATIC_ASSERT() macro (which used a typedef to provoke a compile
error in the event of an assertion failure).

This patch switches to a simpler technique that avoids the warning.

v2: Avoid GCC-specific syntax.  Also update p_compiler.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-04 09:52:18 -07:00
Erik Faye-Lund
456f40e18d freedreno: document debug flag
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-04-04 10:41:50 -06:00
Brian Paul
e95514c0ea st/wgl: add HUD support
v2: fix a few minor issues spotted by Jose.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 10:41:35 -06:00
Brian Paul
0c1dcf906d st/wgl: make stw_current_context() non-static
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 08:50:16 -06:00
Brian Paul
92e5e45ff1 util: add debug_memory_check_block(), debug_memory_tag()
The former just checks that the given block is valid by checking
the header and footer.

The later sets the memory block's tag.  With extra debug code, we
can use that for monitoring/checking particular allocations.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 08:50:15 -06:00
Brian Paul
a408ea9692 gallium/hud: replace malloc w/ MALLOC
To match the FREE() called used later.  Fixes things on Windows.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-04 08:50:15 -06:00
Vincent Lejeune
9276961223 r600g/llvm: Workaround for wrong tex.offset_* 2013-04-04 16:03:04 +02:00
Roland Scheidegger
ce5096a0a9 gallivm: honor explicit derivatives values for cube maps.
This is trivial now, though need to make sure we pass all the necessary
derivative values (which is 3 each for ddx/ddy not 2).
Passes piglit arb_shader_texture_lod-texgradcube test.

v2: add the forgotten abs() for all incoming derivatives (discovered
by new piglit arb_shader_texture_lod-texgradcube test, though more by
luck as it was failing only for exactly one pixel...).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 01:03:42 +02:00
Roland Scheidegger
f621015cb5 gallivm: do per-pixel cube face selection (finally!!!)
This proved to be tricky, the problem is that after selection/mirroring
we cannot calculate reasonable derivatives (if not all pixels in a quad
end up on the same face the derivatives could get "randomly" exceedingly
large).
However, it is actually quite easy to simply calculate the derivatives
before selection/mirroring and then transform them similar to
the cube coordinates (they only need selection/projection, but not
mirroring as we're not interested in the sign bit, of course). While
there is a tiny bit more work to do (need to calculate derivs for 3
coords instead of 2, and additional selects) it also simplifies things
somewhat for the coord selection itself (as we save some broadcast aos
shuffles, and we don't need to calculate the average vector) - hence if
derivatives aren't needed this should actually be faster.
Also, this has the benefit that this will (trivially) work for explicit
derivatives too, which we completely ignored before that (will be in a
separate commit for better trackability).
Note that while the way for getting rho looks very different, it should
result in "nearly" the same values as before (the "nearly" is only because
before the code would choose the face based on an "average" vector and hence
the derivatives calculated according to this face, where now (for implicit
derivatives) the derivatives are projected on the face selected for the
first (top-left) pixel in a quad, so not necessarly the same face).
The transformation done might not quite be state-of-the-art, calculating
length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the
same as before (that is I think a better transform would _somehow_ take
the "derivative major axis" into account so that derivative changes in
the major axis wouldn't get ignored).
Should solve some accuracy problems with cubemaps (can easily be seen with
the cubemap demo when switching wrapping/filtering), though we still don't
do seamless filtering to fix it completely (so not per-sample but per-pixel
is certainly better than per-quad and already sufficient for accurate
results with nearest tex filter).

As for performance, it seems to be a tiny bit faster too (maybe 3% or so
with cubemap demo). Which I'd have expected with nearest/nearest filtering
where this will be less instructions, but the difference seems to actually
be larger with linear/linear_mipmap_linear where it is slightly more
instructions, probably the code appears less serialized allowing better
scheduling (on a sandy bridge cpu). It actually seems to be now at least
as fast as the old path using a conditional when using 128bit vectors too
(that is probably more a result of testing with a newer cpu though), for now
that old path is still there but unused.
No piglit regressions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 01:03:42 +02:00
Roland Scheidegger
bdfbeb9633 gallivm: minor rho calculation optimization for 1 or 3 coords
Using a different packing for the single coord case should save a shuffle.
Plus some minor style fixes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 01:03:42 +02:00
Roland Scheidegger
067a0ae420 gallivm: use f16c hw support for float->half and half->float conversion
Should be way faster of course on cpus supporting this (includes AMD
Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)).
Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-04 01:03:42 +02:00
Zack Rusin
302df7cc85 draw/llvmpipe: allow independent so attachments to the vs
When geometry shaders are present, one needs to be able to create
an empty geometry shader with stream output that needs to be
resolved later and attached to the currently bound vertex shader.
Lets add support for it to llvmpipe and draw. draw allows attaching
independent stream output info to any vertex shader and llvmpipe
resolves at draw time which vertex shader the given empty geometry
shader should be linked to.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
246e68735f llvmpipe: reset so buffers when not appending
We need to reset the internal state of the so buffers or we'll
keep appending even though we're not supposed to.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
7ca65a68e1 draw: remove unused function
we use draw_set_mapped_so_targets nowadays

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
b16ae0f792 draw/llvm: use an enum instead of magic numbers
I think this was there before and got accidently
removed during a merge. Same code as for the GS
context, which is also using an enum instead of
hardcoded numbers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
49b7d933f8 draw/gs: cleanup some debugging code
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
822c21c776 draw/so: maintain an exact number of written vertices
It's quite helpful during the rendering when we know
exactly the count of the vertices available in the
buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
d8543bd752 draw: Implement support for primitive id
We were largely ignoring primitive id.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
f6bfb62c50 draw/so: Fix bogus assert
We do support so with multiple primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
e6fc635351 draw/gs: Fix memory corruption with multiple primitives
We were flushing with incorrect number of primitives. TGSI exec
can only work with a single primitive at a time. Plus the fetching
with multiple primitives on llvm paths wasn't copying the last
element.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
f313b0c850 gallivm: cleanup the gs interface
Instead of void pointers use a base interface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Brian Paul
ac114c6824 svga: add new memory-used HUD query
To track the amount of memory used by all pipe_resources (textures
and buffers).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Brian Paul
a69efa9482 util: add new util_resource_size() function in u_resource.[ch]
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Brian Paul
a3cccdec90 util: move functions from u_resource.c to u_transfer.c
The functions are prototyped in u_transfer.h and are related to the
other functions in u_transfer.c.

The next patch will re-use the u_resource.c file for new code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Vincent Lejeune
159d934066 r600g/llvm: Do not override llvm provided stack_size 2013-04-03 18:39:49 +02:00
Vincent Lejeune
097a6ecdfe r600g/llvm: Do not change cf_alu inst when adding alus 2013-04-03 18:22:40 +02:00
Marek Olšák
ff01e0db0e radeonsi: add more cases for copying unsupported formats to resource_copy_region
Ported from r600g commit:

8891b2f9c9

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 9.1 branch.
2013-04-03 10:58:33 -04:00
Brian Paul
3838edaf5d svga: add HUD queries for number of draw calls, number of fallbacks
The fallbacks count is the number of drawing calls that use a "draw"
module fallback, such as polygon stipple.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:08 -06:00
Brian Paul
49ed1f3cb3 svga: refactor occlusion query code
This is in preparation for adding new query types for the HUD.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:07 -06:00
Brian Paul
a9ae7e9c28 gallium/hud: try L8 texture for font if I8 format isn't supported 2013-04-03 09:44:57 -06:00
Brian Paul
0289ebaa0f svga: add case for PIPE_CAP_QUERY_PIPELINE_STATISTICS 2013-04-03 08:19:44 -06:00
Brian Paul
7e28debb6f st/mesa: rewrite comment in st_manager.c 2013-04-03 08:16:36 -06:00
Christoph Bumiller
80eef069f0 nv50,nvc0: remove MS resolve formats hack
Mesa now allows BlitFramebuffer resolve between RGBA and BGRA.
2013-04-03 13:19:15 +02:00
Christoph Bumiller
4de70bf43c nvc0: fix 128 bit compressed storage type selection 2013-04-03 12:54:44 +02:00
Christoph Bumiller
8e1dd58a7e nvc0: place staging textures in GART and map them directly 2013-04-03 12:54:44 +02:00
Christoph Bumiller
ba9b0b682f nv50: account for pesky prefetch in size calculation of linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller
f0a0d59f0f nvc0: honour scaled coordiantes setting for linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller
d801545964 nvc0: fix for 2d engine R source formats writing RRR1 and not R001 2013-04-03 12:54:43 +02:00
Christoph Bumiller
6417d56c19 nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit
We send position.z == 0, DEPTH_RANGE may be some arbitrary range
not including 0 (for exmaple in piglit's hiz tests).
2013-04-03 12:54:43 +02:00
Christoph Bumiller
e45c969fe5 st/mesa: fix bitmap,drawpix,drawtex for PIPE_CAP_TGSI_TEXCOORD
NOTE: Changed the semantic index for the drawtex coordinate to
be the texture unit index instead of always 0.
Not sure if this is correct but since the value seems to depend
on the unit it would make sense to use different varying slots.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
2a8145d36b nouveau: accelerate buffer copies in resource_copy_region 2013-04-03 12:54:43 +02:00
Christoph Bumiller
3ed4bbd769 nvc0: demagic some of the NVE4_COMPUTE_UPLOAD methods
It's actually the same as P2MF.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
fb0334adb3 nvc0: read PM counters for each warp scheduler separately 2013-04-03 12:54:43 +02:00
Christoph Bumiller
7bac075f25 nvc0: add some metrics to driver specific queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller
198f514aa6 nvc0: add some driver statistics queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller
7628cc247f nvc0: disable compressed storage type 0xdb for now
Single-sample color compression doesn't seem that useful anyway.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
ea12fc3f6c nvc0: use correct hw query for PRIMITIVES_GENERATED
It was the same as SO_STATISTICS[1] before.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
6bca4e7085 nvc0: use fence to check state of queries that don't write sequence
This still isn't optimal, since the fence will signal a bit late,
but better than checking on the bo, which may never be ready if it
is shared (which is likely).
2013-04-03 12:54:43 +02:00
Christoph Bumiller
3d2790cead gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICS
Also, renamed "pixels-rendered" to "samples-passed" because the
occlusion counter increments even if colour and depth writes are
disabled, or (on some implementations) for killed fragments that
passed the depth test when PS early_fragment_tests is set.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
c620aad71c gallium/docs: fix definition of PIPE_QUERY_SO_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Christoph Bumiller
f35e96d973 gallium: add PIPE_CAP_QUERY_PIPELINE_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Paul Berry
41e4bccc75 i965: Reduce code duplication in handling of depth, stencil, and HiZ.
This patch consolidates duplicate code in the brw_depthbuffer and
gen7_depthbuffer state atoms.  Previously, these state atoms contained
5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for
Gen4-6 and 2 for Gen7).  Also a lot of logic for determining the
appropriate buffer setup was duplicated between the Gen4-6 and Gen7
functions.

This refactor splits the code into three separate functions:
brw_emit_depthbuffer(), which determines the appropriate buffer setup
in a mostly generation-independent way, brw_emit_depth_stencil_hiz(),
which emits the appropriate state packets for Gen4-6, and
gen7_emit_depth_stencil_hiz(), which emits the appropriate state
packets for Gen7.

Tested using Piglit on Gen5-7 (no regressions).

v2: Re-word some comments.  Fix an assertion that incorrectly
prohibited packed depth/stencil formats on Gen6 (these are allowed
provided that HiZ is disabled).

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-02 15:19:13 -07:00
Paul Berry
2ad0ed6349 Revert "glsl: Replace constant-index vector array accesses with swizzles"
This reverts commit dbf94d105a, which
was working around a bug in the handling of array indexing when
constant folding built-in functions.  Now that the constant folding
bug has been fixed, the workaround is no longer needed.
2013-04-02 12:24:16 -07:00
Paul Berry
7d4f1e6467 glsl: Fix array indexing when constant folding built-in functions.
Mesa constant-folds built-in functions by using a miniature GLSL
interpreter (see
ir_function_signature::constant_expression_evaluate_expression_list()).
This interpreter had a bug in its handling of array indexing, which
caused expressions like "m[i][j]" (where m is a matrix) to be handled
incorrectly.  Specifically, it incorrectly treated j as indexing into
the whole matrix (rather than indexing just into the vector m[i]); as
a result the offset computed for m[i] was lost and m[i][j] was treated
as m[j][0].

Fixes piglit tests inverse-mat[234].{vert,frag}.

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57436
2013-04-02 12:24:08 -07:00
Roland Scheidegger
450950c57a gallivm: bring back optimized but incorrect float to smallfloat optimizations
Conceptually the same as previously done in float_to_half.
Should cut down number of instructions from 14 to 10 or so, but
will promote some NaNs to Infs, so it's disabled.
It gets a bit tricky though handling all the cases correctly...
Passes basic tests either way (though there are no tests testing special
cases, but some manual tests injecting them seemed promising).

v2: style and comment fixes suggested by Jose

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-02 18:24:31 +02:00
Roland Scheidegger
3febc4a1cd gallivm: consolidate code for float-to-half and float-to-packed conversion.
This replaces the existing float-to-half implementation.
There are definitely a couple of differences - the old implementation
had unspecified(?) rounding behavior, and could at least in theory
construct Inf values out of NaNs. NaNs and Infs should now always be
properly propagated, and rounding behavior is now towards zero
(note this means too large but non-Infinity values get propagated to max
representable value, not Infinity).
The implementation will definitely not match util code, however (which
does nearest rounding, which also means too large values will get
propagated to Infinity).

Also fix a bogus round mask probably leading to rounding bugs...
v2: fix a logic bug in handling infs/nans.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-02 18:24:31 +02:00
Vadim Girlin
9be624b3ef r600g: don't reserve more stack space than required v5
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.

v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Vadim Girlin
7e04227f39 r600g: fix range handling for tgsi input declarations v2
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Marek Olšák
f8502b7e71 gallium/hud: do .xxxx swizzling for the font texture in the fragment shader
This allows using L8 and R8 for the font if I8 isn't supported.

Tested-by: Brian Paul <brianp@vmware.com>
2013-04-02 16:57:57 +02:00
Brian Paul
98b64cc20f hud: flush/unmap the vertex buffer before drawing
The VMware svga driver is picky about making sure the VBO is unmapped
before drawing.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-02 08:17:28 -06:00
Brian Paul
bdd3770b78 draw: use pipe_transfer_unmap() to match pipe_transfer_map() 2013-04-02 08:17:28 -06:00
Roland Scheidegger
9b329f4c09 gallivm: fix signed small float to float conversion
Introduced by 5f41e08cf3,
just a silly typo.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62921.
2013-04-02 13:21:07 +02:00
Christian König
a0dca4409a radeonsi: add instance divisor support v3
v2: reduce key size, don't copy key around to much.
v3: remove key size reduction

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
cf9b31f78a radeonsi: add start instance support
This works different than on R600, we need to add the start instance manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
e4ed58763a radeonsi: add instanceid support
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
83df955ca9 radeon/llvm: move system value fetching to common code
This should be used by both SI and R600.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:42 +02:00
Michel Dänzer
c6efb4870b radeonsi: Handle arbitrary 2-byte formats in resource_copy_region
Fixes mplayer -vo vdpau OSD.

NOTE: This is a candidate for the 9.1 branch.

Reported-by: Igor Vagulin <igor.vagulin@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
2013-04-02 11:42:35 +02:00
Maarten Lankhorst
6d20c646d6 nvc0: Fix fd leak in nvc0_create_decoder
NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-04-02 10:25:26 +02:00
Aras Pranckevicius
b2eee0869f GLSL: fix lower_jumps to report progress properly
A fix for lower_jumps progress reporting, very much like similar in
c1e591eed.

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-01 16:57:17 -07:00
Eric Anholt
62501c3af8 i965/fs: Allow CSE on pre-gen7 varying-index uniform loads
All the other expression types allowed here have inst->mlen == 0, and this
one has implied MRF writes for all of its payload, so nothing else in the
implementation should need to change.

Reduces SEND messages for loading from pull constants in kwin's Lanczos
shader from 16 to 6.  (Due to a deficiency in constant propagation, I
can't use the hack I did in the previous commit to test the performance
change)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:26 -07:00
Eric Anholt
70b27e0e4b i965/fs: Use LD messages for pre-gen7 varying-index uniform loads
This comes at a minor performance cost at the moment (-3.2% +/- 0.2%, n=14 on
my GM45 forced to load all uniforms through the varying-index path), but we
get a whole vec4 at a time to reuse in the next commit.

v2: Fix comment about channels in the other message.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:26 -07:00
Eric Anholt
ce316f62ef i965/fs: Don't double-emit SEND dependency workarounds at control flow.
We weren't setting needs_dep[i] in the loops, so we'd continue on to
potentially add the same workaround MOVs to the later basic block
boundaries, too.  We can either set needs_dep[i] to exit through the
normal path, or we can just return since we know we're done.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:26 -07:00
Eric Anholt
3cf69b2284 i965/fs: Bake regs_written into the IR instead of recomputing it later.
For sampler messages, it depends on the target gen, and on gen4
SIMD16-sampler-on-SIMD8-execution we were returning 4 instead of 8 like we
should.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:26 -07:00
Eric Anholt
8edc7cbe64 i965/fs: Clean up the setup of gen4 simd16 message destinations.
I think this makes it much more obvious what's going on here.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:26 -07:00
Eric Anholt
9f43b84928 i965/fs: Do CSE on gen7's varying-index pull constant loads.
This is our first CSE on a regs_written() > 1 instruction, so it takes a
bit of extra fixup.  Reduces the number of loads on kwin's Lanczos shader
from 12 to 2.

v2: Fix compiler warning (false positive on possibly-uninitialized variable)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:25 -07:00
Eric Anholt
dca5fc1435 i965/fs: Improve performance of varying-index uniform loads on IVB.
Like we have done for the VS and for constant-index uniform loads, we use
the sampler engine to get caching in front of the L3 to avoid tickling the
IVB L3 bug.  This is also a bit of a functional change, as we're now
loading a vec4 instead of a single dword, though we're not taking
advantage of the other 3 components of the vec4 (yet).

With the driver hacked to always take the varying-index path for all
uniforms, improves performance of my old GLSL demo by 315% +/- 2% (n=4).
This a major fix for some blur shaders in compositors from the
varying-index uniforms support I introduced in 9.1.

v2: Move old offset computation into the pre-gen7 path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:25 -07:00
Eric Anholt
bc0e1591f6 i965/fs: Avoid inappropriate optimization with regs_written > 1.
Right now we don't have anything with regs_written() > 1 and !inst->mlen,
but that's about to change.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
740350c982 i965: Make the fragment shader pull constants index by dwords, not vec4s.
We want to load vec4s, since loading a vec4 instead of a dword is
basically no increased latency.  But for variable indexed access, the
previous requirement of aligned vec4s for a sampler LD was hard to
implement.

Note that this change only affects those messages that use the surface
format, like sampler LDs, but not to the untyped data cache loads we've
used in other cases.

No significant performance difference on my GLSL demo with uniforms forced
to take the varying pull constants path (n=4).

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
2f41a60145 i965: Make the constant surface interface take a normal byte size.
This puts the rounding-up logic into the function itself instead of all
the callers having to manage it.  Also drop an "unused" comment in gen4,
as the stride *is* used for texbos (and will be for uniforms soon).

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
8c694dfe64 i965/fs: Move varying uniform offset compuation into the helper func.
I'm going to want to change the math for gen7 using sampler LD
instructions in a way that gets CSE to occur like we'd hope.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
59e858861c i965/fs: Remove creation of a MOV instruction that's never used.
We weren't inserting it into the list, so it did nothing.  This line was
replaced by the MOV/MUL block above.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:24 -07:00
Eric Anholt
1d6ead3804 i965/fs: Allow constant propagation into MACH.
This happens quite a bit with varying-index uniform loads.  We could also
do better by avoiding the MACH entirely, but there's no reason not to at
least take this step.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:24 -07:00
Vincent Lejeune
50fd9c4544 r600g/llvm: Update LLVM_REVISION.txt 2013-04-01 23:50:20 +02:00
Vincent Lejeune
8c8c4e3977 r600g/llvm: Use stack_size provided from llvm. 2013-04-01 23:43:57 +02:00
Vincent Lejeune
4ac0d85ca6 r600g/llvm: uses function attribute to pass shader type 2013-04-01 23:43:42 +02:00
Vincent Lejeune
af38695f51 r600g/llvm: Add support for cf_alu native encode 2013-04-01 23:43:27 +02:00
Haixia Shi
bc0cc2944f ACTIVE_UNIFORM_MAX_LENGTH should include 3 extra characters for arrays.
If the active uniform is an array, then the length of the uniform name should
include the three extra characters for the "[0]" suffix, which is required by
the GL 4.2 spec to be appended to the uniform name in glGetActiveUniform().

This avoids the situation where the output buffer does not have enough space
to hold the "[0]" suffix, resulting in an incomplete array specification like
"foobar[0".

NOTE: This is a candidate for the 9.1 branch.

Change-Id: I41e87ba347a7169eec8c575596cc3416adbe0728
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-01 13:39:13 -07:00
Matt Turner
e2b40e253b i965/fs: Fix bad interaction between tex swizzles and textureQueryLOD.
Reported-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 13:11:43 -07:00
Eric Anholt
4ee892ee8a i965: Remove the old brw_optimize() code.
This is now done in the VS backend before instruction emit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 11:36:06 -07:00
Eric Anholt
4fee05b020 i965/vs: Add a pass to set dependency control fields on instructions.
This is a more aggressive version of the old brw_optimize() path.  Reduces
cycles spent in the vertex shader on minecraft by 18.6% +/- 10.0% (n=15).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 11:36:05 -07:00
Eric Anholt
229a51cdbe i965: Dump shader source for linked shader programs.
We dump shader source in ir_to_mesa.cpp, and we dump linked programs here,
but we had no reference from the linked programs to their source.  This
was preventing improvement of shader-db to use linked shader programs
instead of individual shader files (which is bogus, because it means we
optimize out VS outputs, and don't interpolate FS inputs!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 11:30:36 -07:00
Mike Lothian
777a7f2003 clover: Fix build with LLVM 3.3 2013-04-01 10:50:23 -07:00
Brian Paul
1165ff1af1 llvmpipe: use triangle subdivision to avoid fixed-point overflow issues
If we're drawing to a surface that's 2048 x 2048 pixels or larger there's
danger of fixed-point overflow in the triangle rasterization code.  That
leads to various rendering glitches.

Rather than implement some intricate changes to the rasterization code,
simply subdivide triangles into smaller subtriangles to avoid the issue.
Only do this when the drawing surface is larger than 2048 by 2048.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul
95df2b2883 mesa: remove platform checks around __builtin_ffs, __builtin_ffsll
Use the __builtin_ffs, __builtin_ffsll functions whenever we have GCC,
not just for specific platforms.  Fixes Solaris build.

Note: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62868
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul
99811c344b docs: add a new page documenting known application issues
Let's try to update this when we find other broken applications...

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul
fe30fa9ad6 drirc: set always_have_depth_buffer for Topogon
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-01 08:18:09 -06:00
Adam Jackson
e26d5940ff gallivm: Minor comment cleanup
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-01 09:45:38 -04:00
Dave Airlie
135bb3c1a9 mesa: fix texture storage multisample prototypes harder.
I just noticed the warnings since I fixed the other bit.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-01 19:54:56 +10:00
Vincent Lejeune
c3fb34ee8d r600g/llvm: Update LLVM_REVISION 2013-03-31 21:37:20 +02:00
Vincent Lejeune
67a8ee7aaa r600g/llvm: use native encode for tex 2013-03-31 21:35:47 +02:00
Dave Airlie
5b36bc05be glapi: fix storage multisample build errors
Reported on #radeon by udovdh

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-03-31 20:41:28 +10:00
Chris Forbes
2a528889a3 docs: mark ARB_texture_storage_multisample done
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:42 +13:00
Chris Forbes
d25b4d5e90 i965: enable ARB_texture_storage_multisample on Gen6+
This can be enabled everywhere that ARB_texture_multisample is
supported -- ARB_texture_storage is supported on everything.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:40 +13:00
Chris Forbes
e0015c819c mesa: allow multisample texture targets in [Get]TexParameter*
ARB_texture_storage_multisample allows texture parameters to be
queried for TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY
targets.

Some parameters may also be set, with the following exceptions:

- TEXTURE_BASE_LEVEL may not be set to a nonzero value; generates
   INVALID_OPERATION

- any state which appears in the `per-sampler` state table may not
  be set; generates INVALID_OPERATION

V2: Don't introduce bogus handling of TEXTURE_MAX_LEVEL

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:36 +13:00
Chris Forbes
b15c558c85 mesa: improve reported function name in Tex*Multisample
Now that there are 4 variants, just pass the function name into
teximagemultisample rather than reconstructing it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:34 +13:00
Chris Forbes
9cbfe98bfc mesa: add enable bit for ARB_texture_storage_multisample
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:32 +13:00
Chris Forbes
719974b54c glapi: add definition of ARB_texture_storage_multisample
Adds XML for the extension, dispatch_sanity enabling, and the two new
entrypoints. These are both implemented by calling the shared
teximagemultisample() with immutable=GL_TRUE.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:28 +13:00
Chris Forbes
788b0f8535 mesa: add support for immutable textures to teximagemultisample()
The new entrypoints will come later, but this adds the actual logic for
supporting immutable multisample textures:

- The immutability flag is set as desired.
- Attempting to modify an immutable multisample texture produces
  INVALID_OPERATION.

Note: The extension spec does not mention adding this behavior to
TexImage*Multisample, but it seems like the reasonable thing to do.

V2: - Cover missing error cases (unsized formats; texture object zero)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V1] Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:22 +13:00
Chris Forbes
7f32b9560b mesa: extract _mesa_is_legal_tex_storage_format helper
This is about to be used in teximagemultisample() when immutable=true.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:13 +13:00
Kenneth Graunke
fdc5941972 mesa: Delete VERT_ATTRIB_GENERIC_NV and VERT_BIT_GENERIC_NV macros.
These haven't been used since we deleted NV_vertex_program support.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-30 19:19:45 -07:00
Eric Anholt
0967c362bf i965: Fix an inconsistency inb the VUE map with gl_ClipVertex on gen4/5.
We are intentionally not allocating a slot for gl_ClipVertex.  But by
leaving the bit set in the slots_valid, the fragment shader's computation
of where varyings are in urb entry coming out of the SF would be off by
one.  Fixes rendering in Freespace 2 SCP, and improves rendering in TF2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62830
Tested-by: Joaquín Ignacio Aramendía <samsagax@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-30 17:24:18 -07:00
Eric Anholt
9dd19575d3 intel: Remove a never-taken debug print path.
Alessandro Pignotti noted when I added this code in commit
0e723b135b that it's in the else block for
"if (busy)", so this debug print couldn't happen.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-30 17:23:50 -07:00
Brian Paul
c34bbe110d st/mesa: add ir_lod case in GLSL->TGSI code to silence warning 2013-03-29 17:21:33 -06:00
Ian Romanick
e0131196ca glsl: Generated masked write instead of vector array index for UBO lowering
When reading a column from a row-major matrix, we would slot the single
value read into the vector using an ir_dereference_array of the vector
with a constant index.  This will (eventually) get optimized to a
masked-write, so just generate the masked write in the first place.

v2: Remove unused variable 'chan'.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Anholt <eric@anholt.net>
2013-03-29 12:01:14 -07:00
Ian Romanick
65cc68f430 glsl: Replace open-coded dot-product with dot
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Paul Berry <stereotype441@gmail.com>
2013-03-29 12:01:11 -07:00
Ian Romanick
dbf94d105a glsl: Replace constant-index vector array accesses with swizzles
Search and replace:

    ][0] -> ].x
    ][1] -> ].y
    ][2] -> ].z
    ][3] -> ].w

Fixes piglit tests inverse-mat[234].{vert,frag}.  These tests call the
inverse function with constant parameters and expect proper constant
folding to happen.  My suspicion is that this patch papers over some bug
in constant propagation involving array accesses.

Either way, all of these accesses eventually get lowered to swizzles.
This cuts out the middle man (saving a trivial amount of CPU).

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Paul Berry <stereotype441@gmail.com>
2013-03-29 12:01:07 -07:00
Ian Romanick
c770faea0a glsl: Add missing bool case in glsl_type::get_scalar_type
Since the case was missing bec4->get_scalar_type() would return bvec4,
but vec4->get_scalar_type() would return float.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-29 12:01:01 -07:00
Kenneth Graunke
57a502518e i965: Fix INTEL_DEBUG=shader_time for fragment shaders with discards.
"discard" instructions generate HALT instructions which jump to a final
HALT near the end of the shader.  Previously, fs_generator created this
final jump target when it saw the first FS_OPCODE_FB_WRITE, causing it
to jump right before the FB write epilogue.  This is normally good.

However, INTEL_DEBUG=shader_time also has an epilogue section which
records the final timestamp.  The frontend emits IR for this just before
FS_OPCODE_FB_WRITE.  Unfortunately, this led to the following ordering:

1. Shader Time Epilogue
2. Final HALT (where discards jump)
3. Framebuffer Write Epilogue

This meant that discarded pixels completely skipped the shader time
epilogue, causing no ending timestamp to be written.  This obviously
led to inaccurate results.

This patch adds a new FS_OPCODE_PLACEHOLDER_HALT in the IR stream just
before any epilogue sections.  This is where the final HALT should be
generated, and makes it easy to ensure the correct ordering:

1. Final HALT
2. Shader Time Epilogue
3. Framebuffer Write Epilogue

For shaders that don't discard, this opcode compiles away to nothing.
The scheduler adds barrier dependencies to make sure that it doesn't
get moved above any FS_OPCODE_DISCARD_JUMP instructions.

One 8-wide shader in GLBenchmark 2.7 dropped from 2291.67 Gcycles to
a mere 5.13 Gcycles.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 11:39:32 -07:00
Eric Anholt
20d846ce8b i965: Add names for all instructions to dump_instruction() in FS and VS.
I'd previously added the minimum names to understand my dumps, but this
makes dumps in general much easier to read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 11:39:21 -07:00
Matt Turner
ed6186f0e8 i965: Enable ARB_texture_query_lod.
v2: Support Ironlake as well.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 10:21:14 -07:00
Matt Turner
b8aa9f7d3a i965/fs: Generate LOD sampler message from ir_lod.
v2: Support Ironlake as well.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 10:21:14 -07:00
Dave Airlie
110ca8b1f3 glsl: Implement ARB_texture_query_lod
v2 [mattst88]:
   - Rebase.
   - #define GL_ARB_texture_query_lod to 1.
   - Remove comma after ir_lod in ir.h for MSVC.
   - Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp,
     opt_tree_grafting.cpp.
   - Rename textureQueryLOD to textureQueryLod, see
     https://www.khronos.org/bugzilla/show_bug.cgi?id=821
   - Fix ir_reader of (lod ...).
v3 [mattst88]:
   - Rename textureQueryLod to textureQueryLOD, pending resolution of
     Khronos 821.
   - Add ir_lod case to ir_to_mesa.cpp.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 10:20:26 -07:00
Matt Turner
0e0ab8a071 i965/fs: Use measured Gen7 instruction timings on Gen6.
x before
+ after
+------------------------------------------------------------------------------+
|   x                                   x   +                                  |
|   xx  ++                              x   +                                  |
|   xx  ++ +                           xx   ++                                 |
|x xxx x+++++          +           xxx x*x+*+++ +         x                   +|
|   |_____|____________A______A____M____M_|_______|                            |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
    x  23       8083.78       8287.83       8205.55     8162.7461     68.307951
    +  23       8107.56       8358.74       8224.33     8186.1765     71.506301
    No difference proven at 95.0% confidence

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
f085b21b25 i965/fs: Increase and document MAD latency on Gen7.
58% of mad(8) generated in shader-db are reading registers from the same
bank.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
414ea2f560 i965/fs: Add LRP instruction latency.
Set its latency to what happens to be the default floating-point
instruction latency. One day we may want to handle latency based on
register bank information.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
ad4507b355 i965/fs: Add Haswell cycle timings
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
7997e59b65 i965: Note that write-after-write dependencies are blocking.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:26 -07:00
Matt Turner
f91e371fee i965: Reword comment about the shared mathbox.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:26 -07:00
Roland Scheidegger
5f41e08cf3 gallivm: consolidate some half-to-float and r11g11b10-to-float code
Similar enough that we can try to use shared code.
v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com
2013-03-29 16:39:40 +01:00
Chris Forbes
4412f3bc13 mesa: provide default implementation of QuerySamplesForFormat
Previously at least i915 failed to provide an implementation, but
exposed ARB_internalformat_query anyway, leading to crashes when
QueryInternalformativ was called.

Default implementation just returns 1 for everything, so is suitable for
any driver which does not support multisampling.

V2: - Move from intel to core mesa.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 20:54:36 +13:00
Christoph Bumiller
ee624ced36 nvc0: implement MP performance counters
There's more, but this only adds (most) of the counters that are
handled directly by the shader processors.
The other counter domains are not handled on the multiprocessor and
there are no FIFO object methods for configuring them.
Instead, they have to be programmed by the kernel via PCOUNTER, and
the interface for this isn't in place yet.
2013-03-29 00:33:01 +01:00
Christoph Bumiller
480359bcf6 nvc0: enable compression when supported 2013-03-29 00:33:01 +01:00
Christoph Bumiller
25722e3454 nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count 2013-03-29 00:33:00 +01:00
Christoph Bumiller
443b247878 nv50,nvc0: fix 3d blits, restore viewport after blit 2013-03-29 00:33:00 +01:00
Christoph Bumiller
090e73fc46 nv50: fix 3D render target setup 2013-03-29 00:33:00 +01:00
Brian Paul
b54ce3738a llvmpipe: put .bmp extension on dumped image files 2013-03-28 17:17:26 -06:00
Brian Paul
e90c56bc4e llvmpipe: add 'f' suffix to 1.0 in fixed_to_float() 2013-03-28 17:17:26 -06:00
Brian Paul
499aa3ddb4 draw: fix some build breakage when LLVM is not used
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883
Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-03-28 17:15:58 -06:00
Marek Olšák
9ad9141917 mesa: handle STATE_CURRENT_ATTRIB_MAYBE_VP_CLAMPED for parameter printing
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-28 20:02:50 +01:00
Kenneth Graunke
9fe47756b3 i965: Tidy shader time printing code by using printf's field widths.
We can use %-6s%-6s rather than manually counting characters, resulting
in much more readable code.

This necessitates a small secondary change: using "total fs16" and ""
now causes the "" string to be padded out to 6 characters, resulting in
too much whitespace.  Splitting it into "total" and "fs16" produces the
same output as before.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:44 -07:00
Eric Anholt
6192e9b377 i965/vs: Include URB payload setup in shader_time.
This much more accurately reflects the cost of the vertex shader, since
the payload setup is often a significant fraction of the instructions in
the VS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:41 -07:00
Eric Anholt
55feb19704 i965/vs: Use a send from a 2-register VGRF for shader time writes.
This will let us emit it later, after we're setting up MRFs for the
URB write.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:37 -07:00
Eric Anholt
130138030a i965/vs: Teach copy propagation about sends from GRFs.
This incidentally also teaches it a bit about gen6 math -- we now allow
unswizzled, unmodified GRF temps as the sources for math.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:34 -07:00
Eric Anholt
c3a22d42a8 i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs.
v2: Fix silly bool handling, and don't add new tabs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:29 -07:00
Eric Anholt
47e795d861 i965/fs: Include everything but the final FB write in shader_time.
Previously, if you just wrote a constant color to the render target, no
time got noted at all.  This is convenient for doing single-instruction
timings, but not so much for actual program analysis.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:23 -07:00
Eric Anholt
5c5218ea61 i965/fs: Switch shader_time writes to using GRFs.
This avoids conflicts between shader_time and FB writes, so we can include
more of the program under our profiling.  This does mean hiding more of
the message setup from the optimizer, which doesn't have a way to handle
multi-reg sends from GRFs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:15 -07:00
Eric Anholt
5c039543db i965: Provide more detailed information to match shader_time to programs.
Ken asked me the other day what -1 vs 0 vs 3 vs other meant in our shader
names, and I realized that it was really unclear.  I'd like to do even
better, like noting which one is the clear shader, but that would require
exposing the metaops struct to the driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:11 -07:00
Eric Anholt
d2ba1c24b4 i965: Track ARB program state along with GLSL state for shader_time.
This will let us do much better printouts for non-GLSL programs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:01 -07:00
Marek Olšák
a19f6e880a st/dri: fix crash with HUD and single buffering 2013-03-28 18:17:21 +01:00
Marek Olšák
6b5dfa42c9 st/mesa: remove leftover printfs from ReadPixels
Oops, I thought I had removed all debugging code.
2013-03-28 18:17:21 +01:00
Eric Anholt
eda434921d i965/fs: Improve performance of copy propagation dataflow using bitsets.
Reduces compile time of l4d2's slowest shader by 17.8% +/- 1.3% (n=10).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 09:48:50 -07:00
Zack Rusin
d066133a76 llvmpipe/draw: Fix texture sampling in geometry shaders
We weren't correctly propagating the samplers and sampler views
when they were related to geometry shaders.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
186a6bffdd draw/llvm: Cleanup the store debugging code
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
10964fc73d draw: Allocate the output buffer for output primitives
We were allocating the output buffer but using the input
primitives. We need to allocate that buffer using the
maximum number of output, not input, primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
f20f981553 gallivm: Implement the breakc instruction
Required by more modern examples. Like BRK but with a condition.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
b66ffcf2f8 gallivm: implement implicit primitive flushing
TGSI semantics currently require an implicit endprim at the end
of GS if an ending primitive hasn't been emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
e96f4e3b85 gallium/llvm: implement geometry shaders in the llvm paths
This commits implements code generation of the geometry shaders in
the SOA paths. All the code is there but bugs are likely present.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
edcebe665d draw/gs: Fetch more than one primitive per invocation
Allows executing gs on up to 4 primitives at a time. Will also be
required by the llvm code because there we definitely don't want
to flush with just a single primitive.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Zack Rusin
014c4d1cd7 draw/gs: Abstract the portions of GS that are tgsi specific
To be able to add llvm paths later on we need to have some common
interface for them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Zack Rusin
a85c83e427 draw/llvm: Remove unused gs_constants from jit_context
The member was never used and we'll need to handle it differently
because gs will also need samplers/textures setup.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Zack Rusin
90ee8de700 graw/gs: add missing max output vertices to all tests
A few tests were missing this crucial property.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Jerome Glisse
3f7d9710e8 radeonsi: add cs tracing v3
Same as on r600, trace cs execution by writting cs offset after each
states, this allow to pin point lockup inside command stream and
narrow down the scope of lockup investigation.

v2: Use WRITE_DATA packet instead of WRITE_MEM
v3: Remove useless nop packet

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-03-27 11:38:02 -04:00
Chris Forbes
21a2dfa55d mesa: only check sample count if we actually wanted multisampling
Fixes various test fallout from 90b5a2425a on Pineview, which claims to
support ARB_internalformat_query but doesn't actually provide the
driverfunc.

That driver is still broken [GetInternalformativ will still segfault!]
but it was silly to be going through the sample count logic in the
nonmultisampling case at all.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-27 07:49:12 +13:00
Christian König
c77159cc11 radeon/llvm: document LLVM commit
We need at least that revision to work correctly now.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-26 15:08:00 +01:00
Christian König
1c10018925 radeonsi: add preloading for all samplers
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:43 +01:00
Christian König
0f6cf2bc79 radeonsi: add preloading of all constants
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:40 +01:00
Christian König
44e3224554 radeonsi: mark most intrinsics as readnone/nounwind
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:36 +01:00
Christian König
206f059e1f radeonsi: mark all loads as constant
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:33 +01:00
Christian König
86f6fc2f1d radeonsi: remove wqm intrinsic
Now the backend handles that itself.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:30 +01:00
Christian König
6249db73ea radeon/llvm: remove uneeded inclusion
The include isn't needed and the file has moved with LLVM master.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:23 +01:00
Christian König
0f001fbff1 glsl_to_tgsi: avoid creating arrays if driver doesn't support them
Avoid creating arrays if we replace indirect addressing anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-26 10:22:27 +01:00
Christian König
462de2e65f glsl_to_tgsi: make simplify_cmp work with arrays
Even when we have arrays it is possible for simplify_cmp
to work on temps, just not on arrays.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=62696

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-26 10:22:27 +01:00
Marek Olšák
98a8e5b87e gallium/docs: document get_driver_query_info 2013-03-26 01:37:40 +01:00
Marek Olšák
8ddae684af r600g: add a driver query returning the amount of requested VRAM and GTT memory 2013-03-26 01:28:19 +01:00
Marek Olšák
2504380aaf r600g: add a driver query returning the number of draw_vbo calls
between begin_query and end_query
2013-03-26 01:28:19 +01:00
Marek Olšák
e40c634bd2 st/dri: integrate the HUD
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
c91cf7d7d2 gallium: implement a heads-up display module
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: lots of cosmetic changes
2013-03-26 01:28:19 +01:00
Marek Olšák
8ddcd715b7 gallium: add interface for driver queries like performance counters, etc.
The pipe query interface is reused. The list of available queries can be
obtained using pipe_screen::get_driver_query_info.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
9cec5edea7 gallium/tgsi: fix valgrind warning
"Conditional jump or move depends on uninitialised value(s)"

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
17003b44b7 st/mesa: fix crash with blit-based GetTexImage
https://bugs.freedesktop.org/show_bug.cgi?id=62573

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
d1b91e309b cso: add constant buffer save/restore feature for postprocessing
Postprocessing is an internal meta op and should restore the states
it changes.
2013-03-26 01:28:18 +01:00
Marek Olšák
35c522dce4 radeonsi: fix crash while binding a NULL constant buffer
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 01:28:18 +01:00
Marek Olšák
a2378daf83 r600g: fix crash while binding a NULL constant buffer 2013-03-26 01:28:18 +01:00
Marek Olšák
53228fe2a8 r300g: fix crash while binding a NULL constant buffer 2013-03-26 01:28:18 +01:00
Martin Andersson
92855bcc95 r600g: Use virtual address for PIPE_QUERY_SO* in r600_emit_query_end
Virtual address is used for PIPE_QUERY_SO* queries in
r600_emit_query_begin, but not in r600_emit_query_end.

This will trigger a GPU fault when one of those queries is
made and virtual address is enabled.

Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-25 18:18:23 -04:00
Rob Clark
634fb837ef freedreno: use u_debug for debug env vars
Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-25 15:05:44 -04:00
Jordan Justen
e207c33020 glsl ir: add as_dereference_record
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-25 11:35:56 -07:00
Brian Paul
eb92f89587 gallium: undef PACKAGE_* macros to silence warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-25 12:24:11 -06:00
Brian Paul
c0f16df938 gallivm: init vars to silence warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-25 12:24:11 -06:00
Brian Paul
35aefe9226 swrast: init vars to silence warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-25 12:24:11 -06:00
Rob Clark
980f1cf8a1 freedreno: prefer sw upload for textures
Since we are UMA, in most cases the GPU blit doesn't make much sense for
texture upload.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-25 13:05:44 -04:00
Rob Clark
732b0b5ebc freedreno: track maximal scissor bounds
Optimize out parts of the render target that are scissored out by taking
into account maximal scissor bounds in fd_gmem_render_tiles().

This is a big win on things like gnome-shell which frequently do partial
screen updates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-25 13:05:44 -04:00
Adrian Marius Negreanu
8a4750fe5e android: fix Android.mk bug in mesa/drivers/dri/common
target-specific variables are undefined when used as pre-requisites.
instead, use secondary-expansion.

I noticed this when building the patch:
     i965: Add a driconf option to disable flush throttling

Signed-off-by: Adrian Marius Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-25 09:52:19 -07:00
Eric Anholt
712bac1f41 mesa: Disable validate_ir_tree() on release builds.
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build.  Cuts
6.3% of the startup time of TF2.

NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-25 08:50:38 -07:00
Roland Scheidegger
92b8a37fdf gallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own file
This is really not generic conversion stuff and the code very particular to
these formats.
2013-03-24 22:54:45 +01:00
Vinson Lee
7d0c1f2437 llvmpipe: Fix assertions with assignment instead of comparison.
Fixes assign instead of compare defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-03-24 14:49:22 -07:00
Paul Berry
a593a1b276 i965: Shrink brw_vue_map struct.
This patch changes the arrays in brw_vue_map (which only ever contain
values from -1 to 58) from ints to signed chars.  This reduces the
size of the struct from 488 bytes to 136 bytes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: fix STATIC_ASSERT to use 127 instead of 128.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-24 10:55:28 -07:00
Paul Berry
0a0deb92d9 i965/fs: Rename vp_outputs_written to input_slots_valid.
With the introduction of geometry shaders, fragment inputs will no
longer come exclusively from the vertex shader; sometimes they come
from the geometry shader.  So the name "vp_outputs_written" will
become a misnomer.  This patch renames vp_outputs_written to
input_slots_valid, to reflect the true meaning of the bitfield from
the fragment shader's point of view: it indicates which of the
possible input slots contain valid data that was written by the
previous shader stage.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:28 -07:00
Paul Berry
bf9bfe838e i965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate.
This patch modifies post-GS pipeline stages (transform feedback, clip,
sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather
than brw->vs.prog_data->vue_map.  This ensures that when geometry
shader support is added, these pipeline stages will consult the
geometry shader output VUE map when appropriate, rather than the
vertex shader output VUE map.

v2: Fixed some stale "CACHE_NEW_VS_PROG" comments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
463ef47b16 i965: Store the geometry output VUE map in brw_context.
Currently, the GPU pipeline has one active VUE map in effect at any
given time--the one representing the layout of vertex data coming from
the vertex shader.  However, when geometry shaders are added, they
will have their own independent VUE map.  Later pipeline stages (clip,
sf, fs) will need to consult the geometry shader VUE map if a geometry
shader is in use, and the vertex shader VUE map otherwise.

This patch adds a new field to brw_context, vue_map_geom_out, which
contains the VUE map that should be used by later pipeline stages.  It
also adds a new state flag, BRW_NEW_VUE_MAP_GEOM_OUT, which is
signalled whenever the contents of the VUE map changes.

Since we don't support geometry shaders yet, vue_map_geom_out is
currently set only by the brw_vs_prog state atom.

v2: Don't set vue_map_geom_out in do_vs_prog--that's redundant and
possibly problematic for precompiles.  Only set it in
brw_upload_vs_prog.  Also, make a copy instead of using a
pointer--this makes it possible to detect when the VUE map hasn't
changed, so we can avoid redundant state uploads.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
8fbc22e880 i965: Move brw_vs_prog_data::outputs_written into VUE map.
Future patches will allow for there to be separate VUE maps when both
a geometry shader and a vertex shader are in use.  When this happens,
we will want to have correspondingly separate outputs_written
bitfields.  Moving outputs_written into the VUE map will make this
easy.

For consistency with the terminology used in the VUE map, the bitfield
is renamed to "slots_valid" in the process.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
76ba30800d i965/gen7: Use WE_all mode when enabling channel masks for URB write.
Gen7 adds mask bits to the message header for a URB write which allow
the write to apply only to certain channels.  We don't use this
functionality, so to ensure that the entire write always occurs, we
emit an OR instruction to set the mask bits.

With the advent of geometry shaders, URB writes won't just happen at
the end of a thread; they will happen in mid-thread too.  Thus, we can
no longer rely on channel 0 being enabled, so we need to emit the OR
instruction in WE_all mode to ensure that it is executed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
8371c68a4b i965: Rename BRW_VARYING_SLOT_MAX -> BRW_VARYING_SLOT_COUNT.
The new name clarifies that it represents *one more* than the maximum
possible brw_varying_slot value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
ec9c3882d9 i965: Clarify nomenclature: vert_result -> varying
This patch removes the terminology "vert_result" from the i965 driver,
replacing it with "varying".  The old terminology, "vert_result", was
confusing because (a) it referred to the enum gl_vert_result, which no
longer exists (it was replaced with gl_varying_slot), and (b) it
implied a vertex output, but with the advent of geometry shaders, it
could be either a vertex or a geometry output, depending what shaders
are in use.  The generic term "varying" is less confusing.

No functional change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Whitespace fixes.
2013-03-23 22:47:54 -07:00
Chris Forbes
f56fb9d248 i965: bump MAX_DEPTH_TEXTURE_SAMPLES to 4/8
Bump MAX_DEPTH_TEXTURE_SAMPLES to match what GetInternalformativ is
claiming. Since that limit is what is actually enforced now, this
doesn't actually change anything except the queried value.

There's still no piglits verifying that multisample depth textures work,
but this works in the Unigine demos.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Chris Forbes
2405da174e mesa: use _mesa_check_sample_count() for multisample textures
Extends _mesa_check_sample_count() to properly support the
TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets, which
have subtly different limits than renderbuffers.

This resolves the remaining TODO in the implementation of
TexImage*DMultisample.

V2: - Don't introduce spurious block.
    - Do this in multisample.c instead.
    - Fix typo in error message.
    - Inline spec quotes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Chris Forbes
90b5a2425a mesa: helper for checking renderbuffer sample count
Pulls the checking of the sample count into a helper function, and
extends the existing logic to include the interactions with both
ARB_texture_multisample and ARB_internalformat_query.

_mesa_check_sample_count() checks a desired sample count against a
a combination of target/internalformat, and returns the error enum
to be produced, if any. Unfortunately the conditions are messy and the
errors vary.

V2: - Tidy up spurious block.
    - Move _mesa_check_sample_count() to multisample.c instead; It
      doesn't really belong in fbobject.c or teximage.c.
    - Inlined spec quotes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Chris Forbes
86b8380600 mesa: allow internalformat_query with multisample texture targets
Now that we support ARB_texture_multisample, there are multiple targets
accepted for this query, and they may have target-dependent limits, so
pass the target to the driverfunc.

For example, the sampling hardware may not be able to do general
texelFetch() for some format/sample count combination, but the driver
may still be able to implement a reasonable resolve operation, so it can
be supported for renderbuffers.

V2: - Don't break Gallium compile.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Dmitry Cherkassov
3cc2629b3b clover: add dynamic_cast results checking down in clSetKernelArgument() code path.
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-03-24 02:43:34 +01:00
Roland Scheidegger
b50e362dbb gallivm: Add code for rgb9e5 shared exponent format to float conversion
And use this (and the code for r11g11b10 packed float to float conversion)
in the soa texturing code (the generated code looks quite good).
Should be an order of magnitude faster probably than using the fallback
(not measured).
Tested with piglit texwrap GL_EXT_packed_float and
GL_EXT_texture_shared_exponent respectively (didn't find much else using
it).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-24 02:09:02 +01:00
Marek Olšák
3e10ab6b22 gallium,st/mesa: don't use blit-based transfers with software rasterizers
The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to turn
them off.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:19:16 +01:00
Marek Olšák
25e3094058 st/mesa: implement blit-based ReadPixels
Initial version contributed by: Martin Andersson <g02maran@gmail.com>

This is only used if the memcpy path cannot be used and if no transfer ops
are needed. It's pretty similar to our TexImage and GetTexImage
implementations.

The motivation behind this is to be able to use ReadPixels every frame and
still have at least 20 fps (or 60 fps with a powerful GPU and CPU)
instead of 0.5 fps.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
d702c67ba5 mesa: add common format-independent memcpy-based ReadPixels path
I'll need the _mesa_readpixels_needs_slow_path function for the blit-based
version, but it's also useful to have this memcpy-based path in one place
and not scattered across several functions.

v2: add "const" to function parameters

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
f8855a4214 mesa: add helper func for checking combined depthstencil buffers from st/mesa
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
2dc2066b90 mesa: add a common function returning transfer ops for ReadPixels
I'll need both new functions for later. For now, it consolidates the code
for determining what the transfer ops should be and makes it a little bit
smarter.

v2: added "const"

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
b2a4573c14 mesa: handle HALF_FLOAT like FLOAT in get_tex_rgba
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Roland Scheidegger
b101a094b5 llvmpipe: add EXT_packed_float render target format support
New conversion code to handle conversion from/to r11g11b10 AoS to/from
SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA
(which works pretty much the same as r11g11b10 except for the packing).
(This code should also be used for texture sampling instead of
relying on u_format conversion but it's not yet, so rgb9e5 is unused.)
Unfortunately a crazy amount of hacks is necessary to get the conversion
code running in llvmpipe's generate_unswizzled_blend, which isn't well
suited for formats where the storage representation has nothing to do
with what's needed for blending (moreover, the conversion will convert
from packed AoS values, which is the storage format, to float SoA values,
because this is much more natural for the conversion, and likewise from
SoA values to packed AoS values - but the "blend" (which includes
trivial things like partial mask) works on AoS values, so incoming fs
values will go SoA->AoS, values from destination will go packed
AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably
isn't the most efficient way though the shuffles are probably bearable).

Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter),
still need to verify Inf/NaNs (where most of the complexity in the
conversion comes from actually).

v2: drop the (very bogus) rgb9e5 part, and do component extraction
in the helper code for r11g11b10 to float conversion, making the code
slightly more compact (suggested by Jose), now that there are no other
callers left this works quite well. (Could do the same for the
opposite way but it's less than ideal there, final part of packing
needs to be done in caller anyway and there'd be another conditional.)

v3: minor style and comment fixes. Also fix a potential issue with
negative zero being potentially returned by max(src, zero) as we
don't have well-defined min/max behavior (fortunately no additonal cost).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-22 20:10:53 +01:00
Michel Dänzer
31009b4521 r600g: Honour legacy debugging environment variables
This helps minimize confusion / effort when moving between branches or
helping others.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-03-22 10:29:49 +01:00
Matt Turner
81e585fabe docs: Mark ARB_ES3_compatibility as done. 2013-03-21 15:59:21 -07:00
Rob Clark
eab8d6cbdb freedreno: add pipe->blit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-21 17:33:51 -04:00
Paul Berry
eea30dff43 i965: Add a driconf option to disable flush throttling.
Normally when submitting the first batch buffer after a flush, we
check whether the GPU has completed processing of the first batch
buffer of the previous frame.  If it hasn't, we wait for it to finish
before submitting any more batches.  This prevents GPU-heavy and
CPU-light applications from racing too far ahead of the current frame,
but at the expense of possibly lower frame rates.  Sometimes when
benchmarking we want to disable this mechanism.

This patch adds the driconf option "disable_throttling" to disable the
throttling mechanism.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-21 13:24:43 -07:00
Matt Turner
12dc4be8a6 mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.
NOTE: This is a candidate for the 9.1 branch.
Fixes piglit's texture-immutable-levels test.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-21 11:04:41 -07:00
Adam Jackson
38aa8ec937 glx: Build with VISIBILITY_CFLAGS in automake
Note: This is a candidate for the stable branches.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-03-21 13:21:18 -04:00
Brian Paul
3804d67723 scons: check for existance of 'MSVC_VERSION' in env
Evidently, MSVC_VERSION isn't always defined so check for it before
checking the MSVC version.

Suggested by Jose.
2013-03-21 09:24:40 -06:00
Brian Paul
10393038f8 softpipe: silence some asst. MSVC type warnings in sp_tex_sample.c 2013-03-21 09:24:35 -06:00
Brian Paul
b2d3f364db softpipe: silence some MSVC signed/unsigned warnings 2013-03-21 09:24:35 -06:00
Brian Paul
2e3200d463 softpipe: silence some MSVC float/double warnings 2013-03-21 09:24:35 -06:00
Brian Paul
f7b07fd25c rbug: silence some MSVC signed/unsigned warnings 2013-03-21 09:24:35 -06:00
Brian Paul
bfc8b8fac5 postprocess: silence some MSVC float/int warnings 2013-03-21 09:24:35 -06:00
Brian Paul
8bd5692a5d meta: fix incorrect slice, r coordinate computation
The arithmetic to convert a 3D texture slice to an R coordinate was
incorrect.  Found when MSVC warned of a divide by zero.

Note that we don't actually ever hit this path.  We don't decompress
slices of 3D textures and we don't support 3D mipmap generation yet.
2013-03-21 09:24:35 -06:00
Brian Paul
a940c93aac vega: fix MSVC warning about missing return statement 2013-03-21 09:24:35 -06:00
Brian Paul
52edca9df9 meta: minor indentation fix 2013-03-21 08:28:26 -06:00
Michel Dänzer
032e5548b3 radeonsi: Emit pixel shader state even when only the vertex shader changed
Fixes random failures with piglit glsl-max-varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-03-21 15:12:31 +01:00
Chad Versace
e34fe8bd20 android: Define PACKAGE_VERSION/BUGREPORT in CFLAGS
This fixes the Android build. Commit 439c3d4 broke it.

CC: Adrian M Negreanu <adrian.m.negreanu@intel.com>
CC: Matt Turner <mattst@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-20 15:11:41 -07:00
Kenneth Graunke
d24819dce8 i965/vs: Add IR dumping for immediates.
This makes dump_instructions more useful.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-20 10:40:44 -07:00
Kenneth Graunke
095c3755ee glsl: Add built-in functions for GLSL 1.50.
This makes basic built-in functions work in GLSL 1.50.  It supports
everything except the new Geometry Shader functions.

The new 150.glsl file is 140.glsl plus ARB_texture_multisample.glsl;
150.frag is identical to 140.frag except for the #version bump.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-20 10:38:40 -07:00
Kenneth Graunke
bcdda04349 glsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50.
GLSL 1.50 includes support for the new sampler types introduced by
the ARB_texture_multisample extension.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-20 10:38:38 -07:00
Kenneth Graunke
f1ca2ed538 glsl: Bump standalone compiler versions to 1.50.
The version bumps are necessary in order to compile built-ins for 1.50.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-20 10:38:20 -07:00
Kenneth Graunke
d86efc075e i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.
Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.

This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types.  By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled.  This is a very common
case.

Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.

However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits.  If not, the sampler already returns 1.0
for us without any special swizzling.  XRGB8888, for example, is a very
common case where this occurs.

This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases.  This at least helps
Warsow.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-20 10:37:34 -07:00
Kenneth Graunke
2dd22130cd i965: Don't print a fatal-looking message if intelCreateContext fails.
With the old context creation mechanism, an application asked the GL to
give it a context.  Failing to produce a context was a fatal error.

Now, with GLX_ARB_create_context, the application can request a specific
version.  If it's higher than the maximum version we support, context
creation will fail.  But this is a normal error that applications
recover from.

In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1,
4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1
context.  This led to it printing the following message 6 times:
"brwCreateContext: failed to init intel context"

There's no need to alarm users (and developers) with such a message.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-20 10:37:34 -07:00
Eric Anholt
1f112ccf02 i965/gen7: Align all depth miplevels to 8 in the X direction.
On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW
workaround: blit" (the printouts from the misaligned-depth workaround
blits) from 725 to 675.

It doesn't totally eliminate the workaround blit, because we still have
problems with Y offsets that we can't fix (since texturing can only align
miplevels up to 2 or 4, not 8).

No regressions on piglit/es3conform on IVB.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-20 10:18:44 -07:00
Christoph Bumiller
529dbbfcf7 nvc0: fix max varying count, move CLIPVERTEX,FOG out of the way
The card spews an error if I use all 128 generic slots.
Apparently the real limit isn't just dictated by the address space
layout.
2013-03-20 12:25:21 +01:00
Christoph Bumiller
8acaf862df gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.

The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.

With this patch only nvc0 and nv30 will request that they be used.

v2: introduce a CAP so other drivers don't have to bother with
the new semantic

v3: adapt to introduction gl_varying_slot enum
2013-03-20 12:25:21 +01:00
Ian Romanick
3eaf823b90 docs: import release notes for 9.1.1, add news item
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 17:46:30 -07:00
Kristian Høgsberg
939789e48d gallium-egl: Fix compile errors introduced in de315f76a
The commit changed API in a helper library shared by both egl_dri2 and
the gallium egl state tracker, but only egl_dri2 was updated to use the
new interface.

Tested-by: Giulio Camuffo <giuliocamuffo@gmail.com>
2013-03-19 20:17:47 -04:00
Paul Berry
995bbc2256 i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.
Previous to this patch, when using fixed function fragment shading,
bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set
differently during precompiles and normal usage.  During precompiles
it was being set only if the fragment shader reads from window
position (which it never does), so it was always being set to 0.
During normal usage it was being set if the vertex shader writes to
all 4 components of gl_Position (which it usually does), so it was
usually being set to 1.  As a result, we were almost always doing an
extra recompile for the fixed function fragment shader.

The recompile was totally unnecessary, though, because
brw_wm_prog_key::proj_attrib_mask is only consulted for
fs_visitor::emit_general_interpolation(), which isn't used for
VARYING_SLOT_POS.

This patch avoids the unnecessary recompile by always setting bit
VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-19 16:56:58 -07:00
Paul Berry
db81d3b8f7 ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.
Previously, right after calling _mesa_glsl_link_shader(), the fixed
function fragment shader code made several calls with the ostensible
purpose of setting up uniforms for the fragment shader it just
created.

These calls are unnecessary, since _mesa_glsl_link_shader() calls
driver->LinkShader(), which takes care of calling these functions (or
their equivalent).  Also, they are dangerous to call after
_mesa_glsl_link_shader() has returned, because on back-ends such as
i965 which do precompilation, _mesa_glsl_link_shader() may have
already cached pointers to the existing uniform structures; attempting
to set up the uniforms again invalidates those cached pointers.

It was only by sheer coincidence that this wasn't manifesting itself
as a bug.  It turns out that i965's precompile mechanism was always
setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed
function fragment shaders, but during normal usage this bit usually
gets set to 1.  As a result, the precompiled shader (with its invalid
uniform pointers) was not being used.

I'm about to introduce some changes that cause bit 0 of
proj_attrib_mask to be set consistently between precompilation and
normal usage, so to avoid regressions I need to get rid of the
dangerous duplicate uniform setup code first.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 16:56:56 -07:00
Paul Berry
0af56c9d53 i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.
Since apps typically begin rendering with a call to glClear(), it is
likely that when brw_workaround_depthstencil_alignment() moves a
miplevel to a temporary buffer, it can avoid doing a blit, since the
contents of the miplevel are about to be erased.

This patch adds the necessary plumbing to determine when
brw_workaround_depthstencil_alignment() is being called as a
consequence of glClear(), and avoids the unnecessary blit when it is
safe to do so.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Eliminate unnecessary call to _mesa_is_depthstencil_format().  Fix
handling of depth buffer in depth/stencil format.

v3: Use correct bitfields for clear_mask.  Fix handling of depth
buffer in depth/stencil format when hardware uses separate stencil.
When invalidating, make sure we still reassociate the image to the new
miptree.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-19 16:56:51 -07:00
Alex Deucher
49c1fc7044 r600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on cayman
Doesn't exist on the asic and will cause a CS rejection
if VM is disabled.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-19 18:13:27 -04:00
Alex Deucher
a9914117ea r600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xx
Not using HiS yet, but matches what we do on evergreen+.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-19 18:13:26 -04:00
Brian Paul
c45d22e26a winsys/svga: improve error/debug message output
Use vmw_printf() just for extra debugging info (off by default).
Use vmw_error() for real errors/failures/etc that we definitely
want to report.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-19 15:18:38 -06:00
Brian Paul
460a4444e8 tgsi: fix uninitialized declaration array fields
Fixes a few regressions since the TGSI array changes.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-19 15:15:37 -06:00
Kristian Høgsberg
1670737436 egl_dri2: Lower __DRI_IMAGE version requirement back to 1
We check the extension version manually instead and verify that we have
the createImageFromFds function before enabling prime fd passing.
2013-03-19 16:13:38 -04:00
Maarten Lankhorst
7c3d8301af radeon/llvm: Do not link against libgallium when building statically.
NOTE: This is a candidate for the 9.1 branch.

Tested-by: Vincent Lejeune <vljn@ovi.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-03-19 20:20:33 +01:00
Matt Turner
322c840bea gles2: Add an ABI-check test
Checks that no functions are exported that are not part of the ABI.

Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.
2013-03-19 12:04:32 -07:00
Matt Turner
569bd281c1 gles1: Add an ABI-check test
Checks that no functions are exported that are not part of the ABI.

Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.
2013-03-19 12:04:31 -07:00
Andreas Boll
182895c4e6 gallium/egl: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1

NOTE: This is a candidate for the 9.1 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:38 +01:00
Andreas Boll
92e6260c19 osmesa: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1

v2: Move the added line immediately after -I$(top_srcdir)/src/mapi

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:38 +01:00
Andreas Boll
06fff296e9 build: Enable x86 assembler on Hurd.
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/10-hurd-configure-tweaks.diff;h=984e17df1b8afdf8e4b36bee96aa5ab6a5691021;hb=refs/heads/ubuntu%2B1

Thanks to Pino Toscano.

v2: Don't bother with x86_64. AFAICT GNU/Hurd doesn't support it so far.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:38 +01:00
Andreas Boll
7962f28c43 mesa: use ieee fp on s390 and m68k
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1

Fixes Debian bug #349437.

Patch written by David Nusinow.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:37 +01:00
Roland Scheidegger
5af7b45986 gallivm: fix return opcode handling in main function of a shader
If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-19 18:04:05 +01:00
Rob Clark
afc1b7c21f freedreno: clear fixes
Some fixes for clearing only depth or only stencil.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-19 10:49:30 -04:00
Christian König
90862c8507 radeonsi: enable indirect adressing
Fixing 16 piglit tests.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Christian König
5e616cf2c5 radeonsi: implement indirect adressing of constants
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Christian König
f5298b0a65 radeonsi: switch to using resource destribtors for constants v2
v2: remove superfluous mask, use buffer_size instead of constant

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Christian König
c05483fc00 radeon/llvm: rework input fetch and output store
Cleanup the code and implement indirect addressing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Brian Paul
b51f8593d8 tgsi: add initializer data to fix MSVC compile error 2013-03-19 07:55:48 -06:00
Christian König
897303f8ff tgsi: add ArrayID documentation v2
v2: further improve the text with comments from Christoph Bumiller.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
21190fbd56 tgsi: use separate structure for indirect address v2
To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.

Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.

v2: rename Declaration to ArrayID, put the ArrayID into () instead of []

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
16caeff2a5 tgsi: add ArrayID to declarations
Remember which declarations are declared as "arrays" and so
can be indirectly addressed. ArrayIDs start at 1, cause for
compatibility reasons zero is treaded as no array present.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
d3e07bed90 tgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY
Nobody seems to be using it, and only nv50 had a partial implementation.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
affdff230b glsl_to_tgsi: remove indirect addressing limitations
They shouldn't be necessary any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
3f67251e3d glsl_to_tgsi: allocate arrays separately v2
Instead of allocating everything as temporaries, use the
new array allocation functions.

v2: fix bug in simplify_cmp, declare arrays on demand

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
433b2ca46b glsl_to_tgsi: use get_temp for all allocations
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
506d400275 tgsi/ureg: implement support for array temporaries
Don't bother with free temporaries, just allocate them at
the end and also emit them in their own declaration.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
52947b93b2 tgsi/ureg: cleanup local temporary emission v2
Instead of emitting each temporary separately, emit them in a chunk.

v2: keep separate function for emitting temps

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:31 +01:00
Andreas Boll
36320bfa54 radeon/llvm: Link against libgallium.la to fix an undefined symbol
Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1

Fixes a regression introduced with
f70c385351

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-03-19 12:07:51 +01:00
Kristian Høgsberg
de315f76a2 wayland: Add prime fd passing as a buffer sharing mechanism
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-03-18 21:15:41 -04:00
Kristian Høgsberg
2356e28452 Add dri image entry point for creating image from fd
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-03-18 21:03:54 -04:00
Kristian Høgsberg
664fe6dc84 wayland: allocate a __DRIimage for the color buffer
No functional change here, but this will let us query the image
for an fd handle later.

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-03-18 21:03:46 -04:00
Rob Clark
4e8f5c52bb DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap
If ddx does not support swap, don't advertise it.  This is a hack to
work around current xservers which advertise this extension even when it
is clearly not supported.  When:

http://lists.x.org/archives/xorg-devel/2013-February/035449.html

is merged in upstream xserver and makes it's way into most distros then
this hack can be removed.  In the mean time, it is required to allow
gnome-shell/clutter/etc to work properly with a DDX driver which does
not support ScheduleSwap.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-18 14:16:43 -04:00
Paul Berry
5a13e051d9 i965/blorp: Add INTEL_DEBUG=blorp flag.
This debug flag prints out the native GEN assembly for a blitting
shader produced using BLORP.  Hopefully this should be useful in
developing additional BLORP features.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-18 09:27:25 -07:00
Alex Deucher
2da8ee16a8 r600g: properly set non_disp tiling mode for DMA (v2)
Needs to be set for depth, stencil, and fmask just
like other blocks.

v2: drop additional cayman bits for now

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-17 13:32:48 -04:00
Alex Deucher
4409758a04 r600g: Use blitter rather than DMA for 128bpp on cayman (v3)
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces.  When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.

v2: cayman/TN only

v3: fix comments

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-17 13:32:48 -04:00
Paul Berry
346a1b9bb9 i965: Simplify separate stencil check
The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-16 10:15:51 -07:00
Maarten Lankhorst
f70c385351 gallium/build: Fix visibility CFLAGS in automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Fix formatting - use one CFLAG per line

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-16 12:45:22 +01:00
José Fonseca
49ae9b08d4 scons: Warn when using MSVS versions prior to 2012.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-15 19:55:54 +00:00
Paul Berry
c5d5827951 i965: Apply depthstencil alignment workaround when doing fast clears.
Fast depth clears have the same depth/stencil alignment requirements
as other drawing operations.  Therefore, we need to call
brw_workaround_depthstencil_alignment() from both the clear and
drawing paths.

Without this fix, we get image corruption if the following conditions
hold: (a) the first ever drawing operation to a depth miplevel (or the
first drawing operation after having used the texture for sampling) is
a clear, (b) the depth miplevel has a size that is eligible for fast
depth clears, and (c) the depth miplevel has an offset within the
miptree that isn't 8x8 aligned.

Fixes piglit "depthstencil-render-miplevels" tests with size 273.

NOTE: This is a candidate for stable branches

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-15 11:52:33 -07:00
Paul Berry
eed6baf762 Replace gl_frag_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:26:17 -07:00
Paul Berry
f117abe664 Get rid of _mesa_frag_attrib_to_vert_result().
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:26:07 -07:00
Paul Berry
10a131211e Get rid of _mesa_vert_result_to_frag_attrib().
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.  But we still need to be able to detect when a given vertex
output has no corresponding fragment input.  So it is replaced by a
new function, _mesa_varying_slot_in_fs(), which tells whether the
given varying slot exists as an FS input or not.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:57 -07:00
Paul Berry
827c074fb1 mtypes.h: Modify gl_frag_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_frag_attrib enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:46 -07:00
Paul Berry
a6d807c86f Replace gl_geom_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_geom_result -> gl_varying_slot
GEOM_RESULT_* -> VARYING_SLOT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:36 -07:00
Paul Berry
d453225efc mtypes.h: Modify gl_geom_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_result enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:26 -07:00
Paul Berry
d7c60a4a4f Replace gl_geom_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_geom_attrib -> gl_varying_slot
GEOM_ATTRIB_* -> VARYING_SLOT_*
GEOM_BIT_* -> VARYING_BIT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:15 -07:00
Paul Berry
094bcf399c mtypes.h: Modify gl_geom_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_attrib enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:05 -07:00
Paul Berry
36b252e947 Replace gl_vert_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:54 -07:00
Paul Berry
9e729a79b0 mtypes.h: Modify gl_vert_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_vert_result enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:44 -07:00
Paul Berry
8a076c5f05 mtypes.h: Add new gl_varying_slot enum, and bitfield defines.
Future patches will make use of the enum.  It will eventually take the
place of the existing enums gl_vert_result, gl_geom_attrib,
gl_geom_result, and gl_frag_attrib, all of which represent essentially
the same information but using inconsistent values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:34 -07:00
Paul Berry
6bec74bfd9 i965: Change fragment input related bitfields to 64-bit.
This patch updates the bitfields brw_context::wm.input_size_masks,
tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of
which are indexed by gl_frag_attrib, from 32-bit to 64-bit.

This paves the way for supporting geometry shaders, and for merging
the gl_frag_attrib and gl_vert_result enums.  The combination of these
two will require at least 55 bits in the bitfields.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:30 -07:00
Alex Deucher
03eef7f8ef r600g: add Richland APU pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-15 09:24:14 -04:00
Brian Paul
fec8733d4e st/dri: add support for the always_have_depth_buffer option
This involved adding another driOptionCache to dri_screen.  The
existing one just held the default values.  But now we also need
to have the values from the DRI config file so that we can get at
the always_have_depth_buffer config option, which is per-screen.
2013-03-15 07:05:01 -06:00
Brian Paul
5d1b3097e2 driconf: add a miscellaneous section and always_have_depth_buffer option
This option is needed for some applications that neglect to request
a depth buffer when choosing a visual/fbconfig.

The Linux app Topogun is an example of this problem.
2013-03-15 07:04:13 -06:00
Brian Paul
b3d184bac6 driconf: reorder options, reformat comments, etc
Move the options into the proper section (Debug, Quality, Performance,
etc).

Update comments and add some whitespace to improve readability.
2013-03-15 07:04:08 -06:00
Philipp Brüschweiler
c07c18081e wayland: fix segfault when using software rendering
wayland_roundtrip() was given an incorrect parameter.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62362

Note: This is a candidate for the stable branches.

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-03-15 06:50:23 -06:00
Brian Paul
f4a2c29d93 softpipe: fix up NUM_ENTRIES confusion
There were two different NUM_ENTRIES #defines for the framebuffer
tile cache and the texture tile cache.  Rename the later to fix
the warnings:

In file included from sp_flush.c:40:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
In file included from sp_context.c:50:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition

Also, replace occurances of NUM_ENTRIES with Element() macro to
be safer.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-14 18:17:18 -06:00
Brian Paul
2f6970ae97 st/osmesa: silence some optimized build warnings 2013-03-14 18:09:42 -06:00
Brian Paul
6a9d7659d6 draw: init pre_clip_pos = NULL to fix optimized build warning 2013-03-14 18:09:42 -06:00
Brian Paul
622b1fcc18 glx: init screen = 0 to fix optimized build warning 2013-03-14 18:09:42 -06:00
Kenneth Graunke
91df4d746b i965: Make INTEL_DEBUG=shader_time use the RAW surface format.
Untyped Atomic Operation messages are illegal for non-RAW formats.  The
IVB hardware proceeds happily (after all, who cares what the format of the
surface is if you're doing untyped ops on it?), but later hardware
apparently doesn't.  The simulator for gen7 does complain, though.

v2: Rebase against updates to previous patches. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:40 -07:00
Kenneth Graunke
125b34cffb i965: Specialize SURFACE_STATE creation for shader time.
This is basically a copy and paste of gen7_create_constant_surface, but
with the parameters filled in to offer a simpler interface.

It will diverge shortly.

I didn't bother adding it to the vtable for now since shader time is only
exposed on Gen7+.

v2: Replace tabs in the new code (by anholt)
    Add back dropped memset() and add a comment about HSW channel selects.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:40 -07:00
Kenneth Graunke
f27a220cad i965: Fix INTEL_DEBUG=shader_time for Haswell.
Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.

Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.

v2: Use the #defines from the previous commit. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2013-03-14 12:30:40 -07:00
Eric Anholt
a2d08f170a i965: Add definitions for gen7+ data cache messages.
We were sparsely using some of these message types, but I'll just fill
them all in now.  It will be used for fixing shader_time on HSW.

v2: Add missing MEDIA_BLOCK_READ.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:39 -07:00
Eric Anholt
db3a0f13ef i965: Split shader_time entries into separate cachelines.
This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).

Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).

v2: Add a define for the stride with a comment explaining its units and
    why.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:39 -07:00
José Fonseca
a35a19a6ea scons: Define _ALLOW_KEYWORD_MACROS on MSVC builds.
scons/llvm.py defines inline globally to workaround issues with LLVM C
binding headers, so the only way to is to avoid
aggravating xkeycheck.h errors is to set _ALLOW_KEYWORD_MACROS.

This fixes MSVC 2012 build with LLVM.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-14 19:01:10 +00:00
José Fonseca
6a3d77e13d softpipe: Shrink context size.
- each softpipe_tex_tile_cache 50*64*64*4*4 = 3,276,800 bytes
- each softpipe_context has 3*32 softpipe_tex_tile_cache, i.e, each softpipe
  context is 314,572,800 bytes, i.e, 300MB

That is, in a 32bits process (around 3GB virtual memory max), we can
only fit 10 contexts.

This change is a short-term hack to shrink the context size.  Longer
term we'll need to change how the texture cache works.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-14 11:59:53 +00:00
Christian König
ce3aa0e775 radeon/llvm: fix LLVM dependencies
Since commit 1c4f283151 we obvious depend on this.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-14 12:38:54 +01:00
Anuj Phogat
d78dcdf103 mesa: Fix FB blitting in case of zero size src or dst rect
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.

V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.

Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495

Note: Candidate for all the stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-13 17:58:09 -07:00
Roland Scheidegger
1826659272 tgsi: fix sample_d emit for arrays
Those cases were apparently forgotten.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:22:55 +01:00
Roland Scheidegger
9e93d7c4fd llvmpipe: don't assert when trying to render to surfaces with multiple layers
instead just warn when creating the surface, rendering will simply happen
to first layer.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:22:30 +01:00
Roland Scheidegger
81e728982d softpipe: don't assert when creating surfaces with multiple layers
We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:21:56 +01:00
José Fonseca
4889315619 llvmpipe: Fix geometry shader token leak.
Trivial. Matches softpipe's code.
2013-03-13 21:46:50 +00:00
Tom Stellard
c95177ea88 radeon/llvm: Add missing license headers
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-13 16:01:31 +00:00
Tom Stellard
1c4f283151 radeon/llvm: Make radeon_llvm_util.cpp a C file
All the functions in this file are now implemented in C.
2013-03-13 16:01:31 +00:00
Tom Stellard
3958c104c6 radeon/llvm: Optimize radeon_llvm_strip_unused_kernels()
Just delete unused kernels rather than marking them as internal and
running the GlobalDCE pass.

Also implement this function in C and inline it into
radeon_llvm_get_kernel_module()
2013-03-13 16:01:31 +00:00
Tom Stellard
2ace79dce5 radeon/llvm: Implement radeon_llvm_get_kernel_module() using the C API 2013-03-13 16:01:31 +00:00
Tom Stellard
b34b8576ec radeon/llvm: Implement radeon_llvm_get_num_kernels() using the C API 2013-03-13 16:01:31 +00:00
Tom Stellard
7e9abbea15 radeon/llvm: Implement radeon_llvm_parse_bitcode() using C API
Also make the function static since it is not used anywhere else.
2013-03-13 16:01:30 +00:00
Tom Stellard
97bfcddde0 r600g/llvm: Move llvm wrapper functions into the radeon directory 2013-03-13 16:01:30 +00:00
Jon TURNEY
28e1693630 Properly check GLX_INDIRECT_RENDERING in glapi/tests/check_table
Actually use $DEFINES, so we can see if GLX_INDIRECT_RENDERING is defined

If GLX_INDIRECT_RENDERING is defined,  _GLAPI_SKIP_PROTO_ENTRY_POINTS will
be defined, and libglapi won't contain the 'protocol entry points', so we
should provide stubs in check_table.cpp
2013-03-13 14:55:52 +00:00
Jon TURNEY
ed8ddd57e9 Fix glapi/tests/check_table.cpp for standardized OpenGL function names
It looks like this has been broken since commit
1a1db1746d "Standardize names of OpenGL
functions."

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2013-03-13 14:53:49 +00:00
Jon TURNEY
c7a319182f Fix out-of-tree build of 'make check' in src/mapi/glapi/tests/
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2013-03-13 14:53:36 +00:00
José Fonseca
cff70dcfb2 scons: Define PACKAGE_VERSION/BUGREPORT globally.
Fixes the scons build.
2013-03-13 13:13:37 +00:00
Vinson Lee
a6bb7a9495 tests: Add $(top_srcdir)/include to AM_CPPFLAGS.
Fixes this build error with make check.

  CC     collision.o
In file included from ../../../../../src/mesa/main/hash_table.h:34:0,
                 from collision.c:31:
../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-03-12 23:14:39 -07:00
José Fonseca
f7ef83cdf4 scons: Define PACKAGE_xxx
Should get the builds going again.
2013-03-13 01:29:47 +00:00
Brian Paul
6f86b934e6 docs: rewrite the OSMesa info / instructions
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
79eac7da6b configure: wire-up new OSMesa gallium state tracker and target
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
be51f123c9 target/osmesa: add new Makefile.am
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
94263da46e targets/osmesa: new OSMesa gallium target
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
7114b6a92d st/osmesa: add new Makefile.am
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
73436a909e st/osmesa: new OSMesa gallium state tracker
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
3c3668c5a1 st/mesa: add PIPE_FORMAT_R16G16B16A16_UNORM renderbuffer support
To allow rendering in 16-bit/channel RGBA buffers.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:42 -06:00
José Fonseca
c526e1728f scons: Re-add ',' 2013-03-13 00:31:03 +00:00
José Fonseca
7bff1cc3f6 autotools: Add missing top-level include dir.
Fixes autotools build failure.  Not sure if there are more, as I have
difficulties in building the full tree.
2013-03-13 00:25:09 +00:00
Matt Turner
5c6e1e97b3 configure.ac: Alphabetize freedreno makefiles. 2013-03-12 17:09:55 -07:00
Matt Turner
d89ef39418 build: Get rid of dead MESA_ASM_FILES variable
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:54 -07:00
Matt Turner
bd0c9d07d0 mesa/build: Get rid of dead ALL_FILES variable
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:47 -07:00
Matt Turner
51e065a96c xmlpool/.gitignore: Remove 'Makefile'
Handled by top level .gitignore.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:40 -07:00
Matt Turner
e59fc3faa5 mesa: Use PACKAGE_BUGREPORT macro.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:33 -07:00
Matt Turner
9065bab37e mesa: Remove unused version #defines from version.h.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:28 -07:00
Matt Turner
439c3d4e31 mesa: Replace MESA_VERSION with PACKAGE_VERSION.
One fewer place to have to update.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:21 -07:00
Zack Rusin
42c1b33f6d draw/so: Fix stream output with geometry shaders
If geometry shader is present its stream output info should
be used instead of the vs and we shouldn't use the pre-clipped
corrdinates.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 16:22:26 -07:00
José Fonseca
57cd1d1454 include: Fix build with VS 11 (i.e, 2012).
NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-12 22:07:10 +00:00
José Fonseca
70fe7c6d3e mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all.
We were in four already...

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-12 22:06:27 +00:00
José Fonseca
96b3ca89b1 scons: Allows choosing VS 10 or 11.
NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-12 22:04:04 +00:00
Michel Dänzer
4dca602521 radeonsi: Fix off-by-one for maximum vertex element index in some cases
In cases where the vertex element size is smaller than the vertex buffer
stride, the previous calculation could end up 1 too low. This would result
in the GPU using index 0 instead of the maximum index for those elements,
which would be visible as intermittent distorted triangles.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-12 18:25:54 +01:00
Christoph Bumiller
8aa8b0539e nvc0: avoid crash on updating RASTERIZE_ENABLE state
When doing a blit with the 3D engine, the rasterizer or zsa cso may
be NULL.
2013-03-12 12:55:37 +01:00
Christoph Bumiller
4d28aff48f gallium/tests: check format in compute tests, make selectable 2013-03-12 12:55:37 +01:00
Christoph Bumiller
e2dded78ea nvc0: add MP trap handler for nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
ae59a7d35d nvc0: they removed the NTID,NCTAID,GRIDID registers on nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
e066f2f62f nvc0: implement compute support for nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
75f1f852b0 nvc0/ir: try to fix CAS (CompareAndSwap) 2013-03-12 12:55:37 +01:00
Christoph Bumiller
18fdfbdc32 nv50/ir: add CCTL (cache control) op 2013-03-12 12:55:37 +01:00
Christoph Bumiller
9db7e09cb4 nvc0/ir/emit: fix emission of large address offsets 2013-03-12 12:55:36 +01:00
Christoph Bumiller
175c185941 nvc0: add SHADER/COMPUTE_RESOURCE bind flags to format table 2013-03-12 12:55:36 +01:00
Christoph Bumiller
19ea0bd521 nouveau: align PIPE_BIND_SHADER,COMPUTE_RESOURCEs to 256 bytes 2013-03-12 12:55:36 +01:00
Christoph Bumiller
47f2179844 nv50,nvc0: copy writable flag on surface creation 2013-03-12 12:55:36 +01:00
Christoph Bumiller
7a91d3a2a4 nv50/ir: add support for different sampler and resource index on nve4
And remove non-working code for indirect sampler/resource selection.
Will be added back later.

Includes code from "nv50/ir/tgsi: Resource indirect indexing" by
Francisco Jerez (when mixing the R and S handles we can only specify
them via a register, i.e. indirectly, unless we upload all the used
handle combinations to c[] space, which we don't for now).
2013-03-12 12:55:36 +01:00
Christoph Bumiller
99e4eba669 nv50/ir: implement splitting of 64 bit ops after RA 2013-03-12 12:55:36 +01:00
Christoph Bumiller
ac9f19e485 nvc0/ir: skip back edges when determining latest sched value 2013-03-12 12:55:36 +01:00
Christoph Bumiller
f07c46a4f4 nvc0/ir: use large issue delay after RET, too 2013-03-12 12:55:36 +01:00
Christoph Bumiller
b23ec3f8ba nv50/ir: fix size adjustment for sched info for multiple functions 2013-03-12 12:55:36 +01:00
Christoph Bumiller
d39169cb6d nv50/ir: print function inputs and outputs 2013-03-12 12:55:36 +01:00
Christoph Bumiller
1b4faa2b17 nv50/ir/ssa: add a few comments regarding RenamePass 2013-03-12 12:55:36 +01:00
Francisco Jerez
1535b754fb nv50/ir/tgsi: Exclude local declarations from function prototypes. 2013-03-12 12:55:36 +01:00
Christoph Bumiller
9b563ef3f7 nv50/ir/opt: try to make use of SUCLAMP addend 2013-03-12 12:55:36 +01:00
Christoph Bumiller
a788be19e5 nv50/ir: don't assert on type in Modifier.applyTo if it is 0 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c3a5bc0bdf nv50/ir: add support for barriers
nv50 part by Francisco Jerez.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
a0a25191f2 nv50/ir/tgsi: add support for atomics 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c2dfcd7f0e nv50/ir/tgsi: handle TGSI_OPCODE_LOAD,STORE
Squashed and (heavily) modified original patches by Francisco Jerez:
nv50/ir/tgsi: Implement resource LOAD/STORE (wip).
nv50/ir/tgsi: Emit SUST/SULD for surface access, and add CB LOAD/STORE support
nv50/ir/tgsi: Fix/clean up the LOAD/STORE handling code.

Left out for now:
nv50/ir/tgsi: Resource indirect indexing

Treating raw, read-only surfaces as constant buffers (CBs) was removed
because CBs are limited to a size of 64 KiB which isn't desireable, and
because this decision should probably be made by the state tracker.
If we used a number of CB slots for surfaces, it might find that we
cannot accomodate the advertised limit.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
d105b3df14 nvc0/ir: don't replace load from input in COMPUTE progs with VFETCH 2013-03-12 12:55:35 +01:00
Christoph Bumiller
4506ed28de nvc0/ir: implement lowering of surface ops for nve4 2013-03-12 12:55:35 +01:00
Christoph Bumiller
8ac68b071d nvc0/ir: add formatted surface load lib code, move to extra header
OpenGL is nice and makes the user specify a format with an image unit.
OpenCL is evil and doesn't, and what's better than adding a huge load
of functions that we call indirectly to handle the conversion ?
2013-03-12 12:55:35 +01:00
Christoph Bumiller
ce1951daed nv50/ir: extend moveSources for delta < 0 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c0fc3463e9 nvc0/ir: lower atomics in s[] 2013-03-12 12:55:35 +01:00
Christoph Bumiller
9c196779bc nvc0/ir/emit: implement INSBF, EXTBF, PERMT and ATOM 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c8f0c43f7a nv50/ir/emit: handle OP_ATOM 2013-03-12 12:55:35 +01:00
Christoph Bumiller
d6c95f6819 nvc0/ir/target: some ops can't be predicated, e.g. CALL 2013-03-12 12:55:35 +01:00
Christoph Bumiller
1ed507ca46 nv50/ir/opt: CALLs cannot load 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c893b94060 nv50/ir: add support for indirect BRA,CALL 2013-03-12 12:55:34 +01:00
Christoph Bumiller
efe55075b5 nvc0/ir/emit: implement move to and logic ops on predicates 2013-03-12 12:55:34 +01:00
Christoph Bumiller
ce7610f7d5 nvc0/ir/emit: implement surface related ops 2013-03-12 12:55:34 +01:00
Christoph Bumiller
3741b7d844 nv50/ir: initialize CodeEmitters' specialized target fields 2013-03-12 12:55:34 +01:00
Christoph Bumiller
b0fc2f13ec nv50/ir/opt: make optimization aware of atomics, barriers, surface ops 2013-03-12 12:55:34 +01:00
Christoph Bumiller
22b762f9b4 nv50/ir: add various new OPs that will be needed for compute 2013-03-12 12:55:34 +01:00
Francisco Jerez
c82714c593 nv50/ir: Rename "mkLoad" to "mkLoadv" for consistency. 2013-03-12 12:55:34 +01:00
Christoph Bumiller
cc30ce8160 nv50/ir: fix comparison of system values 2013-03-12 12:55:34 +01:00
Francisco Jerez
4ddfdcea04 nv50/ir/tgsi: Translate grid-related system parameters. 2013-03-12 12:55:34 +01:00
Francisco Jerez
8446c31d0e nv50/ir/tgsi: Accept COMPUTE programs. 2013-03-12 12:55:34 +01:00
Christoph Bumiller
e9294e11b4 nv50/ir/ra: make sure all used function inputs get assigned a reg
A live range [0, 0) counts as empty. For function inputs this can
be a problem, so insert a nop at the beginning to make it [0, 1).
This is a bit of a hack but also the most simple solution.
2013-03-12 12:55:34 +01:00
Christoph Bumiller
ee431b12ec nv50/ir/ra: also add pre-existing MERGE,SPLIT to constraint list 2013-03-12 12:55:34 +01:00
Christoph Bumiller
f1dfa414f4 nv50/ir/ra: fix confusion with conditional RegisterSet::occupy 2013-03-12 12:55:34 +01:00
Christoph Bumiller
d995f44f0b nv50/ir/ra: swap copyCompound args if src is compound and dst isn't 2013-03-12 12:55:33 +01:00
Francisco Jerez
95ad9bca2f nv50/ir/ra: Fix maxGPR calculation for programs with multiple functions. 2013-03-12 12:55:33 +01:00
Francisco Jerez
ca04e71024 nv50/ir/ra: Fix traversal before the beginning of the active list in buildRIG. 2013-03-12 12:55:33 +01:00
Francisco Jerez
fe17d8a7c0 nv50/ir/ra: Fix RegisterSet::occupy(const Value *v). 2013-03-12 12:55:33 +01:00
Francisco Jerez
49ded0e132 nv50/ir/ra: Fix argument const-ness in RegisterSet::idToUnits and idToBytes 2013-03-12 12:55:33 +01:00
Francisco Jerez
5959d4247a nv50/ir/opt: Fix tryPropagateBranch for BBs with several exit branches.
Comments and "if (bf->cfg.incidentCount() == 1)" condition added
by Christoph Bumiller.
2013-03-12 12:55:33 +01:00
Francisco Jerez
572bf83ec0 nv50/ir: Clean up references to function values before destroying them. 2013-03-12 12:55:33 +01:00
Francisco Jerez
12f65e38c0 nouveau: Bail out from nouveau_fence_wait if flushing the pushbuf fails. 2013-03-12 12:55:33 +01:00
Vinson Lee
543d032885 mesa: Use correct functions for enum conversion.
Fixes mixing enum types defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-11 23:44:10 -07:00
Rob Clark
6173cc19c4 freedreno: gallium driver for adreno
Currently works on a220.  Others in the a2xx family look pretty similar
and should be pretty straightforward to support with the same driver.

The a3xx has a new shader ISA, and while many registers appear similar,
the register addresses have been completely shuffled around.  I am not
sure yet whether it is best to support with the same driver, but
different compiler, or whether it should be split into a different
driver.

v1: original
v2: build file updates from review comments, and remove GPL licensed
    header files from msm kernel
v3: smarter temp/pred register assignment, fix clear and depth/stencil
    format issues, resource_transfer fixes, scissor fixes

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-11 21:53:24 -04:00
José Fonseca
44a8e51354 d3d1x: Remove.
Unused/unmaintained.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-03-12 00:35:06 +00:00
José Fonseca
7db60f049f nv50: Remove nv0_ir_from_sm4.*
Unused, depends on d3d1x.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-03-12 00:35:06 +00:00
Roland Scheidegger
5c41d1c222 gallivm: clean up passing derivatives around
Previously, the derivatives were calculated and passed in a packed form
to the sample code (for implicit derivatives, explicit derivatives were
packed to the same format).
There's several reasons why this wasn't such a good idea:
1) the derivatives may not even be needed (not as bad as it sounds since
llvm will just throw the calculations needed for them away but still)
2) the special packing format really shouldn't be part of the sampler
interface
3) depending what the sample code actually does the derivatives will
be processed differently, hence there is no "ideal" packing. For cube
maps with explicit derivatives (which we don't do yet) for instance the
packing looked downright useless, and for non-isotropic filtering we'd
need different calculations too.

So, instead just pass the derivatives as is (for explicit derivatives),
or let the rho calculating sample code calculate them itself. This still
does exactly the same packing stuff for implicit derivatives for now,
though explicit ones are handled in a more straightforward manner (quick
estimates show performance should be quite similar, though it is much
easier to follow and also does the rho calculation per-pixel until the
end, which we eventually need for spec compliance anyway).

No piglit changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-12 00:24:22 +01:00
Chad Versace
b7262ac7ea i965: Fix typo in doxygen hyperlink
s/brw_state_upload/brw_upload_state/

Found because the link was broken.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-11 16:01:19 -07:00
Eric Anholt
11b8df0c01 mesa: Reduce memory usage for reg alloc with many graph nodes (part 2).
After the previous fix that almost removes an allocation of 4*n^2
bytes, we can use a bitset to reduce another allocation from n^2 bytes
to n^2/8 bytes.

Between the previous commit and this one, the peak heap size for an
oglconform ARB_fragment_program max instructions test on i965 goes from
4GB to 255MB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
6aa3afbfd6 mesa: Reduce the memory usage for reg alloc with many graph nodes (part 1)
We were allocating an adjacency_list entry for every possible
interference that could get created, but that usually doesn't happen.
We can save a lot of memory by resizing the array on demand.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
5daf867f6c i965/fs: Improve CSE performance by expiring some available expressions.
We're already walking the list, and we can easily know when something
has no reason to be in the list any longer, so take a brief extra step
to reduce our worst-case runtime (an oglconform test that emits the
maximum instructions in a fragment program).  I don't actually know what
the worst-case runtime was, because it was too long and I got bored.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
f179f419d1 i965/fs: Improve live variables calculation performance.
We can execute way fewer instructions by doing our boolean manipulation
on an "int" of bits at a time, while also reducing our working set size.

Reduces compile time of L4D2's slowest shader from 4s to 1.1s
(-72.4% +/- 0.2%, n=10)

v2: Remove redundant masking (noted by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
4dc7e6dcbf i965/fs: Also do the gen4 SEND dependency workaround against other SENDs.
We were handling the the dependency workaround for the first written reg
of a send preceding the one we're fixing up, but didn't consider the other
regs.  Thus if you had two sampler calls that got allocated to the same
set of regs, one might, rarely, ovewrite the other.  This was occurring in
XBMC's GLSL shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
4c1fdae0a0 i965/fs: Switch to using sampler LD messages for uniform pull constants.
When forcing the compiler to always generate pull constants instead of
push constants (in order to have an easy to use testcase), improves
performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
1323772543 i965/fs: Fix broken rendering in large shaders with UBO loads.
The lowering process creates a new vgrf on gen7 that should be represented
in live interval analysis.  As-is, it was getting a conflicting allocation
with gl_FragDepth in the dolphin emulator, producing broken rendering.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
c588cd2031 i965/fs: Add a comment about about an implementation detail.
I was going to fix the code above like the previous commit, but we already
had that covered (otherwise all our uniform access would have been broken,
unlike just pull constants).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
f10f5e4980 i965/fs: Fix register allocation for uniform pull constants in 16-wide.
We were allowing a compressed instruction to write a register that
contained the last use of a uniform pull constant (either UBO load or push
constant spillover), so it would get half its values smashed.

Since we need to see the actual instruction to decide this, move the
pre-gen6 pixel_x/y logic here, which should improve the performance of
register allocation since virtual_grf_interferes() is called more than
once per instruction.

NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
f09a8e17e5 intel: Remove some unused debug flags.
I was looking at the list to see what might be interesting to document for
application developers, and it turns out some are completely dead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Zack Rusin
7295fad204 draw/gs: Correctly iterate the emitted primitives
We were assuming that each emitted primitive had the same
number of vertices. That is incorrect. Emitted primitives
can have arbirtrary number of vertices. Simply increment
index on iteration to fix it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-07 20:16:07 -08:00
Zack Rusin
e5406f7058 tgsi/exec: Correctly reset NumOutputs before parsing the shader
Whenever we're binding the shaders we're incrementing NumOutputs,
assuming the parser spots an output decleration, but we were never
reseting the variable. That means that each subsequent bind of
a geometry shader would add its number of output to the number
of output bound by all previously ran shaders and our indexes
would get completely messed up.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-07 20:16:00 -08:00
Roland Scheidegger
9060c835fd draw/llvm: another quick hack for drawing with no position output
Also need to skip things if we have no cv value but pos value
(happens with geometry shaders enabled).
Needs a round of cleanup, though.
2013-03-11 17:07:51 +01:00
Roland Scheidegger
ef17cc9cb6 softpipe: don't use samplers with prebaked sampler and sampler_view state
This is needed for handling the dx10-style sample opcodes.
This also simplifies the logic by getting rid of sampler variants
completely (sampler_views though OTOH have sort of variants because
some of their state is different depending on the shader stage they
are bound to).
No significant performance difference (openarena run:
840 frames in 459.8 seconds vs. 840 frames in 460.5 seconds).

v2: fix reference counting bug spotted by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-11 17:07:51 +01:00
Roland Scheidegger
f33c744fb9 tgsi: emit code for SVIEWINFO and SAMPLE_I
Can handle them since the single sampler interface was introduced.

v2: simplify txf/sample_i handling a bit according to Brian's feedback.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-11 17:07:51 +01:00
Roland Scheidegger
7b3a0bb45d tgsi: fix wrong reg used for unit for TGSI_OPCODE_TXF
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-11 17:07:51 +01:00
Tom Stellard
a0676968b9 r600g/llvm: Fix build 2013-03-11 11:10:51 -04:00
Marek Olšák
e4e655fd11 r600g: add debug options disabling various copy-buffer-related features
This will be invaluable for debugging and bug reports.
2013-03-11 13:44:46 +01:00
Marek Olšák
4b69c1a92d mesa: don't allocate a texture if width or height is 0 in CopyTexImage
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-11 13:44:14 +01:00
Marek Olšák
68ed4c9c89 gallium/util: attempt to fix blitting multisample texture arrays
We don't have a test for this yet, but obviously the swizzle was wrong.
2013-03-11 13:43:36 +01:00
Marek Olšák
52efa01de0 r600g: allocate FMASK right after the texture, so that it's aligned with it
This avoids the kernel CS checker errors with MSAA textures.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
2c339f8015 r600g: remove r600.h, move the stuff elsewhere (mostly to r600_pipe.h)
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
ec7d775790 r600g: remove r600_hw_context_priv.h, move the stuff to r600_pipe.h
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
1724ef8908 r600g: remove deprecated state management code
It's nice to see so much code that did pretty much nothing go away.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
65cbf89567 r600g: atomize pixel shader
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
63042af933 r600g: atomize vertex shader
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
167263ecb1 r600g: inline r600_pipe_shader function
also change names of other functions, so that they make sense

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
65b2a449bc r600g: dump vertex elements state along with the fetch shader 2013-03-11 13:43:36 +01:00
Marek Olšák
3f0a51d677 gallium/util: dump instance_divisor 2013-03-11 13:43:36 +01:00
Marek Olšák
3832059b10 r600g: remove bytecode dumping
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
4bf0ebdd4f r600g: use a single env var R600_DEBUG, disable bytecode dumping
Only the disassembler is used to dump shaders. Here's a few examples
how to use R600_DEBUG.

Log compute info:
  R600_DEBUG=compute

Dump all shaders:
  R600_DEBUG=fs,vs,gs,ps,cs

Dump pixel shaders only:
  R600_DEBUG=ps

Disable Hyper-Z:
  R600_DEBUG=nohyperz

Disable the LLVM backend:
  R600_DEBUG=nollvm

Or use any combination of the above, or print all options:
  R600_DEBUG=help

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
2ca73bc7f7 r600g: cleanup #include recursion between r600_pipe.h and evergreen_compute.h
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
43d3e0cd3d r600g: don't check for R600_ENABLE_S3TC env var 2013-03-11 13:43:36 +01:00
Stefan Brüns
b21a9d46e4 glapi/gen: Remove duplicate PYTHON_FLAGS
PYTHON_GEN calls python with PYTHON_FLAGS

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
2013-03-09 16:24:51 -08:00
Frank Henigman
89559c50e7 i965: Link i965_dri.so with C++ linker.
Force C++ linking of i965_dri.so by adding a dummy C++ source file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-08 21:21:53 -08:00
Maxence Le Doré
ba588dd45d gallium/util: Correct shift value for TSC feature detection.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-08 21:21:53 -08:00
Matt Turner
07f2dee731 configure.ac: Build dricommon for DRI gallium drivers
Commit 67ef7559 added an || test "x$enable_dri" check in an attempt to
get the DRI common bits built in some necessary cases. That change was
inappropriate as it made these common DRI pieces be built
unconditionally, so some builds were broken.

Subsequently, commit 998d975e3 change the "|| test" to a "-a"
conjunction within the existing test invocation. This made the '-a
"x$enable_dri" = xyes' clause have no effect, (as it was inside an
enclosing test for the same condition). So the new breakage from
commit 67ef7559 was addressed, but the original problems were
regressed.

The immediately preceding commit removed the redundant condition.

Now, finally this commit fixes the original problem as described in
the commit message of 67ef7559: this code should be compiled when
using the DRI state tracker. In order to do so, the HAVE_*_DRI
conditionals must be moved after the last assignment of HAVE_COMMON_DRI.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821
Tested-by: Stéphane Marchesin <marcheu@chromium.org>
2013-03-08 21:21:46 -08:00
Matt Turner
7de78ce5e5 configure.ac: Remove redundant checks of enable_dri.
The whole block is enclosed inside if test "x$enable_dri" = xyes.
2013-03-08 21:20:43 -08:00
Matt Turner
79a0977241 mesa: Allow ETC2/EAC formats with ARB_ES3_compatibility.
Fixes piglit's oes_compressed_etc2_texture-miptree tests on Desktop GL.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-03-08 21:20:39 -08:00
Stéphane Marchesin
1662178863 i915g: Use PIPE_FLUSH_END_OF_FRAME to trigger throttling
This helps with jittering, instead of throttling at every command
buffer we only throttle once a frame.
2013-03-08 19:34:50 -08:00
Stéphane Marchesin
d815e8af39 i915g: Update TODO 2013-03-08 19:34:43 -08:00
Brian Paul
728240b64d docs: document another Viewperf bug 2013-03-08 10:35:46 -07:00
Jan de Groot
17f1cb1d99 dri/nouveau: fix crash in nouveau_flush
https://bugs.freedesktop.org/show_bug.cgi?id=61947

Note: this is a candidate for the stable branches
2013-03-07 19:55:07 +01:00
Brian Paul
057c46d791 draw: add const qualifier to silence compiler warning 2013-03-07 08:11:12 -07:00
Brian Paul
9915636fb8 llvmpipe: remove the power of two sizeof(struct cmd_block) assertion
It fails on 32-bit systems (I only tested on 64-bit).  Power of two
size isn't required, so just remove the assertion.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-07 06:28:23 -07:00
Brian Paul
c2665aacdd vbo: fix crash found with shared display lists
This fixes a crash when a display list is created in one context
but executed from a second one.  The vbo_save_context::vertex_store
memeber will be NULL if we never created a display list with the
context.  Just check for that before dereferencing the pointer.

Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661

Note: This is a candidate for the stable branches.
2013-03-07 06:28:23 -07:00
Alan Hourihane
5984a911f9 mesa: fix glGetInteger*(GL_SAMPLER_BINDING).
If the sampler object has been deleted on another context, an
alternative context may reference the old sampler. So ensure the sampler
object still exists.

Note: this is a candidate for the stable branch.

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-07 10:13:40 +00:00
Christian König
eddf33f711 radeon/llvm: document LLVM commit
We need at least that revision to work correctly now.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-07 10:06:24 +01:00
Christian König
a7a899584c radeon/llvm: enable LICM and DCE pass v2
LICM stands for Loop Invariant Code Motion. Instructions that
does not depend of loop index are moved outside of loop body.

DCE is DeadCodeElimination.

v2: updated commit msg, thx to Vincent.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
e4188ee13d radeonsi: add LLVMNoUnwindAttribute to intrinsic
So LLVM can better eliminate dead code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
0666ffddd2 radeonsi: rework input interpolation
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
c497321d31 radeonsi: remove SI.vs.load.buffer.index
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
55fe5ccb39 radeon/llvm: make SGPRs proper function arguments v2
v2: remove unrelated changes

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
b8f4ca3d85 radeon/llvm: replace shader type intrinsic with function attribute
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
de80e560bc radeonsi: switch to v*i8 for resources and samplers v2
v2: remove unrelated changes

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
2cb54833d0 r600g/llvm: Update CONSTANT_BUFFER address space definition
To match recent LLVM changes.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-07 10:03:11 +01:00
Zack Rusin
2532147f8b draw/llvm: fix inputs to the geometry shader
We can't clip and viewport transform the vertices before we let
the geometry shader process them. Lets make sure the generated
vertex shader has both disabled if geometry shader is present.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-05 20:13:08 -08:00
Bryan Cain
8c74380b2d draw: use geometry shader info in clip_init_state if appropriate
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-05 20:13:08 -08:00
Bryan Cain
30f246bf2c draw: account for separate shader objects in geometry shader code
The geometry shader code seems to have been originally written with the
assumptions that there are the same number of VS outputs as GS outputs and
that VS outputs are in the same order as their corresponding GS inputs. Since
TGSI uses separate shader objects, these are both wrong assumptions. This
was causing several valid vertex/geometry shader combinations to either render
incorrectly or trigger an assertion.

Conflicts:
	src/gallium/auxiliary/draw/draw_gs.c

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-05 20:13:08 -08:00
Alan Hourihane
cf0b4a30fc Unreference sampler object when it's currently bound to texture unit.
This change specifically unbinds a sampler object from the texture unit
if it's bound to a unit. The spec calls for default object when deleting
sampler objects which are currently bound.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-06 18:10:12 +00:00
Brian Paul
b21f8e364b llvmpipe: fix incorrect 'j' array index in dummy texture code
Use 0 instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-03-06 10:34:09 -07:00
Brian Paul
975d31f60d llvmpipe: remove unused cmd_block_list struct 2013-03-06 10:34:09 -07:00
Brian Paul
a51b81558f llvmpipe: add some scene limit sanity check assertions
Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-06 10:34:09 -07:00
Brian Paul
a31ebdffa0 llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE
We advertise a max texture/surfaces size of 8K x 8K but the old values
for these limits didn't actually allow us to handle that surface size.

For 8K x 8K we'll have 16384 bins.  Each bin needs at least one cmd_block
object which was 2192 bytes in size.  Since 16384 * 2192 exceeded
LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not
draw the complete scene.

By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks.  And
by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command
blocks for 8K x 8K, plus a few regular data blocks.

Fixes the (improved) piglit fbo-maxsize test.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-06 10:34:09 -07:00
Kenneth Graunke
492693c0a5 i965: Don't fill buffer with zeroes.
This was only necessary because our bounds checking was off by one, and
thus we read an extra pair of values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-06 08:27:54 -08:00
Kenneth Graunke
89e5c8e0fa i965: Fix off-by-one in query object result gathering.
If we've written N pairs of values to the buffer, then last_index = N,
but the values are 0 .. N-1.  Thus, we need to use <, not <=.

This worked anyway because we fill the buffer with zeroes, so we just
added an extra (0 - 0) to our results.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-06 08:27:47 -08:00
Christian König
886c5085e3 radeon/llvm: fix trivial warnings
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-06 12:08:54 +01:00
Christian König
a212483437 radeonsi: fix trivial warning
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-06 12:07:40 +01:00
Eric Anholt
88b20d5834 intel: Improve the matching (more formats!) for TexImage from PBOs.
Mesa core is the place for encoding what format/type matches a mesa
format, so rely on that.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
731d474d98 intel: Improve the test for readpixels blit path format checking.
We were allowing things like copying RG1616 to a user's ARGB8888
format, while we were denying anything that wasn't ARGB8888 or
RGB565.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
3c7e96ff01 intel: Fold intel_region_copy() into its one caller.
This is similar code to intel_miptree_copy_slice, but the knobs
are all set differently.

v2: fix whitespace

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
7604debabb intel: Transition intel_region_map() to being a miptree operation.
I'm trying to move us away from the region structure, and all the
callers are currently dereferencing a miptree to get the region.

In this change, the map_refcount is dropped.  However, the bo->virtual is
itself map refcounted, so that's already dealt with.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
f4f288f317 intel: Remove num_mapped_regions tracking.
The point of tracking the value was removed in February 2012
(65b096aedd), and this should have
been removed at the same time.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
3c9532314c intel: Remove the struct intel_region reuse hash table.
I don't see any reason for it -- it was introduced with the DRI2
invalidate work by krh in 2010 with no explanation.  I suspect it was
something about wanting the same drm_intel_bo struct underneath multiple
openings of the BO within one process, but that's covered by libdrm at
this point.  As far as the struct region goes, it is not threadsafe, so
multiple contexts sharing a region could have mixed up the map_count and
assertion failed or worse.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:37 -08:00
José Fonseca
e77234be39 scons: Provide shorthand aliases for software winsyses. 2013-03-05 23:06:13 +00:00
José Fonseca
3950953f93 scons: Fix llvm-config not found error message.
"% llvm_version" is bogus copy'n'past cruft.
2013-03-05 23:06:13 +00:00
Ian Romanick
674f9239b9 mesa: Modify candidate search string
Several commits on master for the 9.1 branch had "NOTE" messages in a
slightly different format.

NOTE: This is a candidate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-05 14:54:11 -08:00
Eric Anholt
65afa11dc6 mesa: Remove the special enum for _mesa_error debug output.
Now all the per-message enums from mtypes are gone.  Now we can extend
unique message IDs into all generators of debug output without having to
update mtypes.h for each one.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:01 -08:00
Eric Anholt
d9249935db mesa: Remove the enum for the oom-within-debug-output case.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:01 -08:00
Eric Anholt
6816f67de6 mesa: Remove now-unused gl_winsys_error and gl_shader_error enums.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
c72cf53817 mesa: Report ARB_debug_output for both shader errors and warnings.
This ends up reusing the dynamic ID support, so a silly enum gets to go
away.  We don't assign good IDs to different messages yet, but at least
that's tractable now.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
f0a191ca0f intel: Add missing perf debug for a stall on mapping a BO.
I was testing the ARB_debug_output code and wrote an obvious sample that
should have hit this, and got confused that my ARB_debug_output was
broken.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
14cec07177 i965: Make perf_debug() output to GL_ARB_debug_output in a debug context.
I tried to ensure that performance in the non-debug case doesn't change
(we still just check one condition up front), and I think the impact is
small enough in the debug context case to warrant including all of it.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
0a1c6bcfb0 intel: Finish renaming fallback_debug() to perf_debug().
They're about to change to handle GL_ARB_debug_output, so just make one
function.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
807eedf70f intel: Hook up the WARN_ONCE macro to GL_ARB_debug_output.
This doesn't provide detailed error type information, but it's important
to get these relatively severe but rare error messages out to the
developer through whatever mechanism they are using.

v2: Rebase on new WARN_ONCE additions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
2013-03-05 14:25:00 -08:00
Eric Anholt
3025680578 mesa: Add support for GL_ARB_debug_output with dynamic ID allocation.
We can emit messages now without always having to use the same ID for
each, or having a giant table of all possible errors in mtypes.h.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
7beb93456d mesa: Merge handling of application-provided and built-in error sources.
I want to have dynamic IDs so that we don't need to add to mtypes.h for
every error we might want to add.  To do so, I need to get rid of the
static arrays and actually support all the crazy filtering of dynamic IDs
that we already support for application-provided error sources.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
88831a8d99 mesa: Fix _mesa_problem() on context destroy after application debug output
This was apparently not noticed because we don't have any testing of
application-generated debug output.  However, as I'm changing the
GL-generated debug output to use the same path as
application/middleware-generated debug output, this obviously became an
issue.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
e0d1e3b785 mesa: Move debug type/severity enums to mesa core.
These will get reused by new ARB_debug_output messages in drivers/core,
instead of having the caller pass GL enums and have us immediately
switch-statement those into enums.

Add source enums will be handled in the next commit, because the way
different sources are handled at the moment is pretty strange.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
c42148d16e mesa: Replace open-coded _mesa_lookup_enum_by_nr().
The new one doesn't have the same behavior for GL_NO_ERROR, but we don't
produce errors with GL_NO_ERROR as the error type.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
e022461c64 mesa: Remove extra #define MAXSTRING duplicating MAX_DEBUG_MESSAGE_LENGTH.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Marcin Slusarz
f4ebcd133b dri/nouveau: NV17_3D class is not available for NV1a chipset
Should fix https://bugs.freedesktop.org/show_bug.cgi?id=60510

Note: this is a candidate for the stable branches

Acked-by: Francisco Jerez <currojerez@riseup.net>
2013-03-05 21:19:17 +01:00
Roland Scheidegger
b9eb573600 tgsi: handle projection modifier for array textures.
This partly reverts 6ace2e41da.
Apparently with GL_MESA_texture_array fixed-function texturing
with texture arrays is possible, and hence we have to handle TXP.
(Though noone seems to know the semantics, softpipe now does what
it did before, which is to NOT project the array coord, llvmpipe
for instance however indeed does project the array coord. Unlike
before it will project the comparison coord for shadow1d array, as
that clearly was an error.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61828.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 20:10:37 +01:00
Roland Scheidegger
be6d18ba5e st/mesa: translate ir offset parameters for non-TXF opcodes.
Otherwise the state tracker will crash if the texture instructions
have offsets.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 20:10:37 +01:00
Matt Turner
523b07e320 configure.ac: Remove stale comment about --x-* arguments.
Should have been removed with e273ed37.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 11:02:36 -08:00
Matt Turner
35189d768b configure.ac: Don't check for X11 unconditionally.
X11 is already checked conditionally below.

Fixes OSMesa-only configurations to not require X11.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 11:02:22 -08:00
Alan Hourihane
196443f3f5 Add missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions
This was hit on the glTexStorage2D() path.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 17:22:44 +00:00
Jon TURNEY
87fdcd87b1 Fix out-of-tree build of 'make check' in src/mesa/main/tests
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-05 13:33:16 +00:00
Dave Airlie
e21460b4d5 u_blitter: don't create illegal shaders for 1D/3D/RECT/CUBE MSAA
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-03-04 22:23:08 +00:00
Daniel Martin
998d975e38 Fix build of swrast only without libdrm
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Martin <consume.noise@gmail.com>
2013-03-04 10:11:01 -08:00
Brian Paul
b1390c7992 mesa: flush current state when querying GL_EDGE_FLAG
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61395

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-04 08:41:45 -07:00
Jakub Bogusz
e29124717e vdpau-softpipe: Build correct source file - vl_winsys_xsp.c
Copy-and-paste problem introduced by commit 7f24483e.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-03 22:53:26 -08:00
Kenneth Graunke
b88f74d63d i965: Fix Crystal Well PCI IDs.
The second digit was off by one, which meant we accidentally treated
GTn as GT(n-1).  This also meant no support for GT1 at all.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-03 13:53:58 -08:00
Vincent Lejeune
83e7d111af r600g: Check comp_mask before merging export instructions
Fixes a llvm uncovered (rare) bug where consecutive exports were
merged even if they have incompatible mask.
2013-03-03 21:39:51 +01:00
Vadim Girlin
138b5b9a12 r600g: fix check_and_set_bank_swizzle for cayman
Tested-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
2013-03-03 21:38:49 +01:00
Brian Paul
0b6e72f8d7 st/mesa: add switch case for ir_txf_ms to silence warning
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-03-02 05:52:40 -07:00
Brian Paul
2ea0e30bed mesa: add switch case for ir_txf_ms to silence warning
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 05:52:28 -07:00
Kenneth Graunke
cf0c0a7782 i965: Pull query BO reallocation out into a helper function.
We'll want to reuse this for non-occlusion queries in the future.

Plus, it's a single logical task, so having it as a helper function
clarifies the code somewhat.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
961c9b8cac i965: Replace the global brw->query.bo variable with query->bo.
Again, eliminating a global variable in favor of a per-query object
variable will help in a future where we have more queries in hardware.

Personally, I find this clearer: there's just the query object's BO,
rather than two variables that usually shadow each other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
614944b897 i965: Turn if (query->bo) into an assertion.
The code a few lines above calls brw_emit_query_begin() if !query->bo,
and that creates query->bo.  So it should always be non-NULL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
981a22b62b i965: Unify query object BO reallocation code.
If we haven't allocated a BO yet, we need to do that.  Or, if there
isn't enough room to write another pair of values, we need to gather up
the existing results and start a new one.  This is simple enough.

However, the old code was awkwardly split into two blocks, with a
write_depth_count() placed in the middle.  The new depth count isn't
relevant to gathering the old BO's data, so that can go after the
reallocation is done.  With the two blocks adjacent, we can merge them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
90feda81de i965: Use query->last_index instead of the global brw->query.index.
Since we already have an index in the brw_query_object, there's no need
to also keep a global variable that shadows it.

Plus, if we ever add support for more types of queries that still need
the per-batch before/after treatment we do for occlusion queries, we
won't be able to use a single global variable.  In contrast, per-query
object variables will work fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
ec5d502ec3 i965: Remove brw_query_object::first_index field as it's always 0.
brw->query.index is initialized to 0 just a few lines before it's
copied to first_index.

Presumably the idea here was to reuse the query BO for subsequent
queries of the same type, but since that doesn't happen, there's no need
to have the extra code complexity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
d92c7d8eed i965: Add a pile of comments to brw_queryobj.c.
This code was really difficult to follow, for a number of reasons:

- Queries were handled in four different ways (TIMESTAMP writes a single
  value, TIME_ELAPSED writes a single pair of values, occlusion queries
  write pairs of values for the start and end of each batch, and other
  queries are done entirely in software.  It turns out that there are
  very good reasons each query is handled the way it is, but
  insufficient comments explaining the rationale.

- It wasn't immediately obvious which functions were driver hooks
  and which were helper functions.  For example, brw_query_begin() is
  a driver hook that implements glBeginQuery() for all query types, but
  the similarly named brw_emit_query_begin() is a helper function that's
  only relevant for occlusion queries.

Extra explanatory comments should save me and others from constantly
having to ask how this code works and why various query types are
handled differently.

v2: Incorporate Eric's feedback: change "as soon as possible" to "the
    results will be present when mapped."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
d1b34baf9b i965: Write TIMESTAMP query values into the first buffer element.
For timestamp queries, we just write a single value to a BO.  The
natural place to write that is element 0, so we should do that.

Previously, we wrote it into element 1 (the second slot) leaving
element 0 filled with garbage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:03 -08:00
Kenneth Graunke
3d71f4fbac i965: Implement the new QueryCounter() hook.
This moves the GL_TIMESTAMP handling out of EndQuery.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:03 -08:00
Kenneth Graunke
dfb056b892 mesa: Add a new QueryCounter() hook for TIMESTAMP queries.
In OpenGL, most queries record statistics about operations performed
between a defined beginning and ending point.  However, TIMESTAMP
queries are different: they immediately return a single value, and there
is no start/stop mechanism.

Previously, Mesa implemented TIMESTAMP queries by calling EndQuery
without first calling BeginQuery.  Apparently this is DirectX
convention, and Gallium followed suit.  I personally find the asymmetry
jarring, however---having BeginQuery and EndQuery handle a different set
of enum values looks like a bug.  It's also a bit confusing to mix the
one-shot query with the start/stop model.

So, add a new QueryCounter driver hook for implementing TIMESTAMP.  For
now, fall back to EndQuery to support drivers that don't do the new
mechanism.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:03 -08:00
Roland Scheidegger
6ace2e41da tgsi: add texel offsets and derivatives to sampler interface
Something I never got around to implement, but this is the tgsi execution
side for implementing texel offsets (for ordinary texturing) and explicit
derivatives for sampling (though I guess the ordering of the components
for the derivs parameters is debatable).
There is certainly a runtime cost associated with this.
Unless there are different interfaces used depending on the "complexity"
of the texture instructions, this is impossible to avoid.
Offsets are always active (I think checking if they are active or not is
probably not worth it since it should mostly be an add), whereas the
sampler_control is extended for explicit derivatives.
For now softpipe (the only user of this) just drops all those new values
on the floor (which is the part I never implemented...).

Additionally this also fixes (discovered by accident) inconsistent
projective divide for the comparison coord - the code did do the
projection for shadow2d targets, but not shadow1d ones. This also
drops checking for projection modifier on array targets, since they
aren't possible in any extension I know of (hence we don't actually
know if the array layer should also be divided or not).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
c7c7186045 draw: additional fix for the no-position case with llvm
Similar fix to what is done for the non-llvm case, we could otherwise still
hit the stages (near certainly with gs) which crash. It is probably a much
better idea to skip trying to draw at that point anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
ea8b2ae8a5 draw: fix no position output in non-llvm pipeline.
It seems easiest (and best) if we simply skip all the later stages
(after stream output).
(This is different to the llvm case at least for now where we will
simply try to render garbage, though both behaviors should be correct.)
Fixes piglit glsl-1.40-tf-no-position with softpipe.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
de0593e333 draw/llvm: skip clipping and viewport transform if there's no position output
With glsl 1.40 writing position is not required (useful for transform
feedback, though in fact it's still possible to rasterize such geometry
even if the results aren't too well defined).
Prevents crashes in that case. Fixes piglit glsl-1.40-tf-no-position.
Not quite sure this is 100% correct as it also skips clipdistance
clipping which could still work (but not sure if the result would
really be needed?)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
2ef13e7c55 llvmpipe: don't assert on illegal surface creation.
Since c8eb2d0e82 llvmpipe checks if it's
actually legal to create a surface. The opengl state tracker doesn't quite
obey this so for now just warn instead of assert.
Also warn instead of disabled assert when creating sampler views
(same reasoning).

Addresses https://bugs.freedesktop.org/show_bug.cgi?id=61647.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
4c12276607 llvmpipe: bump glsl version to 140
texel offsets should have been the last missing feature for 130, and in
fact 140 as well (last there were texture buffers). In any case we still
don't do OpenGL 3.0 (missing MSAA which will be difficult,
plus EXT_packed_float, ARB_depth_buffer_float and EXT_framebuffer_sRGB).

v2: bump to 140 instead - we have everything except we crash when not writing
to gl_Position (but softpipe crashes as well) so let's just say this is a bug
instead. Also (by Dave Airlie's suggestion) update llvm-todo.txt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-02 02:54:30 +01:00
Roland Scheidegger
b3b3b389fa gallivm: add support for texel offsets for ordinary texturing.
This was previously only handled for texelFetch (much easier).
Depending on the wrap mode this works slightly differently (for somewhat
efficient implementation), hence have to do that separately in all roughly
137 places - it is easy if we use fixed point coords for wrapping, however
some wrapping modes are near impossible with fixed point (the repeat stuff)
hence we have to normalize the offsets if we can't do the wrapping in
unnormalized space (which is a division which is slow but should still be
much better than the alternative, which would be integer modulo for wrapping
which is just unusable). This should still give accurate results in all
cases that really matter, though it might be not quite conformant behavior
for some apis (but we have much worse problems there anyway even without
using offsets).
(Untested, no piglit test.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-02 02:54:30 +01:00
Brian Paul
a99eb5c83f svga: always link with C++
Even when we don't have LLVM since there's other C++ code
in the resulting DRI driver object.

Note: This is a candidate for the stable branches.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-01 17:31:32 -07:00
Brian Paul
f6c0612618 st/mesa: convert ir_triop_lrp to TGSI_OPCODE_LRP
AFAICT, all gallium drivers implement TGSI_OPCODE_LRP.
Tested with softpipe, llvmpipe, svga drivers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-01 17:31:32 -07:00
Chris Forbes
7616586cff docs: Mark some things done in GL3.txt 2013-03-02 12:02:25 +13:00
Martin Andersson
d96d8ed910 winsys/radeon: Only add bo to hash table when creating flink
The problem is that we mix bo handles and flinked names in the hash
table. Because kms type handles are not flinked they should not be
added to the hash table. If we do that we will sooner or later
get a situation where we will overwrite a correct entry because
the bo handle was the same as a flinked name.

Note: this is a candidate for the stable branches.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-01 17:52:40 -05:00
Chris Forbes
1d4dbeeaec i965: enable ARB_texture_multisample on Gen6+
V2: Works on Ivy Bridge now too, so this can be 6+.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:40:50 +13:00
Chris Forbes
26c8479474 i965/fs: add support for ir_txf_ms on Gen6+
On Gen6, lower this to `ld` with lod=0 and an extra sample_index
parameter.

On Gen7, use `ld2dms`. We don't support CMS yet for multisample
textures, so we just hardcode MCS=0. This is ignored for IMS and UMS
surfaces.

Note: If we do end up emitting specialized shaders based on the MSAA
layout, we can emit a slightly shorter message here in the UMS case.

Note: According to the PRM, `ld2dms` takes one more parameter, lod.
However, it's always zero, and including it would make the message too
long for SIMD16, so we just omit it.

V2: Reworked completely, added support for Gen7.
V3: - Introduce sample_index parameter rather than reusing lod
    - Removed spurious whitespace change
    - Clarify commit message
V4: - Fix comment style
    - Emit SHADER_OPCODE_TXF_MS on Gen6. This was benignly wrong since
      it lowers to `ld` anyway on this gen, but still wrong.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:40:50 +13:00
Chris Forbes
6883c8845d i965/vs: add support for ir_txf_ms on Gen6+
On Gen6, lower this to `ld` with lod=0 and an extra sample_index
parameter.

On Gen7, use `ld2dms`. This takes an additional MCS parameter to support
compressed multisample surfaces, but we're not enabling them for
multisample textures for now, so it's always ignored and can be safely
omitted.

V2: Reworked completely, added support for Gen7.
V3: - Use new sample_index, sample_index_type rather than reusing lod
    - Clarify commit message.
V4: - Fix comment style

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:40:49 +13:00
Chris Forbes
f52ce6a0ca i965: add a new virtual opcode: SHADER_OPCODE_TXF_MS
This is very similar to the TXF opcode, but lowers to `ld2dms` rather
than `ld` on Gen7.

V4: - add SHADER_OPCODE_TXF_MS to is_tex() functions, so regalloc thinks
      it actually writes the correct number of registers. Otherwise in
      nontrivial shaders some of the registers tend to get clobbered,
      producing bad results.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-02 11:40:49 +13:00
Chris Forbes
555dc6d74d i965: take the target into account for Gen7 MSAA modes
Gen7 has an erratum affecting the ld_mcs message, making it unsafe to
use when the surface doesn't have an associated MCS.

From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"):

   "If this field is disabled and the sampling engine <ld_mcs>
   message is issued on this surface, the MCS surface may be
   accessed. Software must ensure that the surface is defined
   to avoid GTT errors."

To allow the shader to treat all surfaces uniformly, force UMS if the
surface is to be used as a multisample texture, even if CMS would have
been possible.

V3: - Quoted erratum text

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:39:42 +13:00
Chris Forbes
8cc26ae993 i965: Support multisampling in surface_state for textures
The surface_state setup for renderbuffers already worked; only the
texturing side needed work. BLORP does something similar, but does its
own surface_state setup.

On Gen6, we just need to set the correct sample count.

On Gen7: - set the correct sample count
         - set the correct layout mode
         - set GEN7_SURFACE_ARYSPC_LOD0 if it's set in the miptree.

V2: - Clarify commit message
    - Rebased onto Paul's physical/logical dims cleanup
    - Added Gen7 support

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-02 11:35:24 +13:00
Chris Forbes
e62b6a10bc i965: add support for multisample textures
V2: - Fix for state moving from texobj to image
    - Rebased onto Paul's logical/physical cleanup
    - Fixed missing quantization of sample count
    - Fold in IMS renderbuffer wrapper fixes from later in the series
    - Use correct physical slice offset for UMS/CMS surfaces on Gen7

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-02 11:35:24 +13:00
Chris Forbes
575d3870bb mesa: implement TexImage*Multisample
V2: - fix formatting issues
    - generate GL_OUT_OF_MEMORY if teximage cannot be allocated
    - fix for state moving from texobj to image

V3: - remove ridiculous stencil hack
    - alter format check to not allow a base format of STENCIL_INDEX
    - allow width/height/depth to be zero, to deallocate the texture
    - dont forget to call _mesa_update_fbo_texture

V4: - fix indentation
    - don't throw errors on proxy texture targets

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-02 11:35:24 +13:00
Chris Forbes
61d42ffef4 mesa: support multisample textures in framebuffer completeness check
- sample count must be the same on all attachments
- fixedsamplepositions must be the same on all attachments
(renderbuffers have fixedsamplepositions=true implicitly; only
multisample textures can choose to have it false)

V2: - fix wrapping to 80 columns, debug message, fix for state moving
      from texobj to image.
    - stencil texturing tweaks tidied up and folded in here.

V3: - Removed silly stencil hacks entirely; the extension doesn't
      actually make stencil-only textures legal at all.
    - Moved sample count / fixed sample locations checks into
      existing attachment-type-specific blocks, as suggested by Eric

V4: - Removed stencil hacks which were missed in V3 (thanks Eric)
    - Don't move the declaration of texImg; only required pre-V3.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:22 +13:00
Chris Forbes
032896cbf9 i965: expose sample positions
Moves the definition of the sample positions out of
gen6_emit_3dstate_multisample, and unpacks them in
gen6_get_sample_position.

V2: Be consistent about `sample position` rather than `location`.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:20 +13:00
Chris Forbes
569c4a9f1c i965: add support for sample mask on Gen6+
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:17 +13:00
Chris Forbes
1822496f3a mesa: implement sample mask
V2: - fix multiline comment style
    - stop using ASSERT_OUTSIDE_BEGIN_END_AND_FLUSH since that
      doesn't exist anymore.

V3: - check for the extension being enabled
    - tidier flagging of _NEW_MULTISAMPLE
    - fix weird indentation in get.c

V4: - move flush later in SampleMaski()

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:16 +13:00
Chris Forbes
7c1017e292 mesa: implement GetMultisamplefv
Actual sample locations deferred to a driverfunc since only the driver
really knows where they will be.

V2: - pass the draw buffer to the driverfunc; don't fallback to pixel
      center if driverfunc is missing.
    - rename GetSampleLocation to GetSamplePosition
    - invert y sample position for winsys FBOs, at Paul's suggestion

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:13 +13:00
Chris Forbes
abb5429537 i965: expose new max sample counts
V2: For now, only expose a depth sample count of 1, since there are
possible unresolved interactions with HiZ.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:08 +13:00
Chris Forbes
db5d5c30a6 mesa: add new max sample count state
- GL_MAX_COLOR_TEXTURE_SAMPLES
- GL_MAX_DEPTH_TEXTURE_SAMPLES
- GL_MAX_INTEGER_SAMPLES

V2: initialize limits to 1 in _mesa_init_constants as suggested by Brian
and Paul

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:34:58 +13:00
Chris Forbes
ffb53b4f03 glsl: add support for ARB_texture_multisample
V2: - emit `sample` parameter properly for multisample texelFetch()
    - fix spurious whitespace change
    - introduce a new opcode ir_txf_ms rather than overloading the
      existing ir_txf further. This makes doing the right thing in
      the driver somewhat simpler.

V3: - fix weird whitespace

V4: - don't forget to include the new opcode in tex_opcode_strs[]
      (thanks Kenneth for spotting this)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Eric Anholt <eric@anholt.net>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:54 +13:00
Chris Forbes
16af0aca09 tests: add ARB_texture_multisample enums to table
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:42 +13:00
Chris Forbes
d04a4dd003 mesa: add texobj support for ARB_texture_multisample
Adds the new texture targets, and per-image state for GL_TEXTURE_SAMPLES
and GL_TEXTURE_FIXED_SAMPLE_LOCATIONS.

V2: - Allow multisample texture targets in glInvalidateTexSubImage too.
      This was already partly there, but I missed it the first time around
      since the interaction is defined in a newer extension. Fixed weird
      indentation.
    - Allow multisample array textures in glFramebufferTextureLayer.
      This was overlooked as the tests originally only used 2d
      multisample textures.

V3: - Set min/mag filters sensibly for multisample textures. This
      can't actually be changed by the user, so it's more sensible to
      initialize it correctly than to hack around it being bogus later.

V4: - Tidy up initial min/mag filter setup. Setup in
      _mesa_initialize_texture_object was bogus, but benign since
      finish_texture_init() clobbered everything with correct values. For V4,
      just do the setup in finish_texture_init().

V5: - Don't break glPopAttrib(GL_TEXTURE_BIT)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:27 +13:00
Chris Forbes
0f83e415e4 glapi: add ARB_texture_multisample
Adds new enums, dispatch machinery, and stubs for the 4 new entrypoints.

V2: - Drop placeholder
    - Align enum values
    - Remove explicit exec=mesa; it *is* the dispatch flavor we want,
      but it's also the default. I misunderstood how this worked before;
      after actually reading the generator it makes good sense.

V3: - Squash in stubs for new entrypoints, and dispatch_sanity tweaks,
      so we don't get build breakage between those patches.

V4: - Fix various remaining whitespace issues

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[1/3 V2] Reviewed-by: Matt Turner <mattst88@gmail.com>
[V3] Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:20 +13:00
Eric Anholt
c0674fa5cd intel: Use the new "ctx" local variable I just added some more.
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-01 12:10:22 -08:00
Eric Anholt
e15c21a957 i965: Make sRGB-capable framebuffers by default.
The GLX extension lets you expose visuals that explicitly guarantee you
that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set
the flag even while the visual doesn't provide the guarantee.  This
appears to be consistent with other implementations, as we've seen
several apps now that don't require an srgb visual and assume sRGB will
work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-01 12:10:16 -08:00
Eric Anholt
973ddc897d intel: Fix software copying of miptree faces for weird formats.
Now that we have W-tiled S8, we can't just region_map and poke at bits --
there has to be some swizzling.  Rely on intel_miptree_map to get that job
done.  This should also get the highest performance path we know of for the
mapping (interesting if I get around to finishing movntdqa some day).

v2: Fix stale name of the bit in a comment.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-01 11:50:03 -08:00
Eric Anholt
6d6bd2ac7c intel: Add a flag for miptree mapping to disable transcoding.
I want to reuse intel_miptree_map() to replace some region mapping that's
broken for separate stencil, but doing so would result in new demands on
ETC transcode that we actually don't want to happen.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-01 11:50:03 -08:00
Eric Anholt
e63c959451 i965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-01 11:50:03 -08:00
Alex Deucher
a40ba43d78 r600g: enable CP DMA on 6xx
Tested across several 6xx parts, no piglit regressions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-01 12:11:31 -05:00
Marek Olšák
58bd926d9e r600g: don't require dword alignment with CP DMA for buffer transfers
which is a leftover from the days when we used streamout to copy buffers

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
89e2898e9e r600g: always map uninitialized buffer range as unsynchronized
Any driver can implement this simple and efficient optimization.
Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used
with TF2 anymore, so we avoid a ton of useless buffer copies.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Marek Olšák
44f37261fc gallium/util: add helper code for 1D integer range
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: cosmetic changes based on Brian's review

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch. (the next patch depends on it)
2013-03-01 13:46:32 +01:00
Marek Olšák
8f192a3c9e r600g: cleanup deprecated register tables
These registers are either already emitted elsewhere or moved to start_cs.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
f0636bc982 r600g: unify vgt states
The states were split because we thought it caused a hardlock. Now we know
the hardlock was caused by something else and has since been fixed.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
e5a250fdf9 r600g: flush and invalidate htile cache when appropriate
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Marek Olšák
6f25de6711 r600g: atomize streamout enabling
This doesn't fix any issue we know of, but there indeed is a week spot
in draw_vbo where streamout can fail. After streamout is enabled,
the need_cs_space call can flush the context, which causes the streamout
to be disabled right after it was enabled and bad things happen.

One way to fix it is to atomize the beginning part, so that no context flush
can happen between streamout enabling and the first drawing.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
9dd18f43a4 r600g: use async DMA with a non-zero src offset
probably a typo

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Marek Olšák
c77917d35f r600g: pad the DMA CS to a multiple of 8 dwords
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Jordan Justen
782d4f0f3c intel: Enable __DRI_API_OPENGL_CORE api with dri2 contexts
Without this set, dri_util.c:dri2CreateContextAttribs
will reject requests to create a context with
__DRI_API_OPENGL_CORE.

This prevents a 3.2 core profile context from being created
even when MESA_GL_OVERRIDE_VERSION=3.2 is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:51:00 -08:00
Jordan Justen
fde59a27fb intel: update max versions based on MESA_GL_VERSION_OVERRIDE
If the override is version is >= 3.1, then update the
max_gl_core_version. Otherwise, update max_gl_compat_version.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:50:56 -08:00
Jordan Justen
c4e059a359 mesa version: add _mesa_get_gl_version_override
This will allow other code to get access to the override
version before a context is available.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:50:50 -08:00
Jordan Justen
500b69e797 glsl: allow GLSL compiler version to be overridden to 1.50
Although GLSL 1.50 compiler support is not available,
this change will allow MESA_GLSL_VERSION_OVERRIDE=150 to be
used while 1.50 support is being developed.

Since no drivers claim 1.50 GLSL support, this change should
only impact Mesa when MESA_GLSL_VERSION_OVERRIDE=150 is set.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:49:59 -08:00
Matt Turner
4154ac066f i965/fs: Put immediate operand as src2
Immediate operands can only be src2 in 2-source instructions. Fixes
piglit failures since 0a1d145e (oops!).

Spotted-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-28 16:29:30 -08:00
Chad Versace
809fdc211f intel: Remove intel_mipmap_tree::wraps_etc
The field was equivalent to (etc_format != MESA_FORMAT_NONE), and
therefore duplicate information.

This patch removes field and replaces all references to it with
`etc_format != MESA_FORMAT_NONE`.

No Piglit ETC test regresses on Intel Sandybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-28 15:22:41 -08:00
Matt Turner
c001985cbf ir_to_mesa: Translate ir_triop_lrp to OPCODE_LRP.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:19:00 -08:00
Matt Turner
428503fcdf i965/vs: Assert that ir_triop_lrp was lowered.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:19:00 -08:00
Matt Turner
f78a7ff6b2 i965/fp: Use the LRP instruction for OPCODE_LRP.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:19:00 -08:00
Kenneth Graunke
0a1d145e5f i965/fs: Use the LRP instruction for ir_triop_lrp when possible.
v2 [mattst88]:
   - Add BRW_OPCODE_LRP to list of CSE-able expressions.
   - Fix op_var[] array size.
   - Rename arguments to emit_lrp to (x, y, a) to clear confusion.
   - Add LRP function to brw_fs.cpp/.h.
   - Corrected comment about LRP instruction arguments in emit_lrp.
v3 [mattst88]:
   - Duplicate MAD code for LRP instead of using a function pointer.
   - Check for != GRF instead of == IMM in emit_lrp.
   - Lower LRP on gen < 6.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

1
2013-02-28 13:19:00 -08:00
Kenneth Graunke
015a48743d i965: Add support for emitting the LRP instruction.
Like MAD, this is another three-source instruction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Matt Turner
af2c64063e glsl: Optimize ir_triop_lrp(x, y, a) with a = 0.0f or 1.0f
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Kenneth Graunke
93066ce129 glsl: Convert mix() to use a new ir_triop_lrp opcode.
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).

Pattern matching or peepholing this is more desirable, but can be
tricky.  By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.

Currently, all consumers lower ir_triop_lrp.  Subsequent patches will
actually generate different code.

v2 [mattst88]:
   - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
     subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
   - Move changes from the next patch to opt_algebraic.cpp to accept
     3-src operations.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Kenneth Graunke
18281d6088 glsl: Rework ir_reader to handle expressions with three operands.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Kenneth Graunke
1afd33ec05 glsl: Consolidate ir_expression constructors that use explicit types.
Previously, we had separate constructors for one, two, and four operand
expressions.  This patch consolidates them into a single constructor
which uses NULL default parameters.

The unary and binary operator constructors had assertions to verify that
the caller supplied the correct number of operands for the expression,
but the four-operand version did not.  Since get_num_operands for
ir_quadop_vector returns the number of vector_elements, we can safely
add that without breaking the semantics of ir_quadop_vector.

This also paves the way for expressions with three operands.  Currently,
none can be constructed since get_num_operands() never returns 3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Matt Turner
f0213b1242 i965/vs/gen7: Allow MATH instructions to have MRF as a destination
total instructions in shared programs: 346873 -> 346847 (-0.01%)
instructions in affected programs:     364 -> 338 (-7.14%)

(All affected shaders are from Lightsmark)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Matt Turner
4eeb9ded9d i965/fs/gen7: Allow MATH instructions to have MRF as a destination
total instructions in shared programs: 1376297 -> 1375626 (-0.05%)
instructions in affected programs:     35977 -> 35306 (-1.87%)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Matt Turner
d5c3aa89dc i965/gen7: Relax restrictions on fake MRFs
Gen6 has write-only MRF registers, and for ease of implementation we
paritition off 16 general purposes registers to act as MRFs on Gen7.

Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't
do with real MRFs:
   - read from them;
   - return values directly to them from a send instruction; and
   - compute directly to them with math instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Matt Turner
b9f6795e34 i965/fs: Remove duplicate scan_inst->mlen check
Is already checked 20 lines below.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Tom Stellard
aa1c734b3c clover: Fix build with LLVM 3.3 v2
v2:
  - Fix order that the clang libraries are passed to the linker to avoid
    missing symbol errors.

Acked-by: Francisco Jerez <currojerez@riseup.net>
2013-02-28 16:01:23 -05:00
Jordan Justen
6f1538f8b4 attrib: push/pop FRAGMENT_PROGRAM_ARB state
This requirement was added by ARB_fragment_program

When the Steam overlay is enabled, this fixes:
* Menu corruption with the Puddle game
* The screen going black on Rochard when
  the Steam overlay is accessed

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-28 09:29:45 -08:00
Keith Kriewall
efd8311a54 scons: Fix Windows build with LLVM 3.2
Fixes fdo bug 61299

NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-28 15:40:02 +00:00
Adam Sampson
2506b03503 autotools: oprofilejit should be included in the list of LLVM components required
NOTE: This is a candidate for the stable branch.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-28 15:37:09 +00:00
Jerome Glisse
6bc7605745 r600g: workaround hyperz lockup on evergreen
This work around disable hyperz if write to zbuffer is disabled. Somehow
using hyperz when not writting to the zbuffer trigger GPU lockup. See :

https://bugs.freedesktop.org/show_bug.cgi?id=60848

Candidate for 9.1

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-28 09:48:05 -05:00
Jordan Justen
c6ae10887e texobj: add verbose api trace messages to several routines
Motivated by wanting to see if GenTextures was called by an
application while debugging another Steam overlay issue.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-27 23:02:12 -08:00
Roland Scheidegger
c8eb2d0e82 llvmpipe: check buffers in llvmpipe_is_resource_referenced.
Now that buffers can be used as textures or render targets
make sure they aren't skipped.

Fix suggested by Jose Fonseca.

v2: added a couple of assertions so we can actually guarantee
we check the resources and don't skip them. Also added some comments
that this is actually a lie due to the way the opengl buffer api works.
2013-02-28 03:39:54 +01:00
Roland Scheidegger
686f6c69bd llvmpipe: support rendering to buffer render targets.
Unfortunately not usable from OpenGL, and no cap bit.
Pretty similar to a 1d texture, though allows specifying a start element.

v2: also fix up renderbuffer width (which will get promoted to fb width)
to be the number of elements

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-28 03:39:54 +01:00
Roland Scheidegger
2fcd3638be util: fix issues with util_clear_render_target.
For PIPE_BUFFER we need coord adjustments for the transfer.
And for pure integer formats util_pack_color just crashes,
need to handle that differently due to clear colors being ints/uints.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-28 03:39:53 +01:00
Roland Scheidegger
6b35c2b110 softpipe/draw/tgsi: simplify driver/tgsi sampler interface
Use a single sampler adapter instead of per-sampler-unit samplers,
and just pass along texture unit and sampler unit in the calls.
The reason is that for dx10-style sample opcodes pre-wired
samplers including all the texture state aren't really feasible (and for
sample_i/sviewinfo we don't even have samplers).
Of course right now softpipe doesn't actually do anything more than
just look up all its pre-wired per-texunit/per-samplerunit sampler as
it did before so this doesn't really achieve much except one more
function call, however this is now all softpipe's fault (fixing that in
a way which doesn't suck is still an unsolved problem).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-28 03:39:53 +01:00
Maxence Le Doré
0845d16976 gallivm: fix mis-matching AOS instruction emission
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-27 20:23:01 +00:00
Jon TURNEY
f816a9f522 glx: Fix glXCreateWindow() when GLX_DIRECT_RENDERING is undefined
glXCreateWindow() and glXCreatePbuffer() always fail when built without
GLX_DIRECT_RENDERING defined since commit 48331047.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2013-02-27 13:36:19 -05:00
Francisco Jerez
4deefd9ba6 configure.ac: Clarify the description of the --with-opencl-libdir parameter a little.
https://bugs.freedesktop.org/show_bug.cgi?id=61415

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-02-27 12:27:13 +01:00
Vinson Lee
f987d23b28 radeonsi: Fix memory leak in si_set_constant_buffer.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-26 20:03:11 -08:00
Vinson Lee
f88ed1658c st/vega: Fix memory leak in combine_shaders.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-26 20:01:58 -08:00
Kristian Høgsberg
112ccfab44 egl/wayland: Don't block on EGL_DEFAULT_DISPAY under wayland
Normally the application will own the main event queue and be responsible
for moving events.  In case of EGL_DEFAULT_DISPLAY, EGL opens the display
and has to own the main queue so it can move the events itself.
Call wl_display_dispatch_pending() to take ownership.
2013-02-26 12:49:49 -05:00
Ian Romanick
68a147e9a9 egl: Allow 24-bit visuals for 32-bit RGBA8888 configs
Previously only the 32-bit X visual would match the 32-bit RGBA8888
configs.  This resulted in every config with alpha getting the "magic"
visual whose alpha is used by the compositor.  This also resulted in no
multisample visuals being advertised.  How many ways could we lose?

This patch inverts the problem... now you can't get the visual with
alpha used by the compositor even if you want it.  I think we need to
invent a new value for EGL_TRANSPARENT_TYPE that apps can use to get
this.  I'm surprised that there isn't already a choice for
EGL_TRANSPARENT_ALPHA.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Tian Ye <yex.tian@intel.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59783
2013-02-26 09:42:31 -08:00
Brian Paul
e2148ab043 st/mesa: remove some conditionals in update_raster_state()
Just use simple assignments.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-26 09:16:52 -07:00
Alex Deucher
e5e4c07e79 r600g: add missing emit_flush for R600_CONTEXT_FLUSH_AND_INV case
We set the cp_coher_cntl bits but never emit them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-26 10:30:26 -05:00
Alex Deucher
d54bc5d227 r600g: synchronize streamout buffers on r6xx too (v3)
Streamout buffers need to be synchronized on r6xx as
well.

v2: Add DEST flush as well.
v3: drop DEST flush

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-26 10:30:10 -05:00
Brian Paul
62329d77b8 winsys/null: fix var typo templet->templat 2013-02-26 08:20:16 -07:00
Brian Paul
02bf645111 svga: fix comment typos 2013-02-26 08:20:16 -07:00
Marek Olšák
d8d58bdcb9 r300g: implement 3D transfers
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=61351
2013-02-26 01:14:20 +01:00
Marek Olšák
3857f450a6 gallium/util: add helper util_max_layer from r600g 2013-02-26 01:14:05 +01:00
Roland Scheidegger
52c44cee1e llvmpipe: (trivial) get rid of old function prototypes.
llvmpipe_init_screen/context_texture_funcs have long been replaced
with the respective "resource" funcs.
2013-02-25 20:38:23 +01:00
Roland Scheidegger
c0ba1080df draw: make sure pipeline is revalidated when sampler views or samplers change.
Since with llvm execution parts of sampler view and sampler state is baked into
the shader, we need to revalidate otherwise the wrong shader might get used.
(Not completely sure but I think this would not be required for non-llvm case,
along with everything else in these functions.)
This caused bugs in piglit arb_texture_buffer_object-formats, because we never
noticed that the view format changed.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-25 20:38:23 +01:00
Roland Scheidegger
20183177a5 llvmpipe: support GL_ARB_texture_buffer_object/GL_ARB_texture_buffer_range
This also fixes not honoring first/last_layer view parameters for array
textures, plus not honoring last_level view parameter for all textures
(neither is really used by OpenGL).
This mostly passes piglit arb_texture_buffer_object tests (it needs, however,
glsl 140 version override, plus GL 3.1 override, the latter only because
mesa does not allow ARB_tbo in non-core contexts).
Most arb_texture_buffer_object tests pass, with the exception of
arb_texture_buffer_object-formats. With "arb" parameter it passes most weirdo
formats before it segfaults in the state tracker, this looks to be some issue
with using legacy formats in core context (fails the same in softpipe).
With "core" parameter it passes with "fs", however fails with "vs" (for most
formats). This will be fixed later (debugging shows we're completely missing
the shader recompile depending on format).

v2: based on Jose's feedback, fix comments, variable/function names.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-25 20:38:23 +01:00
Eric Anholt
50a5d5dea0 i965: Fix the W value of deprecated pointcoords on pre-gen6.
When you didn't have a texcoord array bound (or a non-1 current w
attrib), we were telling the fragment shader that it could just use "1"
instead of doing expensive pre-gen6 math to invert it.  If you drew the
point with a non-1 W value, then you'd get the right size (since all the
vertex computations worked), but we'd mis-interpolate the coordinate
across the face.

Fixes the mesa pointsprite demo on GM45.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30232
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Note: This is a candidate for the stable branches.
2013-02-25 11:21:44 -08:00
Tapani Pälli
3cdb548bfb mesa/es: NULL check in EGLImageTargetTexture2DOES
check that pointer passed is valid and return error if not.

Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-25 09:17:31 -08:00
Tapani Pälli
331967c773 mesa: add missing case in _mesa_GetTexParameterfv()
missing case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES is required
by OES_EGL_image_external extension.

Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-25 09:17:20 -08:00
Andreas Boll
533dc3b690 docs: add news item for mesa-demos 8.1.0 release 2013-02-25 11:31:08 +01:00
Andreas Boll
d209926666 docs: import release notes for 9.1, add news item 2013-02-25 10:47:02 +01:00
Jordan Justen
0486d50320 glsl: Remove VS output varyings which are optimized out of the FS
Previously when an input varying was optimized out of the
FS we would still retain it as an output of the VS.

We now build a hash of live FS input varyings rather
than looking in the FS symbol table. (The FS symbol table
will still contain the optimized out varyings.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-23 16:20:28 -08:00
Vinson Lee
f6487e8911 vl: Fix off-by-one error in device_name_length allocation.
Fixes out-of-bounds write reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
2013-02-23 14:57:05 -08:00
John Kåre Alsaker
65aa1a194d llvmpipe: Fix creation of shared and scanout textures.
NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-23 18:36:58 +00:00
José Fonseca
fdb88967e3 util/u_blitter: Set pipe_sampler_state::normalized_coords correctly.
We might want to revisit the normalized_coords semantics, but this is
the current expected behavior.

Fixes fdo bug 61091.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-23 18:36:57 +00:00
Brian Paul
2557d3f9c3 svga: remove some extraneous whitespace 2013-02-23 08:20:36 -07:00
Brian Paul
840d6faf68 st/mesa: fix debug_printf() format string warning
Use %td for ptrdiff_t (aka GLsizeiptrARB).
2013-02-23 08:20:36 -07:00
José Fonseca
0d760a8160 util/dump: Use static assertion to detect string table size mismatches.
Suggested by Brian Paul.

Could probably be extended to other enums.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-23 13:32:34 +00:00
Vinson Lee
2fa9e4c97c st/xvmc/tests: Ensure colorkey is initialized.
Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-02-22 19:32:00 -08:00
Vinson Lee
54afbce934 st/vdpau: Fix memory leak in vlVdpBitmapSurfaceCreate.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-02-22 19:30:03 -08:00
Vinson Lee
1bac4a1e6f st/vdpau: Fix memory leak in vlVdpOutputSurfaceCreate.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-02-22 19:29:56 -08:00
Tapani Pälli
b4dba5bba2 glapi: mark static_dispatch false for DiscardFramebufferEXT
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61199
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Brad King <brad.king@kitware.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-02-22 17:18:08 -08:00
Brian Paul
b804fb8714 llvmpipe: rename polygon offset fields to something more specific
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
f93c580063 llvmpipe: add missing checks for polygon offset point/line modes
The llvm pipeline handles regular filled triangle offsets, but it
doesn't handle offsets for triangles drawn in point or line mode.

Fixes failures found with new piglit polygon-mode-offset test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
d6b8b116ee draw: fix broken polygon offset stage
There were several issues.  We weren't handling different front/back
polygon fill modes.  We weren't checking whether the offset applied to
fill mode vs. line mode vs. point mode.

Fixes problems found with the Visualization Toolkit (VTK) test suite.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
a2c105e31e st/mesa: fix polygon offset state translation logic
The old logic was kind of twisted, but seemed to work in practice.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
8bb291b0f5 st/mesa: check for dummy programs in destroy_program_variants()
When we destroy an ARB vp/fp whose ID was gen'd but not otherwise used we
get a pointer to the dummy/placeholder program.  We can't destroy that one
so just skip it.  This only failed during context tear-down because
glDeleteProgramsARB() was already aware of dummy programs.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=38086

Note: This is a candidate for the stable branches.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-02-22 16:49:05 -07:00
Brian Paul
8589cc41b3 st/mesa: fix trimming of GL_QUAD_STRIP
We sometimes convert GL_QUAD_STRIP prims into GL_TRIANGLE_STRIP, but
that changes the results of the u_trim_pipe_prim() call.  We need to
pass the original primitive type to the trim function.

Note that OpenGL's GL_x prim type values match Gallium's PIPE_PRIM_x values.

Fixes a failure in the new piglit degenerate-prims test.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Alex Deucher
8b5acad0e9 r600g: fixup PS_PARTIAL_FLUSH flag handling for cayman
So we don't emit it twice if we ever use the flag on
cayman.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-22 18:43:27 -05:00
Alex Deucher
8442b67f5f r600g: r6xx deadlock workaround (v6)
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=50655
https://bugs.freedesktop.org/show_bug.cgi?id=47116

v2: flush along with workaround.
v3: just need a flush
v4: try WAIT_UNTIL
v5: switch to PS partial flush
v6: rework patch

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-22 18:23:46 -05:00
Alex Deucher
7ebf83f109 r600g: add PS_PARTIAL_FLUSH flag
PS_PARTIAL flushes seems to be required in certain
cases to prevent hangs, especially on r6xx.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-22 18:23:31 -05:00
Ian Romanick
7ae6864f0d i965: Enable OpenGL ES 3.0 on Sandy Bridge
Regardless of what we put in the screen structure, all of the extensions
that compute_version_es2 checks are present and 3.0 will be exposed
anyway.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-22 13:57:44 -08:00
Lauri Kasanen
0a82828ad5 configure: Fix build with automake < 1.11
Commit 86d30dea3c broke building with older
automake versions with this error:

Makefile:769: *** Recursive variable am__v_YACC_ references itself (eventually).  Stop.

This patch fixes it. Fix stolen from xorg-macros.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
2013-02-22 13:15:14 -08:00
Anuj Phogat
cff862f90d meta: Allocate texture before initializing texture coordinates
tex->Sright and tex->Ttop are initialized during texture allocation.
This fixes depth buffer blitting failures in khronos conformance tests
when run on desktop GL 3.0.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59495

Note: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-22 12:03:59 -08:00
Eric Anholt
92a204b493 mesa: Fix setup of ctx->Point.PointSprite for GLES2.
The recent change for GL core broke the older setup, which broke
gl_PointCoord on pre-gen6 (where gl_PointCoord is undefined if point
sprites are disabled).  Fixes the new piglit GLES-2.0/glsl-fs-pointcoord
test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32429
Note: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-22 10:55:39 -08:00
Eric Anholt
7b0731d940 i965/fs: Fix broken math on values loaded from uniform buffers on gen6.
In a debug build this led to assertion failures, but on a non-debug
build the hardware would just reference the whole vec8 instead of the
same channel 8 times.

Fixes the new piglit glsl-1.40/uniform-buffer/fs-exp2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57121
Note: This is a candidate for the stable branches
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-22 10:50:50 -08:00
José Fonseca
cd01cc3b48 tgsi: Improve execution debugging.
- zero temps/outputs instead of copying (otherwise we won't be able to see
  the temps/outputs assignments for small shaders where nothing changes
  across big areas

- also show the inputs (as it's often impossible to infer from the rest)

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-22 16:19:58 +00:00
José Fonseca
f8436c17e4 util/u_dump: Update texture target strings. 2013-02-22 16:19:58 +00:00
Sergey Matyukevich
21e8af0b09 util/debug: Always use __builtin_frame_address on gcc.
Should workaround fdo bug 57563.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 16:19:58 +00:00
Michel Dänzer
f6b40ddd2d radeon/llvm: Remove stale comment about radeon_llvm_emit_prepare_cube_coords 2013-02-22 13:06:07 +01:00
Marek Olšák
aac8138744 r600g: fix random corruption with CP DMA in TF2
NOTE: This is a candidate for the 9.1 branch.
2013-02-22 12:49:15 +01:00
Michel Dänzer
3447cc4856 radeonsi: Don't pretend there is any R8G8B8 support
The hardware can't do it.
2013-02-22 11:44:24 +01:00
Andreas Boll
c1f2c3a80f llvmpipe/build: add DLOPEN_LIBS and PTHREAD_LIBS to the lp_test_* targets
Fixes undefined symbols.

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 10:21:43 +01:00
Andreas Boll
c1eb585f3d targets/xa-vmwgfx: Force c++ linker to fix undefined symbols
NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61200
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-22 10:21:43 +01:00
Roland Scheidegger
b6f15954b4 llvmpipe: Fix rendering into PIPE_FORMAT_X8*_UNORM.
Mesa state tracker recently started using PIPE_FORMAT_X8B8G8R8_UNORM,
causing segfaults in texture-packed-formats, because swizze[chan] was
0xff for padding channel (X).

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 09:00:45 +00:00
José Fonseca
8ed1279b10 trace: Never close stdout/stderr.
This could happen, when a trace screen was destroyed and then recreated.
2013-02-22 08:45:07 +00:00
José Fonseca
59025d6e95 trace: Fix set_constant_buffer dumping.
We were dumping the trace driver pointer, instead of the pointer from the
underlying pipe driver.
2013-02-22 08:40:47 +00:00
Vinson Lee
b92984b2fa r600g: Fix memory leak in r600_shader_select.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 21:49:24 -08:00
Roland Scheidegger
66c3cd0be3 llvmpipe: simplify buffer allocation logic.
Now with buffer formats clarification don't need all that logic any longer.
(Note that it never would have worked in any case, because blockwidth and
blockheight were swapped any allocation with multi-byte format would have
had zero size.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 04:34:07 +01:00
Roland Scheidegger
2cfee2295f gallium/docs: improve text about resources a bit.
This clarifies some things and gets rid of some old stuff.
The most significant one is probably that buffers cannot have formats
(nearly all drivers completely ignored format and used width0 as byte size
already in any case). There seems to be no use case for "structured" buffers.
(Note while d3d11 has new Structured Buffers, these still aren't associated
with a format, rather a byte stride, which we can't do yet either way.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 04:34:07 +01:00
Roland Scheidegger
f972567671 draw: make sure key size is calculated consistently.
Some parts calculated key size by using shader information, others by using
the pipe_vertex_element information. Since it is perfectly valid to have more
vertex_elements set than the vertex shader is using those may not be the same,
so we weren't copying over all vertex_element state - this caused the tgsi dump
to assert (iterates over all vertex elements). More importantly in this
situation it would also break vertex texturing completely (since the sampler
state derived from the key is at a different position than expected).
Fix thix by deriving key->nr_vertex_elements from the shader information
instead of the pipe_vertex_element state (unlike dx10, we can't have "holes"
in pipe_vertex_element state, so this should be safe).
(Note that actual llvm shader generation does not use the pipe_vertex_element
state from the key itself in any case (althogh I guess it could) but uses
the one from draw.pt (which should be the same though contains all elements)
instead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 04:34:07 +01:00
Tom Stellard
10bcc843f8 r300g/compiler: Fix bug in OMOD folding
The OMOD value was only being folded to one instruction in cases where
the MUL instruction was reading a value written by more than one
instruction.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:28 -05:00
Tom Stellard
5e1321ddf4 r300g/tests: Add helper functions for creating a full program
Now you can convert assembly strings into a full struct radeon_compiler
object and use it to test individual compiler pases.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
bcf2e157ca r300g/tests: Exit test runner with a valid status code
This way make check can report whether or not the tests pass.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
5355fc1e87 r300g/complier: Make r300_vertprog_swizzle_caps visible in other files
This will be used by the test suite in later commits.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
c3df498ff9 r300g/compiler: Fix typo in comment
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
27d140b960 r300g/compiler: Add missing license headers
These are all files that I authored, but forgot to add the license
headers.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Carl Worth
f5a8084692 i965: Avoid segfault in gen6_upload_state
This fixes a bug introduced in commit 258453716f and
triggered whenever "rb" is NULL.

Fixes at least one cause bug #59445:

	[SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault
	https://bugs.freedesktop.org/show_bug.cgi?id=59445

(Though segfaults are still possible in that test case, but they have been
present since before commit 258453716f which is what's being fixed here.)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-21 12:09:24 -08:00
Alex Deucher
2e4ef989a2 r600g: don't enable ReZ mode on evergreen
Can cause lockups in certain cases when
zfunc/zenable/zwrite change without a flush
in between.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60969
and lockups on Civ4 with wine.

This is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 11:59:07 -05:00
Andreas Boll
f7d87332b0 docs: import release notes for 9.0.3, add news item 2013-02-21 17:31:42 +01:00
Michel Dänzer
b63b3012c9 radeonsi: Don't match TGSI_SEMANTIC_POSITION fs inputs to vs outputs 2013-02-21 10:07:18 +01:00
Michel Dänzer
954bc4ac34 radeonsi: Fix w component of TGSI_SEMANTIC_POSITION fragment shader inputs.
It's the reciprocal of the register value.

Fixes piglit fragcoord_w and glsl-fs-fragcoord-zw-perspective.

NOTE: This is a candidate for the 9.1 branch.
2013-02-21 10:06:52 +01:00
Michel Dänzer
18272c9b1b radeonsi: Fix up and enable flat shading.
Requires corresponding LLVM R600 backend fix to work correctly, but even
without that it doesn't hang anymore.

13 more little piglits.

Depends on LLVM: r175193, r175733

NOTE: This is a candidate for the 9.1 branch.
2013-02-21 09:14:36 +01:00
Vinson Lee
0d51906c07 radeonsi: Fix memory leak in si_shader_select.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-20 23:29:12 -08:00
Paul Berry
54d9c8a04a i965: Consign COORD_REPLACE VS hacks to Pre-Gen6.
Pre-Gen6, the SF thread requires exact matching between VS output
slots (aka VUE slots) and FS input slots, even when the corresponding
VS output slot is unused due to being overwritten by point coordinate
replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)).
As a result, we have a special hack in the VS to ensure when any
texture coordinate is subject to point coordinate replacement, it is
always allocated space in the VUE, even if it isn't written to by the
VS.

This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE)
swizzling has the ability to insert the point coordinate into
gl_TexCoord[] without needing a corresponding unused VUE slot.

Note that no modification of SF setup code is required for this
patch--get_attr_override() already does the right thing.  However, we
make a slight comment change to clarify why this works.

In addition to eliminating unnecessary VS recompiles and saving
precious URB space on Gen6+, this will save us the trouble of having
to adjust this hack when we implement geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-20 13:48:45 -08:00
Ian Romanick
8b586322e7 mesa: Don't install glEvalMesh in the beginend dispatch table
NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59740
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-20 12:46:58 -08:00
Roland Scheidegger
83f7cde182 gallivm: fix indirect src register fetches requiring bitcast
For constant and temporary register fetches, the bitcasts weren't done
correctly for the indirect case, leading to crashes due to type mismatches.
Simply do the bitcasts after fetching (much simpler than fixing up the load
pointer for the various cases).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61036

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-20 19:37:30 +01:00
Roland Scheidegger
fbbcc1fcc4 llvmpipe: lp_resource_copy cleanup
We don't need to flush resources for each layer, and since we don't actually
care about layer at all in the flush function just drop the parameter.
Also we can use util_copy_box instead of repeated util_copy_rect.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-20 19:37:30 +01:00
Roland Scheidegger
95181ed2fd llvmpipe: fix lp_resource_copy using more than one 3d slice
These used to be illegal a very long time ago, then for some more time
nothing really emitted these so this code path wasn't hit.
Just trivially iterate over box->depth.
(Might be worth refactoring at some point since nowadays all the code
doesn't really do much except for depth textures.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61093

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-20 19:37:30 +01:00
Tapani Pälli
413941e1a3 gles2: a stub implementation for GL_EXT_discard_framebuffer
This patch implements a stub for GL_EXT_discard_framebuffer with
required checks listed by the extension specification. This extension
is required by GLBenchmark 2.5 when compiled with OpenGL ES 2.0
as the rendering backend.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-20 10:01:45 -08:00
Michel Dänzer
73bf626713 r600g/Cayman: Fix blending using destination alpha factor but non-alpha dest
Only compile tested, but should fix at least some piglit fbo-blending tests.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-20 14:43:17 +01:00
Michel Dänzer
95bced5929 radeonsi: Fix blending using destination alpha factor but non-alpha destination
11 more little piglits.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-20 12:58:52 +01:00
Marek Olšák
72f4490b55 radeonsi: implement 3D transfers
That means we can map and read multiple slices with one transfer_map call.

[ Cherry-picked from r600g commit 1aebb6911e ]

11 more little piglits on master, 1 more on the 9.1 branch (Marek's
glTex(Sub)Image improvements on master broke the other 10).

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-20 12:30:59 +01:00
Marek Olšák
a84c4edeed radeonsi: add assertions to prevent creation of invalid surfaces
[ Cherry-picked from r600g commit ef11ed61a0 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-20 12:30:32 +01:00
Marek Olšák
c4faab63c4 radeonsi: use u_box_origin_2d helper function
[ Cherry-picked from r600g commit b278aba423 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-20 12:15:22 +01:00
Vinson Lee
c403a52666 configure.ac: Do not check for clock_gettime on MinGW.
MinGW does not have clock_gettime.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-19 21:17:37 -08:00
Zack Rusin
076403c30d DRI2: Don't disable GLX_INTEL_swap_event unconditionally
GLX_INTEL_swap_event is broken on the server side, where it's
currently unconditionally enabled. This completely breaks
systems running on drivers which don't support that extension.
There's no way to test for its presence on this side, so instead
of disabling it uncondtionally, just disable it for drivers
which are known to not support it. It makes sense because
most drivers do support it right now.
We'll be able to remove this once Xserver properly advertises
GLX_INTEL_swap_event.

Note: This is a candidate for stable branch branches.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60052
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-19 12:50:16 -08:00
Eric Anholt
4c64f65f5d i965/fs: Enable CSE on uniform pull constant loads.
Improves on a major performance regression for the dolphin wii emulator
from its move to using UBOs.  Performance in the UBO codepath (as
replayed through apitrace) is up 21.1% +/- 2.3% (n=26/29).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-19 10:34:03 -08:00
Eric Anholt
c2a6e529c3 i965/fs: Only do CSE when the dst types match.
We could potentially do some CSE even when the dst types aren't the same
on gen6 where there is no implicit dst type conversion iirc, or in the
case of uniform pull constant loads where the dst type doesn't impact
what's stored.  But it's not worth worrying about.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-02-19 10:33:41 -08:00
Eric Anholt
aebd3f46e3 i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling.
This should fix the register allocation explosion on the GLES 3.0 test
on gen6.  It also gives us an instruction that will fit our CSE handling.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-02-19 10:33:32 -08:00
Eric Anholt
49bdebad38 i965/fs: Fix copy propagation with smearing.
We were correctly relaying the smear from MOV's src, but if the MOV
didn't do a smear, we don't want to smash the smear value from the
instruction being propagated into.  Prevents a regression in the
upcoming UBO change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-02-19 10:33:15 -08:00
Eric Anholt
de7cb1cff3 i965/fs: Add a bit more instruction dumping useful for upcoming work.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-19 10:33:00 -08:00
Tom Stellard
7cd248aa79 radeon/llvm: Fix build with LLVM 3.3 2013-02-19 15:52:55 +00:00
Tom Stellard
1f006717db r600g: Add $(DEFINES) to AM_CXXFLAGS
This way llvm_wrapper.cpp is compiled with -DHAVE_LLVM=0x....
2013-02-19 15:52:55 +00:00
Paul Berry
444246c7e3 i965: Remove unused userclip flags.
brw_vs_prog_data::userclip hasn't been used since commit f0cecd4
(i965: Move VUE map computation to once at VS compile time).

brw_gs_prog_key::userclip_active hasn't been used since commit 9f3d321
(i965: Make the userclip flag for the VUE map come from VS prog data).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-19 07:35:52 -08:00
Brian Paul
dfbcb1849c llvmpipe: fix handling of 0 x 0 framebuffer size
Bump up the size to 1 x 1.  This fixes a number of potential failure
points in the code.

See also http://bugs.freedesktop.org/show_bug.cgi?id=61012
2013-02-19 07:19:19 -07:00
Brian Paul
e2091f64cb st/xlib: initialize the drawable size in create_xmesa_buffer()
Otherwise, the PBuffer's size was never set.  This also initializes
the buffer size for windows, pixmaps, etc.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61012

Note: This is a candidate for the stable branches.
2013-02-19 07:19:19 -07:00
Stefan Brüns
5876a5dbc0 glx: fix glGetTexLevelParameteriv for indirect rendering
A single element in a GLX reply is contained in the header itself.
The number of elements is denoted in the "n" field of the reply.
If "n" is 1, the length of additional data is 0.
The XXX_data_length() function of xcb does not return the length of
the (optional, n>1) data but the number of elements.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=59876

Note: This is a candidate for the stable branches.

Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-19 07:19:19 -07:00
Brian Paul
63c30d7e4f st/mesa: implement glBitmap unpacking from a PBO, for the cache path
We weren't mapping the PBO when using the bitmap cache (but we had
the PBO code for the non-cache path.)

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61026

Note: This is a candidate for the stable branches.
2013-02-19 07:19:19 -07:00
Brian Paul
5da967aff5 draw: fix non-perspective interpolation in interp()
This fixes a regression from ab74fee5e1.
When we use the clip coordinate to compute the screen-space interpolation
factor, we need to first apply the divide-by-W step to the clip
coordinate.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60938

Note: This is a candidate for the 9.1 branch.
2013-02-19 07:19:18 -07:00
Marek Olšák
07cdfdb708 st/mesa: remove what is left from u_blit
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
40ee93c4e8 st/mesa: simplify and improve CopyTexSubImage
It has become a bit messy.

Changes:

- finally correct checking for transfer ops depending on the base format

- making sure the base internal format and the texture format match
  (we were ignoring it, but it's important for correctness)

- the way-too-strict rule that both src and dst base formats must be the same
  was dropped; ensuring the simpler and more permissive rule mentioned above
  is enough

- stop using util_blit_pixels; pipe->blit is flexible enough, and now that we
  have RGBX and red-alpha formats, pipe->blit can be used for more cases

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
6520a86c67 st/mesa: don't do sRGB conversion in CopyTexSubImage
Assuming I understand EXT_texture_sRGB correctly.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
0a1479c829 st/mesa: implement blit-based TexImage and TexSubImage
A temporary texture is created such that it matches the format and type
combination and pixels are copied to it using memcpy. Then the blit is used to
copy the temporary texture to the texture image being modified by TexImage or
TexSubImage. The blit takes care of the format and type conversion and
swizzling. The result is a very fast texture upload involving as little CPU
as possible.

This improves performance in apps which upload textures during rendering.
An example is the Wine OpenGL backend for DirectDraw, which I used to test
the game StarCraft. Profiling had shown that TexSubImage was taking 50% of
CPU time without this patch, which was the main motivation for this work, and
now TexSubImage only takes 14% of CPU time. I had to underclock my CPU to see
any difference in the game and this patch does make the game a lot faster
if the CPU is slow (or using the powersave cpufreq profile).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
a6e0ac9571 st/mesa: fix blit-based GetTexImage for 1D array textures
This is not easy to hit, because we have 3 code paths now
(tried in this order):
- memcpy-based (skips the blit) -> _mesa_tex_getimage
- blit-based
- slow pixel packing -> _mesa_tex_getimage

The main difference later in the code is the parameters of
_mesa_image_address3d.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
91acf6225a st/mesa: fix blit-based GetTexImage for depth/stencil formats
BTW, we have 0 tests for glGetTexImage(format=GL_DEPTH*).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
0181e18d0f st/mesa: factor out code for determining blit.mask from CopyTexSubImage
I'll need this later.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Michel Dänzer
9c1107b3e1 radeonsi: Fix PIPE_FORMAT_X32_S8X24_UINT sampler hardware format
4 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-02-18 15:59:02 +01:00
Michel Dänzer
8356962853 radeonsi: Use stencil surface level information for stencil texturing
7 more little dwarves^W piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-02-18 15:58:37 +01:00
Michel Dänzer
f9adf79876 radeonsi: properly implement S8Z24 depth-stencil format
Based on r600g commit 2b9659c9e6 .

Fixes crashes with 4 piglit tests which are now hitting these formats.

NOTE: This is a candidate for the 9.1 branch.
2013-02-18 15:58:05 +01:00
Vincent Lejeune
0527317e1f r600g/llvm: Support for TBO
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:59 +01:00
Vincent Lejeune
c116598f86 r600g/llvm: Set Inputs/Outputs count to 32 (api reported value)
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:54 +01:00
Vincent Lejeune
90e6f47ac8 r600g/llvm: Fix alpha_to_one piglit tests
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:50 +01:00
Vincent Lejeune
ef8fde6acb r600g/llvm: Add support for UBO
NOTE: This is a candidate for the Mesa stable branch.

Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:45 +01:00
Christopher James Halse Rogers
dd599188d2 i965: Fix leak in blorp CopyTexSubImage2D
_mesa_delete_renderbuffer does not call the driver-specific
renderbuffer delete function, so the blorp code was leaking the
Intel-specific bits, including some GEM objects.

Call the renderbuffer's ->Delete() method instead, which does the
right thing.

Fixes Unity rapidly sending the machine into the arms of the OOM-killer

Note: This is a candidate for the 9.1 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-16 08:11:14 -08:00
Roland Scheidegger
f1ab67c13a gallivm/tgsi: fix issues with sample opcodes
We need to encode them as Texture instructions since the NumOffsets field
is encoded there. However, we don't encode the actual target in there, this
is derived from the sampler view src later.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-16 02:40:59 +01:00
Roland Scheidegger
cb2e678294 gallivm/tgsi: fix src modifier fetching with non-float types.
Need to take the type into account. Also, if we want to allow
mov's with modifiers we need to pick a type (assume float).

v2: don't allow all modifiers on all type, in particular don't allow
absolute on non-float types and don't allow negate on unsigned.
Also treat UADD as signed (despite the name) since it is used
for handling both signed and unsigned integer arguments and otherwise
modifiers don't work.
Also add tgsi docs clarifying this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-16 02:40:51 +01:00
Roland Scheidegger
c25ae5d27b gallivm: fix issues with trunc/round/floor/ceil with no arch rounding
The emulation of these if there's no rounding instruction available
is a bit more complicated than what the code did.
In particular, doing fp-to-int/int-to-fp will not work if the exponent
is large enough (and with NaNs, Infs). Hence such values need to be filtered
out and the original value returned in this case (which fortunately should
always be exact). This comes at the expense of performance (if your cpu
doesn't support rounding instructions).
Furthermore, floor/ifloor/ceil/iceil were affected by precision issues for
values near negative (for floor) or positive (for ceil) zero, fix that as well
(fixing this issue might not actually be slower except for ceil/iceil if the
type is not signed which is probably rare - note iceil has no callers left
in any case).

Also add some new rounding test values in lp_test_arit to actually test
for that stuff (which previously would have failed without sse41).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=59701.
2013-02-16 02:40:44 +01:00
Roland Scheidegger
70daad6a99 gallivm: DIV shouldn't be deprecated.
(Though it looks glsl won't emit it.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-16 02:40:36 +01:00
Matt Turner
00f6fe6c66 mesa: Use PROGRAM_ERROR_STRING_ARB instead of the _NV name
Since NV_fragment_program is now gone. No functional change, since the
values are identical.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-15 10:28:12 -08:00
Brian Paul
2ef530cf68 trace: add context pointer sanity checking
To help catch mixed up context pointer bugs in the future, add a
trace_context_check() function and some new assertions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-15 11:11:34 -07:00
Brian Paul
82d62cf04f trace: fix incorrect trace_surface::base.context pointer
When a trace_surface object is created in trace_surf_create() we
weren't correctly setting the surface's context pointer.  Instead of
it being the trace context, it was the wrapped driver's context.
This caused things to blow up sometimes during surface deallocation.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-15 11:11:34 -07:00
Brian Paul
3b0de75c4d mesa: remove old version comment from gl.h 2013-02-15 09:25:15 -07:00
Brian Paul
70135e915a trace: whitespace, comment clean-ups 2013-02-15 09:25:15 -07:00
Brian Paul
7b836a7d25 trace: move struct tr_list to tr_texture.h
That's the only place it's used.
2013-02-15 09:25:15 -07:00
Brian Paul
4be5a06752 st/mesa: fix format query for GL_ARB_texture_rg
The GL_ARB_texture_rg spec says that we need to support both texturing
and rendering for the GL_RED and GL_RG formats.  So move the format
check up into the rendertarget_mapping[] list.  Also, add
PIPE_FORMAT_R8_UNORM to the list of formats required.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-15 09:25:14 -07:00
Eric Anholt
c37992c54d i965/fs: Do a general SEND dependency workaround for the original 965.
We'd been ad-hoc inserting instructions in some SEND messages with no
knowledge of when it was required (so extra instructions), but not all SENDs
(so not often enough).  This should do much better than that, though it's
still flow-control-ignorant.

v2: Use BRW_MAX_MRF instead of magic numbers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58960
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: Candidate for the stable branches.
2013-02-15 06:17:46 -08:00
Kristian Høgsberg
6dbe94c12c egl-wayland: Fix left-over wl_display_roundtrip() usage
We have to use the EGL wayland event queue for roundtrip, so use the
wayland_roundtrip() helper, which does just that.
2013-02-14 20:48:05 -05:00
Eric Anholt
5bb05c6e6d i965/gen7: Set up all samplers even if samplers are sparsely used.
In GLSL, sampler indices are allocated contiguously from 0.  But in the
case of ARB_fragment_program (and possibly fixed function), an app that
uses texture 0 and 2 will use sampler indices 0 and 2, so we were only
allocating space for samplers 0 and 1 and setting up sampler 0.  We
would read garbage for sampler 2, resulting in flickering textures and
an angry simulator.

Fixes bad rendering in 0 A.D. and ETQW.  This was fixed for pre-gen7 by
28f4be9eb9

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
2013-02-14 15:14:09 -08:00
Marek Olšák
34dc4d6b67 r600g: add support for red-alpha render targets 2013-02-14 14:59:36 +01:00
Marek Olšák
ec5376f5d8 r300g: add support for red-alpha render targets 2013-02-14 14:59:36 +01:00
Marek Olšák
5d3b8ad24b st/mesa: try to find exact format matching user format and type for DrawPixels
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-14 14:51:46 +01:00
Marek Olšák
2b9659c9e6 r600g: properly implement S8Z24 depth-stencil format for Evergreen
I should say "fix", but it has never been used until now.
S8Z24 is the format equivalent to the GL_UNSIGNED_INT_24_8 packing,
so we'll start to see it more often with st/mesa now making smart decisions
about formats.

The DB<->CB copy can change the channel ordering for transfers, other than
that, the internal DB format doesn't really matter.

R600-R700 support is possible except shadow mapping.
FMT_24_8 is broken if the SAMPLE_C instruction is used (no idea why).

Also the sampler swizzling was broken in theory and the fact it worked was
a lucky coincidence.

radeonsi might need to port this.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-02-14 14:51:46 +01:00
Michel Dänzer
c840270ebe radeonsi: Handle TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS
8 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-02-14 10:51:44 +01:00
Michel Dänzer
f34ad85765 radeonsi: Fix array indices for detecting integer vertex formats 2013-02-14 10:31:21 +01:00
Vinson Lee
0d5ce524ab glsl: Initialize ir_texture member variable.
Fixes uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 23:10:48 -08:00
Eric Anholt
b8906adb66 intel: Allow blit readpixels even when the pack alignment is set.
The default alignment is 4, so this fast path was rarely hit.  Rather
than introduce logic to handle alignment, just use the Mesa core
function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46632
Cc: neil@linux.intel.com
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 18:10:20 -08:00
Eric Anholt
516d8be502 i965: Remove writemask support from brw_SAMPLE().
The code was rather broken for non-XYZW on 8-wide, but all of our
callers were using XYZW anyway.  For my experiments with using writemask
on texturing, I've been using manual header setup in the compiler
backends, since we want to actually know what registers are written for
optimization and register allocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 18:10:20 -08:00
Eric Anholt
bf91f0b039 i965/fs: Use a helper function for checking for flow control instructions.
In 2 of our checks, we were missing BREAK and CONTINUE.

NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 17:47:06 -08:00
bma
ce3dfa19ab shaderapi: Fix AttachShader error
Detect a duplicate Shader type as and error instead of silently allowing
it, restrict to ES2 API.

v2: Tapani Pälli <tapani.palli@intel.com>
    - make the check run time instead of compile time

v3: chadv
    - Quote spec on which error to generate.

Signed-off-by: bma <Bo.Ma@windriver.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-13 14:09:47 -08:00
Tom Stellard
0898047e7b configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs
This is required when LLVM is built with CMake, which creates one
shared library for each component.
2013-02-13 17:01:08 -05:00
Eric Anholt
cb4616d32d i965: Re-enable the -RHW workaround for original gen4 chips.
Fixes broken clipping in supertuxkart and presumably many other applications.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471
NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 10:19:21 -08:00
Eric Anholt
ddc2b453d0 i965/gen4: Work around missing sRGB RGB DXT1 support.
The hardware just doesn't support it.  I suspect this was a regression from
the move to fixed MESA_FORMATs for compressed textures and that previously we
were storing uncompressed for this or something.

Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor
swizzled" on my GM965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 10:19:21 -08:00
Paul Berry
dfb57e7d1b glsl: Fix error checking on "flat" keyword to match GLSL ES 3.00, GLSL 1.50.
All of the GLSL specs from GLSL 1.30 (and GLSL ES 3.00) onward contain
language requiring certain integer variables to be declared with the
"flat" keyword, but they differ in exactly *when* the rule is
enforced:

(a) GLSL 1.30 and 1.40 say that vertex shader outputs having integral
type must be declared as "flat".  There is no restriction on fragment
shader inputs.

(b) GLSL 1.50 through 4.30 say that fragment shader inputs having
integral type must be declared as "flat".  There is no restriction on
vertex shader outputs.

(c) GLSL ES 3.00 says that both vertex shader outputs and fragment
shader inputs having integral type must be declared as "flat".

Previously, Mesa's behaviour was consistent with (a).  This patch
makes it consistent with (b) when compiling desktop shaders, and (c)
when compiling ES shaders.

Rationale for desktop shaders: once we add geometry shaders, (b) really
seems like the right choice, because it requires "flat" in just the
situations where it matters.  Since we may want to extend geometry
shader support back before GLSL 1.50 (via ARB_geometry_shader4), it
seems sensible to apply this rule to all GLSL versions.  Also, this
matches the behaviour of the nVidia proprietary driver for Linux, and
the expectations of Intel's oglconform test suite.

Rationale for ES shaders: since the behaviour specified in GLSL ES
3.00 matches neither pre-GLSL-1.50 nor post-GLSL-1.50 behaviour, it
seems likely that this was a deliberate choice on the part of the GLES
folks to be more restrictive.  Also, the argument in favor of (b)
doesn't apply to GLES, since it doesn't support geometry shaders at
all.

Some discussion about this has already happened on the Mesa-dev list.
See:

http://lists.freedesktop.org/archives/mesa-dev/2013-February/034199.html

Fixes piglit tests:
- glsl-1.30/compiler/interpolation-qualifiers/nonflat-*.frag
- glsl-1.30/compiler/interpolation-qualifiers/vs-flat-int-0{2,3,4,5}.vert
- glsl-es-3.00/compiler/interpolation-qualifiers/varying-struct-nonflat-{int,uint}.frag

Fixes oglconform tests:
- glsl-q-inperpol negative.fragin.{int,uint,ivec,uvec}

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-13 07:58:08 -08:00
Paul Berry
93c913485e glsl: don't allow non-flat integral types in varying structs/arrays.
In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says:

    "If a vertex output is a signed or unsigned integer or integer
    vector, then it must be qualified with the interpolation qualifier
    flat."

The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output
Variables"):

    "Vertex shader outputs that are, *or contain*, signed or unsigned
    integers or integer vectors must be qualified with the
    interpolation qualifier flat."

(Emphasis mine.)

The language in the GLSL ES 3.00 spec is clearly correct and should be
applied to all shading language versions, since varyings that contain
ints can't be interpolated, regardless of which shading language
version is in use.

(Note that in GLSL 1.50 the restriction is changed to apply to
fragment shader inputs rather than vertex shader outputs, to
accommodate the fact that in the presence of geometry shaders, vertex
shader outputs are not necessarily interpolated.  That will be
addressed by a future patch).

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-13 07:58:01 -08:00
Paul Berry
d5948f2f5e glsl: Allow default precision qualifiers to be set for sampler types.
From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):

    "The precision statement

        precision precision-qualifier type;

    can be used to establish a default precision qualifier. The type
    field can be either int or float or any of the sampler types, and
    the precision-qualifier can be lowp, mediump, or highp."

GLSL ES 1.00 has similar language.  GLSL 1.30 doesn't allow precision
qualifiers on sampler types, but this seems like an oversight (since
the intention of including these in GLSL 1.30 is to allow
compatibility with ES shaders).

Previously, Mesa followed GLSL 1.30 and only allowed default precision
qualifiers to be set for float and int.  This patch makes it follow
GLSL ES rules in all cases.

Fixes Piglit tests default-precision-sampler.{vert,frag}.

Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-13 07:57:58 -08:00
Marek Olšák
60aa5f360a st/mesa: fix texture buffer objects
Broken by 624528834f.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-13 16:38:19 +01:00
Kenneth Graunke
8cabe26f5d i965: Use derived state for Haswell's 3DSTATE_VF packet.
Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX.

Fixes gles3conform's primitive_restart_mode test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-12 20:24:28 -08:00
Marek Olšák
ea63491629 st/mesa: accelerate glGetTexImage for all formats using a blit
This commit allows using glGetTexImage during rendering and still
maintain interactive framerates.

This improves performance of WarCraft 3 under Wine. The framerate is improved
from 25 fps to 39 fps in the main menu, and from 0.5 fps to 32 fps in the game.

v2: fix choosing the format for decompression
2013-02-13 02:13:10 +01:00
Marek Olšák
cd41833b44 gallium: add red-alpha texture formats and a couple of util functions
This is for glGetTexImage and it will be used for samplers only (which some
drivers already implement by reading util_format_description).

v2: incorporate Brian's suggestion

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-13 02:13:10 +01:00
Jerome Glisse
974b482aca r600g: fix lockup when hyperz & alpha test are enabled together. v3
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.

v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-12 17:03:56 -05:00
Jordan Justen
496928a442 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL
In OpenGL 4.3, new language was added that would require
this check. But, if this check results in broken applications
then perhaps it will be reversed.

For now, remove this check and re-evaluate when
desktop GL 4.3 is closer.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-12 11:22:49 -08:00
Christian König
8c80894fb3 radeonsi: remove constant index limitation v3
With the llvm patches, fixing 14 piglit tests in total.

v2: increase the const limit
v3: document the const limit

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-12 18:57:12 +01:00
Christian König
8514f5ac01 radeonsi: support constants as TEX coordinates
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-12 18:57:12 +01:00
Paul Berry
f8426eea35 glsl: Fix unsupported version error for GLSL ES 3.00, future proof for 3.30.
When the user specifies an unsupported GLSL version,
_mesa_glsl_parse_state::process_version_directive() nicely gives them
an error message telling them which GLSL versions are supported.
Previous to this patch, the logic for determining whether a given
language version was supported was independent from the logic to
generate this error message string; as a result, we had a bug where
GLSL 3.00 would never be listed in the error message as an available
language version, even if it was really available.

To make matters worse, the code for generating the error message
string assumed that desktop GL versions were always separated by 0.10,
an assumption that will be wrong as soon as we support GLSL 3.30.

This patch fixes both problems by adding a table of supported GLSL
versions to _mesa_glsl_parse_state; this table is used both to
generate the error message and to check whether a given version is
supported.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-12 08:06:35 -08:00
Roland Scheidegger
9870459522 gallium/docs: fix typos in sample opcode descriptions 2013-02-12 16:51:11 +01:00
Roland Scheidegger
2947f00bc4 nv50: fix bogus parameters when processing sample instructions
Discovered accidentally when changing SAMPLE_L definition.
Turns out the lod arguments were already correct for the new definition
but the compare and derivs were not.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-02-12 16:51:11 +01:00
Roland Scheidegger
427d36a227 gallium: fix tgsi SAMPLE_L opcode to use separate source for explicit lod
It looks like using coord.w as explicit lod value is a mistake, most likely
because some dx10 docs had it specified that way. Seems this was changed though:
http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx
- let's just hope it doesn't depend on runtime build version or something.
Not only would this need translation (so go against the stated goal these
opcodes should be close to dx10 semantics) but it would prevent usage of this
opcode with cube arrays, which is apparently possible:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509699%28v=vs.85%29.aspx
(Note not only does this show cube arrays using explicit lod, but also the
confusion with this opcode: it lists an explicit lod parameter value, but then
states last component of location is used as lod).
(For "true" hw drivers, only nv50 had code to handle it, and it appears the
code was already right for the new semantics, though fix up the seemingly
wrong c/d arguments while there.)

v2: fix comment, separate out other changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-12 16:51:11 +01:00
Brian Paul
4bfdef87e6 util: fix incorrect Z bit masking in util_clear_depth_stencil()
For PIPE_FORMAT_Z24_UNORM_S8_UINT, the Z bits are in the 24
least significant bits.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60527
and http://bugs.freedesktop.org/show_bug.cgi?id=60524
and http://bugs.freedesktop.org/show_bug.cgi?id=60047

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-12 08:11:05 -07:00
Matt Turner
a79ce0c925 radeon: Remove dead STANDALONE_MMIO defines
These were, at some point in the past, used to request that Xorg's
compiler.h export a static inline xf86ReadMmio32 instead of a function
pointer. compiler.h only has this option for DEC Alpha.

But Xorg's compiler.h isn't being included by either of these two files
and the radeon driver still works on Alpha, so the definitions are dead
and not needed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 23:18:11 -08:00
Roland Scheidegger
8b8bca06df llvmpipe: implement dual source blending
link up the fs outputs and blend inputs, and make sure the second blend source
is correctly loaded and converted (which is quite complex).
There's a slight refactoring of the monster generate_unswizzled_blend()
function where it makes sense to factor out alpha conversion (which needs
to run twice for dual source blend).
This passes piglit arb_blend_func_extended tests.

v2: remove new but ultimately not used function...

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-12 03:41:48 +01:00
Kenneth Graunke
a73181be6d docs: Mark a few things done in GL3.txt. 2013-02-11 15:55:29 -08:00
Kenneth Graunke
3d7c09e8b0 i965: Add missing dirty bits to INTEL_DEBUG=state arrays.
These are more recent additions, and no one remembered to update the
INTEL_DEBUG=state code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-11 15:54:10 -08:00
Kenneth Graunke
b9c5997bb3 i965: Reorganize brw_bits to match the order in brw_context.h.
This reorders the "brw_bits" array in brw_state_upload.c to match the
order of the #defines in brw_context.h.

Otherwise, it's really hard to see if any are missing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-11 15:54:07 -08:00
Kenneth Graunke
0ac6d5a7fb i965: Use BRW_NEW_CONTEXT for gen7_disable rather than BRW_NEW_BATCH.
These don't need to be re-disabled on every batch if we're using
hardware contexts.  (If we're not, this is equivalent.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-11 15:54:01 -08:00
Jerome Glisse
323a448825 r600g: make sure async blit is done 8 * pitch at a time v2
The blit must be aligned on 8 horizontal block.

v2: no need to align the reminder

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-11 18:44:18 -05:00
Martin Andersson
a37835c8ed winsys/radeon: fix bo with virtual address referencing mismatch
If the same context try to flink and open the object, use the
same bo struct instead of opening a new gem handle for the object.
This way we avoid avoid having 2 different handle pointing to the
same kernel object which can latter lead to trouble with virtual
address.

Fix:
https://bugs.freedesktop.org/show_bug.cgi?id=60200

Signed-off-by: Martin Andersson <g02maran@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-02-11 18:38:00 -05:00
Eric Anholt
e776b632c0 vbo: Merge GL_QUADS drawing requests in display lists.
minecraft apparently has its piles of display lists each contain 6
instances of glBegin(GL_QUADS)/verts/glEnd(), which appear in the
compiled list as 6 prims of 4 verts each in one draw call.  We can
reduce driver overhead even more by making that one prim of 24 verts.

Improves minecraft performance by 1.6% +/- .25% (n=446)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-02-11 13:14:52 -08:00
Eric Anholt
50202f0961 vbo: Print display list debug using printf() like dlist.c does.
Otherwise, the stderr and stdout debug end up interleaved wrong
when I pipe them to a file.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-02-11 13:14:51 -08:00
Eric Anholt
b9a66da258 i965: Remove some stale comments about the brw_constant_buffer atom.
These have been wrong since f428255bde
back in 2009!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Eric Anholt
e07457d0ae i965: Simplify VS push constant upload code since removal of old path.
We used to have clip planes optionally included in the push constants,
resulting in a variable amount of data uploaded, but no more.  This also
means less wasted space in the batch for our push constants.

v2: Update _NEW_TRANSFORM state bit information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-02-11 13:14:51 -08:00
Eric Anholt
11766b1bbb i965: Add perf debug for a corner case.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Eric Anholt
936a3ca6fd i965: Fix access mode of index buffer rebase.
It doesn't matter with our current implementation of MapBufferRange,
but it was wrong -- the result pointer is read by intel_upload_data().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Eric Anholt
016928b163 i965: Fix indentation of index buffer rebase code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Marek Olšák
cb6470775c mesa: fix GetTexImage if mesa format and internal format don't match
Tested with softpipe only exposing RGBA formats.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
c8379204ab mesa: don't use memcpy fast path for GetTexImage if base format is different
The Mesa format can be RGBA8888_REV, the format/type can be
GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be
LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base
internal format as well.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
09a99867ab mesa: don't use _mesa_base_tex_format for format parameter of GetTexImage
_mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc.

v2: add a (now hopefully complete) helper function to deal with this

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
5587c8619a mesa: adjust usage of swapBytes/littleEndian in format_matches_format_and_type
- swapBytes has no effect on 8-bit single-component formats
- GL_SHORT is in host byte order, so checking for littleEndian is unnecessary,
  I decided to make the change for single-component formats only

Based on suggestions from Michel Dänzer.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
dcdffaaf43 mesa: remove per-format memcpy codepaths from texstore functions
It's obsoleted by the common function _mesa_texstore_memcpy.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
4bf27ed7ed mesa: implement common texstore memcpy function for all formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
967b21df6a mesa: fill in Z32_FLOAT_X24S8 in _mesa_format_matches_format_and_type
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
a0510fa773 mesa: fill in signed cases and RGBA16 in _mesa_format_matches_format_and_type
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
a0fb71888f mesa: fill in INT/UINT format cases in _mesa_format_matches_format_and_type
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
43395da55a mesa: fill in YCBCR cases in _mesa_format_matches_format_and_type
based on the texstore code

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
87f94e6f80 mesa: fill in SRGB cases in _mesa_format_matches_format_and_type
Texstore takes the same codepath as the corresponding linear formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Adhemerval Zanella
1ab2c55bf4 llvmpipe: fix vertex_header mask store in big-endian
This patch fixes the vertex_header mask bitfield store in big-endian
architectures by bit-swap the fields accordingly.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2013-02-11 13:41:28 -05:00
Adhemerval Zanella
a8016b2f60 llvmpipe: remove lp_swizzled_cbuf
Ununsed since 75da95c5.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2013-02-11 13:41:28 -05:00
Andreas Boll
44a5d7371c docs: document removal of makedepend build dependency
Build dependency removed with
424f200881

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-11 18:11:20 +01:00
Andreas Boll
d59bd61445 docs: update making a new mesa release info
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Andreas Boll
ab10d2d8a5 docs: use proper title for index.html
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Andreas Boll
bf9e19d308 docs: mention some other supported APIs
v2: add ES3

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2013-02-11 10:58:33 +01:00
Andreas Boll
babc638c72 docs: update sourcetree
glsl directory is located in src and not in src/egl

v2: remove ppc, move glapi from src/mesa to src/mapi

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Andreas Boll
dbbe108951 docs: replace CVS with git
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Vinson Lee
990bd49fba configure.ac: Do not check for rt on Mac OS X.
There is no rt library on Mac OS X.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58872
Acked-by: Matt Turner <mattst88@gmail.com>
2013-02-09 15:21:08 -08:00
Ian Romanick
0e2f26d5ea intel: Do not expose OES_compressed_ETC1_RGB8_texture or ARB_texture_rgb10_a2ui pre-GEN4
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.

NOTE: This is a candidate for all stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-08 19:28:53 -08:00
Roland Scheidegger
75d99673a8 softpipe: clean up lod computation
This should handle the new lod_zero modifier more correctly.
The runtime-conditional is a bit more complex however we now also do
scalar lod computation when appropriate which should more than make up for it.
The refactoring should also fix an issue with explicit lods
(lod clamp wasn't applied to them).
Also, always pass lod as the 5th element from tgsi executor, which simplifies
things (get rid of annoying conditionals later).

v2: based on Brian's feedback, use switch in a couple of places, fix up
some function parameter names, fix up comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-08 18:54:40 -08:00
Roland Scheidegger
4f1d757b86 softpipe: try to beat new dx10-style sample opcodes into shape
There were several bugs how this was handled, most opcodes wouldn't even
have fetched the right arguments.
Also, the tex "target" is coming from the sampler view, hence it cannot
have information about shadow comparisons - fortunately this is not only
sampler state but also needs to have matching instruction, so just use this
instead to identify shadow comparisons.
Still untested (compiles...).
Note that sample_i and sviewinfo are still busted (just assert).
(The problem is that the interface for doing the opengl-equivalent functions
txf and txq is tied to the specific the sampler itself but these opcodes
have no sampler associated with them. Oops...)
Also, even the other sample instructions will not work correctly since
they always operate on samplers which include the texture state. Fixing
this wouldn't be that difficult but most likely make softpipe quite a bit
slower when using the OpenGL tex opcodes (as the samplers have pre-baked
function calls in the sampler state depending on texture state and that stuff
would need to be evaluated at runtime), so leave it for now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 18:54:40 -08:00
Roland Scheidegger
614982d320 gallivm: fix up size queries for dx10 sviewinfo opcode
Need to calculate the number of mip levels (if it would be worthwile could
store it in dynamic state).
While here, the query code also used chan 2 for the lod value.
This worked with mesa state tracker but it seems safer to use chan 0.
Still passes piglit textureSize (with some handwaving), though the non-GL
parts are (largely) untested.

v2: clarify and expect the sviewinfo opcode to return ints, not floats,
just like the OpenGL textureSize (dx10 supports dst modifiers with resinfo).
Also simplify some code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 18:54:40 -08:00
Roland Scheidegger
0a8043bb76 gallivm: hook up dx10 sampling opcodes
They are similar to old-style tex opcodes but with separate sampler and
texture units (and other arguments in different places).
Also adjust the debug tgsi dump code.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 18:54:40 -08:00
Vinson Lee
db7612d15d intel: Ensure variable intel is used in i915 builds.
Fixes unused pointer value defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-08 18:51:27 -08:00
Vinson Lee
85a9a7f09c glsl: Ensure glsl_type constructors initialize gl_type.
Fixes uninitialized scalar field defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-08 18:50:08 -08:00
Jerome Glisse
9a47684564 winsys/radeon: improve debuging printing
Make sure one can identify virtual address failure from allocation
failure.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-08 20:30:09 -05:00
Roland Scheidegger
1d71106f5c softpipe: get rid of tgsi_sampler_control param in img_filter
None of the filters used it (why would they). Maybe that param
was just there because some of the lines were considered to be
too short...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
66b6d51214 softpipe: fix using optimized filter function
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
49f8825c49 gallivm: fix typo in lp_build_mul_norm
The signed case didn't do what the comment indicated. Should increase rounding
precision (at the expense of performance since the former code was effectively
a no-op).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
67906f91c9 llvmpipe: first steps of adding dual source blend support
This adds support of the additional blending factors to the blend function
itself, and also enables testing of it in lp_test_blend (which passes).
Still need to add the glue code of linking fs shader outputs to blend inputs
in llvmpipe, and probably need to add special handling if destination doesn't
include alpha (which lp_test_blend doesn't test).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
8e44f4117a llvmpipe: refactoring of visibility counter handling
There can be other per-thread data than just vis_counter, so pass a struct
around instead (some of our non-public code uses this already and this
difference is a major cause of merge pain).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 16:32:30 -08:00
Jerome Glisse
3310acdf47 xorg: fix exa finish access
The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-08 19:01:19 -05:00
Kristian Høgsberg
1fe007399c egl-wayland: Make sure we allocate a back buffer even if nothing was rendered
At eglSwapBuffer time, we blindly assume we have a back buffer, but the
back buffer only gets allocated when somebody tries to render something.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

https://bugs.freedesktop.org/show_bug.cgi?id=60086
2013-02-08 11:23:18 -05:00
Paul Berry
a4b9678a54 Consolidate some redundant definitions of ARRAY_SIZE() macro.
Previous to this patch, there were 13 identical definitions of this
macro in Mesa source.  That's ridiculous.  This patch consolidates 6
of them to a single definition in src/mesa/main/macros.h.

Unfortunately, I wasn't able to eliminate the remaining definitions,
since they occur in places that don't include src/mesa/main/macros.h:

- include/pci_ids/pci_id_driver_map.h
- src/egl/drivers/dri2/egl_dri2.h
- src/egl/main/egldefines.h
- src/gbm/main/backend.c
- src/gbm/main/gbm.c
- src/glx/glxclient.h
- src/mapi/mapi/stub.c

I'm open to suggestions as to how to deal with the remaining redundancy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-08 06:51:22 -08:00
Paul Berry
dc92b2d11f intel/pre-gen6: Disable EXT_framebuffer_multisample.
Previously, the i965 driver enabled EXT_framebuffer_multisample even
on pre-gen6 chipsets.  However, since we don't support multisampling
on these chips, we set GL_MAX_SAMPLES=1 (the minimum allowed by
EXT_framebuffer_multisample), and if the client ever requested a
multisample buffer, we quietly supplied them with a single-sampled
buffer instead.

After some discussion on the mailing list (see thread
"ext_framebuffer_multisample: check for num_samples<=1"), it's clear
that this was the wrong approach.  The correct approach is to only
expose EXT_framebuffer_multisample when we truly support
multisampling; that frees us to set a sensible value of
GL_MAX_SAMPLES=0 on other chipsets, so that we never have to deal with
a client requesting a multisample buffer when multisampling isn't
supported.

This change causes the following piglit tests to be skipped on
chipsets prior to Gen6:

- "ARB_framebuffer_sRGB/blit {renderbuffer,texture}
  {linear,linear_to_srgb,srgb,srgb_to_linear}
  {downsample,msaa,upsample} {disabled,enabled}"
- EXT_framebuffer_multisample/blit-mismatched-formats
- EXT_framebuffer_multisample/blit-mismatched-sizes
- EXT_framebuffer_multisample/dlist
- EXT_framebuffer_multisample/interpolation 0 *
- EXT_framebuffer_multisample/minmax
- EXT_framebuffer_multisample/negative-copypixels
- EXT_framebuffer_multisample/negative-copyteximage
- EXT_framebuffer_multisample/negative-max-samples
- EXT_framebuffer_multisample/negative-mismatched-samples
- EXT_framebuffer_multisample/negative-readpixels
- EXT_framebuffer_multisample/renderbuffer-samples
- EXT_framebuffer_multisample/renderbufferstorage-samples
- EXT_framebuffer_multisample/samples

This is expected, since the above tests exercise MSAA functionality,
and shouldn't be run on systems prior to Gen6.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-08 06:51:22 -08:00
Vinson Lee
b681ed6ac9 glsl: Initialize all tfeedback_candidate_generator member variables.
Fixes uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-02-07 21:51:20 -08:00
Vinson Lee
7c544e55da nv30: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-07 21:45:01 -08:00
Ian Romanick
82691f1293 glsl: Change loop_analysis to not look like a resource leak
Previously the loop_state was allocated in the loop_analysis
constructor, but not freed in the (nonexistent) destructor.  Moving
the allocation of the loop_state makes this code appear less sketchy.

Either way, there is no actual leak.  The loop_state is freed by the
single caller of analyze_loop_variables.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Dave Airlie <airlied@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57753
2013-02-07 21:18:42 -08:00
Paul Berry
04f0d6cc22 mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange.
In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:

    "The error INVALID_VALUE is generated if size is less than or
    equal to zero or if offset + size is greater than the value of
    BUFFER_SIZE."

This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.

Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange.  We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.

Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.

(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer.  We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).

Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 21:16:37 -08:00
Ian Romanick
f29ab4ece5 i965: Set UniformBufferOffsetAlignment to sizeof(vec4)
This matches the behavior of the Windows driver, but a bspec reference
should would be nice.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-07 21:16:08 -08:00
Matt Turner
3ee602314f mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3
Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.

NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 17:53:13 -08:00
Daniel van Vugt
6e226ab5ac gbm: Remember to init format on gbm_dri_bo_create.
https://bugs.freedesktop.org/show_bug.cgi?id=60143
2013-02-07 20:00:52 -05:00
Eric Anholt
7242b03622 glx: Centralize the code for context flushing.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-07 13:13:02 -08:00
Eric Anholt
95080ca8d4 glx: Add a little comment about what dri2FlushFrontBuffer() does.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-07 13:13:02 -08:00
Michel Dänzer
c093f12406 radeonsi: Handle scaled and integer formats for samplers and vertex elements.
Also, add assertions to stress that render targets don't support scaled
formats.

20 more little piglits.
2013-02-07 19:07:43 +01:00
Michel Dänzer
23405ef467 radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support.
The hardware can't do it.
2013-02-07 19:07:43 +01:00
Michel Dänzer
a9816cc784 radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args().
The proper return type is assigned at the end of the function.
2013-02-07 19:07:43 +01:00
Michel Dänzer
07eddc444c radeonsi: Use unique names for referring to texture sampling intrinsics.
Append the overloaded vector type used for passing in the addressing
parameters.

Without this, LLVM uses the same function signature for all those types,
which cannot work.

Fixes problems e.g. with FlightGear and Red Eclipse.
2013-02-07 19:07:43 +01:00
Marek Olšák
74a17a764d r300g: put textures with usage=staging in GTT and make them linear 2013-02-07 17:43:19 +01:00
Jerome Glisse
681707abf2 r600g: fix slice tile max for compressed texture and async dma
Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-07 10:42:22 -05:00
Marek Olšák
9ba1e23647 radeonsi: use new RGBX formats 2013-02-07 00:20:24 +01:00
Marek Olšák
4dc142d521 r300g: fix blending and alpha-test with RGBX16F and enable MSAA for it 2013-02-07 00:20:24 +01:00
Marek Olšák
27e216a075 r300g: use new RGBX formats 2013-02-07 00:20:24 +01:00
Marek Olšák
3c351b7c33 r600g: use new RGBX formats 2013-02-07 00:20:24 +01:00
Marek Olšák
dd21ecdc42 st/mesa: use new RGBX formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-07 00:20:24 +01:00
Marek Olšák
f9fa725690 mesa: add RGBX formats for existing GL RGB texture formats
v2: fix compilation of swrast
2013-02-07 00:20:24 +01:00
Marek Olšák
70bf7bae1d gallium: add RGBX formats for existing GL RGB texture formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-07 00:20:23 +01:00
Kenneth Graunke
7d467f3c15 i965/blorp: Support blits between ARGB and XRGB formats.
Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.

For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
former, so ignore the latter for now.  We can always add it later.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
2013-02-06 10:01:03 -08:00
Kenneth Graunke
c0554141a9 i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal.  However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.

For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors.  This is fairly
straightforward with blending.

For now, this code is never used, as the source and destination formats
still must be equal.  The next patch will relax that restriction.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
2013-02-06 10:00:53 -08:00
Kenneth Graunke
0b3bebbaac i965: Implement CopyTexSubImage2D via BLORP (and use it by default).
The BLT engine has many limitations.  Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.

Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
limitation, and the only way around that is to use BLORP.

Previously, BLORP only handled BlitFramebuffer.  This patch adds an
additional frontend for doing CopyTexSubImage.  It also makes it the
default.  This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases.  With
trivial extensions, it should be able to handle everything the BLT can.

This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell.  Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS.  This is particularly bad in an MMO setting
because people cast spells all the time.

It also helps Xonotic in 4X MSAA mode.  At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)

No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).

v2: Create a fake intel_renderbuffer to wrap the destination texture
    image and then reuse do_blorp_blit rather than reimplementing most
    of it.  Remove unnecessary clipping code and conditional rendering
    check.

v3: Reuse formats_match() to centralize checks; delete temporary
    renderbuffers.  Reorganize the code.

v4: Actually copy stencil when dealing with separate stencil buffers but
    packed depth/stencil formats.  Tested by a new Piglit test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
2013-02-06 10:00:22 -08:00
Kenneth Graunke
29aef6cce8 mesa: Put extern "C" guards in renderbuffer.h.
I need to use this from C++ code.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-06 09:59:53 -08:00
Brian Paul
48b01e6a10 llvmpipe: remove extraneous const qualifier 2013-02-06 09:16:58 -07:00
Marek Olšák
bc2ceb97f1 gallium/util: remove duplicated function util_format_is_rgb_no_alpha
It only checks if alpha is present, so it's the same as util_format_has_alpha.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
b92057a983 st/mesa: get rid of GET_CURRENT_CONTEXT in st_choose_format
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
2e6f10d0b7 st/mesa: adjust texture format selection to try the closest base format first
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
b89b80a91d st/mesa: put RGBX8 and RGBA8 in the default format lists
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
c1856da75d st/mesa: add the rest of RGB8 format/type combos to exact_format_mapping tables
These formats were added a few months after these tables were committed.
No idea why we have the table though. AFAIK, texstore always takes the slow path
for GL_RGBn.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
ebe86b8082 mesa: fixup inconsistent naming of RG16 formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
cf37aef414 r600g: report correct control flow depth 2013-02-06 14:51:31 +01:00
Marek Olšák
fc86394882 glsl: fix incorrect comment about do_common_optimization 2013-02-06 14:51:31 +01:00
Marek Olšák
4362bdadf3 st/mesa: emit saturates in the vertex shader if Shader Model 3.0 is supported
v2: change the requirement from GLSL 1.30 to SM 3.0 (R500 can do this)
2013-02-06 14:51:31 +01:00
Marek Olšák
48689ca14a st/mesa: advertise ARB_shading_language_packing for GLSL >= 1.30
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
afd4178fec st/mesa: do most of GLSL lowering outside of the optimization do-while loop
based on the intel driver

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
7325f1faaa st/mesa: remove dead code depending on EmitCondCodes
EmitCondCodes is always false.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
85efb2fff0 r300g: try to use color varyings for texcoords if max texcoord limit is exceeded
+35 piglits
2013-02-06 14:45:22 +01:00
Marek Olšák
1d3561d877 r300/compiler: copy-propagate saturate mode when possible
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-02-06 14:45:20 +01:00
Marek Olšák
ae8696c7ee r300/compiler: add support for saturate output modifier in r500 vertex shaders
The GLSL compiler can simplify clamp(v,0,1) to saturate. The state tracker
doesn't use it yet, but it will.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-02-06 14:45:16 +01:00
Marek Olšák
499f7de12e r300g: fix blending with RGBX formats
Change DST_ALPHA to ONE.
2013-02-06 14:31:23 +01:00
Marek Olšák
f40a7fc34a r300g: fix blending with blend color and RGBA formats
NOTE: This is a candidate for the stable branches.
2013-02-06 14:31:23 +01:00
José Fonseca
5048e69392 egl/dri: Don't invoke dri2_dpy->flush if it's NULL.
I'd like to test Mesa OpenGL ES along side with NVIDIA libGL drivers. But
without this change, I get a NULL pointer dereference.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 09:22:26 +00:00
Vinson Lee
d08cee5d80 glsl: Initialize ast_parameter_declarator member variables.
Fixes uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-05 22:11:32 -08:00
Brian Paul
ff60509157 svga: fix sRGB rendering
We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering
didn't work properly.

Fixes piglit's framebuffer-srgb test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-05 12:34:55 -07:00
Tom Stellard
8aaee4d64e r600g/compute: Fix segfault caused by new shader disassembler 2013-02-05 15:41:33 +00:00
Michel Dänzer
02a423b239 Require libdrm_radeon 2.4.42 for radeonsi.
It has new PCI IDs and an important tiled surface layout fix.
2013-02-05 15:12:14 +01:00
Eric Anholt
86536a321d i965: Disable write masking when setting up texturing m0.
v2/Kayden: Also disable write masking in the vec4 backend.

Fixes 78 oglconform glsl-bif-tex-* subcases.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net> [v2]
2013-02-04 17:29:41 -08:00
Tapani Pälli
e062a4187d intel: Fix regression in intel_create_image_from_name stride handling
Strangely, the DRIimage interface we have passes the pitch in pixels
instead of bytes, which anholt missed in the change to using bytes for
region pitch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-04 13:59:02 -08:00
Eric Anholt
5751d0cb2d i965: Fix segfaults from 45a28a927a
If you look up a level that isn't in the miptree, you crash.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-04 13:58:55 -08:00
Alex Deucher
4161d70bba radeonsi: add Oland pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
2013-02-04 15:44:38 -05:00
Alex Deucher
af0af75881 radeonsi: default PA_SC_RASTER_CONFIG to 0
That should work in all cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
2013-02-04 15:44:07 -05:00
Alex Deucher
83e4407f44 radeonsi: add support for Oland chips
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch
2013-02-04 15:43:21 -05:00
Paul Berry
99b78337e3 glsl: Support transform feedback of varying structs.
Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.

Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.

Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:47 -08:00
Paul Berry
53febac02c glsl: Use parse_program_resource_name to parse transform feedback varyings.
Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.

Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail.  Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:44 -08:00
Paul Berry
b4db34cc4c glsl: Rename uniform_field_visitor to program_resource_visitor.
There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).

This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:40 -08:00
Paul Berry
b92900d26a mesa/glsl: Separate parsing logic from _mesa_get_uniform_location.
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name().  This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).

Future patches will make use of this function for linking transform
feedback varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:35 -08:00
Quentin Glidic
11bd1b0f58 gallium/egl: Fix include dirs for VPATH build
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
2013-02-04 10:36:50 -08:00
Abdiel Janulgue
eaeb314372 intel: make sure to setup image dimension in image_from_planar setup
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=60212
Tested-by: Scott Moreau <oreaus@gmail.com>
Tested-by:  Tiago Vignatti <tiago.vignatti@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-04 10:18:22 -08:00
Matt Turner
2db1f73849 builtin_compiler/build: Don't use *_FOR_BUILD when not cross compiling
Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS
when not cross compiling, but this assumption didn't take into
consideration 32-bit builds on 64-bit systems. More generally, not
honoring CFLAGS is bad.

Automake is evidently too stupid to accept

if CROSS_COMPILING
CC = @CC_FOR_BUILD@
...
else
CC = @CC@
endif

without warning that CC has been already defined. The warnings are
harmless, but I'd prefer to avoid future reports about them, so define
proxy variables, which are assigned inside the conditional and then
unconditionally assigned to CC et al.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038
2013-02-04 09:35:45 -08:00
Brian Paul
805cf07dc3 st/mesa: emit SQRT opcode when driver supports it 2013-02-04 09:33:44 -07:00
Brian Paul
13f3ae5b83 gallium/drivers: handle PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED query
Initially, only softpipe/llvmpipe support SQRT.
2013-02-04 09:33:44 -07:00
Brian Paul
2d367e40d9 gallivm: implement support for SQRT opcode 2013-02-04 09:33:44 -07:00
Brian Paul
ad30e4545b tgsi: add support for new SQRT opcode 2013-02-04 09:33:44 -07:00
Brian Paul
d276a40e15 gallium: add SQRT shader opcode
The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt()
and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED
query says it's supported by the driver.

Otherwise, sqrt(x) is implemented with x*rsq(x).  The problem with
this is sqrt(0) must be handled specially because rsq(0) might be
Inf/NaN/undefined (and then 0*rsq(0) is Inf/Nan/undefined).  In the
glsl-to-tgsi code we use an extra CMP to check if x is zero and then
replace the result of x*rsq(x) with zero.

In the end, this makes sqrt() generate much more reasonable code for
drivers that can do square roots.

Note that many of piglit's generated shader tests use the GLSL
distance() function.
2013-02-04 09:33:44 -07:00
Michel Dänzer
6455d40b7e radeonsi: Remove spurious traces of R16G16B16 support.
The hardware can't do it, and these were causing warnings in some piglit tests.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:26 +01:00
Michel Dänzer
6bcb823844 radeonsi: Enable texture arrays.
28/30 piglit tests pass.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:25 +01:00
Michel Dänzer
120efeef8b radeonsi: Improve packing of texture address parameters.
In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:25 +01:00
Michel Dänzer
e5fb7347a7 radeonsi: Adapt to sample intrinsics changes.
Fix up intrinsic names, and bitcast texture address parameters to integers.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:25 +01:00
Brian Paul
624528834f st/mesa: simplify the update_single_texture() function
In particular, rework the sRGB/linear format selection code.
There's no reason to mess with the Mesa format.
Just do everything in terms of the gallium pipe_format.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-04 08:28:17 -07:00
Brian Paul
5f81549f6c st/mesa: merge st_ChooseTextureFormat_renderable() into st_ChooseTextureFormat()
That was the only place it was being called from.
2013-02-04 08:28:17 -07:00
Brian Paul
f54a9f4ff2 st/mesa: improve the format choosing code for DrawPixels
The code before was getting a pipe format, then calling
st_pipe_format_to_mesa_format() and then converting back again with
st_mesa_format_to_pipe_format().  This removes one conversion step.
2013-02-04 08:28:17 -07:00
Andreas Boll
38d65a9769 gallium: handle unhandled PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60098

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-02-04 08:28:17 -07:00
Brian Paul
4df42890c5 st/mesa: don't choose DXT formats if we can't do DXT compression
If we call gl[Copy]TexImage2D() with a generic compression format
(e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if
we don't have the external DXT compression library.

We weren't actually enforcing this before since the
pipe_screen::is_format_supported(DXT) query has no dependency on
the DXT compression library.

Now if we're given a generic compressed format and we can't do DXT
compression we'll fall back to a non-compressed format.

v2: use util_format_is_s3tc() function and add more comments about
the allow_dxt parameter.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-04 07:58:21 -07:00
Brian Paul
478056b81a mesa: don't use format chooser code for glCompressedTexImage
When glCompressedTexImage is called the internalFormat is a specific
format for the incoming image and the the hardware format should be
the same (since we never do format transcoding).  So use the simpler
_mesa_glenum_to_compressed_format() function.  This change is also
needed for the next patch.

Note: This is a candidate for the stable branches.
2013-02-04 07:58:21 -07:00
Kenneth Graunke
44aa2e15f6 i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms.
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value.  It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
underlying hardware issue is fixed.

Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.

v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
    mistake.)  They're the same, but using the right names is better.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:41:09 -08:00
Kenneth Graunke
09fbc29828 i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.
(This commit message was primarily written by Paul Berry, who explained
 what's going on far better than I would have.)

Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist.  Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).

However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger".  Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).

The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.

Thankfully, the PRM contains the precise formula the hardware expects.

Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.

NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:40:45 -08:00
Kenneth Graunke
5e9bc7bd12 i965: Compute the maximum SF source attribute.
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:40:43 -08:00
Kenneth Graunke
b3efc5bea8 i965: Refactor Gen6+ SF attribute override code.
The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling.  It doesn't want the final
attr_override DWord form, however.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:40:31 -08:00
Kenneth Graunke
488ddb247c glsl: Remove hash table from ir_set_program_inouts pass.
Back when ir_var_in and ir_var_out signified both function parameters
and shader input/outputs, we had trouble distinguishing the two when
looking at a dereference.  Now that we have separate ir_var_shader_in
and ir_var_shader_out modes, we can determine this easily.

Removing the hash table saves memory and CPU overhead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:38:16 -08:00
Kenneth Graunke
b56d6badad i965: Remove dead field brw_wm_prog_data::error. 2013-02-03 13:38:16 -08:00
Kenneth Graunke
7eda7a455b i965: Remove dead field brw_context::constant_map.
This was used by the old VS backend, but that's long gone.
2013-02-03 13:38:16 -08:00
Vinson Lee
8a4d952d10 r600g: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:52:22 -08:00
Vinson Lee
080e91aa07 egl/dri2: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:34 -08:00
Vinson Lee
cea341fce8 nv30: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:26 -08:00
Vinson Lee
4cd4deab48 nv50: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:16 -08:00
Vinson Lee
0580f165ed nvc0: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:01 -08:00
Vinson Lee
985e710c0d swrast: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:49:45 -08:00
Quentin Glidic
1e857130f0 configure.ac: Fix --with-llvm-shared-libs
The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch

https://bugs.freedesktop.org/show_bug.cgi?id=59851

Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
2013-02-01 22:53:46 +00:00
Tom Stellard
257006e2a4 r600g/llvm: Select the correct GPU type for RV670
RV670 belongs in the R600 chip class

https://bugs.freedesktop.org/show_bug.cgi?id=58666

NOTE: This is a candidate for the 9.1 branch
2013-02-01 22:53:30 +00:00
Abdiel Janulgue
6c7e95cb89 intel: implement create image from texture
Save miptree level info to DRIImage:
- Appropriately-aligned base offset pointing to the image
- Additional x/y adjustment offsets from above.

v8:  -Bump intelImageExtension version
v9:  -Don't use internal _eglError but implement error reporting in new DRI inteface
      instead. This fixes Android build problems based on feedback from
      Adrian M Negreanu and Chad Versace.
     -Move the non-tile-aligned check and error-reporting to intel_set_texture_image_region
v10: -Don't #include "egl/main/eglcurrent.h". [chadv]

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Acked-by: Chad Versace <chad.versace@linux.intel.com> (v10)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:13 -08:00
Abdiel Janulgue
8e2454c562 intel: Account for mt->offset in intel_miptree_map
We need to take account the offset from original bo when using glTexSubImage()
and other functions that manipulate the subregion of an exported texture.
Offsets are appended to mapped region address and when blitting from a source
region.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
11f5c82e83 intel: Create a miptree using offsets in intel_set_texture_image_region
When binding a region to a texture image, re-create the miptree base-level
considering the offset and dimension information exported by DRIImage.

v8: - Move the alignment surface address checks from the image-from-texture
      code to the texture-from-image side. This allows the error reporting to conform to
      OES_EGL_Image and to prevent mixing up EGL and GL errors. Reported by Chad Versace.
    - Addressed an existing issue in renderbuffer case where there is a
      a possibility of creating EGL images out of depthstencil textures which isn't
      really possible. This was spotted by Eric earlier.

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
45a28a927a i965: Account for offsets when updating SURFACE_STATE.
If the offsets are present, this lets us specify a particular level and slice
in a shared region using the base level of an exported mip-map tree.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
163b35e416 intel: add pixel offset calculator for miptree levels
Add helper to calculate fine-grained x and y adjustment pixels
to an image within a miptree level for tiled regions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
7014df0d1d intel: Expose intel_miptree_create_internal as intel_miptree_create_layout.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
f9e4e5f9f9 intel: expose dimensions and offsets of a miptree level in DRIImage
v8: - Append has_depthstencil field in DRIImage structure.

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
7b7af48e01 dri2: Create image from texture
Add create image from texture extension and bump version.

v8: - Add appropriate image errors codes in DRI interface so we don't
      have to use internal EGL functions in driver. Suggested by Chad Versace.

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Michel Dänzer
a8a5055f2d radeonsi: Fix draws using user index buffer.
Was broken since commit bf469f4edc
('gallium: add void *user_buffer in pipe_index_buffer').

Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.

NOTE: This is a candidate for the 9.1 branch.
2013-02-01 18:53:03 +01:00
Brian Paul
1bb52bab9e st/mesa: whitespace/indentation fix 2013-02-01 08:00:28 -07:00
Brian Paul
3cb4915344 svga: check for NaN shader immediates
The svga device doesn't handle them.  Replace with zeros.
Fixes several piglit tests, such as "glsl-const-builtin-inversesqrt".

Reviewed-by: Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-01 08:00:28 -07:00
Brian Paul
9eff5e905f svga: add, use SVGA3D_SURFACE_HINT_VOLUME flag
Reviewed-by: Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-01 08:00:28 -07:00
Brian Paul
9a91ce9448 trace: measure time for each gallium call
To get a rough idea of how much time is spent in each gallium driver
function.  The time is measured in microseconds.
2013-02-01 08:00:28 -07:00
Brian Paul
b516bf46ef trace: add void to function definition 2013-02-01 08:00:28 -07:00
Brian Paul
fe20e3ebb5 trace: allow GALLIUM_TRACE=stdout/stderr 2013-02-01 08:00:28 -07:00
Marek Olšák
225228a7f5 radeonsi: port some of get_shader_param changes from r600g
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-01 15:16:35 +01:00
Marek Olšák
cc5fdaf2dc mesa: don't expose IBM_rasterpos_clip in a core context
glRasterPos doesn't exist in the core profile.

NOTE: This is a candidate for the stable branches (9.0 and 9.1).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-01 15:16:35 +01:00
Marek Olšák
a06f03d795 r300g: always put MSAA resources in VRAM
This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).

Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.

NOTE: This is a candidate for the 9.1 branch.
2013-02-01 15:16:35 +01:00
Michel Dänzer
3b888f534c configure.ac: GLX cannot work without OpenGL
GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL.

If the user specified --enable-xlib-glx --disable-opengl, error out, as these
cannot be both observed at the same time. If the user just specified
--disable-opengl but not --disable-glx, print a warning and disable GLX as
well.

NOTE: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364

Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-02-01 11:42:09 +01:00
Vadim Girlin
9824755dae r600g: remove broken assert from r600_isa.c
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-02-01 13:19:35 +04:00
Vadim Girlin
e42111ecba r600g: implement shader disassembler v3
R600_DUMP_SHADERS environment var now allows to choose dump method:
 0 (default) - no dump
 1 - full dump (old dump)
 2 - disassemble
 3 - both

v2: fix output for burst_count > 1
v3: use more human-readable output for kcache data in CF_ALU_xxx clauses,
    improve output for ALU_EXTENDED, other minor fixes

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-02-01 12:08:42 +04:00
Vadim Girlin
022122ee63 r600g: use tables with ISA info v3
v3: added some flags including condition codes for ALU,
    fixed issue with CF reverse lookup (overlapping ranges of CF_ALU_xxx
    and other CF instructions)
    rebased on current master

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-02-01 12:08:42 +04:00
Vinson Lee
b68a3b865b glapi: Do not use backtrace on MinGW.
execinfo.h is not available on MinGW.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-31 23:23:12 -08:00
Jerome Glisse
5e0c956cb2 r600g: add cs memory usage accounting and limit it v3
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.

The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).

v2: Remove left over from testing version, remove useless NULL
    checking. Improve commit message.
v3: Add comment to code on memory accounting precision

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-31 14:23:52 -05:00
Marek Olšák
5c86a728d4 r600g: fix htile buffer leak
NOTE: This is a candidate for the 9.1 branch.
2013-01-31 15:35:18 +01:00
Andreas Boll
6ea753b056 mesa: bump version to 9.2 (devel)
Now that branch 9.1 is created, bump the minor version in
master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-31 09:01:15 +01:00
Matt Turner
a527b2192e Revert "mesa: Return INVALID_OPERATION when type is known but not allowed"
This reverts commit 2906e2034c.

Fixes a regression in the glean depthStencil test.

Reverting this does not affect any tests in es3conform, so a more recent
patch must have also fixed the failure this one was intended to fix.

Reported-by: lu hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494
2013-01-30 10:56:01 -08:00
Kenneth Graunke
7cccf46ec4 mesa: Add TexBufferRange to dispatch_sanity.
Christoph implemented this, so we should expect it to be present now.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60082
2013-01-30 10:48:05 -08:00
Christoph Bumiller
4bdf5454a5 nv50,nvc0: fix/enable texture buffer objects 2013-01-30 13:10:11 +01:00
Christoph Bumiller
a901d54f67 st/mesa: add support for GL_ARB_texture_buffer_range
v2: Update to handle BufferSize being -1 and return a NULL sampler
view if the specified range would cause out of bounds access.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-30 13:10:11 +01:00
Christoph Bumiller
0fcd2c5e2f gallium: add PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-30 13:10:11 +01:00
Christoph Bumiller
785a8c3beb mesa: implement GL_ARB_texture_buffer_range
v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead
of the buffer's current size so we know we always have to use the
full size of the buffer object (i.e. even if it changes without the
user calling TexBuffer again) for the texture.

Clarify invalid offset alignment error message.

v3: Use extra GL_CORE-only section in get_hash_params.py for
TEXTURE_BUFFER_OFFSET_ALIGNMENT.

v4: Remove unnecessary check for profile in _mesa_TexBufferRange.
Add check for extension enable in get_tex_level_parameter_buffer.

v5: Fix position in gl_API.xml.
Add comment about meaning of BufferSize == -1.

v6: Add back checks for core profile and add a note about it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-30 13:10:10 +01:00
Matt Turner
02b6da1e87 build: Add missing comma in AS_IF
Reported-by: Lauri Kasanen<curaga@operamail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15
2013-01-29 13:19:18 -08:00
Brian Paul
ce6bf2d4c5 mesa: remove ctx->Driver.Error() hook
Not used by any driver anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-29 12:32:13 -07:00
Stéphane Marchesin
67e7263e45 glx: Check that swap_buffers_reply is non-NULL before using it
Check that the return value from xcb_dri2_swap_buffers_reply is
non-NULL before accessing the struct members.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-29 11:15:22 -08:00
Brian Paul
70c5297439 mesa: fix comment typo: s/formaat/format/ 2013-01-29 11:53:24 -07:00
José Fonseca
42f762dcf6 llvmpipe: Don't advertise S8_UNORM (with feeble attempt at supporting it).
S8_UNORM was inadvertedly supported together with Z16_UNORM.

I tried to update the code to accomodate stencil-only -- it seemed a simple
thing to do -- but "fbo-stencil clear GL_STENCIL_INDEX8" still fails,
and it's not worth debugging.

Therefore although this change tries to update for S8_UNORM, it also
disables it completely.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-29 16:41:56 +00:00
José Fonseca
3b683700ef llvmpipe: Fix deferred depth writes for Z16_UNORM.
This special path hadn't been exercised by my earlier testing, and mask
values weren't being properly truncated to match the values.

This change fixes that.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-29 16:41:56 +00:00
Roland Scheidegger
0eb588a37c draw: fix draw_llvm_variant_key struct padding to avoid recompiles
The struct padding got broken by c789b981b2.
This caused serious performance regression because part of the key was
uninitialized and hence the shader always recompiled (at least on release
builds...).
While here also fix key size calculation when the number of samplers
and the number of sampler views are different.

v2: add comment

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-29 08:40:52 -08:00
Marek Olšák
845130951f docs/relnotes-9.1: document new features in radeon drivers 2013-01-29 17:35:17 +01:00
Brian Paul
d83336ce3e docs: more VMware guest driver info, tips 2013-01-29 08:59:53 -07:00
Brian Paul
c80bacba2e st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES >= 2
We never really have multisampling with one sample per pixel.
See also http://bugs.freedesktop.org/show_bug.cgi?id=59873

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
8f3c81d018 mesa: don't enable GL_EXT_framebuffer_multisample for software drivers
Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
2180f32972 osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta
See previous commit for more info.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
89551ae04f xlib: use _mesa_generate_mipmap() for mipmap generation, not meta
The swrast fragment program interpreter has trouble computing the
right texture LOD because it doesn't have easy access to input
derivatives.  This causes the GLSL-based meta generate mipmap code
to fetch texels from the wrong mipmap level.

One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters
to limit sampling from the right level.  But let's just use the
_mesa_generate_mipmap() fallback since it's a lot faster than using
the fragment shader interpreter.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
d60da27273 st/mesa: set ctx->Const.MaxSamples = 0, not 1
The gallium docs for pipe_screen::is_format_supported() says that
samples==0 or samples==1 both mean that multisampling is not supported.
Return GL_MAX_SAMPLES==0 instead of 1 for consistency with other drivers.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-29 08:59:53 -07:00
Brian Paul
4e41ae5fc1 xlib: stop use _mesa_enable_extension(), just set the boolean flags
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-29 08:59:53 -07:00
Brian Paul
becec657d6 xlib: fix incorrect GL_ANGLE_texture_compression_dxt enable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-29 08:59:53 -07:00
José Fonseca
0ca384fb39 llvmpipe: Support Z16_UNORM as depth-stencil format.
Simply by adjusting the vector element width after/before
reading/writing the depth-stencil values.

Ran several GL_DEPTH_COMPONENT16 piglit tests without regressions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-29 07:06:36 +00:00
Kenneth Graunke
9add4e8038 i965: Add chipset limits for Haswell GT1/GT2.
The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
2013-01-28 17:08:28 -08:00
Kenneth Graunke
7b07808f74 intel: Un-hardcode lengths from blitter commands.
The packet length may change at some point in the future.  Specifying it
explicitly (rather than hardcoding it in the command #define) allows us
to change it much more easily in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-28 16:47:52 -08:00
Matt Turner
1b3ec16cc2 Remove APIspec.dtd
Left behind by a8ab7e33.
2013-01-28 16:48:38 -08:00
Matt Turner
6324521789 docs: List new extensions added in Mesa 9.1
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
2013-01-28 16:48:38 -08:00
Eric Anholt
99fe2b36cf intel: Use a CPU map of the batch on LLC-sharing architectures.
Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in,
which was an improvement over mapping the batch through the GTT directly
(since any readback or other failure to stream through write combining
correctly would hurt).  However, on LLC-sharing architectures we can do better
by mapping the batch directly, which reduces the cache footprint of the
application since we no longer have this extra copy of a batchbuffer around.

Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4%
(n=21).  Improves Lightsmark performance by 1.1 +/- 0.1% (n=76).  Improves
cairo-gl performance by 1.9% +/- 1.4% (n=57).

No statistically significant difference in GLB2.1 on SNB (n=37).  Improves
cairo-gl performance by 2.1% +/- 0.1% (n=278).
2013-01-29 11:25:14 +11:00
Jerome Glisse
e1598cb642 r600g: use uint64_t instead of unsigned long for proper 32bits cpu support
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 19:09:52 -05:00
Jerome Glisse
da638781f6 r600g: real fix for non 3.8 kernel
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 17:17:00 -05:00
Vinson Lee
1559994cba i965: Fix assignment instead of comparison in asserts.
Fixes side effect in assertion defects reported by Coverity.

Note: This is a candidate for the 9.1 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-28 13:51:10 -08:00
Tapani Pälli
407029591c android: use gralloc_drm_get_gem_handle api
Currently a gralloc internal structure is exposed to Mesa,
Use a query function instead to maintain ABI compatibility.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-28 12:49:41 -08:00
Paul Berry
8e4bb4bc09 intel: Typo fix: "pitsh" -> "pitch"
Comment change only.
2013-01-28 12:31:25 -08:00
2320 changed files with 195175 additions and 93389 deletions

View File

@@ -35,6 +35,8 @@ LOCAL_C_INCLUDES += \
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
LOCAL_CFLAGS += \
-DPACKAGE_VERSION=\"9.2.0-devel\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
LOCAL_CFLAGS += \

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast i915g nouveau r300g r600g radeonsi vmwgfx
# gallium drivers: swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -42,7 +42,7 @@ DRM_TOP := external/drm
DRM_GRALLOC_TOP := hardware/drm_gralloc
classic_drivers := i915 i965
gallium_drivers := swrast i915g nouveau r300g r600g radeonsi vmwgfx
gallium_drivers := swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx
MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

View File

@@ -36,7 +36,6 @@ check-local:
# Rules for making release tarballs
PACKAGE_VERSION=9.1-devel
PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)
PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)

View File

@@ -69,6 +69,11 @@ if env['gles']:
#######################################################################
# Environment setup
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"9.2.0-devel\\"'),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
])
# Includes
env.Prepend(CPPPATH = [
'#/include',

52
bin/bugzilla_mesa.sh Executable file
View File

@@ -0,0 +1,52 @@
#!/bin/bash
# This script is used to generate the list of fixed bugs that
# appears in the release notes files, with HTML formatting.
#
# Note: This script could take a while until all details have
# been fetched from bugzilla.
#
# Usage examples:
#
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | wc -l
# regex pattern: trim before url
trim_before='s/.*\(http\)/\1/'
# regex pattern: trim after url
trim_after='s/\(show_bug.cgi?id=[0-9]*\).*/\1/'
# regex pattern: always use https
use_https='s/http:/https:/'
# extract fdo urls from commit log
urls=$(git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before -e $trim_after -e $use_https | sort | uniq)
# if DRYRUN is set to "yes", simply print the URLs and don't fetch the
# details from fdo bugzilla.
#DRYRUN=yes
if [ "x$DRYRUN" = xyes ]; then
for i in $urls
do
echo $i
done
else
echo "<ul>"
echo ""
for i in $urls
do
id=$(echo $i | cut -d'=' -f2)
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"
echo ""
done
echo "</ul>"
fi

View File

@@ -1,6 +1,12 @@
#!/bin/sh
# Script for generating a list of candidates for cherry-picking to a stable branch
#
# Usage examples:
#
# $ bin/get-pick-list.sh
# $ bin/get-pick-list.sh > picklist
# $ bin/get-pick-list.sh | tee picklist
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
@@ -8,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^[[:space:]]*NOTE: This is a candidate' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^[[:space:]]*NOTE: .*[Cc]andidate' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

251
bin/perf-annotate-jit Executable file
View File

@@ -0,0 +1,251 @@
#!/usr/bin/env python
#
# Copyright 2012 VMware Inc
# Copyright 2008-2009 Jose Fonseca
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
"""Perf annotate for JIT code.
Linux `perf annotate` does not work with JIT code. This script takes the data
produced by `perf script` command, plus the diassemblies outputed by gallivm
into /tmp/perf-XXXXX.map.asm and produces output similar to `perf annotate`.
See docs/llvmpipe.html for usage instructions.
The `perf script` output parser was derived from the gprof2dot.py script.
"""
import sys
import os.path
import re
import optparse
import subprocess
class Parser:
"""Parser interface."""
def __init__(self):
pass
def parse(self):
raise NotImplementedError
class LineParser(Parser):
"""Base class for parsers that read line-based formats."""
def __init__(self, file):
Parser.__init__(self)
self._file = file
self.__line = None
self.__eof = False
self.line_no = 0
def readline(self):
line = self._file.readline()
if not line:
self.__line = ''
self.__eof = True
else:
self.line_no += 1
self.__line = line.rstrip('\r\n')
def lookahead(self):
assert self.__line is not None
return self.__line
def consume(self):
assert self.__line is not None
line = self.__line
self.readline()
return line
def eof(self):
assert self.__line is not None
return self.__eof
mapFile = None
def lookupMap(filename, matchSymbol):
global mapFile
mapFile = filename
stream = open(filename, 'rt')
for line in stream:
start, length, symbol = line.split()
start = int(start, 16)
length = int(length,16)
if symbol == matchSymbol:
return start
return None
def lookupAsm(filename, desiredFunction):
stream = open(filename + '.asm', 'rt')
while stream.readline() != desiredFunction + ':\n':
pass
asm = []
line = stream.readline().strip()
while line:
addr, instr = line.split(':', 1)
addr = int(addr)
asm.append((addr, instr))
line = stream.readline().strip()
return asm
samples = {}
class PerfParser(LineParser):
"""Parser for linux perf callgraph output.
It expects output generated with
perf record -g
perf script
"""
def __init__(self, infile, symbol):
LineParser.__init__(self, infile)
self.symbol = symbol
def readline(self):
# Override LineParser.readline to ignore comment lines
while True:
LineParser.readline(self)
if self.eof() or not self.lookahead().startswith('#'):
break
def parse(self):
# read lookahead
self.readline()
while not self.eof():
self.parse_event()
asm = lookupAsm(mapFile, self.symbol)
addresses = samples.keys()
addresses.sort()
total_samples = 0
sys.stdout.write('%s:\n' % self.symbol)
for address, instr in asm:
try:
sample = samples.pop(address)
except KeyError:
sys.stdout.write(6*' ')
else:
sys.stdout.write('%6u' % (sample))
total_samples += sample
sys.stdout.write('%6u: %s\n' % (address, instr))
print 'total:', total_samples
assert len(samples) == 0
sys.exit(0)
def parse_event(self):
if self.eof():
return
line = self.consume()
assert line
callchain = self.parse_callchain()
if not callchain:
return
def parse_callchain(self):
callchain = []
while self.lookahead():
function = self.parse_call(len(callchain) == 0)
if function is None:
break
callchain.append(function)
if self.lookahead() == '':
self.consume()
return callchain
call_re = re.compile(r'^\s+(?P<address>[0-9a-fA-F]+)\s+(?P<symbol>.*)\s+\((?P<module>[^)]*)\)$')
def parse_call(self, first):
line = self.consume()
mo = self.call_re.match(line)
assert mo
if not mo:
return None
if not first:
return None
function_name = mo.group('symbol')
if not function_name:
function_name = mo.group('address')
module = mo.group('module')
function_id = function_name + ':' + module
address = mo.group('address')
address = int(address, 16)
if function_name != self.symbol:
return None
start_address = lookupMap(module, function_name)
address -= start_address
#print function_name, module, address
samples[address] = samples.get(address, 0) + 1
return True
def main():
"""Main program."""
optparser = optparse.OptionParser(
usage="\n\t%prog [options] symbol_name")
(options, args) = optparser.parse_args(sys.argv[1:])
if len(args) != 1:
optparser.error('wrong number of arguments')
symbol = args[0]
p = subprocess.Popen(['perf', 'script'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
parser = PerfParser(p.stdout, symbol)
parser.parse()
if __name__ == '__main__':
main()
# vim: set sw=4 et:

View File

@@ -2,6 +2,12 @@
# This script is used to generate the list of changes that
# appears in the release notes files, with HTML formatting.
#
# Usage examples:
#
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes
typeset -i in_log=0

View File

@@ -100,4 +100,4 @@ def AddOptions(opts):
opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
opts.Add(BoolOption('texture_float', 'enable floating-point textures and renderbuffers', 'no'))
if host_platform == 'windows':
opts.Add(EnumOption('MSVS_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0')))
opts.Add(EnumOption('MSVC_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0', '10.0', '11.0')))

File diff suppressed because it is too large Load Diff

View File

@@ -23,12 +23,12 @@ GL_EXT_texture_shared_exponent DONE (i965, r600, swrast)
Float depth buffers (GL_ARB_depth_buffer_float) DONE (i965, r600)
Framebuffer objects (GL_ARB_framebuffer_object) DONE (i965, r300, r600, swrast)
Half-float DONE
Non-normalized Integer texture/framebuffer formats DONE (i965, r600)
Non-normalized Integer texture/framebuffer formats DONE (i965, r600)
1D/2D Texture arrays DONE
Per-buffer blend and masks (GL_EXT_draw_buffers2) DONE (i965, r600, swrast)
GL_EXT_texture_compression_rgtc DONE (i965, r300, r600, swrast)
Red and red/green texture formats DONE (i965, swrast, gallium)
Transform feedback (GL_EXT_transform_feedback) DONE (i965, r600)
Transform feedback (GL_EXT_transform_feedback) DONE (i965, r600)
Vertex array objects (GL_APPLE_vertex_array_object) DONE (i965, r300, r600, swrast)
sRGB framebuffer format (GL_EXT_framebuffer_sRGB) DONE (i965, r600)
glClearBuffer commands DONE
@@ -56,14 +56,14 @@ Signed normalized textures (GL_EXT_texture_snorm) DONE (i965, r300, r600)
GL 3.2:
Core/compatibility profiles DONE
GLSL 1.50 not started
Geometry shaders (GL_ARB_geometry_shader4) partially done (Zack)
GLSL 1.50 in progress
Geometry shaders (GL_ARB_geometry_shader4) partially done
BGRA vertex order (GL_ARB_vertex_array_bgra) DONE (i965, r300, r600, swrast)
Base vertex offset(GL_ARB_draw_elements_base_vertex) DONE (i965, r300, r600, swrast)
Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (i965, r300, r600, swrast)
Provoking vertex (GL_ARB_provoking_vertex) DONE (i965, r300, r600, swrast)
Seamless cubemaps (GL_ARB_seamless_cube_map) DONE (i965, r600)
Multisample textures (GL_ARB_texture_multisample) not started
Multisample textures (GL_ARB_texture_multisample) DONE (i965)
Frag depth clamp (GL_ARB_depth_clamp) DONE (i965, r600, swrast)
Fence objects (GL_ARB_sync) DONE (i965, r300, r600, swrast)
GLX_ARB_create_context_profile DONE
@@ -79,7 +79,7 @@ GL_ARB_sampler_objects DONE (i965, r300, r600)
GL_ARB_shader_bit_encoding DONE
GL_ARB_texture_rgb10_a2ui DONE (i965, r600)
GL_ARB_texture_swizzle DONE (same as EXT version) (i965, r300, r600, swrast)
GL_ARB_timer_query DONE (i965, r600)
GL_ARB_timer_query DONE (i965, r600)
GL_ARB_instanced_arrays DONE (i965, r300, r600)
GL_ARB_vertex_type_2_10_10_10_rev DONE (i965, r600)
@@ -87,17 +87,17 @@ GL_ARB_vertex_type_2_10_10_10_rev DONE (i965, r600)
GL 4.0:
GLSL 4.0 not started
GL_ARB_texture_query_lod not started
GL_ARB_texture_query_lod DONE (i965)
GL_ARB_draw_buffers_blend DONE (i965, r600, softpipe)
GL_ARB_draw_indirect not started
GL_ARB_gpu_shader5 not started
GL_ARB_draw_indirect started (Christoph)
GL_ARB_gpu_shader5 started
GL_ARB_gpu_shader_fp64 not started
GL_ARB_sample_shading not started
GL_ARB_shader_subroutine not started
GL_ARB_tessellation_shader not started
GL_ARB_texture_buffer_object_rgb32 DONE (i965, softpipe)
GL_ARB_texture_cube_map_array DONE (i965, softpipe)
GL_ARB_texture_gather not started
GL_ARB_texture_gather started (Maxence, Chris)
GL_ARB_transform_feedback2 DONE
GL_ARB_transform_feedback3 DONE
@@ -106,7 +106,7 @@ GL 4.1:
GLSL 4.1 not started
GL_ARB_ES2_compatibility DONE (i965, r300, r600)
GL_ARB_get_program_binary not started
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects some infrastructure done
GL_ARB_shader_precision not started
GL_ARB_vertex_attrib_64bit not started
@@ -119,13 +119,13 @@ GLSL 4.2 not started
GL_ARB_texture_compression_bptc not started
GL_ARB_compressed_texture_pixel_storage not started
GL_ARB_shader_atomic_counters not started
GL_ARB_texture_storage DONE (r300, r600, swrast, gallium)
GL_ARB_texture_storage DONE (i965, r300, r600, swrast, gallium)
GL_ARB_transform_feedback_instanced DONE
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_shader_image_load_store not started
GL_ARB_conservative_depth DONE (softpipe)
GL_ARB_shading_language_420pack not started
GL_ARB_internalformat_query not started
GL_ARB_shading_language_420pack started (Todd)
GL_ARB_internalformat_query DONE (i965, gallium)
GL_ARB_map_buffer_alignment DONE (r300, r600, radeonsi)
@@ -133,7 +133,7 @@ GL 4.3:
GLSL 4.3 not started
ARB_arrays_of_arrays not started
ARB_ES3_compatibility not started
ARB_ES3_compatibility DONE (i965)
ARB_clear_buffer_object not started
ARB_compute_shader started (gallium)
ARB_copy_image not started
@@ -149,9 +149,9 @@ ARB_robust_buffer_access_behavior not started
ARB_shader_image_size not started
ARB_shader_storage_buffer_object not started
ARB_stencil_texturing not started
ARB_texture_buffer_range not started
ARB_texture_buffer_range DONE (nv50, nvc0)
ARB_texture_query_levels not started
ARB_texture_storage_multisample not started
ARB_texture_storage_multisample DONE (i965)
ARB_texture_view not started
ARB_vertex_attrib_binding not started

13
docs/README.UVD Normal file
View File

@@ -0,0 +1,13 @@
The software may implement third party technologies (e.g. third party
libraries) that are not licensed to you by AMD and for which you may need
to obtain licenses from other parties. Unless explicitly stated otherwise,
these third party technologies are not licensed hereunder. Such third
party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
AVC, and VC-1.
For MPEG-2 Encoding Products ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD FOR ENCODING VIDEO
INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED WITHOUT A LICENSE
UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSES IS
AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers Green Circle, Suite 400E,
Greenwood Village, Colorado 80111 U.S.A.

View File

@@ -1,6 +1,6 @@
File: docs/README.WIN32
Last updated: 23 April 2011
Last updated: 21 June 2013
Quick Start
@@ -30,6 +30,23 @@ At this time, only the gallium GDI driver is known to work.
Source code also exists in the tree for other drivers in
src/mesa/drivers/windows, but the status of this code is unknown.
Recipe
------
Building on windows requires several open-source packages. These are
steps that work as of this writing.
1) install python 2.7
2) install scons (latest)
3) install mingw, flex, and bison
4) install libxml2 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
get libxml2-python-2.9.1.win-amd64-py2.7.exe
5) install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
get pywin32-218.4.win-amd64-py2.7.exe
6) install git
7) download mesa from git
see http://www.mesa3d.org/repository.html
8) run scons
General
-------

View File

@@ -0,0 +1,83 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Application Issues</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Application Issues</h1>
<p>
This page documents known issues with some OpenGL applications.
</p>
<h2>Topogun</h2>
<p>
<a href="http://www.topogun.com/">Topogun</a> for Linux (version 2, at least)
creates a GLX visual without requesting a depth buffer.
This causes bad rendering if the OpenGL driver happens to choose a visual
without a depth buffer.
</p>
<p>
Mesa 9.1.2 and later (will) support a DRI configuration option to work around
this issue.
Using the <a href="http://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,
set the "Create all visuals with a depth buffer" option before running Topogun.
Then, all GLX visuals will be created with a depth buffer.
</p>
<h2>Old OpenGL games</h2>
<p>
Some old OpenGL games (approx. ten years or older) may crash during
start-up because of an extension string buffer-overflow problem.
</p>
<p>
The problem is a modern OpenGL driver will return a very long string
for the glGetString(GL_EXTENSIONS) query and if the application
naively copies the string into a fixed-size buffer it can overflow the
buffer and crash the application.
</p>
<p>
The work-around is to set the MESA_EXTENSION_MAX_YEAR environment variable
to the approximate release year of the game.
This will cause the glGetString(GL_EXTENSIONS) query to only report extensions
older than the given year.
</p>
<p>
For example, if the game was released in 2001, do
<pre>
export MESA_EXTENSION_MAX_YEAR=2001
</pre>
before running the game.
</p>
<h2>Viewperf</h2>
<p>
See the <a href="viewperf.html">Viewperf issues</a> page for a detailed list
of Viewperf issues.
</p>
</div>
</body>
</html>

View File

@@ -71,6 +71,7 @@
<li><a href="llvmpipe.html" target="_parent">Gallium llvmpipe driver</a>
<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D guest driver</a>
<li><a href="postprocess.html" target="_parent">Gallium post-processing</a>
<li><a href="application-issues.html" target="_parent">Application Issues</a>
<li><a href="viewperf.html" target="_parent">Viewperf Issues</a>
</ul>

View File

@@ -196,19 +196,18 @@ branch is relevant.
<h3>Verify and update version info</h3>
<dl>
<dt>Makefile.am</dt>
<dt>SConstruct</dt>
<dt>Android.common.mk</dt>
<dd>PACKAGE_VERSION</dd>
<dt>configure.ac</dt>
<dd>AC_INIT</dd>
<dt>src/mesa/main/version.h</dt>
<dd>MESA_MAJOR, MESA_MINOR, MESA_PATCH and MESA_VERSION_STRING</dd>
</dl>
<p>
Create a docs/relnotes-x.y.z.html file.
The bin/shortlog_mesa.sh script can be used to create a HTML-formatted list
of changes to include in the file.
Link the new docs/relnotes-x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.
Create a docs/relnotes/x.y.z.html file.
The bin/bugzilla_mesa.sh and bin/shortlog_mesa.sh scripts can be used to
create the HTML-formatted lists of bugfixes and changes to include in the file.
Link the new docs/relnotes/x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.
</p>
<p>
@@ -217,7 +216,7 @@ Update <a href="index.html">docs/index.html</a>.
<p>
Tag the files with the release name (in the form <b>mesa-x.y</b>)
with: <code>git tag -a mesa-x.y</code>
with: <code>git tag -s mesa-x.y -m "Mesa x.y Release"</code>
Then: <code>git push origin mesa-x.y</code>
</p>
@@ -226,13 +225,14 @@ Then: <code>git push origin mesa-x.y</code>
<p>
Make the distribution files. From inside the Mesa directory:
<pre>
./autogen.sh
make tarballs
</pre>
<p>
After the tarballs are created, the md5 checksums for the files will
be computed.
Add them to the docs/relnotes-x.y.html file.
Add them to the docs/relnotes/x.y.html file.
</p>
<p>
@@ -242,15 +242,18 @@ compile everything, and run some demos to be sure everything works.
<h3>Update the website and announce the release</h3>
<p>
Follow the directions on SourceForge for creating a new "release" and
uploading the tarballs.
Make a new directory for the release on annarchy.freedesktop.org with:
<br>
<code>
mkdir /srv/ftp.freedesktop.org/pub/mesa/x.y
</code>
</p>
<p>
Basically, to upload the tarball files with:
<br>
<code>
rsync -avP ssh Mesa*-X.Y.* USERNAME@frs.sourceforge.net:uploads/
rsync -avP -e ssh MesaLib-x.y.* USERNAME@annarchy.freedesktop.org:/srv/ftp.freedesktop.org/pub/mesa/x.y/
</code>
</p>

View File

@@ -32,6 +32,8 @@ sometimes be useful for debugging end-user issues.
<li>LIBGL_ALWAYS_INDIRECT - forces an indirect rendering context/connection.
<li>LIBGL_ALWAYS_SOFTWARE - if set, always use software rendering
<li>LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging)
<li>LIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers
calls per second.
</ul>
@@ -144,14 +146,13 @@ Mesa EGL supports different sets of environment variables. See the
<h2>Gallium environment variables</h2>
<ul>
<li>GALLIUM_HUD - draws various information on the screen, like framerate,
cpu load, driver statistics, performance counters, etc.
Set GALLIUM_HUD=help and run e.g. glxgears for more info.
<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
rather than stderr.
<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment
variables which are used, and their current values.
<li>GALLIUM_NOSSE - if non-zero, do not use SSE runtime code generation for
shader execution
<li>GALLIUM_NOPPC - if non-zero, do not use PPC runtime code generation for
shader execution
<li>GALLIUM_DUMP_CPU - if non-zero, print information about the CPU on start-up
<li>TGSI_PRINT_SANITY - if set, do extra sanity checking on TGSI shaders and
print any errors to stderr.
@@ -159,6 +160,9 @@ Mesa EGL supports different sets of environment variables. See the
<LI>DRAW_NO_FSE - ???
<li>DRAW_USE_LLVM - if set to zero, the draw module will not use LLVM to execute
shaders, vertex fetch, etc.
<li>ST_DEBUG - controls debug output from the Mesa/Gallium state tracker.
Setting to "tgsi", for example, will print all the TGSI shaders.
See src/mesa/state_tracker/st_debug.c for other options.
</ul>
<h3>Softpipe driver environment variables</h3>
@@ -185,6 +189,16 @@ Mesa EGL supports different sets of environment variables. See the
cores present.
</ul>
<h3>VMware SVGA driver environment variables</h3>
<ul>
<li>SVGA_FORCE_SWTNL - force use of software vertex transformation
<li>SVGA_NO_SWTNL - don't allow software vertex transformation fallbacks
(will often result in incorrect rendering).
<li>SVGA_DEBUG - for dumping shaders, constant buffers, etc. See the code
for details.
<li>See the driver code for other, lesser-used variables.
</ul>
<p>
Other Gallium drivers have their own environment variables. These may change

View File

@@ -23,19 +23,27 @@ The specifications follow.
<ul>
<li><a href="MESA_agp_offset.spec">MESA_agp_offset.spec</a>
<li><a href="MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>
<li><a href="MESA_packed_depth_stencil.spec">MESA_packed_depth_stencil.spec</a>
<li><a href="MESA_pack_invert.spec">MESA_pack_invert.spec</a>
<li><a href="MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>
<li><a href="MESA_release_buffers.spec">MESA_release_buffers.spec</a>
<li><a href="MESA_resize_buffers.spec">MESA_resize_buffers.spec</a>
<li><a href="MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)
<li><a href="MESA_texture_signed_rgba.spec">MESA_texture_signed_rgba.spec</a>
<li><a href="MESA_trace.spec">MESA_trace.spec</a> (obsolete)
<li><a href="MESA_window_pos.spec">MESA_window_pos.spec</a>
<li><a href="MESA_ycbcr_texture.spec">MESA_ycbcr_texture.spec</a>
<li><a href="specs/MESA_agp_offset.spec">MESA_agp_offset.spec</a>
<li><a href="specs/MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>
<li><a href="specs/MESA_drm_image.spec">MESA_drm_image.spec</a>
<li><a href="specs/MESA_multithread_makecurrent.spec">MESA_multithread_makecurrent.spec</a>
<li><a href="specs/OLD/MESA_packed_depth_stencil.spec">MESA_packed_depth_stencil.spec</a> (obsolete)
<li><a href="specs/MESA_pack_invert.spec">MESA_pack_invert.spec</a>
<li><a href="specs/MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>
<li><a href="specs/OLD/MESA_program_debug.spec">MESA_program_debug.spec</a> (obsolete)
<li><a href="specs/MESA_release_buffers.spec">MESA_release_buffers.spec</a>
<li><a href="specs/OLD/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a> (obsolete)
<li><a href="specs/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="specs/MESA_shader_debug.spec">MESA_shader_debug.spec</a>
<li><a href="specs/OLD/MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)
<li><a href="specs/MESA_swap_control.spec">MESA_swap_control.spec</a>
<li><a href="specs/MESA_swap_frame_usage.spec">MESA_swap_frame_usage.spec</a>
<li><a href="specs/MESA_texture_array.spec">MESA_texture_array.spec</a>
<li><a href="specs/MESA_texture_signed_rgba.spec">MESA_texture_signed_rgba.spec</a>
<li><a href="specs/OLD/MESA_trace.spec">MESA_trace.spec</a> (obsolete)
<li><a href="specs/MESA_window_pos.spec">MESA_window_pos.spec</a>
<li><a href="specs/MESA_ycbcr_texture.spec">MESA_ycbcr_texture.spec</a>
<li><a href="specs/WL_bind_wayland_display.spec">WL_bind_wayland_display.spec</a>
</ul>
</div>

View File

@@ -2,7 +2,7 @@
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa News</title>
<title>The Mesa 3D Graphics Library</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
@@ -16,10 +16,66 @@
<h1>News</h1>
<h2>July 17, 2013</h2>
<p>
<a href="relnotes/9.1.5.html">Mesa 9.1.5</a> is released.
This is a bug fix release.
</p>
<h2>July 1, 2013</h2>
<p>
<a href="relnotes/9.1.4.html">Mesa 9.1.4</a> is released.
This is a bug fix release.
</p>
<h2>May 21, 2013</h2>
<p>
<a href="relnotes/9.1.3.html">Mesa 9.1.3</a> is released.
This is a bug fix release.
</p>
<h2>April 30, 2013</h2>
<p>
<a href="relnotes/9.1.2.html">Mesa 9.1.2</a> is released.
This is a bug fix release.
</p>
<h2>March 19, 2013</h2>
<p>
<a href="relnotes/9.1.1.html">Mesa 9.1.1</a> is released.
This is a bug fix release.
</p>
<h2>February 24, 2013</h2>
<p>
Mesa demos 8.1.0 is released.
See the <a href="http://lists.freedesktop.org/archives/mesa-dev/2013-February/035180.html">announcement</a> for more information about the release.
You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.1.0/">ftp.freedesktop.org/pub/mesa/demos/8.1.0/</a>.
</p>
<h2>February 22, 2013</h2>
<p>
<a href="relnotes/9.1.html">Mesa 9.1</a> is released.
This is a new development release.
See the release notes for more information about the release.
</p>
<h2>February 21, 2013</h2>
<p>
<a href="relnotes/9.0.3.html">Mesa 9.0.3</a> is released.
This is a bug fix release.
</p>
<h2>January 22, 2013</h2>
<p>
<a href="relnotes-9.0.2.html">Mesa 9.0.2</a> is released.
<a href="relnotes/9.0.2.html">Mesa 9.0.2</a> is released.
This is a bug fix release.
</p>
@@ -27,7 +83,7 @@ This is a bug fix release.
<h2>November 16, 2012</h2>
<p>
<a href="relnotes-9.0.1.html">Mesa 9.0.1</a> is released.
<a href="relnotes/9.0.1.html">Mesa 9.0.1</a> is released.
This is a bug fix release.
</p>
@@ -35,7 +91,7 @@ This is a bug fix release.
<h2>October 24, 2012</h2>
<p>
<a href="relnotes-8.0.5.html">Mesa 8.0.5</a> is released.
<a href="relnotes/8.0.5.html">Mesa 8.0.5</a> is released.
This is a bug fix release.
</p>
@@ -43,7 +99,7 @@ This is a bug fix release.
<h2>October 8, 2012</h2>
<p>
<a href="relnotes-9.0.html">Mesa 9.0</a> is released.
<a href="relnotes/9.0.html">Mesa 9.0</a> is released.
This is the first version of Mesa to support OpenGL 3.1 and GLSL 1.40
(with the i965 driver).
See the release notes for more information about the release.
@@ -53,7 +109,7 @@ See the release notes for more information about the release.
<h2>July 10, 2012</h2>
<p>
<a href="relnotes-8.0.4.html">Mesa 8.0.4</a> is released.
<a href="relnotes/8.0.4.html">Mesa 8.0.4</a> is released.
This is a bug fix release.
</p>
@@ -61,7 +117,7 @@ This is a bug fix release.
<h2>May 18, 2012</h2>
<p>
<a href="relnotes-8.0.3.html">Mesa 8.0.3</a> is released.
<a href="relnotes/8.0.3.html">Mesa 8.0.3</a> is released.
This is a bug fix release.
</p>
@@ -69,7 +125,7 @@ This is a bug fix release.
<h2>March 21, 2012</h2>
<p>
<a href="relnotes-8.0.2.html">Mesa 8.0.2</a> is released.
<a href="relnotes/8.0.2.html">Mesa 8.0.2</a> is released.
This is a bug fix release.
</p>
@@ -77,14 +133,14 @@ This is a bug fix release.
<h2>February 16, 2012</h2>
<p>
<a href="relnotes-8.0.1.html">Mesa 8.0.1</a> is released. This is a bug fix
<a href="relnotes/8.0.1.html">Mesa 8.0.1</a> is released. This is a bug fix
release. See the release notes for more information about the release.
</p>
<h2>February 9, 2012</h2>
<p>
<a href="relnotes-8.0.html">Mesa 8.0</a> is released.
<a href="relnotes/8.0.html">Mesa 8.0</a> is released.
This is the first version of Mesa to support OpenGL 3.0 and GLSL 1.30
(with the i965 driver).
See the release notes for more information about the release.
@@ -94,7 +150,7 @@ See the release notes for more information about the release.
<h2>November 27, 2011</h2>
<p>
<a href="relnotes-7.11.2.html">Mesa 7.11.2</a> is released. This is a bug fix
<a href="relnotes/7.11.2.html">Mesa 7.11.2</a> is released. This is a bug fix
release. This release was made primarily to fix build problems with 7.11.1 on
Mandriva and to fix problems related to glCopyTexImage to luminance-alpha
textures. The later was believed to have been fixed in 7.11.1 but was not.
@@ -103,36 +159,36 @@ textures. The later was believed to have been fixed in 7.11.1 but was not.
<h2>November 17, 2011</h2>
<p>
<a href="relnotes-7.11.1.html">Mesa 7.11.1</a> is released. This is a bug
<a href="relnotes/7.11.1.html">Mesa 7.11.1</a> is released. This is a bug
fix release.
</p>
<h2>July 31, 2011</h2>
<p>
<a href="relnotes-7.11.html">Mesa 7.11</a> (final) is released. This is a new
<a href="relnotes/7.11.html">Mesa 7.11</a> (final) is released. This is a new
development release.
</p>
<h2>June 13, 2011</h2>
<p>
<a href="relnotes-7.10.3.html">Mesa 7.10.3</a> is released. This is a bug
<a href="relnotes/7.10.3.html">Mesa 7.10.3</a> is released. This is a bug
fix release.
</p>
<h2>April 6, 2011</h2>
<p>
<a href="relnotes-7.10.2.html">Mesa 7.10.2</a> is released. This is a bug
<a href="relnotes/7.10.2.html">Mesa 7.10.2</a> is released. This is a bug
fix release.
</p>
<h2>March 2, 2011</h2>
<p>
<a href="relnotes-7.9.2.html">Mesa 7.9.2</a> and
<a href="relnotes-7.10.1.html">Mesa 7.10.1</a> are released. These are
<a href="relnotes/7.9.2.html">Mesa 7.9.2</a> and
<a href="relnotes/7.10.1.html">Mesa 7.10.1</a> are released. These are
stable releases containing bug fixes since the 7.9.1 and 7.10 releases.
</p>
@@ -140,7 +196,7 @@ stable releases containing bug fixes since the 7.9.1 and 7.10 releases.
<h2>October 4, 2010</h2>
<p>
<a href="relnotes-7.9.html">Mesa 7.9</a> (final) is released. This is a new
<a href="relnotes/7.9.html">Mesa 7.9</a> (final) is released. This is a new
development release.
</p>
@@ -148,7 +204,7 @@ development release.
<h2>September 27, 2010</h2>
<p>
<a href="relnotes-7.9.html">Mesa 7.9.0-rc1</a> is released. This is a
<a href="relnotes/7.9.html">Mesa 7.9.0-rc1</a> is released. This is a
release candidate for the 7.9 development release.
</p>
@@ -156,7 +212,7 @@ release candidate for the 7.9 development release.
<h2>June 16, 2010</h2>
<p>
<a href="relnotes-7.8.2.html">Mesa 7.8.2</a> is released. This is a bug-fix
<a href="relnotes/7.8.2.html">Mesa 7.8.2</a> is released. This is a bug-fix
release collecting fixes since the 7.8.1 release.
</p>
@@ -164,18 +220,18 @@ release collecting fixes since the 7.8.1 release.
<h2>April 5, 2010</h2>
<p>
<a href="relnotes-7.8.1.html">Mesa 7.8.1</a> is released. This is a bug-fix
<a href="relnotes/7.8.1.html">Mesa 7.8.1</a> is released. This is a bug-fix
release for a few critical issues in the 7.8 release.
</p>
<h2>March 28, 2010</h2>
<p>
<a href="relnotes-7.7.1.html">Mesa 7.7.1</a> is released. This is a bug-fix
<a href="relnotes/7.7.1.html">Mesa 7.7.1</a> is released. This is a bug-fix
release fixing issues found in the 7.7 release.
</p>
<p>
Also, <a href="relnotes-7.8.html">Mesa 7.8</a> is released. This is a new
Also, <a href="relnotes/7.8.html">Mesa 7.8</a> is released. This is a new
development release.
</p>
@@ -183,37 +239,37 @@ development release.
<h2>December 21, 2009</h2>
<p>
<a href="relnotes-7.6.1.html">Mesa 7.6.1</a> is released. This is a bug-fix
<a href="relnotes/7.6.1.html">Mesa 7.6.1</a> is released. This is a bug-fix
release fixing issues found in the 7.6 release.
</p>
<p>
Also, <a href="relnotes-7.7.html">Mesa 7.7</a> is released. This is a new
Also, <a href="relnotes/7.7.html">Mesa 7.7</a> is released. This is a new
development release.
</p>
<h2>September 28, 2009</h2>
<p>
<a href="relnotes-7.6.html">Mesa 7.6</a> is released. This is a new feature
<a href="relnotes/7.6.html">Mesa 7.6</a> is released. This is a new feature
release. Those especially concerned about stability may want to wait for the
follow-on 7.6.1 bug-fix release.
</p>
<p>
<a href="relnotes-7.5.2.html">Mesa 7.5.2</a> is also released.
<a href="relnotes/7.5.2.html">Mesa 7.5.2</a> is also released.
This is a stable release fixing bugs since the 7.5.1 release.
</p>
<h2>September 3, 2009</h2>
<p>
<a href="relnotes-7.5.1.html">Mesa 7.5.1</a> is released.
<a href="relnotes/7.5.1.html">Mesa 7.5.1</a> is released.
This is a bug-fix release which fixes bugs found in version 7.5.
</p>
<h2>July 17, 2009</h2>
<p>
<a href="relnotes-7.5.html">Mesa 7.5</a> is released.
<a href="relnotes/7.5.html">Mesa 7.5</a> is released.
This is a new features release. People especially concerned about
stability may want to wait for the follow-on 7.5.1 bug-fix release.
</p>
@@ -221,7 +277,7 @@ stability may want to wait for the follow-on 7.5.1 bug-fix release.
<h2>June 23, 2009</h2>
<p>
<a href="relnotes-7.4.4.html">Mesa 7.4.4</a> is released.
<a href="relnotes/7.4.4.html">Mesa 7.4.4</a> is released.
This is a stable release that fixes a regression in the i915/i965 drivers
that slipped into the 7.4.3 release.
</p>
@@ -229,35 +285,35 @@ that slipped into the 7.4.3 release.
<h2>June 19, 2009</h2>
<p>
<a href="relnotes-7.4.3.html">Mesa 7.4.3</a> is released.
<a href="relnotes/7.4.3.html">Mesa 7.4.3</a> is released.
This is a stable release fixing bugs since the 7.4.2 release.
</p>
<h2>May 15, 2009</h2>
<p>
<a href="relnotes-7.4.2.html">Mesa 7.4.2</a> is released.
<a href="relnotes/7.4.2.html">Mesa 7.4.2</a> is released.
This is a stable release fixing bugs since the 7.4.1 release.
</p>
<h2>April 18, 2009</h2>
<p>
<a href="relnotes-7.4.1.html">Mesa 7.4.1</a> is released.
<a href="relnotes/7.4.1.html">Mesa 7.4.1</a> is released.
This is a stable release fixing bugs since the 7.4 release.
</p>
<h2>March 27, 2009</h2>
<p>
<a href="relnotes-7.4.html">Mesa 7.4</a> is released.
<a href="relnotes/7.4.html">Mesa 7.4</a> is released.
This is a stable release fixing bugs since the 7.3 release.
</p>
<h2>January 22, 2009</h2>
<p>
<a href="relnotes-7.3.html">Mesa 7.3</a> is released.
<a href="relnotes/7.3.html">Mesa 7.3</a> is released.
This is a new development release.
Mesa 7.4 will follow and will have bug fixes relative to 7.3.
</p>
@@ -265,14 +321,14 @@ Mesa 7.4 will follow and will have bug fixes relative to 7.3.
<h2>September 20, 2008</h2>
<p>
<a href="relnotes-7.2.html">Mesa 7.2</a> is released.
<a href="relnotes/7.2.html">Mesa 7.2</a> is released.
This is a stable, bug-fix release.
</p>
<h2>August 26, 2008</h2>
<p>
<a href="relnotes-7.1.html">Mesa 7.1</a> is released.
<a href="relnotes/7.1.html">Mesa 7.1</a> is released.
This is a new development release.
It should be relatively stable, but those especially concerned about
stability should wait for the 7.2 release or use Mesa 7.0.4 (the
@@ -282,14 +338,14 @@ previous stable release).
<h2>August 16, 2008</h2>
<p>
<a href="relnotes-7.0.4.html">Mesa 7.0.4</a> is released.
<a href="relnotes/7.0.4.html">Mesa 7.0.4</a> is released.
This is a bug-fix release.
</p>
<h2>April 4, 2008</h2>
<p>
<a href="relnotes-7.0.3.html">Mesa 7.0.3</a> is released.
<a href="relnotes/7.0.3.html">Mesa 7.0.3</a> is released.
This is a bug-fix release.
</p>
@@ -318,28 +374,28 @@ but other drivers will be coming...
<h2>November 10, 2007</h2>
<p>
<a href="relnotes-7.0.2.html">Mesa 7.0.2</a> is released.
<a href="relnotes/7.0.2.html">Mesa 7.0.2</a> is released.
This is a bug-fix release.
</p>
<h2>August 3, 2007</h2>
<p>
<a href="relnotes-7.0.1.html">Mesa 7.0.1</a> is released.
<a href="relnotes/7.0.1.html">Mesa 7.0.1</a> is released.
This is a bug-fix release.
</p>
<h2>June 22, 2007</h2>
<p>
<a href="relnotes-7.0.html">Mesa 7.0</a> is released.
<a href="relnotes/7.0.html">Mesa 7.0</a> is released.
This is a stable release featuring OpenGL 2.1 support.
</p>
<h2>April 27, 2007</h2>
<p>
<a href="relnotes-6.5.3.html">Mesa 6.5.3</a> is released.
<a href="relnotes/6.5.3.html">Mesa 6.5.3</a> is released.
This is a development release which will lead up to the Mesa 7.0 release
(which will advertise OpenGL 2.1 API support).
</p>
@@ -370,33 +426,33 @@ See the <a href="repository.html">repository page</a> for more information.
<h2>December 2, 2006</h2>
<p>
<a href="relnotes-6.5.2.html">Mesa 6.5.2</a> has been released.
<a href="relnotes/6.5.2.html">Mesa 6.5.2</a> has been released.
This is a new development release.
</p>
<h2>September 15, 2006</h2>
<p>
<a href="relnotes-6.5.1.html">Mesa 6.5.1</a> has been released.
<a href="relnotes/6.5.1.html">Mesa 6.5.1</a> has been released.
This is a new development release.
</p>
<h2>March 31, 2006</h2>
<p>
<a href="relnotes-6.5.html">Mesa 6.5</a> has been released.
<a href="relnotes/6.5.html">Mesa 6.5</a> has been released.
This is a new development release.
</p>
<h2>February 2, 2006</h2>
<p>
<a href="relnotes-6.4.2.html">Mesa 6.4.2</a> has been released.
<a href="relnotes/6.4.2.html">Mesa 6.4.2</a> has been released.
This is stable, bug-fix release.
</p>
<h2>November 29, 2005</h2>
<p>
<a href="relnotes-6.4.1.html">Mesa 6.4.1</a> has been released.
<a href="relnotes/6.4.1.html">Mesa 6.4.1</a> has been released.
This is stable, bug-fix release.
</p>
@@ -404,7 +460,7 @@ This is stable, bug-fix release.
<h2>October 24, 2005</h2>
<p>
<a href="relnotes-6.4.html">Mesa 6.4</a> has been released.
<a href="relnotes/6.4.html">Mesa 6.4</a> has been released.
This is stable, bug-fix release.
</p>
@@ -723,8 +779,8 @@ OpenGL 1.5 features.
- demo of per-pixel lighting with a fragment program (demos/fplight.c)
- new version (18) of glext.h header
- new spriteblast.c demo of GL_ARB_point_sprite
- faster glDrawPixels in X11 driver in some cases (see RELNOTES-5.1)
- faster glCopyPixels in X11 driver in some cases (see RELNOTES-5.1)
- faster glDrawPixels in X11 driver in some cases (see relnotes/5.1)
- faster glCopyPixels in X11 driver in some cases (see relnotes/5.1)
Bug fixes:
- really enable OpenGL 1.4 features in DOS driver.
- fixed issues in glDrawPixels and glCopyPixels for very wide images

View File

@@ -75,9 +75,10 @@ in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
BRIAN PAUL BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
</pre>

View File

@@ -130,38 +130,38 @@ need to ask, don't even try it.
<h1>Profiling</h1>
To profile llvmpipe you should pass the options
<p>
To profile llvmpipe you should build as
</p>
<pre>
scons build=profile &lt;same-as-before&gt;
</pre>
<p>
This will ensure that frame pointers are used both in C and JIT functions, and
that no tail call optimizations are done by gcc.
</p>
To better profile JIT code you'll need to build LLVM with oprofile integration.
<h2>Linux perf integration</h2>
<p>
On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
</p>
<pre>
./configure \
--prefix=$install_dir \
--enable-optimized \
--disable-profiling \
--enable-targets=host-only \
--with-oprofile
make -C "$build_dir"
make -C "$build_dir" install
find "$install_dir/lib" -iname '*.a' -print0 | xargs -0 strip --strip-debug
perf record -g /my/application
perf report
</pre>
The you should define
<p>
When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
which can be used by the bin/perf-annotate-jit script to produce disassembly of
the generated code annotated with the samples.
</p>
<pre>
export LLVM=/path/to/llvm-2.6-profile
</pre>
and rebuild.
<p>You can obtain a call graph via
<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
<h1>Unit testing</h1>

View File

@@ -18,77 +18,62 @@
<p>
Mesa's off-screen rendering interface is used for rendering into
user-allocated blocks of memory.
Mesa's off-screen interface is used for rendering into user-allocated memory
without any sort of window system or operating system dependencies.
That is, the GL_FRONT colorbuffer is actually a buffer in main memory,
rather than a window on your display.
There are no window system or operating system dependencies.
One potential application is to use Mesa as an off-line, batch-style renderer.
</p>
<p>
The <b>OSMesa</b> API provides three basic functions for making off-screen
The OSMesa API provides three basic functions for making off-screen
renderings: OSMesaCreateContext(), OSMesaMakeCurrent(), and
OSMesaDestroyContext(). See the Mesa/include/GL/osmesa.h header for
more information about the API functions.
</p>
<p>
The OSMesa interface may be used with any of three software renderers:
</p>
<ol>
<li>llvmpipe - this is the high-performance Gallium LLVM driver
<li>softpipe - this it the reference Gallium software driver
<li>swrast - this is the legacy Mesa software rasterizer
</ol>
<p>
There are several examples of OSMesa in the mesa/demos repository.
</p>
<h2>Deep color channels</h2>
<h1>Building OSMesa</h1>
<p>
For some applications 8-bit color channels don't have sufficient
precision.
OSMesa supports 16-bit and 32-bit color channels through the OSMesa interface.
When using 16-bit channels, channels are GLushorts and RGBA pixels occupy
8 bytes.
When using 32-bit channels, channels are GLfloats and RGBA pixels occupy
16 bytes.
</p>
Configure and build Mesa with something like:
<p>
Before version 6.5.1, Mesa had to be recompiled to support exactly
one of 8, 16 or 32-bit channels.
With Mesa 6.5.1, Mesa can be compiled for either 8, 16 or 32-bit channels
and render into any of the smaller size channels.
For example, if Mesa's compiled for 32-bit channels, you can also render
16 and 8-bit channel images.
</p>
<p>
To build Mesa/OSMesa for 16 and 8-bit color channel support:
<pre>
make realclean
make linux-osmesa16
configure --enable-osmesa --disable-driglx-direct --disable-dri --with-gallium-drivers=swrast
make
</pre>
<p>
To build Mesa/OSMesa for 32, 16 and 8-bit color channel support:
Make sure you have LLVM installed first if you want to use the llvmpipe driver.
</p>
<p>
When the build is complete you should find:
</p>
<pre>
make realclean
make linux-osmesa32
lib/libOSMesa.so (swrast-based OSMesa)
lib/gallium/libOSMsea.so (gallium-based OSMesa)
</pre>
<p>
You'll wind up with a library named libOSMesa16.so or libOSMesa32.so.
Otherwise, most Mesa configurations build an 8-bit/channel libOSMesa.so library
by default.
Set your LD_LIBRARY_PATH to point to one directory or the other to select
the library you want to use.
</p>
<p>
If performance is important, compile Mesa for the channel size you're
most interested in.
</p>
<p>
If you need to compile on a non-Linux platform, copy Mesa/configs/linux-osmesa16
to a new config file and edit it as needed. Then, add the new config name to
the top-level Makefile. Send a patch to the Mesa developers too, if you're
inclined.
When you link your application, link with -lOSMesa
</p>
</div>

View File

@@ -21,57 +21,64 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes-9.1.html">9.1 release notes</a>
<li><a href="relnotes-9.0.2.html">9.0.2 release notes</a>
<li><a href="relnotes-9.0.1.html">9.0.1 release notes</a>
<li><a href="relnotes-9.0.html">9.0 release notes</a>
<li><a href="relnotes-8.0.5.html">8.0.5 release notes</a>
<li><a href="relnotes-8.0.4.html">8.0.4 release notes</a>
<li><a href="relnotes-8.0.3.html">8.0.3 release notes</a>
<li><a href="relnotes-8.0.2.html">8.0.2 release notes</a>
<li><a href="relnotes-8.0.1.html">8.0.1 release notes</a>
<li><a href="relnotes-8.0.html">8.0 release notes</a>
<li><a href="relnotes-7.11.2.html">7.11.2 release notes</a>
<li><a href="relnotes-7.11.1.html">7.11.1 release notes</a>
<li><a href="relnotes-7.11.html">7.11 release notes</a>
<li><a href="relnotes-7.10.3.html">7.10.3 release notes</a>
<li><a href="relnotes-7.10.2.html">7.10.2 release notes</a>
<li><a href="relnotes-7.10.1.html">7.10.1 release notes</a>
<li><a href="relnotes-7.10.html">7.10 release notes</a>
<li><a href="relnotes-7.9.2.html">7.9.2 release notes</a>
<li><a href="relnotes-7.9.1.html">7.9.1 release notes</a>
<li><a href="relnotes-7.9.html">7.9 release notes</a>
<li><a href="relnotes-7.8.3.html">7.8.3 release notes</a>
<li><a href="relnotes-7.8.2.html">7.8.2 release notes</a>
<li><a href="relnotes-7.8.1.html">7.8.1 release notes</a>
<li><a href="relnotes-7.8.html">7.8 release notes</a>
<li><a href="relnotes-7.7.1.html">7.7.1 release notes</a>
<li><a href="relnotes-7.7.html">7.7 release notes</a>
<li><a href="relnotes-7.6.1.html">7.6.1 release notes</a>
<li><a href="relnotes-7.6.html">7.6 release notes</a>
<li><a href="relnotes-7.5.2.html">7.5.2 release notes</a>
<li><a href="relnotes-7.5.1.html">7.5.1 release notes</a>
<li><a href="relnotes-7.5.html">7.5 release notes</a>
<li><a href="relnotes-7.4.4.html">7.4.4 release notes</a>
<li><a href="relnotes-7.4.3.html">7.4.3 release notes</a>
<li><a href="relnotes-7.4.2.html">7.4.2 release notes</a>
<li><a href="relnotes-7.4.1.html">7.4.1 release notes</a>
<li><a href="relnotes-7.4.html">7.4 release notes</a>
<li><a href="relnotes-7.3.html">7.3 release notes</a>
<li><a href="relnotes-7.2.html">7.2 release notes</a>
<li><a href="relnotes-7.1.html">7.1 release notes</a>
<li><a href="relnotes-7.0.4.html">7.0.4 release notes</a>
<li><a href="relnotes-7.0.3.html">7.0.3 release notes</a>
<li><a href="relnotes-7.0.2.html">7.0.2 release notes</a>
<li><a href="relnotes-7.0.1.html">7.0.1 release notes</a>
<li><a href="relnotes-7.0.html">7.0 release notes</a>
<li><a href="relnotes-6.5.3.html">6.5.3 release notes</a>
<li><a href="relnotes-6.5.2.html">6.5.2 release notes</a>
<li><a href="relnotes-6.5.1.html">6.5.1 release notes</a>
<li><a href="relnotes-6.5.html">6.5 release notes</a>
<li><a href="relnotes-6.4.2.html">6.4.2 release notes</a>
<li><a href="relnotes-6.4.1.html">6.4.1 release notes</a>
<li><a href="relnotes-6.4.html">6.4 release notes</a>
<li><a href="relnotes/9.2.html">9.2 release notes</a>
<li><a href="relnotes/9.1.5.html">9.1.5 release notes</a>
<li><a href="relnotes/9.1.4.html">9.1.4 release notes</a>
<li><a href="relnotes/9.1.3.html">9.1.3 release notes</a>
<li><a href="relnotes/9.1.2.html">9.1.2 release notes</a>
<li><a href="relnotes/9.1.1.html">9.1.1 release notes</a>
<li><a href="relnotes/9.1.html">9.1 release notes</a>
<li><a href="relnotes/9.0.3.html">9.0.3 release notes</a>
<li><a href="relnotes/9.0.2.html">9.0.2 release notes</a>
<li><a href="relnotes/9.0.1.html">9.0.1 release notes</a>
<li><a href="relnotes/9.0.html">9.0 release notes</a>
<li><a href="relnotes/8.0.5.html">8.0.5 release notes</a>
<li><a href="relnotes/8.0.4.html">8.0.4 release notes</a>
<li><a href="relnotes/8.0.3.html">8.0.3 release notes</a>
<li><a href="relnotes/8.0.2.html">8.0.2 release notes</a>
<li><a href="relnotes/8.0.1.html">8.0.1 release notes</a>
<li><a href="relnotes/8.0.html">8.0 release notes</a>
<li><a href="relnotes/7.11.2.html">7.11.2 release notes</a>
<li><a href="relnotes/7.11.1.html">7.11.1 release notes</a>
<li><a href="relnotes/7.11.html">7.11 release notes</a>
<li><a href="relnotes/7.10.3.html">7.10.3 release notes</a>
<li><a href="relnotes/7.10.2.html">7.10.2 release notes</a>
<li><a href="relnotes/7.10.1.html">7.10.1 release notes</a>
<li><a href="relnotes/7.10.html">7.10 release notes</a>
<li><a href="relnotes/7.9.2.html">7.9.2 release notes</a>
<li><a href="relnotes/7.9.1.html">7.9.1 release notes</a>
<li><a href="relnotes/7.9.html">7.9 release notes</a>
<li><a href="relnotes/7.8.3.html">7.8.3 release notes</a>
<li><a href="relnotes/7.8.2.html">7.8.2 release notes</a>
<li><a href="relnotes/7.8.1.html">7.8.1 release notes</a>
<li><a href="relnotes/7.8.html">7.8 release notes</a>
<li><a href="relnotes/7.7.1.html">7.7.1 release notes</a>
<li><a href="relnotes/7.7.html">7.7 release notes</a>
<li><a href="relnotes/7.6.1.html">7.6.1 release notes</a>
<li><a href="relnotes/7.6.html">7.6 release notes</a>
<li><a href="relnotes/7.5.2.html">7.5.2 release notes</a>
<li><a href="relnotes/7.5.1.html">7.5.1 release notes</a>
<li><a href="relnotes/7.5.html">7.5 release notes</a>
<li><a href="relnotes/7.4.4.html">7.4.4 release notes</a>
<li><a href="relnotes/7.4.3.html">7.4.3 release notes</a>
<li><a href="relnotes/7.4.2.html">7.4.2 release notes</a>
<li><a href="relnotes/7.4.1.html">7.4.1 release notes</a>
<li><a href="relnotes/7.4.html">7.4 release notes</a>
<li><a href="relnotes/7.3.html">7.3 release notes</a>
<li><a href="relnotes/7.2.html">7.2 release notes</a>
<li><a href="relnotes/7.1.html">7.1 release notes</a>
<li><a href="relnotes/7.0.4.html">7.0.4 release notes</a>
<li><a href="relnotes/7.0.3.html">7.0.3 release notes</a>
<li><a href="relnotes/7.0.2.html">7.0.2 release notes</a>
<li><a href="relnotes/7.0.1.html">7.0.1 release notes</a>
<li><a href="relnotes/7.0.html">7.0 release notes</a>
<li><a href="relnotes/6.5.3.html">6.5.3 release notes</a>
<li><a href="relnotes/6.5.2.html">6.5.2 release notes</a>
<li><a href="relnotes/6.5.1.html">6.5.1 release notes</a>
<li><a href="relnotes/6.5.html">6.5 release notes</a>
<li><a href="relnotes/6.4.2.html">6.4.2 release notes</a>
<li><a href="relnotes/6.4.1.html">6.4.1 release notes</a>
<li><a href="relnotes/6.4.html">6.4 release notes</a>
</ul>
<p>
@@ -80,29 +87,31 @@ Versions of Mesa prior to 6.4 are summarized in the
</p>
<ul>
<li><a href="RELNOTES-6.3.2">RELNOTES-6.3.2</a>
<li><a href="RELNOTES-6.3">RELNOTES-6.3</a>
<li><a href="RELNOTES-6.2.1">RELNOTES-6.2.1</a>
<li><a href="RELNOTES-6.2">RELNOTES-6.2</a>
<li><a href="RELNOTES-6.1">RELNOTES-6.1</a>
<li><a href="RELNOTES-6.0">RELNOTES-6.0</a>
<li><a href="RELNOTES-5.1">RELNOTES-5.1</a>
<li><a href="RELNOTES-5.0.2">RELNOTES-5.0.2</a>
<li><a href="RELNOTES-5.0.1">RELNOTES-5.0.1</a>
<li><a href="RELNOTES-5.0">RELNOTES-5.0</a>
<li><a href="RELNOTES-4.1">RELNOTES-4.1</a>
<li><a href="RELNOTES-4.0.3">RELNOTES-4.0.3</a>
<li><a href="RELNOTES-4.0.2">RELNOTES-4.0.2</a>
<li><a href="RELNOTES-4.0.1">RELNOTES-4.0.1</a>
<li><a href="RELNOTES-4.0">RELNOTES-4.0</a>
<li><a href="RELNOTES-3.5">RELNOTES-3.5</a>
<li><a href="RELNOTES-3.4.2">RELNOTES-3.4.2</a>
<li><a href="RELNOTES-3.4.1">RELNOTES-3.4.1</a>
<li><a href="RELNOTES-3.4">RELNOTES-3.4</a>
<li><a href="RELNOTES-3.3">RELNOTES-3.3</a>
<li><a href="RELNOTES-3.2.1">RELNOTES-3.2.1</a>
<li><a href="RELNOTES-3.2">RELNOTES-3.2</a>
<li><a href="RELNOTES-3.1">RELNOTES-3.1</a>
<li><a href="relnotes/6.3.2">6.3.2 release notes</a>
<li><a href="relnotes/6.3.1">6.3.1 release notes</a>
<li><a href="relnotes/6.3">6.3 release notes</a>
<li><a href="relnotes/6.2.1">6.2.1 release notes</a>
<li><a href="relnotes/6.2">6.2 release notes</a>
<li><a href="relnotes/6.1">6.1 release notes</a>
<li><a href="relnotes/6.0.1">6.0.1 release notes</a>
<li><a href="relnotes/6.0">6.0 release notes</a>
<li><a href="relnotes/5.1">5.1 release notes</a>
<li><a href="relnotes/5.0.2">5.0.2 release notes</a>
<li><a href="relnotes/5.0.1">5.0.1 release notes</a>
<li><a href="relnotes/5.0">5.0 release notes</a>
<li><a href="relnotes/4.1">4.1 release notes</a>
<li><a href="relnotes/4.0.3">4.0.3 release notes</a>
<li><a href="relnotes/4.0.2">4.0.2 release notes</a>
<li><a href="relnotes/4.0.1">4.0.1 release notes</a>
<li><a href="relnotes/4.0">4.0 release notes</a>
<li><a href="relnotes/3.5">3.5 release notes</a>
<li><a href="relnotes/3.4.2">3.4.2 release notes</a>
<li><a href="relnotes/3.4.1">3.4.1 release notes</a>
<li><a href="relnotes/3.4">3.4 release notes</a>
<li><a href="relnotes/3.3">3.3 release notes</a>
<li><a href="relnotes/3.2.1">3.2.1 release notes</a>
<li><a href="relnotes/3.2">3.2 release notes</a>
<li><a href="relnotes/3.1">3.1 release notes</a>
</ul>
</div>

View File

@@ -106,7 +106,7 @@ Vertex/Fragment program debugger
GL_MESA_program_debug is an experimental extension to support
interactive debugging of vertex and fragment programs. See the
docs/MESA_program_debug.spec file for details.
docs/specs/OLD/MESA_program_debug.spec file for details.
The bulk of the vertex/fragment program debugger is implemented
outside of Mesa. The GL_MESA_program_debug extension just has minimal

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.4.1 / November 29, 2006</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.4.2 / February 2, 2006</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.4 / October 24, 2005</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.5.1 Release Notes / September 15, 2006</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.5.2 Release Notes / December 2, 2006</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.5.3 Release Notes / April 27, 2007</h1>
@@ -56,7 +56,7 @@ for the same reason.
<ul>
<li>OpenGL 2.0 and 2.1 API support.
<li>Entirely new Shading Language code generator. See the
<a href="shading.html">Shading Language</a> page for more information.
<a href="../shading.html">Shading Language</a> page for more information.
<li>Much faster software execution of vertex, fragment shaders.
<li>New vertex buffer object (vbo) infrastructure
<li>Updated glext.h file (version 39)

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.5 Release Notes / March 31, 2006</h1>

View File

@@ -2,7 +2,7 @@
<html lang="en">
<head>
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.0.1 Release Notes / August 3, 2007</h1>

View File

@@ -2,7 +2,7 @@
<html lang="en">
<head>
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.0.2 Release Notes / November 10, 2007</h1>

View File

@@ -2,7 +2,7 @@
<html lang="en">
<head>
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.0.3 Release Notes / April 4, 2008</h1>

View File

@@ -2,7 +2,7 @@
<html lang="en">
<head>
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.0.4 Release Notes / August 16, 2008</h1>

View File

@@ -2,7 +2,7 @@
<html lang="en">
<head>
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.0 Release Notes / June 22, 2007</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.1 Release Notes / August 26, 2008</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.10.1 Release Notes / March 2, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.10.2 Release Notes / April 6, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.10.3 Release Notes / June 13, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.10 Release Notes / January 7, 2011</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>
@@ -699,7 +699,7 @@ bc644be551ed585fc4f66c16b64a91c9 MesaGLUT-7.10.tar.gz
<li>st/egl: Plug pbuffer leaks.</li>
<li>st/egl: Fix eglCopyBuffers.</li>
<li>st/egl: Assorted fixes for dri2_display_get_configs.</li>
<li>docs/egl: Update egl.html.</li>
<li>docs/egl: Update ../egl.html.</li>
<li>st/egl: Fix eglChooseConfig when configs is NULL.</li>
<li>docs: Add an example for EGL_DRIVERS_PATH.</li>
<li>autoconf: Fix --with-driver=xlib --enable-openvg.</li>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.11.1 Release Notes / November 17, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.11.2 Release Notes / November 27, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.11 Release Notes / July 31, 2011</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.2 Release Notes / 20 September 2008</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.3 Release Notes / 22 January 2009</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.4.1 Release Notes / 18 April 2009</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.4.2 Release Notes / May 15, 2009</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.4.3 Release Notes / 19 June 2009</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.4.4 Release Notes / 23 June 2009</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.4 Release Notes / 27 March 2009</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.5.1 Release Notes, 3 September 2009</h1>
@@ -29,7 +29,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.5.2 Release Notes, 28 September 2009</h1>
@@ -29,7 +29,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.5 Release Notes / 17 July 2009</h1>
@@ -31,7 +31,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>
<p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.6.1 Release Notes, 21 December 2009</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.6 Release Notes, 28 September 2009</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>
@@ -48,7 +48,7 @@ c49c19c2bbef4f3b7f1389974dff25f4 MesaGLUT-7.6.zip
<h2>New features</h2>
<ul>
<li><a href="openvg.html">OpenVG</a> front-end (state tracker for Gallium).
<li><a href="../openvg.html">OpenVG</a> front-end (state tracker for Gallium).
This was written by Zack Rusin at Tungsten Graphics.
<li>GL_ARB_vertex_array_object and GL_APPLE_vertex_array_object extensions
(supported in Gallium drivers, Intel DRI drivers, and software drivers)</li>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.7.1 Release Notes / March 28, 2010</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.7 Release Notes / 21 December 2009</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.8.1 Release Notes / April 5, 2010</h1>
@@ -29,7 +29,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.8.2 Release Notes / June 17, 2010</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.8.3 Release Notes / (date tbd)</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.8 Release Notes / March 28, 2010</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>
@@ -53,8 +53,8 @@ b54581aeb79b585b158d6a32f94feff2 MesaGLUT-7.8.zip
<li>GL_ARB_fragment_coord_conventions extension (for swrast, i965, and Gallium drivers)
<li>GL_EXT_texture_array extension (swrast driver only)
<li>GL_APPLE_object_purgeable extension (swrast and i945/i965 DRI drivers)
<li>Much improved support for <a href="egl.html">EGL in Mesa</a>
<li>New state trackers for <a href="opengles.html">OpenGL ES 1.1 and 2.0</a>
<li>Much improved support for <a href="../egl.html">EGL in Mesa</a>
<li>New state trackers for <a href="../opengles.html">OpenGL ES 1.1 and 2.0</a>
<li>Dedicated documentation for Gallium
</ul>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.9.1 Release Notes / January 7, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.9.2 Release Notes / March 2, 2011</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 7.9 Release Notes / October 4, 2010</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 2.1.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>
@@ -46,7 +46,7 @@ cd2b6ecec759b0457475e94bbb38fedb MesaLib-7.9.zip
<h2>New features</h2>
<ul>
<li>New, improved GLSL compiler written by Intel.
See the <a href="shading.html"> Shading Language</a> page for
See the <a href="../shading.html"> Shading Language</a> page for
more information.
<li>New, very experimental Gallium driver for R600-R700 Radeons.
<li>Support for AMD Evergreen-based Radeons (HD 5xxx)

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 8.0.1 Release Notes / February 16, 2012</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 8.0.2 Release Notes / March 21, 2012</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 8.0.3 Release Notes / May 18, 2012</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 8.0.4 Release Notes / July 10, 2012</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 8.0.5 Release Notes / October 24, 2012</h1>
@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 8.0 Release Notes / February 9, 2012</h1>
@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<p>
See the <a href="install.html">Compiling/Installing page</a> for prerequisites
See the <a href="../install.html">Compiling/Installing page</a> for prerequisites
for DRI hardware acceleration.
</p>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.0.1 Release Notes / November 16th, 2012</h1>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.0.2 Release Notes / January 22th, 2013</h1>

247
docs/relnotes/9.0.3.html Normal file
View File

@@ -0,0 +1,247 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.0.3 Release Notes / February 21th, 2013</h1>
<p>
Mesa 9.0.3 is a bug fix release which fixes bugs found since the 9.0.2 release.
</p>
<p>
Mesa 9.0 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
168384ac0101f4600a15edd3561acdc7 MesaLib-9.0.3.tar.gz
d7515cc5116c72ac63d735655bd63689 MesaLib-9.0.3.tar.bz2
a2e1c794572440fd0d839a7d7dfea00c MesaLib-9.0.3.zip
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=25201">Bug 25201</a> - Pink artifacts on objects in the distance in ETQW/Quake 4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=31598">Bug 31598</a> - configure: Doesn't check for python libxml2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=40404">Bug 40404</a> - [softpipe] piglit glsl-max-varyings regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47220">Bug 47220</a> - [bisected] Oglc pxconv-gettex(basic.allCases) regressed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=48629">Bug 48629</a> - [bisected i965]Oglc shad-compiler(advanced.TestLessThani) regressed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54240">Bug 54240</a> - [swrast] piglit fbo-generatemipmap-filtering regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=56920">Bug 56920</a> - [sandybridge][uxa] graphics very glitchy and always flickering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57166">Bug 57166</a> - [GM45] Chrome experiment &quot;Stars&quot; crash: brw_fs_emit.cpp:708: brw_reg brw_reg_from_fs_reg(fs_reg*): Assertion „!&quot;not reached&quot;“ failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57746">Bug 57746</a> - build test failure: nouveau_fbo.c:198:3: error: too few arguments to function 'nouveau_renderbuffer_del'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57754">Bug 57754</a> - [swrast] Mesa 9.1-devel implementation error: Unable to delete renderbuffer, no context</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58680">Bug 58680</a> - [IVB] Graphical glitches in 0 A.D</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58972">Bug 58972</a> - [softpipe] util/u_tile.c:795:pipe_put_tile_z: Assertion `0' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59364">Bug 59364</a> - [bisected] Mesa build fails: clientattrib.c:33:22: fatal error: indirect.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59700">Bug 59700</a> - [ILK/SNB/IVB Bisected]Oglc vertexshader(advanced.TestLightsTwoSided) causes GPU hung</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59873">Bug 59873</a> - [swrast] piglit ext_framebuffer_multisample-interpolation 0 centroid-edges regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60052">Bug 60052</a> - [Bisected]Piglit glx_extension_string_sanity fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60172">Bug 60172</a> - Planeshift: triangles where grass would be</li>
<!-- <li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=">Bug </a> - </li> -->
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.0.2..mesa-9.0.3
</pre>
<p>Adam Jackson (1):</p>
<ul>
<li>r200: Fix probable thinko in r200EmitArrays</li>
</ul>
<p>Andreas Boll (7):</p>
<ul>
<li>docs: Add 9.0.2 release md5sums</li>
<li>docs: add news item for 9.0.2 release</li>
<li>configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL</li>
<li>build: require python module libxml2</li>
<li>cherry-ignore: Ignore candidates for the 9.1 branch.</li>
<li>mesa: Bump version to 9.0.3</li>
<li>docs: Add 9.0.3 release notes</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>mesa: Fix GL_LUMINANCE handling for textures in glGetTexImage</li>
</ul>
<p>Brian Paul (29):</p>
<ul>
<li>st/glx: accept GLX_SAMPLE_BUFFERS/SAMPLES_ARB == 0</li>
<li>draw: set precalc_flat flag for AA lines too</li>
<li>softpipe: fix up FS variant unbinding / deletion</li>
<li>softpipe: fix unreliable FS variant binding bug</li>
<li>xlib: handle _mesa_initialize_visual()'s return value</li>
<li>xlib: allow GLX_DONT_CARE for glXChooseFBConfig() attribute values</li>
<li>st/glx: allow GLX_DONT_CARE for glXChooseFBConfig() attribute values</li>
<li>util: fix addressing bug in pipe_put_tile_z() for PIPE_FORMAT_Z32_FLOAT</li>
<li>util: add get/put_tile_z() support for PIPE_FORMAT_Z32_FLOAT_S8X24_UINT</li>
<li>mesa: use GLbitfield64 when copying program inputs</li>
<li>svga: add NULL pointer check in svga_create_sampler_state()</li>
<li>vbo: add a null pointer check to handle OOM instead of crashing</li>
<li>osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta</li>
<li>xlib: use _mesa_generate_mipmap() for mipmap generation, not meta</li>
<li>st/mesa: set ctx-&gt;Const.MaxSamples = 0, not 1</li>
<li>mesa: fix-up and use _mesa_delete_renderbuffer()</li>
<li>mesa: pass context parameter to gl_renderbuffer::Delete()</li>
<li>st/mesa: fix context use-after-free problem in st_renderbuffer_delete()</li>
<li>dri_glx: fix use after free report</li>
<li>mesa: remove warning message in _mesa_reference_renderbuffer_()</li>
<li>st/mesa: add null pointer check in st_renderbuffer_delete()</li>
<li>util: add some defensive coding in u_upload_alloc()</li>
<li>st/mesa: do proper error checking for u_upload_alloc() calls</li>
<li>util: add new error checking code in vbuf helper</li>
<li>mesa: don't enable GL_EXT_framebuffer_multisample for software drivers</li>
<li>st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES &gt;= 2</li>
<li>mesa: don't expose IBM_rasterpos_clip in a core context</li>
<li>svga: fix sRGB rendering</li>
<li>nouveau: Fix build.</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/disasm: Fix horizontal stride of dest registers</li>
</ul>
<p>Eric Anholt (5):</p>
<ul>
<li>i965/fs: Fix the gen6-specific if handling for 80ecb8f15b9ad7d6edc</li>
<li>i965/fs: Don't generate saturates over existing variable values.</li>
<li>i965: Actually add support for GL_ANY_SAMPLES_PASSED from GL_ARB_oq2.</li>
<li>i965/vs: Try again when we've successfully spilled a reg.</li>
<li>i965/gen7: Set up all samplers even if samplers are sparsely used.</li>
</ul>
<p>Frank Henigman (1):</p>
<ul>
<li>mesa: add bounds checking for uniform array access</li>
</ul>
<p>Jerome Glisse (1):</p>
<ul>
<li>r600g: add cs memory usage accounting and limit it v3 (backport for mesa 9.0)</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>unpack: support unpacking MESA_FORMAT_ARGB2101010</li>
</ul>
<p>José Fonseca (2):</p>
<ul>
<li>mesa/st: Don't use 4bits for GL_UNSIGNED_BYTE_3_3_2(_REV)</li>
<li>draw: Properly limit vertex buffer fetches on draw arrays.</li>
</ul>
<p>Kenneth Graunke (19):</p>
<ul>
<li>i965: Fix primitive restart on Haswell.</li>
<li>i965: Refactor texture swizzle generation into a helper.</li>
<li>i965: Do texture swizzling in hardware on Haswell.</li>
<li>i965: Lower textureGrad() with samplerCubeShadow.</li>
<li>i965: Use Haswell's sample_d_c for textureGrad with shadow samplers.</li>
<li>i965: Add chipset limits for Haswell GT1/GT2.</li>
<li>cherry-ignore: Ignore i965 guardband bug fixes.</li>
<li>i965: Add missing _NEW_BUFFERS dirty bit in Gen7 SBE state.</li>
<li>i965/vs: Create a 'lod_type' temporary for ir-&gt;lod_info.lod-&gt;type.</li>
<li>i965/vs: Set LOD to 0 for ordinary texture() calls.</li>
<li>i965/vs: Store texturing results into a vec4 temporary.</li>
<li>cherry-ignore: Ignore candidates for the 9.1 branch.</li>
<li>mesa: Disable GL_NV_primitive_restart extension in core contexts.</li>
<li>glsl: Track UBO block names in the symbol table.</li>
<li>build: Fix build on systems where /usr/bin/python isn't python 2.</li>
<li>i965: Refactor Gen6+ SF attribute override code.</li>
<li>i965: Compute the maximum SF source attribute.</li>
<li>i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.</li>
<li>i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>r600g: fix int-&gt;bool conversion in fence_signalled</li>
<li>gallium/u_upload_mgr: fix a serious memory leak</li>
<li>r300g: fix blending with blend color and RGBA formats</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>mesa: Return 0 for XFB_VARYING_MAX_LENGTH if no varyings</li>
<li>mesa: Set transform feedback's default buffer mode to INTERLEAVED_ATTRIBS</li>
<li>mesa/uniform_query: Don't write to *params if there is an error</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>configure.ac: GLX cannot work without OpenGL</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>mesa: Allow glReadBuffer(GL_NONE) for winsys framebuffers.</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>softpipe: fix using optimized filter function</li>
</ul>
<p>Stefan Dösinger (3):</p>
<ul>
<li>meta: Disable GL_FRAGMENT_SHADER_ATI in MESA_META_SHADER</li>
<li>radeon: Initialize swrast before setting limits</li>
<li>r200: Initialize swrast before setting limits</li>
</ul>
<p>Zack Rusin (2):</p>
<ul>
<li>glx: only advertise GLX_INTEL_swap_event if it's supported</li>
<li>DRI2: Don't disable GLX_INTEL_swap_event unconditionally</li>
</ul>
</div>
</body>
</html>

View File

@@ -3,7 +3,7 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
@@ -11,7 +11,7 @@
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.0 Release Notes / October 8, 2012</h1>

Some files were not shown because too many files have changed in this diff Show More