Fixes TGSI dump output when front/back-face register is declared.
Also, add some assertions to make sure the semantic/interpolate string
arrays have as many elements as there are tokens in the p_shader_tokens.h
file. That should catch problems like this in the future.
The values 2147483648.0 and 4294967294.0 are too larget to be stored in single
precision floats. Forcing these to be singles causes bits to be lost, which
results in errors in some pixel transfer tests.
This fixes bug #22344.
(cherry picked from commit 70e72070fc)
When a function parameter is const-qualified we can avoid making a copy
of the actual parameter (we basically do a search/replace when inlining).
This is now done for array element params too, resulting in better code
(fewer MOV instructions).
We should allow some other types of function arguments here but let's be
conservative for the moment.
The semantics are a little different for shaders vs. fixed-function when
trying to use an incomplete texture. The fallback texture returning
(0,0,0,1) should only be used with shaders. For fixed function, the texture
unit is truly disabled/ignored.
Fixes glean fbo test regression.
(cherry picked from commit 01e16befd4)
(cherry picked from commit 51325f44d1)
[anholt: squashed these two together from master, skipping the mess in between]
Of course glXGetVideoSyncSGI doesn't return the swap interval. The feature
only exists in the Mesa extension... which is the whole reason I created the
Mesa extension! Note that the Mesa extension allows drivers to default to a
swap interval of 0. If the Mesa extension exists, use its value. Only
consider the SGI extension when the Mesa extension is not available.
Fixes bug #22604.
Ensure no other thread is accessing a framebuffer when it is being destroyed by
acquiring both the global and per-framebuffer mutexes. Normal access only
needs the global lock to walk the linked list and acquire the per-framebuffer
mutex.
According to
http://blogs.msdn.com/oldnewthing/archive/2008/01/15/7113860.aspx
WM_SIZE is generated from WM_WINDOWPOSCHANGED by DefWindowProc so it
can be masked out by the application.
Also there were some weird bogus WM_SIZE 0x0 messages when starting
sharedtex_mt which we don't get like this.
the driver used to overwrite grf0 then use implicit move by send instruction
to move contents of grf0 to mrf1. However, we must not overwrite grf0 since
it's still used later for fb write.
Instead, do the move directly do mrf1 (we could use implicit move from another
grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't
seem to make sense).
I think the dp_READ/WRITE_16 functions may suffer from the same issue.
While here also remove unnecessary msg_reg_nr parameter from the dataport
functions since always message register 1 is used.
Thanks to branching, the state of c->current_const[i].index at the point
of emitting constant loads for this instruction may not match the actual
constant currently loaded in the reg at runtime. Fixes a regression in my
GLSL program for idr's class since b58b3a786a.
gl_NormalMatrix is the inverse transpose of the modelview matrix, but
as every matrix here needs to be transposed, we end up with
{MODELVIEW_MATRIX, INVERSE}.
This is a tweak to a previous fix -- it's not necessary to actually
advertise this extension to prevent these games from crashing -- they
ignore the extension string anyway. It's sufficient to just have
GetProcAddress return some dummy function addresses for SwapInterval.
Given we don't really implement this funcitonality, this is a better
fix.
Conflicts:
progs/trivial/Makefile
Pull in a minimal version of statechange shortcircuiting in display
list compilation. This affects only glMaterial and glShadeModel state,
and includes quite a few tests to exercise various tricky cases.
If this goes well, will consider extending to all state in the future.
With recent changes to support frontfacing in glsl, it is necessary
to ensure that the UsesFogFragCoord value is accurate in all shaders.
We were previously not setting it for fixed-function and ARB_fs shaders.
Where vbo save nodes are terminated with a call to DO_FALLBACK(), as in
the case of a recursive CallList which is itself within a Begin/End pair,
there two problems:
1) The display list node's primitive information was incorrect, stating
the cut-off prim had zero vertices
2) On replay, we would get confused by a primitive that started in a
node, but was terminated by individual opcodes.
This change fixes the first problem by correctly terminating the last
primitive on fallback, and the second by forcing the display list to
use the Loopback path, converting all nodes into immediate-mode rendering.
The loopback fix is a performance hit, but avoiding this would require
a fairly large rework of this code.
Only short-circuit material call if *all* statechanges from this call
are cached. Some material calls (eg with FRONT_AND_BACK) change more
than one piece of state -- need to check all of them before returning.
Also, Material calls are legal inside begin/end pairs, so don't need
to be as careful about begin/end state as with regular statechanges
(like ShadeModel) when caching. Take advantage of this and do better
caching.
Statechanges which occur before the first End in a display list may
not be replayed when the list is called, in particular if it is called
from within a begin/end pair.
Recognize vulnerable statechanges and do not use them to fill in the
state cache.
State-change functions which precede the first call to glEnd() in
a compiled list are vulnerable to not being executed when that list
is called.
In particular this can happen if a list is invoked from within a
begin/end pair, as in this example.
When compiling a display list containing a CallList, it is necessary to
invalidate any assumption about the GL state after the recursive call
completes.
When one display list calls another display list, it is possible
that the calling display list makes state-changes or other actions which
invalidate any attempt at caching or state-change elimination in the
calling list.
This test exercises one such case, where the called list consists of just
a single glShadeModel() call.
Varying material inputs were not being picked up from the same slots
where the VBO code is currently placing them (GENERIC0 and above).
Most often they were just being ignored.
Swrast was missing a free for the culmination of driConcatConfigs.
Use free(), not _mesa_free() since we shouldn't be calling any Mesa
functions from the GLX code. driConcatConfigs() should probably use
regular malloc/free to be consistant but the Mesa functions just wrap
the libc functions anyway.
When a buffer was mapped for write and no explicit flush range was provided
the existing semantics were that the whole buffer would be flushed, mostly
for backwards compatability with non map-buffer-range aware code.
However if the buffer was mapped/unmapped with nothing really written --
something that often happens with the vbo -- we were unnecessarily assuming
that the whole buffer was written.
The new PIPE_BUFFER_USAGE_FLUSH_EXPLICIT flag (based from ARB_map_buffer_range
's GL_MAP_FLUSH_EXPLICIT_BIT flag) allows to clearly distinguish the
legacy usage from the nothing written usage.
Required as some applications
retrieve and call these functions regardless of the fact that we
don't advertise the extension and further more the results of
wglGetProcAddress are NULL.
mesa allocates both frontface and pointcoord registers within the fog
coordinate register, by using swizzling. to make it cleaner and easier
for drivers we want each of them in its own register. so when doing
compilation from the mesa IR to tgsi allocate new registers for both
and add new semantics to the respective declarations.
Fix xdemos which default to using display :0.0 to default to $DISPLAY,
this is kind of irritating when testing on a display other than :0.0
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Where vbo save nodes are terminated with a call to DO_FALLBACK(), as in
the case of a recursive CallList which is itself within a Begin/End pair,
there two problems:
1) The display list node's primitive information was incorrect, stating
the cut-off prim had zero vertices
2) On replay, we would get confused by a primitive that started in a
node, but was terminated by individual opcodes.
This change fixes the first problem by correctly terminating the last
primitive on fallback, and the second by forcing the display list to
use the Loopback path, converting all nodes into immediate-mode rendering.
The loopback fix is a performance hit, but avoiding this would require
a fairly large rework of this code.
Add a simple version of _mesa_lookup_enum_by_nr() which expects a primitive
enum (GL_POINTS..GL_POLYGON). This avoids some annoying duplicates
when looking up primitives, such as the GL_FALSE/GL_POINTS clash.
Currently, state-changes in mesa display lists are more or less
a verbatim recording of the GL calls made during compilation.
This change introduces a minor optimization to recognize and eliminate
cases where the application emits redundant state changes, eg:
glShadeModel( GL_FLAT );
glBegin( prim )
...
glEnd()
glShadeModel( GL_FLAT );
glBegin( prim )
...
glEnd()
The big win is when we can eliminate all the statechanges between two
primitive blocks and combine them into a single VBO node.
This commit implements state-change elimination for Material and ShadeModel
only. This is enough to make a start on debugging, etc.
gcc-4.2's optimizer has a strange bug where it looses code from inner
loops in certain situations. For example, if the appearently innocent
looking code below is compiled with gcc-4.2 -S -O1, the inner loop's
code is missing from the outputed assembly.
struct Size {
unsigned width;
};
struct Command {
unsigned length;
struct Size sizes[32];
};
extern void emit_command(void *command, unsigned length);
void
create_surface( struct Size size, unsigned faces, unsigned levels)
{
struct Command cmd;
unsigned face;
unsigned level;
cmd.length = faces*levels*sizeof(cmd.sizes[0]);
for(face = 0; face < faces; ++face) {
for(level = 0; level < levels; ++level) {
cmd.sizes[face*levels + level] = size;
// This should generate a shrl statement, but the whole for body
// disappears in gcc-4.2 -O1/-O2/-O3!
size.width >>= 1;
}
}
emit(&cmd, sizeof cmd.length + cmd.length);
}
Note that this is not specific to MinGW's gcc-4.2 crosscompiler (the
version typically found in debian/ubuntu's mingw32 packages). gcc-4.2 on
Linux also displays the same error. gcc-4.3 and above gets this
correctly though.
Updated MinGW debian packages with gcc-4.3 are available from
http://people.freedesktop.org/~jrfonseca/debian/pool/main/m/
This prevents the error
relocation R_X86_64_PC32 against symbol `_gl_DispatchTSD' can not be used when making a shared object; recompile with -fPIC
when building on x86_64 architecture.
To maintain correctness, the server will copy the real front-buffer to
a newly allocated fake front-buffer in DRI2GetBuffersWithFormat.
However, if the DRI2GetBuffersWithFormat is triggered by glViewport,
this will copy stale data into the new buffer. Fix this by flushing
the current fake front-buffer to the real front-buffer in
intel_viewport.
Fixes bug #22288.
A new node type (SLANG_OPER_RETURN_INLINED) is used to denote 'return'
statements inside inlined functions which need special handling.
All glean glsl1 tests pass for EmitContReturn=FALSE and TRUE.
Need to execute the for loop's increment code before we continue.
Add a slang_assemble_ctx::CurLoopOper field to keep track of the containing
loop and avoid the "cont if true" path in this situation.
Plus, assorted clean-ups.
We should just check if the loop contains a continue/break in the
_slang_can_unroll_for_loop() test function...
This reverts commit 989856bde4.
Conflicts:
src/mesa/shader/slang/slang_codegen.h
This is the start of a glsl-continue-return feature branch to support
a GLSL code generator option for 'continue' and 'return' statements.
Some targets don't support CONT or RET statements so we'll need to
try to generate code that does not use them...
We were failing to deal with:
- vsnprintf returns negative value on error.
- vsnprintf returns the number of chars that *would* have been
written on truncation.
The reshape() function was called when there was no GLX context so
the viewport/modelview/projection setup wasn't doing anything. Move
the call to reshape() into draw().
Also, remove -stereo, -fullscreen options and do some general clean-up.
glsl compiler will not generate OPCODE_SWZ, and as a first step it would
be translated away to a MOV anyway (why?), but later internally this opcode is
generated (for EXT_texture_swizzling).
(cherry picked from commit 4ef1f8e3b5)
We currently don't have support for SGI_swap_control for direct
contexts with DRI2, so disable reporting the extension. Reporting
the extension, and then having glXSwapIntervalSGI() "succeed"
but do nothing can confuse applications.
https://bugs.freedesktop.org/show_bug.cgi?id=22123
(cherry picked from commit 279143c6e8)
Be clearer that this is the number of generic vertex program/shader
attributes, not counting the legacy attributes (pos, normal, color, etc).
(cherry picked from commit 4a95185c9f)
Call the _mesa_set_enable() functions instead of driver functions, etc.
Also, add missing code for 1D/2D texture arrays.
(cherry picked from commit aac19609bf)
If we're running a vertex program to emulated fixed-function, we still need
to treat vertex arrays/attributes as if we're in fixed-function mode.
This should probably be back-ported to Mesa 7.5 after a bit more testing.
(cherry picked from commit dda82137d2)
When a GLSL sampler reads from an incomplete texture it should
return (0,0,0,1). Instead of jumping through hoops in all the drivers
to make this happen, just create/install a fallback texture with those
texel values.
Fixes piglit/fp-incomplete-tex on i965 and more importantly, fixes some
GPU lockups when trying to sample from missing surfaces. If a binding
table entry is NULL, it seems that sampling sometimes works, but not
always (lockup).
Todo: create a fallback texture for each type of texture target?
(cherry picked from commit 3f25219c7b)
The CALL_DrawArrays was leaking the clear's primitives into the display
list with GL_COMPILE_AND_EXECUTE. Use _mesa_DrawArrays instead, which
doesn't appear to leak. Fixes piglit dlist-clear test.
(cherry picked from commit 64edde1004)
Fixes#22181. R200 requires this since DP4 is used in hw tnl mode.
R300 prefers it (should be faster due to no instruction dependencies), but
both methods should be correct (when sw tcl is used though, MUL/MAD might
be faster). Probably doesn't make much difference for R100 since vertex progs
are executed in software anyway, but let's just keep it the same there too.
The alpha value wasn't set at all before so we got unpredictable results.
Note that we don't currently obey GL_DEPTH_TEXTURE_MODE in the state
tracker. For now, we return the result in the default mode (r,r,r,1).
Using uintptr_t as intermediate type for pointer -> integer conversions is
easier to understand and does not cause any size mismatch warnings.
uintptr_t is part of C99, and we already provide a suitable replacement
definition for all platforms we care about.
Avoids warnings on 64bit builds.
Use regular unsigned since that's what gallium expects, but use a
typedef to facilitate possible changes in the future.
Not only for cosmetic reasons, but also because we need to set the
SetWindowsHookEx hook for threads created before the DllMain is called
(threads for each we don't get the DLL_THREAD_ATTACH notification).
With 1D textures, GL_TEXTURE_WRAP_T should be ignored (only
GL_TEXTURE_WRAP_S should be respected). But the i965 hardware
seems to follow the value of GL_TEXTURE_WRAP_T even when sampling
1D textures.
This fix forces GL_TEXTURE_WRAP_T to be GL_REPEAT whenever 1D
textures are used; this allows the texture to be sampled
correctly, avoiding "imaginary" border elements in the T direction.
This bug was demonstrated in the Piglit tex1d-2dborder test.
With this fix, that test passes.
(cherry picked from commit ab6c4fa582)
One warning message:
drm_i915_getparam: -22
was still being sent to fprintf(). This causes all Piglit tests to fail,
even with MESA_DEBUG=0.
Using _mesa_warning() to emit the message allows the general Mesa controls
for messages like this to be applied.
(cherry picked from commit bc3270e99f)
When out of memory (in at least one case, triggered by a longrunning
memory leak), this code will segfault and crash. By checking for the
out-of-memory condition, the system can continue, and will report
the out-of-memory error later, a much preferable outcome.
(cherry picked from commit 44a4abfd4f)
If there is no shared context, there is no purpose in rebinding the same
texture. In some artificial tests this improves performance 10% - 30%.
(cherry picked from commit 7f8000db8b)
This saves doing swtnl from uncached memory, which is painful. Improves
clutter test-text performance by 10% since it started using VBOs.
(cherry picked from commit a945e203d4)
Noticed while debugging a weird 1D FBO testcase that left its existing
viewport and projection matrix in place when switching drawbuffers. Didn't
fix the testcase, though.
(cherry picked from commit 3a521d84ec)
Both EXT_fbo and ARB_fbo agree on this. Fixes a segfault in the metaops
mipmap generation in Intel for SGIS_generate_mipmap of S3TC textures in
Regnum Online.
Bug #21654.
(cherry picked from commit 0307e609aa)
The _Enabled field isn't updated at the point that DrawBuffers is called,
and the Driver.Enable() function does the testing for stencil buffer
presence anyway.
bug #21608 for Radeon
(cherry picked from commit 4c6f829899)
The docs actually explain this, but not in a terribly clear manner.
This nearly fixes the piglit cubemap testcase, except that something's
going wrong with the nearest filtering at 2x2 sizes in the testcase.
Looks good by visual inspection, though.
Bug #21692
(cherry picked from commit 5c5a468848)
Before, if the VP output something that is in the attributes coming into
the WM but which isn't used by the WM, then WM would end up reading subsequent
varyings from the wrong places. This was visible with a GLSL demo
using gl_PointSize in the VS and a varying in the WM, as point size is in
the VUE but not used by the WM. There is now a regression test in piglit,
glsl-unused-varying.
(cherry picked from commit 0f5113deed)
This comes from a radeon-rewrite fallback fix, but may also fix stencil
clear failure when the polygon winding mode is flipped.
(cherry picked from commit d866abeffc)
The first time a context is bound to a drawable, the viewport and scissor
bounds are initialized to the buffer's size. This is actually a bit tricky.
A new _mesa_check_init_viewport() function is called in several places
to check if the viewport has been initialized. We also use a new
ctx->ViewportInitialized flag instead of the overloaded
ctx->FirstTimeCurrent flag.
wglShareLists is a little picky -- it seems to check if it has exclusive
access to a lock, and fails if it doesn't.
This allows the texture to be shared with all windows.
Two parts to this:
One we don't keep pointers to possibly freed memory anymore once we unbind the
drawables from the context. Brian I need to figure out what the comment
you made there, can we get a glean/piglit test so we can fix it properly?
If the new gc is the same as the oldGC, we call the unbind even though
we just bound it in that function. doh.
(cherry picked from master, commit 77506dac8e)
For the TXP instruction we check if the texcoord is really a 4-component
atttibute which requires the divide by W step. This check involved the
projtex_mask field. However, the projtex_mask field was being miscalculated
because of some confusion between vertex program outputs and fragment
program inputs.
1. Rework the size_masks calculation so we correctly set bits corresponding
to fragment program input attributes.
2. Rename projtex_mask to proj_attrib_mask since we're interested in more
than just texcoords (generic varying vars too).
3. Simply the indexing of the size_masks and proj_attrib_mask fields.
4. The tracker::active[] array was mis-dimensioned. Use MAX_PROGRAM_TEMPS
instead of a magic number.
5. Update comments, add new assertions.
With these changes the Lightsmark demo/benchmark renders correctly, until
we eventually hit a GPU lockup...
For some triangles we can generate quads which lie just outside the
surface bounds. Just check the quad's mask before trying to emit/process
the quad.
Fixes failed assertion in Lightsmark.
Render results are only visible when the render cache is flushed.
softpipe_is_texture_referenced must reflect that or transfers to/from the
textures bound in the framebuffer won't be proceeded of the necessary
flush, causing transfer data to be outdated/clobbered.
This fixes conform drawpix test with softpipe.
When the -arb option is specified we use GL_ARB_framebuffer_object intead
of GL_EXT_framebuffer_object.
For some vendors' OpenGL it's important to call the ARB entrypoints
instead of the EXT entrypoints to get correct behaviour. Use some
function pointer tricks to do this (instead of GLEW).
Before, if a vertex shader's outputs didn't exactly match a fragment
shader's inputs we could wind up with invalid TGSI shader declarations.
For example:
Before patch:
DCL OUT[0], POSITION
DCL OUT[1], COLOR[1]
DCL OUT[2], GENERIC[0]
DCL OUT[3], GENERIC[0] <- note duplicate [0]
DCL OUT[4], GENERIC[2]
After patch:
DCL OUT[0], POSITION
DCL OUT[1], COLOR[1]
DCL OUT[2], GENERIC[0]
DCL OUT[3], GENERIC[1]
DCL OUT[4], GENERIC[2]
Actually, after spotting this problem, I realized this is unreachable
code. However don't bother to enable this fast path now, given the normal
path is working just fine.
Add _NEW_PROGRAM_CONSTANTS to _SWRAST_NEW_DERIVED.
This makes sure that we update the fragment shader's constants when state
vars (such as point size) changes.
Fixes the progs/glsl/points.c demo.
The existing implementation was already implemented on software, but relied
on the pipe driver to always support the R16G16B16A16_SNORM format. This
patch eliminates that, without prejudice against a future hardware-only
implementation.
It also avoids some of the short <-> float conversions, and only does a read
transfer of the color buffer on GL_RETURN if absolutely necessary.
Fix mklib to deal with NOPREFIX and use --enable-auto-image-base for cygwin
Teach configure.ac some basic facts about cygwin
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
It's legal for ARB_vertex_program programs to not write to result.position.
The results are undefined in that case. This assertion was causing us to
abort/exit though.
The recent increase ST_MAX_SHADER_TOKENS to 8K causes stack overflows on
windows.
Failure to allocate is not being propagated to the caller. This is not
a regression since the previous _mesa_malloc result wasn't being
checked as well. Unfortunately it is not easy to fix, as the callers of
these functions do not have failure propagation mechanism either, and
so on. So leaving a just fixme note for now.
We were adding references to the input arrays, but failing to drop
them on destruction. This could lead to a 64kb buffer being leaked
each context destruction.
1) Pass the correct format when calling update_array in
_mesa_VertexAttribPointerARB.
2) glVertexAttribPointerNV accepts GL_BGRA format too.
3) raise INVALID_VALUE error when format is BGRA and normalized is
false in glVertexAttribPointerARB
(cherry picked from commit 4adb190a16)
Now that the dlopen wrappers are built into libmesa.a, we need to link
standalone libOSMesa with libdl to resolve dlopen and friends on
platforms that need it.
(cherry picked from commit 4795dd5950)
autoconf had been designating the 8 bit libOSMesa as the default
standalone osmesa, but the Makefile expected it to be linked to libGL.
Fix up the osmesa Makefile so that it allows any of the combinations of
standalone and channel width to be built.
Fixes bug #21980.
(cherry picked from commit 7441dcd90b)
Because of flat shading, we can't use same code as PIPE_PRIM_TRIANGLE_FAN.
This is a follow-on to commit a59575d8fb.
(cherry picked from commit 086ecea179)
When approaching y = x * 0xffffffff / 0xffffff with bit arithmetic, the
8 least significant bits of y should come from the
8 most significant bits of x.
This prevents the width / height from being clipped to the window size before
the texture is allocated. This matches intelCopyTexImage1D.
This should fix bug #21227
Signed-off-by: Ian Romanick <ian.romanick@intel.com>
(cherry picked from commit 129f311673)
Need to take the draw buffer's up/down orientation into consideration
when setting the sprite_coord_mode field.
Fixes inverted sprites when drawing into an FBO.
Fixes a crash when clearing the window with a quad after drawing large
points. We were asking the draw module how many vertex shader outputs
there were and got 3 instead of 2. This led to creating vertices with
too many attributes and trying to read invalid memory.
We reset extra_vp_outputs.slot to zero in the aaline/aapoint stage's
flush functions already.
This omission was just an oversight in the wide_point stage.
There is no current pixel format. Each HDC has its pixelformat which is
kept by gdi and set/get via the SetPixelFormat/GetPixelFormat functions.
Now the HDC's pixelformat is kept in the stw_framebuffer, which is
created during the SetPixelFormat.
This fixes the incorrect colors seen when rendering flat-shaded polygons.
Note that clipped polygons were correct, but unclipped polygons were wrong.
See the glean/clipFlat test for regression testing.
When a vertex shader uses generic vertex attribute 0, but not gl_Vertex,
we need to set attribute[16] to point to attribute[0]. We were setting the
attribute size, but not the pointer.
Fixes crash in glsl/multitex.c when using the VertCoord attribute instead
of gl_Vertex.
If the multitex.vert shader uses the VertCoord generic vertex attribute
instead of the pre-defined gl_Vertex attribute, we need to make sure that
VertCoord gets bound to generic vertex attribute zero.
That's because we need to call glVertexAttrib2fv(0, xy) after all the other
vertex attributes have been set since setting generic attribute 0 triggers
vertex submission. Before, we wound up issuing the vertex attributes in
the order 0, 1, 2 which caused the first vertex to be submitted before all
the attributes were set. Now, the attributes are set in 1, 2, 0 order.
It's possible to hand a GL_COLOR_INDEX/GL_BITMAP image to glTexImage3D()
which gets converted to RGBA via the glPixelMap tables.
This fixes a failure with piglit/fdo10370 with Gallium.
The generic_array[] is 16 elements in size, but the loop was doing 32
iterations. The out of bounds array write was clobbering the following
inputs[] array but as luck would have it, that didn't matter.
The rationale here is to avoid updating a timestamp for a file that
hasn't changed. Needless updates of the timestamp can ripple into
other projects, (xserver, etc.), useless recompiling due to a
'make install' in mesa that didn't actually change anything.
gl_array_object encapsulates a set of vertex arrays (see the
GL_APPLE_vertex_array_object extension).
Create a private gl_array_object for drawing the quad for intel_clear_tris()
so we don't have to worry about the user's vertex array state.
This fixes the no-op glClear bug #21638 and removes the need to call
_mesa_PushClientAttrib() and _mesa_PopClientAttrib().
Don't really delete vertex array objects until the refcount hits zero.
At that time, unbind any pointers to VBOs.
(cherry picked from commit 32b851c807)
For non-stereo visuals, which is all we support, we treat
GL_FRONT_LEFT as GL_FRONT. However, they are technically different,
and they have different enum values. Test for either one to determine
if we're in front-buffer rendering mode.
This fix was suggested by Pierre Willenbrock.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2085cf2462)
The current texture for any particular texture unit is given an additional
reference in update_texture_state(); but if the context is closed before
that texture can be released (which is quite frequent in normal use, unless
a program unbinds and deletes the texture and renders without it to force
a call to update_texture_state(), the memory is lost.
This affects general Mesa; but the i965 is particularly affected because
it allocates a considerable amount of additional memory for each allocated
texture.
(cherry picked from master, commit c230767d69)
The src/dest x,y, and w,h arguments of the pipe->surface_copy
function are unsigned and the drivers aren't expecting negative
(or extremly-large unsigned) values as inputs. Trim the requests
at the state-tracker level before passing down.
Add a new flag mvp_with_dp4 in the context, and use that to switch
both ffvertex.c and programopt.c vertex transformation code to
either DP4 or MUL/MAD implementations.
This is a quick fix for z fighting in quake4 caused by the mismatch
between vertex transformation here and in the position_invarient code.
Full fix would be to make this driver-tunable and adjust both
position_invarient and ffvertex_prog.c code to respect driver
preferences.
Add a dummy function which exists only so that tgsi_text_translate()
doesn't get magic-ed out of the libtgsi.a archive by the build system.
Don't remove unless you know this has been fixed - check on
mingw/scons builds as well.
Previously we created the pipe_surface during framebuffer validation.
But if we did a glCopyTex[Sub]Image() before anything else we wouldn't yet
have the surface. This fixes that.
When building mangled Mesa on Darwin, the exported symbols are
named `_mgluWhatever' instead of simply `_gluWhatever'. When
using a list of exported symbols via the system ld's
`-exported_symbols_list' command line option (as done by mklib),
this resulted in error messages about exporting symbols which do
not exist.
Fortunately the file format accepts simple wildcards. This throws
a wildcard so that the symbol list will match both the mangled and
non-mangled names, preventing the warning and actually exporting
the correct symbols in one shot.
I don't think anyone besides a developer would ever want to use the demo
egl driver. Furthermore, egl would only ever load demodriver if it was
set via EGL_DRIVER in the environment. In that case, I think you can
point it to your mesa source directory.
Previously the pkg-config output files would contain e.g. `-lGL'
and `-lGLU', even if the user modified their configuration to
build libraries with different names. This modifies the
pkg-config inputs, and corresponding makery, so that modifying the
output library name will cause the appropriate updated name to
appear in the pkg-config `-l' option.
Signed-off-by: Dan Nicholson <dbn.lists@gmail.com>
The TGSI interpeter operates in SOA style. We need to check for data
dependencies in instructions which read from and write to the same register.
For now just adding some debug code to detect that condition. Actual fixes
to follow.
This is only used for debuging the gem backend on i965
chipset using the softpipe pipe driver.
Usage: "export INTEL_SOFTPIPE=y" and point LIBGL_DRIVERS_PATH
to "$MESA/lib/gallium" where $MESA is the mesa root.
Make it possible to pass state-tracker-specific data to the
init_screen function, and even open the door for device-specific
state-tracker screen initialization.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
If a shader reaches an out-of-memory condition while adding
a new function (reallocating the function list), a segfault
will occur during cleanup (because the num_functions field
is non-zero, but the functions pointer is NULL).
This fixes that segfault by zeroing out the num_functions
field if reallocation fails.
Make the use_const_buffer field per-program and only call the code which
updates the constant buffer's data if the flag is set.
This should undo the perf regression from 20f3497e4b
We were incorrectly applying the destination texture face and level
when requesting a transfer to the temporary texture, which has only
one face and level. This would obviously cause problems uploading to
compressed cube and mipmap textures.
Here is a couple of fixes for GNU/Hurd:
- dri_interface.h: no libdrm support either.
- configure.ac:
- GNU/Hurd is a GNU OS with _GNU_SOURCE and PTHREADS.
- GNU needs a couple of flags like other OSes
Signed-off-by: Dan Nicholson <dbn.lists@gmail.com>
This interface gives the driver two important features. First, it can
allocate the (fake) front-buffer only when needed. Second, it can
tell the buffer allocator the format of buffers being allocated. This
enables support for back-buffer and depth-buffer with different bits
per pixel.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@redhat.com>
As of commit 23ad86cfb9 all messages go
through output_if_debug().
Add new parameter to output_if_debug() to indicate whether to emit a newline.
_mesa_warning() and _mesa_error() calls should not end their strings with \n.
_mesa_debug() calls should end their text with \n.
This can be almost impossible to avoid - hopefully we won't encounter
a situation where this is a true requirement. Would probably require
drivers to flush between hardware and software vertex processing.
The drm_intel_gem_bo_map_gtt() call that replaced dri_bo_map() is
producing errors like:
intel_bufmgr_gem.c:689: Error preparing buffer map 39 (vp_const_buffer): Invalid argument .
and returning NULL, causing a segfault in the memcpy().
Just reverting until we can get to the root issue...
i915 actually supports up to 4 (according to header file - not tested),
i965 up to 16 (code already handled this but slightly broken), so don't use 2
for all chips, even though angular dependency is very high.
This is a follow-on to commit c1a3b85280.
Note that (at this time) wherever _NEW_PROGRAM_CONSTANTS is set we're still
setting _NEW_PROGRAM so this won't really make any difference (for now).
Need to do this to ensure vbo code unmaps its buffers before calling
the driver, which may be sitting on top of a memory manager which
objects to firing commands from a mapped buffer.
This state flag will be used to indicate that vertex/fragment program
constants have changed. _NEW_PROGRAM will be used to indicate changes
to the vertex/fragment shader itself, or misc related state.
_NEW_PROGRAM_CONSTANTS is also set whenever a program parameter that's
tracking GL state has changed. For example, if the projection matrix is
in the parameter list, calling glFrustum() will cause _NEW_PROGRAM_CONSTANTS
to be set. This will let to remove the need for dynamic state atoms in
some drivers.
For now, we still set _NEW_PROGRAM in all the places we used to. We'll no
longer set _NEW_PROGRAM in glUniform() after drivers/etc have been updated.
Return the conservative PIPE_REFERENCED_FOR_READ | PIPE_REFERENCED_FOR_WRITE
value for now.
This fixes a bunch of regressions seen in piglit and glean.
This needs a proper fix to propogate the out-of-memory condition back
up to Mesa and the app as a GL error. Until then, at least catch the
problem at its source.
The READ message's msg_control value can be 0 or 1 to indicate that the
Oword should be read into the lower or upper half of the target register.
It seems that the other half of the register gets clobbered though. So
we read into two dest registers then use a MOV to combine the upper/lower
halves.
Now that we have real constant buffers, the demands on the CURBE are lessened.
When we use real VS/WM constant buffers we only use the CURBE for clip planes.
There are two usage types of buffer CPU accesses:
One where we try to use the buffer contents for multiple draw commands in
a batch. (batch := sequence of commands that are flushed together),
like incrementally adding bitmaps to a bitmap texture that is reallocated
on flush.
And one where we assume we can safely overwrite the old buffer contexts, like
glTexSubImage. In this case we need to make sure all old drawing commands
referencing the buffer are flushed before we map the buffer.
This is easily forgotten.
Add wrappers for the most common of these operations. The first type is
prefixed with "st_no_flush" and the second type is prefixed with
"st_cond_flush", where "cond" indicates that we attmpt to only flush
if there is indeed unflushed draw commands referencing the buffer.
Prefixed functions are
screen::get_tex_transfer
pipe_buffer_write
pipe_buffer_read
pipe_buffer_map
Please use the wrappers whenever possible.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
Also enable them all regardless of screen bpp, as 32 bpp what I've been
testing against, and haven't been able to detect any screen bpp-specific
troubles with them.
For some reason, MOV instructions using immediate src values don't seem
to work reliably on the GLSL path. Disable them for now (falling back to
const buffer reads). This fixes a bunch of glean glsl1 failures.
need to round up height for _mesa_copy_rect otherwise
textures with height smaller than 4 won't get copied to the miptree at all
Also fix up the confusing debug output (don't output unitialized values,
and output if data is present and the compressed flag)
Also implement context member functions to optimize away those
flushes whenever possible.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
This fixes a regression in fbotest1 on 915, where a transition from
color+vertex array enabled to texcoord0+vertex array enabled wouldn't trigger
program update on the second _mesa_update_state of DrawArrays, and we'd sample
a constant texcoord of 0,0,0,1 instead of the array.
The double state update in DrawArrays from
1680ef8696 still needs fixing.
There's really no need for two negation fields. This came from the
GL_NV_fragment_program extension. The new, unified Negate bitfield applies
after the absolute value step.
This mostly came down to finding the right MRF incantation in the
brw_dp_READ_4_vs() function.
Note: this feature is still disabled (but getting close to done).
Have to use LINK /LIB instead. The biggest problem is when the command
line is very long and all the options are included in a argument file --
link doesn't like if /LIB is included in the argument file.
It is possible for ctx->DrawBuffer to be NULL, so don't fault when
that happens. This change is not being committed to master because it
doesn't appear to be necessary there.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cherry picked from mesa_7_4_branch, commit 49e0c74ddd
Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
We don't update drawables anymore unless they are completely
uninitialized, so we need to swap even if we don't have
cliprects yet, otherwise we never end up calling the driver's
SwapBuffers(). The driver should update the drawable in its
SwapBuffers() anyway.
See 8e753d0404,
"dri glx: Fix dri_util::driBindContext" for the change that
exposed it.
Unfortunately this doesn't catch all the cases, as the mesa state tracker
can still use the framebuffer without giving the wgl state tracker
the chance to lock it.
Handle the loader returning a fake front-buffer. Since the driver
never specifically requests a fake front-buffer, the driver assumes
that it will never receive both a fake and a real front-buffer.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@redhat.com>
Assume that the front-buffer exists even if the server didn't tell the
client that it exists.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@redhat.com>
Track two flags: whether or not front-buffer rendering is currently
enabled and whether or not front-buffer rendering has been enabled
since the last glFlush. If the second flag is set, the front-buffer
is flushed via a loader call back. If the first flag is cleared, the
second flag is cleared at this time.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@redhat.com>
Subclassing the window is invasive: we might call an old window proc even
after it was removed. Glut and another bug just in the wrong time was
provoking this. Hooks don't have this problem.
This fixes the random results that were seen when fetching a constant
inside an IF/ELSE clause. Disabling the execution mask ensures that all
the components of the register are written.
Before, the instruction's CondUpdate field was mistakenly effecting the
constant-fetch operation.
Fixes progs/glsl/bump.c demo. But there are some other issues related
to condition flags and IF/ELSE that need investigation...
For testing, it's very useful to be able to test on a debug build,
while suppressing the debug messages (messages that are by default
suppressed in a release build), in order to see the same behavior
that users of release builds will see.
For example, the "piglit" test suite will flag an error on
programs that produce unexpected output, which means that a
debug build will always fail due to the extra debug messages.
This change introduces a new value to the MESA_DEBUG
environment variable. In a debug build, explicitly setting MESA_DEBUG
to "0" will suppress all debug messages (both from _mesa_debug() and
from _mesa_warning()). (The former behavior was that debug
messages were never suppressed in debug builds.)
Behavior of non-debug builds has not changed. In such a build,
_mesa_debug() messages are always suppressed, and _mesa_warning()
messages will be suppressed unless MESA_DEBUG is set *to any value*.
The texture border color must be interpreted according to the texture's
base format. For example, for a GL_ALPHA texture, sampling the border
color should return (0,0,0,borderAlpha). This wasn't an issue here until
I removed the legacy texenv code (we always use the combiner path now).
This gets us the savings for driver-internal viewport calls that
dd1c68f151 was attempting, without relying
on Xlib internals or clients handling X events.
This speeds up OA on my GM45 by 21% (more than the original CPU cost of
the upload path). We might still be able to squeeze a few more percent out
by avoiding repeatedly mapping/unmapping buffers as we upload elements into
them.
If _math_matrix_analyse() got called before we allocated the inverse
matrix array we could lose the flag indicating that we needed to compute
the inverse. This could happen with certain vertex shader cases.
Currently, shader constants are stored in the GRF (loaded from the CURBE
prior to shader execution). This severly limits the number of constants
and temps that we can support.
This new code will support (practically) unlimited size constant buffers
and free up registers in the GRF. We allocate a new buffer object for the
constants and read them with "Read" messages/instructions. When only a
small number of constants are used, we can still use the old method.
The code works for fragment shaders only (and is actually disabled) for now.
Need to do the same thing for vertex shaders and need to add the necessary
code-gen to fetch the constants which are referenced by the shader
instructions.
premature return in TexParameterf caused mesa to never call Driver.TexParameter
breaking drivers relying on this (fix bug #20966).
While here, also fix using ctx->ErrorValue when deciding to call
Driver.TexParameter. Errors are sticky and uncleared errors thus would cause
this to no longer get called. Since we thus need return value of
set_tex_parameter[if] can also optimize this to only call when value changed.
1) If MakeContextCurrent is called with (NULL, None, None), Don't
send the request to the X server if the current context is direct.
2) Return BadMatch in some error cases according to the glx spec.
3) If MakeContextCurrent is called for a context which is current in
another thread, return BadAccess according to the glx spec.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
1) Don't error-check here. It's done in glx makeCurrent.
2) Allow ctx and the dri drawables to be NULL for future use. This is
currently blocked in glx makeCurrent.
3) Avoid updating dri drawables unless they are completely uninitialized.
Since the updating was done outside of the lock, the driver need to
verify and redo it anyway.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
A shader program may consist of multiple shaders (source code units).
If we find there are unresolved functions after compiling the unit that
defines main(), we'll concatenate all the respective vertex or fragment
shaders then recompile.
This isn't foolproof but should work in most cases.
This fixes an issue when compiling glCallList() into another display list
when the mode is GL_COMPILE_AND_EXECUTE.
Before, the call to glCallList() called _mesa_save_CallList() which called
neutral_CallList() which then called _mesa_save_CallList() again. In the
end, the parent display list contained two calls to the child display list
instead of one.
Let's be on the lookout for regressions caused by this change for a while
before we cherry-pick this elsewhere.
"const" is the right keyword, but I can't do that without adding a bunch
of really annoying and ugly const casts everywhere, and frankly,
that's really stupid, so instead, just don't make them const.
The 'dots' register wasn't getting properly un-negated and un-swizzled
after emitting the code for back-face lighting. So, if more than one
light source was enabled, the specular exponent for the next light source
was wrong.
During execution we were evaluating pow(x, y) where y was negative instead
of positive. This led to the outcome being zero or NaN.
This fixes the occasional black triangles seen in isosurf when hacked to
enable two-sided lighting.
We don't upload the pixels with the CPU in that case, so the map will
only serve as a way of triggering cache flushes over a bunch of data we
don't touch.
Surfaces are now by definition GPU views. So CPU access flags don't make
any sense when creating a surface.
For now we are forcing surfaces to be GPU read/write, but that will go away
soon.
i965 can either do SRGBA8_REV format or SARGB8 format, but not SRGBA8.
Could add SRGBA8_REV support to mesa, but simply use SARGB8 for now.
While here, also add true srgb luminance / luminance_alpha support -
unfortunately the published docs fail to mention which asics support
this, tested on g43 so assume this works on any g4x.
use color format constants instead of magic numbers
remove handling of cpp 0 or 3 (neither is possible) in various places
don't misconfigure 8 bit surface blits as rgb565
Move flags for linking standard C/C++ libraries from configure.ac to mklib
Use -norunpath flag when linking with Sun C++ compiler
Convert mklib -exports list into a linker mapfile
Set FINAL_LIBS correctly when -noprefix is used
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
This scheme breaks when the display connection doesn't receive ConfigureNotify
events. This caused reporoducible problems (cropped / misplaced output) when
starting a 3D application in a guest operating system in VMware Workstation.
This reverts commit dd1c68f151.
Conflicts:
src/glx/x11/dri2_glx.c
Since core Mesa MAX_TEXTURE_LEVELS was bumped, we were incorrectly advertising
a maximum texture size of 4096 on older chips, causing corrupted menu text in
Extreme Tux Racer or Armagetron.
Also make sure our texture image array can actually hold all the mipmap levels
we support...
To distinguish from the -0.2 version still being maintained on the
gallium-mesa-7.4 branch. There are already greater interface changes
between these two branches than there were between -0.2 and -0.1.
Also stop injecting Tungsten into the vendor string - the Gallium in
the renderer string should be sufficient.
The FBO pixel coordinate system, with (0,0) as the
upper-left pixel, is inverted in Y compared to the
normal OpenGL pixel coordinate system, which has
(0,0) as its lower-left pixel.
Viewport and polygon stipple are sensitive to this
inversion; so is point rasterization. The basic
fix is simple: when rendering to an FBO, instead
of the normal RASTRULE_UPPER_RIGHT that's
appropriate for OpenGL windows, use the Y inversion
RASTRULE_LOWER_RIGHT.
Unfortunately, current Intel documentation has this
value listed as "Reserved, but not seen as useful".
It does work on at least some i965-class devices,
though; and the worst that could happen if an
older device didn't support it would be incorrect
point rasterization to FBOs, which is what happens
already, so this fix is at least no worse than what
happens presently, and is better for some (and possibly
all) i965-class devices.
This also cuts instructions by just using the existing bit in the payload
rather than computing it from the determinant in the SF unit and passing it
as a varying down to the WM. Something still goes wrong with getting the
backface color right, but a simpler shader appears to get the right result.
Previously, we would sample (f,glFrontFacing,undef,undef) instead of the
(f,0,0,1) that fragment.fogcoord is supposed to return. Due to
glFrontFacing's presence in FOGC.y, we'll still give bad results there when
glFrontFacing is used.
Bug #19122, piglit testcase fp-fog.
Turns out that XXX comment was important. We weren't flagging the WM to
re-update with the statistics enable, so we got zeroes out of our query.
Bug #20740, fixes piglit occlusion_query test.
Signed-off-by: Eric Anholt <eric@anholt.net>
This was used to indicate OpenGL's lower-left origin for fragment window
coordinates for polygon stipple and gl_FragCoord.
Now:
- fragment coordinate origin is always upper-left corner
- GL polygon stipple is inverted and shifted before given to gallium
- GL fragment programs that use INPUT[WPOS] are modified to use an
inverted window coord which is placed in a temp register.
Note: the origin_lower_left field still exists in pipe_rasterizer_state.
Remove it when all the drivers, etc. no longer reference it.
This is a check-point commit; not turned on yet.
Use the linear scan register allocation algorithm to re-allocate temporary
registers. This is done by computing the live intervals for registers and
reallocating temps with that information.
For some shaders this dramatically reduces the number of temp registers
needed.
For the time being we give up on a few cases such as relative-indexed temps
and subroutine calls (but we inline most GLSL functions anyway).
Add a module that will manage uploading and coalescing multiple
user-buffers, malloc-buffers and other random data that doesn't
happen to be in a GPU buffer already. The module stuffs multiple
little uploads into larger GPU buffers to reduce create/destroy
overheads, etc.
This requires upgrading the interface so that the argument to
glXBindTexImageEXT isn't just dropped on the floor. Note that this only
fixes the accelerated path on Intel, as Mesa's texture format support is
missing x8r8g8b8 support (right now, GL_RGB textures get uploaded as a8r8gb8,
but in this case we're not doing the upload so we can't really work around it
that way).
Fixes bugs with compositors trying to use shaders that use alpha channels, on
windows without a valid alpha channel. Bug #19910 and likely others as well.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Initialize the shader's pragma settings before calling the compiler.
Added pragma "Ignore" fields to allow overriding the #pragma directives found
in shader source code.
It turns out some tests are sensitive to rounding vs. truncating when
converting float color values to integers in glReadPixels(). In particular,
this matters when the destination format is 5/6/5 or 4/4/4/4, etc.
This function is called from many OS-dependent versions of MakeCurrent.
Move the check for multithreading to this central location to avoid
having to make this check from all the callers.
Flush if we change context.
Also reinstate the old optimization of doing nothing if
nothing changes.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
drm_api is a set of hooks used by the dri2 state tracker, this wraps our
dri1 code around the same set of hooks.
Currently the dri2 build will produce nouveau_dri2.so which you'll need
to install as nouveau_dri.so if you wish to try it. The dri2 state
tracker doesn't make it easy for a driver to support both paths in the
same binary.
The MAX-based function can produce values that are non-monotonic for a span
which causes glitches in texture filtering. The sqrt-based one avoids that.
This is perhaps slightly slower than before, but the difference
probably isn't noticable given we're doing software mipmap filtering.
Issue reported by Nir Radian <nirr@horizonsemi.com>
1) Don't allow multiple threads sharing current context,
even if they are mutex protected.
2) Remove all XLockDisplay(), XUnLockDisplay() calls, as they were
only workarounds for xcb.
Signed-off-by: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
The draw module provides a similar interface to the driver which
is retained as various bits of hardware may be able to take on
incremental parts of the vertex pipeline. However, there's no
need to advertise all this complexity to the state tracker.
There are basically two modes now - normal and passthrough/screen-coords.
Any driver who needs a copy of the shader tokens must organize to
do so itself. This has been the case for a long time, but there
was still defensive code in the state tracker, which is now removed.
Any bugs resulting from this need to be fixed in the offending driver...
It is not the state tracker's responsibilty to inject sleeps and
pessimize performance in the hope of avoiding buffer synchronization
issues in buggy drivers.
Calling finish() here will just hide problems that need to be fixed
elsewhere.
2009-03-13 15:53:48 +00:00
985 changed files with 44275 additions and 16555 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.