Compare commits

...

19 Commits

Author SHA1 Message Date
Ian Romanick
7e1d9b3dfc mesa: Bump version to 7.11-rc3 2011-07-25 19:47:49 -07:00
Ian Romanick
929b3cc9b5 mesa: Use --dereference to avoid symlinks in tarballs 2011-07-25 19:47:38 -07:00
Eric Anholt
7e4e5b0b75 i965/fs: Fix MRT drawing since the m0->m2 move for shader debug.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3daa2d97eb)
2011-07-25 19:43:20 -07:00
Eric Anholt
ba7db857c2 i965: Fix many of the trivial WebGL demos that broke due to IB optimization.
The index buffer state emit only occurred if there was an IB in place
and we were in either a new batch or a new IB state.  But because we
only flagged new IB state if IB state changed from the last IB state
we calculated, we could simply never emit IB state after batchbuffer
wraps if the first draw didn't use the IB and we didn't actually
change the IB.

Fixes piglit glx-multi-context-ib-1.
(cherry picked from commit 818db3848b)
2011-07-25 18:58:25 -07:00
Eric Anholt
35bc35a70c i965: Emit texture cache flushes on gen6 along with render cache flushes.
It turns out that internally the texture cache gets flushed in a
couple of cases, particularly around 2D operations mixed with 3D.  In
almost all cases one of those happens between rendering to an
FBO-attached texture and rendering from that texture.  However, as of
the next patch, glean tfbo (and the new fbo-flushing-2 test) would
manage to get stale texture values because one of those flushes didn't
occur.  The intention of this code was always to get the render cache
cleared and ready to be used from the sampler cache (and it does on <=
gen4), so this just catches gen5 up.

This patch was also tested to fix fbo-flushing on gen7.
(cherry picked from commit 185868c9c2)
2011-07-25 18:58:25 -07:00
Paul Berry
dd3bb73153 i965: vs optimization fix: Check val.{negate,abs} in accumulator_contains()
When emitting a MAC instruction in a vertex shader, brw_vs_emit()
calls accumulator_contains() to determine whether the accumulator
already contains the appropriate addend; if it does, then we can avoid
emitting an unnecessary MOV instruction.

However, accumulator_contains() wasn't checking the val.negate or
val.abs flags.  As a result, if the desired value was the negation, or
the absolute value, of what was already in the accumulator, we would
generate an incorrect shader.

Fixes piglit test vs-refract-vec4-vec4-float.

Tested on Gen5 and Gen6.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d92463d5dc)
2011-07-25 18:58:24 -07:00
Kenneth Graunke
0167c85562 i965/gen7: Fix shadow sampling in the old brw_wm_emit backend.
On Ivybridge, the shadow comparitor goes in the first slot, rather than
at the end.  It's not necessary to send u, v, and r.

Fixes tests texturing/texdepth and glean/fbo.

NOTE: This is a candidate for the 7.11 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 572f631895)
2011-07-25 18:58:24 -07:00
Kenneth Graunke
8ea7989f18 i965/fs: Clear result before visiting shadow comparitor and LOD info.
Commit 53c89c67f3 ("i965: Avoid generating
MOVs for assignments of expressions.") added the line "this->result =
reg_undef" all over the code.  Unfortunately, since Eric developed his
patch before I landed Ivybridge support, he missed adding it to
fs_visitor::emit_texture_gen7() after rebasing.

Furthermore, since I developed TXD support before Eric's patch, I
neglected to add it to the gradient handling when I rebased.

Neglecting to set this causes the visitor to use this->result as storage
rather than generating a new temporary.  These missing statements
resulted in the same register being used to store several different
values.

Fixes the following piglit tests on Ivybridge:
- glsl-fs-shadow2dproj.shader_test
- glsl-fs-shadow2dproj-bias.shader_test

NOTE: This is a candidate for the 7.11 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 156cef0fba)
2011-07-25 18:58:24 -07:00
Ian Romanick
317389f601 glsl: Treat ir_dereference_array of non-var as a constant for lowering
Previously the code would just look at deref->array->type to see if it
was a constant.  This isn't good enough because deref->array might be
another ir_dereference_array... of a constant.  As a result,
deref->array->type wouldn't be a constant, but
deref->variable_referenced() would return NULL.  The unchecked NULL
pointer would shortly lead to a segfault.

Instead just look at the return of deref->variable_referenced().  If
it's NULL, assume that either a constant or some other form of
anonymous temporary storage is being dereferenced.

This is a bit hinkey because most drivers treat constant arrays as
uniforms, but the lowering pass treats them as temporaries.  This
keeps the behavior of the old code, so this change isn't making things
worse.

Fixes i965 piglit:

    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-uniform-array-mat[234]-index-col-rd
    vs-uniform-array-mat[234]-index-col-row-rd

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 156f85336f)
2011-07-25 18:58:24 -07:00
Ian Romanick
01a94f72e9 i965: When emitting a src/dst read of an output, keep the swizzle and neg
Fixes i965 piglit vs-varying-array-mat[234]-row-rd.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1d3f09f159)
2011-07-25 18:58:24 -07:00
Ian Romanick
c97a20f3ef i965: When emitting a src/dst write of an output, keep the write mask
Fixes i965 piglit:

    vs-varying-array-mat[234]-col-row-wr
    vs-varying-array-mat[234]-index-col-row-wr
    vs-varying-array-mat[234]-index-row-wr
    vs-varying-array-mat[234]-row-wr
    vs-varying-mat[234]-col-row-wr
    vs-varying-mat[234]-row-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 337e2dfad0)
2011-07-25 18:58:24 -07:00
Ian Romanick
716bcfd24d prog_optimize: Set unused regs to PROGRAM_UNDEFINED after CMP->MOV conversion
Leaving the unused registers with other values caused assertion
failures and other problems in places that blindly iterate over all
sources.

brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr !=
0' failed.

Fixes i965 piglit:

    vs-uniform-array-mat[234]-col-row-rd
    vs-uniform-array-mat[234]-index-col-row-rd
    vs-uniform-array-mat[234]-index-row-rd
    vs-uniform-mat[234]-col-row-rd

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit fbeb68e880)
2011-07-25 18:58:23 -07:00
Ian Romanick
a75aaaaa09 ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructor
Fixes i965 piglit:

    vs-temp-array-mat[234]-col-row-wr
    vs-temp-array-mat[234]-index-col-row-wr
    vs-temp-array-mat[234]-index-row-wr
    vs-temp-mat[234]-col-row-wr

Fixes swrast piglit:

    fs-temp-array-mat[234]-col-row-wr
    fs-temp-array-mat[234]-index-col-row-wr
    fs-temp-array-mat[234]-index-row-wr
    fs-temp-mat[234]-col-row-wr
    vs-temp-array-mat[234]-col-row-wr
    vs-temp-array-mat[234]-index-col-row-wr
    vs-temp-array-mat[234]-index-row-wr
    vs-temp-mat[234]-col-row-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit f7cd9a858c)
2011-07-25 18:58:23 -07:00
Ian Romanick
3d0bb72795 ir_to_mesa: Add each relative address to the previous
This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.

Fixes i965 piglit:

    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-wr
    vs-uniform-array-mat[234]-index-col-rd

Fixes swrast piglit:

    fs-temp-array-mat[234]-index-col-rd
    fs-temp-array-mat[234]-index-col-row-rd
    fs-temp-array-mat[234]-index-col-wr
    fs-uniform-array-mat[234]-index-col-rd
    fs-uniform-array-mat[234]-index-col-row-rd
    fs-varying-array-mat[234]-index-col-rd
    fs-varying-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-wr
    vs-uniform-array-mat[234]-index-col-rd
    vs-uniform-array-mat[234]-index-col-row-rd
    vs-varying-array-mat[234]-index-col-rd
    vs-varying-array-mat[234]-index-col-row-rd
    vs-varying-array-mat[234]-index-col-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d6e1a8f714)
2011-07-25 18:58:23 -07:00
Ian Romanick
e4b60bf38c glsl: When lowering non-constant vector indexing, respect existing conditions
If the non-constant index was in the LHS of an assignment, any
existing condititon on that assignment would be lost.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 601428d2bb)
2011-07-25 18:58:23 -07:00
Ian Romanick
47fc05cf46 glsl: When lowering non-constant array indexing, respect existing conditions
If the non-constant index was in the LHS of an assignment, any
existing condititon on that assignment would be lost.

Fixes i965 piglit:

    fs-temp-array-mat[234]-col-row-wr
    fs-temp-array-mat[234]-index-col-row-wr
    fs-temp-array-mat[234]-index-col-wr
    fs-temp-array-mat[234]-index-row-wr
    vs-varying-array-mat[234]-index-col-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 5f83dfe5b7)
2011-07-25 18:58:23 -07:00
Ian Romanick
45769e5d80 glsl: Rework lowering of non-constant array indexing
The previous implementation could easily get tricked if the LHS of an
assignment included a non-constant index that was "inside" another
dereference.  For example:

    mat4 m[2];
    m[0][i] = vec4(0.0);

Due to the way it tracked whether the array was being assigned, it
would think that the non-constant index was in an r-value.  The new
code fixes that by tracking l-values and r-values differently.  The
index is also replaced by cloning the IR and replacing the index
variable instead of the odd way it was done before.

v2: Apply some simplifications suggested by Eric Anholt.  Making
assignment_generator::rvalue be ir_dereference instead of ir_rvalue
simplified the code a bit.

Fixes i965 piglit fs-temp-array-mat[234]-index-wr and
vs-varying-array-mat[234]-index-wr.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34691
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1731ac3086)

To make bisects work, this also squashes in:

glsl: Correctly return progress from lower_variable_index_to_cond_assign

lower_variable_index_to_cond_assign runs until it can't make any more
progress.  It then returns the result of the last pass which will
always be false.  This caused the lowering loop in
_mesa_ir_link_shader to end before doing one last round of
lower_if_to_cond_assign.  This caused several if-statements (resulting
from lower_variable_index_to_cond_assign) to be left in the IR.

In addition to this change, lower_variable_index_to_cond_assign should
take a flag indicating whether or not it should even generate
if-statements.  This is easily controlled by
switch_generator::linear_sequence_max_length.  This would generate
much better code on architectures without any flow contol.

Fixes i915 piglit regressions glsl-texcoord-array and
glsl-fs-vec4-indexing-temp-src.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c1e591eed4)
2011-07-25 18:57:58 -07:00
Ian Romanick
928137b099 glsl: Split out part of variable_index_to_cond_assign_visitor::needs_lowering
Other code will soon need to know if an array needs lowering based
exclusively on the storage mode.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d2296e784a)
2011-07-25 18:57:58 -07:00
Ian Romanick
26adbcaeb5 glsl: Move is_array_or_matrix outside visitor class
There's no reason for it to be there, and another class that may not
have access to the visitor will need it soon.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8d5f3cef79)
2011-07-25 18:57:58 -07:00
12 changed files with 260 additions and 54 deletions

View File

@@ -183,7 +183,7 @@ ultrix-gcc:
# Rules for making release tarballs
VERSION=7.11-rc2
VERSION=7.11-rc3
DIRECTORY = Mesa-$(VERSION)
LIB_NAME = MesaLib-$(VERSION)
GLUT_NAME = MesaGLUT-$(VERSION)
@@ -481,13 +481,13 @@ rm_config: parsers
rm -f configs/autoconf
$(LIB_NAME).tar: rm_config
cd .. ; tar -cf $(DIRECTORY)/$(LIB_NAME).tar $(LIB_FILES)
cd .. ; tar --dereference -cf $(DIRECTORY)/$(LIB_NAME).tar $(LIB_FILES)
$(LIB_NAME).tar.gz: $(LIB_NAME).tar
gzip --stdout --best $(LIB_NAME).tar > $(LIB_NAME).tar.gz
$(GLUT_NAME).tar:
cd .. ; tar -cf $(DIRECTORY)/$(GLUT_NAME).tar $(GLUT_FILES)
cd .. ; tar --dereference -cf $(DIRECTORY)/$(GLUT_NAME).tar $(GLUT_FILES)
$(GLUT_NAME).tar.gz: $(GLUT_NAME).tar
gzip --stdout --best $(GLUT_NAME).tar > $(GLUT_NAME).tar.gz

View File

@@ -29,6 +29,21 @@
*
* Pre-DX10 GPUs often don't have a native way to do this operation,
* and this works around that.
*
* The lowering process proceeds as follows. Each non-constant index
* found in an r-value is converted to a canonical form \c array[i]. Each
* element of the array is conditionally assigned to a temporary by comparing
* \c i to a constant index. This is done by cloning the canonical form and
* replacing all occurances of \c i with a constant. Each remaining occurance
* of the canonical form in the IR is replaced with a dereference of the
* temporary variable.
*
* L-values with non-constant indices are handled similarly. In this case,
* the RHS of the assignment is assigned to a temporary. The non-constant
* index is replace with the canonical form (just like for r-values). The
* temporary is conditionally assigned to each element of the canonical form
* by comparing \c i with each index. The same clone-and-replace scheme is
* used.
*/
#include "ir.h"
@@ -37,10 +52,76 @@
#include "glsl_types.h"
#include "main/macros.h"
static inline bool
is_array_or_matrix(const ir_instruction *ir)
{
return (ir->type->is_array() || ir->type->is_matrix());
}
/**
* Replace a dereference of a variable with a specified r-value
*
* Each time a dereference of the specified value is replaced, the r-value
* tree is cloned.
*/
class deref_replacer : public ir_rvalue_visitor {
public:
deref_replacer(const ir_variable *variable_to_replace, ir_rvalue *value)
: variable_to_replace(variable_to_replace), value(value),
progress(false)
{
assert(this->variable_to_replace != NULL);
assert(this->value != NULL);
}
virtual void handle_rvalue(ir_rvalue **rvalue)
{
ir_dereference_variable *const dv = (*rvalue)->as_dereference_variable();
if ((dv != NULL) && (dv->var == this->variable_to_replace)) {
this->progress = true;
*rvalue = this->value->clone(ralloc_parent(*rvalue), NULL);
}
}
const ir_variable *variable_to_replace;
ir_rvalue *value;
bool progress;
};
/**
* Find a variable index dereference of an array in an rvalue tree
*/
class find_variable_index : public ir_hierarchical_visitor {
public:
find_variable_index()
: deref(NULL)
{
/* empty */
}
virtual ir_visitor_status visit_enter(ir_dereference_array *ir)
{
if (is_array_or_matrix(ir->array)
&& (ir->array_index->as_constant() == NULL)) {
this->deref = ir;
return visit_stop;
}
return visit_continue;
}
/**
* First array dereference found in the tree that has a non-constant index.
*/
ir_dereference_array *deref;
};
struct assignment_generator
{
ir_instruction* base_ir;
ir_rvalue* array;
ir_dereference *rvalue;
ir_variable *old_index;
bool is_write;
unsigned int write_mask;
ir_variable* var;
@@ -55,18 +136,23 @@ struct assignment_generator
* underlying variable.
*/
void *mem_ctx = ralloc_parent(base_ir);
ir_dereference *element =
new(mem_ctx) ir_dereference_array(this->array->clone(mem_ctx, NULL),
new(mem_ctx) ir_constant(i));
ir_rvalue *variable = new(mem_ctx) ir_dereference_variable(this->var);
ir_assignment *assignment;
if (is_write) {
assignment = new(mem_ctx) ir_assignment(element, variable, condition,
write_mask);
} else {
assignment = new(mem_ctx) ir_assignment(variable, element, condition);
}
/* Clone the old r-value in its entirety. Then replace any occurances of
* the old variable index with the new constant index.
*/
ir_dereference *element = this->rvalue->clone(mem_ctx, NULL);
ir_constant *const index = new(mem_ctx) ir_constant(i);
deref_replacer r(this->old_index, index);
element->accept(&r);
assert(r.progress);
/* Generate a conditional assignment to (or from) the constant indexed
* array dereference.
*/
ir_rvalue *variable = new(mem_ctx) ir_dereference_variable(this->var);
ir_assignment *const assignment = (is_write)
? new(mem_ctx) ir_assignment(element, variable, condition, write_mask)
: new(mem_ctx) ir_assignment(variable, element, condition);
list->push_tail(assignment);
}
@@ -233,21 +319,18 @@ public:
bool lower_temps;
bool lower_uniforms;
bool is_array_or_matrix(const ir_instruction *ir) const
bool storage_type_needs_lowering(ir_dereference_array *deref) const
{
return (ir->type->is_array() || ir->type->is_matrix());
}
bool needs_lowering(ir_dereference_array *deref) const
{
if (deref == NULL || deref->array_index->as_constant()
|| !is_array_or_matrix(deref->array))
return false;
if (deref->array->ir_type == ir_type_constant)
/* If a variable isn't eventually the target of this dereference, then
* it must be a constant or some sort of anonymous temporary storage.
*
* FINISHME: Is this correct? Most drivers treat arrays of constants as
* FINISHME: uniforms. It seems like this should do the same.
*/
const ir_variable *const var = deref->array->variable_referenced();
if (var == NULL)
return this->lower_temps;
const ir_variable *const var = deref->array->variable_referenced();
switch (var->mode) {
case ir_var_auto:
case ir_var_temporary:
@@ -267,8 +350,18 @@ public:
return false;
}
bool needs_lowering(ir_dereference_array *deref) const
{
if (deref == NULL || deref->array_index->as_constant()
|| !is_array_or_matrix(deref->array))
return false;
return this->storage_type_needs_lowering(deref);
}
ir_variable *convert_dereference_array(ir_dereference_array *orig_deref,
ir_assignment* orig_assign)
ir_assignment* orig_assign,
ir_dereference *orig_base)
{
assert(is_array_or_matrix(orig_deref->array));
@@ -314,9 +407,12 @@ public:
new(mem_ctx) ir_assignment(lhs, orig_deref->array_index, NULL);
base_ir->insert_before(assign);
orig_deref->array_index = lhs->clone(mem_ctx, NULL);
assignment_generator ag;
ag.array = orig_deref->array;
ag.rvalue = orig_base;
ag.base_ir = base_ir;
ag.old_index = index;
ag.var = var;
if (orig_assign) {
ag.is_write = true;
@@ -327,21 +423,40 @@ public:
switch_generator sg(ag, index, 4, 4);
exec_list list;
sg.generate(0, length, &list);
base_ir->insert_before(&list);
/* If the original assignment has a condition, respect that original
* condition! This is acomplished by wrapping the new conditional
* assignments in an if-statement that uses the original condition.
*/
if ((orig_assign != NULL) && (orig_assign->condition != NULL)) {
/* No need to clone the condition because the IR that it hangs on is
* going to be removed from the instruction sequence.
*/
ir_if *if_stmt = new(mem_ctx) ir_if(orig_assign->condition);
sg.generate(0, length, &if_stmt->then_instructions);
base_ir->insert_before(if_stmt);
} else {
exec_list list;
sg.generate(0, length, &list);
base_ir->insert_before(&list);
}
return var;
}
virtual void handle_rvalue(ir_rvalue **pir)
{
if (this->in_assignee)
return;
if (!*pir)
return;
ir_dereference_array* orig_deref = (*pir)->as_dereference_array();
if (needs_lowering(orig_deref)) {
ir_variable* var = convert_dereference_array(orig_deref, 0);
ir_variable *var =
convert_dereference_array(orig_deref, NULL, orig_deref);
assert(var);
*pir = new(ralloc_parent(base_ir)) ir_dereference_variable(var);
this->progress = true;
@@ -353,10 +468,11 @@ public:
{
ir_rvalue_visitor::visit_leave(ir);
ir_dereference_array *orig_deref = ir->lhs->as_dereference_array();
find_variable_index f;
ir->lhs->accept(&f);
if (needs_lowering(orig_deref)) {
convert_dereference_array(orig_deref, ir);
if ((f.deref != NULL) && storage_type_needs_lowering(f.deref)) {
convert_dereference_array(f.deref, ir, ir->lhs);
ir->remove();
this->progress = true;
}
@@ -377,7 +493,17 @@ lower_variable_index_to_cond_assign(exec_list *instructions,
lower_temp,
lower_uniform);
visit_list_elements(&v, instructions);
/* Continue lowering until no progress is made. If there are multiple
* levels of indirection (e.g., non-constant indexing of array elements and
* matrix columns of an array of matrix), each pass will only lower one
* level of indirection.
*/
bool progress_ever = false;
do {
v.progress = false;
visit_list_elements(&v, instructions);
progress_ever = v.progress || progress_ever;
} while (v.progress);
return v.progress;
return progress_ever;
}

View File

@@ -171,21 +171,23 @@ ir_vec_index_to_cond_assign_visitor::visit_leave(ir_assignment *ir)
assert(orig_deref->array_index->type->base_type == GLSL_TYPE_INT);
exec_list list;
/* Store the index to a temporary to avoid reusing its tree. */
index = new(ir) ir_variable(glsl_type::int_type, "vec_index_tmp_i",
ir_var_temporary);
ir->insert_before(index);
list.push_tail(index);
deref = new(ir) ir_dereference_variable(index);
assign = new(ir) ir_assignment(deref, orig_deref->array_index, NULL);
ir->insert_before(assign);
list.push_tail(assign);
/* Store the RHS to a temporary to avoid reusing its tree. */
var = new(ir) ir_variable(ir->rhs->type, "vec_index_tmp_v",
ir_var_temporary);
ir->insert_before(var);
list.push_tail(var);
deref = new(ir) ir_dereference_variable(var);
assign = new(ir) ir_assignment(deref, ir->rhs, NULL);
ir->insert_before(assign);
list.push_tail(assign);
/* Generate a conditional move of each vector element to the temp. */
for (i = 0; i < orig_deref->array->type->vector_elements; i++) {
@@ -205,8 +207,25 @@ ir_vec_index_to_cond_assign_visitor::visit_leave(ir_assignment *ir)
deref = new(ir) ir_dereference_variable(var);
assign = new(ir) ir_assignment(swizzle, deref, condition);
ir->insert_before(assign);
list.push_tail(assign);
}
/* If the original assignment has a condition, respect that original
* condition! This is acomplished by wrapping the new conditional
* assignments in an if-statement that uses the original condition.
*/
if (ir->condition != NULL) {
/* No need to clone the condition because the IR that it hangs on is
* going to be removed from the instruction sequence.
*/
ir_if *if_stmt = new(mem_ctx) ir_if(ir->condition);
list.move_nodes_to(&if_stmt->then_instructions);
ir->insert_before(if_stmt);
} else {
ir->insert_before(&list);
}
ir->remove();
this->progress = true;

View File

@@ -59,7 +59,8 @@ fs_visitor::generate_fb_write(fs_inst *inst)
if (inst->target > 0) {
/* Set the render target index for choosing BLEND_STATE. */
brw_MOV(p, retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE, 0, 2),
brw_MOV(p, retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE,
inst->base_mrf, 2),
BRW_REGISTER_TYPE_UD),
brw_imm_ud(inst->target));
}

View File

@@ -595,9 +595,11 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, fs_reg coordinate,
/* gen4's SIMD8 sampler always has the slots for u,v,r present. */
mlen += 3;
} else if (ir->op == ir_txd) {
this->result = reg_undef;
ir->lod_info.grad.dPdx->accept(this);
fs_reg dPdx = this->result;
this->result = reg_undef;
ir->lod_info.grad.dPdy->accept(this);
fs_reg dPdy = this->result;
@@ -778,9 +780,11 @@ fs_visitor::emit_texture_gen5(ir_texture *ir, fs_reg dst, fs_reg coordinate,
inst = emit(FS_OPCODE_TXL, dst);
break;
case ir_txd: {
this->result = reg_undef;
ir->lod_info.grad.dPdx->accept(this);
fs_reg dPdx = this->result;
this->result = reg_undef;
ir->lod_info.grad.dPdy->accept(this);
fs_reg dPdy = this->result;
@@ -842,6 +846,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate,
}
if (ir->shadow_comparitor && ir->op != ir_txd) {
this->result = reg_undef;
ir->shadow_comparitor->accept(this);
emit(BRW_OPCODE_MOV, fs_reg(MRF, base_mrf + mlen), this->result);
mlen += reg_width;
@@ -852,11 +857,13 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate,
case ir_tex:
break;
case ir_txb:
this->result = reg_undef;
ir->lod_info.bias->accept(this);
emit(BRW_OPCODE_MOV, fs_reg(MRF, base_mrf + mlen), this->result);
mlen += reg_width;
break;
case ir_txl:
this->result = reg_undef;
ir->lod_info.lod->accept(this);
emit(BRW_OPCODE_MOV, fs_reg(MRF, base_mrf + mlen), this->result);
mlen += reg_width;
@@ -865,9 +872,11 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate,
if (c->dispatch_width == 16)
fail("Gen7 does not support sample_d/sample_d_c in SIMD16 mode.");
this->result = reg_undef;
ir->lod_info.grad.dPdx->accept(this);
fs_reg dPdx = this->result;
this->result = reg_undef;
ir->lod_info.grad.dPdy->accept(this);
fs_reg dPdy = this->result;
@@ -1062,6 +1071,7 @@ fs_visitor::visit(ir_texture *ir)
if (hw_compare_supported) {
inst->shadow_compare = true;
} else {
this->result = reg_undef;
ir->shadow_comparitor->accept(this);
fs_reg ref = this->result;

View File

@@ -1821,6 +1821,9 @@ accumulator_contains(struct brw_vs_compile *c, struct brw_reg val)
if (val.address_mode != BRW_ADDRESS_DIRECT)
return GL_FALSE;
if (val.negate || val.abs)
return GL_FALSE;
switch (prev_insn->header.opcode) {
case BRW_OPCODE_MOV:
case BRW_OPCODE_MAC:
@@ -1980,9 +1983,22 @@ void brw_vs_emit(struct brw_vs_compile *c )
const struct prog_src_register *src = &inst->SrcReg[i];
index = src->Index;
file = src->File;
if (file == PROGRAM_OUTPUT && c->output_regs[index].used_in_src)
args[i] = c->output_regs[index].reg;
else
if (file == PROGRAM_OUTPUT && c->output_regs[index].used_in_src) {
/* Can't just make get_arg "do the right thing" here because
* other callers of get_arg and get_src_reg don't expect any
* special behavior for the c->output_regs[index].used_in_src
* case.
*/
args[i] = c->output_regs[index].reg;
args[i].dw1.bits.swizzle =
BRW_SWIZZLE4(GET_SWZ(src->Swizzle, 0),
GET_SWZ(src->Swizzle, 1),
GET_SWZ(src->Swizzle, 2),
GET_SWZ(src->Swizzle, 3));
/* Note this is ok for non-swizzle ARB_vp instructions */
args[i].negate = src->Negate ? 1 : 0;
} else
args[i] = get_arg(c, inst, i);
}
@@ -1993,7 +2009,11 @@ void brw_vs_emit(struct brw_vs_compile *c )
index = inst->DstReg.Index;
file = inst->DstReg.File;
if (file == PROGRAM_OUTPUT && c->output_regs[index].used_in_src)
dst = c->output_regs[index].reg;
/* Can't just make get_dst "do the right thing" here because other
* callers of get_dst don't expect any special behavior for the
* c->output_regs[index].used_in_src case.
*/
dst = brw_writemask(c->output_regs[index].reg, inst->DstReg.WriteMask);
else
dst = get_dst(c, inst->DstReg);

View File

@@ -211,6 +211,7 @@ static void brw_new_batch( struct intel_context *intel )
intel->batch.need_workaround_flush = true;
brw->vb.nr_current_buffers = 0;
brw->ib.type = -1;
/* Mark that the current program cache BO has been used by the GPU.
* It will be reallocated if we need to put new programs in for the

View File

@@ -1094,9 +1094,16 @@ void emit_tex(struct brw_wm_compile *c,
if (intel->gen < 5 && c->dispatch_width == 8)
nr_texcoords = 3;
/* For shadow comparisons, we have to supply u,v,r. */
if (shadow)
nr_texcoords = 3;
if (shadow) {
if (intel->gen < 7) {
/* For shadow comparisons, we have to supply u,v,r. */
nr_texcoords = 3;
} else {
/* On Ivybridge, the shadow comparitor comes first. Just load it. */
brw_MOV(p, brw_message_reg(cur_mrf), arg[2]);
cur_mrf += mrf_per_channel;
}
}
/* Emit the texcoords. */
for (i = 0; i < nr_texcoords; i++) {
@@ -1113,7 +1120,7 @@ void emit_tex(struct brw_wm_compile *c,
}
/* Fill in the shadow comparison reference value. */
if (shadow) {
if (shadow && intel->gen < 7) {
if (intel->gen >= 5) {
/* Fill in the cube map array index value. */
brw_MOV(p, brw_message_reg(cur_mrf), brw_imm_f(0));

View File

@@ -388,6 +388,7 @@ intel_batchbuffer_emit_mi_flush(struct intel_context *intel)
OUT_BATCH(PIPE_CONTROL_INSTRUCTION_FLUSH |
PIPE_CONTROL_WRITE_FLUSH |
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
PIPE_CONTROL_TC_FLUSH |
PIPE_CONTROL_NO_WRITE);
OUT_BATCH(0); /* write address */
OUT_BATCH(0); /* write data */

View File

@@ -35,7 +35,7 @@ struct gl_context;
#define MESA_MAJOR 7
#define MESA_MINOR 11
#define MESA_PATCH 0
#define MESA_VERSION_STRING "7.11-rc2"
#define MESA_VERSION_STRING "7.11-rc3"
/* To make version comparison easy */
#define MESA_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))

View File

@@ -134,7 +134,7 @@ src_reg::src_reg(dst_reg reg)
this->index = reg.index;
this->swizzle = SWIZZLE_XYZW;
this->negate = 0;
this->reladdr = NULL;
this->reladdr = reg.reladdr;
}
dst_reg::dst_reg(src_reg reg)
@@ -1494,6 +1494,18 @@ ir_to_mesa_visitor::visit(ir_dereference_array *ir)
this->result, src_reg_for_float(element_size));
}
/* If there was already a relative address register involved, add the
* new and the old together to get the new offset.
*/
if (src.reladdr != NULL) {
src_reg accum_reg = get_temp(glsl_type::float_type);
emit(ir, OPCODE_ADD, dst_reg(accum_reg),
index_reg, *src.reladdr);
index_reg = accum_reg;
}
src.reladdr = ralloc(mem_ctx, src_reg);
memcpy(src.reladdr, &index_reg, sizeof(index_reg));
}

View File

@@ -1319,6 +1319,15 @@ _mesa_simplify_cmp(struct gl_program * program)
inst->Opcode = OPCODE_MOV;
inst->SrcReg[0] = inst->SrcReg[1];
/* Unused operands are expected to have the file set to
* PROGRAM_UNDEFINED. This is how _mesa_init_instructions initializes
* all of the sources.
*/
inst->SrcReg[1].File = PROGRAM_UNDEFINED;
inst->SrcReg[1].Swizzle = SWIZZLE_NOOP;
inst->SrcReg[2].File = PROGRAM_UNDEFINED;
inst->SrcReg[2].Swizzle = SWIZZLE_NOOP;
}
}
if (dbg) {