Compare commits

...

17 Commits

Author SHA1 Message Date
Emil Velikov
c29ddc923f Increment version to 10.4.0-rc3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-28 17:58:26 +00:00
Ilia Mirkin
085de45812 freedreno/ir3: don't pass consts to madsh.m16 in MOD logic
madsh.m16 can't handle a const in src1, make sure to unconst it

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 37fe347542)
2014-11-28 17:29:29 +00:00
Dave Airlie
31c7e6c51d r600g: merge the TXQ and BUFFER constant buffers (v1.1)
We are using 1 more buffer than we have, although in the future the
driver should just end up using one buffer in total probably, this
is a good first step, it merges the txq cube array and buffer info
constants on r600 and evergreen.

This should in theory fix geom shader tests on r600.

v1.1: fix comments from Glenn.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07ae69753c)

Squashed with commit

r600g: fix fallout from last patch

I accidentally rebased from the wrong machine and missed some
fixes that were on my r600 box.

doh.

this fixes a bunch of geom shader textureSize tests on rv635
from gpu reset to pass.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86760
Reported-by: wolput@onsneteindhoven.nl
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b10ddf962f)

Squashed with commit

r600g: make llvm code compile this time

Actually compiling the code helps make it compile.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 91a827624c)
2014-11-28 17:29:07 +00:00
José Fonseca
2a0290d5f5 st/wgl: Don't export wglGetExtensionsStringARB.
It's not exported by the official opengl32.dll neither.  Applications are
supposed to get it via wglGetProcAddress(), not GetProcAddress().

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit cb009bdd44)
2014-11-28 17:28:28 +00:00
José Fonseca
f77a97f057 mapi/glapi: Fix dll linkage of GLES1 symbols.
This fixes several MSVC warnings like:

  warning C4273: 'glClearColorx' : inconsistent dll linkage

In fact, we should avoid using `declspec(dllexport)` altogether, and use
exclusively the .DEF instead, which gives more precise control of which
symbols must be exported, but all the public GL/GLES headers practically
force us to pick between `declspec(dllexport)` or
`declspec(dllimport)`.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 5fdb6d6839)
2014-11-28 17:28:20 +00:00
José Fonseca
d45c35c3d7 util/u_snprintf: Don't redefine HAVE_STDINT_H as 0.
We now always guarantee availability of stdint.h on MSVC -- if MSVC
doesn't supply one we use our own.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 4b6e93650c)
2014-11-28 17:28:13 +00:00
Emil Velikov
16eaf01a6a nine: the .pc file should not follow mesa version
The version provided by it should be the same as the one
provided/handled by the module. Add the missing tiny version.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 9b7037a369)
2014-11-26 21:23:59 +00:00
Chris Forbes
6316d415c4 i965/Gen6-7: Do not replace texcoords with point coord if not drawing points
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.

Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.

V2: - Simplify patch -- the real changes are no longer duplicated across
      the Gen6 and Gen7 atoms.
    - Also don't clobber attr overrides -- which matters on Haswell too,
      and fixes the other half of the problem
    - Fix newly-introduced warnings
V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than
      core flag and state; keep the state flags in order.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0008d0e59e)
2014-11-26 21:23:23 +00:00
Kenneth Graunke
dca88397ca glsl: Make lower_constant_arrays_to_uniforms require dereferences.
Ilia noticed that my lowering pass was converting the constant array
used by textureGatherOffsets' offsets parameter to a uniform.  This
broke textureGather for Nouveau, and is generally a horrible plan,
since it violates the GLSL constraint that offsets must be an
immediate constant.

When I wrote this pass, I neglected to consider whole array assignment.
I figured opt_array_splitting would handle constant indexing, so this
pass was really about fixing variable indexing.

textureGatherOffsets is an example of whole array access that we really
don't want to touch.  Whole array copies don't appear to benefit from
this either - they're most likely initializers for temporary arrays
which are going to be mutated anyway.  Since you're copying, you may
as well copy from immediates, not uniforms.

This patch makes the pass look for ir_dereference_arrays of
ir_constants, rather than looking for any ir_constant directly.
This way, it ignores whole array assignment.

No shader-db changes or Piglit regressions on Haswell.  Some Piglit
tests generate different code (fixing textureGatherOffsets on Nouveau).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 60f011af1a)
2014-11-26 21:23:14 +00:00
Chris Forbes
6c383aaadd mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose
This was just returning the same value as GL_CURRENT_MATRIX_ARB.
Spotted while investigating something else in apitrace.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2b4fe85f0e)
2014-11-26 21:23:08 +00:00
Chris Forbes
7e47ae3185 glsl: Generate unique names for each const array lowered to uniforms
Uniform names (even for hidden uniforms) are required to be unique; some
parts of the compiler assume they can be looked up by name.

Fixes the piglit test: tests/spec/glsl-1.20/linker/array-initializers-1

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 129178893b)
2014-11-26 21:20:33 +00:00
Chris Forbes
9e94c05936 i965: Handle nested uniform array indexing
When converting a uniform array reference to a pull constant load, the
`reladdr` expression itself may have its own `reladdr`, arbitrarily
deeply. This arises from expressions like:

   a[b[x]]     where a, b are uniform arrays (or lowered const arrays),
               and x is not a constant.

Just iterate the lowering to pull constants until we stop seeing these
nested. For most shaders, there will be only one pass through this loop.

Fixes the piglit test:
tests/spec/glsl-1.20/linker/double-indirect-1.shader_test

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit adefccd12a)
2014-11-26 21:20:11 +00:00
Dave Airlie
4952c49f21 r600g: do all CUBE ALU operations before gradient texture operations (v2.1)
This moves all the CUBE section above the gradients section,
so that the gradient emission happens on one block which
is what sb/hardware expect.

v2: avoid changes to bytecode by using spare temps
v2.1: shame gcc, oh the shame. (uninit var warnings)

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c88385603a)
2014-11-26 21:20:05 +00:00
Dave Airlie
013eba0ec1 r600: fix texture gradients instruction emission (v2)
The piglit tests were failing, and it appeared to be SB
optimising out things, but Glenn pointed out the gradients
are meant to be clause local, so we should emit the texture
instructions in the same clause. This moves things around
to always copy to a temp and then emit the texture clauses
for H/V.

v2: Glenn pointed out we could get another ALU fetch in
the wrong place, so load the src gpr earlier as well.

Fixes at least:
./bin/tex-miplevel-selection textureGrad 2D

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 38ec184419)
2014-11-26 21:19:56 +00:00
Ilia Mirkin
db9a6b96ab nv50,nvc0: buffer resources can be bound as other things down the line
res->bind is not an indicator of how the resource is currently bound.
buffers can be rebound across different binding points without changing
underlying storage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fecae4625c)
2014-11-24 00:55:28 +00:00
Ilia Mirkin
4d9c0445dd nv50,nvc0: actually check constbufs for invalidation
The number of vertex buffers has nothing to do with the number of bound
constbufs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e80a0a7d9a)
2014-11-24 00:55:22 +00:00
Ilia Mirkin
1a8f90dc70 nv50/ir: set neg modifiers on min/max args
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=86618
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d07083cfd)
2014-11-24 00:55:17 +00:00
20 changed files with 265 additions and 200 deletions

View File

@@ -1 +1 @@
10.4.0-rc2
10.4.0-rc3

View File

@@ -2085,6 +2085,8 @@ AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc)
AC_SUBST([NINE_MAJOR], 1)
AC_SUBST([NINE_MINOR], 0)
AC_SUBST([NINE_TINY], 0)
AC_SUBST([NINE_VERSION], "$NINE_MAJOR.$NINE_MINOR.$NINE_TINY")
AC_SUBST([VDPAU_MAJOR], 1)
AC_SUBST([VDPAU_MINOR], 0)

View File

@@ -176,7 +176,7 @@
#define HAVE_ASPRINTF 1 /* not needed */
#define HAVE_STDARG_H 1
#define HAVE_STDDEF_H 1
#define HAVE_STDINT_H 0
#define HAVE_STDINT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_INTTYPES_H 0
#define HAVE_LOCALE_H 0

View File

@@ -2350,6 +2350,9 @@ trans_idiv(const struct instr_translater *t,
if (t->tgsi_opc == TGSI_OPCODE_MOD || t->tgsi_opc == TGSI_OPCODE_UMOD) {
/* The division result will have ended up in q. */
if (is_rel_or_const(b))
b = get_unconst(ctx, b);
/* mull.u r, q, b */
instr = instr_create(ctx, 2, OPC_MULL_U);
vectorize(ctx, instr, &r_dst, 2, q_src, 0, b, 0);

View File

@@ -924,7 +924,9 @@ CodeEmitterNV50::emitMINMAX(const Instruction *i)
break;
}
code[1] |= i->src(0).mod.abs() << 20;
code[1] |= i->src(0).mod.neg() << 26;
code[1] |= i->src(1).mod.abs() << 19;
code[1] |= i->src(1).mod.neg() << 27;
}
emitForm_MAD(i);
}

View File

@@ -180,7 +180,12 @@ nv50_invalidate_resource_storage(struct nouveau_context *ctx,
}
}
if (res->bind & PIPE_BIND_VERTEX_BUFFER) {
if (res->bind & (PIPE_BIND_VERTEX_BUFFER |
PIPE_BIND_INDEX_BUFFER |
PIPE_BIND_CONSTANT_BUFFER |
PIPE_BIND_STREAM_OUTPUT |
PIPE_BIND_SAMPLER_VIEW)) {
assert(nv50->num_vtxbufs <= PIPE_MAX_ATTRIBS);
for (i = 0; i < nv50->num_vtxbufs; ++i) {
if (nv50->vtxbuf[i].buffer == res) {
@@ -190,14 +195,11 @@ nv50_invalidate_resource_storage(struct nouveau_context *ctx,
return ref;
}
}
}
if (res->bind & PIPE_BIND_INDEX_BUFFER) {
if (nv50->idxbuf.buffer == res)
if (!--ref)
return ref;
}
if (res->bind & PIPE_BIND_SAMPLER_VIEW) {
for (s = 0; s < 3; ++s) {
assert(nv50->num_textures[s] <= PIPE_MAX_SAMPLERS);
for (i = 0; i < nv50->num_textures[s]; ++i) {
@@ -210,12 +212,11 @@ nv50_invalidate_resource_storage(struct nouveau_context *ctx,
}
}
}
}
if (res->bind & PIPE_BIND_CONSTANT_BUFFER) {
for (s = 0; s < 3; ++s) {
assert(nv50->num_vtxbufs <= NV50_MAX_PIPE_CONSTBUFS);
for (i = 0; i < nv50->num_vtxbufs; ++i) {
for (i = 0; i < NV50_MAX_PIPE_CONSTBUFS; ++i) {
if (!(nv50->constbuf_valid[s] & (1 << i)))
continue;
if (!nv50->constbuf[s][i].user &&
nv50->constbuf[s][i].u.buf == res) {
nv50->dirty |= NV50_NEW_CONSTBUF;

View File

@@ -196,7 +196,12 @@ nvc0_invalidate_resource_storage(struct nouveau_context *ctx,
}
}
if (res->bind & PIPE_BIND_VERTEX_BUFFER) {
if (res->bind & (PIPE_BIND_VERTEX_BUFFER |
PIPE_BIND_INDEX_BUFFER |
PIPE_BIND_CONSTANT_BUFFER |
PIPE_BIND_STREAM_OUTPUT |
PIPE_BIND_COMMAND_ARGS_BUFFER |
PIPE_BIND_SAMPLER_VIEW)) {
for (i = 0; i < nvc0->num_vtxbufs; ++i) {
if (nvc0->vtxbuf[i].buffer == res) {
nvc0->dirty |= NVC0_NEW_ARRAYS;
@@ -205,17 +210,14 @@ nvc0_invalidate_resource_storage(struct nouveau_context *ctx,
return ref;
}
}
}
if (res->bind & PIPE_BIND_INDEX_BUFFER) {
if (nvc0->idxbuf.buffer == res) {
nvc0->dirty |= NVC0_NEW_IDXBUF;
nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_IDX);
if (!--ref)
return ref;
}
}
if (res->bind & PIPE_BIND_SAMPLER_VIEW) {
for (s = 0; s < 5; ++s) {
for (i = 0; i < nvc0->num_textures[s]; ++i) {
if (nvc0->textures[s][i] &&
@@ -228,11 +230,11 @@ nvc0_invalidate_resource_storage(struct nouveau_context *ctx,
}
}
}
}
if (res->bind & PIPE_BIND_CONSTANT_BUFFER) {
for (s = 0; s < 5; ++s) {
for (i = 0; i < nvc0->num_vtxbufs; ++i) {
for (i = 0; i < NVC0_MAX_PIPE_CONSTBUFS; ++i) {
if (!(nvc0->constbuf_valid[s] & (1 << i)))
continue;
if (!nvc0->constbuf[s][i].user &&
nvc0->constbuf[s][i].u.buf == res) {
nvc0->dirty |= NVC0_NEW_CONSTBUF;

View File

@@ -23,7 +23,6 @@
#define CONSTANT_BUFFER_0_ADDR_SPACE 8
#define CONSTANT_BUFFER_1_ADDR_SPACE (CONSTANT_BUFFER_0_ADDR_SPACE + R600_UCP_CONST_BUFFER)
#define CONSTANT_TXQ_BUFFER (CONSTANT_BUFFER_0_ADDR_SPACE + R600_TXQ_CONST_BUFFER)
#define LLVM_R600_BUFFER_INFO_CONST_BUFFER \
(CONSTANT_BUFFER_0_ADDR_SPACE + R600_BUFFER_INFO_CONST_BUFFER)
@@ -690,7 +689,7 @@ static void llvm_emit_tex(
if (emit_data->inst->Dst[0].Register.WriteMask & 4) {
LLVMValueRef offset = lp_build_const_int32(bld_base->base.gallivm, 0);
LLVMValueRef ZLayer = LLVMBuildExtractElement(gallivm->builder,
llvm_load_const_buffer(bld_base, offset, CONSTANT_TXQ_BUFFER),
llvm_load_const_buffer(bld_base, offset, LLVM_R600_BUFFER_INFO_CONST_BUFFER),
lp_build_const_int32(gallivm, 0), "");
emit_data->output[0] = LLVMBuildInsertElement(gallivm->builder, emit_data->output[0], ZLayer, lp_build_const_int32(gallivm, 2), "");

View File

@@ -44,20 +44,19 @@
#define R600_TRACE_CS_DWORDS 7
#define R600_MAX_USER_CONST_BUFFERS 13
#define R600_MAX_DRIVER_CONST_BUFFERS 4
#define R600_MAX_DRIVER_CONST_BUFFERS 3
#define R600_MAX_CONST_BUFFERS (R600_MAX_USER_CONST_BUFFERS + R600_MAX_DRIVER_CONST_BUFFERS)
/* start driver buffers after user buffers */
#define R600_UCP_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS)
#define R600_TXQ_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 1)
#define R600_BUFFER_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 2)
#define R600_GS_RING_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 3)
/* Currently R600_MAX_CONST_BUFFERS is too large, the hardware only has 16 buffers, but the driver is
* trying to use 17. Avoid accidentally aliasing with user UBOs for SAMPLE_POSITIONS by using an id<16.
#define R600_BUFFER_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 1)
#define R600_GS_RING_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 2)
/* Currently R600_MAX_CONST_BUFFERS just fits on the hw, which has a limit
* of 16 const buffers.
* UCP/SAMPLE_POSITIONS are never accessed by same shader stage so they can use the same id.
*
* Fixing this properly would require the driver to combine its buffers into a single hardware buffer,
* which would also allow supporting the d3d 11 mandated minimum of 15 user const buffers.
* In order to support d3d 11 mandated minimum of 15 user const buffers
* we'd have to squash all use cases into one driver buffer.
*/
#define R600_SAMPLE_POSITIONS_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS)
@@ -316,7 +315,6 @@ struct r600_samplerview_state {
uint32_t dirty_mask;
uint32_t compressed_depthtex_mask; /* which textures are depth */
uint32_t compressed_colortex_mask;
boolean dirty_txq_constants;
boolean dirty_buffer_constants;
};

View File

@@ -5035,8 +5035,9 @@ static int r600_do_buffer_txq(struct r600_shader_ctx *ctx)
alu.op = ALU_OP1_MOV;
if (ctx->bc->chip_class >= EVERGREEN) {
alu.src[0].sel = 512 + (id / 4);
alu.src[0].chan = id % 4;
/* channel 0 or 2 of each word */
alu.src[0].sel = 512 + (id / 2);
alu.src[0].chan = (id % 2) * 2;
} else {
/* r600 we have them at channel 2 of the second dword */
alu.src[0].sel = 512 + (id * 2) + 1;
@@ -5095,6 +5096,14 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
inst->Instruction.Opcode == TGSI_OPCODE_TG4)
sampler_src_reg = 2;
/* TGSI moves the sampler to src reg 3 for TXD */
if (inst->Instruction.Opcode == TGSI_OPCODE_TXD)
sampler_src_reg = 3;
sampler_index_mode = inst->Src[sampler_src_reg].Indirect.Index == 2 ? 2 : 0; // CF_INDEX_1 : CF_INDEX_NONE
if (sampler_index_mode)
ctx->shader->uses_index_registers = true;
src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
if (inst->Texture.Texture == TGSI_TEXTURE_BUFFER) {
@@ -5109,64 +5118,7 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
}
}
if (inst->Instruction.Opcode == TGSI_OPCODE_TXD) {
/* TGSI moves the sampler to src reg 3 for TXD */
sampler_src_reg = 3;
sampler_index_mode = inst->Src[sampler_src_reg].Indirect.Index == 2 ? 2 : 0; // CF_INDEX_1 : CF_INDEX_NONE
for (i = 1; i < 3; i++) {
/* set gradients h/v */
memset(&tex, 0, sizeof(struct r600_bytecode_tex));
tex.op = (i == 1) ? FETCH_OP_SET_GRADIENTS_H :
FETCH_OP_SET_GRADIENTS_V;
tex.sampler_id = tgsi_tex_get_src_gpr(ctx, sampler_src_reg);
tex.sampler_index_mode = sampler_index_mode;
tex.resource_id = tex.sampler_id + R600_MAX_CONST_BUFFERS;
tex.resource_index_mode = sampler_index_mode;
if (tgsi_tex_src_requires_loading(ctx, i)) {
tex.src_gpr = r600_get_temp(ctx);
tex.src_sel_x = 0;
tex.src_sel_y = 1;
tex.src_sel_z = 2;
tex.src_sel_w = 3;
for (j = 0; j < 4; j++) {
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP1_MOV;
r600_bytecode_src(&alu.src[0], &ctx->src[i], j);
alu.dst.sel = tex.src_gpr;
alu.dst.chan = j;
if (j == 3)
alu.last = 1;
alu.dst.write = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
}
} else {
tex.src_gpr = tgsi_tex_get_src_gpr(ctx, i);
tex.src_sel_x = ctx->src[i].swizzle[0];
tex.src_sel_y = ctx->src[i].swizzle[1];
tex.src_sel_z = ctx->src[i].swizzle[2];
tex.src_sel_w = ctx->src[i].swizzle[3];
tex.src_rel = ctx->src[i].rel;
}
tex.dst_gpr = ctx->temp_reg; /* just to avoid confusing the asm scheduler */
tex.dst_sel_x = tex.dst_sel_y = tex.dst_sel_z = tex.dst_sel_w = 7;
if (inst->Texture.Texture != TGSI_TEXTURE_RECT) {
tex.coord_type_x = 1;
tex.coord_type_y = 1;
tex.coord_type_z = 1;
tex.coord_type_w = 1;
}
r = r600_bytecode_add_tex(ctx->bc, &tex);
if (r)
return r;
}
} else if (inst->Instruction.Opcode == TGSI_OPCODE_TXP) {
if (inst->Instruction.Opcode == TGSI_OPCODE_TXP) {
int out_chan;
/* Add perspective divide */
if (ctx->bc->chip_class == CAYMAN) {
@@ -5230,9 +5182,6 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
src_gpr = ctx->temp_reg;
}
sampler_index_mode = inst->Src[sampler_src_reg].Indirect.Index == 2 ? 2 : 0; // CF_INDEX_1 : CF_INDEX_NONE
if (sampler_index_mode)
ctx->shader->uses_index_registers = true;
if ((inst->Texture.Texture == TGSI_TEXTURE_CUBE ||
inst->Texture.Texture == TGSI_TEXTURE_CUBE_ARRAY ||
@@ -5451,6 +5400,69 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
src_gpr = ctx->temp_reg;
}
if (inst->Instruction.Opcode == TGSI_OPCODE_TXD) {
int temp_h = 0, temp_v = 0;
int start_val = 0;
/* if we've already loaded the src (i.e. CUBE don't reload it). */
if (src_loaded == TRUE)
start_val = 1;
else
src_loaded = TRUE;
for (i = start_val; i < 3; i++) {
int treg = r600_get_temp(ctx);
if (i == 0)
src_gpr = treg;
else if (i == 1)
temp_h = treg;
else
temp_v = treg;
for (j = 0; j < 4; j++) {
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP1_MOV;
r600_bytecode_src(&alu.src[0], &ctx->src[i], j);
alu.dst.sel = treg;
alu.dst.chan = j;
if (j == 3)
alu.last = 1;
alu.dst.write = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);
if (r)
return r;
}
}
for (i = 1; i < 3; i++) {
/* set gradients h/v */
memset(&tex, 0, sizeof(struct r600_bytecode_tex));
tex.op = (i == 1) ? FETCH_OP_SET_GRADIENTS_H :
FETCH_OP_SET_GRADIENTS_V;
tex.sampler_id = tgsi_tex_get_src_gpr(ctx, sampler_src_reg);
tex.sampler_index_mode = sampler_index_mode;
tex.resource_id = tex.sampler_id + R600_MAX_CONST_BUFFERS;
tex.resource_index_mode = sampler_index_mode;
tex.src_gpr = (i == 1) ? temp_h : temp_v;
tex.src_sel_x = 0;
tex.src_sel_y = 1;
tex.src_sel_z = 2;
tex.src_sel_w = 3;
tex.dst_gpr = r600_get_temp(ctx); /* just to avoid confusing the asm scheduler */
tex.dst_sel_x = tex.dst_sel_y = tex.dst_sel_z = tex.dst_sel_w = 7;
if (inst->Texture.Texture != TGSI_TEXTURE_RECT) {
tex.coord_type_x = 1;
tex.coord_type_y = 1;
tex.coord_type_z = 1;
tex.coord_type_w = 1;
}
r = r600_bytecode_add_tex(ctx->bc, &tex);
if (r)
return r;
}
}
if (src_requires_loading && !src_loaded) {
for (i = 0; i < 4; i++) {
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
@@ -5686,9 +5698,16 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP1_MOV;
alu.src[0].sel = 512 + (id / 4);
alu.src[0].kc_bank = R600_TXQ_CONST_BUFFER;
alu.src[0].chan = id % 4;
if (ctx->bc->chip_class >= EVERGREEN) {
/* channel 1 or 3 of each word */
alu.src[0].sel = 512 + (id / 2);
alu.src[0].chan = ((id % 2) * 2) + 1;
} else {
/* r600 we have them at channel 2 of the second dword */
alu.src[0].sel = 512 + (id * 2) + 1;
alu.src[0].chan = 2;
}
alu.src[0].kc_bank = R600_BUFFER_INFO_CONST_BUFFER;
tgsi_dst(ctx, &inst->Dst[0], 2, &alu.dst);
alu.last = 1;
r = r600_bytecode_add_alu(ctx->bc, &alu);

View File

@@ -649,7 +649,6 @@ static void r600_set_sampler_views(struct pipe_context *pipe, unsigned shader,
dst->views.dirty_mask |= new_mask;
dst->views.compressed_depthtex_mask &= dst->views.enabled_mask;
dst->views.compressed_colortex_mask &= dst->views.enabled_mask;
dst->views.dirty_txq_constants = TRUE;
dst->views.dirty_buffer_constants = TRUE;
r600_sampler_views_dirty(rctx, &dst->views);
@@ -984,6 +983,7 @@ static void r600_set_sample_mask(struct pipe_context *pipe, unsigned sample_mask
* then in the shader, we AND the 4 components with 0xffffffff or 0,
* then OR the alpha with the value given here.
* We use a 6th constant to store the txq buffer size in
* we use 7th slot for number of cube layers in a cube map array.
*/
static void r600_setup_buffer_constants(struct r600_context *rctx, int shader_type)
{
@@ -1022,6 +1022,7 @@ static void r600_setup_buffer_constants(struct r600_context *rctx, int shader_ty
samplers->buffer_constants[offset + 4] = 0;
samplers->buffer_constants[offset + 5] = samplers->views.views[i]->base.texture->width0 / util_format_get_blocksize(samplers->views.views[i]->base.format);
samplers->buffer_constants[offset + 6] = samplers->views.views[i]->base.texture->array_size / 6;
}
}
@@ -1033,7 +1034,10 @@ static void r600_setup_buffer_constants(struct r600_context *rctx, int shader_ty
pipe_resource_reference(&cb.buffer, NULL);
}
/* On evergreen we only need to store the buffer size for TXQ */
/* On evergreen we store two values
* 1. buffer size for TXQ
* 2. number of cube layers in a cube map array.
*/
static void eg_setup_buffer_constants(struct r600_context *rctx, int shader_type)
{
struct r600_textures_info *samplers = &rctx->samplers[shader_type];
@@ -1048,12 +1052,16 @@ static void eg_setup_buffer_constants(struct r600_context *rctx, int shader_type
samplers->views.dirty_buffer_constants = FALSE;
bits = util_last_bit(samplers->views.enabled_mask);
array_size = bits * sizeof(uint32_t) * 4;
array_size = bits * 2 * sizeof(uint32_t) * 4;
samplers->buffer_constants = realloc(samplers->buffer_constants, array_size);
memset(samplers->buffer_constants, 0, array_size);
for (i = 0; i < bits; i++)
if (samplers->views.enabled_mask & (1 << i))
samplers->buffer_constants[i] = samplers->views.views[i]->base.texture->width0 / util_format_get_blocksize(samplers->views.views[i]->base.format);
for (i = 0; i < bits; i++) {
if (samplers->views.enabled_mask & (1 << i)) {
uint32_t offset = i * 2;
samplers->buffer_constants[offset] = samplers->views.views[i]->base.texture->width0 / util_format_get_blocksize(samplers->views.views[i]->base.format);
samplers->buffer_constants[offset + 1] = samplers->views.views[i]->base.texture->array_size / 6;
}
}
cb.buffer = NULL;
cb.user_buffer = samplers->buffer_constants;
@@ -1063,35 +1071,6 @@ static void eg_setup_buffer_constants(struct r600_context *rctx, int shader_type
pipe_resource_reference(&cb.buffer, NULL);
}
static void r600_setup_txq_cube_array_constants(struct r600_context *rctx, int shader_type)
{
struct r600_textures_info *samplers = &rctx->samplers[shader_type];
int bits;
uint32_t array_size;
struct pipe_constant_buffer cb;
int i;
if (!samplers->views.dirty_txq_constants)
return;
samplers->views.dirty_txq_constants = FALSE;
bits = util_last_bit(samplers->views.enabled_mask);
array_size = bits * sizeof(uint32_t) * 4;
samplers->txq_constants = realloc(samplers->txq_constants, array_size);
memset(samplers->txq_constants, 0, array_size);
for (i = 0; i < bits; i++)
if (samplers->views.enabled_mask & (1 << i))
samplers->txq_constants[i] = samplers->views.views[i]->base.texture->array_size / 6;
cb.buffer = NULL;
cb.user_buffer = samplers->txq_constants;
cb.buffer_offset = 0;
cb.buffer_size = array_size;
rctx->b.b.set_constant_buffer(&rctx->b.b, shader_type, R600_TXQ_CONST_BUFFER, &cb);
pipe_resource_reference(&cb.buffer, NULL);
}
/* set sample xy locations as array of fragment shader constants */
void r600_set_sample_locations_constant_buffer(struct r600_context *rctx)
{
@@ -1175,7 +1154,7 @@ static bool r600_update_derived_state(struct r600_context *rctx)
struct pipe_context * ctx = (struct pipe_context*)rctx;
bool ps_dirty = false, vs_dirty = false, gs_dirty = false;
bool blend_disable;
bool need_buf_const;
if (!rctx->blitter->running) {
unsigned i;
@@ -1296,29 +1275,35 @@ static bool r600_update_derived_state(struct r600_context *rctx)
/* on R600 we stuff masks + txq info into one constant buffer */
/* on evergreen we only need a txq info one */
if (rctx->b.chip_class < EVERGREEN) {
if (rctx->ps_shader && rctx->ps_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.uses_tex_buffers)
r600_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
} else {
if (rctx->ps_shader && rctx->ps_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.uses_tex_buffers)
eg_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
if (rctx->ps_shader) {
need_buf_const = rctx->ps_shader->current->shader.uses_tex_buffers || rctx->ps_shader->current->shader.has_txq_cube_array_z_comp;
if (need_buf_const) {
if (rctx->b.chip_class < EVERGREEN)
r600_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
else
eg_setup_buffer_constants(rctx, PIPE_SHADER_FRAGMENT);
}
}
if (rctx->vs_shader) {
need_buf_const = rctx->vs_shader->current->shader.uses_tex_buffers || rctx->vs_shader->current->shader.has_txq_cube_array_z_comp;
if (need_buf_const) {
if (rctx->b.chip_class < EVERGREEN)
r600_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
else
eg_setup_buffer_constants(rctx, PIPE_SHADER_VERTEX);
}
}
if (rctx->ps_shader && rctx->ps_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_FRAGMENT);
if (rctx->vs_shader && rctx->vs_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_VERTEX);
if (rctx->gs_shader && rctx->gs_shader->current->shader.has_txq_cube_array_z_comp)
r600_setup_txq_cube_array_constants(rctx, PIPE_SHADER_GEOMETRY);
if (rctx->gs_shader) {
need_buf_const = rctx->gs_shader->current->shader.uses_tex_buffers || rctx->gs_shader->current->shader.has_txq_cube_array_z_comp;
if (need_buf_const) {
if (rctx->b.chip_class < EVERGREEN)
r600_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
else
eg_setup_buffer_constants(rctx, PIPE_SHADER_GEOMETRY);
}
}
if (rctx->b.chip_class < EVERGREEN && rctx->ps_shader && rctx->vs_shader) {
if (!r600_adjust_gprs(rctx)) {

View File

@@ -367,7 +367,6 @@ EXPORTS
wglUseFontBitmapsW
wglUseFontOutlinesA
wglUseFontOutlinesW
wglGetExtensionsStringARB
DrvCopyContext
DrvCreateContext
DrvCreateLayerContext

View File

@@ -62,7 +62,7 @@ d3dadapter9_la_LDFLAGS = \
-shrext .so \
-module \
-no-undefined \
-version-number $(NINE_MAJOR):$(NINE_MINOR) \
-version-number $(NINE_MAJOR):$(NINE_MINOR):$(NINE_TINY) \
$(GC_SECTIONS) \
$(LD_NO_UNDEFINED)

View File

@@ -6,6 +6,6 @@ moduledir=@D3D_DRIVER_INSTALL_DIR@
Name: d3d
Description: Native D3D driver modules
Version: @VERSION@
Version: @NINE_VERSION@
Requires.private: @DRI_PC_REQ_PRIV@
Cflags: -I${includedir}

View File

@@ -49,6 +49,7 @@ public:
{
instructions = insts;
progress = false;
index = 0;
}
bool run()
@@ -62,6 +63,7 @@ public:
private:
exec_list *instructions;
bool progress;
unsigned index;
};
void
@@ -70,14 +72,20 @@ lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)
if (!*rvalue)
return;
ir_constant *con = (*rvalue)->as_constant();
ir_dereference_array *dra = (*rvalue)->as_dereference_array();
if (!dra)
return;
ir_constant *con = dra->array->as_constant();
if (!con || !con->type->is_array())
return;
void *mem_ctx = ralloc_parent(con);
char *uniform_name = ralloc_asprintf(mem_ctx, "constarray__%d", index++);
ir_variable *uni =
new(mem_ctx) ir_variable(con->type, "constarray", ir_var_uniform);
new(mem_ctx) ir_variable(con->type, uniform_name, ir_var_uniform);
uni->constant_initializer = con;
uni->constant_value = con;
uni->data.has_initializer = true;
@@ -87,7 +95,8 @@ lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)
uni->data.max_array_access = uni->type->length - 1;
instructions->push_head(uni);
*rvalue = new(mem_ctx) ir_dereference_variable(uni);
ir_dereference_variable *varref = new(mem_ctx) ir_dereference_variable(uni);
*rvalue = new(mem_ctx) ir_dereference_array(varref, dra->array_index);
progress = true;
}

View File

@@ -16,6 +16,7 @@ if env['platform'] == 'windows':
env.Append(CPPDEFINES = [
'_GDI32_', # prevent gl* being declared __declspec(dllimport) in MS headers
'BUILD_GL32', # declare gl* as __declspec(dllexport) in Mesa headers
'KHRONOS_DLL_EXPORTS', # declare gl* as __declspec(dllexport) in Khronos headers
])
if env['gles']:
env.Append(CPPDEFINES = ['_GLAPI_DLL_EXPORTS'])

View File

@@ -3354,6 +3354,7 @@ vec4_visitor::move_uniform_array_access_to_pull_constants()
{
int pull_constant_loc[this->uniforms];
memset(pull_constant_loc, -1, sizeof(pull_constant_loc));
bool nested_reladdr;
/* Walk through and find array access of uniforms. Put a copy of that
* uniform in the pull constant buffer.
@@ -3361,44 +3362,51 @@ vec4_visitor::move_uniform_array_access_to_pull_constants()
* Note that we don't move constant-indexed accesses to arrays. No
* testing has been done of the performance impact of this choice.
*/
foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
for (int i = 0 ; i < 3; i++) {
if (inst->src[i].file != UNIFORM || !inst->src[i].reladdr)
continue;
do {
nested_reladdr = false;
int uniform = inst->src[i].reg;
foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
for (int i = 0 ; i < 3; i++) {
if (inst->src[i].file != UNIFORM || !inst->src[i].reladdr)
continue;
/* If this array isn't already present in the pull constant buffer,
* add it.
*/
if (pull_constant_loc[uniform] == -1) {
const gl_constant_value **values =
&stage_prog_data->param[uniform * 4];
int uniform = inst->src[i].reg;
pull_constant_loc[uniform] = stage_prog_data->nr_pull_params / 4;
if (inst->src[i].reladdr->reladdr)
nested_reladdr = true; /* will need another pass */
assert(uniform < uniform_array_size);
for (int j = 0; j < uniform_size[uniform] * 4; j++) {
stage_prog_data->pull_param[stage_prog_data->nr_pull_params++]
= values[j];
}
}
/* If this array isn't already present in the pull constant buffer,
* add it.
*/
if (pull_constant_loc[uniform] == -1) {
const gl_constant_value **values =
&stage_prog_data->param[uniform * 4];
/* Set up the annotation tracking for new generated instructions. */
base_ir = inst->ir;
current_annotation = inst->annotation;
pull_constant_loc[uniform] = stage_prog_data->nr_pull_params / 4;
dst_reg temp = dst_reg(this, glsl_type::vec4_type);
assert(uniform < uniform_array_size);
for (int j = 0; j < uniform_size[uniform] * 4; j++) {
stage_prog_data->pull_param[stage_prog_data->nr_pull_params++]
= values[j];
}
}
emit_pull_constant_load(block, inst, temp, inst->src[i],
pull_constant_loc[uniform]);
/* Set up the annotation tracking for new generated instructions. */
base_ir = inst->ir;
current_annotation = inst->annotation;
inst->src[i].file = temp.file;
inst->src[i].reg = temp.reg;
inst->src[i].reg_offset = temp.reg_offset;
inst->src[i].reladdr = NULL;
dst_reg temp = dst_reg(this, glsl_type::vec4_type);
emit_pull_constant_load(block, inst, temp, inst->src[i],
pull_constant_loc[uniform]);
inst->src[i].file = temp.file;
inst->src[i].reg = temp.reg;
inst->src[i].reg_offset = temp.reg_offset;
inst->src[i].reladdr = NULL;
}
}
}
} while (nested_reladdr);
/* Now there are no accesses of the UNIFORM file with a reladdr, so
* no need to track them as larger-than-vec4 objects. This will be

View File

@@ -129,6 +129,20 @@ get_attr_override(const struct brw_vue_map *vue_map, int urb_entry_read_offset,
}
static bool
is_drawing_points(const struct brw_context *brw)
{
/* Determine if the primitives *reaching the SF* are points */
if (brw->geometry_program) {
/* BRW_NEW_GEOMETRY_PROGRAM */
return brw->geometry_program->OutputType == GL_POINTS;
} else {
/* BRW_NEW_PRIMITIVE */
return brw->primitive == _3DPRIM_POINTLIST;
}
}
/**
* Create the mapping from the FS inputs we produce to the previous pipeline
* stage (GS or VS) outputs they source from.
@@ -149,6 +163,23 @@ calculate_attr_overrides(const struct brw_context *brw,
/* _NEW_LIGHT */
bool shade_model_flat = brw->ctx.Light.ShadeModel == GL_FLAT;
/* From the Ivybridge PRM, Vol 2 Part 1, 3DSTATE_SBE,
* description of dw10 Point Sprite Texture Coordinate Enable:
*
* "This field must be programmed to zero when non-point primitives
* are rendered."
*
* The SandyBridge PRM doesn't explicitly say that point sprite enables
* must be programmed to zero when rendering non-point primitives, but
* the IvyBridge PRM does, and if we don't, we get garbage.
*
* This is not required on Haswell, as the hardware ignores this state
* when drawing non-points -- although we do still need to be careful to
* correctly set the attr overrides.
*/
/* BRW_NEW_PRIMITIVE | BRW_NEW_GEOMETRY_PROGRAM */
bool drawing_points = is_drawing_points(brw);
/* Initialize all the attr_overrides to 0. In the loop below we'll modify
* just the ones that correspond to inputs used by the fs.
*/
@@ -167,18 +198,20 @@ calculate_attr_overrides(const struct brw_context *brw,
/* _NEW_POINT */
bool point_sprite = false;
if (brw->ctx.Point.PointSprite &&
(attr >= VARYING_SLOT_TEX0 && attr <= VARYING_SLOT_TEX7) &&
brw->ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
point_sprite = true;
if (drawing_points) {
if (brw->ctx.Point.PointSprite &&
(attr >= VARYING_SLOT_TEX0 && attr <= VARYING_SLOT_TEX7) &&
brw->ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
point_sprite = true;
}
if (attr == VARYING_SLOT_PNTC)
point_sprite = true;
if (point_sprite)
*point_sprite_enables |= (1 << input_index);
}
if (attr == VARYING_SLOT_PNTC)
point_sprite = true;
if (point_sprite)
*point_sprite_enables |= (1 << input_index);
/* flat shading */
if (interp_qualifier == INTERP_QUALIFIER_FLAT ||
(shade_model_flat && is_gl_Color &&
@@ -410,7 +443,9 @@ const struct brw_tracked_state gen6_sf_state = {
_NEW_POINT |
_NEW_MULTISAMPLE),
.brw = (BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_GEOMETRY_PROGRAM |
BRW_NEW_PRIMITIVE |
BRW_NEW_VUE_MAP_GEOM_OUT),
.cache = CACHE_NEW_WM_PROG
},

View File

@@ -92,7 +92,9 @@ const struct brw_tracked_state gen7_sbe_state = {
_NEW_POINT |
_NEW_PROGRAM),
.brw = (BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_FRAGMENT_PROGRAM |
BRW_NEW_GEOMETRY_PROGRAM |
BRW_NEW_PRIMITIVE |
BRW_NEW_VUE_MAP_GEOM_OUT),
.cache = CACHE_NEW_WM_PROG
},

View File

@@ -627,7 +627,7 @@ descriptor=[
# == GL_CURRENT_MATRIX_NV
[ "CURRENT_MATRIX_ARB", "LOC_CUSTOM, TYPE_MATRIX, 0, extra_ARB_vertex_program_ARB_fragment_program" ],
# == GL_CURRENT_MATRIX_NV
[ "TRANSPOSE_CURRENT_MATRIX_ARB", "LOC_CUSTOM, TYPE_MATRIX, 0, extra_ARB_vertex_program_ARB_fragment_program" ],
[ "TRANSPOSE_CURRENT_MATRIX_ARB", "LOC_CUSTOM, TYPE_MATRIX_T, 0, extra_ARB_vertex_program_ARB_fragment_program" ],
# == GL_PROGRAM_ERROR_POSITION_NV
[ "PROGRAM_ERROR_POSITION_ARB", "CONTEXT_INT(Program.ErrorPos), extra_ARB_vertex_program_ARB_fragment_program" ],