Compare commits

...

25 Commits

Author SHA1 Message Date
Emil Velikov
bcb9e1d26b docs: add release notes for 11.0.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-26 13:32:07 +01:00
Emil Velikov
de1637c7fe Update version to 11.0.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-26 13:22:34 +01:00
Ian Romanick
cf716563a8 t_dd_dmatmp: Use addition instead of subtraction in loop bounds
This is used everywhere else in this file because it avoids problems
when count is zero (due to trimming).

No piglit regressions on i915 (G33) or radeon (Radeon 7500).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: Marius Predut <marius.predut@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25543d8ec5)
2015-09-23 21:10:42 +01:00
Ian Romanick
2c65e64881 t_dd_dmatmp: Pull out common 'count -= count & 3' code
This was missing in the HAVE_TRIANGLES path, and that could cause
incorrect rendering.

No piglit regressions on i915 (G33) or radeon (Radeon 7500).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: Marius Predut <marius.predut@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c0b3b2f760)
2015-09-23 21:10:11 +01:00
Ian Romanick
8be6b32d65 t_dd_dmatmp: Use '& 3' instead of '% 4' everywhere
No piglit regressions on i915 (G33) or radeon (Radeon 7500).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0d475ee2b9)
2015-09-23 21:09:41 +01:00
Ian Romanick
0e0d008b2b t_dd_dmatmp: Clean up improper code formatting from previous patch
No piglit regressions on i915 (G33) or radeon (Radeon 7500).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fad8d54de7)
2015-09-23 21:09:10 +01:00
Ian Romanick
007aae740e t_dd_dmatmp: Make "count" actually be the count
The value passed in count previously was "vertex after the last vertex
to be processed."  Calling that "count" was misleading and kind of mean.
Looking at the code, many functions immediately do "count-start" to get
back the true count.  That's just silly.

If it is better for the loops to be 'for (j = start; j < (start +
count); j++)', GCC will do that transformation.

NOTE: There is some strange formatting left by this patch.  That was
done to make it more obvious that the before and after code is
equivalent.  These will be fixed in the next patch.

No piglit regressions on i915 (G33) or radeon (Radeon 7500).

v2: Fix a remaining (count-start) in render_quad_strip_verts.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d7bf7969b9)
2015-09-23 21:08:40 +01:00
Iago Toral Quiroga
575f5a94c3 mesa: Fix GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for default framebuffer.
From section 9.2. Binding and Managing Framebuffer Objects:

"Upon successful return from Get*FramebufferAttachmentParameteriv, if
pname is FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE, then params will contain
one of NONE, FRAMEBUFFER_DEFAULT, TEXTURE, or RENDERBUFFER, identifying
the type of object which contains the attached image."

And then it clarifies further:

"If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
either no framebuffer is bound to target; or the default framebuffer is
bound, attachment is DEPTH or STENCIL, and the number of depth or stencil
bits, respectively, is zero"

Currently, if the default framebuffer is bound, we always return
GL_FRAMEBUFFER_DEFAULT for FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE, but
according to the spec, when GL_DEPTH or GL_STENCIL attachments are
the ones being queried, we should return GL_NONE if they don't exist.

Fixes the following dEQP test:
dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf439951b7)
2015-09-23 21:08:06 +01:00
Tapani Pälli
b1203ec9f3 i965: fix textureGrad for cubemaps
Fixes bugs exposed by commit
2b1cdb0edd in:
   ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag

No regressions observed in deqp, CTS or Piglit.

v2: address review feedback from Iago Toral:
   - move rho calculation to else branch
   - optimize dx and dy calculation
   - fix documentation inconsistensies

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91114
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f8815bcb9)
2015-09-23 21:07:35 +01:00
Jeremy Huddleston
c29e3f1bca configure.ac: Add support to enable read-only text segment on x86.
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/240956
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6dfc5e28f7)
2015-09-23 21:07:03 +01:00
Ilia Mirkin
c98217178b radeonsi: load fmask ptr relative to the resources array
res_ptr already contains the resource values. fmask_ptr needs to be
looked up relative to the start of the resource params.

Note that this only affects indirect loads of MS sampler arrays.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d5162bdc0)
2015-09-23 21:06:29 +01:00
Tapani Pälli
278739eb01 mesa: fix errors when reading depth with glReadPixels
OpenGL ES 3.0 spec 3.7.2 "Transfer of Pixel Rectangles" specifies
DEPTH_COMPONENT, UNSIGNED_INT as a valid couple, validation for
internal format is checked by is_float_depth().

Fix regression caused by 81d2fd91a9 in:
   ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels

Test uses GL_DEPTH_COMPONENT, UNSIGNED_INT only when GL_NV_read_depth
extension is present.

v2: change check in _mesa_error_check_format_and_type to be explicit
    for ES 2.0+, desktop OpenGL does not allow this behaviour + uses
    this function for both glReadPixels and glDrawPixels validation.
    (No Piglit regressions seen with v2.)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92009
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit afa1efdc85)
2015-09-23 21:05:54 +01:00
Ilia Mirkin
ae6dcfee56 nv50,nvc0: flush texture cache in presence of coherent bufs
This fixes the newly-added arb_texture_buffer_object-bufferstorage
piglit test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e844e1007d)
2015-09-23 21:05:23 +01:00
Ilia Mirkin
9fcf28bb14 nv50,nvc0: detect underlying resource changes and update tic
When updating texture buffers, we might end up replacing the whole
buffer. Check that the tic address matches the resource address, and if
not, update the tic and reupload it.

This fixes:
  arb_direct_state_access-texture-buffer
  arb_texture_buffer_object-data-sync

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 323c912506)
2015-09-23 21:04:50 +01:00
Ulrich Weigand
5fe09ffe6a mesa: Fix texture compression on big-endian systems
Various pieces of code to create compressed textures will first
generate an uncompressed RGBA texture into a temporary buffer,
and then read from that buffer while creating the final compressed
texture in the requested format.

The code reading from the temporary buffer assumes the buffer is
formatted as an array of bytes in RGBA order.  However, the buffer
is filled using a _mesa_texstore call with MESA_FORMAT_R8G8B8A8_UNORM
format -- this is defined as an array of *integers* holding the
RGBA values in packed format (least-significant to most-significant).
This means incorrect bytes are accessed on big-endian systems.

This patch fixes this by using the MESA_FORMAT_A8B8G8R8_UNORM format
instead on big-endian systems when filling the buffer.  This fixes
about 100 piglit test case failures on s390x for me.

Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@gmail.com>
(cherry picked from commit bd016a2601)
2015-09-23 21:04:15 +01:00
Ilia Mirkin
395cd23690 freedreno/a3xx: fix blending of L8 format
Even though luminance formats don't have alpha, we still want the alpha
output to go to the blender. This fixes the luminance blending tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 545a3cbb01)
2015-09-23 21:03:44 +01:00
Ilia Mirkin
d04024cffa nv50, nvc0: fix max texture buffer size to 128M elements
This is what the hardware supports, there never was any sort of 64K
limit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7a275fcda8)
2015-09-23 21:03:12 +01:00
Ilia Mirkin
370c2b344b st/mesa: avoid integer overflows with buffers >= 512MB
This fixes failures with the newly-submitted max-size texture buffer
piglit test for GPUs exposing >= 128M max texels.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
(cherry picked from commit eb081681df)
2015-09-23 21:02:40 +01:00
Ray Strode
bcb3bfd510 gbm: convert gbm bo format to fourcc format on dma-buf import
At the moment if a gbm buffer is imported and the gbm buffer
has an old-style GBM_BO_FORMAT format, the import will crash,
since it's passed directly to DRI functions that expect
a fourcc format (as provided by the newer GBM_FORMAT
definitions)

This commit addresses the problem in two ways:

1) it prevents invalid formats from leading to a crash by
returning EINVAL if the image couldn't be created

2) it translates GBM_BO_FORMAT formats into the comparable
GBM_FORMAT formats.

Reference: https://bugzilla.gnome.org/show_bug.cgi?id=753531
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit 4bf151e662)
2015-09-23 21:02:07 +01:00
Anuj Phogat
ebfa2ea34f meta: Abort meta pbo path if TexSubImage need signed unsigned conversion
See similar fix for Readpixels in mesa commit 0d20790. Jason suggested
we need that for TexSubImage as well.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 64e25167ed)
2015-09-23 21:01:36 +01:00
Antia Puentes
3736ef3a17 i965/vec4_nir: Load constants as integers
Loads constants using integer as their register type, like it is
done in FS backend.

No shader-db changes in HSW.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit b8d2263c83)
2015-09-23 21:01:05 +01:00
Antia Puentes
d9e4a3ae6a i965/vec4: Fix saturation errors when coalescing registers
If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.

This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.

For example,
   mov      vgrf7:D vgrf2:D
   mov.sat  m4:F vgrf7:F

is coalesced to:
   mov.sat  m4:D vgrf2:D

The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV (without type
conversion implied). In that case, the patch sets the register
types to the type of the final destination.

Shader-db results in HSW (only vec4 instructions shown):

total instructions in shared programs: 1754415 -> 1754416 (0.00%)
instructions in affected programs:     74 -> 75 (1.35%)
helped:                                0
HURT:                                  1
GAINED:                                0
LOST:                                  0

Only one extra instruction in one of the shaders, that comes from
eliminating a saturation error by preventing register coalesce.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 79f1a7ae28)
2015-09-23 21:00:34 +01:00
Jason Ekstrand
1afea31ad8 i965/vec4: Don't reswizzle hardware registers
Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91719
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1037e0a84f)
2015-09-23 21:00:03 +01:00
Jason Ekstrand
d9b54a01be nir: Fix a bunch of ralloc parenting errors
As of a10d4937, we would really like things associated with an instruction
to be allocated out of that instruction and not out of the shader.  In
particular, you should be passing the instruction that will ultimately be
holding the source into nir_src_copy rather than an arbitrary memory
context.

We also change the prototypes of nir_dest_copy and nir_alu_src/dest_copy to
explicitly take an instruction so we catch this earlier in the future.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
(cherry picked from commit 8c8fc5f833)
2015-09-23 20:48:26 +01:00
Emil Velikov
c4bae5792b docs: add sha256 checksums for 11.0.0
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-12 13:32:56 +01:00
38 changed files with 628 additions and 171 deletions

View File

@@ -1 +1 @@
11.0.0
11.0.1

View File

@@ -1150,6 +1150,16 @@ AC_SUBST(GLX_TLS, ${GLX_USE_TLS})
AS_IF([test "x$GLX_USE_TLS" = xyes -a "x$ax_pthread_ok" = xyes],
[DEFINES="${DEFINES} -DGLX_USE_TLS"])
dnl Read-only text section on x86 hardened platforms
AC_ARG_ENABLE([glx-read-only-text],
[AS_HELP_STRING([--enable-glx-read-only-text],
[Disable writable .text section on x86 (decreases performance) @<:@default=disabled@:>@])],
[enable_glx_read_only_text="$enableval"],
[enable_glx_read_only_text=no])
if test "x$enable_glx_read_only_text" = xyes; then
DEFINES="$DEFINES -DGLX_X86_READONLY_TEXT"
fi
dnl
dnl More DRI setup
dnl

View File

@@ -33,7 +33,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD.
7d7e4ddffa3b162506efa01e2cc41e329caa4995336b92e5cc21f2e1fb36c1b3 mesa-11.0.0.tar.gz
e095a3eb2eca9dfde7efca8946527c8ae20a0cc938a8c78debc7f158ad44af32 mesa-11.0.0.tar.xz
</pre>

133
docs/relnotes/11.0.1.html Normal file
View File

@@ -0,0 +1,133 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.0.1 Release Notes / September 26, 2015</h1>
<p>
Mesa 11.0.1 is a bug fix release which fixes bugs found since the 11.0.0 release.
</p>
<p>
Mesa 11.0.1 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38109">Bug 38109</a> - i915 driver crashes if too few vertices are submitted (Mesa 7.10.2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91114">Bug 91114</a> - ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91716">Bug 91716</a> - [bisected] piglit.shaders.glsl-vs-int-attrib regresses on 32 bit BYT, HSW, IVB, SNB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91719">Bug 91719</a> - [SNB,HSW,BYT] dEQP regressions associated with using NIR for vertex shaders</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92009">Bug 92009</a> - ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels fails</li>
</ul>
<h2>Changes</h2>
<p>Antia Puentes (2):</p>
<ul>
<li>i965/vec4: Fix saturation errors when coalescing registers</li>
<li>i965/vec4_nir: Load constants as integers</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>meta: Abort meta pbo path if TexSubImage need signed unsigned conversion</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 11.0.0</li>
<li>Update version to 11.0.1</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>mesa: Fix GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for default framebuffer.</li>
</ul>
<p>Ian Romanick (5):</p>
<ul>
<li>t_dd_dmatmp: Make "count" actually be the count</li>
<li>t_dd_dmatmp: Clean up improper code formatting from previous patch</li>
<li>t_dd_dmatmp: Use '&amp; 3' instead of '% 4' everywhere</li>
<li>t_dd_dmatmp: Pull out common 'count -= count &amp; 3' code</li>
<li>t_dd_dmatmp: Use addition instead of subtraction in loop bounds</li>
</ul>
<p>Ilia Mirkin (6):</p>
<ul>
<li>st/mesa: avoid integer overflows with buffers &gt;= 512MB</li>
<li>nv50, nvc0: fix max texture buffer size to 128M elements</li>
<li>freedreno/a3xx: fix blending of L8 format</li>
<li>nv50,nvc0: detect underlying resource changes and update tic</li>
<li>nv50,nvc0: flush texture cache in presence of coherent bufs</li>
<li>radeonsi: load fmask ptr relative to the resources array</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>nir: Fix a bunch of ralloc parenting errors</li>
<li>i965/vec4: Don't reswizzle hardware registers</li>
</ul>
<p>Jeremy Huddleston (1):</p>
<ul>
<li>configure.ac: Add support to enable read-only text segment on x86.</li>
</ul>
<p>Ray Strode (1):</p>
<ul>
<li>gbm: convert gbm bo format to fourcc format on dma-buf import</li>
</ul>
<p>Tapani Pälli (2):</p>
<ul>
<li>mesa: fix errors when reading depth with glReadPixels</li>
<li>i965: fix textureGrad for cubemaps</li>
</ul>
<p>Ulrich Weigand (1):</p>
<ul>
<li>mesa: Fix texture compression on big-endian systems</li>
</ul>
</div>
</body>
</html>

View File

@@ -355,6 +355,8 @@ fd3_fs_output_format(enum pipe_format format)
case PIPE_FORMAT_R16G16_FLOAT:
case PIPE_FORMAT_R11G11B10_FLOAT:
return RB_R16G16B16A16_FLOAT;
case PIPE_FORMAT_L8_UNORM:
return RB_R8G8B8A8_UNORM;
default:
return fd3_pipe2color(format);
}

View File

@@ -100,7 +100,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXEL_OFFSET:
return 7;
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return 65536;
return 128 * 1024 * 1024;
case PIPE_CAP_GLSL_FEATURE_LEVEL:
return 330;
case PIPE_CAP_MAX_RENDER_TARGETS:

View File

@@ -221,6 +221,26 @@ nv50_create_texture_view(struct pipe_context *pipe,
return &view->pipe;
}
static void
nv50_update_tic(struct nv50_context *nv50, struct nv50_tic_entry *tic,
struct nv04_resource *res)
{
uint64_t address = res->address;
if (res->base.target != PIPE_BUFFER)
return;
address += tic->pipe.u.buf.first_element *
util_format_get_blocksize(tic->pipe.format);
if (tic->tic[1] == (uint32_t)address &&
(tic->tic[2] & 0xff) == address >> 32)
return;
nv50_screen_tic_unlock(nv50->screen, tic);
tic->id = -1;
tic->tic[1] = address;
tic->tic[2] &= 0xffffff00;
tic->tic[2] |= address >> 32;
}
static bool
nv50_validate_tic(struct nv50_context *nv50, int s)
{
@@ -240,6 +260,7 @@ nv50_validate_tic(struct nv50_context *nv50, int s)
continue;
}
res = &nv50_miptree(tic->pipe.texture)->base;
nv50_update_tic(nv50, tic, res);
if (tic->id < 0) {
tic->id = nv50_screen_tic_alloc(nv50->screen, tic);

View File

@@ -768,6 +768,7 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
{
struct nv50_context *nv50 = nv50_context(pipe);
struct nouveau_pushbuf *push = nv50->base.pushbuf;
bool tex_dirty = false;
int i, s;
/* NOTE: caller must ensure that (min_index + index_bias) is >= 0 */
@@ -797,6 +798,9 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
push->kick_notify = nv50_draw_vbo_kick_notify;
/* TODO: Instead of iterating over all the buffer resources looking for
* coherent buffers, keep track of a context-wide count.
*/
for (s = 0; s < 3 && !nv50->cb_dirty; ++s) {
uint32_t valid = nv50->constbuf_valid[s];
@@ -824,6 +828,21 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
nv50->cb_dirty = false;
}
for (s = 0; s < 3 && !tex_dirty; ++s) {
for (i = 0; i < nv50->num_textures[s] && !tex_dirty; ++i) {
if (!nv50->textures[s][i] ||
nv50->textures[s][i]->texture->target != PIPE_BUFFER)
continue;
if (nv50->textures[s][i]->texture->flags &
PIPE_RESOURCE_FLAG_MAP_COHERENT)
tex_dirty = true;
}
}
if (tex_dirty) {
BEGIN_NV04(push, NV50_3D(TEX_CACHE_CTL), 1);
PUSH_DATA (push, 0x20);
}
if (nv50->vbo_fifo) {
nv50_push_vbo(nv50, info);
push->kick_notify = nv50_default_kick_notify;

View File

@@ -87,7 +87,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
return 31;
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return 65536;
return 128 * 1024 * 1024;
case PIPE_CAP_GLSL_FEATURE_LEVEL:
return 410;
case PIPE_CAP_MAX_RENDER_TARGETS:

View File

@@ -226,6 +226,26 @@ nvc0_create_texture_view(struct pipe_context *pipe,
return &view->pipe;
}
static void
nvc0_update_tic(struct nvc0_context *nvc0, struct nv50_tic_entry *tic,
struct nv04_resource *res)
{
uint64_t address = res->address;
if (res->base.target != PIPE_BUFFER)
return;
address += tic->pipe.u.buf.first_element *
util_format_get_blocksize(tic->pipe.format);
if (tic->tic[1] == (uint32_t)address &&
(tic->tic[2] & 0xff) == address >> 32)
return;
nvc0_screen_tic_unlock(nvc0->screen, tic);
tic->id = -1;
tic->tic[1] = address;
tic->tic[2] &= 0xffffff00;
tic->tic[2] |= address >> 32;
}
static bool
nvc0_validate_tic(struct nvc0_context *nvc0, int s)
{
@@ -247,6 +267,7 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
continue;
}
res = nv04_resource(tic->pipe.texture);
nvc0_update_tic(nvc0, tic, res);
if (tic->id < 0) {
tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
@@ -313,6 +334,7 @@ nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
continue;
}
res = nv04_resource(tic->pipe.texture);
nvc0_update_tic(nvc0, tic, res);
if (tic->id < 0) {
tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);

View File

@@ -899,6 +899,9 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
push->kick_notify = nvc0_draw_vbo_kick_notify;
/* TODO: Instead of iterating over all the buffer resources looking for
* coherent buffers, keep track of a context-wide count.
*/
for (s = 0; s < 5 && !nvc0->cb_dirty; ++s) {
uint32_t valid = nvc0->constbuf_valid[s];
@@ -924,6 +927,23 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
nvc0->cb_dirty = false;
}
for (s = 0; s < 5; ++s) {
for (int i = 0; i < nvc0->num_textures[s]; ++i) {
struct nv50_tic_entry *tic = nv50_tic_entry(nvc0->textures[s][i]);
struct pipe_resource *res;
if (!tic)
continue;
res = nvc0->textures[s][i]->texture;
if (res->target != PIPE_BUFFER ||
!(res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
continue;
BEGIN_NVC0(push, NVC0_3D(TEX_CACHE_CTL), 1);
PUSH_DATA (push, (tic->id << 4) | 1);
NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_cache_flush_count, 1);
}
}
if (nvc0->state.vbo_mode) {
nvc0_push_vbo(nvc0, info);
push->kick_notify = nvc0_default_kick_notify;

View File

@@ -2300,7 +2300,7 @@ static void tex_fetch_args(
lp_build_const_int32(gallivm,
SI_FMASK_TEX_OFFSET), "");
fmask_ptr = LLVMGetParam(si_shader_ctx->radeon_bld.main_fn, SI_PARAM_RESOURCE);
fmask_ptr = build_indexed_load_const(si_shader_ctx, res_ptr, ind_index);
fmask_ptr = build_indexed_load_const(si_shader_ctx, fmask_ptr, ind_index);
}
} else {
res_ptr = si_shader_ctx->resources[sampler_index];

View File

@@ -706,14 +706,30 @@ gbm_dri_bo_import(struct gbm_device *gbm,
{
struct gbm_import_fd_data *fd_data = buffer;
int stride = fd_data->stride, offset = 0;
int dri_format;
switch (fd_data->format) {
case GBM_BO_FORMAT_XRGB8888:
dri_format = GBM_FORMAT_XRGB8888;
break;
case GBM_BO_FORMAT_ARGB8888:
dri_format = GBM_FORMAT_ARGB8888;
break;
default:
dri_format = fd_data->format;
}
image = dri->image->createImageFromFds(dri->screen,
fd_data->width,
fd_data->height,
fd_data->format,
dri_format,
&fd_data->fd, 1,
&stride, &offset,
NULL);
if (image == NULL) {
errno = EINVAL;
return NULL;
}
gbm_format = fd_data->format;
break;
}

View File

@@ -145,7 +145,7 @@ void nir_src_copy(nir_src *dest, const nir_src *src, void *mem_ctx)
}
}
void nir_dest_copy(nir_dest *dest, const nir_dest *src, void *mem_ctx)
void nir_dest_copy(nir_dest *dest, const nir_dest *src, nir_instr *instr)
{
/* Copying an SSA definition makes no sense whatsoever. */
assert(!src->is_ssa);
@@ -155,17 +155,18 @@ void nir_dest_copy(nir_dest *dest, const nir_dest *src, void *mem_ctx)
dest->reg.base_offset = src->reg.base_offset;
dest->reg.reg = src->reg.reg;
if (src->reg.indirect) {
dest->reg.indirect = ralloc(mem_ctx, nir_src);
nir_src_copy(dest->reg.indirect, src->reg.indirect, mem_ctx);
dest->reg.indirect = ralloc(instr, nir_src);
nir_src_copy(dest->reg.indirect, src->reg.indirect, instr);
} else {
dest->reg.indirect = NULL;
}
}
void
nir_alu_src_copy(nir_alu_src *dest, const nir_alu_src *src, void *mem_ctx)
nir_alu_src_copy(nir_alu_src *dest, const nir_alu_src *src,
nir_alu_instr *instr)
{
nir_src_copy(&dest->src, &src->src, mem_ctx);
nir_src_copy(&dest->src, &src->src, &instr->instr);
dest->abs = src->abs;
dest->negate = src->negate;
for (unsigned i = 0; i < 4; i++)
@@ -173,9 +174,10 @@ nir_alu_src_copy(nir_alu_src *dest, const nir_alu_src *src, void *mem_ctx)
}
void
nir_alu_dest_copy(nir_alu_dest *dest, const nir_alu_dest *src, void *mem_ctx)
nir_alu_dest_copy(nir_alu_dest *dest, const nir_alu_dest *src,
nir_alu_instr *instr)
{
nir_dest_copy(&dest->dest, &src->dest, mem_ctx);
nir_dest_copy(&dest->dest, &src->dest, &instr->instr);
dest->write_mask = src->write_mask;
dest->saturate = src->saturate;
}
@@ -1921,14 +1923,14 @@ nir_ssa_def_rewrite_uses(nir_ssa_def *def, nir_src new_src, void *mem_ctx)
nir_foreach_use_safe(def, use_src) {
nir_instr *src_parent_instr = use_src->parent_instr;
list_del(&use_src->use_link);
nir_src_copy(use_src, &new_src, mem_ctx);
nir_src_copy(use_src, &new_src, src_parent_instr);
src_add_all_uses(use_src, src_parent_instr, NULL);
}
nir_foreach_if_use_safe(def, use_src) {
nir_if *src_parent_if = use_src->parent_if;
list_del(&use_src->use_link);
nir_src_copy(use_src, &new_src, mem_ctx);
nir_src_copy(use_src, &new_src, src_parent_if);
src_add_all_uses(use_src, NULL, src_parent_if);
}
}

View File

@@ -580,8 +580,8 @@ nir_dest_for_reg(nir_register *reg)
return dest;
}
void nir_src_copy(nir_src *dest, const nir_src *src, void *mem_ctx);
void nir_dest_copy(nir_dest *dest, const nir_dest *src, void *mem_ctx);
void nir_src_copy(nir_src *dest, const nir_src *src, void *instr_or_if);
void nir_dest_copy(nir_dest *dest, const nir_dest *src, nir_instr *instr);
typedef struct {
nir_src src;
@@ -630,10 +630,6 @@ typedef struct {
unsigned write_mask : 4; /* ignored if dest.is_ssa is true */
} nir_alu_dest;
void nir_alu_src_copy(nir_alu_src *dest, const nir_alu_src *src, void *mem_ctx);
void nir_alu_dest_copy(nir_alu_dest *dest, const nir_alu_dest *src,
void *mem_ctx);
typedef enum {
nir_type_invalid = 0, /* Not a valid type */
nir_type_float,
@@ -702,6 +698,11 @@ typedef struct nir_alu_instr {
nir_alu_src src[];
} nir_alu_instr;
void nir_alu_src_copy(nir_alu_src *dest, const nir_alu_src *src,
nir_alu_instr *instr);
void nir_alu_dest_copy(nir_alu_dest *dest, const nir_alu_dest *src,
nir_alu_instr *instr);
/* is this source channel used? */
static inline bool
nir_alu_instr_channel_used(nir_alu_instr *instr, unsigned src, unsigned channel)

View File

@@ -561,7 +561,7 @@ emit_copy(nir_parallel_copy_instr *pcopy, nir_src src, nir_src dest_src,
assert(src.reg.reg->num_components >= dest_src.reg.reg->num_components);
nir_alu_instr *mov = nir_alu_instr_create(mem_ctx, nir_op_imov);
nir_src_copy(&mov->src[0].src, &src, mem_ctx);
nir_src_copy(&mov->src[0].src, &src, mov);
mov->dest.dest = nir_dest_for_reg(dest_src.reg.reg);
mov->dest.write_mask = (1 << dest_src.reg.reg->num_components) - 1;

View File

@@ -46,11 +46,11 @@ lower_reduction(nir_alu_instr *instr, nir_op chan_op, nir_op merge_op,
for (unsigned i = 0; i < num_components; i++) {
nir_alu_instr *chan = nir_alu_instr_create(mem_ctx, chan_op);
nir_alu_ssa_dest_init(chan, 1);
nir_alu_src_copy(&chan->src[0], &instr->src[0], mem_ctx);
nir_alu_src_copy(&chan->src[0], &instr->src[0], chan);
chan->src[0].swizzle[0] = chan->src[0].swizzle[i];
if (nir_op_infos[chan_op].num_inputs > 1) {
assert(nir_op_infos[chan_op].num_inputs == 2);
nir_alu_src_copy(&chan->src[1], &instr->src[1], mem_ctx);
nir_alu_src_copy(&chan->src[1], &instr->src[1], chan);
chan->src[1].swizzle[0] = chan->src[1].swizzle[i];
}
@@ -153,7 +153,7 @@ lower_alu_instr_scalar(nir_alu_instr *instr, void *mem_ctx)
unsigned src_chan = (nir_op_infos[instr->op].input_sizes[i] == 1 ?
0 : chan);
nir_alu_src_copy(&lower->src[i], &instr->src[i], mem_ctx);
nir_alu_src_copy(&lower->src[i], &instr->src[i], lower);
for (int j = 0; j < 4; j++)
lower->src[i].swizzle[j] = instr->src[i].swizzle[src_chan];
}

View File

@@ -91,7 +91,7 @@ lower_instr(nir_intrinsic_instr *instr, nir_function_impl *impl)
nir_alu_instr *mul = nir_alu_instr_create(mem_ctx, nir_op_imul);
nir_ssa_dest_init(&mul->instr, &mul->dest.dest, 1, NULL);
mul->dest.write_mask = 0x1;
nir_src_copy(&mul->src[0].src, &deref_array->indirect, mem_ctx);
nir_src_copy(&mul->src[0].src, &deref_array->indirect, mul);
mul->src[1].src.is_ssa = true;
mul->src[1].src.ssa = &atomic_counter_size->def;
nir_instr_insert_before(&instr->instr, &mul->instr);

View File

@@ -376,7 +376,7 @@ nir_lower_io_block(nir_block *block, void *void_state)
store->const_index[0] = offset;
nir_src_copy(&store->src[0], &intrin->src[0], state->mem_ctx);
nir_src_copy(&store->src[0], &intrin->src[0], store);
if (has_indirect)
store->src[1] = indirect;

View File

@@ -183,8 +183,7 @@ get_deref_reg_src(nir_deref_var *deref, nir_instr *instr,
nir_alu_instr *add = nir_alu_instr_create(state->shader,
nir_op_iadd);
add->src[0].src = *src.reg.indirect;
nir_src_copy(&add->src[1].src, &deref_array->indirect,
state->shader);
nir_src_copy(&add->src[1].src, &deref_array->indirect, add);
add->dest.write_mask = 1;
nir_ssa_dest_init(&add->instr, &add->dest.dest, 1, NULL);
nir_instr_insert_before(instr, &add->instr);
@@ -225,7 +224,7 @@ lower_locals_to_regs_block(nir_block *block, void *void_state)
nir_src_for_ssa(&mov->dest.dest.ssa),
state->shader);
} else {
nir_dest_copy(&mov->dest.dest, &intrin->dest, state->shader);
nir_dest_copy(&mov->dest.dest, &intrin->dest, &mov->instr);
}
nir_instr_insert_before(&intrin->instr, &mov->instr);
@@ -241,7 +240,7 @@ lower_locals_to_regs_block(nir_block *block, void *void_state)
&intrin->instr, state);
nir_alu_instr *mov = nir_alu_instr_create(state->shader, nir_op_imov);
nir_src_copy(&mov->src[0].src, &intrin->src[0], state->shader);
nir_src_copy(&mov->src[0].src, &intrin->src[0], mov);
mov->dest.write_mask = (1 << intrin->num_components) - 1;
mov->dest.dest.is_ssa = false;
mov->dest.dest.reg.reg = reg_src.reg.reg;

View File

@@ -60,8 +60,8 @@ insert_mov(nir_alu_instr *vec, unsigned start_channel,
assert(src_idx < nir_op_infos[vec->op].num_inputs);
nir_alu_instr *mov = nir_alu_instr_create(mem_ctx, nir_op_imov);
nir_alu_src_copy(&mov->src[0], &vec->src[src_idx], mem_ctx);
nir_alu_dest_copy(&mov->dest, &vec->dest, mem_ctx);
nir_alu_src_copy(&mov->src[0], &vec->src[src_idx], mov);
nir_alu_dest_copy(&mov->dest, &vec->dest, mov);
mov->dest.write_mask = (1u << start_channel);
mov->src[0].swizzle[start_channel] = vec->src[src_idx].swizzle[0];

View File

@@ -216,8 +216,7 @@ nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
for (unsigned j = 0; j < add->dest.dest.ssa.num_components; j++)
ffma->src[i].swizzle[j] = mul->src[i].swizzle[swizzle[j]];
}
nir_alu_src_copy(&ffma->src[2], &add->src[1 - add_mul_src],
state->mem_ctx);
nir_alu_src_copy(&ffma->src[2], &add->src[1 - add_mul_src], ffma);
assert(add->dest.dest.is_ssa);

View File

@@ -195,7 +195,7 @@ nir_opt_peephole_select_block(nir_block *block, void *void_state)
nir_phi_instr *phi = nir_instr_as_phi(instr);
nir_alu_instr *sel = nir_alu_instr_create(state->mem_ctx, nir_op_bcsel);
nir_src_copy(&sel->src[0].src, &if_stmt->condition, state->mem_ctx);
nir_src_copy(&sel->src[0].src, &if_stmt->condition, sel);
/* Splat the condition to all channels */
memset(sel->src[0].swizzle, 0, sizeof sel->src[0].swizzle);
@@ -205,7 +205,7 @@ nir_opt_peephole_select_block(nir_block *block, void *void_state)
assert(src->src.is_ssa);
unsigned idx = src->pred == then_block ? 1 : 2;
nir_src_copy(&sel->src[idx].src, &src->src, state->mem_ctx);
nir_src_copy(&sel->src[idx].src, &src->src, sel);
}
nir_ssa_dest_init(&sel->instr, &sel->dest.dest,

View File

@@ -45,6 +45,24 @@
#include "uniforms.h"
#include "varray.h"
static bool
need_signed_unsigned_int_conversion(mesa_format mesaFormat,
GLenum format, GLenum type)
{
const GLenum mesaFormatType = _mesa_get_format_datatype(mesaFormat);
const bool is_format_integer = _mesa_is_enum_format_integer(format);
return (mesaFormatType == GL_INT &&
is_format_integer &&
(type == GL_UNSIGNED_INT ||
type == GL_UNSIGNED_SHORT ||
type == GL_UNSIGNED_BYTE)) ||
(mesaFormatType == GL_UNSIGNED_INT &&
is_format_integer &&
(type == GL_INT ||
type == GL_SHORT ||
type == GL_BYTE));
}
static struct gl_texture_image *
create_texture_for_pbo(struct gl_context *ctx, bool create_pbo,
GLenum pbo_target, int width, int height,
@@ -166,6 +184,13 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint dims,
if (ctx->_ImageTransferState)
return false;
/* This function rely on BlitFramebuffer to fill in the pixel data for
* glTex[Sub]Image*D. But, BlitFrameBuffer doesn't support signed to
* unsigned or unsigned to signed integer conversions.
*/
if (need_signed_unsigned_int_conversion(tex_image->TexFormat, format, type))
return false;
/* For arrays, use a tall (height * depth) 2D texture but taking into
* account the inter-image padding specified with the image height packing
* property.
@@ -250,24 +275,6 @@ fail:
return success;
}
static bool
need_signed_unsigned_int_conversion(mesa_format rbFormat,
GLenum format, GLenum type)
{
const GLenum srcType = _mesa_get_format_datatype(rbFormat);
const bool is_dst_format_integer = _mesa_is_enum_format_integer(format);
return (srcType == GL_INT &&
is_dst_format_integer &&
(type == GL_UNSIGNED_INT ||
type == GL_UNSIGNED_SHORT ||
type == GL_UNSIGNED_BYTE)) ||
(srcType == GL_UNSIGNED_INT &&
is_dst_format_integer &&
(type == GL_INT ||
type == GL_SHORT ||
type == GL_BYTE));
}
bool
_mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims,
struct gl_texture_image *tex_image,

View File

@@ -251,7 +251,7 @@ intel_run_render(struct gl_context * ctx, struct tnl_pipeline_stage *stage)
continue;
intel_render_tab_verts[prim & PRIM_MODE_MASK] (ctx, start,
start + length, prim);
length, prim);
}
tnl->Driver.Render.Finish(ctx);

View File

@@ -48,6 +48,7 @@ public:
private:
void emit(ir_variable *, ir_rvalue *);
ir_variable *temp(void *ctx, const glsl_type *type, const char *name);
};
/**
@@ -60,6 +61,17 @@ lower_texture_grad_visitor::emit(ir_variable *var, ir_rvalue *value)
base_ir->insert_before(assign(var, value));
}
/**
* Emit a temporary variable declaration
*/
ir_variable *
lower_texture_grad_visitor::temp(void *ctx, const glsl_type *type, const char *name)
{
ir_variable *var = new(ctx) ir_variable(type, name, ir_var_temporary);
base_ir->insert_before(var);
return var;
}
static const glsl_type *
txs_type(const glsl_type *type)
{
@@ -144,28 +156,179 @@ lower_texture_grad_visitor::visit_leave(ir_texture *ir)
new(mem_ctx) ir_variable(grad_type, "dPdy", ir_var_temporary);
emit(dPdy, mul(size, ir->lod_info.grad.dPdy));
/* Calculate rho from equation 3.20 of the GL 3.0 specification. */
ir_rvalue *rho;
if (dPdx->type->is_scalar()) {
rho = expr(ir_binop_max, expr(ir_unop_abs, dPdx),
expr(ir_unop_abs, dPdy));
} else {
rho = expr(ir_binop_max, expr(ir_unop_sqrt, dot(dPdx, dPdx)),
expr(ir_unop_sqrt, dot(dPdy, dPdy)));
}
/* lambda_base = log2(rho). We're ignoring GL state biases for now.
*
* For cube maps the result of these formulas is giving us a value of rho
* that is twice the value we should use, so divide it by 2 or,
* alternatively, remove one unit from the result of the log2 computation.
*/
ir->op = ir_txl;
if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE) {
ir->lod_info.lod = expr(ir_binop_add,
expr(ir_unop_log2, rho),
new(mem_ctx) ir_constant(-1.0f));
/* Cubemap texture lookups first generate a texture coordinate normalized
* to [-1, 1] on the appropiate face. The appropiate face is determined
* by which component has largest magnitude and its sign. The texture
* coordinate is the quotient of the remaining texture coordinates against
* that absolute value of the component of largest magnitude. This
* division requires that the computing of the derivative of the texel
* coordinate must use the quotient rule. The high level GLSL code is as
* follows:
*
* Step 1: selection
*
* vec3 abs_p, Q, dQdx, dQdy;
* abs_p = abs(ir->coordinate);
* if (abs_p.x >= max(abs_p.y, abs_p.z)) {
* Q = ir->coordinate.yzx;
* dQdx = ir->lod_info.grad.dPdx.yzx;
* dQdy = ir->lod_info.grad.dPdy.yzx;
* }
* if (abs_p.y >= max(abs_p.x, abs_p.z)) {
* Q = ir->coordinate.xzy;
* dQdx = ir->lod_info.grad.dPdx.xzy;
* dQdy = ir->lod_info.grad.dPdy.xzy;
* }
* if (abs_p.z >= max(abs_p.x, abs_p.y)) {
* Q = ir->coordinate;
* dQdx = ir->lod_info.grad.dPdx;
* dQdy = ir->lod_info.grad.dPdy;
* }
*
* Step 2: use quotient rule to compute derivative. The normalized to
* [-1, 1] texel coordinate is given by Q.xy / (sign(Q.z) * Q.z). We are
* only concerned with the magnitudes of the derivatives whose values are
* not affected by the sign. We drop the sign from the computation.
*
* vec2 dx, dy;
* float recip;
*
* recip = 1.0 / Q.z;
* dx = recip * ( dQdx.xy - Q.xy * (dQdx.z * recip) );
* dy = recip * ( dQdy.xy - Q.xy * (dQdy.z * recip) );
*
* Step 3: compute LOD. At this point we have the derivatives of the
* texture coordinates normalized to [-1,1]. We take the LOD to be
* result = log2(max(sqrt(dot(dx, dx)), sqrt(dy, dy)) * 0.5 * L)
* = -1.0 + log2(max(sqrt(dot(dx, dx)), sqrt(dy, dy)) * L)
* = -1.0 + log2(sqrt(max(dot(dx, dx), dot(dy,dy))) * L)
* = -1.0 + log2(sqrt(L * L * max(dot(dx, dx), dot(dy,dy))))
* = -1.0 + 0.5 * log2(L * L * max(dot(dx, dx), dot(dy,dy)))
* where L is the dimension of the cubemap. The code is:
*
* float M, result;
* M = max(dot(dx, dx), dot(dy, dy));
* L = textureSize(sampler, 0).x;
* result = -1.0 + 0.5 * log2(L * L * M);
*/
/* Helpers to make code more human readable. */
#define EMIT(instr) base_ir->insert_before(instr)
#define THEN(irif, instr) irif->then_instructions.push_tail(instr)
#define CLONE(x) x->clone(mem_ctx, NULL)
ir_variable *abs_p = temp(mem_ctx, glsl_type::vec3_type, "abs_p");
EMIT(assign(abs_p, swizzle_for_size(abs(CLONE(ir->coordinate)), 3)));
ir_variable *Q = temp(mem_ctx, glsl_type::vec3_type, "Q");
ir_variable *dQdx = temp(mem_ctx, glsl_type::vec3_type, "dQdx");
ir_variable *dQdy = temp(mem_ctx, glsl_type::vec3_type, "dQdy");
/* unmodified dPdx, dPdy values */
ir_rvalue *dPdx = ir->lod_info.grad.dPdx;
ir_rvalue *dPdy = ir->lod_info.grad.dPdy;
/* 1. compute selector */
/* if (abs_p.x >= max(abs_p.y, abs_p.z)) ... */
ir_if *branch_x =
new(mem_ctx) ir_if(gequal(swizzle_x(abs_p),
max2(swizzle_y(abs_p), swizzle_z(abs_p))));
/* Q = p.yzx;
* dQdx = dPdx.yzx;
* dQdy = dPdy.yzx;
*/
int yzx = MAKE_SWIZZLE4(SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_X, 0);
THEN(branch_x, assign(Q, swizzle(CLONE(ir->coordinate), yzx, 3)));
THEN(branch_x, assign(dQdx, swizzle(CLONE(dPdx), yzx, 3)));
THEN(branch_x, assign(dQdy, swizzle(CLONE(dPdy), yzx, 3)));
EMIT(branch_x);
/* if (abs_p.y >= max(abs_p.x, abs_p.z)) */
ir_if *branch_y =
new(mem_ctx) ir_if(gequal(swizzle_y(abs_p),
max2(swizzle_x(abs_p), swizzle_z(abs_p))));
/* Q = p.xzy;
* dQdx = dPdx.xzy;
* dQdy = dPdy.xzy;
*/
int xzy = MAKE_SWIZZLE4(SWIZZLE_X, SWIZZLE_Z, SWIZZLE_Y, 0);
THEN(branch_y, assign(Q, swizzle(CLONE(ir->coordinate), xzy, 3)));
THEN(branch_y, assign(dQdx, swizzle(CLONE(dPdx), xzy, 3)));
THEN(branch_y, assign(dQdy, swizzle(CLONE(dPdy), xzy, 3)));
EMIT(branch_y);
/* if (abs_p.z >= max(abs_p.x, abs_p.y)) */
ir_if *branch_z =
new(mem_ctx) ir_if(gequal(swizzle_z(abs_p),
max2(swizzle_x(abs_p), swizzle_y(abs_p))));
/* Q = p;
* dQdx = dPdx;
* dQdy = dPdy;
*/
THEN(branch_z, assign(Q, swizzle_for_size(CLONE(ir->coordinate), 3)));
THEN(branch_z, assign(dQdx, CLONE(dPdx)));
THEN(branch_z, assign(dQdy, CLONE(dPdy)));
EMIT(branch_z);
/* 2. quotient rule */
ir_variable *recip = temp(mem_ctx, glsl_type::float_type, "recip");
EMIT(assign(recip, div(new(mem_ctx) ir_constant(1.0f), swizzle_z(Q))));
ir_variable *dx = temp(mem_ctx, glsl_type::vec2_type, "dx");
ir_variable *dy = temp(mem_ctx, glsl_type::vec2_type, "dy");
/* tmp = Q.xy * recip;
* dx = recip * ( dQdx.xy - (tmp * dQdx.z) );
* dy = recip * ( dQdy.xy - (tmp * dQdy.z) );
*/
ir_variable *tmp = temp(mem_ctx, glsl_type::vec2_type, "tmp");
EMIT(assign(tmp, mul(swizzle_xy(Q), recip)));
EMIT(assign(dx, mul(recip, sub(swizzle_xy(dQdx),
mul(tmp, swizzle_z(dQdx))))));
EMIT(assign(dy, mul(recip, sub(swizzle_xy(dQdy),
mul(tmp, swizzle_z(dQdy))))));
/* M = max(dot(dx, dx), dot(dy, dy)); */
ir_variable *M = temp(mem_ctx, glsl_type::float_type, "M");
EMIT(assign(M, max2(dot(dx, dx), dot(dy, dy))));
/* size has textureSize() of LOD 0 */
ir_variable *L = temp(mem_ctx, glsl_type::float_type, "L");
EMIT(assign(L, swizzle_x(size)));
ir_variable *result = temp(mem_ctx, glsl_type::float_type, "result");
/* result = -1.0 + 0.5 * log2(L * L * M); */
EMIT(assign(result,
add(new(mem_ctx)ir_constant(-1.0f),
mul(new(mem_ctx)ir_constant(0.5f),
expr(ir_unop_log2, mul(mul(L, L), M))))));
/* 3. final assignment of parameters to textureLod call */
ir->lod_info.lod = new (mem_ctx) ir_dereference_variable(result);
#undef THEN
#undef EMIT
} else {
/* Calculate rho from equation 3.20 of the GL 3.0 specification. */
ir_rvalue *rho;
if (dPdx->type->is_scalar()) {
rho = expr(ir_binop_max, expr(ir_unop_abs, dPdx),
expr(ir_unop_abs, dPdy));
} else {
rho = expr(ir_binop_max, expr(ir_unop_sqrt, dot(dPdx, dPdx)),
expr(ir_unop_sqrt, dot(dPdy, dPdy)));
}
/* lambda_base = log2(rho). We're ignoring GL state biases for now. */
ir->lod_info.lod = expr(ir_unop_log2, rho);
}

View File

@@ -950,6 +950,14 @@ vec4_instruction::can_reswizzle(int dst_writemask,
if (mlen > 0)
return false;
/* We can't use swizzles on the accumulator and that's really the only
* HW_REG we would care to reswizzle so just disallow them all.
*/
for (int i = 0; i < 3; i++) {
if (src[i].file == HW_REG)
return false;
}
return true;
}
@@ -1053,6 +1061,17 @@ vec4_visitor::opt_register_coalesce()
}
}
/* This doesn't handle saturation on the instruction we
* want to coalesce away if the register types do not match.
* But if scan_inst is a non type-converting 'mov', we can fix
* the types later.
*/
if (inst->saturate &&
inst->dst.type != scan_inst->dst.type &&
!(scan_inst->opcode == BRW_OPCODE_MOV &&
scan_inst->dst.type == scan_inst->src[0].type))
break;
/* If we can't handle the swizzle, bail. */
if (!scan_inst->can_reswizzle(inst->dst.writemask,
inst->src[0].swizzle,
@@ -1128,6 +1147,16 @@ vec4_visitor::opt_register_coalesce()
scan_inst->dst.file = inst->dst.file;
scan_inst->dst.reg = inst->dst.reg;
scan_inst->dst.reg_offset = inst->dst.reg_offset;
if (inst->saturate &&
inst->dst.type != scan_inst->dst.type) {
/* If we have reached this point, scan_inst is a non
* type-converting 'mov' and we can modify its register types
* to match the ones in inst. Otherwise, we could have an
* incorrect saturation result.
*/
scan_inst->dst.type = inst->dst.type;
scan_inst->src[0].type = inst->src[0].type;
}
scan_inst->saturate |= inst->saturate;
}
scan_inst = (vec4_instruction *)scan_inst->next;

View File

@@ -456,7 +456,7 @@ void
vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
{
dst_reg reg = dst_reg(GRF, alloc.allocate(1));
reg.type = BRW_REGISTER_TYPE_F;
reg.type = BRW_REGISTER_TYPE_D;
unsigned remaining = brw_writemask_for_size(instr->def.num_components);
@@ -477,7 +477,7 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
}
reg.writemask = writemask;
emit(MOV(reg, src_reg(instr->value.f[i])));
emit(MOV(reg, src_reg(instr->value.i[i])));
remaining &= ~writemask;
}

View File

@@ -446,7 +446,7 @@ static GLboolean radeon_run_render( struct gl_context *ctx,
start, start+length);
if (length)
tab[prim & PRIM_MODE_MASK]( ctx, start, start + length, prim );
tab[prim & PRIM_MODE_MASK](ctx, start, length, prim);
}
tnl->Driver.Render.Finish( ctx );

View File

@@ -3595,7 +3595,16 @@ _mesa_get_framebuffer_attachment_parameter(struct gl_context *ctx,
switch (pname) {
case GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE_EXT:
*params = _mesa_is_winsys_fbo(buffer)
/* From the OpenGL spec, 9.2. Binding and Managing Framebuffer Objects:
*
* "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
* either no framebuffer is bound to target; or the default framebuffer
* is bound, attachment is DEPTH or STENCIL, and the number of depth or
* stencil bits, respectively, is zero."
*/
*params = (_mesa_is_winsys_fbo(buffer) &&
((attachment != GL_DEPTH && attachment != GL_STENCIL) ||
(att->Type != GL_NONE)))
? GL_FRAMEBUFFER_DEFAULT : att->Type;
return;
case GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME_EXT:

View File

@@ -494,7 +494,8 @@ _mesa_bytes_per_pixel(GLenum format, GLenum type)
else
return -1;
case GL_UNSIGNED_INT_24_8_EXT:
if (format == GL_DEPTH_STENCIL_EXT)
if (format == GL_DEPTH_COMPONENT ||
format == GL_DEPTH_STENCIL_EXT)
return sizeof(GLuint);
else
return -1;
@@ -1691,6 +1692,10 @@ _mesa_error_check_format_and_type(const struct gl_context *ctx,
return GL_INVALID_OPERATION;
case GL_UNSIGNED_INT_24_8:
/* Depth buffer OK to read in OpenGL ES (NV_read_depth). */
if (ctx->API == API_OPENGLES2 && format == GL_DEPTH_COMPONENT)
return GL_NO_ERROR;
if (format != GL_DEPTH_STENCIL) {
return GL_INVALID_OPERATION;
}

View File

@@ -963,6 +963,7 @@ read_pixels_es3_error_check(GLenum format, GLenum type,
return GL_NO_ERROR;
break;
case GL_UNSIGNED_SHORT:
case GL_UNSIGNED_INT:
case GL_UNSIGNED_INT_24_8:
if (!is_float_depth)
return GL_NO_ERROR;

View File

@@ -1291,7 +1291,8 @@ _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
tempImageSlices[0] = (GLubyte *) tempImage;
_mesa_texstore(ctx, dims,
baseInternalFormat,
MESA_FORMAT_R8G8B8A8_UNORM,
_mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
: MESA_FORMAT_A8B8G8R8_UNORM,
rgbaRowStride, tempImageSlices,
srcWidth, srcHeight, srcDepth,
srcFormat, srcType, srcAddr,

View File

@@ -130,7 +130,8 @@ _mesa_texstore_rgba_fxt1(TEXSTORE_PARAMS)
tempImageSlices[0] = (GLubyte *) tempImage;
_mesa_texstore(ctx, dims,
baseInternalFormat,
MESA_FORMAT_R8G8B8A8_UNORM,
_mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
: MESA_FORMAT_A8B8G8R8_UNORM,
rgbaRowStride, tempImageSlices,
srcWidth, srcHeight, srcDepth,
srcFormat, srcType, srcAddr,

View File

@@ -196,9 +196,11 @@ _mesa_texstore_rg_rgtc2(TEXSTORE_PARAMS)
dstFormat == MESA_FORMAT_LA_LATC2_UNORM);
if (baseInternalFormat == GL_RG)
tempFormat = MESA_FORMAT_R8G8_UNORM;
tempFormat = _mesa_little_endian() ? MESA_FORMAT_R8G8_UNORM
: MESA_FORMAT_G8R8_UNORM;
else
tempFormat = MESA_FORMAT_L8A8_UNORM;
tempFormat = _mesa_little_endian() ? MESA_FORMAT_L8A8_UNORM
: MESA_FORMAT_A8L8_UNORM;
rgRowStride = 2 * srcWidth * sizeof(GLubyte);
tempImage = malloc(srcWidth * srcHeight * 2 * sizeof(GLubyte));

View File

@@ -198,7 +198,8 @@ _mesa_texstore_rgba_dxt1(TEXSTORE_PARAMS)
tempImageSlices[0] = (GLubyte *) tempImage;
_mesa_texstore(ctx, dims,
baseInternalFormat,
MESA_FORMAT_R8G8B8A8_UNORM,
_mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
: MESA_FORMAT_A8B8G8R8_UNORM,
rgbaRowStride, tempImageSlices,
srcWidth, srcHeight, srcDepth,
srcFormat, srcType, srcAddr,
@@ -255,7 +256,8 @@ _mesa_texstore_rgba_dxt3(TEXSTORE_PARAMS)
tempImageSlices[0] = (GLubyte *) tempImage;
_mesa_texstore(ctx, dims,
baseInternalFormat,
MESA_FORMAT_R8G8B8A8_UNORM,
_mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
: MESA_FORMAT_A8B8G8R8_UNORM,
rgbaRowStride, tempImageSlices,
srcWidth, srcHeight, srcDepth,
srcFormat, srcType, srcAddr,
@@ -311,7 +313,8 @@ _mesa_texstore_rgba_dxt5(TEXSTORE_PARAMS)
tempImageSlices[0] = (GLubyte *) tempImage;
_mesa_texstore(ctx, dims,
baseInternalFormat,
MESA_FORMAT_R8G8B8A8_UNORM,
_mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
: MESA_FORMAT_A8B8G8R8_UNORM,
rgbaRowStride, tempImageSlices,
srcWidth, srcHeight, srcDepth,
srcFormat, srcType, srcAddr,

View File

@@ -274,8 +274,8 @@ st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe,
return NULL;
size = MIN2(stObj->pt->width0 - base, (unsigned)stObj->base.BufferSize);
f = ((base * 8) / desc->block.bits) * desc->block.width;
n = ((size * 8) / desc->block.bits) * desc->block.width;
f = (base / (desc->block.bits / 8)) * desc->block.width;
n = (size / (desc->block.bits / 8)) * desc->block.width;
if (!n)
return NULL;
templ.u.buf.first_element = f;

View File

@@ -121,9 +121,9 @@ static void TAG(render_points_verts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr ) {
for (j = 0; j < count; j += nr) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
@@ -148,7 +148,7 @@ static void TAG(render_lines_verts)( struct gl_context *ctx,
/* Emit whole number of lines in total and in each buffer:
*/
count -= (count-start) & 1;
count -= count & 1;
currentsz = GET_CURRENT_VB_MAX_VERTS();
currentsz -= currentsz & 1;
dmasz -= dmasz & 1;
@@ -156,9 +156,9 @@ static void TAG(render_lines_verts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr ) {
for (j = 0; j < count; j += nr) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
@@ -186,9 +186,9 @@ static void TAG(render_line_strip_verts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j + 1 < count; j += nr - 1 ) {
for (j = 0; j + 1 < count; j += nr - 1 ) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
@@ -214,10 +214,7 @@ static void TAG(render_line_loop_verts)( struct gl_context *ctx,
INIT( GL_LINE_STRIP );
if (flags & PRIM_BEGIN)
j = start;
else
j = start + 1;
j = (flags & PRIM_BEGIN) ? 0 : 1;
/* Ensure last vertex won't wrap buffers:
*/
@@ -234,23 +231,23 @@ static void TAG(render_line_loop_verts)( struct gl_context *ctx,
nr = MIN2( currentsz, count - j );
if (j + nr >= count &&
start < count - 1 &&
count > 1 &&
(flags & PRIM_END))
{
void *tmp;
tmp = ALLOC_VERTS(nr+1);
tmp = TAG(emit_verts)( ctx, j, nr, tmp );
tmp = TAG(emit_verts)(ctx, start + j, nr, tmp);
tmp = TAG(emit_verts)( ctx, start, 1, tmp );
(void) tmp;
}
else {
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
}
}
else if (start + 1 < count && (flags & PRIM_END)) {
else if (count > 1 && (flags & PRIM_END)) {
void *tmp;
tmp = ALLOC_VERTS(2);
tmp = TAG(emit_verts)( ctx, start+1, 1, tmp );
@@ -284,14 +281,14 @@ static void TAG(render_triangles_verts)( struct gl_context *ctx,
/* Emit whole number of tris in total. dmasz is already a multiple
* of 3.
*/
count -= (count-start)%3;
count -= count % 3;
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr) {
for (j = 0; j < count; j += nr) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
}
@@ -322,9 +319,9 @@ static void TAG(render_tri_strip_verts)( struct gl_context *ctx,
dmasz -= (dmasz & 1);
currentsz -= (currentsz & 1);
for (j = start ; j + 2 < count; j += nr - 2 ) {
for (j = 0; j + 2 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
@@ -354,12 +351,12 @@ static void TAG(render_tri_fan_verts)( struct gl_context *ctx,
currentsz = dmasz;
}
for (j = start + 1 ; j + 1 < count; j += nr - 2 ) {
for (j = 1; j + 1 < count; j += nr - 2) {
void *tmp;
nr = MIN2( currentsz, count - j + 1 );
tmp = ALLOC_VERTS( nr );
tmp = TAG(emit_verts)( ctx, start, 1, tmp );
tmp = TAG(emit_verts)( ctx, j, nr - 1, tmp );
tmp = TAG(emit_verts)( ctx, start + j, nr - 1, tmp );
(void) tmp;
currentsz = dmasz;
}
@@ -394,12 +391,12 @@ static void TAG(render_poly_verts)( struct gl_context *ctx,
currentsz = dmasz;
}
for (j = start + 1 ; j + 1 < count ; j += nr - 2 ) {
for (j = 1 ; j + 1 < count ; j += nr - 2 ) {
void *tmp;
nr = MIN2( currentsz, count - j + 1 );
tmp = ALLOC_VERTS( nr );
tmp = TAG(emit_verts)( ctx, start, 1, tmp );
tmp = TAG(emit_verts)( ctx, j, nr - 1, tmp );
tmp = TAG(emit_verts)(ctx, start + j, nr - 1, tmp);
(void) tmp;
currentsz = dmasz;
}
@@ -437,9 +434,9 @@ static void TAG(render_quad_strip_verts)( struct gl_context *ctx,
dmasz -= (dmasz & 2);
currentsz -= (currentsz & 2);
for (j = start ; j + 3 < count; j += nr - 2 ) {
for (j = 0; j + 3 < count; j += nr - 2 ) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
@@ -465,7 +462,7 @@ static void TAG(render_quad_strip_verts)( struct gl_context *ctx,
/* Emit whole number of quads in total, and in each buffer.
*/
dmasz -= dmasz & 1;
count -= (count-start) & 1;
count -= count & 1;
currentsz -= currentsz & 1;
if (currentsz < 12)
@@ -474,14 +471,14 @@ static void TAG(render_quad_strip_verts)( struct gl_context *ctx,
currentsz = currentsz/6*2;
dmasz = dmasz/6*2;
for (j = start; j + 3 < count; j += nr - 2 ) {
for (j = 0; j + 3 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
if (nr >= 4) {
GLint quads = (nr/2)-1;
GLint i;
ELTS_VARS( ALLOC_ELTS( quads*6 ) );
for ( i = j-start ; i < j-start+quads*2 ; i+=2 ) {
for (i = j; i < j + quads * 2; i += 2) {
EMIT_TWO_ELTS( 0, (i+0), (i+1) );
EMIT_TWO_ELTS( 2, (i+2), (i+1) );
EMIT_TWO_ELTS( 4, (i+3), (i+2) );
@@ -519,15 +516,15 @@ static void TAG(render_quad_strip_verts)( struct gl_context *ctx,
dmasz -= dmasz & 1;
currentsz = GET_CURRENT_VB_MAX_VERTS();
currentsz -= currentsz & 1;
count -= (count-start) & 1;
count -= count & 1;
if (currentsz < 8) {
currentsz = dmasz;
}
for (j = start; j + 3 < count; j += nr - 2 ) {
for (j = 0; j + 3 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
@@ -545,6 +542,9 @@ static void TAG(render_quads_verts)( struct gl_context *ctx,
GLuint count,
GLuint flags )
{
/* Emit whole number of quads in total. */
count -= count & 3;
if (HAVE_QUADS) {
LOCAL_VARS;
int dmasz = (GET_SUBSEQUENT_VB_MAX_VERTS()/4) * 4;
@@ -553,18 +553,13 @@ static void TAG(render_quads_verts)( struct gl_context *ctx,
INIT(GL_QUADS);
/* Emit whole number of quads in total. dmasz is already a multiple
* of 4.
*/
count -= (count-start)%4;
currentsz = (GET_CURRENT_VB_MAX_VERTS()/4) * 4;
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr) {
for (j = 0; j < count; j += nr) {
nr = MIN2( currentsz, count - j );
TAG(emit_verts)( ctx, j, nr, ALLOC_VERTS(nr) );
TAG(emit_verts)(ctx, start + j, nr, ALLOC_VERTS(nr));
currentsz = dmasz;
}
}
@@ -587,7 +582,6 @@ static void TAG(render_quads_verts)( struct gl_context *ctx,
/* Emit whole number of quads in total, and in each buffer.
*/
dmasz -= dmasz & 3;
count -= (count-start) & 3;
currentsz -= currentsz & 3;
/* Adjust for rendering as triangles:
@@ -598,14 +592,14 @@ static void TAG(render_quads_verts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr ) {
for (j = 0; j < count; j += nr ) {
nr = MIN2( currentsz, count - j );
if (nr >= 4) {
GLint quads = nr/4;
GLint i;
ELTS_VARS( ALLOC_ELTS( quads*6 ) );
for ( i = j-start ; i < j-start+quads*4 ; i+=4 ) {
for (i = j; i < j + quads * 4; i += 4) {
EMIT_TWO_ELTS( 0, (i+0), (i+1) );
EMIT_TWO_ELTS( 2, (i+3), (i+1) );
EMIT_TWO_ELTS( 4, (i+2), (i+3) );
@@ -629,15 +623,15 @@ static void TAG(render_quads_verts)( struct gl_context *ctx,
INIT(GL_TRIANGLES);
for (j = start; j < count-3; j += 4) {
for (j = 0; j + 3 < count; j += 4) {
void *tmp = ALLOC_VERTS( 6 );
/* Send v0, v1, v3
*/
tmp = EMIT_VERTS(ctx, j, 2, tmp);
tmp = EMIT_VERTS(ctx, j + 3, 1, tmp);
tmp = EMIT_VERTS(ctx, start + j, 2, tmp);
tmp = EMIT_VERTS(ctx, start + j + 3, 1, tmp);
/* Send v1, v2, v3
*/
tmp = EMIT_VERTS(ctx, j + 1, 3, tmp);
tmp = EMIT_VERTS(ctx, start + j + 1, 3, tmp);
(void) tmp;
}
}
@@ -698,9 +692,9 @@ static void TAG(render_points_elts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr ) {
for (j = 0; j < count; j += nr ) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)(ctx, elts + start + j, nr, ALLOC_ELTS(nr));
FLUSH();
currentsz = dmasz;
}
@@ -728,7 +722,7 @@ static void TAG(render_lines_elts)( struct gl_context *ctx,
/* Emit whole number of lines in total and in each buffer:
*/
count -= (count-start) & 1;
count -= count & 1;
currentsz -= currentsz & 1;
dmasz -= dmasz & 1;
@@ -736,9 +730,9 @@ static void TAG(render_lines_elts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr ) {
for (j = 0; j < count; j += nr ) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)(ctx, elts + start + j, nr, ALLOC_ELTS(nr));
FLUSH();
currentsz = dmasz;
}
@@ -768,9 +762,9 @@ static void TAG(render_line_strip_elts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j + 1 < count; j += nr - 1 ) {
for (j = 0; j + 1 < count; j += nr - 1) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)( ctx, elts + start + j, nr, ALLOC_ELTS(nr));
FLUSH();
currentsz = dmasz;
}
@@ -798,10 +792,7 @@ static void TAG(render_line_loop_elts)( struct gl_context *ctx,
FLUSH();
ELT_INIT( GL_LINE_STRIP );
if (flags & PRIM_BEGIN)
j = start;
else
j = start + 1;
j = (flags & PRIM_BEGIN) ? 0 : 1;
currentsz = GET_CURRENT_VB_MAX_ELTS();
if (currentsz < 8) {
@@ -818,23 +809,23 @@ static void TAG(render_line_loop_elts)( struct gl_context *ctx,
nr = MIN2( currentsz, count - j );
if (j + nr >= count &&
start < count - 1 &&
count > 1 &&
(flags & PRIM_END))
{
void *tmp;
tmp = ALLOC_ELTS(nr+1);
tmp = TAG(emit_elts)( ctx, elts+j, nr, tmp );
tmp = TAG(emit_elts)(ctx, elts + start + j, nr, tmp);
tmp = TAG(emit_elts)( ctx, elts+start, 1, tmp );
(void) tmp;
}
else {
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)(ctx, elts + start + j, nr, ALLOC_ELTS(nr));
currentsz = dmasz;
}
}
}
else if (start + 1 < count && (flags & PRIM_END)) {
else if (count > 1 && (flags & PRIM_END)) {
void *tmp;
tmp = ALLOC_ELTS(2);
tmp = TAG(emit_elts)( ctx, elts+start+1, 1, tmp );
@@ -874,14 +865,14 @@ static void TAG(render_triangles_elts)( struct gl_context *ctx,
/* Emit whole number of tris in total. dmasz is already a multiple
* of 3.
*/
count -= (count-start)%3;
count -= count % 3;
currentsz -= currentsz%3;
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr) {
for (j = 0; j < count; j += nr) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)(ctx, elts + start + j, nr, ALLOC_ELTS(nr));
FLUSH();
currentsz = dmasz;
}
@@ -914,9 +905,9 @@ static void TAG(render_tri_strip_elts)( struct gl_context *ctx,
dmasz -= (dmasz & 1);
currentsz -= (currentsz & 1);
for (j = start ; j + 2 < count; j += nr - 2 ) {
for (j = 0; j + 2 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)( ctx, elts + start + j, nr, ALLOC_ELTS(nr) );
FLUSH();
currentsz = dmasz;
}
@@ -947,12 +938,12 @@ static void TAG(render_tri_fan_elts)( struct gl_context *ctx,
currentsz = dmasz;
}
for (j = start + 1 ; j + 1 < count; j += nr - 2 ) {
for (j = 1; j + 1 < count; j += nr - 2) {
void *tmp;
nr = MIN2( currentsz, count - j + 1 );
tmp = ALLOC_ELTS( nr );
tmp = TAG(emit_elts)( ctx, elts+start, 1, tmp );
tmp = TAG(emit_elts)( ctx, elts+j, nr - 1, tmp );
tmp = TAG(emit_elts)(ctx, elts + start + j, nr - 1, tmp);
(void) tmp;
FLUSH();
currentsz = dmasz;
@@ -985,12 +976,12 @@ static void TAG(render_poly_elts)( struct gl_context *ctx,
currentsz = dmasz;
}
for (j = start + 1 ; j + 1 < count; j += nr - 2 ) {
for (j = 1 ; j + 1 < count; j += nr - 2) {
void *tmp;
nr = MIN2( currentsz, count - j + 1 );
tmp = ALLOC_ELTS( nr );
tmp = TAG(emit_elts)( ctx, elts+start, 1, tmp );
tmp = TAG(emit_elts)( ctx, elts+j, nr - 1, tmp );
tmp = TAG(emit_elts)(ctx, elts + start + j, nr - 1, tmp);
(void) tmp;
FLUSH();
currentsz = dmasz;
@@ -1023,7 +1014,7 @@ static void TAG(render_quad_strip_elts)( struct gl_context *ctx,
/* Emit whole number of quads in total, and in each buffer.
*/
dmasz -= dmasz & 1;
count -= (count-start) & 1;
count -= count & 1;
currentsz -= currentsz & 1;
if (currentsz < 12)
@@ -1035,7 +1026,7 @@ static void TAG(render_quad_strip_elts)( struct gl_context *ctx,
currentsz = currentsz/6*2;
dmasz = dmasz/6*2;
for (j = start; j + 3 < count; j += nr - 2 ) {
for (j = 0; j + 3 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
if (nr >= 4)
@@ -1044,7 +1035,7 @@ static void TAG(render_quad_strip_elts)( struct gl_context *ctx,
GLint quads = (nr/2)-1;
ELTS_VARS( ALLOC_ELTS( quads*6 ) );
for ( i = j-start ; i < j-start+quads ; i++, elts += 2 ) {
for (i = j; i < j + quads; i++, elts += 2) {
EMIT_TWO_ELTS( 0, elts[0], elts[1] );
EMIT_TWO_ELTS( 2, elts[2], elts[1] );
EMIT_TWO_ELTS( 4, elts[3], elts[2] );
@@ -1060,9 +1051,9 @@ static void TAG(render_quad_strip_elts)( struct gl_context *ctx,
else {
ELT_INIT( GL_TRIANGLE_STRIP );
for (j = start; j + 3 < count; j += nr - 2 ) {
for (j = 0; j + 3 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)(ctx, elts + start + j, nr, ALLOC_ELTS(nr));
FLUSH();
currentsz = dmasz;
}
@@ -1076,6 +1067,9 @@ static void TAG(render_quads_elts)( struct gl_context *ctx,
GLuint count,
GLuint flags )
{
/* Emit whole number of quads in total. */
count -= count & 3;
if (HAVE_QUADS) {
LOCAL_VARS;
GLuint *elts = TNL_CONTEXT(ctx)->vb.Elts;
@@ -1088,14 +1082,12 @@ static void TAG(render_quads_elts)( struct gl_context *ctx,
currentsz = GET_CURRENT_VB_MAX_ELTS()/4*4;
count -= (count-start)%4;
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j < count; j += nr) {
for (j = 0; j < count; j += nr) {
nr = MIN2( currentsz, count - j );
TAG(emit_elts)( ctx, elts+j, nr, ALLOC_ELTS(nr) );
TAG(emit_elts)(ctx, elts + start + j, nr, ALLOC_ELTS(nr));
FLUSH();
currentsz = dmasz;
}
@@ -1112,7 +1104,6 @@ static void TAG(render_quads_elts)( struct gl_context *ctx,
/* Emit whole number of quads in total, and in each buffer.
*/
dmasz -= dmasz & 3;
count -= (count-start) & 3;
currentsz -= currentsz & 3;
/* Adjust for rendering as triangles:
@@ -1123,7 +1114,7 @@ static void TAG(render_quads_elts)( struct gl_context *ctx,
if (currentsz < 8)
currentsz = dmasz;
for (j = start; j + 3 < count; j += nr - 2 ) {
for (j = 0; j + 3 < count; j += nr - 2) {
nr = MIN2( currentsz, count - j );
if (nr >= 4)
@@ -1132,7 +1123,7 @@ static void TAG(render_quads_elts)( struct gl_context *ctx,
GLint i;
ELTS_VARS( ALLOC_ELTS( quads * 6 ) );
for ( i = j-start ; i < j-start+quads ; i++, elts += 4 ) {
for (i = j; i < j + quads; i++, elts += 4) {
EMIT_TWO_ELTS( 0, elts[0], elts[1] );
EMIT_TWO_ELTS( 2, elts[3], elts[1] );
EMIT_TWO_ELTS( 4, elts[2], elts[3] );