docs: add release notes for 11.0.3

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Update version to 11.0.3
2015-10-10 16:21:59 +01:00 · 2015-10-10 16:17:51 +01:00 · 2015-10-10 16:15:08 +01:00 · 2015-10-10 16:14:12 +01:00 · 2015-10-07 16:42:01 +01:00 · 2015-10-07 15:21:10 +01:00
47 changed files with 755 additions and 199 deletions
--- a/2
+++ b/2
@@ -1 +1 @@
-11.0.2
+11.0.3
--- a/docs/relnotes/11.0.2.html
+++ b/docs/relnotes/11.0.2.html
@@ -31,7 +31,8 @@ because compatibility contexts are not supported.

 <h2>SHA256 checksums</h2>
 <pre>
-TBD
+45170773500d6ae2f9eb93fc85efee69f7c97084411ada4eddf92f78bca56d20  mesa-11.0.2.tar.gz
+fce11fb27eb87adf1e620a76455d635c6136dfa49ae58c53b34ef8d0c7b7eae4  mesa-11.0.2.tar.xz
 </pre>


--- a/docs/relnotes/11.0.3.html
+++ b/docs/relnotes/11.0.3.html
@@ -0,0 +1,184 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html lang="en">
+<head>
+  <meta http-equiv="content-type" content="text/html; charset=utf-8">
+  <title>Mesa Release Notes</title>
+  <link rel="stylesheet" type="text/css" href="../mesa.css">
+</head>
+<body>
+
+<div class="header">
+  <h1>The Mesa 3D Graphics Library</h1>
+</div>
+
+<iframe src="../contents.html"></iframe>
+<div class="content">
+
+<h1>Mesa 11.0.3 Release Notes / October 10, 2015</h1>
+
+<p>
+Mesa 11.0.3 is a bug fix release which fixes bugs found since the 11.0.2 release.
+</p>
+<p>
+Mesa 11.0.3 implements the OpenGL 4.1 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 4.1.  OpenGL
+4.1 is <strong>only</strong> available if requested at context creation
+because compatibility contexts are not supported.
+</p>
+
+
+<h2>SHA256 checksums</h2>
+<pre>
+TBD
+</pre>
+
+
+<h2>New features</h2>
+<p>None</p>
+
+<h2>Bug fixes</h2>
+
+<p>This list is likely incomplete.</p>
+
+<ul>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55552">Bug 55552</a> - Compile errors with --enable-mangling</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91044">Bug 91044</a> - piglit spec/egl_khr_create_context/valid debug flag gles* fail</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91718">Bug 91718</a> - piglit.spec.arb_shader_image_load_store.invalid causes intermittent GPU HANG</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92072">Bug 92072</a> - Wine breakage since d082c5324 (st/mesa: don't call st_validate_state in BlitFramebuffer)</li>
+
+<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92265">Bug 92265</a> - Black windows in weston after update mesa to 11.0.2-1</li>
+
+</ul>
+
+
+<h2>Changes</h2>
+
+<p>Brian Paul (1):</p>
+<ul>
+  <li>st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats</li>
+</ul>
+
+<p>Daniel Scharrer (1):</p>
+<ul>
+  <li>mesa: Add abs input modifier to base for POW in ffvertex_prog</li>
+</ul>
+
+<p>Emil Velikov (3):</p>
+<ul>
+  <li>docs: add sha256 checksums for 11.0.2</li>
+  <li>Revert "nouveau: make sure there's always room to emit a fence"</li>
+  <li>Update version to 11.0.3</li>
+</ul>
+
+<p>Francisco Jerez (1):</p>
+<ul>
+  <li>i965/fs: Fix hang on IVB and VLV with image format mismatch.</li>
+</ul>
+
+<p>Ian Romanick (1):</p>
+<ul>
+  <li>meta: Handle array textures in scaled MSAA blits</li>
+</ul>
+
+<p>Ilia Mirkin (6):</p>
+<ul>
+  <li>nouveau: be more careful about freeing temporary transfer buffers</li>
+  <li>nouveau: delay deleting buffer with unflushed fence</li>
+  <li>nouveau: wait to unref the transfer's bo until it's no longer used</li>
+  <li>nv30: pretend to have packed texture/surface formats</li>
+  <li>nv30: always go through translate module on big-endian</li>
+  <li>nouveau: make sure there's always room to emit a fence</li>
+</ul>
+
+<p>Jason Ekstrand (1):</p>
+<ul>
+  <li>mesa: Correctly handle GL_BGRA_EXT in ES3 format_and_type checks</li>
+</ul>
+
+<p>Kyle Brenneman (3):</p>
+<ul>
+  <li>glx: Fix build errors with --enable-mangling (v2)</li>
+  <li>mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.</li>
+  <li>glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)</li>
+</ul>
+
+<p>Leo Liu (1):</p>
+<ul>
+  <li>radeon/vce: fix vui time_scale zero error</li>
+</ul>
+
+<p>Marek Olšák (21):</p>
+<ul>
+  <li>st/mesa: fix front buffer regression after dropping st_validate_state in Blit</li>
+  <li>radeonsi: handle index buffer alloc failures</li>
+  <li>radeonsi: handle constant buffer alloc failures</li>
+  <li>gallium/radeon: handle buffer_map staging buffer failures better</li>
+  <li>gallium/radeon: handle buffer alloc failures in r600_draw_rectangle</li>
+  <li>gallium/radeon: add a fail path for depth MSAA texture readback</li>
+  <li>radeonsi: report alloc failure from si_shader_binary_read</li>
+  <li>radeonsi: add malloc fail paths to si_create_shader_state</li>
+  <li>radeonsi: skip drawing if the tess factor ring allocation fails</li>
+  <li>radeonsi: skip drawing if GS ring allocations fail</li>
+  <li>radeonsi: handle shader precompile failures</li>
+  <li>radeonsi: handle fixed-func TCS shader create failure</li>
+  <li>radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload</li>
+  <li>radeonsi: skip drawing if PS fails to compile or upload</li>
+  <li>radeonsi: skip drawing if updating the scratch buffer fails</li>
+  <li>radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders</li>
+  <li>radeonsi: handle dummy constant buffer allocation failure</li>
+  <li>gallium/u_blitter: handle allocation failures</li>
+  <li>radeonsi: add scratch buffer to the buffer list when it's re-allocated</li>
+  <li>st/dri: don't use _ctx in client_wait_sync</li>
+  <li>egl/dri2: don't require a context for ClientWaitSync (v2)</li>
+</ul>
+
+<p>Matthew Waters (1):</p>
+<ul>
+  <li>egl: rework handling EGL_CONTEXT_FLAGS</li>
+</ul>
+
+<p>Michel Dänzer (1):</p>
+<ul>
+  <li>st/dri: Use packed RGB formats</li>
+</ul>
+
+<p>Roland Scheidegger (1):</p>
+<ul>
+  <li>mesa: fix mipmap generation for immutable, compressed textures</li>
+</ul>
+
+<p>Tom Stellard (3):</p>
+<ul>
+  <li>gallium/radeon: Use call_once() when initailizing LLVM targets</li>
+  <li>gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2</li>
+  <li>radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2</li>
+</ul>
+
+<p>Varad Gautam (1):</p>
+<ul>
+  <li>egl: restore surface type before linking config to its display</li>
+</ul>
+
+<p>Ville Syrjälä (3):</p>
+<ul>
+  <li>i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0)</li>
+  <li>i915: Fix texcoord vs. varying collision in fragment programs</li>
+  <li>i915: Remember to call intel_prepare_render() before blitting</li>
+</ul>
+
+
+</div>
+</body>
+</html>
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -312,6 +312,8 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
         else
            conf->dri_single_config = dri_config;
      }
+
+      conf->base.SurfaceType = 0;
      conf->base.ConfigID = config_id;

      _eglLinkConfig(&conf->base);
@@ -2384,13 +2386,18 @@ dri2_client_wait_sync(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSync *sync,
   unsigned wait_flags = 0;
   EGLint ret = EGL_CONDITION_SATISFIED_KHR;

-   if (flags & EGL_SYNC_FLUSH_COMMANDS_BIT_KHR)
+   /* The EGL_KHR_fence_sync spec states:
+    *
+    *    "If no context is current for the bound API,
+    *     the EGL_SYNC_FLUSH_COMMANDS_BIT_KHR bit is ignored.
+    */
+   if (dri2_ctx && flags & EGL_SYNC_FLUSH_COMMANDS_BIT_KHR)
      wait_flags |= __DRI2_FENCE_FLAG_FLUSH_COMMANDS;

   /* the sync object should take a reference while waiting */
   dri2_egl_ref_sync(dri2_sync);

-   if (dri2_dpy->fence->client_wait_sync(dri2_ctx->dri_context,
+   if (dri2_dpy->fence->client_wait_sync(dri2_ctx ? dri2_ctx->dri_context : NULL,
                                         dri2_sync->fence, wait_flags,
                                         timeout))
      dri2_sync->base.SyncStatus = EGL_SIGNALED_KHR;
--- a/src/egl/main/eglcontext.c
+++ b/src/egl/main/eglcontext.c
@@ -152,12 +152,51 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,

         /* The EGL_KHR_create_context spec says:
          *
-          *     "Flags are only defined for OpenGL context creation, and
-          *     specifying a flags value other than zero for other types of
-          *     contexts, including OpenGL ES contexts, will generate an
-          *     error."
+          *     "If the EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR flag bit is set in
+          *     EGL_CONTEXT_FLAGS_KHR, then a <debug context> will be created.
+          *     [...]
+          *     In some cases a debug context may be identical to a non-debug
+          *     context. This bit is supported for OpenGL and OpenGL ES
+          *     contexts."
          */
-         if (api != EGL_OPENGL_API && val != 0) {
+         if ((val & EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR) &&
+             (api != EGL_OPENGL_API && api != EGL_OPENGL_ES_API)) {
+            err = EGL_BAD_ATTRIBUTE;
+            break;
+         }
+
+         /* The EGL_KHR_create_context spec says:
+          *
+          *     "If the EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR flag bit
+          *     is set in EGL_CONTEXT_FLAGS_KHR, then a <forward-compatible>
+          *     context will be created. Forward-compatible contexts are
+          *     defined only for OpenGL versions 3.0 and later. They must not
+          *     support functionality marked as <deprecated> by that version of
+          *     the API, while a non-forward-compatible context must support
+          *     all functionality in that version, deprecated or not. This bit
+          *     is supported for OpenGL contexts, and requesting a
+          *     forward-compatible context for OpenGL versions less than 3.0
+          *     will generate an error."
+          */
+         if ((val & EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR) &&
+             (api != EGL_OPENGL_API || ctx->ClientMajorVersion < 3)) {
+            err = EGL_BAD_ATTRIBUTE;
+            break;
+         }
+
+         /* The EGL_KHR_create_context_spec says:
+          *
+          *     "If the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR bit is set in
+          *     EGL_CONTEXT_FLAGS_KHR, then a context supporting <robust buffer
+          *     access> will be created. Robust buffer access is defined in the
+          *     GL_ARB_robustness extension specification, and the resulting
+          *     context must also support either the GL_ARB_robustness
+          *     extension, or a version of OpenGL incorporating equivalent
+          *     functionality. This bit is supported for OpenGL contexts.
+          */
+         if ((val & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR) &&
+             (api != EGL_OPENGL_API ||
+              !dpy->Extensions.EXT_create_context_robustness)) {
            err = EGL_BAD_ATTRIBUTE;
            break;
         }
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -81,6 +81,8 @@
 #  pragma pop_macro("DEBUG")
 #endif

+#include "c11/threads.h"
+#include "os/os_thread.h"
 #include "pipe/p_config.h"
 #include "util/u_debug.h"
 #include "util/u_cpu_detect.h"
@@ -103,6 +105,33 @@ static LLVMEnsureMultithreaded lLVMEnsureMultithreaded;

 }

+static once_flag init_native_targets_once_flag;
+
+static void init_native_targets()
+{
+   // If we have a native target, initialize it to ensure it is linked in and
+   // usable by the JIT.
+   llvm::InitializeNativeTarget();
+
+   llvm::InitializeNativeTargetAsmPrinter();
+
+   llvm::InitializeNativeTargetDisassembler();
+}
+
+/**
+ * The llvm target registry is not thread-safe, so drivers and state-trackers
+ * that want to initialize targets should use the gallivm_init_llvm_targets()
+ * function to safely initialize targets.
+ *
+ * LLVM targets should be initialized before the driver or state-tracker tries
+ * to access the registry.
+ */
+extern "C" void
+gallivm_init_llvm_targets(void)
+{
+   call_once(&init_native_targets_once_flag, init_native_targets);
+}
+
 extern "C" void
 lp_set_target_options(void)
 {
@@ -115,13 +144,7 @@ lp_set_target_options(void)
   llvm::DisablePrettyStackTrace = true;
 #endif

-   // If we have a native target, initialize it to ensure it is linked in and
-   // usable by the JIT.
-   llvm::InitializeNativeTarget();
-
-   llvm::InitializeNativeTargetAsmPrinter();
-
-   llvm::InitializeNativeTargetDisassembler();
+   gallivm_init_llvm_targets();
 }


--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
@@ -41,6 +41,8 @@ extern "C" {

 struct lp_generated_code;

+extern void
+gallivm_init_llvm_targets(void);

 extern void
 lp_set_target_options(void);
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -1190,6 +1190,8 @@ static void blitter_draw(struct blitter_context_priv *ctx,

   u_upload_data(ctx->upload, 0, sizeof(ctx->vertices), ctx->vertices,
                 &vb.buffer_offset, &vb.buffer);
+   if (!vb.buffer)
+      return;
   u_upload_unmap(ctx->upload);

   pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, &vb);
@@ -2089,6 +2091,9 @@ void util_blitter_clear_buffer(struct blitter_context *blitter,

   u_upload_data(ctx->upload, 0, num_channels*4, clear_value,
                 &vb.buffer_offset, &vb.buffer);
+   if (!vb.buffer)
+      goto out;
+
   vb.stride = 0;

   blitter_set_running_flag(ctx);
@@ -2112,6 +2117,7 @@ void util_blitter_clear_buffer(struct blitter_context *blitter,

   util_draw_arrays(pipe, PIPE_PRIM_POINTS, 0, size / 4);

+out:
   blitter_restore_vertex_states(ctx);
   blitter_restore_render_cond(ctx);
   blitter_unset_running_flag(ctx);
--- a/src/gallium/drivers/nouveau/nouveau_buffer.c
+++ b/src/gallium/drivers/nouveau/nouveau_buffer.c
@@ -80,7 +80,12 @@ release_allocation(struct nouveau_mm_allocation **mm,
 inline void
 nouveau_buffer_release_gpu_storage(struct nv04_resource *buf)
 {
-   nouveau_bo_ref(NULL, &buf->bo);
+   if (buf->fence && buf->fence->state < NOUVEAU_FENCE_STATE_FLUSHED) {
+      nouveau_fence_work(buf->fence, nouveau_fence_unref_bo, buf->bo);
+      buf->bo = NULL;
+   } else {
+      nouveau_bo_ref(NULL, &buf->bo);
+   }

   if (buf->mm)
      release_allocation(&buf->mm, buf->fence);
@@ -281,7 +286,8 @@ nouveau_buffer_transfer_del(struct nouveau_context *nv,
 {
   if (tx->map) {
      if (likely(tx->bo)) {
-         nouveau_bo_ref(NULL, &tx->bo);
+         nouveau_fence_work(nv->screen->fence.current,
+                            nouveau_fence_unref_bo, tx->bo);
         if (tx->mm)
            release_allocation(&tx->mm, nv->screen->fence.current);
      } else {
@@ -782,7 +788,7 @@ nouveau_buffer_migrate(struct nouveau_context *nv,
      nv->copy_data(nv, buf->bo, buf->offset, new_domain,
                    bo, offset, old_domain, buf->base.width0);

-      nouveau_bo_ref(NULL, &bo);
+      nouveau_fence_work(screen->fence.current, nouveau_fence_unref_bo, bo);
      if (mm)
         release_allocation(&mm, screen->fence.current);
   } else
--- a/src/gallium/drivers/nouveau/nouveau_fence.c
+++ b/src/gallium/drivers/nouveau/nouveau_fence.c
@@ -231,3 +231,11 @@ nouveau_fence_next(struct nouveau_screen *screen)

   nouveau_fence_new(screen, &screen->fence.current, false);
 }
+
+void
+nouveau_fence_unref_bo(void *data)
+{
+   struct nouveau_bo *bo = data;
+
+   nouveau_bo_ref(NULL, &bo);
+}
--- a/src/gallium/drivers/nouveau/nouveau_fence.h
+++ b/src/gallium/drivers/nouveau/nouveau_fence.h
@@ -37,6 +37,9 @@ void nouveau_fence_next(struct nouveau_screen *);
 bool nouveau_fence_wait(struct nouveau_fence *);
 bool nouveau_fence_signalled(struct nouveau_fence *);

+void nouveau_fence_unref_bo(void *data); /* generic unref bo callback */
+
+
 static inline void
 nouveau_fence_ref(struct nouveau_fence *fence, struct nouveau_fence **ref)
 {
--- a/src/gallium/drivers/nouveau/nv30/nv30_format.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_format.c
@@ -78,12 +78,12 @@ nv30_format_info_table[PIPE_FORMAT_COUNT] = {
   _(B4G4R4X4_UNORM      , S___),
   _(B4G4R4A4_UNORM      , S___),
   _(B5G6R5_UNORM        , SB__),
-   _(B8G8R8X8_UNORM      , SB__),
-   _(B8G8R8X8_SRGB       , S___),
-   _(B8G8R8A8_UNORM      , SB__),
-   _(B8G8R8A8_SRGB       , S___),
+   _(BGRX8888_UNORM      , SB__),
+   _(BGRX8888_SRGB       , S___),
+   _(BGRA8888_UNORM      , SB__),
+   _(BGRA8888_SRGB       , S___),
   _(R8G8B8A8_UNORM      , __V_),
-   _(R8G8B8A8_SNORM      , S___),
+   _(RGBA8888_SNORM      , S___),
   _(DXT1_RGB            , S___),
   _(DXT1_SRGB           , S___),
   _(DXT1_RGBA           , S___),
@@ -138,8 +138,8 @@ const struct nv30_format
 nv30_format_table[PIPE_FORMAT_COUNT] = {
   R_(B5G5R5X1_UNORM    , X1R5G5B5          ),
   R_(B5G6R5_UNORM      , R5G6B5            ),
-   R_(B8G8R8X8_UNORM    , X8R8G8B8          ),
-   R_(B8G8R8A8_UNORM    , A8R8G8B8          ),
+   R_(BGRX8888_UNORM    , X8R8G8B8          ),
+   R_(BGRA8888_UNORM    , A8R8G8B8          ),
   Z_(Z16_UNORM         , Z16               ),
   Z_(X8Z24_UNORM       , Z24S8             ),
   Z_(S8_UINT_Z24_UNORM , Z24S8             ),
@@ -223,11 +223,11 @@ nv30_texfmt_table[PIPE_FORMAT_COUNT] = {
   _(B4G4R4X4_UNORM    , A4R4G4B4, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
   _(B4G4R4A4_UNORM    , A4R4G4B4, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
   _(B5G6R5_UNORM      , R5G6B5  , 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
-   _(B8G8R8X8_UNORM    , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
-   _(B8G8R8X8_SRGB     , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
-   _(B8G8R8A8_UNORM    , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
-   _(B8G8R8A8_SRGB     , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, SRGB, ____),
-   _(R8G8B8A8_SNORM    , A8R8G8B8, 0, C, C, C, C, 0, 1, 2, 3, NONE, SSSS),
+   _(BGRX8888_UNORM    , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
+   _(BGRX8888_SRGB     , A8R8G8B8, 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
+   _(BGRA8888_UNORM    , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
+   _(BGRA8888_SRGB     , A8R8G8B8, 0, C, C, C, C, 2, 1, 0, 3, SRGB, ____),
+   _(RGBA8888_SNORM    , A8R8G8B8, 0, C, C, C, C, 0, 1, 2, 3, NONE, SSSS),
   _(DXT1_RGB          , DXT1    , 0, C, C, C, 1, 2, 1, 0, x, NONE, ____),
   _(DXT1_SRGB         , DXT1    , 0, C, C, C, 1, 2, 1, 0, x, SRGB, ____),
   _(DXT1_RGBA         , DXT1    , 0, C, C, C, C, 2, 1, 0, 3, NONE, ____),
--- a/src/gallium/drivers/nouveau/nv30/nv30_miptree.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_miptree.c
@@ -339,10 +339,15 @@ nv30_miptree_transfer_unmap(struct pipe_context *pipe,
   struct nv30_context *nv30 = nv30_context(pipe);
   struct nv30_transfer *tx = nv30_transfer(ptx);

-   if (ptx->usage & PIPE_TRANSFER_WRITE)
+   if (ptx->usage & PIPE_TRANSFER_WRITE) {
      nv30_transfer_rect(nv30, NEAREST, &tx->tmp, &tx->img);

-   nouveau_bo_ref(NULL, &tx->tmp.bo);
+      /* Allow the copies above to finish executing before freeing the source */
+      nouveau_fence_work(nv30->screen->base.fence.current,
+                         nouveau_fence_unref_bo, tx->tmp.bo);
+   } else {
+      nouveau_bo_ref(NULL, &tx->tmp.bo);
+   }
   pipe_resource_reference(&ptx->resource, NULL);
   FREE(tx);
 }
--- a/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_vbo.c
@@ -191,7 +191,11 @@ nv30_vbo_validate(struct nv30_context *nv30)
   if (!nv30->vertex || nv30->draw_flags)
      return;

+#ifdef PIPE_ARCH_BIG_ENDIAN
+   if (1) { /* Figure out where the buffers are getting messed up */
+#else
   if (unlikely(vertex->need_conversion)) {
+#endif
      nv30->vbo_fifo = ~0;
      nv30->vbo_user = 0;
   } else {
--- a/src/gallium/drivers/nouveau/nv50/nv50_miptree.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_miptree.c
@@ -163,7 +163,10 @@ nv50_miptree_destroy(struct pipe_screen *pscreen, struct pipe_resource *pt)
 {
   struct nv50_miptree *mt = nv50_miptree(pt);

-   nouveau_bo_ref(NULL, &mt->base.bo);
+   if (mt->base.fence && mt->base.fence->state < NOUVEAU_FENCE_STATE_FLUSHED)
+      nouveau_fence_work(mt->base.fence, nouveau_fence_unref_bo, mt->base.bo);
+   else
+      nouveau_bo_ref(NULL, &mt->base.bo);

   nouveau_fence_ref(NULL, &mt->base.fence);
   nouveau_fence_ref(NULL, &mt->base.fence_wr);
--- a/src/gallium/drivers/nouveau/nv50/nv50_transfer.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_transfer.c
@@ -365,9 +365,14 @@ nv50_miptree_transfer_unmap(struct pipe_context *pctx,
            tx->rect[0].base += mt->layer_stride;
         tx->rect[1].base += tx->nblocksy * tx->base.stride;
      }
+
+      /* Allow the copies above to finish executing before freeing the source */
+      nouveau_fence_work(nv50->screen->base.fence.current,
+                         nouveau_fence_unref_bo, tx->rect[1].bo);
+   } else {
+      nouveau_bo_ref(NULL, &tx->rect[1].bo);
   }

-   nouveau_bo_ref(NULL, &tx->rect[1].bo);
   pipe_resource_reference(&transfer->resource, NULL);

   FREE(tx);
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_transfer.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_transfer.c
@@ -495,11 +495,16 @@ nvc0_miptree_transfer_unmap(struct pipe_context *pctx,
         tx->rect[1].base += tx->nblocksy * tx->base.stride;
      }
      NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_transfers_wr, 1);
+
+      /* Allow the copies above to finish executing before freeing the source */
+      nouveau_fence_work(nvc0->screen->base.fence.current,
+                         nouveau_fence_unref_bo, tx->rect[1].bo);
+   } else {
+      nouveau_bo_ref(NULL, &tx->rect[1].bo);
   }
   if (tx->base.usage & PIPE_TRANSFER_READ)
      NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_transfers_rd, 1);

-   nouveau_bo_ref(NULL, &tx->rect[1].bo);
   pipe_resource_reference(&transfer->resource, NULL);

   FREE(tx);
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -305,12 +305,11 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx,
 				data += box->x % R600_MAP_BUFFER_ALIGNMENT;
 				return r600_buffer_get_transfer(ctx, resource, level, usage, box,
 								ptransfer, data, staging, offset);
-			} else {
-				return NULL; /* error, shouldn't occur though */
 			}
+		} else {
+			/* At this point, the buffer is always idle (we checked it above). */
+			usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
 		}
-		/* At this point, the buffer is always idle (we checked it above). */
-		usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
 	}
 	/* Using a staging buffer in GTT for larger reads is much faster. */
 	else if ((usage & PIPE_TRANSFER_READ) &&
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -78,6 +78,9 @@ void r600_draw_rectangle(struct blitter_context *blitter,
 	 * I guess the 4th one is derived from the first 3.
 	 * The vertex specification should match u_blitter's vertex element state. */
 	u_upload_alloc(rctx->uploader, 0, sizeof(float) * 24, &offset, &buf, (void**)&vb);
+	if (!buf)
+		return;
+
 	vb[0] = x1;
 	vb[1] = y1;
 	vb[2] = depth;
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -989,6 +989,11 @@ static void *r600_texture_transfer_map(struct pipe_context *ctx,

 			if (usage & PIPE_TRANSFER_READ) {
 				struct pipe_resource *temp = ctx->screen->resource_create(ctx->screen, &resource);
+				if (!temp) {
+					R600_ERR("failed to create a temporary depth texture\n");
+					FREE(trans);
+					return NULL;
+				}

 				r600_copy_region_with_blit(ctx, temp, 0, 0, 0, 0, texture, level, box);
 				rctx->blit_decompress_depth(ctx, (struct r600_texture*)temp, staging_depth,
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
@@ -25,6 +25,8 @@
 */
 #include "radeon_llvm_emit.h"
 #include "radeon_elf_util.h"
+#include "c11/threads.h"
+#include "gallivm/lp_bld_misc.h"
 #include "util/u_memory.h"
 #include "pipe/p_shader_tokens.h"

@@ -86,30 +88,29 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)

 static void init_r600_target()
 {
-	static unsigned initialized = 0;
-	if (!initialized) {
+	gallivm_init_llvm_targets();
 #if HAVE_LLVM < 0x0307
-		LLVMInitializeR600TargetInfo();
-		LLVMInitializeR600Target();
-		LLVMInitializeR600TargetMC();
-		LLVMInitializeR600AsmPrinter();
+	LLVMInitializeR600TargetInfo();
+	LLVMInitializeR600Target();
+	LLVMInitializeR600TargetMC();
+	LLVMInitializeR600AsmPrinter();
 #else
-		LLVMInitializeAMDGPUTargetInfo();
-		LLVMInitializeAMDGPUTarget();
-		LLVMInitializeAMDGPUTargetMC();
-		LLVMInitializeAMDGPUAsmPrinter();
+	LLVMInitializeAMDGPUTargetInfo();
+	LLVMInitializeAMDGPUTarget();
+	LLVMInitializeAMDGPUTargetMC();
+	LLVMInitializeAMDGPUAsmPrinter();

 #endif
-		initialized = 1;
-	}
 }

+static once_flag init_r600_target_once_flag = ONCE_FLAG_INIT;
+
 LLVMTargetRef radeon_llvm_get_r600_target(const char *triple)
 {
 	LLVMTargetRef target = NULL;
 	char *err_message = NULL;

-	init_r600_target();
+	call_once(&init_r600_target_once_flag, init_r600_target);

 	if (LLVMGetTargetFromTriple(triple, &target, &err_message)) {
 		fprintf(stderr, "Cannot find target for triple %s ", triple);
--- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
+++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
@@ -233,6 +233,9 @@ static void vui(struct rvce_encoder *enc)
 {
 	int i;

+	if (!enc->pic.rate_ctrl.frame_rate_num)
+		return;
+
 	RVCE_BEGIN(0x04000009); // vui
 	RVCE_CS(0x00000000); //aspectRatioInfoPresentFlag
 	RVCE_CS(0x00000000); //aspectRatioInfo.aspectRatioIdc
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -468,7 +468,8 @@ void si_upload_const_buffer(struct si_context *sctx, struct r600_resource **rbuf

 	u_upload_alloc(sctx->b.uploader, 0, size, const_offset,
 		       (struct pipe_resource**)rbuffer, &tmp);
-	util_memcpy_cpu_to_le32(tmp, ptr, size);
+	if (rbuffer)
+		util_memcpy_cpu_to_le32(tmp, ptr, size);
 }

 static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint slot,
@@ -500,6 +501,11 @@ static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint s
 			si_upload_const_buffer(sctx,
 					       (struct r600_resource**)&buffer, input->user_buffer,
 					       input->buffer_size, &buffer_offset);
+			if (!buffer) {
+				/* Just unbind on failure. */
+				si_set_constant_buffer(ctx, shader, slot, NULL);
+				return;
+			}
 			va = r600_resource(buffer)->gpu_address + buffer_offset;
 		} else {
 			pipe_resource_reference(&buffer, input->buffer);
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -170,6 +170,8 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, void *
 	if (sctx->b.chip_class == CIK) {
 		sctx->null_const_buf.buffer = pipe_buffer_create(screen, PIPE_BIND_CONSTANT_BUFFER,
 								 PIPE_USAGE_DEFAULT, 16);
+		if (!sctx->null_const_buf.buffer)
+			goto fail;
 		sctx->null_const_buf.buffer_size = sctx->null_const_buf.buffer->width0;

 		for (shader = 0; shader < SI_NUM_SHADERS; shader++) {
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3829,11 +3829,14 @@ int si_shader_binary_read(struct si_screen *sscreen, struct si_shader *shader)
 {
 	const struct radeon_shader_binary *binary = &shader->binary;
 	unsigned i;
+	int r;
 	bool dump  = r600_can_dump_shader(&sscreen->b,
 		shader->selector ? shader->selector->tokens : NULL);

 	si_shader_binary_read_config(sscreen, shader, 0);
-	si_shader_binary_upload(sscreen, shader);
+	r = si_shader_binary_upload(sscreen, shader);
+	if (r)
+		return r;

 	if (dump) {
 		if (!(sscreen->b.debug_flags & DBG_NO_ASM)) {
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -274,7 +274,7 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
 			      unsigned force_level);

 /* si_state_shader.c */
-void si_update_shaders(struct si_context *sctx);
+bool si_update_shaders(struct si_context *sctx);
 void si_init_shader_functions(struct si_context *sctx);

 /* si_state_draw.c */
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -760,8 +760,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
 	else
 		sctx->current_rast_prim = info->mode;

-	si_update_shaders(sctx);
-	if (!si_upload_shader_descriptors(sctx))
+	if (!si_update_shaders(sctx) ||
+	    !si_upload_shader_descriptors(sctx))
 		return;

 	if (info->indexed) {
@@ -783,6 +783,10 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)

 			u_upload_alloc(sctx->b.uploader, start_offset, count * 2,
 				       &out_offset, &out_buffer, &ptr);
+			if (!out_buffer) {
+				pipe_resource_reference(&ib.buffer, NULL);
+				return;
+			}

 			util_shorten_ubyte_elts_to_userptr(&sctx->b.b, &ib, 0,
 							   ib.offset + start_offset,
@@ -803,6 +807,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
 			u_upload_data(sctx->b.uploader, start_offset, count * ib.index_size,
 				      (char*)ib.user_buffer + start_offset,
 				      &ib.offset, &ib.buffer);
+			if (!ib.buffer)
+				return;
 			/* info->start will be added by the drawing code */
 			ib.offset -= start_offset;
 		}
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -665,8 +665,16 @@ static void *si_create_shader_state(struct pipe_context *ctx,
 	struct si_shader_selector *sel = CALLOC_STRUCT(si_shader_selector);
 	int i;

+	if (!sel)
+		return NULL;
+
 	sel->type = pipe_shader_type;
 	sel->tokens = tgsi_dup_tokens(state->tokens);
+	if (!sel->tokens) {
+		FREE(sel);
+		return NULL;
+	}
+
 	sel->so = state->stream_output;
 	tgsi_scan_shader(state->tokens, &sel->info);
 	p_atomic_inc(&sscreen->b.num_shaders_created);
@@ -725,7 +733,12 @@ static void *si_create_shader_state(struct pipe_context *ctx,
 	}

 	if (sscreen->b.debug_flags & DBG_PRECOMPILE)
-		si_shader_select(ctx, sel);
+		if (si_shader_select(ctx, sel)) {
+			fprintf(stderr, "radeonsi: can't create a shader\n");
+			tgsi_free_tokens(sel->tokens);
+			FREE(sel);
+			return NULL;
+		}

 	return sel;
 }
@@ -1031,11 +1044,23 @@ static void si_init_gs_rings(struct si_context *sctx)
 	assert(!sctx->gs_rings);
 	sctx->gs_rings = CALLOC_STRUCT(si_pm4_state);

+	if (!sctx->gs_rings)
+		return;
+
 	sctx->esgs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
 				       PIPE_USAGE_DEFAULT, esgs_ring_size);
+	if (!sctx->esgs_ring) {
+		FREE(sctx->gs_rings);
+		return;
+	}

 	sctx->gsvs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
 					     PIPE_USAGE_DEFAULT, gsvs_ring_size);
+	if (!sctx->gsvs_ring) {
+		pipe_resource_reference(&sctx->esgs_ring, NULL);
+		FREE(sctx->gs_rings);
+		return;
+	}

 	if (sctx->b.chip_class >= CIK) {
 		if (sctx->b.chip_class >= VI) {
@@ -1094,14 +1119,16 @@ static void si_update_gs_rings(struct si_context *sctx)

 }
 /**
- * @returns 1 if \p sel has been updated to use a new scratch buffer and 0
- *          otherwise.
+ * @returns 1 if \p sel has been updated to use a new scratch buffer
+ *          0 if not
+ *          < 0 if there was a failure
 */
-static unsigned si_update_scratch_buffer(struct si_context *sctx,
+static int si_update_scratch_buffer(struct si_context *sctx,
 				    struct si_shader_selector *sel)
 {
 	struct si_shader *shader;
 	uint64_t scratch_va = sctx->scratch_buffer->gpu_address;
+	int r;

 	if (!sel)
 		return 0;
@@ -1122,7 +1149,9 @@ static unsigned si_update_scratch_buffer(struct si_context *sctx,
 	si_shader_apply_scratch_relocs(sctx, shader, scratch_va);

 	/* Replace the shader bo with a new bo that has the relocs applied. */
-	si_shader_binary_upload(sctx->screen, shader);
+	r = si_shader_binary_upload(sctx->screen, shader);
+	if (r)
+		return r;

 	/* Update the shader state to use the new shader bo. */
 	si_shader_init_pm4_state(shader);
@@ -1161,7 +1190,7 @@ static unsigned si_get_max_scratch_bytes_per_wave(struct si_context *sctx)
 	return bytes;
 }

-static void si_update_spi_tmpring_size(struct si_context *sctx)
+static bool si_update_spi_tmpring_size(struct si_context *sctx)
 {
 	unsigned current_scratch_buffer_size =
 		si_get_current_scratch_buffer_size(sctx);
@@ -1169,6 +1198,7 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
 		si_get_max_scratch_bytes_per_wave(sctx);
 	unsigned scratch_needed_size = scratch_bytes_per_wave *
 		sctx->scratch_waves;
+	int r;

 	if (scratch_needed_size > 0) {

@@ -1181,6 +1211,9 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
 			sctx->scratch_buffer =
 					si_resource_create_custom(&sctx->screen->b.b,
 	                                PIPE_USAGE_DEFAULT, scratch_needed_size);
+			if (!sctx->scratch_buffer)
+				return false;
+			sctx->emit_scratch_reloc = true;
 		}

 		/* Update the shaders, so they are using the latest scratch.  The
@@ -1188,31 +1221,57 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)
 		 * last used, so we still need to try to update them, even if
 		 * they require scratch buffers smaller than the current size.
 		 */
-		if (si_update_scratch_buffer(sctx, sctx->ps_shader))
+		r = si_update_scratch_buffer(sctx, sctx->ps_shader);
+		if (r < 0)
+			return false;
+		if (r == 1)
 			si_pm4_bind_state(sctx, ps, sctx->ps_shader->current->pm4);
-		if (si_update_scratch_buffer(sctx, sctx->gs_shader))
+
+		r = si_update_scratch_buffer(sctx, sctx->gs_shader);
+		if (r < 0)
+			return false;
+		if (r == 1)
 			si_pm4_bind_state(sctx, gs, sctx->gs_shader->current->pm4);
-		if (si_update_scratch_buffer(sctx, sctx->tcs_shader))
+
+		r = si_update_scratch_buffer(sctx, sctx->tcs_shader);
+		if (r < 0)
+			return false;
+		if (r == 1)
 			si_pm4_bind_state(sctx, hs, sctx->tcs_shader->current->pm4);

 		/* VS can be bound as LS, ES, or VS. */
 		if (sctx->tes_shader) {
-			if (si_update_scratch_buffer(sctx, sctx->vs_shader))
+			r = si_update_scratch_buffer(sctx, sctx->vs_shader);
+			if (r < 0)
+				return false;
+			if (r == 1)
 				si_pm4_bind_state(sctx, ls, sctx->vs_shader->current->pm4);
 		} else if (sctx->gs_shader) {
-			if (si_update_scratch_buffer(sctx, sctx->vs_shader))
+			r = si_update_scratch_buffer(sctx, sctx->vs_shader);
+			if (r < 0)
+				return false;
+			if (r == 1)
 				si_pm4_bind_state(sctx, es, sctx->vs_shader->current->pm4);
 		} else {
-			if (si_update_scratch_buffer(sctx, sctx->vs_shader))
+			r = si_update_scratch_buffer(sctx, sctx->vs_shader);
+			if (r < 0)
+				return false;
+			if (r == 1)
 				si_pm4_bind_state(sctx, vs, sctx->vs_shader->current->pm4);
 		}

 		/* TES can be bound as ES or VS. */
 		if (sctx->gs_shader) {
-			if (si_update_scratch_buffer(sctx, sctx->tes_shader))
+			r = si_update_scratch_buffer(sctx, sctx->tes_shader);
+			if (r < 0)
+				return false;
+			if (r == 1)
 				si_pm4_bind_state(sctx, es, sctx->tes_shader->current->pm4);
 		} else {
-			if (si_update_scratch_buffer(sctx, sctx->tes_shader))
+			r = si_update_scratch_buffer(sctx, sctx->tes_shader);
+			if (r < 0)
+				return false;
+			if (r == 1)
 				si_pm4_bind_state(sctx, vs, sctx->tes_shader->current->pm4);
 		}
 	}
@@ -1223,6 +1282,7 @@ static void si_update_spi_tmpring_size(struct si_context *sctx)

 	sctx->spi_tmpring_size = S_0286E8_WAVES(sctx->scratch_waves) |
 				S_0286E8_WAVESIZE(scratch_bytes_per_wave >> 10);
+	return true;
 }

 static void si_init_tess_factor_ring(struct si_context *sctx)
@@ -1230,11 +1290,20 @@ static void si_init_tess_factor_ring(struct si_context *sctx)
 	assert(!sctx->tf_state);
 	sctx->tf_state = CALLOC_STRUCT(si_pm4_state);

+	if (!sctx->tf_state)
+		return;
+
 	sctx->tf_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
 					   PIPE_USAGE_DEFAULT,
 					   32768 * sctx->screen->b.info.max_se);
+	if (!sctx->tf_ring) {
+		FREE(sctx->tf_state);
+		return;
+	}
+
 	sctx->b.clear_buffer(&sctx->b.b, sctx->tf_ring, 0,
 			     sctx->tf_ring->width0, fui(0), false);
+
 	assert(((sctx->tf_ring->width0 / 4) & C_030938_SIZE) == 0);

 	if (sctx->b.chip_class >= CIK) {
@@ -1290,7 +1359,6 @@ static void si_generate_fixed_func_tcs(struct si_context *sctx)

 	sctx->fixed_func_tcs_shader =
 		ureg_create_shader_and_destroy(ureg, &sctx->b.b);
-	assert(sctx->fixed_func_tcs_shader);
 }

 static void si_update_vgt_shader_config(struct si_context *sctx)
@@ -1338,32 +1406,49 @@ static void si_update_so(struct si_context *sctx, struct si_shader_selector *sha
 	sctx->b.streamout.stride_in_dw = shader->so.stride;
 }

-void si_update_shaders(struct si_context *sctx)
+bool si_update_shaders(struct si_context *sctx)
 {
 	struct pipe_context *ctx = (struct pipe_context*)sctx;
 	struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
+	int r;

 	/* Update stages before GS. */
 	if (sctx->tes_shader) {
-		if (!sctx->tf_state)
+		if (!sctx->tf_state) {
 			si_init_tess_factor_ring(sctx);
+			if (!sctx->tf_state)
+				return false;
+		}

 		/* VS as LS */
-		si_shader_select(ctx, sctx->vs_shader);
+		r = si_shader_select(ctx, sctx->vs_shader);
+		if (r)
+			return false;
 		si_pm4_bind_state(sctx, ls, sctx->vs_shader->current->pm4);

 		if (sctx->tcs_shader) {
-			si_shader_select(ctx, sctx->tcs_shader);
+			r = si_shader_select(ctx, sctx->tcs_shader);
+			if (r)
+				return false;
 			si_pm4_bind_state(sctx, hs, sctx->tcs_shader->current->pm4);
 		} else {
-			if (!sctx->fixed_func_tcs_shader)
+			if (!sctx->fixed_func_tcs_shader) {
 				si_generate_fixed_func_tcs(sctx);
-			si_shader_select(ctx, sctx->fixed_func_tcs_shader);
+				if (!sctx->fixed_func_tcs_shader)
+					return false;
+			}
+
+			r = si_shader_select(ctx, sctx->fixed_func_tcs_shader);
+			if (r)
+				return false;
 			si_pm4_bind_state(sctx, hs,
 					  sctx->fixed_func_tcs_shader->current->pm4);
 		}

-		si_shader_select(ctx, sctx->tes_shader);
+		r = si_shader_select(ctx, sctx->tes_shader);
+		if (r)
+			return false;
+
 		if (sctx->gs_shader) {
 			/* TES as ES */
 			si_pm4_bind_state(sctx, es, sctx->tes_shader->current->pm4);
@@ -1374,24 +1459,33 @@ void si_update_shaders(struct si_context *sctx)
 		}
 	} else if (sctx->gs_shader) {
 		/* VS as ES */
-		si_shader_select(ctx, sctx->vs_shader);
+		r = si_shader_select(ctx, sctx->vs_shader);
+		if (r)
+			return false;
 		si_pm4_bind_state(sctx, es, sctx->vs_shader->current->pm4);
 	} else {
 		/* VS as VS */
-		si_shader_select(ctx, sctx->vs_shader);
+		r = si_shader_select(ctx, sctx->vs_shader);
+		if (r)
+			return false;
 		si_pm4_bind_state(sctx, vs, sctx->vs_shader->current->pm4);
 		si_update_so(sctx, sctx->vs_shader);
 	}

 	/* Update GS. */
 	if (sctx->gs_shader) {
-		si_shader_select(ctx, sctx->gs_shader);
+		r = si_shader_select(ctx, sctx->gs_shader);
+		if (r)
+			return false;
 		si_pm4_bind_state(sctx, gs, sctx->gs_shader->current->pm4);
 		si_pm4_bind_state(sctx, vs, sctx->gs_shader->current->gs_copy_shader->pm4);
 		si_update_so(sctx, sctx->gs_shader);

-		if (!sctx->gs_rings)
+		if (!sctx->gs_rings) {
 			si_init_gs_rings(sctx);
+			if (!sctx->gs_rings)
+				return false;
+		}

 		if (sctx->emitted.named.gs_rings != sctx->gs_rings)
 			sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
@@ -1406,18 +1500,9 @@ void si_update_shaders(struct si_context *sctx)

 	si_update_vgt_shader_config(sctx);

-	si_shader_select(ctx, sctx->ps_shader);
-
-	if (!sctx->ps_shader->current) {
-		struct si_shader_selector *sel;
-
-		/* use a dummy shader if compiling the shader (variant) failed */
-		si_make_dummy_ps(sctx);
-		sel = sctx->dummy_pixel_shader;
-		si_shader_select(ctx, sel);
-		sctx->ps_shader->current = sel->current;
-	}
-
+	r = si_shader_select(ctx, sctx->ps_shader);
+	if (r)
+		return false;
 	si_pm4_bind_state(sctx, ps, sctx->ps_shader->current->pm4);

 	if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
@@ -1428,9 +1513,14 @@ void si_update_shaders(struct si_context *sctx)
 		si_update_spi_map(sctx);
 	}

-	if (si_pm4_state_changed(sctx, ps) || si_pm4_state_changed(sctx, vs) ||
-	    si_pm4_state_changed(sctx, gs)) {
-		si_update_spi_tmpring_size(sctx);
+	if (si_pm4_state_changed(sctx, ls) ||
+	    si_pm4_state_changed(sctx, hs) ||
+	    si_pm4_state_changed(sctx, es) ||
+	    si_pm4_state_changed(sctx, gs) ||
+	    si_pm4_state_changed(sctx, vs) ||
+	    si_pm4_state_changed(sctx, ps)) {
+		if (!si_update_spi_tmpring_size(sctx))
+			return false;
 	}

 	if (sctx->ps_db_shader_control != sctx->ps_shader->current->db_shader_control) {
@@ -1445,6 +1535,7 @@ void si_update_shaders(struct si_context *sctx)
 		if (sctx->b.chip_class == SI)
 			si_mark_atom_dirty(sctx, &sctx->db_render_state);
 	}
+	return true;
 }

 void si_init_shader_functions(struct si_context *sctx)
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -188,10 +188,10 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable,
       * may occur as the stvis->color_format.
       */
      switch(format) {
-      case PIPE_FORMAT_B8G8R8A8_UNORM:
+      case PIPE_FORMAT_BGRA8888_UNORM:
 	 depth = 32;
 	 break;
-      case PIPE_FORMAT_B8G8R8X8_UNORM:
+      case PIPE_FORMAT_BGRX8888_UNORM:
 	 depth = 24;
 	 break;
      case PIPE_FORMAT_B5G6R5_UNORM:
@@ -261,13 +261,13 @@ dri_image_drawable_get_buffers(struct dri_drawable *drawable,
      case PIPE_FORMAT_B5G6R5_UNORM:
         image_format = __DRI_IMAGE_FORMAT_RGB565;
         break;
-      case PIPE_FORMAT_B8G8R8X8_UNORM:
+      case PIPE_FORMAT_BGRX8888_UNORM:
         image_format = __DRI_IMAGE_FORMAT_XRGB8888;
         break;
-      case PIPE_FORMAT_B8G8R8A8_UNORM:
+      case PIPE_FORMAT_BGRA8888_UNORM:
         image_format = __DRI_IMAGE_FORMAT_ARGB8888;
         break;
-      case PIPE_FORMAT_R8G8B8A8_UNORM:
+      case PIPE_FORMAT_RGBA8888_UNORM:
         image_format = __DRI_IMAGE_FORMAT_ABGR8888;
         break;
      default:
@@ -314,10 +314,10 @@ dri2_allocate_buffer(__DRIscreen *sPriv,

   switch (format) {
      case 32:
-         pf = PIPE_FORMAT_B8G8R8A8_UNORM;
+         pf = PIPE_FORMAT_BGRA8888_UNORM;
         break;
      case 24:
-         pf = PIPE_FORMAT_B8G8R8X8_UNORM;
+         pf = PIPE_FORMAT_BGRX8888_UNORM;
         break;
      case 16:
         pf = PIPE_FORMAT_Z16_UNORM;
@@ -724,13 +724,13 @@ dri2_create_image_from_winsys(__DRIscreen *_screen,
      pf = PIPE_FORMAT_B5G6R5_UNORM;
      break;
   case __DRI_IMAGE_FORMAT_XRGB8888:
-      pf = PIPE_FORMAT_B8G8R8X8_UNORM;
+      pf = PIPE_FORMAT_BGRX8888_UNORM;
      break;
   case __DRI_IMAGE_FORMAT_ARGB8888:
-      pf = PIPE_FORMAT_B8G8R8A8_UNORM;
+      pf = PIPE_FORMAT_BGRA8888_UNORM;
      break;
   case __DRI_IMAGE_FORMAT_ABGR8888:
-      pf = PIPE_FORMAT_R8G8B8A8_UNORM;
+      pf = PIPE_FORMAT_RGBA8888_UNORM;
      break;
   default:
      pf = PIPE_FORMAT_NONE;
@@ -845,13 +845,13 @@ dri2_create_image(__DRIscreen *_screen,
      pf = PIPE_FORMAT_B5G6R5_UNORM;
      break;
   case __DRI_IMAGE_FORMAT_XRGB8888:
-      pf = PIPE_FORMAT_B8G8R8X8_UNORM;
+      pf = PIPE_FORMAT_BGRX8888_UNORM;
      break;
   case __DRI_IMAGE_FORMAT_ARGB8888:
-      pf = PIPE_FORMAT_B8G8R8A8_UNORM;
+      pf = PIPE_FORMAT_BGRA8888_UNORM;
      break;
   case __DRI_IMAGE_FORMAT_ABGR8888:
-      pf = PIPE_FORMAT_R8G8B8A8_UNORM;
+      pf = PIPE_FORMAT_RGBA8888_UNORM;
      break;
   default:
      pf = PIPE_FORMAT_NONE;
@@ -1293,6 +1293,7 @@ dri2_load_opencl_interop(struct dri_screen *screen)
 }

 struct dri2_fence {
+   struct dri_screen *driscreen;
   struct pipe_fence_handle *pipe_fence;
   void *cl_event;
 };
@@ -1313,6 +1314,7 @@ dri2_create_fence(__DRIcontext *_ctx)
      return NULL;
   }

+   fence->driscreen = dri_screen(_ctx->driScreenPriv);
   return fence;
 }

@@ -1336,6 +1338,7 @@ dri2_get_fence_from_cl_event(__DRIscreen *_screen, intptr_t cl_event)
      return NULL;
   }

+   fence->driscreen = driscreen;
   return fence;
 }

@@ -1360,9 +1363,9 @@ static GLboolean
 dri2_client_wait_sync(__DRIcontext *_ctx, void *_fence, unsigned flags,
                      uint64_t timeout)
 {
-   struct dri_screen *driscreen = dri_screen(_ctx->driScreenPriv);
-   struct pipe_screen *screen = driscreen->base.screen;
   struct dri2_fence *fence = (struct dri2_fence*)_fence;
+   struct dri_screen *driscreen = fence->driscreen;
+   struct pipe_screen *screen = driscreen->base.screen;

   /* No need to flush. The context was flushed when the fence was created. */

--- a/src/gallium/state_trackers/dri/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/dri_drawable.c
@@ -231,11 +231,11 @@ dri_set_tex_buffer2(__DRIcontext *pDRICtx, GLint target,
      if (format == __DRI_TEXTURE_FORMAT_RGB)  {
         /* only need to cover the formats recognized by dri_fill_st_visual */
         switch (internal_format) {
-         case PIPE_FORMAT_B8G8R8A8_UNORM:
-            internal_format = PIPE_FORMAT_B8G8R8X8_UNORM;
+         case PIPE_FORMAT_BGRA8888_UNORM:
+            internal_format = PIPE_FORMAT_BGRX8888_UNORM;
            break;
-         case PIPE_FORMAT_A8R8G8B8_UNORM:
-            internal_format = PIPE_FORMAT_X8R8G8B8_UNORM;
+         case PIPE_FORMAT_ARGB8888_UNORM:
+            internal_format = PIPE_FORMAT_XRGB8888_UNORM;
            break;
         default:
            break;
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -35,7 +35,8 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
 	-lclangEdit \
 	-lclangLex \
 	-lclangBasic \
-	$(LLVM_LIBS)
+	$(LLVM_LIBS) \
+	$(PTHREAD_LIBS)

 nodist_EXTRA_lib@OPENCL_LIBNAME@_la_SOURCES = dummy.cpp
 lib@OPENCL_LIBNAME@_la_SOURCES =
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -46,6 +46,7 @@ AM_CFLAGS = \
 	$(EXTRA_DEFINES_XF86VIDMODE) \
 	-D_REENTRANT \
 	-DDEFAULT_DRIVER_DIR=\"$(DRI_DRIVER_SEARCH_DIR)\" \
+	-DGL_LIB_NAME=\"lib@GL_LIB@.so.1\" \
 	$(DEFINES) \
 	$(LIBDRM_CFLAGS) \
 	$(DRI2PROTO_CFLAGS) \
--- a/src/glx/dri_common.c
+++ b/src/glx/dri_common.c
@@ -73,6 +73,10 @@ dri_message(int level, const char *f, ...)
   }
 }

+#ifndef GL_LIB_NAME
+#define GL_LIB_NAME "libGL.so.1"
+#endif
+
 #ifndef DEFAULT_DRIVER_DIR
 /* this is normally defined in Mesa/configs/default with DRI_DRIVER_SEARCH_PATH */
 #define DEFAULT_DRIVER_DIR "/usr/local/lib/dri"
@@ -99,7 +103,7 @@ driOpenDriver(const char *driverName)
   int len;

   /* Attempt to make sure libGL symbols will be visible to the driver */
-   glhandle = dlopen("libGL.so.1", RTLD_NOW | RTLD_GLOBAL);
+   glhandle = dlopen(GL_LIB_NAME, RTLD_NOW | RTLD_GLOBAL);

   libPaths = NULL;
   if (geteuid() == getuid()) {
--- a/src/glx/glxcmds.c
+++ b/src/glx/glxcmds.c
@@ -2646,7 +2646,11 @@ _X_EXPORT void (*glXGetProcAddressARB(const GLubyte * procName)) (void)
 */
 _X_EXPORT void (*glXGetProcAddress(const GLubyte * procName)) (void)
 #if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
+# if defined(USE_MGL_NAMESPACE)
+   __attribute__ ((alias("mglXGetProcAddressARB")));
+# else
   __attribute__ ((alias("glXGetProcAddressARB")));
+# endif
 #else
 {
   return glXGetProcAddressARB(procName);
--- a/src/glx/glxextensions.h
+++ b/src/glx/glxextensions.h
@@ -281,11 +281,17 @@ typedef void (*PFNGLXDISABLEEXTENSIONPROC) (const char *name);
 # define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func)
 #else
 # if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
-#  define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
+/* GLX_ALIAS and GLX_ALIAS_VOID both expand to the macro GLX_ALIAS2. Using the
+ * extra expansion means that the name mangling macros in glx_mangle.h will
+ * apply before stringification, so the alias attribute will have a string like
+ * "mglXFoo" instead of "glXFoo". */
+#  define GLX_ALIAS2(return_type, real_func, proto_args, args, aliased_func) \
   return_type  real_func  proto_args                                   \
   __attribute__ ((alias( # aliased_func ) ));
+#  define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
+   GLX_ALIAS2(return_type, real_func, proto_args, args, aliased_func)
 #  define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func) \
-   GLX_ALIAS(void, real_func, proto_args, args, aliased_func)
+   GLX_ALIAS2(void, real_func, proto_args, args, aliased_func)
 # else
 #  define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
   return_type  real_func  proto_args                                   \
--- a/src/mapi/mapi_glapi.c
+++ b/src/mapi/mapi_glapi.c
@@ -175,7 +175,7 @@ _glapi_get_stub(const char *name, int generate)
   const struct mapi_stub *stub;

 #ifdef USE_MGL_NAMESPACE
-   if (name)
+   if (name && name[0] == 'm')
      name++;
 #endif

--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -71,9 +71,7 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
   char *sample_map_str = rzalloc_size(mem_ctx, 1);
   char *sample_map_expr = rzalloc_size(mem_ctx, 1);
   char *texel_fetch_macro = rzalloc_size(mem_ctx, 1);
-   const char *vs_source;
   const char *sampler_array_suffix = "";
-   const char *texcoord_type = "vec2";
   float y_scale;
   enum blit_msaa_shader shader_index;

@@ -99,7 +97,6 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
      shader_index += BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_SCALED_RESOLVE -
                      BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_SCALED_RESOLVE;
      sampler_array_suffix = "Array";
-      texcoord_type = "vec3";
   }

   if (blit->msaa_shaders[shader_index]) {
@@ -150,28 +147,37 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
                          "   const int sample_map[%d] = int[%d](%s);\n",
                          samples, samples, sample_map_str);

-   ralloc_asprintf_append(&texel_fetch_macro,
-                          "#define TEXEL_FETCH(coord) texelFetch(texSampler, i%s(coord), %s);\n",
-                          texcoord_type, sample_number);
+   if (target == GL_TEXTURE_2D_MULTISAMPLE) {
+      ralloc_asprintf_append(&texel_fetch_macro,
+                             "#define TEXEL_FETCH(coord) texelFetch(texSampler, ivec2(coord), %s);\n",
+                             sample_number);
+   } else {
+      ralloc_asprintf_append(&texel_fetch_macro,
+                             "#define TEXEL_FETCH(coord) texelFetch(texSampler, ivec3(coord, layer), %s);\n",
+                             sample_number);
+   }

-   vs_source = ralloc_asprintf(mem_ctx,
+   static const char vs_source[] =
                               "#version 130\n"
                               "in vec2 position;\n"
-                               "in %s textureCoords;\n"
-                               "out %s texCoords;\n"
+                               "in vec3 textureCoords;\n"
+                               "out vec2 texCoords;\n"
+                               "flat out int layer;\n"
                               "void main()\n"
                               "{\n"
-                               "   texCoords = textureCoords;\n"
+                               "   texCoords = textureCoords.xy;\n"
+                               "   layer = int(textureCoords.z);\n"
                               "   gl_Position = vec4(position, 0.0, 1.0);\n"
-                               "}\n",
-                               texcoord_type,
-                               texcoord_type);
+                               "}\n"
+      ;
+
   fs_source = ralloc_asprintf(mem_ctx,
                               "#version 130\n"
                               "#extension GL_ARB_texture_multisample : enable\n"
                               "uniform sampler2DMS%s texSampler;\n"
                               "uniform float src_width, src_height;\n"
-                               "in %s texCoords;\n"
+                               "in vec2 texCoords;\n"
+                               "flat in int layer;\n"
                               "out vec4 out_color;\n"
                               "\n"
                               "void main()\n"
@@ -212,7 +218,6 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
                               "   out_color = mix(x_0_color, x_1_color, interp.y);\n"
                               "}\n",
                               sampler_array_suffix,
-                               texcoord_type,
                               sample_map_expr,
                               y_scale,
                               1.0f / y_scale,
--- a/src/mesa/drivers/dri/i915/i830_context.h
+++ b/src/mesa/drivers/dri/i915/i830_context.h
@@ -42,10 +42,10 @@
 #define I830_UPLOAD_STIPPLE          0x4
 #define I830_UPLOAD_INVARIENT        0x8
 #define I830_UPLOAD_RASTER_RULES     0x10
-#define I830_UPLOAD_TEX(i)           (0x10<<(i))
-#define I830_UPLOAD_TEXBLEND(i)      (0x100<<(i))
-#define I830_UPLOAD_TEX_ALL          (0x0f0)
-#define I830_UPLOAD_TEXBLEND_ALL     (0xf00)
+#define I830_UPLOAD_TEX(i)           (0x0100<<(i))
+#define I830_UPLOAD_TEXBLEND(i)      (0x1000<<(i))
+#define I830_UPLOAD_TEX_ALL          (0x0f00)
+#define I830_UPLOAD_TEXBLEND_ALL     (0xf000)

 /* State structure offsets - these will probably disappear.
 */
--- a/src/mesa/drivers/dri/i915/i915_context.h
+++ b/src/mesa/drivers/dri/i915/i915_context.h
@@ -115,6 +115,8 @@ enum {
   I915_RASTER_RULES_SETUP_SIZE,
 };

+#define I915_TEX_UNITS 8
+
 #define I915_MAX_CONSTANT      32
 #define I915_CONSTANT_SIZE     (2+(4*I915_MAX_CONSTANT))

@@ -194,7 +196,8 @@ struct i915_fragment_program

   /* Helpers for i915_fragprog.c:
    */
-   GLuint wpos_tex;
+   uint8_t texcoord_mapping[I915_TEX_UNITS];
+   uint8_t wpos_tex;
   bool depth_written;

   struct
@@ -205,15 +208,6 @@ struct i915_fragment_program
   GLuint nr_params;
 };

-
-
-
-
-
-
-#define I915_TEX_UNITS 8
-
-
 struct i915_hw_state
 {
   GLuint Ctx[I915_CTX_SETUP_SIZE];
--- a/src/mesa/drivers/dri/i915/i915_fragprog.c
+++ b/src/mesa/drivers/dri/i915/i915_fragprog.c
@@ -72,6 +72,22 @@ static const GLfloat cos_constants[4] = { 1.0,
   -1.0 / (6 * 5 * 4 * 3 * 2 * 1)
 };

+/* texcoord_mapping[unit] = index | TEXCOORD_{TEX,VAR} */
+#define TEXCOORD_TEX (0<<7)
+#define TEXCOORD_VAR (1<<7)
+
+static unsigned
+get_texcoord_mapping(struct i915_fragment_program *p, uint8_t texcoord)
+{
+   for (unsigned i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
+      if (p->texcoord_mapping[i] == texcoord)
+         return i;
+   }
+
+   /* blah */
+   return p->ctx->Const.MaxTextureCoordUnits - 1;
+}
+
 /**
 * Retrieve a ureg for the given source register.  Will emit
 * constants, apply swizzling and negation as needed.
@@ -82,6 +98,7 @@ src_vector(struct i915_fragment_program *p,
           const struct gl_fragment_program *program)
 {
   GLuint src;
+   unsigned unit;

   switch (source->File) {

@@ -119,8 +136,10 @@ src_vector(struct i915_fragment_program *p,
      case VARYING_SLOT_TEX5:
      case VARYING_SLOT_TEX6:
      case VARYING_SLOT_TEX7:
+         unit = get_texcoord_mapping(p, (source->Index -
+                                         VARYING_SLOT_TEX0) | TEXCOORD_TEX);
         src = i915_emit_decl(p, REG_TYPE_T,
-                              T_TEX0 + (source->Index - VARYING_SLOT_TEX0),
+                              T_TEX0 + unit,
                              D0_CHANNEL_ALL);
 	 break;

@@ -132,8 +151,10 @@ src_vector(struct i915_fragment_program *p,
      case VARYING_SLOT_VAR0 + 5:
      case VARYING_SLOT_VAR0 + 6:
      case VARYING_SLOT_VAR0 + 7:
+         unit = get_texcoord_mapping(p, (source->Index -
+                                         VARYING_SLOT_VAR0) | TEXCOORD_VAR);
         src = i915_emit_decl(p, REG_TYPE_T,
-                              T_TEX0 + (source->Index - VARYING_SLOT_VAR0),
+                              T_TEX0 + unit,
                              D0_CHANNEL_ALL);
         break;

@@ -1176,27 +1197,54 @@ fixup_depth_write(struct i915_fragment_program *p)
   }
 }

+static void
+check_texcoord_mapping(struct i915_fragment_program *p)
+{
+   GLbitfield64 inputs = p->FragProg.Base.InputsRead;
+   unsigned unit = 0;
+
+   for (unsigned i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
+      if (inputs & VARYING_BIT_TEX(i)) {
+         if (unit >= p->ctx->Const.MaxTextureCoordUnits) {
+            unit++;
+            break;
+         }
+         p->texcoord_mapping[unit++] = i | TEXCOORD_TEX;
+      }
+      if (inputs & VARYING_BIT_VAR(i)) {
+         if (unit >= p->ctx->Const.MaxTextureCoordUnits) {
+            unit++;
+            break;
+         }
+         p->texcoord_mapping[unit++] = i | TEXCOORD_VAR;
+      }
+   }
+
+   if (unit > p->ctx->Const.MaxTextureCoordUnits)
+      i915_program_error(p, "Too many texcoord units");
+}

 static void
 check_wpos(struct i915_fragment_program *p)
 {
   GLbitfield64 inputs = p->FragProg.Base.InputsRead;
   GLint i;
+   unsigned unit = 0;

   p->wpos_tex = -1;

+   if ((inputs & VARYING_BIT_POS) == 0)
+      return;
+
   for (i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
-      if (inputs & (VARYING_BIT_TEX(i) | VARYING_BIT_VAR(i)))
-         continue;
-      else if (inputs & VARYING_BIT_POS) {
-         p->wpos_tex = i;
-         inputs &= ~VARYING_BIT_POS;
-      }
+      unit += !!(inputs & VARYING_BIT_TEX(i));
+      unit += !!(inputs & VARYING_BIT_VAR(i));
   }

-   if (inputs & VARYING_BIT_POS) {
+   if (unit < p->ctx->Const.MaxTextureCoordUnits)
+      p->wpos_tex = unit;
+   else
      i915_program_error(p, "No free texcoord for wpos value");
-   }
 }


@@ -1212,6 +1260,7 @@ translate_program(struct i915_fragment_program *p)
   }

   i915_init_program(i915, p);
+   check_texcoord_mapping(p);
   check_wpos(p);
   upload_program(p);
   fixup_depth_write(p);
@@ -1420,22 +1469,24 @@ i915ValidateFragmentProgram(struct i915_context *i915)

   for (i = 0; i < p->ctx->Const.MaxTextureCoordUnits; i++) {
      if (inputsRead & VARYING_BIT_TEX(i)) {
+         int unit = get_texcoord_mapping(p, i | TEXCOORD_TEX);
         int sz = VB->AttribPtr[_TNL_ATTRIB_TEX0 + i]->size;

-         s2 &= ~S2_TEXCOORD_FMT(i, S2_TEXCOORD_FMT0_MASK);
-         s2 |= S2_TEXCOORD_FMT(i, SZ_TO_HW(sz));
+         s2 &= ~S2_TEXCOORD_FMT(unit, S2_TEXCOORD_FMT0_MASK);
+         s2 |= S2_TEXCOORD_FMT(unit, SZ_TO_HW(sz));

         EMIT_ATTR(_TNL_ATTRIB_TEX0 + i, EMIT_SZ(sz), 0, sz * 4);
      }
-      else if (inputsRead & VARYING_BIT_VAR(i)) {
+      if (inputsRead & VARYING_BIT_VAR(i)) {
+         int unit = get_texcoord_mapping(p, i | TEXCOORD_VAR);
         int sz = VB->AttribPtr[_TNL_ATTRIB_GENERIC0 + i]->size;

-         s2 &= ~S2_TEXCOORD_FMT(i, S2_TEXCOORD_FMT0_MASK);
-         s2 |= S2_TEXCOORD_FMT(i, SZ_TO_HW(sz));
+         s2 &= ~S2_TEXCOORD_FMT(unit, S2_TEXCOORD_FMT0_MASK);
+         s2 |= S2_TEXCOORD_FMT(unit, SZ_TO_HW(sz));

         EMIT_ATTR(_TNL_ATTRIB_GENERIC0 + i, EMIT_SZ(sz), 0, sz * 4);
      }
-      else if (i == p->wpos_tex) {
+      if (i == p->wpos_tex) {
 	 int wpos_size = 4 * sizeof(float);
         /* If WPOS is required, duplicate the XYZ position data in an
          * unused texture coordinate:
--- a/src/mesa/drivers/dri/i915/intel_fbo.c
+++ b/src/mesa/drivers/dri/i915/intel_fbo.c
@@ -658,6 +658,11 @@ intel_blit_framebuffer_with_blitter(struct gl_context *ctx,
 {
   struct intel_context *intel = intel_context(ctx);

+   /* Sync up the state of window system buffers.  We need to do this before
+    * we go looking for the buffers.
+    */
+   intel_prepare_render(intel);
+
   if (mask & GL_COLOR_BUFFER_BIT) {
      GLint i;
      struct gl_renderbuffer *src_rb = readFb->_ColorReadBuffer;
--- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
@@ -312,13 +312,43 @@ namespace {
   }

   namespace image_validity {
+      /**
+       * Check whether the bound image is suitable for untyped access.
+       */
+      brw_predicate
+      emit_untyped_image_check(const fs_builder &bld, const fs_reg &image,
+                               brw_predicate pred)
+      {
+         const brw_device_info *devinfo = bld.shader->devinfo;
+         const fs_reg stride = offset(image, bld, BRW_IMAGE_PARAM_STRIDE_OFFSET);
+
+         if (devinfo->gen == 7 && !devinfo->is_haswell) {
+            /* Check whether the first stride component (i.e. the Bpp value)
+             * is greater than four, what on Gen7 indicates that a surface of
+             * type RAW has been bound for untyped access.  Reading or writing
+             * to a surface of type other than RAW using untyped surface
+             * messages causes a hang on IVB and VLV.
+             */
+            set_predicate(pred,
+                          bld.CMP(bld.null_reg_ud(), stride, fs_reg(4),
+                                  BRW_CONDITIONAL_G));
+
+            return BRW_PREDICATE_NORMAL;
+         } else {
+            /* More recent generations handle the format mismatch
+             * gracefully.
+             */
+            return pred;
+         }
+      }
+
      /**
       * Check whether there is an image bound at the given index and write
       * the comparison result to f0.0.  Returns an appropriate predication
       * mode to use on subsequent image operations.
       */
      brw_predicate
-      emit_surface_check(const fs_builder &bld, const fs_reg &image)
+      emit_typed_atomic_check(const fs_builder &bld, const fs_reg &image)
      {
         const brw_device_info *devinfo = bld.shader->devinfo;
         const fs_reg size = offset(image, bld, BRW_IMAGE_PARAM_SIZE_OFFSET);
@@ -895,7 +925,9 @@ namespace brw {
             * surface read on the result,
             */
            const brw_predicate pred =
-               emit_bounds_check(bld, image, saddr, dims);
+               emit_untyped_image_check(bld, image,
+                                        emit_bounds_check(bld, image,
+                                                          saddr, dims));

            /* and they don't know about surface coordinates, we need to
             * convert them to a raw memory offset.
@@ -1041,7 +1073,9 @@ namespace brw {
                * the surface write on the result,
                */
               const brw_predicate pred =
-                  emit_bounds_check(bld, image, saddr, dims);
+                  emit_untyped_image_check(bld, image,
+                                           emit_bounds_check(bld, image,
+                                                             saddr, dims));

               /* and, phew, they don't know about surface coordinates, we
                * need to convert them to a raw memory offset.
@@ -1072,7 +1106,7 @@ namespace brw {
         using namespace image_coordinates;
         using namespace surface_access;
         /* Avoid performing an atomic operation on an unbound surface. */
-         const brw_predicate pred = emit_surface_check(bld, image);
+         const brw_predicate pred = emit_typed_atomic_check(bld, image);

         /* Transform the image coordinates into actual surface coordinates. */
         const fs_reg saddr =
--- a/src/mesa/main/ffvertex_prog.c
+++ b/src/mesa/main/ffvertex_prog.c
@@ -293,9 +293,10 @@ struct ureg {
   GLuint file:4;
   GLint idx:9;      /* relative addressing may be negative */
                     /* sizeof(idx) should == sizeof(prog_src_reg::Index) */
+   GLuint abs:1;
   GLuint negate:1;
   GLuint swz:12;
-   GLuint pad:6;
+   GLuint pad:5;
 };


@@ -324,6 +325,7 @@ static const struct ureg undef = {
   0,
   0,
   0,
+   0,
   0
 };

@@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx)
   struct ureg reg;
   reg.file = file;
   reg.idx = idx;
+   reg.abs = 0;
   reg.negate = 0;
   reg.swz = SWIZZLE_NOOP;
   reg.pad = 0;
@@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx)



+static struct ureg absolute( struct ureg reg )
+{
+   reg.abs = 1;
+   reg.negate = 0;
+   return reg;
+}
+
+
 static struct ureg negate( struct ureg reg )
 {
   reg.negate ^= 1;
@@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src,
   src->File = reg.file;
   src->Index = reg.idx;
   src->Swizzle = reg.swz;
+   src->Abs = reg.abs;
   src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE;
-   src->Abs = 0;
   src->RelAddr = 0;
   /* Check that bitfield sizes aren't exceeded */
   assert(src->Index == reg.idx);
@@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct tnl_program *p,

      emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm);
      emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot);
-      emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W));
+      emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), swizzle1(attenuation, W));
      emit_op2(p, OPCODE_MUL, att, 0, slt, spot);

      release_temp(p, spot);
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2619,6 +2619,7 @@ _mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
       * internal formats, they do not correspond to GL constants, so the base
       * format is returned instead.
       */
+      case GL_BGRA_EXT:
      case GL_LUMINANCE_ALPHA:
      case GL_LUMINANCE:
      case GL_ALPHA:
@@ -2738,8 +2739,19 @@ _mesa_es3_error_check_format_and_type(const struct gl_context *ctx,
      if (effectiveInternalFormat == GL_NONE)
         return GL_INVALID_OPERATION;

-      GLenum baseInternalFormat =
-         _mesa_base_tex_format(ctx, effectiveInternalFormat);
+      GLenum baseInternalFormat;
+      if (internalFormat == GL_BGRA_EXT) {
+         /* Unfortunately, _mesa_base_tex_format returns a base format of
+          * GL_RGBA for GL_BGRA_EXT.  This makes perfect sense if you're
+          * asking the question, "what channels does this format have?"
+          * However, if we're trying to determine if two internal formats
+          * match in the ES3 sense, we actually want GL_BGRA.
+          */
+         baseInternalFormat = GL_BGRA_EXT;
+      } else {
+         baseInternalFormat =
+            _mesa_base_tex_format(ctx, effectiveInternalFormat);
+      }

      if (internalFormat != baseInternalFormat)
         return GL_INVALID_OPERATION;
@@ -2748,6 +2760,11 @@ _mesa_es3_error_check_format_and_type(const struct gl_context *ctx,
   }

   switch (format) {
+   case GL_BGRA_EXT:
+      if (type != GL_UNSIGNED_BYTE || internalFormat != GL_BGRA)
+         return GL_INVALID_OPERATION;
+      break;
+
   case GL_RGBA:
      switch (type) {
      case GL_UNSIGNED_BYTE:
--- a/src/mesa/main/mipmap.c
+++ b/src/mesa/main/mipmap.c
@@ -1922,11 +1922,8 @@ generate_mipmap_uncompressed(struct gl_context *ctx, GLenum target,
      }

      /* get dest gl_texture_image */
-      dstImage = _mesa_get_tex_image(ctx, texObj, target, level + 1);
-      if (!dstImage) {
-         _mesa_error(ctx, GL_OUT_OF_MEMORY, "generating mipmaps");
-         return;
-      }
+      dstImage = _mesa_select_tex_image(texObj, target, level + 1);
+      assert(dstImage);

      if (target == GL_TEXTURE_1D_ARRAY) {
 	 srcDepth = srcHeight;
@@ -2110,7 +2107,19 @@ generate_mipmap_compressed(struct gl_context *ctx, GLenum target,
                                         srcWidth, srcHeight, srcDepth,
                                         &dstWidth, &dstHeight, &dstDepth);
      if (!nextLevel)
-	 break;
+	 goto end;
+
+      if (!_mesa_prepare_mipmap_level(ctx, texObj, level + 1,
+                                      dstWidth, dstHeight, dstDepth,
+                                      border, srcImage->InternalFormat,
+                                      srcImage->TexFormat)) {
+         /* all done */
+         goto end;
+      }
+
+      /* get dest gl_texture_image */
+      dstImage = _mesa_select_tex_image(texObj, target, level + 1);
+      assert(dstImage);

      /* Compute dst image strides and alloc memory on first iteration */
      temp_dst_row_stride = _mesa_format_row_stride(temp_format, dstWidth);
@@ -2124,13 +2133,6 @@ generate_mipmap_compressed(struct gl_context *ctx, GLenum target,
 	 }
      }

-      /* get dest gl_texture_image */
-      dstImage = _mesa_get_tex_image(ctx, texObj, target, level + 1);
-      if (!dstImage) {
-         _mesa_error(ctx, GL_OUT_OF_MEMORY, "generating mipmaps");
-         goto end;
-      }
-
      /* for 2D arrays, setup array[depth] of slice pointers */
      for (i = 0; i < srcDepth; i++) {
         temp_src_slices[i] = temp_src + temp_src_img_stride * i;
@@ -2149,14 +2151,6 @@ generate_mipmap_compressed(struct gl_context *ctx, GLenum target,
                                  dstWidth, dstHeight, dstDepth,
                                  temp_dst_slices, temp_dst_row_stride);

-      if (!_mesa_prepare_mipmap_level(ctx, texObj, level + 1,
-                                      dstWidth, dstHeight, dstDepth,
-                                      border, srcImage->InternalFormat,
-                                      srcImage->TexFormat)) {
-         /* all done */
-         goto end;
-      }
-
      /* The image space was allocated above so use glTexSubImage now */
      ctx->Driver.TexSubImage(ctx, 2, dstImage,
                              0, 0, 0, dstWidth, dstHeight, dstDepth,
--- a/src/mesa/state_tracker/st_cb_blit.c
+++ b/src/mesa/state_tracker/st_cb_blit.c
@@ -229,6 +229,7 @@ st_BlitFramebuffer(struct gl_context *ctx,
                  st_adjust_blit_for_msaa_resolve(&blit);

                  st->pipe->blit(st->pipe, &blit);
+                  dstRb->defined = true; /* front buffer tracking */
               }
            }
         }
@@ -266,6 +267,7 @@ st_BlitFramebuffer(struct gl_context *ctx,
                  st_adjust_blit_for_msaa_resolve(&blit);

                  st->pipe->blit(st->pipe, &blit);
+                  dstRb->defined = true; /* front buffer tracking */
               }
            }
         }
--- a/src/mesa/state_tracker/st_format.c
+++ b/src/mesa/state_tracker/st_format.c
@@ -1960,7 +1960,11 @@ st_ChooseTextureFormat(struct gl_context *ctx, GLenum target,
   else if (internalFormat == 3 || internalFormat == 4 ||
            internalFormat == GL_RGB || internalFormat == GL_RGBA ||
            internalFormat == GL_RGB8 || internalFormat == GL_RGBA8 ||
-            internalFormat == GL_BGRA)
+            internalFormat == GL_BGRA ||
+            internalFormat == GL_RGB16F ||
+            internalFormat == GL_RGBA16F ||
+            internalFormat == GL_RGB32F ||
+            internalFormat == GL_RGBA32F)
 	 bindings |= PIPE_BIND_RENDER_TARGET;

   /* GLES allows the driver to choose any format which matches
@@ -1 +1 @@
 .0.2
 .0.3