Compare commits

..

67 Commits

Author SHA1 Message Date
Emil Velikov
07571cd8cc Update version to 17.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-06 13:18:13 +00:00
Lucas Stach
2fc362f147 etnaviv: force vertex buffers through the MMU
This fixes a vertex data corruption issue if some of the vertex streams
go through the MMU and some don't.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit e158b74971)
Nominated-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-03 11:18:53 +00:00
Christian König
89b51c7e43 st/va: make sure that we call begin_frame() only once v2
This fixes "st/va: delay calling begin_frame until we have all parameters".

v2: call begin frame after decoder (re)creation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
(cherry picked from commit 1338d912f5)
2017-02-03 11:12:16 +00:00
Nayan Deshmukh
ac2337ee38 st/vdpau: only send buffers with B8G8R8A8 format to X
PresentPixmap only works if the pixmap depth matches with the
window depth, otherwise it returns a BadMatch protocol error.
Even if the depths match, the result won't look correctly
if the VDPAU RGB component order doesn't match the X11 one so
we only allow the X11 format.
For other buffers we copy them to a buffer which is send to X.

v2: only send buffers with format VDP_RGBA_FORMAT_B8G8R8A8
v3: reword commit message
v4: add comment explaining the code

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 31908d6a4a)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99637
Nominated-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Nominated-by: Michel Dänzer <michel.daenzer@amd.com> (IRC)
2017-02-03 11:09:00 +00:00
Mauro Rossi
77ec080710 android: fix llvm, elf dependencies for M, N releases
These changes set the correct llvm version and elf include path
which differ for Marshmallow and Nougat

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9c45bb731c)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	Android.common.mk
2017-02-03 11:08:59 +00:00
Jason Ekstrand
eadbc95d64 anv: Improve flushing around STATE_BASE_ADDRESS
It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is.  We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards.  The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.

Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again.  In particular:

 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
    really know whether or not it's pipelined.

 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
    is a compute shader.

 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
    because the state may be getting cached there (we don't really know).

Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92128590bc)
2017-02-03 11:08:59 +00:00
Jason Ekstrand
69ec90ad24 anv: Flush render cache before STATE_BASE_ADDRESS on gen7
We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed.  Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related.  This commit fixes them.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f1f9794118)
2017-02-03 11:08:59 +00:00
Jason Ekstrand
7abecef5c3 isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell
This causes hangs on Broadwell if you try to render to it.  I have no
idea how we managed to not hit this earlier.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4871930451)
2017-02-03 11:08:59 +00:00
Jason Ekstrand
5d470a68e6 intel/blorp: Handle clearing of A4B4G4R4 on all platforms
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0348b5a0b)
2017-02-03 11:08:59 +00:00
Wladimir J. van der Laan
3df060d953 etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers
This fixes rendering of full-screen quads (and other screen-filling
geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op
on other hardware.

- It looks like SE_CLIP registers were not set at all.
  I'm amazed that rendering worked without them. Emit them to
  avoid issues on gc3000.

- Define constants
  ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
  ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
  ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
  ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)

  These demarcate the margin (fixp16) between the computed sizes and the
  value sent to the chip. I have set these to the numbers used by the
  Vivante driver for gc2000. I am not sure whether any old hardware was
  relying on the old numbers, or whether those were just a guess. But if
  so, these need to be moved to the _specs structure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 56314f5baf)
2017-02-03 11:08:59 +00:00
Wladimir J. van der Laan
34cd53ca8c etnaviv: Generate new sin/cos instructions on GC3000
Shaders using sin/cos instructions were not working on GC3000.

The reason for this turns out to be that these chips implement sin/cos
in a different way (but using the same opcodes):

- Need their input scaled by 1/pi instead of 2/pi.

- Output an x and y component, which need to be multiplied to
  get the result.

- tex_amode needs to be set to 1.

Add a new bit to the compiler specs and generate these instructions
as necessary.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit fe3bb8cdb5)
2017-02-03 11:08:59 +00:00
Nanley Chery
05d1c8aa02 anv/cmd_buffer: Use the proper depth input attachment surface state
Commit 2852efcda4 moved the location of
the depth input attachment surface state from the render pass to the
image view, but failed to update the surface state location used when
emitting the binding table. Fix this by loading the surface state from
the correct location.

Fixes:
dEQP-VK.renderpass.formats.d16_unorm.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.d32_sfloat.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*
dEQP-VK.renderpass.attachment_allocation.input_output.93
dEQP-VK.renderpass.attachment_allocation.input_output.92
dEQP-VK.renderpass.attachment_allocation.input_output.82
dEQP-VK.renderpass.attachment_allocation.input_output.46

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 33e0c5d003)
2017-02-03 11:08:59 +00:00
Bartosz Tomczyk
ca222b7c18 glsl: fix heap-buffer-overflow
The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit fc27181f9e)
2017-02-03 11:08:59 +00:00
Wladimir J. van der Laan
6c89a728d9 etnaviv: Cannot render to rb-swapped formats
Exposing rb swapped (or other swizzled) formats for rendering would
involve swizzing in the pixel shader. This is not the case at the
moment, so reject requests for creating such surfaces.

(GPUs that need an extra resolve step anyway due to multiple pixel
pipes, such as gc2000, might also do this swap in the resolve operation.
But this would be tricky to keep track of)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
(cherry picked from commit 658568941d)
2017-02-03 11:08:59 +00:00
Christian Gmeiner
f3b7a51383 etnaviv: Avoid infinite loop in find_frame()
Use of unsigned loop control variable with '>= 0' would lead
to infinite loop.

Reported by clang:

etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression
>= 0 is always true [-Wtautological-compare]
   for (unsigned sp = c->frame_sp; sp >= 0; sp--)
                                   ~~ ^  ~

v2: Simply use the same datatype as c->frame_sp is using.

CC: <mesa-stable@lists.freedesktop.org>
Reported-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 82fe240a99)
2017-02-03 11:08:59 +00:00
Dave Airlie
9ecfbafedb radv/ac: apply slice rounding to 1d arrays as well.
Fixes:
dEQP-VK.glsl.texture_functions.texture.*1darray*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8477aa71d9)
2017-02-03 11:08:58 +00:00
Dave Airlie
eaf311d90d radv/ac: implement txs for buffer textures.
This fixes a bunch of buffer related:
dEQP-VK.memory.pipeline_barrier.*
tests, that were crashing in LLVM due to this being missing.

Reviewed-by: Andres Rodriguez<andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ecd426490)
2017-02-03 11:08:58 +00:00
Dave Airlie
bbb4562def radv/ac: handle nir irem opcode.
This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opsrem.*

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ecc3fa3ba3)
2017-02-03 11:08:58 +00:00
Dave Airlie
7083ca2625 radv/ac: fix multisample subpass image.
We weren't adding the fragment position properly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 059dd17175)
2017-02-03 11:08:58 +00:00
Dave Airlie
8917af11f7 radv: handle transfer_write as a dst flag.
It appears we can get image barriers like:
    srcStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dstStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dependencyFlags:                VkDependencyFlags = 0
    memoryBarrierCount:             uint32_t = 0
    pMemoryBarriers:                const VkMemoryBarrier* = NULL
    bufferMemoryBarrierCount:       uint32_t = 0
    pBufferMemoryBarriers:          const VkBufferMemoryBarrier* = NULL
    imageMemoryBarrierCount:        uint32_t = 1
    pImageMemoryBarriers:           const VkImageMemoryBarrier* = 0x7ffc882367b0
        pImageMemoryBarriers[0]:        const VkImageMemoryBarrier = 0x7ffc882367b0:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0x2df55e0
            subresourceRange:               VkImageSubresourceRange = 0x7ffc882367e0:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 1
                baseArrayLayer:                 uint32_t = 0
                layerCount:                     uint32_t = 1

This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here,
not sure if this is a too large hammer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a1c1ba7d56)
2017-02-03 11:08:58 +00:00
Marek Olšák
b7f7dc7231 radeonsi: don't invoke DCC decompression in update_all_texture_descriptors
This fixes a bug uncovered by the 17-part patch series, specifically:
  "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"

If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a0740d59aa)
2017-02-03 11:08:58 +00:00
Bartosz Tomczyk
301c9b96f2 r600: Fix stack overflow
Commit 7b5878ee04 increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.

Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a41f2527ae)
2017-02-03 11:08:58 +00:00
Kenneth Graunke
06b9bc66d5 i965: Support the force_glsl_version driconf option.
Gallium drivers have had this for a while.  It makes sense to support
it consistently across drivers, so expose it in i965 as well.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2f7a7ae131)
2017-02-03 11:08:58 +00:00
Kenneth Graunke
270597d13f i965: Fix check for negative pitch in can_do_fast_copy_blit().
At this point, the pitch is in bytes.  We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K.  This
means the bit 15 trick won't work.

The caller now has signed integers anyway, so just pass those through
and do the obvious check.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 02216a1ddf)
2017-02-03 11:08:58 +00:00
Kenneth Graunke
671dfe51a0 i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.
Applications may delete a shader program, create a new one, and bind it
before the next draw.  With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program.  In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:

      if (brw->vertex_program != ctx->VertexProgram._Current) {
         brw->vertex_program = ctx->VertexProgram._Current;
         brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
      }

Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on.  This causes utter chaos.

As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them.  Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.

The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none.  This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time.  Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...

Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.

Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.

Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7c5629a269)
2017-02-03 11:08:58 +00:00
Bas Nieuwenhuizen
d7d772f903 radv/ac: Use base in push constant loads.
Apparently the source is not an address but an offset, so we actually
need to use the base.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96c60b7f07)
2017-02-03 11:08:57 +00:00
Emil Velikov
522ee2cd7d configure.ac: list radeon in --with-vulkan-drivers help string
Analogous to what we do for the dri and gallium drivers.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@colllabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit cb6be5c8c0)
2017-02-03 11:08:57 +00:00
Emil Velikov
929b3bb6fe radv: automake: Don't install vk_platform.h or vulkan.h.
These files belong to the vulkan loader.

Identical to
045f38a507 vulkan: Don't install vk_platform.h or vulkan.h.

Cc: Dave Airlie <airlied@redhat.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6f2dec0a23)
2017-02-03 11:08:57 +00:00
Emil Velikov
e6ea92b263 mesa/tests: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 091f2b8c98)
2017-02-03 11:08:57 +00:00
Emil Velikov
27e7e7e7e3 dri/osmesa: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6ba96bdcab)
2017-02-03 11:08:57 +00:00
Emil Velikov
3919feee55 dri/swrast: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ede4ff9adc)
2017-02-03 11:08:57 +00:00
Emil Velikov
6ee946862c radeon, r200: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5a0ba1e5de)
2017-02-03 11:08:57 +00:00
Emil Velikov
4e20356a6c mapi: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee5de93269)
2017-02-03 11:08:57 +00:00
Emil Velikov
5236ab7bac loader: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit af860850a0)
2017-02-03 11:08:57 +00:00
Emil Velikov
4ea4e19ccb glx/windows: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 912b4f5472)
2017-02-03 11:08:57 +00:00
Emil Velikov
fad44e6aea glx/apple: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
(cherry picked from commit 5b874cee09)
2017-02-03 11:08:57 +00:00
Emil Velikov
a817d1e227 glx: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d66f9e6d93)
2017-02-03 11:08:57 +00:00
Emil Velikov
44ba34817c d3dadapter9: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d221bf9b91)
2017-02-03 11:08:56 +00:00
Emil Velikov
586b009cfe st/dri: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 517f34b4be)
2017-02-03 11:08:56 +00:00
Emil Velikov
89ce0721eb clover: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 65d5a60cac)
2017-02-03 11:08:56 +00:00
Emil Velikov
87fc95c94c egl: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c5921ae0d2)
2017-02-03 11:08:56 +00:00
Emil Velikov
042b3445b2 i915: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 90ac5c339e)
2017-02-03 11:08:56 +00:00
Emil Velikov
0a1ad5c916 i965: automake: include builddir prior to srcdir
The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4622c75dfb)
2017-02-03 11:08:56 +00:00
Emil Velikov
fe1b2f7341 freedreno: automake: correctly set MKDIR_GEN
Analogous to previous commit.

Fixes: 4610e5ef28 "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
(cherry picked from commit a922c82125)
2017-02-03 11:08:56 +00:00
Emil Velikov
c22ee800d2 i965: automake: correctly set MKDIR_GEN
Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5eed48d237)
2017-02-03 11:08:56 +00:00
Jason Ekstrand
e79043bbb9 vulkan/wsi: Lower the maximum image sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit d6397dd625)
2017-02-03 11:08:56 +00:00
Jason Ekstrand
f14926027c vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 659edd9f5c)
2017-02-03 11:08:56 +00:00
Jason Ekstrand
23ffeed7e0 vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit dc578ef060)
2017-02-03 11:08:56 +00:00
Emil Velikov
1e03b5e566 mesa: move variable declaration to where its used
The variable replacement was unused when building w/o
ENABLE_SHADER_CACHE. Since we can mix variable declarations and code,
move it to where its used.

Fixes: 9f8dc3bf03 "utils: build sha1/disk cache only with
Android/Autoconf"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

(cherry picked from commit 6a5850b04a)
2017-02-03 11:08:56 +00:00
Andreas Boll
58952675f6 configure.ac: Require LLVM for r300 only on x86 and x86_64
b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.

r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before b3119a3.

Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1f2a890ace)
2017-02-03 11:08:55 +00:00
Lionel Landwerlin
fe44c532b2 spirv: handle undefined components for OpVectorShuffle
Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bbe8705c57)
2017-02-03 11:08:55 +00:00
Lionel Landwerlin
939c0c82e5 spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :

...
%zero        = OpConstant %i32 0
%ivec3_0     = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef  = OpUndef %ivec3
%sc_0        = OpSpecConstant %i32 0
%sc_1        = OpSpecConstant %i32 0
%sc_2        = OpSpecConstant %i32 0
...

Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit df7063cba3)
2017-02-03 11:08:55 +00:00
Lionel Landwerlin
7c663b1d5e anv: fix descriptor pool internal size allocation
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.

Fixes a crash due to out of bound memory access in:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

v2: Drop debug traces (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3421106ec)
2017-02-03 11:08:55 +00:00
Kenneth Graunke
2554c98d70 i965: Make intelEmitCopyBlit not truncate large strides.
When trying to blit larger tiled surfaces, the pitch can be larger than
32768 bytes, which means it won't fit in a GLshort.  Passing it in will
truncate the stride to 0, which has...surprising results.

The pitch can be up to 32,768 DWords, or 128kB.  We measure it in bytes,
but divide by 4 when programming it.  So we need to handle values up to
131,072.  Switch from GLshort to int32_t to avoid the truncation.

Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage
at widths greater than 8192.

v2: Use int32_t as negative values can be used (Jason).

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f8f7ea508b)
2017-02-03 11:08:55 +00:00
Kenneth Graunke
31715781c6 i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register.  With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky.  Use a UW type to avoid this.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit fcf723b647)
2017-02-03 11:08:55 +00:00
Iago Toral Quiroga
ebfe5e17ee anv/lower_input_attachments: honor sample index parameter to subpassLoad()
According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is:

gvec4 subpassLoad(gsubpassInput   subpass);
gvec4 subpassLoad(gsubpassInputMS subpass, int sample);

So the multisampled case always receives an explicit sample index that we
should use. The current implementation was ignoring this parameter
and using gl_SampleID value instead.

Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9b25769da6)
2017-02-03 11:08:55 +00:00
Kenneth Graunke
dcb3b24b86 i965: Fix fast depth clears for surfaces with a dimension of 16384.
I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong.  But it turns out that
there is a legitimate reason to use it, so let's do so.

The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5106df85da)
2017-02-03 11:08:55 +00:00
Lionel Landwerlin
5a806f7def anv: set command buffer to NULL when allocations fail
The spec section 5.2 says:

   "vkAllocateCommandBuffers can be used to create multiple command
   buffers. If the creation of any of those command buffers fails, the
   implementation must destroy all successfully created command buffer
   objects from this command, set all entries of the pCommandBuffers
   array to VK_NULL_HANDLE and return the error."

Fixes:
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25e21cb8d0)
2017-02-03 11:08:55 +00:00
Dave Airlie
c63652b8ac radv: program a default point size.
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.

This preempt fixes a bunch of geom shader tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2ab2be092d)
2017-02-03 11:08:55 +00:00
Marek Olšák
651861d862 radeonsi: handle first_non_void correctly in si_create_vertex_elements
This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit eac7df43ca)
2017-02-03 11:08:55 +00:00
Marek Olšák
d701877fb0 st/mesa: destroy pipe_context before destroying st_context (v2)
If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.

Firefox with WebGL2 enabled hits this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456

v2: protect against a double destroy in st_create_context_priv and callers.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d9ef549238)
2017-02-03 11:08:54 +00:00
Ian Romanick
b2bbfca79f mesa: Don't advertise GL_OES_read_format in core profile
OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.

The functionality is also included in GL_ARB_ES2_compatibility.  Ever
OpenGL core-profile driver currently exposes both extensions.  I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems.  No functionality
is removed.

I have left this extension in place for compatibility profile.  There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.

Three other other alternatives considered:

 - Remove the string from compatibility profile drivers but leave the
   functionality in place.

 - Add a flag to expose the extension string, and set it in every OpenGL
   driver that does not expose GL_ARB_ES2_compatibility (and those
   drivers only).  I tried this.  You can't have two instances of an
   extension in the extension table (one dummy_true for ES1 and one with
   a flag for compatibility profile), so the implementation requires a
   bit of effort.

 - Only expose the extension in compatibility if the version is less
   than 2.0.  I didn't see an easy way to do this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c4a0c1efff)
2017-02-03 11:08:54 +00:00
Roland Scheidegger
140ad270c8 gallivm: (trivial) fix ddiv cpu implementation
we can't use the cpu implementation of fdiv, as this one uses different
lp_build_context, which causes assertion failure.
Just use default fdiv action (there is no fast rcp for doubles which we
could potentially use anyway).

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 25208949d7)
2017-02-03 11:08:54 +00:00
Roland Scheidegger
517fc3ef78 tgsi: implement ddiv opcode
softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64,
so we really need to support that opcode.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 3b575a955c)
2017-02-03 11:08:54 +00:00
Jason Ekstrand
450f6aa5b2 i965/blorp: Use the correct ISL format for combined depth/stencil
In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format.  In this case, we handle
stencil properly but we give blorp the wrong ISL format.  Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.

Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c180f9633)
2017-02-03 11:08:54 +00:00
Topi Pohjolainen
d940b91f94 i965/blorp: Add also depth and stencil buffers to render cache
v2 (Jason, Curro): Add stencil also even though it is not
                   enabled yet.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit ba6399df94)
2017-02-03 11:08:54 +00:00
Emil Velikov
63f169d5d0 configure.ac: move require_dri_shared_libs_and_glapi() before its users
Otherwise we'll get a lovely message as below:
"require_dri_shared_libs_and_glapi: command not found"

Cc: Steven Newbury <steve@snewbury.org.uk>
Reported-by: Steven Newbury <steve@snewbury.org.uk>
Fixes: da410e6afa "configure: explicitly require shared glapi for
enable-dri"
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Steven Newbury <steve@snewbury.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

(cherry picked from commit 5872850b88)
2017-02-03 11:08:54 +00:00
64 changed files with 450 additions and 175 deletions

View File

@@ -78,10 +78,22 @@ endif
ifeq ($(MESA_ENABLE_LLVM),true)
LOCAL_CFLAGS += \
-DHAVE_LLVM=0x0305 -DMESA_LLVM_VERSION_PATCH=2 \
-D__STDC_CONSTANT_MACROS \
-D__STDC_FORMAT_MACROS \
-D__STDC_LIMIT_MACROS
ifeq ($(MESA_ANDROID_MAJOR_VERSION),5)
LOCAL_CFLAGS += -DHAVE_LLVM=0x0305 -DMESA_LLVM_VERSION_PATCH=2
ELF_INCLUDES := external/elfutils/0.153/libelf
endif
ifeq ($(MESA_ANDROID_MAJOR_VERSION),6)
LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0
ELF_INCLUDES := external/elfutils/src/libelf
endif
ifeq ($(MESA_ANDROID_MAJOR_VERSION),7)
LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0
ELF_INCLUDES := external/elfutils/libelf
endif
endif
ifneq ($(LOCAL_IS_HOST_MODULE),true)

View File

@@ -1 +1 @@
17.0.0-rc2
17.0.0-rc3

View File

@@ -1436,6 +1436,22 @@ if test "x$enable_gallium_osmesa" = xyes; then
fi
fi
require_dri_shared_libs_and_glapi() {
if test "x$enable_static" = xyes; then
AC_MSG_ERROR([$1 cannot be build as static library])
fi
if test "x$enable_dri" != xyes; then
# There is only a single backend which won't be build/used otherwise.
# XXX: Revisit this as the egl/haiku is a thing.
AC_MSG_ERROR([$1 requires --enable-dri])
fi
if test "x$enable_shared_glapi" != xyes; then
AC_MSG_ERROR([$1 requires --enable-shared-glapi])
fi
}
if test "x$enable_dri" = xyes; then
require_dri_shared_libs_and_glapi "DRI"
@@ -1722,7 +1738,7 @@ fi
AC_ARG_WITH([vulkan-drivers],
[AS_HELP_STRING([--with-vulkan-drivers@<:@=DIRS...@:>@],
[comma delimited Vulkan drivers list, e.g.
"intel"
"intel,radeon"
@<:@default=no@:>@])],
[with_vulkan_drivers="$withval"],
[with_vulkan_drivers="no"])
@@ -1815,22 +1831,6 @@ AC_SUBST([OSMESA_LIB_DEPS])
AC_SUBST([OSMESA_PC_REQ])
AC_SUBST([OSMESA_PC_LIB_PRIV])
require_dri_shared_libs_and_glapi() {
if test "x$enable_static" = xyes; then
AC_MSG_ERROR([$1 cannot be build as static library])
fi
if test "x$enable_dri" != xyes; then
# There is only a single backend which won't be build/used otherwise.
# XXX: Revisit this as the egl/haiku is a thing.
AC_MSG_ERROR([$1 requires --enable-dri])
fi
if test "x$enable_shared_glapi" != xyes; then
AC_MSG_ERROR([$1 requires --enable-shared-glapi])
fi
}
dnl
dnl gbm configuration
dnl
@@ -2212,6 +2212,19 @@ gallium_require_llvm() {
fi
}
dnl
dnl r300 doesn't strictly require LLVM, but for performance reasons we
dnl highly recommend LLVM usage. So require it at least on x86 and x86_64
dnl architectures.
dnl
r300_require_llvm() {
case "$host" in *gnux32) return;; esac
case "$host_cpu" in
i*86|x86_64|amd64) gallium_require_llvm $1
;;
esac
}
dnl
dnl DRM is needed by X, Wayland, and offscreen rendering.
dnl Surfaceless is an alternative for the last one.
@@ -2298,7 +2311,7 @@ if test -n "$with_gallium_drivers"; then
HAVE_GALLIUM_R300=yes
PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
require_libdrm "r300"
gallium_require_llvm "r300"
r300_require_llvm "r300"
;;
xr600)
HAVE_GALLIUM_R600=yes

View File

@@ -55,7 +55,7 @@ LOCAL_C_INCLUDES := \
external/llvm/include \
external/llvm/device/include \
external/libcxx/include \
external/elfutils/$(if $(filter 5,$(MESA_ANDROID_MAJOR_VERSION)),0.153/,$(if $(filter 6,$(MESA_ANDROID_MAJOR_VERSION)),src/))libelf
$(ELF_INCLUDES)
LOCAL_STATIC_LIBRARIES := libLLVMCore

View File

@@ -1267,6 +1267,9 @@ static void visit_alu(struct nir_to_llvm_context *ctx, nir_alu_instr *instr)
src[1] = to_float(ctx, src[1]);
result = LLVMBuildFRem(ctx->builder, src[0], src[1], "");
break;
case nir_op_irem:
result = LLVMBuildSRem(ctx->builder, src[0], src[1], "");
break;
case nir_op_idiv:
result = LLVMBuildSDiv(ctx->builder, src[0], src[1], "");
break;
@@ -1745,9 +1748,12 @@ static LLVMValueRef visit_vulkan_resource_index(struct nir_to_llvm_context *ctx,
static LLVMValueRef visit_load_push_constant(struct nir_to_llvm_context *ctx,
nir_intrinsic_instr *instr)
{
LLVMValueRef ptr;
LLVMValueRef ptr, addr;
ptr = build_gep0(ctx, ctx->push_constants, get_src(ctx, instr->src[0]));
addr = LLVMConstInt(ctx->i32, nir_intrinsic_base(instr), 0);
addr = LLVMBuildAdd(ctx->builder, addr, get_src(ctx, instr->src[0]), "");
ptr = build_gep0(ctx, ctx->push_constants, addr);
ptr = cast_ptr(ctx, ptr, get_def_type(ctx, &instr->dest.ssa));
return LLVMBuildLoad(ctx->builder, ptr, "");
@@ -2238,7 +2244,7 @@ static int image_type_to_components_count(enum glsl_sampler_dim dim, bool array)
}
static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
nir_intrinsic_instr *instr, bool add_frag_pos)
nir_intrinsic_instr *instr)
{
const struct glsl_type *type = instr->variables[0]->var->type;
if(instr->variables[0]->deref.child)
@@ -2253,6 +2259,8 @@ static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
LLVMValueRef res;
int count;
enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
@@ -2378,12 +2386,11 @@ static LLVMValueRef visit_image_load(struct nir_to_llvm_context *ctx,
} else {
bool is_da = glsl_sampler_type_is_array(type) ||
glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_CUBE;
bool add_frag_pos = glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_SUBPASS;
LLVMValueRef da = is_da ? ctx->i32one : ctx->i32zero;
LLVMValueRef glc = LLVMConstInt(ctx->i1, 0, false);
LLVMValueRef slc = LLVMConstInt(ctx->i1, 0, false);
params[0] = get_image_coords(ctx, instr, add_frag_pos);
params[0] = get_image_coords(ctx, instr);
params[1] = get_sampler_desc(ctx, instr->variables[0], DESC_IMAGE);
params[2] = LLVMConstInt(ctx->i32, 15, false); /* dmask */
if (HAVE_LLVM <= 0x0309) {
@@ -2442,7 +2449,7 @@ static void visit_image_store(struct nir_to_llvm_context *ctx,
LLVMValueRef slc = i1false;
params[0] = to_float(ctx, get_src(ctx, instr->src[2]));
params[1] = get_image_coords(ctx, instr, false); /* coords */
params[1] = get_image_coords(ctx, instr); /* coords */
params[2] = get_sampler_desc(ctx, instr->variables[0], DESC_IMAGE);
params[3] = LLVMConstInt(ctx->i32, 15, false); /* dmask */
if (HAVE_LLVM <= 0x0309) {
@@ -2502,7 +2509,7 @@ static LLVMValueRef visit_image_atomic(struct nir_to_llvm_context *ctx,
bool da = glsl_sampler_type_is_array(type) ||
glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_CUBE;
coords = params[param_count++] = get_image_coords(ctx, instr, false);
coords = params[param_count++] = get_image_coords(ctx, instr);
params[param_count++] = get_sampler_desc(ctx, instr->variables[0], DESC_IMAGE);
params[param_count++] = i1false; /* r128 */
params[param_count++] = da ? i1true : i1false; /* da */
@@ -3154,6 +3161,15 @@ static void tex_fetch_ptrs(struct nir_to_llvm_context *ctx,
*fmask_ptr = get_sampler_desc(ctx, instr->texture, DESC_FMASK);
}
static LLVMValueRef apply_round_slice(struct nir_to_llvm_context *ctx,
LLVMValueRef coord)
{
coord = to_float(ctx, coord);
coord = ac_emit_llvm_intrinsic(&ctx->ac, "llvm.rint.f32", ctx->f32, &coord, 1, 0);
coord = to_integer(ctx, coord);
return coord;
}
static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
{
LLVMValueRef result = NULL;
@@ -3211,6 +3227,11 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
}
}
if (instr->op == nir_texop_txs && instr->sampler_dim == GLSL_SAMPLER_DIM_BUF) {
result = get_buffer_size(ctx, res_ptr, false);
goto write_result;
}
if (instr->op == nir_texop_texture_samples) {
LLVMValueRef res, samples, is_msaa;
res = LLVMBuildBitCast(ctx->builder, res_ptr, ctx->v8i32, "");
@@ -3310,15 +3331,16 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
/* Pack texture coordinates */
if (coord) {
address[count++] = coords[0];
if (instr->coord_components > 1)
if (instr->coord_components > 1) {
if (instr->sampler_dim == GLSL_SAMPLER_DIM_1D && instr->is_array && instr->op != nir_texop_txf) {
coords[1] = apply_round_slice(ctx, coords[1]);
}
address[count++] = coords[1];
}
if (instr->coord_components > 2) {
/* This seems like a bit of a hack - but it passes Vulkan CTS with it */
if (instr->sampler_dim != GLSL_SAMPLER_DIM_3D && instr->op != nir_texop_txf) {
coords[2] = to_float(ctx, coords[2]);
coords[2] = ac_emit_llvm_intrinsic(&ctx->ac, "llvm.rint.f32", ctx->f32, &coords[2],
1, 0);
coords[2] = to_integer(ctx, coords[2]);
coords[2] = apply_round_slice(ctx, coords[2]);
}
address[count++] = coords[2];
}

View File

@@ -21,9 +21,7 @@
include Makefile.sources
vulkan_includedir = $(includedir)/vulkan
vulkan_include_HEADERS = \
noinst_HEADERS = \
$(top_srcdir)/include/vulkan/vk_platform.h \
$(top_srcdir)/include/vulkan/vulkan.h

View File

@@ -438,7 +438,8 @@ radv_emit_graphics_raster_state(struct radv_cmd_buffer *cmd_buffer,
raster->spi_interp_control);
radeon_set_context_reg_seq(cmd_buffer->cs, R_028A00_PA_SU_POINT_SIZE, 2);
radeon_emit(cmd_buffer->cs, 0);
unsigned tmp = (unsigned)(1.0 * 8.0);
radeon_emit(cmd_buffer->cs, S_028A00_HEIGHT(tmp) | S_028A00_WIDTH(tmp));
radeon_emit(cmd_buffer->cs, S_028A04_MIN_SIZE(radv_pack_float_12p4(0)) |
S_028A04_MAX_SIZE(radv_pack_float_12p4(8192/2))); /* R_028A04_PA_SU_POINT_MINMAX */
@@ -2605,6 +2606,7 @@ void radv_CmdPipelineBarrier(
break;
case VK_ACCESS_COLOR_ATTACHMENT_READ_BIT:
case VK_ACCESS_TRANSFER_READ_BIT:
case VK_ACCESS_TRANSFER_WRITE_BIT:
case VK_ACCESS_INPUT_ATTACHMENT_READ_BIT:
flush_bits |= RADV_CMD_FLUSH_AND_INV_FRAMEBUFFER | RADV_CMD_FLAG_INV_GLOBAL_L2;
default:

View File

@@ -535,7 +535,7 @@ private:
const char *str_end;
while((str_start = strchr(name_copy, '[')) &&
(str_end = strchr(name_copy, ']'))) {
memmove(str_start, str_end + 1, 1 + strlen(str_end));
memmove(str_start, str_end + 1, 1 + strlen(str_end + 1));
}
unsigned index = 0;

View File

@@ -1102,23 +1102,43 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp opcode,
SpvOp opcode = get_specialization(b, val, w[3]);
switch (opcode) {
case SpvOpVectorShuffle: {
struct vtn_value *v0 = vtn_value(b, w[4], vtn_value_type_constant);
struct vtn_value *v1 = vtn_value(b, w[5], vtn_value_type_constant);
unsigned len0 = glsl_get_vector_elements(v0->const_type);
unsigned len1 = glsl_get_vector_elements(v1->const_type);
struct vtn_value *v0 = &b->values[w[4]];
struct vtn_value *v1 = &b->values[w[5]];
assert(v0->value_type == vtn_value_type_constant ||
v0->value_type == vtn_value_type_undef);
assert(v1->value_type == vtn_value_type_constant ||
v1->value_type == vtn_value_type_undef);
unsigned len0 = v0->value_type == vtn_value_type_constant ?
glsl_get_vector_elements(v0->const_type) :
glsl_get_vector_elements(v0->type->type);
unsigned len1 = v1->value_type == vtn_value_type_constant ?
glsl_get_vector_elements(v1->const_type) :
glsl_get_vector_elements(v1->type->type);
assert(len0 + len1 < 16);
unsigned bit_size = glsl_get_bit_size(val->const_type);
assert(bit_size == glsl_get_bit_size(v0->const_type) &&
bit_size == glsl_get_bit_size(v1->const_type));
unsigned bit_size0 = v0->value_type == vtn_value_type_constant ?
glsl_get_bit_size(v0->const_type) :
glsl_get_bit_size(v0->type->type);
unsigned bit_size1 = v1->value_type == vtn_value_type_constant ?
glsl_get_bit_size(v1->const_type) :
glsl_get_bit_size(v1->type->type);
assert(bit_size == bit_size0 && bit_size == bit_size1);
if (bit_size == 64) {
uint64_t u64[8];
for (unsigned i = 0; i < len0; i++)
u64[i] = v0->constant->values[0].u64[i];
for (unsigned i = 0; i < len1; i++)
u64[len0 + i] = v1->constant->values[0].u64[i];
if (v0->value_type == vtn_value_type_constant) {
for (unsigned i = 0; i < len0; i++)
u64[i] = v0->constant->values[0].u64[i];
}
if (v1->value_type == vtn_value_type_constant) {
for (unsigned i = 0; i < len1; i++)
u64[len0 + i] = v1->constant->values[0].u64[i];
}
for (unsigned i = 0, j = 0; i < count - 6; i++, j++) {
uint32_t comp = w[i + 6];
@@ -1132,11 +1152,14 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp opcode,
}
} else {
uint32_t u32[8];
for (unsigned i = 0; i < len0; i++)
u32[i] = v0->constant->values[0].u32[i];
for (unsigned i = 0; i < len1; i++)
u32[len0 + i] = v1->constant->values[0].u32[i];
if (v0->value_type == vtn_value_type_constant) {
for (unsigned i = 0; i < len0; i++)
u32[i] = v0->constant->values[0].u32[i];
}
if (v1->value_type == vtn_value_type_constant) {
for (unsigned i = 0; i < len1; i++)
u32[len0 + i] = v1->constant->values[0].u32[i];
}
for (unsigned i = 0, j = 0; i < count - 6; i++, j++) {
uint32_t comp = w[i + 6];
@@ -2902,6 +2925,7 @@ vtn_handle_variable_or_type_instruction(struct vtn_builder *b, SpvOp opcode,
vtn_handle_constant(b, opcode, w, count);
break;
case SpvOpUndef:
case SpvOpVariable:
vtn_handle_variables(b, opcode, w, count);
break;

View File

@@ -1268,6 +1268,12 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
const uint32_t *w, unsigned count)
{
switch (opcode) {
case SpvOpUndef: {
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_undef);
val->type = vtn_value(b, w[1], vtn_value_type_type)->type;
break;
}
case SpvOpVariable: {
struct vtn_variable *var = rzalloc(b, struct vtn_variable);
var->type = vtn_value(b, w[1], vtn_value_type_type)->type;

View File

@@ -96,8 +96,8 @@ AM_CFLAGS += \
-I$(top_srcdir)/src/egl/drivers/dri2 \
-I$(top_srcdir)/src/gbm/backends/dri \
-I$(top_srcdir)/src/egl/wayland/wayland-egl \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-DDEFAULT_DRIVER_DIR=\"$(DRI_DRIVER_SEARCH_DIR)\" \
-D_EGL_BUILT_IN_DRIVER_DRI2

View File

@@ -34,7 +34,7 @@ LOCAL_C_INCLUDES += \
external/llvm/include \
external/llvm/device/include \
external/libcxx/include \
external/elfutils/$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),0.153/)libelf
$(ELF_INCLUDES)
endif
include $(MESA_COMMON_MK)

View File

@@ -2624,7 +2624,6 @@ lp_set_default_actions_cpu(
bld_base->op_actions[TGSI_OPCODE_DSLT].emit = dslt_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DSNE].emit = dsne_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DDIV].emit = div_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DRSQ].emit = drecip_sqrt_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DSQRT].emit = dsqrt_emit_cpu;

View File

@@ -209,6 +209,16 @@ micro_dadd(union tgsi_double_channel *dst,
dst->d[3] = src[0].d[3] + src[1].d[3];
}
static void
micro_ddiv(union tgsi_double_channel *dst,
const union tgsi_double_channel *src)
{
dst->d[0] = src[0].d[0] / src[1].d[0];
dst->d[1] = src[0].d[1] / src[1].d[1];
dst->d[2] = src[0].d[2] / src[1].d[2];
dst->d[3] = src[0].d[3] / src[1].d[3];
}
static void
micro_ddx(union tgsi_exec_channel *dst,
const union tgsi_exec_channel *src)
@@ -5995,6 +6005,10 @@ exec_instruction(
exec_double_binary(mach, inst, micro_dadd, TGSI_EXEC_DATA_DOUBLE);
break;
case TGSI_OPCODE_DDIV:
exec_double_binary(mach, inst, micro_ddiv, TGSI_EXEC_DATA_DOUBLE);
break;
case TGSI_OPCODE_DMUL:
exec_double_binary(mach, inst, micro_dmul, TGSI_EXEC_DATA_DOUBLE);
break;

View File

@@ -1021,7 +1021,7 @@ label_mark_use(struct etna_compile *c, struct etna_compile_label *label)
static struct etna_compile_frame *
find_frame(struct etna_compile *c, enum etna_compile_frame_type type)
{
for (unsigned sp = c->frame_sp; sp >= 0; sp--)
for (int sp = c->frame_sp; sp >= 0; sp--)
if (c->frame_stack[sp].type == type)
return &c->frame_stack[sp];
@@ -1444,7 +1444,42 @@ static void
trans_trig(const struct instr_translater *t, struct etna_compile *c,
const struct tgsi_full_instruction *inst, struct etna_inst_src *src)
{
if (c->specs->has_sin_cos_sqrt) {
if (c->specs->has_new_sin_cos) { /* Alternative SIN/COS */
/* On newer chips alternative SIN/COS instructions are implemented,
* which:
* - Need their input scaled by 1/pi instead of 2/pi
* - Output an x and y component, which need to be multiplied to
* get the result
*/
/* TGSI lowering should deal with SCS */
assert(inst->Instruction.Opcode != TGSI_OPCODE_SCS);
struct etna_native_reg temp = etna_compile_get_inner_temp(c); /* only using .xyz */
emit_inst(c, &(struct etna_inst) {
.opcode = INST_OPCODE_MUL,
.sat = 0,
.dst = etna_native_to_dst(temp, INST_COMPS_Z),
.src[0] = src[0], /* any swizzling happens here */
.src[1] = alloc_imm_f32(c, 1.0f / M_PI),
});
emit_inst(c, &(struct etna_inst) {
.opcode = inst->Instruction.Opcode == TGSI_OPCODE_COS
? INST_OPCODE_COS
: INST_OPCODE_SIN,
.sat = 0,
.dst = etna_native_to_dst(temp, INST_COMPS_X | INST_COMPS_Y),
.src[2] = etna_native_to_src(temp, SWIZZLE(Z, Z, Z, Z)),
.tex = { .amode=1 }, /* Unknown bit needs to be set */
});
emit_inst(c, &(struct etna_inst) {
.opcode = INST_OPCODE_MUL,
.sat = inst->Instruction.Saturate,
.dst = convert_dst(c, &inst->Dst[0]),
.src[0] = etna_native_to_src(temp, SWIZZLE(X, X, X, X)),
.src[1] = etna_native_to_src(temp, SWIZZLE(Y, Y, Y, Y)),
});
} else if (c->specs->has_sin_cos_sqrt) {
/* TGSI lowering should deal with SCS */
assert(inst->Instruction.Opcode != TGSI_OPCODE_SCS);

View File

@@ -491,6 +491,23 @@ etna_emit_state(struct etna_context *ctx)
/*00C14*/ EMIT_STATE(SE_DEPTH_BIAS, rasterizer->SE_DEPTH_BIAS);
/*00C18*/ EMIT_STATE(SE_CONFIG, rasterizer->SE_CONFIG);
}
if (unlikely(dirty & (ETNA_DIRTY_SCISSOR | ETNA_DIRTY_FRAMEBUFFER |
ETNA_DIRTY_RASTERIZER | ETNA_DIRTY_VIEWPORT))) {
struct etna_rasterizer_state *rasterizer = etna_rasterizer_state(ctx->rasterizer);
uint32_t clip_right =
MIN2(ctx->framebuffer.SE_CLIP_RIGHT, ctx->viewport.SE_CLIP_RIGHT);
uint32_t clip_bottom =
MIN2(ctx->framebuffer.SE_CLIP_BOTTOM, ctx->viewport.SE_CLIP_BOTTOM);
if (rasterizer->scissor) {
clip_right = MIN2(ctx->scissor.SE_CLIP_RIGHT, clip_right);
clip_bottom = MIN2(ctx->scissor.SE_CLIP_BOTTOM, clip_bottom);
}
/*00C20*/ EMIT_STATE_FIXP(SE_CLIP_RIGHT, clip_right);
/*00C24*/ EMIT_STATE_FIXP(SE_CLIP_BOTTOM, clip_bottom);
}
if (unlikely(dirty & (ETNA_DIRTY_SHADER))) {
/*00E00*/ EMIT_STATE(RA_CONTROL, ctx->shader_state.RA_CONTROL);
}

View File

@@ -47,6 +47,17 @@
/* PE render targets must be aligned to 64 bytes */
#define ETNA_PE_ALIGNMENT (64)
/* These demarcate the margin (fixp16) between the computed sizes and the
value sent to the chip. These have been set to the numbers used by the
Vivante driver on gc2000. They used to be -1 for scissor right and bottom. I
am not sure whether older hardware was relying on these or they were just a
guess. But if so, these need to be moved to the _specs structure.
*/
#define ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
#define ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
#define ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
#define ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)
/* GPU chip 3D specs */
struct etna_specs {
/* supports SUPERTILE (64x64) tiling? */
@@ -59,6 +70,8 @@ struct etna_specs {
unsigned has_sign_floor_ceil : 1;
/* can use VS_RANGE, PS_RANGE registers*/
unsigned has_shader_range_registers : 1;
/* has the new sin/cos functions */
unsigned has_new_sin_cos : 1;
/* can use any kind of wrapping mode on npot textures */
unsigned npot_tex_any_wrap;
/* number of bits per TS tile */
@@ -126,6 +139,8 @@ struct compiled_scissor_state {
uint32_t SE_SCISSOR_TOP;
uint32_t SE_SCISSOR_RIGHT;
uint32_t SE_SCISSOR_BOTTOM;
uint32_t SE_CLIP_RIGHT;
uint32_t SE_CLIP_BOTTOM;
};
/* Compiled pipe_viewport_state */
@@ -140,6 +155,8 @@ struct compiled_viewport_state {
uint32_t SE_SCISSOR_TOP;
uint32_t SE_SCISSOR_RIGHT;
uint32_t SE_SCISSOR_BOTTOM;
uint32_t SE_CLIP_RIGHT;
uint32_t SE_CLIP_BOTTOM;
uint32_t PE_DEPTH_NEAR;
uint32_t PE_DEPTH_FAR;
};
@@ -162,6 +179,8 @@ struct compiled_framebuffer_state {
uint32_t SE_SCISSOR_TOP;
uint32_t SE_SCISSOR_RIGHT;
uint32_t SE_SCISSOR_BOTTOM;
uint32_t SE_CLIP_RIGHT;
uint32_t SE_CLIP_BOTTOM;
uint32_t RA_MULTISAMPLE_UNK00E04;
uint32_t RA_MULTISAMPLE_UNK00E10[VIVS_RA_MULTISAMPLE_UNK00E10__LEN];
uint32_t RA_CENTROID_TABLE[VIVS_RA_CENTROID_TABLE__LEN];

View File

@@ -201,7 +201,10 @@ etna_resource_alloc(struct pipe_screen *pscreen, unsigned layout,
size = setup_miptree(rsc, paddingX, paddingY, msaa_xscale, msaa_yscale);
struct etna_bo *bo = etna_bo_new(screen->dev, size, DRM_ETNA_GEM_CACHE_WC);
uint32_t flags = DRM_ETNA_GEM_CACHE_WC;
if (templat->bind & PIPE_BIND_VERTEX_BUFFER)
flags |= DRM_ETNA_GEM_FORCE_MMU;
struct etna_bo *bo = etna_bo_new(screen->dev, size, flags);
if (unlikely(bo == NULL)) {
BUG("Problem allocating video memory for resource");
return NULL;

View File

@@ -469,8 +469,11 @@ etna_screen_is_format_supported(struct pipe_screen *pscreen,
return FALSE;
if (usage & PIPE_BIND_RENDER_TARGET) {
/* if render target, must be RS-supported format */
if (translate_rs_format(format) != ETNA_NO_MATCH) {
/* If render target, must be RS-supported format that is not rb swapped.
* Exposing rb swapped (or other swizzled) formats for rendering would
* involve swizzing in the pixel shader.
*/
if (translate_rs_format(format) != ETNA_NO_MATCH && !translate_rs_format_rb_swap(format)) {
/* Validate MSAA; number of samples must be allowed, and render target
* must have MSAA'able format. */
if (sample_count > 1) {
@@ -617,6 +620,8 @@ etna_get_specs(struct etna_screen *screen)
screen->model >= 0x1000 || screen->model == 0x880;
screen->specs.npot_tex_any_wrap =
VIV_FEATURE(screen, chipMinorFeatures1, NON_POWER_OF_TWO);
screen->specs.has_new_sin_cos =
VIV_FEATURE(screen, chipMinorFeatures3, HAS_FAST_TRANSCENDENTALS);
if (instruction_count > 256) { /* unified instruction memory? */
screen->specs.vs_offset = 0xC000;

View File

@@ -323,8 +323,10 @@ etna_set_framebuffer_state(struct pipe_context *pctx,
/* Scissor setup */
cs->SE_SCISSOR_LEFT = 0; /* affected by rasterizer and scissor state as well */
cs->SE_SCISSOR_TOP = 0;
cs->SE_SCISSOR_RIGHT = (sv->width << 16) - 1;
cs->SE_SCISSOR_BOTTOM = (sv->height << 16) - 1;
cs->SE_SCISSOR_RIGHT = (sv->width << 16) + ETNA_SE_SCISSOR_MARGIN_RIGHT;
cs->SE_SCISSOR_BOTTOM = (sv->height << 16) + ETNA_SE_SCISSOR_MARGIN_BOTTOM;
cs->SE_CLIP_RIGHT = (sv->width << 16) + ETNA_SE_CLIP_MARGIN_RIGHT;
cs->SE_CLIP_BOTTOM = (sv->height << 16) + ETNA_SE_CLIP_MARGIN_BOTTOM;
cs->TS_MEM_CONFIG = ts_mem_config;
@@ -345,13 +347,17 @@ etna_set_scissor_states(struct pipe_context *pctx, unsigned start_slot,
{
struct etna_context *ctx = etna_context(pctx);
struct compiled_scissor_state *cs = &ctx->scissor;
assert(ss->minx <= ss->maxx);
assert(ss->miny <= ss->maxy);
/* note that this state is only used when rasterizer_state->scissor is on */
ctx->scissor_s = *ss;
cs->SE_SCISSOR_LEFT = (ss->minx << 16);
cs->SE_SCISSOR_TOP = (ss->miny << 16);
cs->SE_SCISSOR_RIGHT = (ss->maxx << 16) - 1;
cs->SE_SCISSOR_BOTTOM = (ss->maxy << 16) - 1;
cs->SE_SCISSOR_RIGHT = (ss->maxx << 16) + ETNA_SE_SCISSOR_MARGIN_RIGHT;
cs->SE_SCISSOR_BOTTOM = (ss->maxy << 16) + ETNA_SE_SCISSOR_MARGIN_BOTTOM;
cs->SE_CLIP_RIGHT = (ss->maxx << 16) + ETNA_SE_CLIP_MARGIN_RIGHT;
cs->SE_CLIP_BOTTOM = (ss->maxy << 16) + ETNA_SE_CLIP_MARGIN_BOTTOM;
ctx->dirty |= ETNA_DIRTY_SCISSOR;
}
@@ -387,22 +393,14 @@ etna_set_viewport_states(struct pipe_context *pctx, unsigned start_slot,
/* Compute scissor rectangle (fixp) from viewport.
* Make sure left is always < right and top always < bottom.
*/
cs->SE_SCISSOR_LEFT = etna_f32_to_fixp16(MAX2(vs->translate[0] - vs->scale[0], 0.0f));
cs->SE_SCISSOR_TOP = etna_f32_to_fixp16(MAX2(vs->translate[1] - vs->scale[1], 0.0f));
cs->SE_SCISSOR_RIGHT = etna_f32_to_fixp16(MAX2(vs->translate[0] + vs->scale[0], 0.0f));
cs->SE_SCISSOR_BOTTOM = etna_f32_to_fixp16(MAX2(vs->translate[1] + vs->scale[1], 0.0f));
if (cs->SE_SCISSOR_LEFT > cs->SE_SCISSOR_RIGHT) {
uint32_t tmp = cs->SE_SCISSOR_RIGHT;
cs->SE_SCISSOR_RIGHT = cs->SE_SCISSOR_LEFT;
cs->SE_SCISSOR_LEFT = tmp;
}
if (cs->SE_SCISSOR_TOP > cs->SE_SCISSOR_BOTTOM) {
uint32_t tmp = cs->SE_SCISSOR_BOTTOM;
cs->SE_SCISSOR_BOTTOM = cs->SE_SCISSOR_TOP;
cs->SE_SCISSOR_TOP = tmp;
}
cs->SE_SCISSOR_LEFT = etna_f32_to_fixp16(MAX2(vs->translate[0] - fabsf(vs->scale[0]), 0.0f));
cs->SE_SCISSOR_TOP = etna_f32_to_fixp16(MAX2(vs->translate[1] - fabsf(vs->scale[1]), 0.0f));
uint32_t right_fixp = etna_f32_to_fixp16(MAX2(vs->translate[0] + fabsf(vs->scale[0]), 0.0f));
uint32_t bottom_fixp = etna_f32_to_fixp16(MAX2(vs->translate[1] + fabsf(vs->scale[1]), 0.0f));
cs->SE_SCISSOR_RIGHT = right_fixp + ETNA_SE_SCISSOR_MARGIN_RIGHT;
cs->SE_SCISSOR_BOTTOM = bottom_fixp + ETNA_SE_SCISSOR_MARGIN_BOTTOM;
cs->SE_CLIP_RIGHT = right_fixp + ETNA_SE_CLIP_MARGIN_RIGHT;
cs->SE_CLIP_BOTTOM = bottom_fixp + ETNA_SE_CLIP_MARGIN_BOTTOM;
cs->PE_DEPTH_NEAR = fui(0.0); /* not affected if depth mode is Z (as in GL) */
cs->PE_DEPTH_FAR = fui(1.0);

View File

@@ -9,6 +9,7 @@ AM_CFLAGS = \
$(GALLIUM_DRIVER_CFLAGS) \
$(FREEDRENO_CFLAGS)
MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
ir3/ir3_nir_trig.c: ir3/ir3_nir_trig.py $(top_srcdir)/src/compiler/nir/nir_algebraic.py
$(MKDIR_GEN)
$(AM_V_GEN) PYTHONPATH=$(top_srcdir)/src/compiler/nir $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/ir3/ir3_nir_trig.py > $@ || ($(RM) $@; false)

View File

@@ -2924,7 +2924,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
struct pipe_stream_output_info so = pipeshader->selector->so;
struct tgsi_full_immediate *immediate;
struct r600_shader_ctx ctx;
struct r600_bytecode_output output[32];
struct r600_bytecode_output output[ARRAY_SIZE(shader->output)];
unsigned output_done, noutput;
unsigned opcode;
int i, j, k, r = 0;

View File

@@ -660,7 +660,8 @@ si_mark_image_range_valid(const struct pipe_image_view *view)
static void si_set_shader_image(struct si_context *ctx,
unsigned shader,
unsigned slot, const struct pipe_image_view *view)
unsigned slot, const struct pipe_image_view *view,
bool skip_decompress)
{
struct si_screen *screen = ctx->screen;
struct si_images_info *images = &ctx->images[shader];
@@ -702,7 +703,7 @@ static void si_set_shader_image(struct si_context *ctx,
assert(!tex->is_depth);
assert(tex->fmask.size == 0);
if (uses_dcc &&
if (uses_dcc && !skip_decompress &&
(view->access & PIPE_IMAGE_ACCESS_WRITE ||
!vi_dcc_formats_compatible(res->b.b.format, view->format))) {
/* If DCC can't be disabled, at least decompress it.
@@ -776,10 +777,10 @@ si_set_shader_images(struct pipe_context *pipe,
if (views) {
for (i = 0, slot = start_slot; i < count; ++i, ++slot)
si_set_shader_image(ctx, shader, slot, &views[i]);
si_set_shader_image(ctx, shader, slot, &views[i], false);
} else {
for (i = 0, slot = start_slot; i < count; ++i, ++slot)
si_set_shader_image(ctx, shader, slot, NULL);
si_set_shader_image(ctx, shader, slot, NULL, false);
}
si_update_compressed_tex_shader_mask(ctx, shader);
@@ -1710,7 +1711,7 @@ void si_update_all_texture_descriptors(struct si_context *sctx)
view->resource->target == PIPE_BUFFER)
continue;
si_set_shader_image(sctx, shader, i, view);
si_set_shader_image(sctx, shader, i, view, true);
}
/* Sampler views. */

View File

@@ -3368,7 +3368,7 @@ static void *si_create_vertex_elements(struct pipe_context *ctx,
first_non_void = util_format_get_first_non_void_channel(elements[i].src_format);
data_format = si_translate_buffer_dataformat(ctx->screen, desc, first_non_void);
num_format = si_translate_buffer_numformat(ctx->screen, desc, first_non_void);
channel = &desc->channel[first_non_void];
channel = first_non_void >= 0 ? &desc->channel[first_non_void] : NULL;
v->rsrc_word3[i] = S_008F0C_DST_SEL_X(si_map_swizzle(desc->swizzle[0])) |
S_008F0C_DST_SEL_Y(si_map_swizzle(desc->swizzle[1])) |
@@ -3390,12 +3390,12 @@ static void *si_create_vertex_elements(struct pipe_context *ctx,
/* This isn't actually used in OpenGL. */
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_A2_SINT << (4 * i);
}
} else if (channel->type == UTIL_FORMAT_TYPE_FIXED) {
} else if (channel && channel->type == UTIL_FORMAT_TYPE_FIXED) {
if (desc->swizzle[3] == PIPE_SWIZZLE_1)
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_RGBX_32_FIXED << (4 * i);
else
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_RGBA_32_FIXED << (4 * i);
} else if (channel->size == 32 && !channel->pure_integer) {
} else if (channel && channel->size == 32 && !channel->pure_integer) {
if (channel->type == UTIL_FORMAT_TYPE_SIGNED) {
if (channel->normalized) {
if (desc->swizzle[3] == PIPE_SWIZZLE_1)

View File

@@ -2,12 +2,12 @@ include Makefile.sources
AM_CPPFLAGS = \
-I$(top_srcdir)/include \
-I$(top_builddir)/src \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/drivers \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/winsys \
-I$(top_builddir)/src \
-I$(srcdir)
if HAVE_CLOVER_ICD

View File

@@ -28,8 +28,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
$(GALLIUM_CFLAGS) \
$(LIBDRM_CFLAGS) \
$(VISIBILITY_CFLAGS)

View File

@@ -81,7 +81,7 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID context_id, VASurfaceID rende
}
if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
context->decoder->begin_frame(context->decoder, context->target, &context->desc.base);
context->needs_begin_frame = true;
return VA_STATUS_SUCCESS;
}
@@ -178,6 +178,8 @@ handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer *
if (!context->decoder)
return VA_STATUS_ERROR_ALLOCATION_FAILED;
context->needs_begin_frame = true;
}
return vaStatus;
@@ -308,8 +310,11 @@ handleVASliceDataBufferType(vlVaContext *context, vlVaBuffer *buf)
sizes[num_buffers] = buf->size;
++num_buffers;
context->decoder->begin_frame(context->decoder, context->target,
&context->desc.base);
if (context->needs_begin_frame) {
context->decoder->begin_frame(context->decoder, context->target,
&context->desc.base);
context->needs_begin_frame = false;
}
context->decoder->decode_bitstream(context->decoder, context->target, &context->desc.base,
num_buffers, (const void * const*)buffers, sizes);
}

View File

@@ -261,6 +261,7 @@ typedef struct {
int target_id;
bool first_single_submitted;
int gop_coeff;
bool needs_begin_frame;
} vlVaContext;
typedef struct {

View File

@@ -75,6 +75,13 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
memset(&res_tmpl, 0, sizeof(res_tmpl));
/*
* The output won't look correctly when this buffer is send to X,
* if the VDPAU RGB component order doesn't match the X11 one so
* we only allow the X11 format
*/
vlsurface->send_to_X = rgba_format == VDP_RGBA_FORMAT_B8G8R8A8;
res_tmpl.target = PIPE_TEXTURE_2D;
res_tmpl.format = VdpFormatRGBAToPipe(rgba_format);
res_tmpl.width0 = width;

View File

@@ -231,7 +231,7 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
vscreen = pq->device->vscreen;
pipe_mutex_lock(pq->device->mutex);
if (vscreen->set_back_texture_from_output)
if (vscreen->set_back_texture_from_output && surf->send_to_X)
vscreen->set_back_texture_from_output(vscreen, surf->surface->texture, clip_width, clip_height);
tex = vscreen->texture_from_drawable(vscreen, (void *)pq->drawable);
if (!tex) {
@@ -239,7 +239,7 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
return VDP_STATUS_INVALID_HANDLE;
}
if (!vscreen->set_back_texture_from_output) {
if (!vscreen->set_back_texture_from_output || !surf->send_to_X) {
dirty_area = vscreen->get_dirty_area(vscreen);
memset(&surf_templ, 0, sizeof(surf_templ));
@@ -289,7 +289,7 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
framenum++;
}
if (!vscreen->set_back_texture_from_output) {
if (!vscreen->set_back_texture_from_output || !surf->send_to_X) {
pipe_resource_reference(&tex, NULL);
pipe_surface_reference(&surf_draw, NULL);
}

View File

@@ -415,6 +415,7 @@ typedef struct
struct pipe_fence_handle *fence;
struct vl_compositor_state cstate;
struct u_rect dirty_area;
bool send_to_X;
} vlVdpOutputSurface;
typedef struct

View File

@@ -27,8 +27,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/loader \
-I$(top_srcdir)/src/mapi/ \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_builddir)/src/mesa/drivers/dri/common/ \
-I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_srcdir)/src/gallium/winsys \
-I$(top_srcdir)/src/gallium/state_trackers/nine \
$(GALLIUM_TARGET_CFLAGS) \

View File

@@ -37,10 +37,10 @@ AM_CFLAGS = \
-I$(top_srcdir)/include/GL/internal \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/loader \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mapi/glapi \
-I$(top_builddir)/src/mapi \
-I$(top_srcdir)/src/mapi \
-I$(top_builddir)/src/mapi/glapi \
-I$(top_srcdir)/src/mapi/glapi \
$(VISIBILITY_CFLAGS) \
$(SHARED_GLAPI_CFLAGS) \
$(EXTRA_DEFINES_XF86VIDMODE) \

View File

@@ -6,11 +6,11 @@ AM_CFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/glx \
-I$(top_srcdir)/src/mesa \
-I$(top_builddir)/src/mesa \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mapi/glapi \
-I$(top_builddir)/src/mapi/glapi \
-I$(top_srcdir)/src/mapi/glapi \
$(VISIBILITY_CFLAGS) \
$(SHARED_GLAPI_CFLAGS) \
$(DEFINES) \

View File

@@ -24,8 +24,8 @@ libwindowsglx_la_CFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/glx \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mapi/glapi \
-I$(top_builddir)/src/mapi/glapi \
-I$(top_srcdir)/src/mapi/glapi \
$(VISIBILITY_CFLAGS) \
$(SHARED_GLAPI_CFLAGS) \
$(DEFINES) \

View File

@@ -349,6 +349,29 @@ blorp_clear(struct blorp_batch *batch,
if (format == ISL_FORMAT_R9G9B9E5_SHAREDEXP) {
clear_color.u32[0] = float3_to_rgb9e5(clear_color.f32);
format = ISL_FORMAT_R32_UINT;
} else if (format == ISL_FORMAT_A4B4G4R4_UNORM) {
/* Broadwell and earlier cannot render to this format so we need to work
* around it by swapping the colors around and using B4G4R4A4 instead.
*/
/* First, we apply the swizzle. */
union isl_color_value old;
assert((unsigned)(swizzle.r - ISL_CHANNEL_SELECT_RED) < 4);
assert((unsigned)(swizzle.g - ISL_CHANNEL_SELECT_RED) < 4);
assert((unsigned)(swizzle.b - ISL_CHANNEL_SELECT_RED) < 4);
assert((unsigned)(swizzle.a - ISL_CHANNEL_SELECT_RED) < 4);
old.u32[swizzle.r - ISL_CHANNEL_SELECT_RED] = clear_color.u32[0];
old.u32[swizzle.g - ISL_CHANNEL_SELECT_RED] = clear_color.u32[1];
old.u32[swizzle.b - ISL_CHANNEL_SELECT_RED] = clear_color.u32[2];
old.u32[swizzle.a - ISL_CHANNEL_SELECT_RED] = clear_color.u32[3];
swizzle = ISL_SWIZZLE_IDENTITY;
/* Now we re-order for the new format */
clear_color.u32[0] = old.u32[1];
clear_color.u32[1] = old.u32[2];
clear_color.u32[2] = old.u32[3];
clear_color.u32[3] = old.u32[0];
format = ISL_FORMAT_B4G4R4A4_UNORM;
}
memcpy(&params.wm_inputs.clear_color, clear_color.f32, sizeof(float) * 4);

View File

@@ -218,9 +218,10 @@ static const struct surface_format_info format_info[] = {
SF(50, 50, x, x, x, x, x, x, x, x, P8A8_UNORM_PALETTE1)
SF( x, x, x, x, x, x, x, x, x, x, A1B5G5R5_UNORM)
/* According to the PRM, A4B4G4R4_UNORM isn't supported until Sky Lake
* but empirical testing indicates that it works just fine on Broadwell.
* but empirical testing indicates that at least sampling works just fine
* on Broadwell.
*/
SF(80, 80, x, x, 80, x, x, x, x, x, A4B4G4R4_UNORM)
SF(80, 80, x, x, 90, x, x, x, x, x, A4B4G4R4_UNORM)
SF(90, x, x, x, x, x, x, x, x, x, L8A8_UINT)
SF(90, x, x, x, x, x, x, x, x, x, L8A8_SINT)
SF( Y, Y, x, 45, Y, Y, Y, x, x, x, R8_UNORM)

View File

@@ -232,9 +232,12 @@ VkResult anv_AllocateCommandBuffers(
break;
}
if (result != VK_SUCCESS)
if (result != VK_SUCCESS) {
anv_FreeCommandBuffers(_device, pAllocateInfo->commandPool,
i, pCommandBuffers);
for (i = 0; i < pAllocateInfo->commandBufferCount; i++)
pCommandBuffers[i] = VK_NULL_HANDLE;
}
return result;
}

View File

@@ -329,18 +329,18 @@ VkResult anv_CreateDescriptorPool(
}
}
const size_t size =
sizeof(*pool) +
const size_t pool_size =
pCreateInfo->maxSets * sizeof(struct anv_descriptor_set) +
descriptor_count * sizeof(struct anv_descriptor) +
buffer_count * sizeof(struct anv_buffer_view);
const size_t total_size = sizeof(*pool) + pool_size;
pool = vk_alloc2(&device->alloc, pAllocator, size, 8,
pool = vk_alloc2(&device->alloc, pAllocator, total_size, 8,
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (!pool)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
pool->size = size;
pool->size = pool_size;
pool->next = 0;
pool->free_list = EMPTY;

View File

@@ -100,11 +100,8 @@ try_lower_input_load(nir_function_impl *impl, nir_intrinsic_instr *load)
if (image_dim == GLSL_SAMPLER_DIM_SUBPASS_MS) {
tex->op = nir_texop_txf_ms;
nir_ssa_def *sample_id =
nir_load_system_value(&b, nir_intrinsic_load_sample_id, 0);
tex->src[2].src_type = nir_tex_src_ms_index;
tex->src[2].src = nir_src_for_ssa(sample_id);
tex->src[2].src = load->src[1];
}
nir_ssa_dest_init(&tex->instr, &tex->dest, 4, 32, NULL);

View File

@@ -55,8 +55,6 @@ genX(cmd_buffer_emit_state_base_address)(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_device *device = cmd_buffer->device;
/* XXX: Do we need this on more than just BDW? */
#if (GEN_GEN >= 8)
/* Emit a render target cache flush.
*
* This isn't documented anywhere in the PRM. However, it seems to be
@@ -65,9 +63,10 @@ genX(cmd_buffer_emit_state_base_address)(struct anv_cmd_buffer *cmd_buffer)
* clear depth, reset state base address, and then go render stuff.
*/
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.DCFlushEnable = true;
pc.RenderTargetCacheFlushEnable = true;
pc.CommandStreamerStallEnable = true;
}
#endif
anv_batch_emit(&cmd_buffer->batch, GENX(STATE_BASE_ADDRESS), sba) {
sba.GeneralStateBaseAddress = (struct anv_address) { NULL, 0 };
@@ -148,6 +147,8 @@ genX(cmd_buffer_emit_state_base_address)(struct anv_cmd_buffer *cmd_buffer)
*/
anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
pc.TextureCacheInvalidationEnable = true;
pc.ConstantCacheInvalidationEnable = true;
pc.StateCacheInvalidationEnable = true;
}
}
@@ -1177,9 +1178,9 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
assert(stage == MESA_SHADER_FRAGMENT);
if (desc->image_view->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
/* For stencil input attachments, we treat it like any old texture
* that a user may have bound.
if (desc->image_view->aspect_mask != VK_IMAGE_ASPECT_COLOR_BIT) {
/* For depth and stencil input attachments, we treat it like any
* old texture that a user may have bound.
*/
surface_state = desc->image_view->sampler_surface_state;
assert(surface_state.alloc_size);
@@ -1187,9 +1188,9 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
desc->image_view->image->aux_usage,
surface_state);
} else {
/* For depth and color input attachments, we create the surface
* state at vkBeginRenderPass time so that we can include aux
* and clear color information.
/* For color input attachments, we create the surface state at
* vkBeginRenderPass time so that we can include aux and clear
* color information.
*/
assert(binding->input_attachment_index < subpass->input_count);
const unsigned subpass_att = binding->input_attachment_index;

View File

@@ -39,8 +39,8 @@ libloader_la_LIBADD =
if HAVE_DRICOMMON
libloader_la_CPPFLAGS += \
-I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_builddir)/src/mesa/drivers/dri/common/ \
-I$(top_srcdir)/src/mesa/drivers/dri/common/ \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/mapi/ \
-DUSE_DRICONF

View File

@@ -46,8 +46,8 @@ AM_CPPFLAGS = \
$(SELINUX_CFLAGS) \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/mapi \
-I$(top_builddir)/src/mapi
-I$(top_builddir)/src/mapi \
-I$(top_srcdir)/src/mapi
include Makefile.sources

View File

@@ -30,9 +30,9 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(INTEL_CFLAGS)

View File

@@ -30,21 +30,22 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-I$(top_srcdir)/src/gtest/include \
-I$(top_srcdir)/src/compiler/nir \
-I$(top_srcdir)/src/intel \
-I$(top_builddir)/src/compiler/glsl \
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler/nir \
-I$(top_builddir)/src/intel \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/intel \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(INTEL_CFLAGS)
AM_CXXFLAGS = $(AM_CFLAGS)
MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
brw_nir_trig_workarounds.c: brw_nir_trig_workarounds.py $(top_srcdir)/src/compiler/nir/nir_algebraic.py
$(MKDIR_GEN)
$(AM_V_GEN) PYTHONPATH=$(top_srcdir)/src/compiler/nir $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/brw_nir_trig_workarounds.py > $@ || ($(RM) $@; false)

View File

@@ -284,8 +284,10 @@ brw_blorp_to_isl_format(struct brw_context *brw, mesa_format format,
case MESA_FORMAT_S_UINT8:
return ISL_FORMAT_R8_UINT;
case MESA_FORMAT_Z24_UNORM_X8_UINT:
case MESA_FORMAT_Z24_UNORM_S8_UINT:
return ISL_FORMAT_R24_UNORM_X8_TYPELESS;
case MESA_FORMAT_Z_FLOAT32:
case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
return ISL_FORMAT_R32_FLOAT;
case MESA_FORMAT_Z_UNORM16:
return ISL_FORMAT_R16_UNORM;

View File

@@ -910,6 +910,9 @@ brw_process_driconf_options(struct brw_context *brw)
ctx->Const.ForceGLSLExtensionsWarn =
driQueryOptionb(options, "force_glsl_extensions_warn");
ctx->Const.ForceGLSLVersion =
driQueryOptioni(options, "force_glsl_version");
ctx->Const.DisableGLSLLineContinuations =
driQueryOptionb(options, "disable_glsl_line_continuations");

View File

@@ -508,7 +508,7 @@ fs_generator::generate_cs_terminate(fs_inst *inst, struct brw_reg payload)
insn = brw_next_insn(p, BRW_OPCODE_SEND);
brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_UW));
brw_set_src0(p, insn, payload);
brw_set_src0(p, insn, retype(payload, BRW_REGISTER_TYPE_UW));
brw_set_src1(p, insn, brw_imm_d(0));
/* Terminate a compute shader by sending a message to the thread spawner.

View File

@@ -177,6 +177,49 @@ static struct gl_program *brwNewProgram(struct gl_context *ctx, GLenum target,
static void brwDeleteProgram( struct gl_context *ctx,
struct gl_program *prog )
{
struct brw_context *brw = brw_context(ctx);
/* Beware! prog's refcount has reached zero, and it's about to be freed.
*
* In brw_upload_pipeline_state(), we compare brw->foo_program to
* ctx->FooProgram._Current, and flag BRW_NEW_FOO_PROGRAM if the
* pointer has changed.
*
* We cannot leave brw->foo_program as a dangling pointer to the dead
* program. malloc() may allocate the same memory for a new gl_program,
* causing us to see matching pointers...but totally different programs.
*
* We cannot set brw->foo_program to NULL, either. If we've deleted the
* active program, Mesa may set ctx->FooProgram._Current to NULL. That
* would cause us to see matching pointers (NULL == NULL), and fail to
* detect that a program has changed since our last draw.
*
* So, set it to a bogus gl_program pointer that will never match,
* causing us to properly reevaluate the state on our next draw.
*
* Getting this wrong causes heisenbugs which are very hard to catch,
* as you need a very specific allocation pattern to hit the problem.
*/
static const struct gl_program deleted_program;
if (brw->vertex_program == prog)
brw->vertex_program = &deleted_program;
if (brw->tess_ctrl_program == prog)
brw->tess_ctrl_program = &deleted_program;
if (brw->tess_eval_program == prog)
brw->tess_eval_program = &deleted_program;
if (brw->geometry_program == prog)
brw->geometry_program = &deleted_program;
if (brw->fragment_program == prog)
brw->fragment_program = &deleted_program;
if (brw->compute_program == prog)
brw->compute_program = &deleted_program;
_mesa_delete_program( ctx, prog );
}

View File

@@ -477,6 +477,18 @@ gen8_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt,
break;
case BLORP_HIZ_OP_DEPTH_CLEAR:
dw1 |= GEN8_WM_HZ_DEPTH_CLEAR;
/* The "Clear Rectangle X Max" (and Y Max) fields are exclusive,
* rather than inclusive, and limited to 16383. This means that
* for a 16384x16384 render target, we would miss the last row
* or column of pixels along the edge.
*
* To work around this, we have to set the "Full Surface Depth
* and Stencil Clear" bit. We can do this in all cases because
* we always clear the full rectangle anyway. We'll need to
* change this if we ever add scissored clear support.
*/
dw1 |= GEN8_WM_HZ_FULL_SURFACE_DEPTH_CLEAR;
break;
case BLORP_HIZ_OP_NONE:
unreachable("Should not get here.");

View File

@@ -261,4 +261,8 @@ retry:
if (params->dst.enabled)
brw_render_cache_set_add_bo(brw, params->dst.addr.buffer);
if (params->depth.enabled)
brw_render_cache_set_add_bo(brw, params->depth.addr.buffer);
if (params->stencil.enabled)
brw_render_cache_set_add_bo(brw, params->stencil.addr.buffer);
}

View File

@@ -235,13 +235,9 @@ emit_miptree_blit(struct brw_context *brw,
* represented per scan lines worth of graphics data depends on the
* color depth.
*
* Furthermore, intelEmitCopyBlit (which is called below) uses a signed
* 16-bit integer to represent buffer pitch, so it can only handle buffer
* pitches < 32k. However, the pitch is measured in bytes for linear buffers
* and dwords for tiled buffers.
*
* As a result of these two limitations, we can only use the blitter to do
* this copy when the miptree's pitch is less than 32k linear or 128k tiled.
* The blitter's pitch is a signed 16-bit integer, but measured in bytes
* for linear surfaces and DWords for tiled surfaces. So the maximum
* pitch is 32k linear and 128k tiled.
*/
if (blt_pitch(src_mt) >= 32768 || blt_pitch(dst_mt) >= 32768) {
perf_debug("Falling back due to >= 32k/128k pitch\n");
@@ -480,11 +476,11 @@ static bool
can_fast_copy_blit(struct brw_context *brw,
drm_intel_bo *src_buffer,
int16_t src_x, int16_t src_y,
uintptr_t src_offset, uint32_t src_pitch,
uintptr_t src_offset, int32_t src_pitch,
uint32_t src_tiling, uint32_t src_tr_mode,
drm_intel_bo *dst_buffer,
int16_t dst_x, int16_t dst_y,
uintptr_t dst_offset, uint32_t dst_pitch,
uintptr_t dst_offset, int32_t dst_pitch,
uint32_t dst_tiling, uint32_t dst_tr_mode,
int16_t w, int16_t h, uint32_t cpp,
GLenum logic_op)
@@ -520,10 +516,8 @@ can_fast_copy_blit(struct brw_context *brw,
if (!_mesa_is_pow_two(cpp) || cpp > 16)
return false;
/* For Fast Copy Blits the pitch cannot be a negative number. So, bit 15
* of the destination pitch must be zero.
*/
if ((src_pitch >> 15 & 1) != 0 || (dst_pitch >> 15 & 1) != 0)
/* For Fast Copy Blits the pitch cannot be a negative number. */
if (src_pitch < 0 || dst_pitch < 0)
return false;
/* For Linear surfaces, the pitch has to be an OWord (16byte) multiple. */
@@ -577,12 +571,12 @@ xy_blit_cmd(uint32_t src_tiling, uint32_t src_tr_mode,
bool
intelEmitCopyBlit(struct brw_context *brw,
GLuint cpp,
GLshort src_pitch,
int32_t src_pitch,
drm_intel_bo *src_buffer,
GLuint src_offset,
uint32_t src_tiling,
uint32_t src_tr_mode,
GLshort dst_pitch,
int32_t dst_pitch,
drm_intel_bo *dst_buffer,
GLuint dst_offset,
uint32_t dst_tiling,

View File

@@ -31,12 +31,12 @@
bool
intelEmitCopyBlit(struct brw_context *brw,
GLuint cpp,
GLshort src_pitch,
int32_t src_pitch,
drm_intel_bo *src_buffer,
GLuint src_offset,
uint32_t src_tiling,
uint32_t src_tr_mode,
GLshort dst_pitch,
int32_t dst_pitch,
drm_intel_bo *dst_buffer,
GLuint dst_offset,
uint32_t dst_tiling,

View File

@@ -79,6 +79,7 @@ DRI_CONF_BEGIN
DRI_CONF_ALWAYS_FLUSH_CACHE("false")
DRI_CONF_DISABLE_THROTTLING("false")
DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN("false")
DRI_CONF_FORCE_GLSL_VERSION(0)
DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS("false")
DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED("false")
DRI_CONF_DUAL_COLOR_BLEND_BY_LOCATION("false")

View File

@@ -34,9 +34,9 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/r200/server \
-I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(RADEON_CFLAGS)

View File

@@ -35,9 +35,9 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/radeon/server \
-I$(top_builddir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(RADEON_CFLAGS)

View File

@@ -30,8 +30,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_builddir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
$(LIBDRM_CFLAGS) \
$(DEFINES) \
$(VISIBILITY_CFLAGS)

View File

@@ -28,8 +28,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mapi \
-I$(top_builddir)/src/mapi \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
$(DEFINES)
AM_CFLAGS = $(PTHREAD_CFLAGS) \

View File

@@ -363,7 +363,7 @@ EXT(OES_point_size_array , dummy_true
EXT(OES_point_sprite , ARB_point_sprite , x , x , ES1, x , 2004)
EXT(OES_primitive_bounding_box , OES_primitive_bounding_box , x , x , x , 31, 2014)
EXT(OES_query_matrix , dummy_true , x , x , ES1, x , 2003)
EXT(OES_read_format , dummy_true , GLL, GLC, ES1, x , 2003)
EXT(OES_read_format , dummy_true , GLL, x , ES1, x , 2003)
EXT(OES_rgb8_rgba8 , dummy_true , x , x , ES1, ES2, 2005)
EXT(OES_sample_shading , OES_sample_variables , x , x , x , 30, 2014)
EXT(OES_sample_variables , OES_sample_variables , x , x , x , 30, 2014)

View File

@@ -1741,8 +1741,6 @@ _mesa_ShaderSource(GLuint shaderObj, GLsizei count,
GLcharARB *source;
struct gl_shader *sh;
GLcharARB *replacement;
sh = _mesa_lookup_shader_err(ctx, shaderObj, "glShaderSourceARB");
if (!sh)
return;
@@ -1799,6 +1797,8 @@ _mesa_ShaderSource(GLuint shaderObj, GLsizei count,
source[totalLength - 2] = '\0';
#ifdef ENABLE_SHADER_CACHE
GLcharARB *replacement;
/* Dump original shader source to MESA_SHADER_DUMP_PATH and replace
* if corresponding entry found from MESA_SHADER_READ_PATH.
*/

View File

@@ -4,8 +4,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/gtest/include \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
-I$(top_builddir)/src/mesa \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/include \
$(DEFINES) $(INCLUDE_DIRS)

View File

@@ -278,7 +278,7 @@ void st_invalidate_state(struct gl_context * ctx, GLbitfield new_state)
static void
st_destroy_context_priv(struct st_context *st)
st_destroy_context_priv(struct st_context *st, bool destroy_pipe)
{
uint shader, i;
@@ -314,6 +314,10 @@ st_destroy_context_priv(struct st_context *st)
st_invalidate_readpix_cache(st);
cso_destroy_context(st->cso_context);
if (st->pipe && destroy_pipe)
st->pipe->destroy(st->pipe);
free( st );
}
@@ -503,7 +507,7 @@ st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe,
/* This can happen when a core profile was requested, but the driver
* does not support some features of GL 3.1 or later.
*/
st_destroy_context_priv(st);
st_destroy_context_priv(st, false);
return NULL;
}
@@ -579,7 +583,6 @@ destroy_tex_sampler_cb(GLuint id, void *data, void *userData)
void st_destroy_context( struct st_context *st )
{
struct pipe_context *pipe = st->pipe;
struct gl_context *ctx = st->ctx;
GLuint i;
@@ -608,11 +611,9 @@ void st_destroy_context( struct st_context *st )
/* This will free the st_context too, so 'st' must not be accessed
* afterwards. */
st_destroy_context_priv(st);
st_destroy_context_priv(st, true);
st = NULL;
pipe->destroy( pipe );
free(ctx);
}

View File

@@ -379,7 +379,8 @@ wsi_wl_surface_get_capabilities(VkIcdSurfaceBase *surface,
caps->currentExtent = (VkExtent2D) { -1, -1 };
caps->minImageExtent = (VkExtent2D) { 1, 1 };
caps->maxImageExtent = (VkExtent2D) { INT16_MAX, INT16_MAX };
/* This is the maximum supported size on Intel */
caps->maxImageExtent = (VkExtent2D) { 1 << 14, 1 << 14 };
caps->supportedTransforms = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
caps->currentTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
caps->maxImageArrayLayers = 1;
@@ -409,25 +410,27 @@ wsi_wl_surface_get_formats(VkIcdSurfaceBase *icd_surface,
if (!display)
return VK_ERROR_OUT_OF_HOST_MEMORY;
uint32_t count = u_vector_length(&display->formats);
if (pSurfaceFormats == NULL) {
*pSurfaceFormatCount = count;
*pSurfaceFormatCount = u_vector_length(&display->formats);
return VK_SUCCESS;
}
assert(*pSurfaceFormatCount >= count);
*pSurfaceFormatCount = count;
uint32_t count = 0;
VkFormat *f;
u_vector_foreach(f, &display->formats) {
*(pSurfaceFormats++) = (VkSurfaceFormatKHR) {
if (count == *pSurfaceFormatCount)
return VK_INCOMPLETE;
pSurfaceFormats[count++] = (VkSurfaceFormatKHR) {
.format = *f,
/* TODO: We should get this from the compositor somehow */
.colorSpace = VK_COLORSPACE_SRGB_NONLINEAR_KHR,
};
}
assert(*pSurfaceFormatCount <= count);
*pSurfaceFormatCount = count;
return VK_SUCCESS;
}
@@ -441,11 +444,13 @@ wsi_wl_surface_get_present_modes(VkIcdSurfaceBase *surface,
return VK_SUCCESS;
}
assert(*pPresentModeCount >= ARRAY_SIZE(present_modes));
*pPresentModeCount = MIN2(*pPresentModeCount, ARRAY_SIZE(present_modes));
typed_memcpy(pPresentModes, present_modes, *pPresentModeCount);
*pPresentModeCount = ARRAY_SIZE(present_modes);
return VK_SUCCESS;
if (*pPresentModeCount < ARRAY_SIZE(present_modes))
return VK_INCOMPLETE;
else
return VK_SUCCESS;
}
VkResult wsi_create_wl_surface(const VkAllocationCallbacks *pAllocator,

View File

@@ -370,7 +370,8 @@ x11_surface_get_capabilities(VkIcdSurfaceBase *icd_surface,
*/
caps->currentExtent = (VkExtent2D) { -1, -1 };
caps->minImageExtent = (VkExtent2D) { 1, 1 };
caps->maxImageExtent = (VkExtent2D) { INT16_MAX, INT16_MAX };
/* This is the maximum supported size on Intel */
caps->maxImageExtent = (VkExtent2D) { 1 << 14, 1 << 14 };
}
free(err);
free(geom);