Compare commits

..

382 Commits

Author SHA1 Message Date
Timothy Arceri
b010fa8567 glsl: make sure UBO arrays are sized in ES
This check was removed in 5b2675093e add it back in.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
https://bugs.freedesktop.org/show_bug.cgi?id=96349
2016-06-14 11:33:24 +10:00
Vedran Miletić
4825264f75 clover: Update OpenCL version string to match OpenGL
Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For
both, always append git version suffix from git_sha1.h.

v5: move semicolon to same line as MESA_GIT_SHA1.
v4: drop #ifdef guards.
v3: add missing include.
v2: change CL_DEVICE_VERSION as well.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-06-13 15:55:59 -07:00
Francisco Jerez
bd9f972651 i965/fs: Fix regs_written for SIMD-lowered instructions some more.
ISTR having suggested this during review of the recent FP64 changes to
the SIMD lowering pass, but it doesn't look like it was taken into
account in the end.  Using the fs_reg::component_size helper instead
of this open-coded variant makes sure that the stride is taken into
account correctly.  Fixes at least the following piglit tests with
spilling forced on (since otherwise regs_written would be calculated
incorrectly and the spilling code would be rather confused about how
much data needs to be spilled):

 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-13 15:55:59 -07:00
Francisco Jerez
a84b5d43e2 i965: Fix cross-primitive scratch corruption when changing the per-thread allocation.
I haven't found any mention of this in the hardware docs, but
experimentally what seems to be going on is that when the per-thread
scratch slot size is changed between two pipelined draw calls, shader
invocations using the old and new scratch size setting may end up
being executed in parallel, causing their scratch offset calculations
to be based in a different partitioning of the scratch space, which
can cause their thread-local scratch space to overlap leading to
cross-thread scratch corruption.

I've been experimenting with alternative workarounds, like emitting a
PIPE_CONTROL with DC flush and CS stall between draw (or dispatch
compute) calls using different per-thread scratch allocation settings,
or avoiding reuse of the scratch BO if the per-thread scratch
allocation doesn't exactly match the original.  Both seem to be as
effective as this workaround, but they have potential performance
implications, while this should be basically for free.

Fixes over 40 failures in our CI system with spilling forced on
(including CTS, dEQP and Piglit failures) on a number of different
platforms from Gen4 to Gen9.  The 'glsl-max-varyings' piglit test
seems to be able to reproduce this bug consistently in the vertex
shader on at least Gen4, Gen8 and Gen9 with spilling forced on.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-13 15:55:58 -07:00
Francisco Jerez
d960284e44 i965: Keep track of the per-thread scratch allocation in brw_stage_state.
This will be used to find out what per-thread slot size a previously
allocated scratch BO was used with in order to fix a hardware race
condition without introducing additional stalls or memory allocations.
Instead of calling brw_get_scratch_bo() manually from the various
codegen functions, call a new helper function that keeps track of the
per-thread scratch size and conditionally allocates a larger scratch
BO.

v2: Handle BO allocation manually instead of relying on
    brw_get_scratch_bo (Ken).

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-13 15:55:58 -07:00
Francisco Jerez
013ae4a70a i965: Fix scratch overallocation if the original slot size was already a power of two.
The bitwise arithmetic trick used in brw_get_scratch_size() to clamp
the scratch allocation to 1KB has the unintended side effect that it
will cause us to allocate 2x the required amount of scratch space if
the original per-thread scratch size happened to be already a power of
two.  Instead use the obvious MAX2 idiom to clamp the scratch
allocation to the expected range.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-13 15:55:58 -07:00
Kenneth Graunke
2df8f4a253 mesa: Make TexSubImage check negative dimensions sooner.
Two dEQP tests expect INVALID_VALUE errors for negative width/height
parameters, but get INVALID_OPERATION because they haven't actually
created a destination image.  This is arguably not a bug in Mesa, as
there's no specified ordering of error conditions.

However, it's also really easy to make the tests pass, and there's
no real harm in doing these checks earlier.

Fixes:
dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_width_height
dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texsubimage3d_neg_width_height

v2: Drop redundant check (caught by Anuj Phogat).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-06-13 15:38:47 -07:00
Brian Paul
cf9bb9acac util: update some assertions in util_resource_copy_region()
To cope with copies of compressed images which are not multiples of
the block size.  Suggested by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>
2016-06-13 13:30:19 -06:00
Kenneth Graunke
5a0d294d38 i965: Fix encode_slm_size() to take a generation, not a device info.
In the Vulkan driver, we have the generation number (a compile time
constant) but not necessarily the brw_device_info struct.  I meant
to rework the function to take a generation number instead of a
brw_device_info pointer to accomodate this.  But I forgot, and left
it taking a brw_device_info pointer, while making Vulkan pass the
generation number (8, 9, ...) directly.  This led to crashes.

Brown paper bag fix for commit 87d062a940.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-13 12:23:11 -07:00
Kenneth Graunke
667e5cec76 i965: Don't leak scratch BOs for TCS/TES.
These need to be freed too.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-13 12:22:06 -07:00
Nanley Chery
a4a5917248 anv/pipeline: Don't dereference NULL dynamic state pointers
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.

This fixes a segfault seen in the McNopper demo, VKTS_Example09.

v3 (Jason Ekstrand):
   - Fix disabled rasterization check
   - Revert opaque detection of color attachment usage

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-13 11:35:45 -07:00
Nanley Chery
a0d84a9ef9 anv: Document and rename anv_pipeline_init_dynamic_state()
To reduce confusion, clarify that the state being copied is not dynamic.

This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-13 11:35:45 -07:00
Samuel Pitoiset
7f257abc1b nvc0/ir: clamp the UBO index for compute on Kepler
We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-13 20:12:48 +02:00
Marek Olšák
6e1b12c788 radeonsi: enable scratch coalescing
This makes one particular compute shader 8x faster.

Latest LLVM git is required.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-13 18:13:51 +02:00
Jimmy Berry
0c0f841e5d st/va: hardlink driver instances to gallium_drv_video.so
Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.

Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Jan Vesely
1fb4179f92 vl: Fix trivial sign compare warnings
v2: add whitepace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
[Emil Velikov: squash a few more whitespace issues]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Rob Herring
112e988329 Android: move libdrm settings to top-level Android.common.mk
Fix warnings like these due to HAVE_LIBDRM being inconsistently defined:

external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition]
typedef struct drm_clip_rect drm_clip_rect_t;

HAVE_LIBDRM needs to be set project wide to fix this. This change also
harmlessly links libdrm with everything, but simplifies the makefiles a
bit.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Rob Herring
54e550ab8a Android: disable some noisy warnings
Turn off warnings for -Wpointer-arith, -Wno-missing-field-initializers,
-Wno-initializer-overrides, and -Wno-mismatched-tags. These are all deemed
pointless, on purpose or no plans to fix.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Emil Velikov
db8790c0da st/mesa: inline _mesa_create_context() into its only caller
Inline the function into it's only caller. This way it's more obvious
how the classic and gallium drivers (st/mesa) use _mesa_initialize_context.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:29 +01:00
Emil Velikov
a4fa8bf819 st/mesa: remove unneeded break from st_api_create_context()
We have return on the previous line, thus the break will never be
reached.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
6406bc1592 st/mesa: use c99 initializer for st_gl_api
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
15bc7856bf gallium: remove st_api::get_proc_address hook
It has been unused for a long time, plus makes the gallium dri modules
require an extra glapi symbol relative to their classic counterparts.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
23a7fca6aa mesa: remove _mesa_init_get_hash()
The actual code of the function print_table_stats() is guarded
by a ifdef GET_DEBUG, which was not been defined in years.

The last fix in 2013 (7db6b5aa91) indicates that it's rarely
used/tested. Since the issue has gone unnoticed for a whole year
(broken with 2ad4a47547).

Let's remove it for now. We can always revive it at a later stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
b81685eb32 mesa: kill off _mesa_do_init_remap_table()
... and inline its contents in _mesa_init_remap_table().

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
bfbf286f7d mesa: use native types when possible
All of the functions and related data is internal, so there's no point
if using the GL types.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
3f80c95f35 mesa: make _mesa_map_function_spec() static
Used only locally.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
390678f27d mesa: remove used _mesa_get_function_spec() and gl_function_remap
Final user was killed with last commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
5b700059a8 mesa: remove unused _mesa_map_function_array()
Unused as of commit 5a175127f3 ("dri: Remove all extension enabling
utility functions") and the patch before the previous patch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
5378ee8187 glapi: remap_helper.py: remove MESA_alt_functions
The final user was nuked with last commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
b5dd8e0cf8 mesa: remove unused function _mesa_map_static_functions()
Unused as of commit 5a175127f3 ("dri: Remove all extension enabling
utility functions")

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
07ae8c7df7 dri/common: remove unused libdri_test_stubs.la
... and associated file(s).

No longer needed since commit 057259655e ("i965: Don't link libmesa or
libdri_test_stubs into tests")

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-13 15:31:27 +01:00
Emil Velikov
fcb5a75a66 swr: automake: add missing -I flag
When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.

Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:24 +01:00
Emil Velikov
f4d26856df automake: add SWR to `make distcheck' gallium drivers
Will allows us to catch missing files and build issues before getting
the tarball out for general consumption.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:24:44 +01:00
Emil Velikov
bab5ab6940 configure.ac: strip out the llvm-config -march/mtune flags
Otherwise drivers such as SWR that depend on providing their own values
will fail to build.

v2: Add -mcpu for good measure (Chuck)

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
2016-06-13 15:24:44 +01:00
Chuck Atkins
c86fcaca72 swr: Add missing headers for package inclusion
CC: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:24:44 +01:00
Emil Velikov
8229fe68b5 automake: get in-tree `make distclean' working again.
With earlier commit we've handled the `make distclean' out of tree
build, yet we failed to attribute that for in-tree builds the test
condition will return 1. Thus effectively the target will be considered
as "failed".

Fixes: b7f7ec7843 ("mesa: automake: distclean git_sha1.h when building
OOT")
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-06-13 15:24:44 +01:00
Jan Vesely
ace70aedcf gallivm: Fix trivial sign warnings
v2: include whitespace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-13 09:23:09 -04:00
Julien Isorce
a04804746f st/va: use proper temp pipe_video_buffer template
Instead of changing the format on the existing template
which makes error handling not nice and confuses coverity.

CoverityID: 1337953

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-13 09:14:32 +01:00
Julien Isorce
6c43e0016e st/va: it is valid to release the VABuffer of an exported resource
pipe_resource_reference(&res, NULL) will decrement reference counting,
i.e. p_atomic_dec(res->count). But the va surface still has the initial
reference since it has created the resource. So calling vaDestroyImage
on a derived image calls VaDestroyBuffer but the decrementation won't
reach 0. It is just wrong for vlVaDestroyBuffer to rely on the
export_refcount flag. Finally the vaapi intel driver has the same logic.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-13 09:14:32 +01:00
Timothy Arceri
30df78236c glsl: fix component overlap validation for doubles
This change makes sure to remove arrays when checking if type
is a double.

The check for the end of the first slot of a multi-slot double
is also fixed by bumping the check to 4 rather than 3.
Previously we were we not reserving the last component.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-12 21:56:32 +10:00
Timothy Arceri
ad3def919e glsl: fix max varyings count for ARB_enhanced_layouts
Since this extension allows more than one varying to share a single
location we can't just count the number of slots a varying takes and
add it to the total.

Instead we now reuse the reserved varyings bitfield to determine how
many slots are reserved for explicit locations instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-12 21:56:28 +10:00
Kenneth Graunke
0fb85ac08d i965: Use the correct number of threads for compute shaders.
We were programming the number of threads per subslice, when we should
have been programming the total number of threads on the GPU as a whole.

Thanks to Curro and Jordan for helping track this down!

On Skylake GT3e:
- Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
- Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
- Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.

On Broadwell GT2:
- Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
- Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
- Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
  0.255654% (n=25).

On Haswell GT3e:
- Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
  by roughly 1.10x.
- Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
- Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
  0.432771% (n=64).

On Ivybridge GT2:
- Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
  by roughly 1.03x.
- Improves performance in Synmark's G/43CSDof by roughly 1.25x.
- No change in Synmark's Gl43CSCloth (n=28).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:40:15 -07:00
Kenneth Graunke
1db37ebecf i965: Assert that the scratch spaces are in range.
I don't know that anything actually guarantees this, but if we exceed
the limits, we may end up overflowing and trashing random buffers that
happen to be nearby in the VMA space, leading to rendering corruption,
hangs, or worse.

We should really fix this properly.  However, the pitfall has existed
for ages, so for now we should at least detect it.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:40:15 -07:00
Kenneth Graunke
a42a93dc12 i965: Fix CS scratch size calculations on Ivybridge and Baytrail.
These are linear, not powers of two, and much more limited.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:40:14 -07:00
Kenneth Graunke
147a90d82a i965: Fix Haswell CS per-thread scratch space encoding.
Most scratch stages use power of two sizes, in kilobytes, where
0 means 1kB.  But compute shaders on Haswell have a minimum of 2kB,
and use a representation where 0 = 2kB.

This meant that we were effectively telling the hardware to allocate
each thread twice as much space as we meant to, while simultaneously
not allocating that much space in the buffer, leading to overflows.

Note that the existing code is completely wrong for Ivybridge,
but that will take additional work to sort out, so I've left it
as is for now.  A subsequent commit will take care of that.

Together with the previous patches, this fixes rendering corruption
on Synmark's Gl43CSDof on Haswell.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:40:14 -07:00
Kenneth Graunke
a7d029d3df i965: Account for poor address calculations in Haswell CS scratch size.
Curro figured this out by investigating the simulator.  Apparently
there's also a workaround in the Windows driver.  I'm not sure it's
actually documented anywhere.

We were underallocating the scratch buffer by a factor of 128/70.

v2: Rename threads_per_subslice to scratch_ids_per_subslice
    (suggested by Jordan Justen).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:39:45 -07:00
Kenneth Graunke
2213ffdb4b i965: Allocate scratch space for the maximum number of compute threads.
We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.

Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing.  We need to allocate enough
for every possible thread that could run.

Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:38:50 -07:00
Kenneth Graunke
9cd8f95809 i965: Set subslice_total on Gen7/7.5 platforms.
We'll use this for compute shader thread counts and scratch space
calculations shortly.

Note that subslices are referred to as "half slices" on Ivybridge.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:38:47 -07:00
Kenneth Graunke
87d062a940 i965: Fix shared local memory size for Gen9+.
Skylake changes the representation of shared local memory size:

 Size   | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
 -------------------------------------------------------------------
 Gen7-8 |    0 | none | none |    1 |    2 |     4 |     8 |    16 |
 -------------------------------------------------------------------
 Gen9+  |    0 |    1 |    2 |    3 |    4 |     5 |     6 |     7 |

The old formula would substantially underallocate the amount of space.
This fixes GPU hangs on Skylake when running with full thread counts.

v2: Fix the Vulkan driver too, use a helper function, and fix the table
    in the comments and commit message.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-12 00:38:26 -07:00
Ilia Mirkin
3f48548a6f nv50: reinstate dedicated constbuf push path
This was disabled due to occasionally incorrect behavior when trying to
upload data. It later became apparent that nvc0 also had a similar but
slightly different issue, which was resolved in commit e50c01d5. This
takes the same logic as nvc0 and applies it to nv50 (which has somewhat
different interfaces).

Unfortunately I did not note down precisely what was broken with UBOs
when removing the support from nv50, but I've tested a bunch of local
traces, and none of them appear to regress. This should hopefully
improve performance when UBOs are used, but this was not directly
verified.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-11 12:18:43 -04:00
Ilia Mirkin
f47845596b nv50: enable indirect addressing of fragment shader inputs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-11 11:50:42 -04:00
Ilia Mirkin
7d7e015381 mesa: add drawbuffer argument to ClearNamedFramebufferfi
This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since
it's horrible to have differing APIs across library versions, we should
attempt to minimize the impact by backporting it as far as possible and
hope no one notices.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 20:32:03 -04:00
Ilia Mirkin
92351a71a8 GL: update glcorearb.h to svn 32433
This brings in the fixed glClearNamedFramebufferfi definition, as well
as a lot of GLsizei -> GLsizeiptr changes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 20:31:53 -04:00
Ilia Mirkin
f81374fd3e GL: update glext to svn 32957
This brings in defines from GL_EXT_window_rectangles and fixes the
glClearNamedFramebufferfi definition.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 20:24:53 -04:00
Brian Paul
5cfc91624c docs: GL_ARB_copy_image done for softpipe, llvmpipe
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-06-10 15:50:55 -06:00
Brian Paul
e9b86bb92c llvmpipe: turn on pipe cap for GL_ARB_copy_image support
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Brian Paul
2db747cf26 llvmpipe: don't use 3-component formats, except 32-bit x 3 formats
This basically disallows all 8-bit x 3 and 16-bit x 3 formats for
textures and render targets.  Some 3-component formats were already
disallowed before.  This avoids problems with GL_ARB_copy_image.

v2: the previous version of this patch disallowed all 3-component formats

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 15:50:04 -06:00
Brian Paul
672e92a146 softpipe: turn on pipe cap for GL_ARB_copy_image support
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Brian Paul
d8fe6332d8 softpipe: don't use 3-component formats
Mesa and gallium don't have a complete set of matching 3-component
texture formats.  For example, 8-bit sRGB unorm.  To fully support
the GL_ARB_copy_image extension we need to have support for all of
these formats: RGB8_UNORM, RGB8_SNORM, RGB8_SRGB, RGB8_UINT, and
RGB8_SINT using the same component order.  Since we don't have that,
disable the 3-component formats for now.

v2: Simplify 3-component format check, per Marek.
Also check that target != PIPE_BUFFER.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Brian Paul
e295b4e800 st/mesa: tweak surface format mapping table
1. Try to choose R8G8B8A8 unorm/srgb formats before others in an
effort to try to match component ordering for UINT/SINT/etc.

2. If we can't get a format such as PIPE_FORMAT_A16_UNORM, try
PIPE_FORMAT_R16G16B16A16_UNORM before shallower formats.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Brian Paul
dd4be2e19a util: update util_resource_copy_region() for GL_ARB_copy_image
This primarily means added support for copying between compressed
and uncompressed formats.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-10 15:50:04 -06:00
Anuj Phogat
466b320163 gallium: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
 -----------
|           |
 ------- ---
|       |   |
|       |   |
 ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-10 14:35:21 -07:00
Anuj Phogat
f8679badd4 mesa: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
 rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
 to the destination rectangle bounded by the locations (dstX0, dstY 0)
 and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
 while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
     -----------
    |           |
     ------- ---
    |       |   |
    |       |   |
     ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-10 14:35:21 -07:00
Dave Airlie
1584918996 gallivm: more 64-bit integer prep work.
This converts one other place to using the new helper.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:30 +10:00
Dave Airlie
f550b6d296 radeonsi: convert to 64-bitness checks instead of doubles.
This converts to testing for 64-bit types and renames some things
in anticipation of 64-bit integer support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:21 +10:00
Dave Airlie
e5c57824ec gallivm: make non-float return code bitcast consistent.
This just uses the same form across the fetches.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:17 +10:00
Dave Airlie
3b97e50b9a gallium/gallivm: use 64-bit test instead of doubles.
This just makes some generic code that currently emits double
suitable for emitting 64-bit values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:44:13 +10:00
Dave Airlie
213ab8db87 gallium/tgsi: add 64-bitness type check function.
Currently this just doubles, but we'll convert users to this
so making adding 64-bit integers easier.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-11 06:43:45 +10:00
Jason Ekstrand
8d37556ec9 anv/entrypoints: Rework #if guards
This reworks the #if guards a bit.  When Emil originally wrote them, he
just guarded everything.  However, part of what anv_entrypoints_gen.py
generates is a hash table for looking up entrypoints based on their name.
This table *cannot* get out of sync between C and python regardless of
preprocessor flags.  In order to prevent this, this commit makes us use
void pointers in the dispatch table for those entrypoints which aren't
available.  This means that the dispatch table size and entry order is
constant and it should never get out-of-sync with the python.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 13:21:07 -07:00
Jason Ekstrand
9ed0d9dd06 anv/entrypoints: Use the function pointer types provided by vulkan.h
This is a bit cleaner than generating the types ourselves when making the
table.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 13:21:07 -07:00
Nicolai Hähnle
42624ea837 st/mesa: use base level size as "guess" when available
When an applications specifies mip levels _before_ setting a mipmap texture
filter, we will initially guess a single texture level. When the second level
image is created, we try to allocate the full texture -- however, we get the
base level size guess wrong if that size is odd. This leads to yet another
re-allocation of the texture later during st_finalize_texture.

Even worse, this re-allocation breaks a (reasonable) assumption made by
st_generate_mipmaps, because the re-allocation in the finalization call will
again allocate a single-level pipe texture (based on the non-mipmap texture
filter!). As a result, mipmap generation fails in interesting ways.

All of this can be avoided by just using the fact that we already know the
size of the base level.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-10 20:20:39 +02:00
Jason Ekstrand
a1e69930e4 anv: Remove the PhysicalDeviceLimits FINISHME
At this point, the limits are probably more-or-less correct.  If there is
an invalid limit, that's a bug not a FINSHME.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:43:45 -07:00
Jason Ekstrand
4f5bbf804b anv/pipeline_cache: Allow for an zero-sized cache
This gets ANV_ENABLE_PIPELINE_CACHE=false working again.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:43:10 -07:00
Jason Ekstrand
a1a25db699 anv/pipeline: Store the (set, binding, index) tripple in the bind map
This way the the bind map (which we're caching) is mostly independent of
the pipeline layout.  The only coupling remaining is that we pull the array
size of a binding out of the layout.  However, that size is also specified
in the shader and should always match so it's not really coupled.  This
rendering issues in Dota 2.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:43:07 -07:00
Jason Ekstrand
c13c5ac561 anv/descriptor_set: Ensure that bindings are always in increasing order
Since applications are allowed to specify some set of bindings which need
not be dense they also need not be in order.  For most things, this doesn't
matter, but it could result getting the wrong dynamic offsets. This adds a
quick-and-dirty sort to ensure that everything is always in increasing
order of binding index.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:43:03 -07:00
Jason Ekstrand
e2265926f2 anv/descriptor_set: Add a type field in debug builds
This allows for some extra validation and makes it easier to see what's
going on when poking around in gdb.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:42:59 -07:00
Jason Ekstrand
cd21015abd anv/descriptor_set: Set array_size to zero for non-existant descriptors
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:42:45 -07:00
Leo Liu
2ad443e4cc vl/dri3: support receiving new pixmap for front buffer
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.

This also fixes Totem player playback corruption.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 11:24:24 -04:00
Leo Liu
0ef8500aab vl/dri3: get Makefile properly
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.

This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 11:24:19 -04:00
Jose Fonseca
2b4cee0571 gallivm: Never emit llvm.fmuladd on LLVM 3.3.
Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't
handle llvm.fmuladd and instead it fall backs to a C function.  Which in
turn causes a segfault on Windows.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 16:17:04 +01:00
Jose Fonseca
320d1191c6 gallivm: Use llvm.fmuladd.*.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Jose Fonseca
9e8edfa190 util,gallivm: Explicitly enable/disable fma attribute.
As suggested by Roland Scheidegger.

Use the same logic as f16c, since fma requires VEX encoding.

But disable FMA on LLVM 3.3 without MCJIT.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Bas Nieuwenhuizen
54f755fa0f radeonsi: Reinitialize all descriptors in CE preamble.
This fixes a problem with the CE preamble and restoring only stuff in the
preamble when needed.

To illustrate suppose we have two graphics IB's 1 and 2, which  are submitted in
that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we
have a context switch at the start of IB 1, but not between IB 1 and IB 2.

The old code put the CE RAM loads in the preamble of IB 2. As the preamble of
IB 1 does not have the loads and the preamble of IB 2 does not get executed, the
old values are not load into CE RAM.

Fix this by always restoring the entire CE RAM.

v2: - Just load all descriptor set buffers instead of load and store the entire
      CE RAM.
    - Leave the ce_ram_dirty tracking in place for the non-preamble case.

v3: - Fixed parameter alignment.
    - Rebased to master (Nicolai's descriptor series).

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-10 12:18:29 +02:00
Jose Fonseca
f93c22109e mesa: Wrap extensions.h declarations with extern "C".
This should fix the MSVC linker failures that arose with commit
5e2d25894b.

Trivial.
2016-06-10 11:00:42 +01:00
Ilia Mirkin
f48f344700 st/mesa: fix type confusion with reladdrs
The reality is that this doesn't matter, because we manually emit the
ARL to the sampler reladdr, and those arguments don't get an extra load
later, so it's effectively just a boolean. However having the types be
wrong is confusing and could trigger very odd bugs should usage change
down the line.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-09 21:01:53 -04:00
Dave Airlie
f140ed6d95 glsl/ir: remove TABs in ir_constant_expression.cpp
Adding 64-bit integers support was going to make this file worse,
just remove the tabs from it now.

Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-10 10:30:18 +10:00
Anuj Phogat
73a54e4892 i965/gen9: Don't change halign and valign to fit in fast copy blit
An update in graphics specs has deleted the halign and valign fields
from XY_FAST_COPY_BLT command. See mesa commit 97f0f91.

Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-06-09 15:50:07 -07:00
Anuj Phogat
46c8967813 mesa: Add a helper function for shared code in get_tex_rgba_{un}compressed
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-09 15:50:07 -07:00
Samuel Pitoiset
5e2d25894b mesa: Let compute shaders work in compatibility profiles
The extension is already advertised in compatibility profile, but
the _mesa_has_compute_shaders only returns true in core profile.
If we advertise it, we should allow it to work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-06-09 21:03:28 +02:00
Tim Rowley
2c85128e01 swr: implement clipPlanes/clipVertex/clipDistance/cullDistance
v2: only load the clip vertex once

v3: fix clip enable logic, add cullDistance

v4: remove duplicate fields in vs jit key, fix test of clip fixup needed

v5: fix clipdistance linkage for slot!=0,4

v6: support clip+cull; passes most piglit clip (failures understood)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-09 13:28:35 -05:00
Daniel Czarnowski
cf804b4455 glx: fix crash with bad fbconfig
GLX documentation states:
	glXCreateNewContext can generate the following errors: (...)
	GLXBadFBConfig if config is not a valid GLXFBConfig

Function checks if the given config is a valid config and sets proper
error code.

Fixes currently crashing glx-fbconfig-bad Piglit test.

v2: coding style cleanups (Emil, Topi)
    use DefaultScreen macro (Emil)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
2016-06-09 17:55:44 +03:00
Nayan Deshmukh
2d140ae70a st/vdpau: implement luma keying
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-09 14:23:24 +02:00
Nayan Deshmukh
f24eb5a178 vl: Apply luma key filter before CSC conversion
Apply the luma key filter to the YCbCr values during the CSC conversion
    in video buffer shader. The initial values of max and min luma are set
    to opposite values to disable the filter initially and will be set when
    enabling it.

    Add extra parmeters min and max luma for the luma key filter in
    vl_compositor_set_csc_matrix in va, xvmc. Setting them
    to opposite value 1.f and 0.f respectively won't effect the CSC
    conversion

    v2: -Squash 1,2 and 3 into one patch to avoid breaking build of
        other components. (Christian)
        -use ureg_swizzle. (Christian)
        -change name of the variables. (Christian)

    v3: -Squash all patches in one to avoid breaking of build. (Emil)
        -wrap functions properly. (Emil)
        -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil)

    v4: -Divide it in two patches one which introduces the functionality
	 and assigs dummy values to the changed functions and second which
	 implements the lumakey filter. (Christian)
	-use ureg_scalar instead ureg_swizzle. (Christian)

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-09 14:23:07 +02:00
Jason Ekstrand
037ce5d734 i965: Emit surface states for extra planes prior to gen8
When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8
but not for gen7 or earlier.  It all works, we just need to emit the states
for the extra planes.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-08 21:57:57 -07:00
Marc-André Lureau
dc81b3ad43 virgl: fix checking fences
When calling virgl_fence_wait() with timeout=0,
virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE
for a busy resource, whereace virgl_fence_wait() should return TRUE for
a completed (non-busy) resource.

This fixes running supertuxkart in a VM (I could not reproduce locally
with vtest though there is a similar fix)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 14:07:53 +10:00
Dave Airlie
15896a470b glsl/types: rename is_dual_slot_double to is_dual_slot_64bit.
In the future int64 support will have the same requirements.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 09:17:24 +10:00
Dave Airlie
45c901f7a3 st/glsl_to_tgsi: move to checking 64-bitness instead of double
This uses the new types interfaces to check for 64-bit types,
as futureproofing against int64 support.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:49 +10:00
Dave Airlie
bbbc45b8e1 st/glsl_to_tgsi: use enum glsl_base_type instead of unsigned
This is just some better type safety that I noticed while working
on 64-bit integer support.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:49 +10:00
Dave Airlie
152f5eea62 mesa: use new 64-bit checks instead of explicit double checks.
This just moves to the new interfaces in advance of int64.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:47 +10:00
Dave Airlie
2df46519e4 glsl/link_varyings: switch to 64bit check instead of double.
This is prep work for int64 support.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:43 +10:00
Dave Airlie
35616a9e0e glsl: use new interfaces for 64-bit checks.
This is just prep work for int64 support, changing
places where 64-bit matters no doubles.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:19 +10:00
Dave Airlie
a82b8e8b36 compiler: use 64bit check for sizing instead of double check.
This just moves code to the new check in advance of int64 support.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:15 +10:00
Dave Airlie
246518154e compiler/types: add 64-bitness queries.
This adds an inline and type query for if a type is 64-bit.

Fow now this is equivalent to double, but int64 will change
this.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-09 07:37:04 +10:00
Adam Jackson
a1c5cd426c glapi/glx: Add overflow checks to the client-side indirect code
Coverity complains that the computed sizes can lead to negative lengths
passed to memcpy. If that happens we've been handed invalid arguments
anyway, so just bomb out.

The funky "0%s" is because the size string for the variable-length part
of the request is of the form "+ safe_pad() ...", and a unary + would
coerce the result to always be positive, defeating the overflow check.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-08 14:39:46 -04:00
Marek Olšák
26b69ad250 radeonsi: improve the computation and comment of scratch_waves
2% isn't much. If you think the number should be decreased, please speak up.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-08 19:28:25 +02:00
Marek Olšák
1d9c1d9386 radeonsi: print the number of spilled VGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-08 19:28:25 +02:00
Marek Olšák
2b18d67a1e gallium/radeon: remove dead code creating LLVMTargetMachine
This was for some old unsupported LLVM version.
Only si_create_context creates the target machine now.
r600g doesn't use this function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-08 19:23:42 +02:00
Marek Olšák
a343ab55f7 radeonsi: don't enable scratch just for SGPR spills
Diff from shader-db:
  Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave

v2: add "break;"

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-08 19:23:41 +02:00
Marek Olšák
55b097d004 st/mesa: try not to compile compute shader on the first use
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-08 19:23:41 +02:00
Marek Olšák
95288277d5 Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces"
This reverts commit ffd54d1936.

No, it doesn't work. The test case is "glxgears -samples 2".
2016-06-08 19:21:55 +02:00
Nicolai Hähnle
bd5c41fe5f st/mesa: directly compute level=0 texture size in st_finalize_texture
The width0/height0/depth0 on stObj may not have been set at this point.
Observed in a trace that set up levels 2..9 of a 2d texture, and set the base
level to 2, with height 1. This made the guess logic always bail.

Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat
redundant storage of width0/height0/depth0 and makes sure we always compute
pipe texture sizes that are compatible with the base level image of the
GL texture.

Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul.

v2:
- try to re-use an existing pipe texture when possible
- handle a corner case where the base level is not level 0 and it is of
  size 1x1x1

v3:
- ptHeight = ptWidth in cube map 1x1 case (suggested by Brian)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-08 19:12:07 +02:00
Timothy Arceri
8c3ecde0e1 glsl: stop allocating memory for SSBOs and builtins
This just stops counting and assigning a storage location for
these uniforms, the count is only used to create the uniform storage.

These uniform types don't use this storage.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-08 13:19:32 +10:00
Ilia Mirkin
6e6fd911da st/mesa: use buffer usage history to set dirty flags for revalidation
We were previously unconditionally doing this for arrays and ubo's, and
ignoring texture/storage/atomic buffers. Instead use the usage history
to determine which atoms need to be revalidated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-07 22:27:04 -04:00
Gurchetan Singh
d9546b0c5d i965: Integrate precise trig into configuration infrastructure
With this change, to enable precise SIN and COS instructions
on Intel hardware, one can put

<option name="precise_trig" value="true"/>

in the proper drirc file.

V2: Make option name more generic

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
2016-06-07 15:42:21 -07:00
Marek Olšák
f39439d166 radeonsi: re-enable PBO ReadPixels acceleration
disabled by 4f1cccf570

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-08 00:22:45 +02:00
Marek Olšák
7c6e88b643 radeonsi: allow MSAA resolving into a texture that has DCC enabled
Since DCC is enabled almost everywhere now, it's important not to disable
this fast path.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
9a472a3e0b gallium/radeon: move DCC clearing into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
ffd54d1936 radeonsi: allow direct hw MSAA resolve for scanout surfaces
No idea why this was disabled, but it works fine.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
4be46c7d9d radeonsi: don't allocate DCC for the temporary MSAA resolve surface
Allocating it has no effect, but it adds overhead (useless DCC clear).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
c06246501e radeonsi: don't enable DCC in the sampler if first_level doesn't have it
If first_level > 0 and DCC is disabled for that level, let's skip DCC
reads entirely.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
00389100b6 winsys/amdgpu: enable DCC for mipmapped textures
Also add dcc_fast_clear_size for clearing only the necessary subset
of DCC. For no AA, it's equal to the size of the whole DCC level.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
c65361763c gallium/radeon: don't disable DCC because of SDMA
We want to keep DCC enabled to save bandwidth. It was a bad idea to disable
it here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
2fd74a05bb radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
aa7fe70443 radeonsi: add per-level dcc_enabled flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
60e93ddd06 radeonsi: compute DCC register parameters in si_emit_framebuffer_state
This will get more complicated with mipmapped DCC or when DCC is enabled
after allocation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
a01536a29f gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUT
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
d4d733e39d gallium/radeon: don't allocate DCC for non-renderable texture formats
R9G9B9E5 is the only uncompressed one hopefully.

This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Nicolai Hähnle
b42bc90b6a radeonsi: enable WQM in PS prolog when needed
WQM is needed when the PS prolog computes a VGPR that is consumed by a shader
with (implicit or explicit) derivatives.

Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be
effective (otherwise it's just a no-op).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130
Cc: 12.0 <mesa-dev@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 23:46:02 +02:00
Nicolai Hähnle
d3a584defe tgsi/scan: add uses_derivatives (v2)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-07 23:45:17 +02:00
Nanley Chery
b7a0c0ec7f docs/devinfo: Expound on helpful extension tips
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-07 11:16:23 -07:00
Nanley Chery
9e7de50cab docs/devinfo: Update bullet in stale extension guide
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-07 11:16:23 -07:00
Nanley Chery
26b0f023d7 docs/devinfo: Add closing paragraph tag
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-07 11:16:23 -07:00
Tim Rowley
87f0a0448f swr: fix provoking vertex
Use rasterizer provoking vertex API.

Fix rasterizer provoking vertex for tristrips and quad list/strips.

v2: make provoking vertex tables static const

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-07 11:47:52 -05:00
Ilia Mirkin
c81b090c92 st/mesa: revalidate image atoms when a texture is updated
A texture may be redefined with _NEW_TEXTURE, which might have been
bound to a shader image slot. We have to revalidate the image atoms to
pick up on the new resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-07 10:18:34 -04:00
Ilia Mirkin
71ad8a173f gk104/ir: fix conditions for adding a texbar
Sometimes a register source can actually be double- or even quad-wide.
We must make sure that the inserted texbars take that width into
account.

Based on an earlier patch by Samuel Pitoiset.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
2016-06-07 10:18:13 -04:00
Nicolai Hähnle
8239da28e8 radeonsi: keep track of dirty descriptor sets
Reduces CPU load for draw calls that change none or few of the descriptors.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:10 +02:00
Nicolai Hähnle
d152c73712 radeonsi: move si_descriptors into a per-context array
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:07 +02:00
Nicolai Hähnle
a29c4f9ebd radeonsi: pass shader stage to si_disable_shader_image
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:05 +02:00
Nicolai Hähnle
4e0fb72786 radeonsi: access descriptor sets via local variables
This will simplify moving them to a per-context array.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:02 +02:00
Nicolai Hähnle
ba4a2840c7 radeonsi: add si_set_rw_buffer to be used for internal descriptors
So that callers outside of si_descriptors.c need to worry less about the
details of descriptor handling.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:59 +02:00
Nicolai Hähnle
c615a055f4 radeonsi: pass shader stage to si_set_shader_image
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:57 +02:00
Nicolai Hähnle
e6612a3e68 radeonsi: pass shader stage to si_set_sampler_view
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:55 +02:00
Nicolai Hähnle
c32cd4b78d radeonsi: move descriptor set begin_new_cs handling into a separate function
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:39 +02:00
Nicolai Hähnle
031b57bc2f radeonsi: move enabled_mask out of si_descriptors
This mask is irrelevant for the generic descriptor set handling, and having it
outside simplifies subsequent changes slightly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:23 +02:00
Jason Ekstrand
d1e141a661 anv/entrypoints: Stop using the C preprocessor
Now that we emit guards for everything, we can just generate the files and
trust build flags to keep us safe.  This should also fix the tarball
problems.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:25 +01:00
Jason Ekstrand
d1a53f91ee anv/entrypoints: Emit #if guards for all platforms
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:25 +01:00
Haixia Shi
1ea233c6f3 platform_android: prevent deadlock in droid_swap_buffers
To avoid blocking other EGL calls, release the display mutex before
we enqueue buffer to android frameworks and re-acquire the mutex
upon return.

v2: moved lock/unlock inside droid_window_enqueue_buffer().

TEST=verify pinch zoom in Photos app no longer causes hangs

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:25 +01:00
Emil Velikov
b7f7ec7843 mesa: automake: distclean git_sha1.h when building OOT
In the case of out-of-tree (OOT) builds, in particular when building
from tarball, we'll end up with the file in both srcdir and builddir.

We want the former to remain intact (since we need it on rebuild) while
the latter should be removed otherwise `make distclean' gets angry at
us.

Ideally there'll be a solution that feels a bit less of a hack. Until
then this does the job exactly as expected.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:23 +01:00
Emil Velikov
2c424e00c3 mesa: automake: ensure that git_sha1.h.tmp has the right attributes
... when copied from git_sha1.h.

As the latter file can we lacking the write attribute, one should set it
explicitly. Otherwise we'll get a warning/failure at cleanup stage.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:21:46 +01:00
Emil Velikov
359d9dfec3 mesa: automake: add directory prefix for git_sha1.h
Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.

Here the file is already generated and is part of the tarball.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:21:45 +01:00
Emil Velikov
1816c837c1 egl: android: don't add the image loader extension for !render_node
With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.

As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.

That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.

Fixes: 34ddef39ce ("egl: android: add dma-buf fd support")

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-06-07 12:21:45 +01:00
Marek Olšák
095803a37a gallium/radeon: add support for sharing textures with DCC between processes
v2: use a function for calculating WORD1 of bo metadata

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-07 11:12:26 +02:00
Marek Olšák
9e5b5fbde0 gallium/radeon: don't discard DCC if an external user can write to it
We don't import textures with DCC now, but soon we will.

v2: if we can't disable DCC for image writes, at least decompress DCC
    at bind time

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-07 11:12:26 +02:00
Dave Airlie
c6b14bafa4 i915: fix typo CAP.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-07 18:31:14 +10:00
Jakob Sinclair
b450f29073 glsl: initialise pointer to NULL
Could cause issues if you tried to read from an uninitialised pointer.
This just initalises the pointer to null to avoid that being a problem.
Discovered by Coverity.

CID: 1343616

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-06-07 08:13:25 +02:00
Dave Airlie
c295923d13 i965/gen8: fix cull distance emission for tessellation shaders.
This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-07 11:52:17 +10:00
Ilia Mirkin
704bc0f0e9 nvc0: add support for VOTE tgsi opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-06 20:49:29 -04:00
Ilia Mirkin
f64c36e2d7 st/mesa: expose GL_ARB_shader_group_vote when supported by backend
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:29 -04:00
Ilia Mirkin
edfa7a4b25 gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowed
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:29 -04:00
Ilia Mirkin
30684b50d7 gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:28 -04:00
Ilia Mirkin
5189f0243a mesa: hook up core bits of GL_ARB_shader_group_vote
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:48:46 -04:00
Kenneth Graunke
13b859de04 glsl: Make opt_copy_propagation_elements actually propagate into loops.
We've had a FINISHME here since Eric originally wrote the code in 2011.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are basically a wash:

   No change in instruction counts.

   total cycles in shared programs: 78685980 -> 78680730 (-0.01%)
   cycles in affected programs: 2102646 -> 2097396 (-0.25%)
   helped: 48
   HURT: 83

I figured if we're going to do this for one copy propagation pass,
we may as well do it in both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-06 14:14:31 -07:00
Kenneth Graunke
0756e3a25c glsl: Make opt_copy_propagation actually propagate into loops.
We've had a FINISHME here since Eric originally wrote the code in 2010.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are not terribly impressive:

   total instructions in shared programs: 9008589 -> 9008613 (0.00%)
   instructions in affected programs: 4293 -> 4317 (0.56%)
   helped: 0
   HURT: 10

   total cycles in shared programs: 78550978 -> 78575760 (0.03%)
   cycles in affected programs: 655426 -> 680208 (3.78%)
   helped: 75
   HURT: 88

   GAINED: 2

Most of the "regressions" appear to be us successfully copy propagating
uniforms, which i965 is loading as pull constants instead of push, so we
occasionally have two pulls instead of one.  That doesn't seem like this
pass's job - it's propagating correctly, and we should be smarter about
pull loads in the backend.

This patch is also useful for a couple of reasons:

1. It can clean up copies created by varying packing (previously, we
   couldn't if the uses were inside a loop).

   This fixes a bug when interpolateAt*() is used on a packed varying
   inside a loop: glsl_to_nir struggles to see through the extra copy
   and mistakenly believed the variable was not an input.

2. It will help propagate uniform array access created by
   lower_const_array_to_uniforms().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-06 14:14:31 -07:00
Samuel Pitoiset
08ddfe7b2f nv50/ir: use round toward 0 when converting doubles to integers
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.

This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-06-06 22:56:04 +02:00
Marek Olšák
00e6899ae5 gallium/radeon: don't re-set BO metadata after CMASK deallocation
CMASK has no effect on metadata, because it's not sharable.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-06 22:50:55 +02:00
Marek Olšák
589d6b58c3 st/mesa: change SQRT lowering to fix the game Risen
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94627
(against nouveau)

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-06 22:50:55 +02:00
Marek Olšák
991cbfcb14 radeonsi: add a performance tweak for 4 SE parts
Ported from Vulkan.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-06 22:50:55 +02:00
Marek Olšák
2802310c25 radeonsi: simplify PRIMGROUP_SIZE computation for tessellation
Ported from Vulkan.

v2: keep the comment

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-06 22:50:55 +02:00
Marek Olšák
014c8ec770 r600g: use hw MSAA resolve for non-trivial resolves
This improves MSAA resolve performance.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-06 22:50:55 +02:00
Marek Olšák
6b449783f6 radeonsi: use hw MSAA resolve for non-trivial resolves
This improves MSAA resolve performance.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-06 22:50:55 +02:00
Dave Airlie
07403014c3 mesa/program_resource: return -1 for index if no location.
The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."

This fixes:
GL45-CTS.program_interface_query.output-built-in

[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-07 06:10:19 +10:00
Nicolai Hähnle
ec2b52e2d9 radeonsi: set descriptor dirty mask on shader buffer unbind
Found randomly while skimming the code. This might have caused VM faults in
robustness tests.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-06 21:43:18 +02:00
Nicolai Hähnle
0f916d4ca7 st/mesa: fix resource leak in try_pbo_readpixels
Found by inspection after seeing
https://bugs.freedesktop.org/show_bug.cgi?id=96343

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-06 21:42:27 +02:00
Charmaine Lee
627e975896 tgsi: fix mixed data type comparison in tgsi_point_sprite.c
Cast the unsigned semantic index to integer datatype before comparing
to max_generic, otherwise, max_generic which is initialized to -1
will be converted to unsigned int before the comparison, causing a wrong
semantic index to be assigned to a shader output.

Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265)

Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord.

v2: use the original max_generic variable but add the (int) cast
    to the semantic index, as suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-06 10:20:45 -06:00
Charmaine Lee
304b5a1446 svga: print shader linkage info when tgsi debug bit is on
When TGSI debug flag is enabled, print the shader linkage info as well.

Tested with mesa demos with SVGA_DEBUG=tgsi

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-06 10:20:45 -06:00
Ilia Mirkin
4f1cccf570 st/mesa: check shader image format support before using PBO download
ARB_shader_image_load_store only requires a very fixed list of formats
to be supported, while textures may be in all kinds of formats, like
BGRA which are presently not supported on at least Kepler.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-06 12:05:59 -04:00
Lars Hamre
4163c71010 tgsi: use truncf in micro_trunc
Switches to using truncf in micro_trunc.

Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-trunc-float
fs-trunc-vec2
fs-trunc-vec3
fs-trunc-vec4
vs-trunc-float
vs-trunc-vec2
vs-trunc-vec3
vs-trunc-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-trunc-float
gs-trunc-vec2
gs-trunc-vec3
gs-trunc-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-06 15:56:28 +02:00
Samuel Iglesias Gonsálvez
2b648ec17c i965/gs/scalar: Fix load input for doubles
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-06 12:37:16 +02:00
Samuel Iglesias Gonsálvez
2d6f82a294 i965/fs: fix offset when loading double vector input varyings
When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.

const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-06 12:37:16 +02:00
Samuel Iglesias Gonsálvez
cb30727648 i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings
Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
64-bit aligned and the current implementation fails to read the data
properly. Instead, when there is is a double input varying, read it as
vector of floats with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-06 12:37:16 +02:00
Dave Airlie
4c86399378 glsl: geom shader max_vertices layout must match.
From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs".
"all geometry shader output vertex count declarations in a
program must declare the same count."

Fixes:
GL45-CTS.geometry_shader.output.conflicted_output_vertices_max

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-06 18:02:19 +10:00
Jason Ekstrand
ffcef720b7 anv/pipeline: Add support for caching the push constant map
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-06-06 00:44:32 -07:00
Dave Airlie
78659ade40 glsl: use enum glsl_interface_packing in more places. (v2)
Although the glsl_types.h stores this in a bitfield,
we should hide that from everyone else. Hide the cast
in an accessor method and use the enum everywhere.

This makes things a bit nicer in gdb, and improves type
safety.

v2: fix a few pieces of interface I missed that caused some
piglit regressions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-06 15:58:37 +10:00
Dave Airlie
ff2e569153 i965: don't use NumLayers for 3D textures.
For 3D textures we shouldn't be using NumLayers, we need
to get it from the depth.

This fixes:
GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-06 13:07:07 +10:00
Dave Airlie
1f66a4b689 glsl: for anonymous struct matching use without_array() (v3)
With tessellation shaders we can have cases where we have
arrays of anon structs, so make sure we match using without_array().

Fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in

v2:
test lengths match as well (Ilia)
v3:
descend array lengths to check for matches as well (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-06 12:54:41 +10:00
Dave Airlie
6702c15810 glsl/ast: don't crash when func_name is NULL
This fixes a crash in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

If we can't find the func_name in one of these paths,
we have emitted an earlier error so just return here.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-06 12:54:30 +10:00
Dave Airlie
4336196b7f glsl: handle ast_aggregate in has_sequence_subexpression. (v2)
GL43-CTS.compute_shader.work-group-size does
uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 };

The initializer triggers the GLSL 4.30/GLES3 tests
for constant sequence subexpressions, so it doesn't
happen unless you are using those, so just return
false as this path is now reachable.

v2: update commit msg with diagnosis
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-06 12:54:19 +10:00
Kenneth Graunke
f657a59d98 mesa: Try to unbreak the MSVC build.
PATH_MAX is apparently not a thing on Windows.  Borrow the hack from
pipe_loader.c to try and make this work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-05 16:32:08 -07:00
Kenneth Graunke
c417c0c9c3 mesa: Add MESA_SHADER_CAPTURE_PATH for writing .shader_test files.
This writes linked shader programs to .shader_test files to
$MESA_SHADER_CAPTURE_PATH in the format used by shader-db
(http://cgit.freedesktop.org/mesa/shader-db).

It supports both GLSL shaders and ARB programs.  All stages that
are linked together are written in a single .shader_test file.

This eliminates the need for shader-db's split-to-files.py, as Mesa
produces the desired format directly.  It's much more reliable than
parsing stdout/stderr, as those may contain extraneous messages, or
simply be closed by the application and unavailable.

We have many similar features already, but this is a bit different:
- MESA_GLSL=dump writes to stdout, not files.
- MESA_GLSL=log writes each stage to separate files (rather than
  all linked shaders in one file), at draw time (not link time),
  with uniform data and state flag info.
- Tapani's shader replacement mechanism (MESA_SHADER_DUMP_PATH and
  MESA_SHADER_READ_PATH) also uses separate files per shader stage,
  but allows reading in files to replace an app's shader code.

v2:  Dump ARB programs too, not just GLSL.
v3:  Don't dump bogus 0.shader_test file.
v4:  Add "GL_ARB_separate_shader_objects" to the [require] block.
v5:  Print "GLSL 4.00" instead of "GLSL 4.0" in the [require] block.
v6:  Don't hardcode /tmp/mesa.
v7:  Fix memoization of getenv().
v8:  Also print "SSO ENABLED" (suggested by Timothy).
v9:  Also handle ES shaders (suggested by Ilia).
v10: Guard against MESA_SHADER_CAPTURE_PATH being too long; add
     _mesa_warning calls on error handling (suggested by Ben).
v11: Fix crash when variable is unset introduced in v10.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-05 13:48:57 -07:00
Ilia Mirkin
092ec3920f nv50,nvc0: fix BGR10_A2UI vertex format
This is mostly academic as this is not reachable from GL, which only has
the packed RGB10_A2UI vertex format.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-05 15:13:46 -04:00
Samuel Pitoiset
be365f34f0 nvc0: do not clear surfaces bins in the validate function
We should not call nouveau_bufctx_reset() inside a validate function.
This only affects Fermi where images are aliased between 3D and CP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-05 19:02:59 +02:00
Samuel Pitoiset
43d3ecfb33 nvc0: re-validate images after launching a grid on Fermi
Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.

However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.

This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-05 18:48:02 +02:00
Marek Olšák
3b44864ab7 radeonsi: fix images with level > 0
This should fix spec@arb_shader_image_load_store@level.

Broken by:
    Commit: 95c5bbae66
    radeonsi: set some image descriptor fields at bind time

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-05 17:00:14 +02:00
Ilia Mirkin
fd6bbc2ee2 nvc0: reduce overhead from always marking images dirty
We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Ilia Mirkin
0f673db6f0 nvc0: reduce overhead from always marking buffers dirty
We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Ilia Mirkin
e8ee161b16 nvc0: fix memory barrier flag handling
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Ilia Mirkin
29abbeecd8 nvc0: mark bound buffer range valid
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-04 23:50:56 -04:00
Dave Airlie
f018456901 anv/entrypoints: don't go using wayland/xcb unless they are configured
The fix in:
anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards

breaks things if wayland headers aren't installed.

Separate things out properly to avoid that problem.

[airlied: fixed up to put in pre-existing sections].
Reported-by: Arjan van de Ven
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-05 07:03:12 +10:00
Marek Olšák
d5491a81ff gallium/radeon: don't use the DMA ring for pipelined buffer uploads
Submitting a DMA IB flushes the GFX IB and all GPU caches.

Vedran Miletić said:
  "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps
   (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling
   from 1200p)."

Some anonymous dude said:
   R9 390 results:
      Tomb Raider (normal settings): 80 -> 88 FPS
      Talos Principle (custom settings): 23 -> 56 FPS
      Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Vedran Miletić <vedran@miletic.net>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
9c35ec2042 r600g: don't flush caches when binding shader resources
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
eff94af794 r600g: only do necessary cache flushes in cp_dma_copy_buffer
The main impact is that {upload, draw, upload, draw, ..} doesn't flush
framebuffer caches before every upload.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
9e62012c30 r600g: only do necessary cache flushes in cp_dma_clear_buffer
The main impact is that fast color clear doesn't flush TC, CONST, DB.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
c92a3ae7e9 r600g: remove a CP DMA workaround that's not needed anymore
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
5ea5ed6050 r600g: fix CP DMA hazard with index buffer fetches (v3)
v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel,
    otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
ade16e1f5d r600g: properly sync CP with CP DMA on R6xx
This will allow removing useless cache & IB flushes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
7746903d3a r600g: write WAIT_UNTIL in the correct place
This has been wrong all along. Fixing this will allow removing useless
cache flushes.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
ee0c96c11e gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memory
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Marek Olšák
ada3d8f31e gallium/u_suballoc: allow different alignment for each allocation
Just move the alignment parameter from u_suballocator_create
to u_suballocator_alloc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-06-04 15:42:33 +02:00
Jason Ekstrand
441194edd9 anv/blit: Use CLAMP_TO_EDGE for scaled blits
When upscaling you can end up interpolating between the edge pixel and one
past the edge.  Using CLAMP_TO_EDGE seems like the most reasonable thing to
do in this case.  This fixes two of the new Vulkan CTS tests in
dEQP-VK.api.copy_and_blit.blit_image.*

Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
9313a56816 anv/copy: Account for the anv_surface.offset when creating a blit2d_surf
This was causing problems if the user tried to copy to/from the stencil
portion of a combined depth/stencil image.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
526a8de22d nir/spirv: Make a decoration switch complete
Getting rid of the default case makes the compiler warn if we are missing
cases.  While we're here, we also add the one missing case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
62c6e94bd6 nir/spirv: Make unhandled decorations and capabilities non-fatal
glslang frequently throw bogus decorations into shaders.  While we are free
to assert-fail, it's a bit nicer to the application to just warn.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
ed14d21d04 nir/spirv: Add a way to print non-fatal warnings
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
2e46a5d155 nir/spirv: Add string lookup tables for a couple of SPIR-V enums
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
5a1e56f344 nir/spirv: Complete the list of capabilities
Previously we supported a subset of capabilities and just left a default
case for the others.  It's time to stop being lazy and actually audit the
capabilities.  This should bring them up-to-date with reality.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
9fa958e95b anv/pipeline: Add support for early depth stencil
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
66bd2e1133 mesa: Get rid of _mesa_active_fragment_shader_has_side_effects
It is no longer used.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
35bf4d9dc2 i965/ps_state: Use wm_prog_data.has_side_effects
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
3fb289f957 i965/fs Add a wm_prog_data bit for has_side_effects
This is more accurate than calling
_mesa_active_fragment_shader_has_side_effects because it looks at whether
or not the SSBOs, images, or atomic buffers are actually written rather
than just existing in the program.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
4d3b8318a7 nir/info: Get rid of uses_interp_var_at_offset
We were using this briefly in the i965 driver to trigger recompiles but we
haven't been using it since we switched to the NIR y-transform lowering
pass.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
56a178922f anv/pipeline: Silently pass tests if depth or stencil is missing
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
bc7f7e1953 anv/pipeline: Unify gen7/8 emit_ds_state
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
fdc3c5dd05 genxml/gen6,7,75: s/BackFace/Backface
This is more consistent with gen8+

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
1f7b54ed29 nir/spirv: Handle the WorkgroupSize builtin decoration
This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests
in the latest dev version of the Vulkan CTS.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
b26cdd65e8 nir/spirv: Use breaks instead of returns in constant handling
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
a19ae36ce5 anv/pipeline: Refactor specialization constant handling a bit
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
45542f554c nir/lower_indirect_derefs: Use the direct array deref for recursion
This fixes about 100 of the new Vulkan CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jason Ekstrand
59f06ac389 anv/clear: Handle ClearImage on 3-D images
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Francisco Jerez
7244dc1e06 Revert "i965/fs: Allow scalar source regions on SNB math instructions."
This reverts commit c1107cec44.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode.  Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):

   es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
   es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
   es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_float_frag_xvary
   es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
   es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
   es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2016-06-03 18:47:29 -07:00
Francisco Jerez
a2135c6fd9 i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).
The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction.  This prevents
cmod propagation from (mis)optimizing:

 cmp.z.f0 tmp, ...
 mov.z.f0 null, tmp

into:

 cmp.z.f0 tmp, ...

which gives the negation of the flag result of the original sequence.
I originally noticed this while working on SIMD32 in the scalar
back-end, but the same scenario is likely to be possible in vec4
programs so this commit ports the bugfix with the same name from the
scalar back-end to the vec4 cmod propagation pass.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-03 18:38:51 -07:00
Emil Velikov
7a3a0d9212 anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS
Otherwise we will fail to find the headers in some scenarios.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2016-06-04 00:52:00 +01:00
Emil Velikov
a1256c0ea7 nir: automake: add nir_search_helpers.h to the sources list(s)
Fixes: dfbae7d64f ("nir/algebraic: support for power-of-two
optimizations")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-06-04 00:18:40 +01:00
Rob Clark
1535519e51 freedreno/ir3: do idiv lowering after main opt loop
Give algebraic-opt pass a chance to catch udiv by const power-of-two,
before running lower-idiv pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-03 16:05:03 -04:00
Rob Clark
dfbae7d64f nir/algebraic: support for power-of-two optimizations
Some optimizations, like converting integer multiply/divide into left/
right shifts, have additional constraints on the search expression.
Like requiring that a variable is a constant power of two.  Support
these cases by allowing a fxn name to be appended to the search var
expression (ie. "a#32(is_power_of_two)").

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-03 16:05:03 -04:00
Nicolai Hähnle
a64c7cd2ba radeonsi: mark buffer texture range valid for shader images
When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.

This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels,
reported by Michel Dänzer.

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-03 14:11:05 +02:00
Marek Olšák
8c361e84ad Revert "egl: Check if API is supported when using eglBindAPI."
This reverts commit e8b38ca202.

It broke Glamor for Gallium at least.
2016-06-03 11:33:45 +02:00
Alejandro Piñeiro
9bdbb9c0e0 mesa/formatquery: expand NUM_SAMPLE_COUNTS OpenGL ES comment
For ES 3.0 NUM_SAMPLE_COUNTS spec points that some formats will be
always zero. But on ES 3.1 can be different to zero.

The current code is correctly checking exactly against version 3.0,
but the comment only mentions 3.0 spec. It is clearer mentioning both.

v2: better wording on the comment (Ian Romanick)

Acked-by: Eduardo Lima <elima@igalia.com>
Acked-by: Antia Puentes <apuentes@igalia.com>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-03 07:38:25 +02:00
Dave Airlie
d10ae20b96 mesa/get: return correct value for layer provoking vertex.
This fixes:
GL45-CTS.geometry_shader.layered_rendering.layered_rendering

on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-03 12:33:34 +10:00
Plamena Manolova
0b67efaed2 egl: Account for default values of texture target and format
When validating attributes during surface creation we should account
for the default values of texture target and format (EGL_NO_TEXTURE)
since the user is not obligated to explicitly set both via the
attribute list passed to eglCreatePbufferSurface.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-06-02 16:07:31 -07:00
Samuel Pitoiset
28590eb949 nvc0: mark buffer texture range valid for shader images
Loosely based on radeonsi (Thanks to Nicolai).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-06-03 00:12:23 +02:00
Mauro Rossi
278c2212ac isl: add support for Android libmesa_isl static library
isl library is needed to build i965, libmesa_isl static library is added
to fix related Android building errors.

Any attempt to build libmesa_genxml as phony package module failed to deliver
gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9}

Due to constraints in Android Build System, libmesa_genxml is built as static,
at least one source is needed, so dummy.c is autogenerated for this scope,
libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES,
to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-02 22:31:44 +01:00
Mauro Rossi
4143245c23 android: libmesa_glsl: add a dependency on libmesa_nir static
Fixes the following building error:

target  C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp
In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0,
                 from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28:
external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory
compilation terminated.
build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed
make: *** [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1
make: *** Waiting for unfinished jobs....

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-02 22:31:00 +01:00
Emil Velikov
af1a0ae8ce isl: automake: don't include isl_format_layout.c in two lists.
Including the file in both ISL_FILES and ISL_GENERATED_FILES makes
the actual dependency list less obvious.

v2: Drop unrelated vulkan hunk (Jason).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-02 22:26:04 +01:00
Emil Velikov
af2637aa32 automake: bring back the .PHONY git_sha1.h.tmp rule
With earlier commit 3689ef32af ("automake: rework the git_sha1.h rule,
include in tarball") we/I erroneously removed the PHONY rule and the
temporary file.

The former is used to ensure that the header is regenerated when on each
make invocation, while the latter helps us avoid the unneeded rebuild(s)
when the SHA1 hasn't changed.

Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-06-02 22:23:12 +01:00
Kenneth Graunke
f74a29188c i965: Add _NEW_POINT to a couple of comments.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-06-02 14:11:55 -07:00
Charmaine Lee
0cf0d7c02e svga: allow copy box in svga_transfer_dma_band()
Instead of just allow copy of a rectangle in svga_transfer_dma_band(),
this patch allows it to copy a box, hence allows copy a 3d texture
in one transfer.

Fixes black screen in running Heaven after commit fb9fe35. (Bug 1663282)

Tested with Heaven, glretrace, piglit.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-02 15:03:41 -06:00
Rob Clark
94d8fbd217 freedreno: fix bad bitshift warnings
Coverity doesn't realize idx will never be negative.  Throw in some
assert()s to help it out.

(Hopefully assert() isn't getting compiled out for coverity build.. but
there seems to be just one way to find out.  We might have to change
these to assume())

Fixes CID 1362442, 1362443

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 16:29:32 -04:00
Rob Clark
676c77a923 freedreno: assume builtin shaders do compile
Maybe we should switch to ureg to build the builtin shaders.  But at any
rate, if they fail to compile it is because someone messed them up (or
changed TGSI syntax?).

CID 1362444

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 16:29:32 -04:00
Francisco Jerez
060c8d245d i965/fs: Reindent emit_zip().
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-02 13:24:48 -07:00
Francisco Jerez
7aa76d66a1 i965/fs: Skip SIMD lowering destination zipping if possible.
Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lowered instruction, and all source regions of
the instruction must be either fully disjoint from the destination or
be aligned with it group by group.

v2: Don't handle partial source-destination overlap for simplicity
    (Jason).  No instruction count regressions with respect to v1 in
    either shader-db or the few FP64 shader_runner test-cases with
    partial overlap I've checked manually.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-02 13:24:48 -07:00
Anuj Phogat
75da9c9933 blorp: Fix 16x multisample scaled blits
Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled
(with added 16x sample support) now passes with this patch.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-02 13:21:26 -07:00
Anuj Phogat
59c19b7687 meta: Fix indentation in shader code
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2016-06-02 13:21:26 -07:00
Dave Airlie
af7bf610cf mesa/copyimage: report INVALID_VALUE for missing cube face
The specs says INVALID_VALUE for exceeding dimensions,
which is really what is happening here.

This fixes:
GL45-CTS.copy_image.non_existent_mipmap

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-03 06:08:44 +10:00
Dave Airlie
c0856eacf1 mesa/copyimage: fix num samples check to handle renderbuffers.
This test was only happening for textures, but there is
nothing in the spec to say this, so test it for all cases.

This fixes:
GL45-CTS.copy_image.invalid_target

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-03 06:08:22 +10:00
Rob Clark
80c2886033 freedreno/a4xx: silence coverity warning
CID 1362451

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
9b854ce53c freedreno/a3xx+a4xx: fix potential null ptr deref
Coverity spotted the a3xx case (not sure why not the a4xx).

CID 1362452

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
27a97097e1 freedreno/ir3: fix coverity warning
CID 1362453

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
374ad2e2bd freedreno/ir3: use nir_shader_get_entrypoint() helper
Should also fix coverity warning: CID 1362454

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
df64cd6814 freedreno/a4xx: fix incorrect enum type
a4xx has it's own enum, different from a2xx/a3xx.

Spotted by coverity: CID 1362458, 1362459

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
1632b0eac0 freedreno: fix coverity negative array index warning
Never can happen, since query would not have been created in the first
place if pidx(query_type) return negative.  Lets let coverity realize
this.

CID 1362460

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
ba452d43e0 freedreno: fix dereference before null check
ptr can actually never be null so just drop the check.

CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking ptr suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
228b2b36f4 gallium/util: remove u_staging
Unused, and fixes a couple of coverity warnings: CID 1362171, 1362170

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2016-06-02 15:44:07 -04:00
Rob Clark
18fb922faa freedreno/a3xx: only update/emit bordercolor state when needed
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Rob Clark
11f0652404 freedreno/a4xx: only update/emit bordercolor state when needed
I noticed in stk that it was contributing to a lot of overhead.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-02 15:44:07 -04:00
Matt Turner
0d81a684c1 i965: Add missing types to type_sz().
Coverity warns in multiple places about the potential for division by
zero, caused by this function's default case.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-06-02 11:34:09 -07:00
Nanley Chery
c06cef7f9b mesa/extensions: Fix ES1 extension reporting
Commit eda15abd84 , unintentionally
advertised these extensions in ES1 contexts. Undo this error.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-02 10:46:59 -07:00
Plamena Manolova
e8b38ca202 egl: Check if API is supported when using eglBindAPI.
According to the EGL specifications before binding an API
we must check whether it's supported first. If not eglBindAPI
should return EGL_FALSE and generate a EGL_BAD_PARAMETER error.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-02 07:45:19 -07:00
Eric Engestrom
17f4c723eb st/osmesa: remove double-write (overwriting)
These two lines have been here since the file was created.
I'm guessing the second one was just for testing during dev, so it's the
one that's going away.

CoverityID: 1296205

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-02 07:05:05 -06:00
Nayan Deshmukh
6c9a352d79 st/vdpau: check for null pointer in get/put bits.
Check for null pointer before accessing arrays in get/put bits
native/YCbCr/Indexed in VdpOutputSurface and VdpVideoSurface.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-02 09:28:48 +02:00
Christian König
b3e75c3997 radeon/uvd: fix the H264 level for Tonga v2
We support 5.2 for a while now.

v2: we even support 5.2 for H264, 5.1 is for HEVC.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
2016-06-02 09:27:57 +02:00
Alejandro Piñeiro
b48c42cd1f mesa/formatquery: add a comment to clarify INTERNALFORMAT_PREFERRED
The comment clarifies that the driver is called only to try to get
a preferred internalformat, and that it was already checked if the
format is supported or not.

Acked-by: Eduardo Lima <elima@igalia.com>
Acked-by: Antia Puentes <apuentes@igalia.com>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-02 08:54:17 +02:00
Alejandro Piñeiro
c1ceee6cc9 i965/formatquery: remove INTERNALFORMAT_PREFERRED implementation
Right now the implementation only checks if the internalformat is
supported or not. But that implementation is wrong, returning
unsupported for some internalformats. Additionally, checking if
the internalformat is supported or not is already done at mesa/main
before calling the driver hook, so this new check is not needed.

Acked-by: Eduardo Lima <elima@igalia.com>
Acked-by: Antia Puentes <apuentes@igalia.com>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-02 08:54:10 +02:00
Alejandro Piñeiro
58617bcebe i965/eu: use simd8 when exec_size != EXECUTE_16
Among other thigs, fix a gpu hang when using INTEL_DEBUG=shader_time
for any shader.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-06-02 08:08:10 +02:00
Jordan Justen
0a3acff5b5 i965: Remove old CS local ID handling
The old method pushed data for each channels uvec3 data of
gl_LocalInvocationID.

The new method pushes 1 dword of data that is a 'thread local ID'
value. Based on that value, we can generate gl_LocalInvocationIndex
and gl_LocalInvocationID with some calculations.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
b1f22c6317 i965: Enable cross-thread constants and compact local IDs for hsw+
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

One complication is that cross-thread constants are loaded into
registers before per-thread constants. Previously, our local IDs were
loaded before the uniform data and treated as 'payload' data, even
though they were actually pushed into the registers like the other
uniform data.

Therefore, in this patch we simultaneously enable a newer layout where
each thread now uses a single uniform slot for a unique local ID for
the thread. This uniform is handled specially to make sure it is added
last into the uniform push constant registers. This minimizes our
usage of push constant registers, and maximizes our ability to use
cross-thread constants for registers.

To swap from the old to the new layout, we also need to flip some
lowering pass switches to let our driver handle the lowering instead.
We also no longer force thread_local_id_index to -1.

v4:
 * Minimize size of patch that switches from the old local ID layout
   to the new layout (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
3ba9594f32 anv: Support new local ID generation & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
30685392e0 i965: Support new local ID push constant & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
d437798ace i965: Add CS push constant info to brw_cs_prog_data
We need information about push constants in a few places for the GL
driver, and another couple places for the vulkan driver.

When we add support for uploading both a common (cross-thread) set of
push constants, combined with the previous per-thread push constant
data, things are going to get even more complicated. To simplify
things, we add push constant info into the cs prog_data struct.

The cross-thread constant support is added as of Haswell. To support
it we need to make sure all push constants with uniform values are
added to earlier registers. The register that varies per thread and
holds the thread invocation's unique local ID needs to be added last.

For now we add the code that would calculate cross-thread constatn
information for hsw+, but we force it (cross_thread_supported) off
until the other parts of the driver support it.

v4:
 * Support older local ID push constant layout as well. (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
1b79e7ebbd i965: Store number of threads in brw_cs_prog_data
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
3ef0957dac i965: Add nir based intrinsic lowering and thread ID uniform
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.

We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.

We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex.

v2:
 * Create variable during lowering pass. (Ken)

v3:
 * Don't create a variable, but instead just insert an intrisic call
   to load a uniform from the allocated location. (Jason)

v4:
 * Don't run this pass if thread_local_id_index < 0

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
04fc72501a i965: Put CS local thread ID uniform in last push register
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.

It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.

The reason this is important is that the cross-thread push constant
registers are loaded first, and the per-thread push constant registers
are loaded after that. (Broadwell adds another push constant upload
mechanism which reverses this order, but we are ignoring this for
now.)

v2:
 * Add variable in intrinsics lowering pass
 * Make sure the ID is pushed last in assign_constant_locations, and
   that we save a spot for the ID in the push constants

v3:
 * Simplify code based with Jason's suggestions.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
fa279dfbf0 i965: Add uniform for a CS thread local base ID
v4:
 * Force thread_local_id_index to -1 for now, and have
   fs_visitor::setup_cs_payload look at thread_local_id_index. This
   enables us to more easily cut over from the old local ID layout to
   the new layout, as suggested by Jason.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
8f48d23e0f i965: Add nir channel_num system value
v2:
 * simd16/32 fixes (curro)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
6f316c9d86 nir: Make lowering gl_LocalInvocationIndex optional
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jordan Justen
7b9def3583 glsl: Add glsl LowerCsDerivedVariables option
v2:
 * Move lower flag to context constants. (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jason Ekstrand
1205999c22 i965/fs: Copy the offset when lowering logical pull constant sends
This fixes 64 Vulkan CTS tests per gen

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-01 16:00:44 -07:00
Dave Airlie
8d4f4adfbd glsl/distance: make sure we use clip dist varying slot for lowered var.
When lowering, we always want to use the clip dist varying.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-02 07:09:21 +10:00
Nicolai Hähnle
c7877b9dab winsys/amdgpu: decay max_ib_size over time
So that memory use will eventually decrease again after a temporary peak.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
6aff6377b1 winsys/amdgpu: implement IB chaining on the gfx ring
As a consequence, CE IB size never triggers a flush anymore.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
45be461f55 winsys/amdgpu: consolidate IB size management in amdgpu_ib_finalize
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
89ba076de4 radeon/winsys: introduce radeon_winsys_cs_chunk
We will chain multiple chunks together and will keep pointers to the older
chunks to support IB dumping.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:20 +02:00
Nicolai Hähnle
a7c26bfc0c radeonsi/sid: add packet definitions for IB chaining
While we're at it, add packet printing in si_debug.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
83a01cb498 winsys/amdgpu: start with smaller IBs, growing as necessary
This avoids allocating giant IBs from the outset, especially for CE and DMA.

Since we now limit max_dw only by the size that the buffer happens to be
(which, due to the buffer cache, can be even larger than the rounded-up size
we request), the new function amdgpu_ib_max_submit_dwords controls when we
submit an IB.

With this change, we effectively never flush prematurely due to the CE IB,
after an initial warm-up phase.

v2:
- clean up buffer_size calculation

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
f80c6abb9e winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functions
The latter function allows getting the containing amdgpu_cs from any IB
(including non-main ones).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
9e5ed559ba winsys/amdgpu: extract IB big buffer allocation for re-use
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
9db851b5ee winsys/amdgpu: add IB buffer in amdgpu_get_new_ib
Adding the buffer when we start using it for the IB makes the logic for
chaining a bit simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:19 +02:00
Nicolai Hähnle
d6211a61b0 gallium/radeon: use cs_check_space throughout
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
46ad3561be radeon/winsys: add cs_check_space
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
92d5d97b10 winsys/amdgpu: simplify interface of amdgpu_get_new_ib
We'll want to have an amdgpu_cs pointer for future changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Nicolai Hähnle
8396ab4241 winsys/amdgpu: add amdgpu_cs_has_user_fence
v2: style change

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:52:18 +02:00
Kenneth Graunke
25e1b8d366 i965: Fix isoline reads in scalar TES.
Isolines aren't reversed.  commit 5b2d8c2273 fixed this for the vec4
TES backend, but not the scalar one.

Found while debugging GL45-CTS.tessellation_shader.
tessellation_control_to_tessellation_evaluation.gl_tessLevel.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2016-06-01 13:46:09 -07:00
Nicolai Hähnle
ed0e9862c5 st/mesa: implement PBO downloads for ReadPixels
v2: require PIPE_CAP_SAMPLER_VIEW_TARGET; technically only needed for some of
    the texture targets, but all hardware that has shader images should also
    have this cap.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:51 +02:00
Nicolai Hähnle
f3b62d4c74 st/mesa: hook up a no-op try_pbo_readpixels
For better bisectability given that the order of some of the fallback tests
in the blit path are rearranged.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:48 +02:00
Nicolai Hähnle
1cb4be94ae st/mesa: add layer_offset to PBO fragment shader
This will be used to select a slice of a 3D texture.

v2: fix a comment (Marek)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:43 +02:00
Nicolai Hähnle
2bf6dfac8a st/mesa: create PBO download fragment shaders
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:40 +02:00
Nicolai Hähnle
852d3fcd3b st/mesa: add PBO download enable bit and fragment shaders
For downloads, the fragment shader must know the source texture target, hence
we may cache multiple fragment shaders.

v2: break long line (Marek)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:34 +02:00
Nicolai Hähnle
581c001532 st/mesa: move shareable parts of PBO upload state and draw to st_pbo.c
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:31 +02:00
Nicolai Hähnle
e16800226e st/mesa: move PBO buffer address calculation to st_pbo.c
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:28 +02:00
Nicolai Hähnle
21e069f7d4 st/mesa: move PBO upload fs creation to st_pbo.c
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:26 +02:00
Nicolai Hähnle
979688a027 st/mesa: rename pbo_upload to pbo
At the same time, rename members that are upload-specific to say so.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:23 +02:00
Nicolai Hähnle
be82065fbe st/mesa: move PBO vertex and geometry shader creation to st_pbo.c
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:20 +02:00
Nicolai Hähnle
4ecc32b0e1 st/mesa: begin moving PBO functions into their own file
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:18 +02:00
Nicolai Hähnle
d9893feb2c gallium/cso: allow saving the first fragment shader image slot
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:15 +02:00
Nicolai Hähnle
fc0352ff9c gallium/u_inlines: allow NULL src in util_copy_image_view
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:37:12 +02:00
Nicolai Hähnle
57f576f1fb gallium: add PIPE_BARRIER_ALL define
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-01 22:36:48 +02:00
Ian Romanick
a428c955ce glsl: Use Geom.VerticesOut == -1 to specify unset
Because apparently layout(max_vertices=0) is a thing.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-01 11:11:39 -07:00
Ian Romanick
b27dfa5403 i965: If control_data_header_size_bits is zero, don't do EndPrimitive
This can occur when max_vertices=0 is explicitly specified.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-01 11:11:39 -07:00
Ian Romanick
049bb94d2e mesa: Fix bogus strncmp
The string "[0]\0" is the same as "[0]" as far as the C string datatype
is concerned.  That string has length 3.  strncmp(s, length_3_string, 4)
is the same as strcmp(s, length_3_string), so make it be strcmp.

v2: Not the same as strncmp(..., 3).  Noticed by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-01 11:11:25 -07:00
Marek Olšák
12740efd29 radeonsi: set correct stencil tile mode for texturing
Sadly, this doesn't affect SI and VI in any way.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-01 17:35:30 +02:00
Marek Olšák
ea68215c54 winsys/amdgpu: set flags correctly when allocating depth-stencil buffers
This mimics Vulkan. It also documents how to fix stencil texturing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-01 17:35:30 +02:00
Marek Olšák
532a5af47f gallium/radeon: lower memory usage during texture transfers
This improves throughput by keeping TTM overhead down.

Some piglit tests such as texelFetch and streaming-texture-leak will
use less memory now.

v2: use gart_size / 4 as the threshold

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-01 17:35:30 +02:00
Marek Olšák
614e3c6272 gallium/radeon: invalidate busy linear textures for whole-texture uploads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
fc1479a954 gallium/radeon: degrade tiled textures mapped often to linear
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
9927c8138a gallium/radeon: clean up and better comment use_staging_texture
Next commits will add other things around this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
b033584299 radeonsi: set some colorbuffer register fields at emit time
to allow reallocating the texture storage with different parameters

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
30b2b860b0 radeonsi: implement global resetting of texture descriptors
it will be used by texture reallocation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
28de7aec0c radeonsi: move code for setting one shader image into separate function
v2: fix set_shader_images(..., NULL). Found by Christoph Haag.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
95c5bbae66 radeonsi: set some image descriptor fields at bind time
mainly the fields that can change by reallocating a texture and changing
the tile mode

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
ef765d0789 gallium/radeon: strenghten some checking for DMA preparation
Just for consistency. This doesn't fix anything, because DCC is not
supported with non-mipmapped textures.

v1.1: fix the comment about DCC

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Marek Olšák
9d881cc0ac gallium/util: add util_texrange_covers_whole_level from radeon
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-01 17:35:30 +02:00
Ilia Mirkin
ca135a2612 nir: allow sat on all float destination types
With the introduction of fp64 and fp16 to nir, there are now a bunch of
float types running around. A F1 2015 shader ends up with an i2f.sat
operation, which has a nir_type_float32 destination. Allow sat on all
the float destination types.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 10:44:40 -04:00
Alex Deucher
bd85e4a041 radeonsi: fix the raster config setup for 1 RB iceland chips
I didn't realize there were 1 and 2 RB variants when this code
was originally added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
2016-06-01 09:59:57 -04:00
Dave Airlie
6400144041 mesa/sampler: fix error codes for sampler parameters.
The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it,
however version 8 of it fixed this, and the GL specs also have
the fixed value in them.

Fixes:
GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-01 17:01:19 +10:00
Dave Airlie
0ebf4257a3 glsl: define some GLES3 constants in GLSL 4.1
The GLSL 4.1 spec adds:
gl_MaxVertexUniformVectors
gl_MaxFragmentUniformVectors
gl_MaxVaryingVectors

This fixes:
GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-01 17:01:13 +10:00
Topi Pohjolainen
6ca118d2f4 i965: Add norbc debug option
This INTEL_DEBUG option disables lossless compression (also known
as render buffer compression).

v2: (Matt) Use likely(!lossless_compression_disabled) instead of
           !likely(lossless_compression_disabled)
    (Grazvydas) Update docs/envvars.html

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-01 09:16:36 +03:00
Topi Pohjolainen
30e9e6bd07 i965/gen9: Configure rbc buffers as plain for non-rbc tex views
Fixes rendering in Shadow of Mordor with rbc. Application writes
RGBA_UNORM texture filling it with values the application wants to
later on treat as SRGB_ALPHA.
Intel driver enables lossless compression for the buffer by the time
of writing. However, the driver fails to make sure the buffer can be
sampled as something else later on and unfortunately there is
restriction in the hardware for using lossless compression for srgb
formats which looks to extend itself to the sampling engine also.
Requesting srgb to linear conversion on top of compressed buffer
results the color values to be pretty much garbage.

Fortunately none of tracked benchmarks showed a regression with
this.

v2 (Matt): Add missing space

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-01 09:16:36 +03:00
Kenneth Graunke
a3dc99f3d4 i965: Fix the passthrough TCS for isolines.
We weren't setting up several of the uniform values for the patch
header, so we'd crash when uploading push constants.  We at least
need to initialize them to zero.  We also had the isoline parameters
reversed, so it would also render incorrectly (if it didn't crash).

Fixes a new Piglit test(*) (isoline-no-tcs), as well as crashes in
GL44-CTS.tessellation_shader.single.max_patch_vertices.

(*) https://lists.freedesktop.org/archives/piglit/2016-May/019866.html

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2016-05-31 23:09:13 -07:00
Dave Airlie
ebb81cd683 i965/xfb: skip components in correct buffer.
The driver was adding the skip components but always for buffer 0.

This fixes:
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-01 15:53:00 +10:00
Dave Airlie
1fe7bbb911 glsl/linker: fix multiple streams transform feedback.
e2791b38b4
mesa/program_interface_query: fix transform feedback varyings.

caused a regression in
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams
on radeonsi.

The problem was it was using the skip components varying to set
the stream id, when it should wait until a varying was written,
this just adds the varying checks in the right place.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-01 13:30:41 +10:00
Dave Airlie
e891f7cf55 mesa/bufferobj: use mapping range in BufferSubData.
According to GL4.5 spec:
An INVALID_OPERATION error is generated if any part of the speci-
fied buffer range is mapped with MapBufferRange or MapBuffer (see sec-
tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map-
BufferRange access flags.

So we should use the if range is mapped path.

This fixes:
GL45-CTS.buffer_storage.map_persistent_buffer_sub_data

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-01 13:30:40 +10:00
Ilia Mirkin
18d11c9989 nv50/ir: fix error finding free element in bitset in some situations
This really only hits for bitsets with a size of a multiple of 32. We
can end up with pos = -1 as a result of the ffs, which we in turn decide
is a valid position (since we fall through the loop and i == 1, we end
up adding 32 to it, so end up returning 31 again).

Up until recently this was largely unreachable, as the register file
sizes were all 63 or 255. However with the advent of compute shaders
which can restrict the number of registers, this can now happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-05-31 23:25:51 -04:00
Ilia Mirkin
d873608bcf nv50/ir: print relevant file's bitset when showing RA info
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-31 23:25:50 -04:00
Timothy Arceri
98d40b4d11 Revert "glsl: fix xfb_offset unsized array validation"
This reverts commit aac90ba292.

The commit caused a regression in:
piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom

Also the CTS test it was meant to fix seems like it may be bogus.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-01 10:33:57 +10:00
Francisco Jerez
c1107cec44 i965/fs: Allow scalar source regions on SNB math instructions.
I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:

 "The supported regioning modes for math instructions are align16,
  align1 with the following restrictions:
   - Scalar source is supported.
  [...]
   - Source and destination offset must be the same, except the case of
     scalar source."

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-31 15:57:41 -07:00
Francisco Jerez
06d8765bc0 i965/fs: Fix constant combining for instructions that cannot accept source mods.
This is the case for SNB math instructions so we need to be careful
and insert the literal value of the immediate into the table (rather
than its absolute value) if the instruction is unable to invert the
sign of the constant on the fly.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:57:41 -07:00
Francisco Jerez
303ec22ed6 i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:57:41 -07:00
Francisco Jerez
4fe4f6e8a7 i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes.
Which requires using a bitset instead of a boolean flag to keep track
of the GRFs we've seen a generating instruction for already.  The
search loop continues until all instructions initializing the value of
the source VGRF have been found, or it is determined that coalescing
is not possible.

Fixes a few piglit test cases on Gen4-6 which were regressed by
6956015aa5 due to the different (yet
perfectly valid) ordering in which copy instructions are emitted now
by the simd lowering pass, which had the side effect of causing this
optimization pass to start corrupting the program in cases where a
VGRF-to-MRF copy instruction would be eliminated but only the last
instruction writing to the source VGRF region would be rewritten to
point to the target MRF.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:57:41 -07:00
Francisco Jerez
1898673f58 i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation.
This will be required to correctly transform the destination of 8-wide
instructions that write a single GRF of a VGRF to MRF copy marked
COMPR4.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:57:40 -07:00
Francisco Jerez
485fbaff03 i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops.
This will allow compute_to_mrf to handle cases where the source of the
VGRF-to-MRF copy is initialized by more than one instruction.  In such
cases we cannot rewrite the destination of any of the generating
instructions until it's known whether the whole VGRF source region can
be coalesced into the destination MRF, which will imply continuing the
search until all generating instructions have been found or it has
been determined that the VGRF and MRF registers cannot be coalesced.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:57:40 -07:00
Francisco Jerez
4b0ec9f475 i965/fs: Fix compute-to-mrf VGRF region coverage condition.
Compute-to-mrf was checking whether the destination of scan_inst is
more than one component (making assumptions about the instruction data
type) in order to find out whether the result is being fully copied
into the MRF destination, which is rather inaccurate in cases where a
single-component instruction is only partially contained in the source
region, or when the execution size of the copy and scan_inst
instructions differ.  Instead check whether the destination region of
the instruction is really contained within the bounds of the source
region of the copy.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:57:40 -07:00
Francisco Jerez
bb61e24787 i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap().
Compute-to-mrf was being rather heavy-handed about checking whether
instruction source or destination regions interfere with the copy
instruction, which could conceivably lead to program miscompilation.
Fix it by using regions_overlap() instead of the open-coded and
dubiously correct overlap checks.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:56:54 -07:00
Francisco Jerez
88f380a2dd i965/fs: Teach regions_overlap() about COMPR4 MRF regions.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-31 15:22:04 -07:00
Dylan Baker
604010a7ed Don't use python 3
Now there are not files that require python 3, so for now just remove
the python 3 dependency and use python 2. I think the right plan is to
just get all of the python ready for python 3, and then use whatever
python is available.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Dylan Baker
ab31817fed genxml: change chbang to python 2
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Dylan Baker
12c1a01c72 genxml: use the isalpha method rather than str.isalpha.
This fixes gen_pack_header to work on python 2, where name[0] is unicode
not str.

Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Dylan Baker
a45a25418b genxml: require future imports for python2 compatibility.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Dylan Baker
e5681e4d70 genxml: mark re strings as raw
This is a correctness issue.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Dylan Baker
de2e9da2e9 genxml: Make classes descendants of object
This is the default in python3, but in python2 you get old style
classes. No one likes old-style classes.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Dylan Baker
9f50e3572c genxml: mark gen_pack_header.py as encoded in utf-8
There is unicode in this file, and I'm actually surprised that the
python interpreter hasn't gotten grumpy.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-05-31 15:09:06 -07:00
Bas Nieuwenhuizen
35818129a6 radeonsi: Decompress DCC textures in a render feedback loop.
By using a counter to quickly reject textures that are not
bound to a framebuffer, the performance impact when binding
sampler_views/images is not too large.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-31 21:43:04 +02:00
Bas Nieuwenhuizen
cbe3421f05 radeonsi: Add counter to check if a texture is bound to a framebuffer.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-31 21:43:00 +02:00
Rhys Kidd
8cb74dd4e6 vc4: Fix compiler warnings in fail_instr path of QIR validate pass
Introduced in 8e2d0843c0.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-31 10:56:02 -07:00
Emil Velikov
b8e1f59d62 anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards
The generated sources should follow the example set by the vulkan
headers and our non-generated code. Namely: the code for all supported
platforms should be available, each one guarded by its respective
VK_USE_PLATFORM_*_KHR macro.

v2: Reword commit message.

Cc: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)
2016-05-31 18:41:28 +01:00
Brian Paul
6bea33008e svga: change enum pipe_resource_usage back to unsigned
This parameter is actually a bitmask of PIPE_TRANSFER_x flags.
Change it back to a simple unsigned type.  IIRC, some compilers
complain about masks of enum values.  Also, this make the function
signature match u_resource_vtbl::transfer_map() again.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-31 10:20:36 -06:00
Marek Olšák
7ca55d2da8 radeonsi: fix CP DMA hazard with index buffer fetches
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-31 16:59:32 +02:00
Marek Olšák
d427110882 r600g: do GL-compliant integer resolves
The GL spec has been clarified and the new rule says we should just
copy 1 sample. u_blitter does the right thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-31 16:48:55 +02:00
Marek Olšák
d5882bb0df radeonsi: do GL-compliant integer resolves
The GL spec has been clarified and the new rule says we should just
copy 1 sample. u_blitter does the right thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-31 16:48:54 +02:00
Marek Olšák
921ab0028e gallium/u_blitter: do GL-compliant integer resolves
The GL spec has been clarified and the new rule says we should just
copy 1 sample.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-31 16:48:53 +02:00
Marek Olšák
8a10192b4b mesa: fix crash in driver_RenderTexture_is_safe
This just fixed the crash with the apitrace in bug report.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-31 16:43:34 +02:00
Marek Olšák
fc4896e686 radeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0
It's not needed since it was fixed in the kernel.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-31 16:41:22 +02:00
Jakob Sinclair
877c00c653 gallium/radeon: fixed division by zero
Coverity is getting a false positive that a division by zero can occur
here. This change will silence the Coverity warnings as a division by zero
cannot occur in this case.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-05-31 12:51:20 +02:00
Eric Engestrom
35fd5282ea st/glsl_to_tgsi: prevent infinite loop
`unsigned j` would never fail `j >= 0`, leading to an infinite loop as
`j--` wraps around.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-05-31 11:46:30 +02:00
Dave Airlie
f87352d769 glsl/images: bounds check image unit assignment
The CTS test:
GL45-CTS.multi_bind.dispatch_bind_image_textures
binds 192 image uniforms, we reject this later,
but not until after we trash the contents of the
struct gl_shader.

Error now reads:
Too many compute shader image uniforms (192 > 16)
instead of
Too many compute shader image uniforms (2745344416 > 16)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-31 10:41:44 +10:00
Ilia Mirkin
4b1a167a2b nvc0/ir: fix spilling predicates to registers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-05-30 18:15:14 -04:00
Ilia Mirkin
1f895caba0 nvc0/ir: limit max number of regs based on availability in SM
This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-30 18:15:10 -04:00
Ilia Mirkin
27a51ff9b4 nv50/ir: record number of threads in a compute shader
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-30 18:14:55 -04:00
Pierre Moreau
ae70879530 nv50/ir: Add missing handling of U64/S64 in inlines
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-30 16:12:12 -04:00
Emil Velikov
9074470d7b docs: rename release notes to 12.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7ad2cb6f08)
2016-05-30 20:33:30 +01:00
Ilia Mirkin
68d135011b docs: move nvc0 out of individual lines of GL 4.2, 4.3, ES 3.1
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-30 15:18:32 -04:00
Emil Velikov
888cf6eea2 docs: add 12.1.0-devel release notes template, bump version
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 20:03:19 +01:00
Marek Olšák
4291229488 docs/GL3: mark radeonsi as all done up to GL 4.3 and GLES 3.1 2016-05-30 20:48:51 +02:00
Emil Velikov
922b471777 nir: add the SConscript.nir to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 19:19:01 +01:00
579 changed files with 8721 additions and 11652 deletions

View File

@@ -1,7 +1,6 @@
language: c
sudo: true
dist: trusty
sudo: false
cache:
directories:
@@ -16,11 +15,7 @@ addons:
- libexpat1-dev
- libxcb-dri2-0-dev
- libx11-xcb-dev
- llvm-3.5-dev
# llvm-config is not in the dev package?
- llvm-3.5
# LLVM packaging is broken and misses this dep.
- libedit-dev
- llvm-3.4-dev
- scons
env:
@@ -46,16 +41,6 @@ install:
- export PATH="/usr/lib/ccache:$PATH"
- pip install --user mako
# Since libdrm gets updated in configure.ac regularly, try to pick up the
# latest version from there.
- for line in `grep "^LIBDRM_.*_REQUIRED=" configure.ac`; do
old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;
new_ver=`echo $line | sed 's/.*REQUIRED=//'`;
if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then
export LIBDRM_VERSION="libdrm-$new_ver";
fi;
done
# Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting).
@@ -93,19 +78,22 @@ install:
- wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
- tar -jxvf $LIBDRM_VERSION.tar.bz2
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 && make install)
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
- tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
- (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Disabled LLVM (and therefore r300 and r600) because the build fails
# with "undefined reference to `clock_gettime'" and "undefined
# reference to `setupterm'" in llvmpipe.
script:
- if test "x$BUILD" = xmake; then
./autogen.sh --enable-debug
--disable-gallium-llvm
--with-egl-platforms=x11,drm
--with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau
--with-gallium-drivers=svga,swrast,vc4,virgl,r300,r600
--disable-llvm-shared-libs
--with-gallium-drivers=svga,swrast,vc4,virgl
;
make && make check;
elif test x$BUILD = xscons; then

View File

@@ -34,6 +34,10 @@ MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)
LOCAL_CFLAGS += \
-Wno-unused-parameter \
-Wno-date-time \
-Wno-pointer-arith \
-Wno-missing-field-initializers \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
@@ -78,6 +82,12 @@ LOCAL_CFLAGS += \
-D__STDC_LIMIT_MACROS
endif
# add libdrm if there are hardware drivers
ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
LOCAL_CFLAGS += -DHAVE_LIBDRM
LOCAL_SHARED_LIBRARIES += libdrm
endif
LOCAL_CPPFLAGS += \
$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \
-Wno-error=non-virtual-dtor \

View File

@@ -40,7 +40,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-vdpau \
--enable-xa \
--enable-xvmc \
--enable-llvm-shared-libs \
--disable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \
@@ -62,7 +62,6 @@ noinst_HEADERS = \
include/c99_math.h \
include/c11 \
include/D3D9 \
include/GL/wglext.h \
include/HaikuGL \
include/no_extern_c.h \
include/pci_ids

View File

@@ -1 +1 @@
12.0.6
12.1.0-devel

View File

@@ -37,8 +37,6 @@ cache:
- win_flex_bison-2.4.5.zip
- llvm-3.3.1-msvc2013-mtd.7z
os: Visual Studio 2013
environment:
WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip
LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z
@@ -49,13 +47,11 @@ install:
- python -m pip --version
# Install Mako
- python -m pip install --egg Mako
# Install pywin32 extensions, needed by SCons
- python -m pip install pypiwin32
# Install SCons
- python -m pip install --egg scons==2.4.1
- scons --version
# Install flex/bison
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"
- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul
- set Path=%CD%\winflexbison;%Path%
- win_flex --version

View File

@@ -1,28 +0,0 @@
# The offending commit that this patch (part) reverts isn't in 12.0
be32a2132785fbc119f17e62070e007ee7d17af7 i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable
# The patch depends on the batch_cache work at least.
89f00f749fda4c1beca38f362c7f86bdc6e32785 a4xx: make sure to actually clamp depth as requested
# The patch depends on the 'generic' interoplation and location
# implementation introduced with 2d6dd30a9b30
114874b22beafb2d07006b197c62d717fc7f80cc i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode
# VAAPI encode landed after the branch point.
a5993022275c20061ac025d9adc26c5f9d02afee st/va Avoid VBR bitrate calculation overflow v2
# EGL_KHR_debug landed after the branch point.
17084b6f9340f798111e53e08f5d35c7630cee48 egl: Fix missing unlock in eglGetSyncAttribKHR
# Depends on update_renderbuffer_read_surfaces at least
f2b9b0c730e345bcffa9eadabb25af3ab02642f2 i965: Add missing BRW_NEW_FS_PROG_DATA to render target reads.
# The commit in question hasn't landed in branch
1ef787339774bc7f1cc9c1615722f944005e070c Revert "egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT"
# Patches depend on the fence_finish() gallium API change and corresponding driver work
f240ad98bc05281ea7013d91973cb5f932ae9434 st/mesa: unduplicate st_check_sync code
b687f766fddb7b39479cd9ee0427984029ea3559 st/mesa: allow multiple concurrent waiters in ClientWaitSync
# Commit was reverted shortly after it landed in master
a39ad185932eab4f25a0cb2b112c10d8700ef242 configure.ac: honour LLVM_LIBDIR when linking against LLVM

View File

@@ -40,7 +40,7 @@ else
for i in $urls
do
id=$(echo $i | cut -d'=' -f2)
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"
echo ""
done

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*12\.0.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -1,39 +0,0 @@
#!/bin/sh
# Script for generating a list of candidates which have typos in the nomination line
#
# Usage examples:
#
# $ bin/get-typod-pick-list.sh
# $ bin/get-typod-pick-list.sh > picklist
# $ bin/get-typod-pick-list.sh | tee picklist
# NB:
# This script intentionally _never_ checks for specific version tag
# Should we consider folding it with the original get-pick-list.sh
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Check to see if it has already been picked over.
if grep -q ^$sha already_picked ; then
continue
fi
git log -n1 --pretty=oneline $sha | cat
done
rm -f already_picked

View File

@@ -225,7 +225,6 @@ AX_GCC_FUNC_ATTRIBUTE([packed])
AX_GCC_FUNC_ATTRIBUTE([pure])
AX_GCC_FUNC_ATTRIBUTE([returns_nonnull])
AX_GCC_FUNC_ATTRIBUTE([unused])
AX_GCC_FUNC_ATTRIBUTE([visibility])
AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
AX_GCC_FUNC_ATTRIBUTE([weak])
@@ -784,7 +783,6 @@ if test "x$enable_asm" = xyes; then
esac
fi
AC_HEADER_MAJOR
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
@@ -1062,7 +1060,6 @@ xno)
;;
esac
AM_CONDITIONAL(HAVE_GLX, test "x$enable_glx" != xno)
AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xdri)
AM_CONDITIONAL(HAVE_XLIB_GLX, test "x$enable_glx" = xxlib)
AM_CONDITIONAL(HAVE_GALLIUM_XLIB_GLX, test "x$enable_glx" = xgallium-xlib)
@@ -1641,9 +1638,9 @@ esac
AC_ARG_WITH([vulkan-icddir],
[AS_HELP_STRING([--with-vulkan-icddir=DIR],
[directory for the Vulkan driver icd files @<:@${datarootdir}/vulkan/icd.d@:>@])],
[directory for the Vulkan driver icd files @<:@${sysconfdir}/vulkan/icd.d@:>@])],
[VULKAN_ICD_INSTALL_DIR="$withval"],
[VULKAN_ICD_INSTALL_DIR='${datarootdir}/vulkan/icd.d'])
[VULKAN_ICD_INSTALL_DIR='${sysconfdir}/vulkan/icd.d'])
AC_SUBST([VULKAN_ICD_INSTALL_DIR])
if test -n "$with_vulkan_drivers"; then
@@ -1999,8 +1996,8 @@ if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then
AC_MSG_ERROR([cannot build egl state tracker without EGL library])
fi
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland_scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland_scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner])
@@ -2184,10 +2181,6 @@ if test "x$enable_gallium_llvm" = xyes; then
LLVM_COMPONENTS="engine bitwriter mcjit mcdisassembler"
if $LLVM_CONFIG --components | grep -q inteljitevents ; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} inteljitevents"
fi
if test "x$enable_opencl" = xyes; then
llvm_check_version_for "3" "5" "0" "opencl"
@@ -2337,45 +2330,6 @@ swr_llvm_check() {
fi
}
swr_require_cxx_feature_flags() {
feature_name="$1"
preprocessor_test="$2"
option_list="$3"
output_var="$4"
AC_MSG_CHECKING([whether $CXX supports $feature_name])
AC_LANG_PUSH([C++])
save_CXXFLAGS="$CXXFLAGS"
save_IFS="$IFS"
IFS=","
found=0
for opts in $option_list
do
unset IFS
CXXFLAGS="$opts $save_CXXFLAGS"
AC_COMPILE_IFELSE(
[AC_LANG_PROGRAM(
[ #if !($preprocessor_test)
#error
#endif
])],
[found=1; break],
[])
IFS=","
done
IFS="$save_IFS"
CXXFLAGS="$save_CXXFLAGS"
AC_LANG_POP([C++])
if test $found -eq 1; then
AC_MSG_RESULT([$opts])
eval "$output_var=\$opts"
return 0
fi
AC_MSG_RESULT([no])
AC_MSG_ERROR([swr requires $feature_name support])
return 1
}
dnl Duplicates in GALLIUM_DRIVERS_DIRS are removed by sorting it after this block
if test -n "$with_gallium_drivers"; then
gallium_drivers=`IFS=', '; echo $with_gallium_drivers`
@@ -2445,20 +2399,29 @@ if test -n "$with_gallium_drivers"; then
xswr)
swr_llvm_check "swr"
swr_require_cxx_feature_flags "C++11" "__cplusplus >= 201103L" \
",-std=c++11" \
SWR_CXX11_CXXFLAGS
AC_SUBST([SWR_CXX11_CXXFLAGS])
AC_MSG_CHECKING([whether $CXX supports c++11/AVX/AVX2])
AVX_CXXFLAGS="-march=core-avx-i"
AVX2_CXXFLAGS="-march=core-avx2"
swr_require_cxx_feature_flags "AVX" "defined(__AVX__)" \
",-mavx,-march=core-avx" \
SWR_AVX_CXXFLAGS
AC_SUBST([SWR_AVX_CXXFLAGS])
AC_LANG_PUSH([C++])
save_CXXFLAGS="$CXXFLAGS"
CXXFLAGS="-std=c++11 $CXXFLAGS"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
[AC_MSG_ERROR([c++11 compiler support not detected])])
CXXFLAGS="$save_CXXFLAGS"
swr_require_cxx_feature_flags "AVX2" "defined(__AVX2__)" \
",-mavx2 -mfma -mbmi2 -mf16c,-march=core-avx2" \
SWR_AVX2_CXXFLAGS
AC_SUBST([SWR_AVX2_CXXFLAGS])
save_CXXFLAGS="$CXXFLAGS"
CXXFLAGS="$AVX_CXXFLAGS $CXXFLAGS"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
[AC_MSG_ERROR([AVX compiler support not detected])])
CXXFLAGS="$save_CXXFLAGS"
save_CFLAGS="$CXXFLAGS"
CXXFLAGS="$AVX2_CXXFLAGS $CXXFLAGS"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
[AC_MSG_ERROR([AVX2 compiler support not detected])])
CXXFLAGS="$save_CXXFLAGS"
AC_LANG_POP([C++])
HAVE_GALLIUM_SWR=yes
;;
@@ -2596,8 +2559,6 @@ fi
AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
AM_CONDITIONAL(HAVE_COMMON_OSMESA, test "x$enable_osmesa" = xyes -o \
"x$enable_gallium_osmesa" = xyes)
AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64)
AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
@@ -2629,8 +2590,6 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
AC_SUBST([XA_TINY], $XA_TINY)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
AC_SUBST([TIMESTAMP_CMD], '`test $(SOURCE_DATE_EPOCH) && echo $(SOURCE_DATE_EPOCH) || date +%s`')
AC_ARG_ENABLE(valgrind,
[AS_HELP_STRING([--enable-valgrind],
[Build mesa with valgrind support (default: auto)])],

View File

@@ -146,45 +146,45 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL 4.2, GLSL 4.20 -- all DONE: radeonsi
GL 4.2, GLSL 4.20 -- all DONE: nvc0, radeonsi
GL_ARB_texture_compression_bptc DONE (i965, nvc0, r600, radeonsi)
GL_ARB_texture_compression_bptc DONE (i965, r600)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_atomic_counters DONE (i965, softpipe)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (i965, softpipe)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_internalformat_query DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_map_buffer_alignment DONE (all drivers)
GL 4.3, GLSL 4.30:
GL 4.3, GLSL 4.30 -- all DONE: nvc0, radeonsi
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_copy_image DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_compute_shader DONE (i965, softpipe)
GL_ARB_copy_image DONE (i965, nv50, r600, softpipe, llvmpipe)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, r600, llvmpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, r600, softpipe)
GL_ARB_internalformat_query2 DONE (all drivers)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_multi_draw_indirect DONE (i965, r600, llvmpipe, softpipe, swr)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior DONE (i965, nvc0, radeonsi)
GL_ARB_shader_image_size DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
GL_ARB_robust_buffer_access_behavior DONE (i965)
GL_ARB_shader_image_size DONE (i965, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, softpipe)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (nv50, i965, r600, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_view DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_vertex_attrib_binding DONE (all drivers)
@@ -211,7 +211,7 @@ GL 4.5, GLSL 4.50:
GL_ARB_ES3_1_compatibility DONE (nvc0, radeonsi)
GL_ARB_clip_control DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, nvc0, llvmpipe, softpipe)
GL_ARB_cull_distance DONE (i965, nv50, nvc0, llvmpipe, softpipe, swr)
GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
@@ -222,32 +222,32 @@ GL 4.5, GLSL 4.50:
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1
GLES3.1, GLSL ES 3.1 -- all DONE: nvc0, radeonsi
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_compute_shader DONE (i965, softpipe)
GL_ARB_draw_indirect DONE (i965, r600, llvmpipe, softpipe, swr)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, r600, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_image_load_store DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_image_size DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_atomic_counters DONE (i965, softpipe)
GL_ARB_shader_image_load_store DONE (i965, softpipe)
GL_ARB_shader_image_size DONE (i965, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965, nv50, r600, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965, nvc0, r600, radeonsi)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600, radeonsi)
GS5 Enhanced textureGather DONE (i965, r600)
GS5 Packing/bitfield/conversion functions DONE (i965, r600)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functionality not covered above:
glMemoryBarrierByRegion DONE
glGetTexLevelParameter[fi]v - needs updates DONE
glGetBooleani_v - restrict to GLES enums
gl_HelperInvocation support DONE (i965, nvc0, r600, radeonsi)
gl_HelperInvocation support DONE (i965, r600)
GLES3.2, GLSL ES 3.2
GL_EXT_color_buffer_float DONE (all drivers)

View File

@@ -684,9 +684,11 @@ To add a new GL extension to Mesa you have to do at least the following.
</li>
<li>
Add a new entry to the <code>gl_extensions</code> struct in mtypes.h
if the extension requires driver capabilities not already exposed by
another extension.
</li>
<li>
Update the <code>extensions.c</code> file.
Add a new entry to the src/mesa/main/extensions_table.h file.
</li>
<li>
From this point, the best way to proceed is to find another extension,
@@ -697,12 +699,18 @@ To add a new GL extension to Mesa you have to do at least the following.
If the new extension adds new GL state, the functions in get.c, enable.c
and attrib.c will most likely require new code.
</li>
<li>
To determine if the new extension is active in the current context,
use the auto-generated _mesa_has_##name_str() function defined in
src/mesa/main/extensions.h.
</li>
<li>
The dispatch tests check_table.cpp and dispatch_sanity.cpp
should be updated with details about the new extensions functions. These
tests are run using 'make check'
</li>
</ul>
</p>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.0 Release Notes / July 8, 2016</h1>
<h1>Mesa 12.0.0 Release Notes / TBD</h1>
<p>
Mesa 12.0.0 is a new development release.
@@ -33,8 +33,7 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
3b8fa4d86d78f8f6ec86055b92ad1afe869001483593b3dd4531184b8bc4fcfb mesa-12.0.0.tar.gz
0090c025219318935124292b482e3439bc43e8c074ad01086449fcad88547dc6 mesa-12.0.0.tar.xz
TBD.
</pre>
@@ -79,256 +78,11 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=42187">Bug 42187</a> - ES 1.1 conformance pntszary.c fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81585">Bug 81585</a> - piglit spec_glsl-1.10_compiler_literals_invalid-float-suffix-capital-f.vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83036">Bug 83036</a> - [ILK]Piglit spec_ARB_copy_image_arb_copy_image-formats fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89607">Bug 89607</a> - Assertion hit in opt_array_splitting with recursive array indexing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92628">Bug 92628</a> - HTTP site for Mesa downloads</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92743">Bug 92743</a> - Centroid shouldn't have to match between the FS and the VS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93054">Bug 93054</a> - [BDW] DiRT Showdown and Bioshock Infinite only render half the screen (bottom left triangle)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93551">Bug 93551</a> - Divinity: Original Sin Enhanced Edition(Native) crash on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93767">Bug 93767</a> - Glitches with soft shadows and MSAA in Knights of the Old Republic 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93840">Bug 93840</a> - [i965] Alien: Isolation fails with GL_ARB_compute_shader enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94081">Bug 94081</a> - [HSW] compute shader shared var + atomic op = fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94086">Bug 94086</a> - Multiple conflicting libGL libraries installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94116">Bug 94116</a> - program interface queries not returning right data for UBO / GL_BLOCK_INDEX</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94129">Bug 94129</a> - Mesa's compiler should warn about undefined values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94181">Bug 94181</a> - [regression] piglit.spec.ext_framebuffer_object.getteximage-formats init-by-clear-and-render</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94198">Bug 94198</a> - [HSW] segfault in copy image when copying from cubemap to 2d</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94199">Bug 94199</a> - Shader abort/crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94253">Bug 94253</a> - [llvmpipe] piglit gl-1.0-swapbuffers-behavior regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94254">Bug 94254</a> - [llvmpipe] [softpipe] piglit read-front regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94257">Bug 94257</a> - [softpipe] piglit glx-copy-sub-buffer regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94274">Bug 94274</a> - [swrast] piglit arb_occlusion_query2-render regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94284">Bug 94284</a> - [radeonsi] outlast segfault on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94291">Bug 94291</a> - llvmpipe tests fail if built on skylake i7-6700k</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94348">Bug 94348</a> - vkBindImageMemory doesn't take into account the offset when the image is used as a depth buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94383">Bug 94383</a> - build error on i386 when enabling swr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94447">Bug 94447</a> - glsl/glcpp/tests/glcpp-test-cr-lf regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94453">Bug 94453</a> - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_{center,corner} fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94454">Bug 94454</a> - dEQP-GLES3.functional.clipping.point.wide_point_clip* fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94456">Bug 94456</a> - dEQP-GLES3.functional.state_query.floats.{blend_color,color_clear_value,depth_clear_value}_getinteger64 fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94458">Bug 94458</a> - dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94468">Bug 94468</a> - [HSW, regression, bisected] numerous Sascha demos render incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94485">Bug 94485</a> - dEQP-GLES3.functional.negative_api.shader.compile_shader and delete_shader broken by Meta</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94524">Bug 94524</a> - Wrong gl_TessLevelOuter interpretation for isolines</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94657">Bug 94657</a> - [llvmpipe] [softpipe] piglit arb_texture_view-getteximage-srgb regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94661">Bug 94661</a> - [bdw, skl] vk-cts: new test failing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94671">Bug 94671</a> - [radeonsi] Blue-ish textures in Shadow of Mordor</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94713">Bug 94713</a> - [Gen8+] ES 3.1 Stencil texturing broken for 2DArray/Cubes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94747">Bug 94747</a> - Convert phi nodes to logical operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94835">Bug 94835</a> - Increase fragment shader sample limits from 16 to 32 (AMD Linux - Mesa/RadeonSi)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94847">Bug 94847</a> - [ES3.1CTS] es31-cts.draw_buffers_indexed.color_masks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94896">Bug 94896</a> - [vulkan] new CTS tests fail on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94904">Bug 94904</a> - [vulkan, BSW] dEQP-VK.api.object_management.multithreaded_per_thread_device intermittent crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94907">Bug 94907</a> - codegen/nv50_ir_ra.cpp:1330:29: error: isinf was not declared in this scope</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94909">Bug 94909</a> - [llvmpipe] piglit fs-roundEven-float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94917">Bug 94917</a> - radeonsi supports GL_ARB_shader_storage_buffer_object with 0 GL_MAX_COMBINED_SHADER_STORAGE_BLOCKS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94924">Bug 94924</a> - [GEN8] Ungine Valley fails to run due to &quot;intel_do_flush_locked failed: Input/output error&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94925">Bug 94925</a> - Crash in egl_dri3_get_dri_context with Dolphin EGL/X11 in single-core mode</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94944">Bug 94944</a> - [regression, hswgt1] gpu hang on arb_shader_image_load_store</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94969">Bug 94969</a> - build fails because install-data-local doesn't follow $DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94972">Bug 94972</a> - blend failures on llvmpipe with llvm 3.7 due to vector selects</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94979">Bug 94979</a> - dolphin-emu rendering broken on gallium/SWR + crashing often</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94984">Bug 94984</a> - XCom2 crashes with SIGSEGV on radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94997">Bug 94997</a> - [vulkan, SKL,BDW,HSW] deqp-vk.spirv_assembly.instruction.compute.opcopymemory.array regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94998">Bug 94998</a> - [vulkan] deqp-vk.pipeline.push_constant.graphics_pipeline.count_3shader_vgf regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95001">Bug 95001</a> - [vulkan] deqp-vk.binding_model.shader_access regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95005">Bug 95005</a> - Unreal engine demos segfault after shader compilation error with OpenGL 4.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95034">Bug 95034</a> - vkResetCommandPool should not destroy the command buffers.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95071">Bug 95071</a> - [bisected] Wrong colors in KDE/Qt applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95138">Bug 95138</a> - [deqp, 32bit, gen8+] deqp-gles31.functional.draw_indirect.negative</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95142">Bug 95142</a> - [ES3.1CTS,GEN8] ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95158">Bug 95158</a> - glx-test compilation fails in `make check`</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95164">Bug 95164</a> - GLSL compiler (linker I think) emits assertion upon call to glAttachShader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95180">Bug 95180</a> - rasterizer/memory/Convert.h:170:9: error: __builtin_isnan is not a member of std</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95198">Bug 95198</a> - Shadow of Mordor beta has missing geometry with gl 4.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95203">Bug 95203</a> - Tonga GST/OMX/VCE encode broken since mesa: st/omx: Fix resource leak on OMX_ErrorNone</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95211">Bug 95211</a> - scons TypeError: 'tuple' object is not callable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95251">Bug 95251</a> - vdpau decoder capabilities: not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95252">Bug 95252</a> - [deqp] deqp-gles31.functional.debug.object_labels.query_length_only crashes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95292">Bug 95292</a> - [IVB,SKL] vulkan: stride/tiling issue with vkCmdCopyBufferToImage from larger source buffer into destination image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95296">Bug 95296</a> - nir_lower_double_packing.c:79:4: error: void function 'lower_double_pack_impl' should not return a value [-Wreturn-type]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95324">Bug 95324</a> - GL33-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo fails in one case on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95370">Bug 95370</a> - [965GM] piglit fails many tests after a5d7e144</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95373">Bug 95373</a> - Suspicious warning in brw_blorp_clear.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95403">Bug 95403</a> - [GK110] misaligned_gpr spamming dmesg when playing victor vran</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives &quot;invalid floating point operation&quot; at startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95456">Bug 95456</a> - glXGetFBConfigs has invalid screen bounds</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95462">Bug 95462</a> - [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95529">Bug 95529</a> - [regression, bisected] Image corruption in Chrome</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95537">Bug 95537</a> - Invalid argument in anv_ioctl called from anv_physical_device_init</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96221">Bug 96221</a> - nir/nir_lower_tex.c:202: error: unknown field f32 specified in initializer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96228">Bug 96228</a> - SSBO test regressions from mesa 5b267509</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96236">Bug 96236</a> - dri_interface.h:404: error: redefinition of typedef mesa_glinterop_device_info</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96238">Bug 96238</a> - swr fails to build outside of the main directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96239">Bug 96239</a> - [radeonsi tessellation] [R9 290/390] Random &quot;texture flickering&quot; (Shadow of Mordor, Tomb Raider, Unigine Heaven 4.0)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96258">Bug 96258</a> - [NVC0] Hang when running compute program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96285">Bug 96285</a> - Mesa build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96299">Bug 96299</a> - [vulkan] 64 regressions due to mesa d5f2f32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96346">Bug 96346</a> - [SNB,CTS] es2-cts.gtf.gl.atan regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96349">Bug 96349</a> - [CTS,SKL,BSW,BDW,KBL,BXT] es31-cts.arrays_of_arrays.interactionuniformbuffers3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96351">Bug 96351</a> - [CTS,SKL,KBL,BXT] es2-cts.gtf.gl2extensiontests.egl_image.egl_image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96504">Bug 96504</a> - [vulkancts] compute tests crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] &quot;clover: Update OpenCL version string to match OpenGL&quot;: clover's build fails because of missing git_sha1.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96565">Bug 96565</a> - Clive Barker's Jericho displays strange,vivid colors when motion blur enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96629">Bug 96629</a> - dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width &gt;= 1' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96674">Bug 96674</a> - [SNB, ILK] spec.ext_image_dma_buf_import.ext_image_dma_buf_import-sample_nv1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96765">Bug 96765</a> - BindFragDataLocationIndexed on array fragment shader output.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>
</ul>
TBD.
<h2>Changes</h2>
Radeon drivers (r600 and radeonsi) now require LLVm 3.6 as a minimum.
TBD.
</div>
</body>

View File

@@ -1,403 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.2 Release Notes / September 2, 2016</h1>
<p>
Mesa 12.0.2 is a bug fix release which fixes bugs found since the 12.0.1 release.
</p>
<p>
Mesa 12.0.2 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
a08565ab1273751ebe2ffa928cbf785056594c803077c9719d0763da780f2918 mesa-12.0.2.tar.gz
d957a5cc371dcd7ff2aa0d87492f263aece46f79352f4520039b58b1f32552cb mesa-12.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96381">Bug 96381</a> - Texture artifacts with immutable texture storage and mipmaps</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97567">Bug 97567</a> - [SNB, ILK] ctl, piglit regressions in mesa 12.0.2rc1</li>
</ul>
<h2>Changes</h2>
<p>Andreas Boll (1):</p>
<ul>
<li>configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too</li>
</ul>
<p>Bernard Kilarski (1):</p>
<ul>
<li>glx: fix error code when there is no context bound</li>
</ul>
<p>Brian Paul (4):</p>
<ul>
<li>svga: handle mismatched number of samplers, sampler views</li>
<li>mesa: use _mesa_clear_texture_image() in clear_texture_fields()</li>
<li>swrast: fix incorrectly positioned putImage() in swrast driver</li>
<li>mesa: fix format conversion bug in get_tex_rgba_uncompressed()</li>
</ul>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Fix miptree layout for EGLImage-based renderbuffers</li>
<li>i965: Respect miptree offsets in intel_readpixels_tiled_memcpy()</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>st/mesa: fix reference counting bug in st_vdpau</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>swr: Refactor checks for compiler feature flags</li>
</ul>
<p>Daniel Scharrer (1):</p>
<ul>
<li>mesa: Fix fixed function spot lighting on newer hardware (again)</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>anv: fix writemask on blit fragment shader.</li>
<li>st/glsl_to_tgsi: fix st_src_reg_for_double constant.</li>
</ul>
<p>Emil Velikov (15):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.1</li>
<li>mesa: automake: list builddir before srcdir</li>
<li>mesa: scons: list builddir before srcdir</li>
<li>i965: store reference to the context within struct brw_fence (v2)</li>
<li>anv: remove internal 'validate' layer</li>
<li>anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>
<li>anv: automake: build with -Bsymbolic</li>
<li>anv: do not export the Vulkan API</li>
<li>anv: remove dummy VK_DEBUG_MARKER_EXT entry points</li>
<li>isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>
<li>cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth"</li>
<li>i915: Check return value of screen-&gt;image.loader-&gt;getBuffers</li>
<li>Revert "i965/miptree: Set logical_depth0 == 6 for cube maps"</li>
<li>glx/glvnd: list the strcmp arguments in correct order</li>
<li>Update version to 12.0.2</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>vc4: Close our screen's fd on screen close.</li>
<li>vc4: Disable early Z with computed depth.</li>
<li>vc4: Fix a leak of the src[] array of VPM reads in optimization.</li>
<li>vc4: Fix leak of the bo_handles table.</li>
</ul>
<p>Francisco Jerez (3):</p>
<ul>
<li>i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.</li>
<li>i965: Make room in the batch epilogue for three more pipe controls.</li>
<li>i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>platform_android: prevent deadlock in droid_swap_buffers</li>
</ul>
<p>Ian Romanick (5):</p>
<ul>
<li>mesa: Strip arrayness from interface block names in some IO validation</li>
<li>glsl: Pack integer and double varyings as flat even if interpolation mode is none</li>
<li>glcpp: Track the actual version instead of just the version_resolved flag</li>
<li>glcpp: Only disallow #undef of pre-defined macros on GLSL ES &gt;= 3.00 shaders</li>
<li>glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10</li>
</ul>
<p>Ilia Mirkin (16):</p>
<ul>
<li>mesa: etc2 online compression is unsupported, don't attempt it</li>
<li>st/mesa: return appropriate mesa format for ETC texture formats</li>
<li>mesa: set _NEW_BUFFERS when updating texture bound to current buffers</li>
<li>nv50,nvc0: srgb rendering is only available for rgba/bgra</li>
<li>vbo: allow DrawElementsBaseVertex in display lists</li>
<li>gallium/util: add helper to compute zmin/zmax for a viewport state</li>
<li>nv50,nvc0: fix depth range when halfz is enabled</li>
<li>nv50/ir: fix bb positions after exit instructions</li>
<li>vbo: add basevertex when looking up elements for vbo splitting</li>
<li>a4xx: only disable depth clipping, not all clipping, when requested</li>
<li>nv50/ir: make sure cfg iterator always hits all blocks</li>
<li>main: add missing EXTRA_END in OES_sample_variables get check</li>
<li>nouveau: always enable at least one RC</li>
<li>nv30: only bail on color/depth bpp mismatch when surfaces are swizzled</li>
<li>a4xx: make sure to actually clamp depth as requested</li>
<li>gk110/ir: fix quadop dall emission</li>
</ul>
<p>Jan Ziak (2):</p>
<ul>
<li>egl/x11: avoid using freed memory if dri2 init fails</li>
<li>loader: fix memory leak in loader_dri3_open</li>
</ul>
<p>Jason Ekstrand (31):</p>
<ul>
<li>nir/spirv: Don't multiply the push constant block size by 4</li>
<li>anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge</li>
<li>glsl/types: Fix function type comparison function</li>
<li>glsl/types: Use _mesa_hash_data for hashing function types</li>
<li>genxml: Make gen6-7 blending look more like gen8</li>
<li>anv/pipeline: Unify blend state setup between gen7 and gen8</li>
<li>anv: Enable independentBlend on gen7</li>
<li>anv: Add an align_down_npot_u32 helper</li>
<li>anv: Handle VK_WHOLE_SIZE properly for buffer views</li>
<li>i965/miptree: Enforce that height == 1 for 1-D array textures</li>
<li>i965/miptree: Set logical_depth0 == 6 for cube maps</li>
<li>nir: Add a nir_deref_foreach_leaf helper</li>
<li>nir/inline: Constant-initialize local variables in the callee if needed</li>
<li>anv/pipeline: Set up point coord enables</li>
<li>i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations</li>
<li>i965/vec4: Make opt_vector_float reset at the top of each block</li>
<li>anv/blit2d: Add a format parameter to bind_dst and create_iview</li>
<li>anv/blit2d: Add support for RGB destinations</li>
<li>anv/clear: Make cmd_clear_image take an actual VkClearValue</li>
<li>anv/clear: Clear E5B9G9R9 images as R32_UINT</li>
<li>anv: Include the pipeline layout in the shader hash</li>
<li>isl: Allow multisampled array textures</li>
<li>anv/descriptor_set: memset anv_descriptor_set_layout</li>
<li>anv/pipeline: Fix bind maps for fragment output arrays</li>
<li>anv/allocator: Correctly set the number of buckets</li>
<li>anv/pipeline: Properly handle OOM during shader compilation</li>
<li>anv: Remove unused fields from anv_pipeline_bind_map</li>
<li>anv: Add pipeline_has_stage guards a few places</li>
<li>anv: Add a struct for storing a compiled shader</li>
<li>anv/pipeline: Add support for caching the push constant map</li>
<li>anv: Rework pipeline caching</li>
</ul>
<p>José Fonseca (2):</p>
<ul>
<li>appveyor: Install pywin32 extensions.</li>
<li>appveyor: Force Visual Studio 2013 image.</li>
</ul>
<p>Kenneth Graunke (21):</p>
<ul>
<li>genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values.</li>
<li>genxml: Add APIMODE_D3D missing enum values and improve consistency.</li>
<li>anv: Fix near plane clipping on Gen7/7.5.</li>
<li>anv: Enable early culling on Gen7.</li>
<li>anv: Unify 3DSTATE_CLIP code across generations.</li>
<li>genxml: Rename "API Rendering Disable" to "Rendering Disable".</li>
<li>anv: Properly call gen75_emit_state_base_address on Haswell.</li>
<li>i965: Include VUE handles for GS with invocations &gt; 1.</li>
<li>nir: Add a base const_index to shared atomic intrinsics.</li>
<li>i965: Fix shared atomic intrinsics to pay attention to base.</li>
<li>mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.</li>
<li>mesa: Don't call GenerateMipmap if Width or Height == 0.</li>
<li>glsl: Delete bogus ir_set_program_inouts assert.</li>
<li>glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].</li>
<li>glsl: Fix location bias for patch variables.</li>
<li>glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.</li>
<li>mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.</li>
<li>nir/builder: Add bany_inequal and bany helpers.</li>
<li>i965: Implement the WaPreventHSTessLevelsInterference workaround.</li>
<li>i965: Fix execution size of scalar TCS barrier setup code.</li>
<li>i965: Fix barrier count shift in scalar TCS backend.</li>
</ul>
<p>Leo Liu (2):</p>
<ul>
<li>st/omx/enc: check uninitialized list from task release</li>
<li>vl/dri3: fix a memory leak from front buffer</li>
</ul>
<p>Marek Olšák (7):</p>
<ul>
<li>glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast</li>
<li>radeonsi: add a workaround for a compute VGPR-usage LLVM bug</li>
<li>winsys/amdgpu: disallow DCC with mipmaps</li>
<li>gallium/util: fix align64</li>
<li>radeonsi: only set dual source blending for MRT0</li>
<li>radeonsi: fix VM faults due NULL internal const buffers on CIK</li>
<li>radeonsi: disable SDMA texture copying on Carrizo</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>mapi: Massage code to allow clang to compile.</li>
<li>i965/vec4: Ignore swizzle of VGRF for use by var_range_end().</li>
<li>mesa: Use AC_HEADER_MAJOR to include correct header for major().</li>
<li>nir: Walk blocks in source code order in lower_vars_to_ssa.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>glx: Don't use current context in __glXSendError</li>
</ul>
<p>Miklós Máté (1):</p>
<ul>
<li>vbo: set draw_id</li>
</ul>
<p>Nanley Chery (5):</p>
<ul>
<li>anv/descriptor_set: Fix binding partly undefined descriptor sets</li>
<li>isl: Fix assert on raw buffer surface state size</li>
<li>anv/device: Fix max buffer range limits</li>
<li>isl: Fix isl_tiling_is_any_y()</li>
<li>anv/gen7_pipeline: Set PixelShaderKillPixel for discards</li>
</ul>
<p>Nicolai Hähnle (7):</p>
<ul>
<li>radeonsi: explicitly choose center locations for 1xAA on Polaris</li>
<li>radeonsi: fix Polaris MSAA regression</li>
<li>radeonsi: ensure sample locations are set for line and polygon smoothing</li>
<li>st_glsl_to_tgsi: only skip over slots of an input array that are present</li>
<li>glsl: fix optimization of discard nested multiple levels</li>
<li>radeonsi: flush TC L2 cache for indirect draw data</li>
<li>radeonsi: add si_set_rw_buffer to be used for internal descriptors</li>
</ul>
<p>Nicolas Boichat (6):</p>
<ul>
<li>egl/dri2: dri2_make_current: Set EGL error if bindContext fails</li>
<li>egl/wayland: Set disp-&gt;DriverData to NULL on error</li>
<li>egl/surfaceless: Set disp-&gt;DriverData to NULL on error</li>
<li>egl/drm: Set disp-&gt;DriverData to NULL on error</li>
<li>egl/android: Set dpy-&gt;DriverData to NULL on error</li>
<li>egl/dri2: Add reference count for dri2_egl_display</li>
</ul>
<p>Rob Herring (3):</p>
<ul>
<li>Android: add missing u_math.h include path for libmesa_isl</li>
<li>vc4: fix vc4_resource_from_handle() stride calculation</li>
<li>vc4: add hash table look-up for exported dmabufs</li>
</ul>
<p>Samuel Pitoiset (7):</p>
<ul>
<li>nvc0/ir: fix images indirect access on Fermi</li>
<li>nvc0: fix the driver cb size when draw parameters are used</li>
<li>gm107/ir: add missing NEG modifier for IADD32I</li>
<li>gm107/ir: make use of ADD32I for all immediates</li>
<li>nvc0: upload sample locations on GM20x</li>
<li>nvc0: invalidate textures/samplers on GK104+</li>
<li>nv50/ir: always emit the NDV bit for OP_QUADOP</li>
</ul>
<p>Stefan Dirsch (1):</p>
<ul>
<li>Avoid overflow in 'last' variable of FindGLXFunction(...)</li>
</ul>
<p>Stencel, Joanna (1):</p>
<ul>
<li>egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.</li>
</ul>
<p>Tim Rowley (2):</p>
<ul>
<li>Revert "gallium: Force blend color to 16-byte alignment"</li>
<li>swr: switch from overriding -march to selecting features</li>
</ul>
<p>Tomasz Figa (8):</p>
<ul>
<li>gallium/dri: Add shared glapi to LIBADD on Android</li>
<li>egl/android: Remove unused variables</li>
<li>egl/android: Check return value of dri2_get_dri_config()</li>
<li>egl/android: Stop leaking DRI images</li>
<li>gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)</li>
<li>gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)</li>
<li>gallium/winsys/kms: Move display target handle lookup to separate function</li>
<li>gallium/winsys/kms: Look up the GEM handle after importing a prime FD</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,71 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.3 Release Notes / September 15, 2016</h1>
<p>
Mesa 12.0.3 is a bug fix release which fixes bugs found since the 12.0.3 release.
</p>
<p>
Mesa 12.0.3 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
79abcfab3de30dbd416d1582a3cf6b1be308466231488775f1b7bb43be353602 mesa-12.0.3.tar.gz
1dc86dd9b51272eee1fad3df65e18cda2e556ef1bc0b6e07cd750b9757f493b1 mesa-12.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97781">Bug 97781</a> - [HSW, BYT, IVB] es2-cts.gtf.gl2extensiontests.depth_texture_cube_map.depth_texture_cube_map</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.2</li>
<li>Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"</li>
<li>Update version to 12.0.3</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>appveyor: Update winflexbison download URL.</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,321 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.4 Release Notes / November 10, 2016</h1>
<p>
Mesa 12.0.4 is a bug fix release which fixes bugs found since the 12.0.4 release.
</p>
<p>
Mesa 12.0.4 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
22026ce4f1c6a7908b0d10ff057decec0a5633afe7f38a0cef5c08d0689f02a6 mesa-12.0.4.tar.gz
5d6003da867d3f54e5000b4acdfc37e6cce5b6a4459274fdad73e24bd2f0065e mesa-12.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef GLXContext</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (4):</p>
<ul>
<li>gallium/util: Really allow aliasing of dst for u_box_union_*</li>
<li>st/nine: Fix the calculation of the number of vs inputs</li>
<li>st/nine: Fix mistake in Volume9 UnlockBox</li>
<li>st/nine: Fix locking CubeTexture surfaces.</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>configure.ac: fix the name of the Wayland Scanner pc file</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()</li>
</ul>
<p>Chad Versace (3):</p>
<ul>
<li>egl: Fix truncation error in _eglParseSyncAttribList64</li>
<li>i965/sync: Fix uninitalized usage and leak of mutex</li>
<li>egl: Don't advertise unsupported platform extensions</li>
</ul>
<p>Chuanbo Weng (1):</p>
<ul>
<li>gbm: fix potential NULL deref of mapImage/unmapImage.</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>autoconf: Make header install distinct for various APIs (v2)</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>anv: initialise and increment send_sbc</li>
<li>anv/wsi: fix apps that acquire multiple images up front</li>
<li>Revert "st/vdpau: use linear layout for output surfaces"</li>
</ul>
<p>Emil Velikov (12):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.3</li>
<li>cherry-ignore: add non-applicable i965 commit</li>
<li>cherry-ignore: add vaapi encode fix</li>
<li>cherry-ignore: add EGL_KHR_debug fix</li>
<li>cherry-ignore: add update_renderbuffer_read_surfaces()</li>
<li>isl/gen6: correctly check msaa layout samples count</li>
<li>egl/x11: don't crash if dri2_dpy-&gt;conn is NULL</li>
<li>get-pick-list.sh: Require explicit "12.0" for nominating stable patches</li>
<li>automake: don't forget to pick wglext.h in the tarball</li>
<li>cherry-ignore: add N/A EGL revert</li>
<li>cherry-ignore: add ClientWaitSync fixes</li>
<li>Update version to 12.0.4</li>
</ul>
<p>Eric Anholt (5):</p>
<ul>
<li>travis: Parse configure.ac to pick an updated LIBDRM_VERSION.</li>
<li>travis: Update to the Ubuntu Trusty image.</li>
<li>travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.</li>
<li>travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.</li>
<li>gallium: Fix install-gallium-links.mk on non-bash /bin/sh</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept</li>
</ul>
<p>Ilia Mirkin (16):</p>
<ul>
<li>nv30: set usage to staging so that the buffer is allocated in GART</li>
<li>a3xx: make sure to actually clamp depth as requested</li>
<li>a3xx: make use of software clipping when hw can't handle it</li>
<li>a3xx: use window scissor to simulate viewport xy clip</li>
<li>main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer</li>
<li>mesa/formatquery: limit ES target support, fix core context support</li>
<li>nir: fix definition of pack_uvec2_to_uint</li>
<li>gm107/ir: AL2P writes to a predicate register</li>
<li>st/mesa: fix is_scissor_enabled when X/Y are negative</li>
<li>nvc0/ir: fix overwriting of value backing non-constant gather offset</li>
<li>nv50/ir: copy over value's register id when resolving merge of a phi</li>
<li>nvc0/ir: fix textureGather with a single offset</li>
<li>gm107/ir: fix texturing with indirect samplers</li>
<li>gm107/ir: fix bit offset of tex lod setting for indirect texturing</li>
<li>nv50,nvc0: avoid reading out of bounds when getting bogus so info</li>
<li>nv50/ir: process texture offset sources as regular sources</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radeonsi: Fix primitive restart when index changes</li>
</ul>
<p>Jason Ekstrand (9):</p>
<ul>
<li>nir/spirv: Swap the argument order for AtomicCompareExchange</li>
<li>nir/spirv: Use the correct sources for CompareExchange on images</li>
<li>nir/spirv: Break variable decoration handling into a helper</li>
<li>nir/spirv: Refactor variable deocration handling</li>
<li>nir/spirv/cfg: Handle switches whose break block is a loop continue</li>
<li>nir/spirv/cfg: Detect switch_break after loop_break/continue</li>
<li>nir: Add a nop intrinsic</li>
<li>nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks</li>
<li>intel/blorp: Rework our usage of ralloc when compiling shaders</li>
</ul>
<p>Jonathan Gray (3):</p>
<ul>
<li>genxml: add generated headers to EXTRA_DIST</li>
<li>mapi: automake: set VISIBILITY_CFLAGS for shared glapi</li>
<li>mesa: automake: include mesa_glinterop.h in distfile</li>
</ul>
<p>Julien Isorce (1):</p>
<ul>
<li>st/va: also honors interlaced preference when providing a video format</li>
</ul>
<p>Kenneth Graunke (8):</p>
<ul>
<li>nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().</li>
<li>mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.</li>
<li>i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.</li>
<li>i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.</li>
<li>i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.</li>
<li>i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.</li>
<li>i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.</li>
<li>i965: Fix gl_InvocationID in dual object GS where invocations == 1.</li>
</ul>
<p>Marek Olšák (12):</p>
<ul>
<li>radeonsi: fix cubemaps viewed as 2D</li>
<li>radeonsi: take compute shader and dispatch indirect memory usage into account</li>
<li>radeonsi: fix FP64 UBO loads with indirect uniform block indexing</li>
<li>mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc</li>
<li>radeonsi: fix interpolateAt opcodes for .zw components</li>
<li>radeonsi: fix texture border colors for compute shaders</li>
<li>radeonsi: disable ReZ</li>
<li>gallium/radeon: make sure the address of separate CMASK is aligned properly</li>
<li>winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures</li>
<li>egl: use util/macros.h</li>
<li>egl: make interop ABI visible again</li>
<li>glx: make interop ABI visible again</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Perform check for valid fbconfig against proper X-Screen.</li>
</ul>
<p>Martin Peres (2):</p>
<ul>
<li>loader/dri3: add get_dri_screen() to the vtable</li>
<li>loader/dri3: import prime buffers in the currently-bound screen</li>
</ul>
<p>Matt Whitlock (5):</p>
<ul>
<li>egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
</ul>
<p>Max Staudt (1):</p>
<ul>
<li>r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Overhaul dri3_update_num_back</li>
</ul>
<p>Nicholas Bishop (2):</p>
<ul>
<li>gbm: return appropriate error when queryImage() fails</li>
<li>st/dri: check pipe_screen-&gt;resource_get_handle() return value</li>
</ul>
<p>Nicolai Hähnle (10):</p>
<ul>
<li>gallium/radeon: cleanup and fix branch emits</li>
<li>st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations</li>
<li>st/glsl_to_tgsi: simplify translate_tex_offset</li>
<li>st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets</li>
<li>st/mesa: fix vertex elements setup for doubles</li>
<li>radeonsi: fix indirect loads of 64 bit constants</li>
<li>st/glsl_to_tgsi: fix atomic counter addressing</li>
<li>st/glsl_to_tgsi: fix block copies of arrays of doubles</li>
<li>st/mesa: only set primitive_restart when the restart index is in range</li>
<li>radeonsi: fix 64-bit loads from LDS</li>
</ul>
<p>Samuel Pitoiset (4):</p>
<ul>
<li>nvc0/ir: fix subops for IMAD</li>
<li>gk110/ir: fix wrong emission of OP_NOT</li>
<li>nvc0: use correct bufctx when invalidating CP textures</li>
<li>nvc0/ir: fix emission of IMAD with NEG modifiers</li>
</ul>
<p>Stencel, Joanna (1):</p>
<ul>
<li>egl/wayland: add missing destroy_window callback</li>
</ul>
<p>Tapani Pälli (5):</p>
<ul>
<li>egl: stop claiming support for pbuffer + msaa</li>
<li>egl/dri2: set max values for pbuffer width and height</li>
<li>egl: add check that eglCreateContext gets a valid config</li>
<li>mesa: fix error handling in DrawBuffers</li>
<li>egl: set preserved behavior for surface only if config supports it</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>configure.ac: add llvm inteljitevents component if enabled</li>
</ul>
<p>Vedran Miletić (1):</p>
<ul>
<li>clover: Fix build against clang SVN &gt;= r273191</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>Revert "mesa_glinterop: remove inclusion of GLX header"</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,138 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.5 Release Notes / December 5, 2016</h1>
<p>
Mesa 12.0.5 is a bug fix release which fixes bugs found since the 12.0.5 release.
</p>
<p>
Mesa 12.0.5 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
44d08a27d98bfeacd864381189e434d98afbf451689d01f80380dc1d66450e5b mesa-12.0.5.tar.gz
2b0a972d8282860a11291c09c3ef01ac45171405951eb21a83c45ed2b4321924 mesa-12.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (2):</p>
<ul>
<li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>
<li>glx/glvnd: Fix dispatch function names and indices</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add release notes for 12.0.4</li>
<li>docs: add sha256 checksums for 12.0.4</li>
<li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>
<li>Update version to 12.0.5</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>mesa: change state query return value for RGB565</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>i965/fs/generator: Don't use the address immediate for MOV_INDIRECT</li>
<li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>
<li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>intel: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>
</ul>
<p>Marek Olšák (13):</p>
<ul>
<li>gallium/radeon: fix behavior of GLSL findLSB(0)</li>
<li>gallium/radeon: make sure HTILE address is aligned properly</li>
<li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>
<li>gallium/radeon: unify viewport emission code</li>
<li>gallium/radeon: set VPORT_ZMIN/MAX registers correctly</li>
<li>radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader</li>
<li>radeonsi: fix a crash in imageSize for cubemap arrays</li>
<li>radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it</li>
<li>gallium/radeon: add support for sharing textures with DCC between processes</li>
<li>radeonsi: always set all blend registers</li>
<li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>
<li>radeonsi: disable RB+ blend optimizations for dual source blending</li>
<li>radeonsi: silence runtime warnings with LLVM 3.9</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>anv: Replace "abi_versions" with correct "api_version".</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>mesa/fbobject: Update CubeMapFace when reusing textures</li>
</ul>
<p>Steinar H. Gunderson (1):</p>
<ul>
<li>Fix races during _mesa_HashWalk().</li>
</ul>
<p>Tim Rowley (3):</p>
<ul>
<li>swr: [rasterizer jitter] cleanup supporting different llvm versions</li>
<li>swr: [rasterizer jitter] fix llvm-3.7 compile</li>
<li>swr: [rasterizer] add support for llvm-3.9</li>
</ul>
</div>
</body>
</html>

View File

@@ -1,148 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.6 Release Notes / January 23, 2017</h1>
<p>
Mesa 12.0.6 is a bug fix release which fixes bugs found since the 12.0.5 release.
</p>
<p>
Mesa 12.0.6 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
65339ba5d76a45225b8b56f9a1da9db15c569e1d163760faa2921da0a8461741 mesa-12.0.6.tar.gz
7d6da9744c1022a4c2ab6ad01a206984d00443fb691568011d01b3dd97e36448 mesa-12.0.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (3):</p>
<ul>
<li>i965/mt: Disable aux surfaces after making miptree shareable</li>
<li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>
<li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.5</li>
<li>get-typod-pick-list.sh: add new script</li>
<li>automake: use shared llvm libs for make distcheck</li>
<li>egl/wayland: use the destroy_window_callback for swrast</li>
<li>Update version to 12.0.6</li>
</ul>
<p>Fredrik Höglund (1):</p>
<ul>
<li>dri3: Fix MakeCurrent without a default framebuffer</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: take extra push space into account for pushbuf_space calls</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>spirv/nir: Fix some texture opcode asserts</li>
<li>spirv/nir: Add support for shadow samplers that return vec4</li>
<li>spirv/nir: Properly handle gather components</li>
<li>anv/pipeline: Set binding_table.gather_texture_start</li>
<li>nir: Add a helper for determining the type of a texture source</li>
<li>nir/lower_tex: Add some helpers for working with tex sources</li>
<li>nir/lower_tex: Add support for lowering coordinate offsets</li>
<li>i965/nir: Enable NIR lowering of txf and rect offsets</li>
<li>i965: Get rid of the do_lower_unnormalized_offsets pass</li>
<li>spirv/nir: Don't increment coord_components for array lod queries</li>
<li>anv/image: Assert that the image format is actually supported</li>
<li>spirv/nir: Move opcode selection higher up in handle_texture</li>
<li>spirv/nir: Refactor type handling in handle_texture</li>
<li>nir/spirv: Refactor coordinate handling in handle_texture</li>
<li>spirv/nir: Handle texture projectors</li>
<li>spirv/nir: Add support for ImageQuerySamples</li>
<li>anv/device: Return the right error for failed maps</li>
<li>anv/device: Implicitly unmap memory objects in FreeMemory</li>
<li>anv/descriptor_set: Write the state offset in the surface state free list.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>
<li>i965: Properly flush in hsw_pause_transform_feedback().</li>
</ul>
<p>Marek Olšák (6):</p>
<ul>
<li>cso: don't release sampler states that are bound</li>
<li>radeonsi: always restore sampler states when unbinding sampler views</li>
<li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>
<li>radeonsi: disable CE on SI + AMDGPU</li>
<li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>
<li>gallium/radeon: fix the draw-calls HUD query</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>
<li>i965/fs: Add unit tests for copy propagation pass.</li>
<li>i965/fs: Reject copy propagation into SEL if not min/max.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: enable WQM in PS prolog when needed</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,15 +14,15 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>
<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>
<h1>Mesa 12.1.0 Release Notes / TBD</h1>
<p>
Mesa 12.0.1 is a bug fix release which fixes bugs found since the 12.0.1 release.
Mesa 12.1.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 12.1.1.
</p>
<p>
Mesa 12.0.1 implements the OpenGL 4.3 API, but the version reported by
Mesa 12.1.0 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
@@ -33,34 +33,27 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
28dff9c045f4305c96a875a487b9f06c7e88d910511cd6016dbddcd1f53ade0d mesa-12.0.1.tar.gz
bab24fb79f78c876073527f515ed871fc9c81d816f66c8a0b051d8d653896389 mesa-12.0.1.tar.xz
TBD.
</pre>
<h2>New features</h2>
<p>None</p>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_shader_group_vote on nvc0</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96864">Bug 96864</a> - Mesa 12.0 radeon build broken</li>
</ul>
TBD.
<h2>Changes</h2>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.0</li>
<li>radeon: reference the correct cdw/max_dw</li>
<li>Update version to 12.0.1</li>
<li>docs: add release notes for 12.0.1</li>
</ul>
TBD.
</div>
</body>

View File

@@ -58,8 +58,8 @@ extern "C" {
#endif
/* Forward declarations to avoid inclusion of GL/glx.h */
struct _XDisplay;
struct __GLXcontextRec;
typedef struct _XDisplay Display;
typedef struct __GLXcontextRec *GLXContext;
/* Forward declarations to avoid inclusion of EGL/egl.h */
typedef void *EGLDisplay;
@@ -246,7 +246,7 @@ struct mesa_glinterop_export_out {
* \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error
*/
int
MesaGLInteropGLXQueryDeviceInfo(struct _XDisplay *dpy, struct __GLXcontextRec *context,
MesaGLInteropGLXQueryDeviceInfo(Display *dpy, GLXContext context,
struct mesa_glinterop_device_info *out);
@@ -271,7 +271,7 @@ MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
* \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error
*/
int
MesaGLInteropGLXExportObject(struct _XDisplay *dpy, struct __GLXcontextRec *context,
MesaGLInteropGLXExportObject(Display *dpy, GLXContext context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
@@ -286,11 +286,11 @@ MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_export_out *out);
typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,
typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(Display *dpy, GLXContext context,
struct mesa_glinterop_device_info *out);
typedef int (PFNMESAGLINTEROPEGLQUERYDEVICEINFOPROC)(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_device_info *out);
typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,
typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(Display *dpy, GLXContext context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
typedef int (PFNMESAGLINTEROPEGLEXPORTOBJECTPROC)(EGLDisplay dpy, EGLContext context,

View File

@@ -36,8 +36,8 @@
*/
#if defined(_MSC_VER)
# if _MSC_VER < 1800 || (_MSC_FULL_VER < 180031101 && !defined(__clang__))
# error "Microsoft Visual Studio 2013 Update 4 or higher required"
# if _MSC_VER < 1800
# error "Microsoft Visual Studio 2013 or higher required"
# endif
/*

View File

@@ -13,8 +13,8 @@ all-local : .install-gallium-links
fi; \
$(MKDIR_P) $$link_dir; \
file_list="$(dri_LTLIBRARIES:%.la=.libs/%.so)"; \
file_list="$$file_list$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; \
file_list="$$file_list$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; \
file_list+="$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; \
file_list+="$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; \
for f in $$file_list; do \
if test -h .libs/$$f; then \
cp -d $$f $$link_dir; \

View File

@@ -25,13 +25,15 @@ git_sha1.h.tmp:
@# a gitlink file if $(top_srcdir) is a submodule checkout or a linked
@# worktree.
@# If we are building from a release tarball copy the bundled header.
@touch git_sha1.h.tmp
@if test -e $(top_srcdir)/.git; then \
if which git > /dev/null; then \
git --git-dir=$(top_srcdir)/.git log -n 1 --oneline | \
sed 's/^\([^ ]*\) .*/#define MESA_GIT_SHA1 "git-\1"/' \
> git_sha1.h.tmp ; \
fi \
else \
cp $(srcdir)/git_sha1.h git_sha1.h.tmp ;\
chmod u+w git_sha1.h.tmp; \
fi
git_sha1.h: git_sha1.h.tmp
@@ -43,34 +45,15 @@ git_sha1.h: git_sha1.h.tmp
fi
BUILT_SOURCES = git_sha1.h
CLEANFILES = $(BUILT_SOURCES)
# We want to keep the srcdir file since we need it on rebuild from tarball.
# At the same time `make distclean' gets angry at us if we don't cleanup the
# builddir one.
distclean-local:
( test $(top_srcdir) != $(top_builddir) && rm $(builddir)/git_sha1.h ) || true
SUBDIRS = . gtest util mapi/glapi/gen mapi
if HAVE_OPENGL
gldir = $(includedir)/GL
gl_HEADERS = \
$(top_srcdir)/include/GL/gl.h \
$(top_srcdir)/include/GL/glext.h \
$(top_srcdir)/include/GL/glcorearb.h \
$(top_srcdir)/include/GL/gl_mangle.h
endif
if HAVE_GLX
glxdir = $(includedir)/GL
glx_HEADERS = \
$(top_srcdir)/include/GL/glx.h \
$(top_srcdir)/include/GL/glxext.h \
$(top_srcdir)/include/GL/glx_mangle.h
pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = mesa/gl.pc
endif
if HAVE_COMMON_OSMESA
osmesadir = $(includedir)/GL
osmesa_HEADERS = $(top_srcdir)/include/GL/osmesa.h
endif
# include only conditionally ?
SUBDIRS += compiler
@@ -117,8 +100,7 @@ SUBDIRS += gallium
endif
EXTRA_DIST = \
getopt hgl SConscript \
$(top_srcdir)/include/GL/mesa_glinterop.h
getopt hgl SConscript git_sha1.h
AM_CFLAGS = $(VISIBILITY_CFLAGS)
AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)

View File

@@ -231,11 +231,11 @@ NIR_FILES = \
nir/nir_phi_builder.c \
nir/nir_phi_builder.h \
nir/nir_print.c \
nir/nir_propagate_invariant.c \
nir/nir_remove_dead_variables.c \
nir/nir_repair_ssa.c \
nir/nir_search.c \
nir/nir_search.h \
nir/nir_search_helpers.h \
nir/nir_split_var_copies.c \
nir/nir_sweep.c \
nir/nir_to_ssa.c \

View File

@@ -2278,10 +2278,10 @@ precision_qualifier_allowed(const glsl_type *type)
* From this, we infer that GLSL 1.30 (and later) should allow precision
* qualifiers on sampler types just like float and integer types.
*/
const glsl_type *const t = type->without_array();
return (t->is_float() || t->is_integer() || t->contains_opaque()) &&
!t->is_record();
return (type->is_float()
|| type->is_integer()
|| type->contains_opaque())
&& !type->without_array()->is_record();
}
const glsl_type *
@@ -3393,7 +3393,7 @@ apply_layout_qualifier_to_variable(const struct ast_type_qualifier *qual,
(qual_component + components - 1) > 3) {
_mesa_glsl_error(loc, state, "component overflow (%u > 3)",
(qual_component + components - 1));
} else if (qual_component == 1 && type->is_double()) {
} else if (qual_component == 1 && type->is_64bit()) {
/* We don't bother checking for 3 as it should be caught by the
* overflow check above.
*/
@@ -4697,14 +4697,6 @@ ast_declarator_list::hir(exec_list *instructions,
apply_layout_qualifier_to_variable(&this->type->qualifier, var, state,
&loc);
if ((var->data.mode == ir_var_auto || var->data.mode == ir_var_temporary)
&& (var->type->is_numeric() || var->type->is_boolean())
&& state->zero_init) {
const ir_constant_data data = {0};
var->data.has_initializer = true;
var->constant_initializer = new(var) ir_constant(var->type, &data);
}
if (this->type->qualifier.flags.q.invariant) {
if (!is_varying_var(var, state->stage)) {
_mesa_glsl_error(&loc, state,
@@ -5002,8 +4994,13 @@ ast_declarator_list::hir(exec_list *instructions,
state->check_precision_qualifiers_allowed(&loc);
}
if (this->type->qualifier.precision != ast_precision_none &&
!precision_qualifier_allowed(var->type)) {
/* If a precision qualifier is allowed on a type, it is allowed on
* an array of that type.
*/
if (!(this->type->qualifier.precision == ast_precision_none
|| precision_qualifier_allowed(var->type->without_array()))) {
_mesa_glsl_error(&loc, state,
"precision qualifiers apply only to floating point"
", integer and opaque types");
@@ -6846,7 +6843,7 @@ ast_process_struct_or_iface_block_members(exec_list *instructions,
}
} else {
if (layout && layout->flags.q.explicit_xfb_offset) {
unsigned align = field_type->is_double() ? 8 : 4;
unsigned align = field_type->is_64bit() ? 8 : 4;
fields[i].offset = glsl_align(block_xfb_offset, align);
block_xfb_offset +=
MAX2(xfb_stride, (int) (4 * field_type->component_slots()));

View File

@@ -528,6 +528,12 @@ barrier_supported(const _mesa_glsl_parse_state *state)
state->stage == MESA_SHADER_TESS_CTRL;
}
static bool
vote(const _mesa_glsl_parse_state *state)
{
return state->ARB_shader_group_vote_enable;
}
/** @} */
/******************************************************************************/
@@ -853,6 +859,8 @@ private:
ir_function_signature *_shader_clock(builtin_available_predicate avail,
const glsl_type *type);
ir_function_signature *_vote(enum ir_expression_operation opcode);
#undef B0
#undef B1
#undef B2
@@ -2935,6 +2943,10 @@ builtin_builder::create_builtins()
glsl_type::uvec2_type),
NULL);
add_function("anyInvocationARB", _vote(ir_unop_vote_any), NULL);
add_function("allInvocationsARB", _vote(ir_unop_vote_all), NULL);
add_function("allInvocationsEqualARB", _vote(ir_unop_vote_eq), NULL);
#undef F
#undef FI
#undef FIUD
@@ -5576,6 +5588,16 @@ builtin_builder::_shader_clock(builtin_available_predicate avail,
return sig;
}
ir_function_signature *
builtin_builder::_vote(enum ir_expression_operation opcode)
{
ir_variable *value = in_var(glsl_type::bool_type, "value");
MAKE_SIG(glsl_type::bool_type, vote, 1, value);
body.emit(ret(expr(opcode, value)));
return sig;
}
/** @} */
/******************************************************************************/

View File

@@ -37,11 +37,6 @@ static const struct gl_builtin_uniform_element gl_NumSamples_elements[] = {
{NULL, {STATE_NUM_SAMPLES, 0, 0}, SWIZZLE_XXXX}
};
/* only for TCS */
static const struct gl_builtin_uniform_element gl_PatchVerticesIn_elements[] = {
{NULL, {STATE_INTERNAL, STATE_TCS_PATCH_VERTICES_IN}, SWIZZLE_XXXX}
};
static const struct gl_builtin_uniform_element gl_DepthRange_elements[] = {
{"near", {STATE_DEPTH_RANGE, 0, 0}, SWIZZLE_XXXX},
{"far", {STATE_DEPTH_RANGE, 0, 0}, SWIZZLE_YYYY},
@@ -239,7 +234,6 @@ static const struct gl_builtin_uniform_element gl_NormalMatrix_elements[] = {
#define STATEVAR(name) {#name, name ## _elements, ARRAY_SIZE(name ## _elements)}
static const struct gl_builtin_uniform_desc _mesa_builtin_uniform_desc[] = {
STATEVAR(gl_PatchVerticesIn),
STATEVAR(gl_NumSamples),
STATEVAR(gl_DepthRange),
STATEVAR(gl_ClipPlane),
@@ -1035,14 +1029,9 @@ void
builtin_variable_generator::generate_tcs_special_vars()
{
add_system_value(SYSTEM_VALUE_PRIMITIVE_ID, int_t, "gl_PrimitiveID");
add_system_value(SYSTEM_VALUE_VERTICES_IN, int_t, "gl_PatchVerticesIn");
add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");
if (state->ctx->Const.LowerTCSPatchVerticesIn) {
add_uniform(int_t, "gl_PatchVerticesIn");
} else {
add_system_value(SYSTEM_VALUE_VERTICES_IN, int_t, "gl_PatchVerticesIn");
}
add_output(VARYING_SLOT_TESS_LEVEL_OUTER, array(float_t, 4),
"gl_TessLevelOuter")->data.patch = 1;
add_output(VARYING_SLOT_TESS_LEVEL_INNER, array(float_t, 2),

View File

@@ -278,34 +278,10 @@ control_line_success:
HASH_TOKEN DEFINE_TOKEN define
| HASH_TOKEN UNDEF IDENTIFIER NEWLINE {
macro_t *macro;
/* Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:
*
* It is an error to undefine or to redefine a built-in
* (pre-defined) macro name.
*
* The GLSL ES 1.00 spec does not contain this text.
*
* Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:
*
* #define and #undef functionality are defined as is
* standard for C++ preprocessors for macro definitions
* both with and without macro parameters.
*
* At least as far as I can tell GCC allow '#undef __FILE__'.
* Furthermore, there are desktop OpenGL conformance tests
* that expect '#undef __VERSION__' and '#undef
* GL_core_profile' to work.
*
* Only disallow #undef of pre-defined macros on GLSL ES >=
* 3.00 shaders.
*/
if (parser->is_gles &&
parser->version >= 300 &&
(strcmp("__LINE__", $3) == 0
|| strcmp("__FILE__", $3) == 0
|| strcmp("__VERSION__", $3) == 0
|| strncmp("GL_", $3, 3) == 0))
if (strcmp("__LINE__", $3) == 0
|| strcmp("__FILE__", $3) == 0
|| strcmp("__VERSION__", $3) == 0
|| strncmp("GL_", $3, 3) == 0)
glcpp_error(& @1, parser, "Built-in (pre-defined)"
" macro names cannot be undefined.");
@@ -420,13 +396,13 @@ control_line_success:
_glcpp_parser_skip_stack_pop (parser, & @1);
} NEWLINE
| HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE {
if (parser->version != 0) {
if (parser->version_resolved) {
glcpp_error(& @1, parser, "#version must appear on the first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, NULL, true);
}
| HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE {
if (parser->version != 0) {
if (parser->version_resolved) {
glcpp_error(& @1, parser, "#version must appear on the first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, $4, true);
@@ -1369,7 +1345,7 @@ glcpp_parser_create(const struct gl_extensions *extensions, gl_api api)
parser->extensions = extensions;
parser->api = api;
parser->version = 0;
parser->version_resolved = false;
parser->has_new_line_number = 0;
parser->new_line_number = 1;
@@ -2305,10 +2281,10 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
{
const struct gl_extensions *extensions = parser->extensions;
if (parser->version != 0)
if (parser->version_resolved)
return;
parser->version = version;
parser->version_resolved = true;
add_builtin_define (parser, "__VERSION__", version);
@@ -2491,6 +2467,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
if (extensions->ARB_cull_distance)
add_builtin_define(parser, "GL_ARB_cull_distance", 1);
if (extensions->ARB_shader_group_vote)
add_builtin_define(parser, "GL_ARB_shader_group_vote", 1);
}
}

View File

@@ -196,7 +196,7 @@ struct glcpp_parser {
int error;
const struct gl_extensions *extensions;
gl_api api;
unsigned version;
bool version_resolved;
bool has_new_line_number;
int new_line_number;
bool has_new_source_number;

View File

@@ -1,4 +1,3 @@
#version 300 es
#undef __LINE__
#undef __FILE__
#undef __VERSION__

View File

@@ -1,7 +1,6 @@
0:1(1): preprocessor error: Built-in (pre-defined) macro names cannot be undefined.
0:2(1): preprocessor error: Built-in (pre-defined) macro names cannot be undefined.
0:3(1): preprocessor error: Built-in (pre-defined) macro names cannot be undefined.
0:4(1): preprocessor error: Built-in (pre-defined) macro names cannot be undefined.
#version 300 es

View File

@@ -1,4 +0,0 @@
#version 110
#undef __LINE__
#undef __FILE__
#undef __VERSION__

View File

@@ -1,4 +0,0 @@
#version 110

View File

@@ -348,10 +348,10 @@ isampler2DMSArray KEYWORD_WITH_ALT(150, 300, 150, 320, yyextra->ARB_texture_mul
usampler2DMSArray KEYWORD_WITH_ALT(150, 300, 150, 320, yyextra->ARB_texture_multisample_enable || yyextra->OES_texture_storage_multisample_2d_array_enable, USAMPLER2DMSARRAY);
/* keywords available with ARB_texture_cube_map_array_enable extension on desktop GLSL */
samplerCubeArray KEYWORD_WITH_ALT(400, 310, 400, 0, yyextra->ARB_texture_cube_map_array_enable, SAMPLERCUBEARRAY);
isamplerCubeArray KEYWORD_WITH_ALT(400, 310, 400, 0, yyextra->ARB_texture_cube_map_array_enable, ISAMPLERCUBEARRAY);
usamplerCubeArray KEYWORD_WITH_ALT(400, 310, 400, 0, yyextra->ARB_texture_cube_map_array_enable, USAMPLERCUBEARRAY);
samplerCubeArrayShadow KEYWORD_WITH_ALT(400, 310, 400, 0, yyextra->ARB_texture_cube_map_array_enable, SAMPLERCUBEARRAYSHADOW);
samplerCubeArray KEYWORD_WITH_ALT(400, 0, 400, 0, yyextra->ARB_texture_cube_map_array_enable, SAMPLERCUBEARRAY);
isamplerCubeArray KEYWORD_WITH_ALT(400, 0, 400, 0, yyextra->ARB_texture_cube_map_array_enable, ISAMPLERCUBEARRAY);
usamplerCubeArray KEYWORD_WITH_ALT(400, 0, 400, 0, yyextra->ARB_texture_cube_map_array_enable, USAMPLERCUBEARRAY);
samplerCubeArrayShadow KEYWORD_WITH_ALT(400, 0, 400, 0, yyextra->ARB_texture_cube_map_array_enable, SAMPLERCUBEARRAYSHADOW);
samplerExternalOES {
if (yyextra->OES_EGL_image_external_enable)

View File

@@ -1784,10 +1784,8 @@ type_qualifier:
* variables. As only outputs can be declared as invariant, an invariant
* output from one shader stage will still match an input of a subsequent
* stage without the input being declared as invariant."
*
* On the desktop side, this text first appears in GLSL 4.30.
*/
if (state->is_version(430, 300) && $$.flags.q.in)
if (state->es_shader && state->language_version >= 300 && $$.flags.q.in)
_mesa_glsl_error(&@1, state, "invariant qualifiers cannot be used with shader inputs");
}
| interpolation_qualifier type_qualifier

View File

@@ -74,7 +74,6 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx,
/* Set default language version and extensions */
this->language_version = 110;
this->forced_language_version = ctx->Const.ForceGLSLVersion;
this->zero_init = ctx->Const.GLSLZeroInit;
this->es_shader = false;
this->ARB_texture_rectangle_enable = true;
@@ -595,6 +594,7 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = {
EXT(ARB_shader_bit_encoding, true, false, ARB_shader_bit_encoding),
EXT(ARB_shader_clock, true, false, ARB_shader_clock),
EXT(ARB_shader_draw_parameters, true, false, ARB_shader_draw_parameters),
EXT(ARB_shader_group_vote, true, false, ARB_shader_group_vote),
EXT(ARB_shader_image_load_store, true, false, ARB_shader_image_load_store),
EXT(ARB_shader_image_size, true, false, ARB_shader_image_size),
EXT(ARB_shader_precision, true, false, ARB_shader_precision),
@@ -1603,6 +1603,7 @@ ast_struct_specifier::ast_struct_specifier(const char *identifier,
name = identifier;
this->declarations.push_degenerate_list_at_head(&declarator_list->link);
is_declaration = true;
layout = NULL;
}
void ast_subroutine_list::print(void) const

View File

@@ -306,7 +306,6 @@ struct _mesa_glsl_parse_state {
bool es_shader;
unsigned language_version;
unsigned forced_language_version;
bool zero_init;
gl_shader_stage stage;
/**
@@ -576,6 +575,8 @@ struct _mesa_glsl_parse_state {
bool ARB_shader_clock_warn;
bool ARB_shader_draw_parameters_enable;
bool ARB_shader_draw_parameters_warn;
bool ARB_shader_group_vote_enable;
bool ARB_shader_group_vote_warn;
bool ARB_shader_image_load_store_enable;
bool ARB_shader_image_load_store_warn;
bool ARB_shader_image_size_enable;

View File

@@ -1284,9 +1284,6 @@ nir_visitor::visit(ir_expression *ir)
intrin->intrinsic == nir_intrinsic_interp_var_at_sample)
intrin->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
if (intrin->intrinsic == nir_intrinsic_interp_var_at_offset)
shader->info.uses_interp_var_at_offset = true;
unsigned bit_size = glsl_get_bit_size(deref->type);
add_instr(&intrin->instr, deref->type->vector_elements, bit_size);

View File

@@ -341,6 +341,12 @@ ir_expression::ir_expression(int op, ir_rvalue *op0)
this->type = glsl_type::int_type;
break;
case ir_unop_vote_any:
case ir_unop_vote_all:
case ir_unop_vote_eq:
this->type = glsl_type::bool_type;
break;
default:
assert(!"not reached: missing automatic type setup for ir_expression");
this->type = op0->type;
@@ -563,6 +569,9 @@ static const char *const operator_strs[] = {
"interpolate_at_centroid",
"get_buffer_size",
"ssbo_unsized_array_length",
"vote_any",
"vote_all",
"vote_eq",
"+",
"-",
"*",

View File

@@ -537,6 +537,10 @@ public:
return this->interface_type;
}
enum glsl_interface_packing get_interface_type_packing() const
{
return this->interface_type->get_interface_packing();
}
/**
* Get the max_ifc_array_access pointer
*
@@ -586,13 +590,6 @@ public:
return this->u.state_slots;
}
inline bool is_interpolation_flat() const
{
return this->data.interpolation == INTERP_QUALIFIER_FLAT ||
this->type->contains_integer() ||
this->type->contains_double();
}
inline bool is_name_ralloced() const
{
return this->name != ir_variable::tmp_name;
@@ -1484,10 +1481,17 @@ enum ir_expression_operation {
*/
ir_unop_ssbo_unsized_array_length,
/**
* Vote among threads on the value of the boolean argument.
*/
ir_unop_vote_any,
ir_unop_vote_all,
ir_unop_vote_eq,
/**
* A sentinel marking the last of the unary operations.
*/
ir_last_unop = ir_unop_ssbo_unsized_array_length,
ir_last_unop = ir_unop_vote_eq,
ir_binop_add,
ir_binop_sub,

File diff suppressed because it is too large Load Diff

View File

@@ -147,7 +147,7 @@ ir_expression::accept(ir_hierarchical_visitor *v)
goto done;
case visit_stop:
return visit_stop;
return s;
}
}

View File

@@ -119,7 +119,7 @@ mark(struct gl_program *prog, ir_variable *var, int offset, int len,
/* double inputs read is only for vertex inputs */
if (stage == MESA_SHADER_VERTEX &&
var->type->without_array()->is_dual_slot_double())
var->type->without_array()->is_dual_slot())
prog->DoubleInputsRead |= bitfield;
if (stage == MESA_SHADER_FRAGMENT) {
@@ -260,19 +260,15 @@ ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var,
* lowering passes (do_vec_index_to_swizzle() gets rid of indexing into
* vectors, and lower_packed_varyings() gets rid of structs that occur in
* varyings).
*
* However, we don't use varying packing in all cases - tessellation
* shaders bypass it. This means we'll see varying structs and arrays
* of structs here. For now, we just give up so the caller marks the
* entire variable as used.
*/
if (!(type->is_matrix() ||
(type->is_array() &&
(type->fields.array->is_numeric() ||
type->fields.array->is_boolean())))) {
assert(!"Unexpected indexing in ir_set_program_inouts");
/* If we don't know how to handle this case, give up and let the
* caller mark the whole variable as used.
/* For safety in release builds, in case we ever encounter unexpected
* indexing, give up and let the caller mark the whole variable as used.
*/
return false;
}
@@ -310,7 +306,7 @@ ir_set_program_inouts_visitor::try_mark_partial_variable(ir_variable *var,
/* double element width for double types that takes two slots */
if (this->shader_stage != MESA_SHADER_VERTEX ||
var->data.mode != ir_var_shader_in) {
if (type->without_array()->is_dual_slot_double())
if (type->without_array()->is_dual_slot())
elem_width *= 2;
}

View File

@@ -453,6 +453,14 @@ ir_validate::visit_leave(ir_expression *ir)
assert(ir->operands[0]->type->base_type == GLSL_TYPE_SUBROUTINE);
assert(ir->type->base_type == GLSL_TYPE_INT);
break;
case ir_unop_vote_any:
case ir_unop_vote_all:
case ir_unop_vote_eq:
assert(ir->type == glsl_type::bool_type);
assert(ir->operands[0]->type == glsl_type::bool_type);
break;
case ir_binop_add:
case ir_binop_sub:
case ir_binop_mul:

View File

@@ -167,8 +167,7 @@ link_uniform_block_active_visitor::visit(ir_variable *var)
* also considered active, even if no member of the block is
* referenced."
*/
if (var->get_interface_type()->interface_packing ==
GLSL_INTERFACE_PACKING_PACKED)
if (var->get_interface_type_packing() == GLSL_INTERFACE_PACKING_PACKED)
return visit_continue;
/* Process the block. Bail if there was an error.
@@ -258,8 +257,7 @@ link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir)
* std140 layout qualifier, all its instances have been already marked
* as used in link_uniform_block_active_visitor::visit(ir_variable *).
*/
if (var->get_interface_type()->interface_packing ==
GLSL_INTERFACE_PACKING_PACKED) {
if (var->get_interface_type_packing() == GLSL_INTERFACE_PACKING_PACKED) {
b->var = var;
process_arrays(this->mem_ctx, ir, b);
}

View File

@@ -70,7 +70,7 @@ private:
}
virtual void enter_record(const glsl_type *type, const char *,
bool row_major, const unsigned packing) {
bool row_major, const enum glsl_interface_packing packing) {
assert(type->is_record());
if (packing == GLSL_INTERFACE_PACKING_STD430)
this->offset = glsl_align(
@@ -81,7 +81,7 @@ private:
}
virtual void leave_record(const glsl_type *type, const char *,
bool row_major, const unsigned packing) {
bool row_major, const enum glsl_interface_packing packing) {
assert(type->is_record());
/* If this is the last field of a structure, apply rule #9. The
@@ -106,7 +106,7 @@ private:
virtual void visit_field(const glsl_type *type, const char *name,
bool row_major, const glsl_type *,
const unsigned packing,
const enum glsl_interface_packing packing,
bool last_field)
{
assert(this->index < this->num_variables);

View File

@@ -222,7 +222,7 @@ set_uniform_initializer(void *mem_ctx, gl_shader_program *prog,
val->array_elements[0]->type->base_type;
const unsigned int elements = val->array_elements[0]->type->components();
unsigned int idx = 0;
unsigned dmul = (base_type == GLSL_TYPE_DOUBLE) ? 2 : 1;
unsigned dmul = glsl_base_type_is_64bit(base_type) ? 2 : 1;
assert(val->type->length >= storage->array_elements);
for (unsigned int i = 0; i < storage->array_elements; i++) {

View File

@@ -65,7 +65,7 @@ program_resource_visitor::process(const glsl_type *type, const char *name)
unsigned record_array_count = 1;
char *name_copy = ralloc_strdup(NULL, name);
unsigned packing = type->interface_packing;
enum glsl_interface_packing packing = type->get_interface_packing();
recursion(type, &name_copy, strlen(name), false, NULL, packing, false,
record_array_count, NULL);
@@ -79,9 +79,9 @@ program_resource_visitor::process(ir_variable *var)
const bool row_major =
var->data.matrix_layout == GLSL_MATRIX_LAYOUT_ROW_MAJOR;
const unsigned packing = var->get_interface_type() ?
var->get_interface_type()->interface_packing :
var->type->interface_packing;
const enum glsl_interface_packing packing = var->get_interface_type() ?
var->get_interface_type_packing() :
var->type->get_interface_packing();
const glsl_type *t =
var->data.from_named_ifc_block ? var->get_interface_type() : var->type;
@@ -116,7 +116,7 @@ void
program_resource_visitor::recursion(const glsl_type *t, char **name,
size_t name_length, bool row_major,
const glsl_type *record_type,
const unsigned packing,
const enum glsl_interface_packing packing,
bool last_field,
unsigned record_array_count,
const glsl_struct_field *named_ifc_member)
@@ -228,7 +228,7 @@ void
program_resource_visitor::visit_field(const glsl_type *type, const char *name,
bool row_major,
const glsl_type *,
const unsigned,
const enum glsl_interface_packing,
bool /* last_field */)
{
visit_field(type, name, row_major);
@@ -243,13 +243,13 @@ program_resource_visitor::visit_field(const glsl_struct_field *field)
void
program_resource_visitor::enter_record(const glsl_type *, const char *, bool,
const unsigned)
const enum glsl_interface_packing)
{
}
void
program_resource_visitor::leave_record(const glsl_type *, const char *, bool,
const unsigned)
const enum glsl_interface_packing)
{
}
@@ -402,7 +402,9 @@ private:
* uniforms.
*/
this->num_active_uniforms++;
this->num_values += values;
if(!is_gl_identifier(name) && !is_shader_storage)
this->num_values += values;
}
struct string_to_uint_map *hidden_map;
@@ -660,7 +662,7 @@ private:
}
virtual void enter_record(const glsl_type *type, const char *,
bool row_major, const unsigned packing) {
bool row_major, const enum glsl_interface_packing packing) {
assert(type->is_record());
if (this->buffer_block_index == -1)
return;
@@ -673,7 +675,7 @@ private:
}
virtual void leave_record(const glsl_type *type, const char *,
bool row_major, const unsigned packing) {
bool row_major, const enum glsl_interface_packing packing) {
assert(type->is_record());
if (this->buffer_block_index == -1)
return;
@@ -687,7 +689,7 @@ private:
virtual void visit_field(const glsl_type *type, const char *name,
bool row_major, const glsl_type * /* record_type */,
const unsigned packing,
const enum glsl_interface_packing packing,
bool /* last_field */)
{
assert(!type->without_array()->is_record());
@@ -762,13 +764,14 @@ private:
current_var->data.how_declared == ir_var_hidden;
this->uniforms[id].builtin = is_gl_identifier(name);
/* Do not assign storage if the uniform is builtin */
if (!this->uniforms[id].builtin)
this->uniforms[id].storage = this->values;
this->uniforms[id].is_shader_storage =
current_var->is_in_shader_storage_block();
/* Do not assign storage if the uniform is builtin */
if (!this->uniforms[id].builtin &&
!this->uniforms[id].is_shader_storage)
this->uniforms[id].storage = this->values;
if (this->buffer_block_index != -1) {
this->uniforms[id].block_index = this->buffer_block_index;
@@ -819,7 +822,9 @@ private:
this->uniforms[id].row_major = false;
}
this->values += values_for_type(type);
if (!this->uniforms[id].builtin &&
!this->uniforms[id].is_shader_storage)
this->values += values_for_type(type);
}
/**
@@ -1251,7 +1256,8 @@ link_assign_uniform_locations(struct gl_shader_program *prog,
#ifndef NDEBUG
for (unsigned i = 0; i < num_uniforms; i++) {
assert(uniforms[i].storage != NULL || uniforms[i].builtin);
assert(uniforms[i].storage != NULL || uniforms[i].builtin ||
uniforms[i].is_shader_storage);
}
assert(parcel.values == data_end);

View File

@@ -308,25 +308,7 @@ cross_validate_types_and_qualifiers(struct gl_shader_program *prog,
return;
}
/* The GLSL 4.30 and GLSL ES 3.00 specifications say:
*
* "As only outputs need be declared with invariant, an output from
* one shader stage will still match an input of a subsequent stage
* without the input being declared as invariant."
*
* while GLSL 4.20 says:
*
* "For variables leaving one shader and coming into another shader,
* the invariant keyword has to be used in both shaders, or a link
* error will result."
*
* and GLSL ES 1.00 section 4.6.4 "Invariance and Linking" says:
*
* "The invariance of varyings that are declared in both the vertex
* and fragment shaders must match."
*/
if (input->data.invariant != output->data.invariant &&
prog->Version < (prog->IsES ? 300 : 430)) {
if (!prog->IsES && input->data.invariant != output->data.invariant) {
linker_error(prog,
"%s shader output `%s' %s invariant qualifier, "
"but %s shader input %s invariant qualifier\n",
@@ -415,15 +397,15 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
unsigned slot_limit = idx + num_elements;
unsigned last_comp;
if (var->type->without_array()->is_record()) {
if (type->without_array()->is_record()) {
/* The component qualifier can't be used on structs so just treat
* all component slots as used.
*/
last_comp = 4;
} else {
unsigned dmul = var->type->is_double() ? 2 : 1;
unsigned dmul = type->without_array()->is_64bit() ? 2 : 1;
last_comp = var->data.location_frac +
var->type->without_array()->vector_elements * dmul;
type->without_array()->vector_elements * dmul;
}
while (idx < slot_limit) {
@@ -443,7 +425,7 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
for (unsigned j = 0; j < 4; j++) {
if (explicit_locations[idx][j] &&
(explicit_locations[idx][j]->type->without_array()
->base_type != var->type->without_array()->base_type)) {
->base_type != type->without_array()->base_type)) {
linker_error(prog,
"Varyings sharing the same location must "
"have the same underlying numerical type. "
@@ -461,7 +443,7 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
* worry about components beginning at anything other than 0 as
* the spec does not allow this for dvec3 and dvec4.
*/
if (i == 3 && last_comp > 4) {
if (i == 4 && last_comp > 4) {
last_comp = last_comp - 4;
/* Bump location index and reset the component index */
idx++;
@@ -726,7 +708,7 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
+ this->matched_candidate->toplevel_var->data.location_frac
+ this->matched_candidate->offset;
const unsigned dmul =
this->matched_candidate->type->without_array()->is_double() ? 2 : 1;
this->matched_candidate->type->without_array()->is_64bit() ? 2 : 1;
if (this->matched_candidate->type->is_array()) {
/* Array variable */
@@ -904,7 +886,7 @@ tfeedback_decl::store(struct gl_context *ctx, struct gl_shader_program *prog,
}
if (explicit_stride && explicit_stride[buffer]) {
if (this->is_double() && info->Buffers[buffer].Stride % 2) {
if (this->is_64bit() && info->Buffers[buffer].Stride % 2) {
linker_error(prog, "invalid qualifier xfb_stride=%d must be a "
"multiple of 8 as its applied to a type that is or "
"contains a double.",
@@ -1628,8 +1610,7 @@ varying_matches::compute_packing_class(const ir_variable *var)
unsigned packing_class = var->data.centroid | (var->data.sample << 1) |
(var->data.patch << 2);
packing_class *= 4;
packing_class += var->is_interpolation_flat()
? unsigned(INTERP_QUALIFIER_FLAT) : var->data.interpolation;
packing_class += var->data.interpolation;
return packing_class;
}
@@ -1956,7 +1937,7 @@ canonicalize_shader_io(exec_list *ir, enum ir_variable_mode io_mode)
* 64 bit map. Per-vertex and per-patch both have separate location domains
* with a max of MAX_VARYING.
*/
static uint64_t
uint64_t
reserved_varying_slot(struct gl_shader *stage, ir_variable_mode io_mode)
{
assert(io_mode == ir_var_shader_in || io_mode == ir_var_shader_out);
@@ -2018,7 +1999,8 @@ assign_varying_locations(struct gl_context *ctx,
struct gl_shader_program *prog,
gl_shader *producer, gl_shader *consumer,
unsigned num_tfeedback_decls,
tfeedback_decl *tfeedback_decls)
tfeedback_decl *tfeedback_decls,
const uint64_t reserved_slots)
{
/* Tessellation shaders treat inputs and outputs as shared memory and can
* access inputs and outputs of other invocations.
@@ -2196,10 +2178,6 @@ assign_varying_locations(struct gl_context *ctx,
}
}
const uint64_t reserved_slots =
reserved_varying_slot(producer, ir_var_shader_out) |
reserved_varying_slot(consumer, ir_var_shader_in);
const unsigned slots_used = matches.assign_locations(prog, reserved_slots);
matches.store_locations();
@@ -2282,14 +2260,16 @@ assign_varying_locations(struct gl_context *ctx,
bool
check_against_output_limit(struct gl_context *ctx,
struct gl_shader_program *prog,
gl_shader *producer)
gl_shader *producer,
unsigned num_explicit_locations)
{
unsigned output_vectors = 0;
unsigned output_vectors = num_explicit_locations;
foreach_in_list(ir_instruction, node, producer->ir) {
ir_variable *const var = node->as_variable();
if (var && var->data.mode == ir_var_shader_out &&
if (var && !var->data.explicit_location &&
var->data.mode == ir_var_shader_out &&
var_counts_against_varying_limit(producer->Stage, var)) {
/* outputs for fragment shader can't be doubles */
output_vectors += var->type->count_attribute_slots(false);
@@ -2324,14 +2304,16 @@ check_against_output_limit(struct gl_context *ctx,
bool
check_against_input_limit(struct gl_context *ctx,
struct gl_shader_program *prog,
gl_shader *consumer)
gl_shader *consumer,
unsigned num_explicit_locations)
{
unsigned input_vectors = 0;
unsigned input_vectors = num_explicit_locations;
foreach_in_list(ir_instruction, node, consumer->ir) {
ir_variable *const var = node->as_variable();
if (var && var->data.mode == ir_var_shader_in &&
if (var && !var->data.explicit_location &&
var->data.mode == ir_var_shader_in &&
var_counts_against_varying_limit(consumer->Stage, var)) {
/* vertex inputs aren't varying counted */
input_vectors += var->type->count_attribute_slots(false);

View File

@@ -151,7 +151,7 @@ public:
return this->size;
else
return this->vector_elements * this->matrix_columns * this->size *
(this->is_double() ? 2 : 1);
(this->is_64bit() ? 2 : 1);
}
unsigned get_location() const {
@@ -160,7 +160,7 @@ public:
private:
bool is_double() const
bool is_64bit() const
{
switch (this->type) {
case GL_DOUBLE:
@@ -320,16 +320,22 @@ assign_varying_locations(struct gl_context *ctx,
struct gl_shader_program *prog,
gl_shader *producer, gl_shader *consumer,
unsigned num_tfeedback_decls,
tfeedback_decl *tfeedback_decls);
tfeedback_decl *tfeedback_decls,
const uint64_t reserved_slots);
uint64_t
reserved_varying_slot(struct gl_shader *stage, ir_variable_mode io_mode);
bool
check_against_output_limit(struct gl_context *ctx,
struct gl_shader_program *prog,
gl_shader *producer);
gl_shader *producer,
unsigned num_explicit_locations);
bool
check_against_input_limit(struct gl_context *ctx,
struct gl_shader_program *prog,
gl_shader *consumer);
gl_shader *consumer,
unsigned num_explicit_locations);
#endif /* GLSL_LINK_VARYINGS_H */

View File

@@ -72,7 +72,6 @@
#include "ir.h"
#include "program.h"
#include "program/hash_table.h"
#include "program/prog_instruction.h"
#include "linker.h"
#include "link_varyings.h"
#include "ir_optimization.h"
@@ -2486,7 +2485,7 @@ resize_tes_inputs(struct gl_context *ctx,
ir->accept(&input_resize_visitor);
}
if (tcs || ctx->Const.LowerTESPatchVerticesIn) {
if (tcs) {
/* Convert the gl_PatchVerticesIn system value into a constant, since
* the value is known at this point.
*/
@@ -2495,22 +2494,9 @@ resize_tes_inputs(struct gl_context *ctx,
if (var && var->data.mode == ir_var_system_value &&
var->data.location == SYSTEM_VALUE_VERTICES_IN) {
void *mem_ctx = ralloc_parent(var);
var->data.mode = ir_var_auto;
var->data.location = 0;
var->data.explicit_location = false;
if (tcs) {
var->data.mode = ir_var_auto;
var->constant_value = new(mem_ctx) ir_constant(num_vertices);
} else {
var->data.mode = ir_var_uniform;
var->data.how_declared = ir_var_hidden;
var->allocate_state_slots(1);
ir_state_slot *slot0 = &var->get_state_slots()[0];
slot0->swizzle = SWIZZLE_XXXX;
slot0->tokens[0] = STATE_INTERNAL;
slot0->tokens[1] = STATE_TES_PATCH_VERTICES_IN;
for (int i = 2; i < STATE_LENGTH; i++)
slot0->tokens[i] = 0;
}
var->constant_value = new(mem_ctx) ir_constant(num_vertices);
}
}
}
@@ -2877,7 +2863,7 @@ assign_attribute_or_color_locations(gl_shader_program *prog,
* issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this
* is optional behavior, but it seems preferable.
*/
if (var->type->without_array()->is_dual_slot_double())
if (var->type->without_array()->is_dual_slot())
double_storage_locations |= (use_mask << attr);
}
@@ -2954,7 +2940,7 @@ assign_attribute_or_color_locations(gl_shader_program *prog,
to_assign[i].var->data.is_unmatched_generic_inout = 0;
used_locations |= (use_mask << location);
if (to_assign[i].var->type->without_array()->is_dual_slot_double())
if (to_assign[i].var->type->without_array()->is_dual_slot())
double_storage_locations |= (use_mask << location);
}
@@ -3687,18 +3673,6 @@ create_shader_variable(struct gl_shader_program *shProg,
if (in->data.mode == ir_var_system_value &&
in->data.location == SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) {
out->name = ralloc_strdup(shProg, "gl_VertexID");
} else if ((in->data.mode == ir_var_shader_out &&
in->data.location == VARYING_SLOT_TESS_LEVEL_OUTER) ||
(in->data.mode == ir_var_system_value &&
in->data.location == SYSTEM_VALUE_TESS_LEVEL_OUTER)) {
out->name = ralloc_strdup(shProg, "gl_TessLevelOuter");
type = glsl_type::get_array_instance(glsl_type::float_type, 4);
} else if ((in->data.mode == ir_var_shader_out &&
in->data.location == VARYING_SLOT_TESS_LEVEL_INNER) ||
(in->data.mode == ir_var_system_value &&
in->data.location == SYSTEM_VALUE_TESS_LEVEL_INNER)) {
out->name = ralloc_strdup(shProg, "gl_TessLevelInner");
type = glsl_type::get_array_instance(glsl_type::float_type, 2);
} else {
out->name = ralloc_strdup(shProg, name);
}
@@ -3851,9 +3825,6 @@ add_interface_variables(struct gl_shader_program *shProg,
continue;
};
if (var->data.patch)
loc_bias = int(VARYING_SLOT_PATCH0);
/* Skip packed varyings, packed varyings are handled separately
* by add_packed_varyings.
*/
@@ -4783,7 +4754,6 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
;
lower_const_arrays_to_uniforms(prog->_LinkedShaders[i]->ir);
propagate_invariance(prog->_LinkedShaders[i]->ir);
}
/* Validation for special cases where we allow sampler array indexing
@@ -4880,9 +4850,12 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
*/
if (last < MESA_SHADER_FRAGMENT &&
(num_tfeedback_decls != 0 || prog->SeparateShader)) {
const uint64_t reserved_out_slots =
reserved_varying_slot(prog->_LinkedShaders[last], ir_var_shader_out);
if (!assign_varying_locations(ctx, mem_ctx, prog,
prog->_LinkedShaders[last], NULL,
num_tfeedback_decls, tfeedback_decls))
num_tfeedback_decls, tfeedback_decls,
reserved_out_slots))
goto done;
}
@@ -4900,6 +4873,9 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
gl_shader *const sh = prog->_LinkedShaders[last];
if (prog->SeparateShader) {
const uint64_t reserved_slots =
reserved_varying_slot(sh, ir_var_shader_in);
/* Assign input locations for SSO, output locations are already
* assigned.
*/
@@ -4907,7 +4883,8 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
NULL /* producer */,
sh /* consumer */,
0 /* num_tfeedback_decls */,
NULL /* tfeedback_decls */))
NULL /* tfeedback_decls */,
reserved_slots))
goto done;
}
@@ -4928,9 +4905,15 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
gl_shader *const sh_i = prog->_LinkedShaders[i];
gl_shader *const sh_next = prog->_LinkedShaders[next];
const uint64_t reserved_out_slots =
reserved_varying_slot(sh_i, ir_var_shader_out);
const uint64_t reserved_in_slots =
reserved_varying_slot(sh_next, ir_var_shader_in);
if (!assign_varying_locations(ctx, mem_ctx, prog, sh_i, sh_next,
next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
tfeedback_decls))
tfeedback_decls,
reserved_out_slots | reserved_in_slots))
goto done;
do_dead_builtin_varyings(ctx, sh_i, sh_next,
@@ -4939,11 +4922,14 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
/* This must be done after all dead varyings are eliminated. */
if (sh_i != NULL) {
if (!check_against_output_limit(ctx, prog, sh_i)) {
unsigned slots_used = _mesa_bitcount_64(reserved_out_slots);
if (!check_against_output_limit(ctx, prog, sh_i, slots_used)) {
goto done;
}
}
if (!check_against_input_limit(ctx, prog, sh_next))
unsigned slots_used = _mesa_bitcount_64(reserved_in_slots);
if (!check_against_input_limit(ctx, prog, sh_next, slots_used))
goto done;
next = i;

View File

@@ -156,7 +156,7 @@ protected:
*/
virtual void visit_field(const glsl_type *type, const char *name,
bool row_major, const glsl_type *record_type,
const unsigned packing,
const enum glsl_interface_packing packing,
bool last_field);
/**
@@ -180,10 +180,10 @@ protected:
virtual void visit_field(const glsl_struct_field *field);
virtual void enter_record(const glsl_type *type, const char *name,
bool row_major, const unsigned packing);
bool row_major, const enum glsl_interface_packing packing);
virtual void leave_record(const glsl_type *type, const char *name,
bool row_major, const unsigned packing);
bool row_major, const enum glsl_interface_packing packing);
virtual void set_buffer_offset(unsigned offset);
@@ -199,7 +199,7 @@ private:
*/
void recursion(const glsl_type *t, char **name, size_t name_length,
bool row_major, const glsl_type *record_type,
const unsigned packing,
const enum glsl_interface_packing packing,
bool last_field, unsigned record_array_count,
const glsl_struct_field *named_ifc_member);
};

View File

@@ -114,7 +114,7 @@ lower_buffer_access::emit_access(void *mem_ctx,
/* For a row-major matrix, the next column starts at the next
* element.
*/
int size_mul = deref->type->is_double() ? 8 : 4;
int size_mul = deref->type->is_64bit() ? 8 : 4;
emit_access(mem_ctx, is_write, col_deref, base_offset,
deref_offset + i * size_mul,
row_major, deref->type->matrix_columns, packing,
@@ -125,7 +125,7 @@ lower_buffer_access::emit_access(void *mem_ctx,
/* std430 doesn't round up vec2 size to a vec4 size */
if (packing == GLSL_INTERFACE_PACKING_STD430 &&
deref->type->vector_elements == 2 &&
!deref->type->is_double()) {
!deref->type->is_64bit()) {
size_mul = 8;
} else {
/* std140 always rounds the stride of arrays (and matrices) to a
@@ -137,7 +137,7 @@ lower_buffer_access::emit_access(void *mem_ctx,
* machine units, the base alignment is 4N. For vec4, base
* alignment is 4N.
*/
size_mul = (deref->type->is_double() &&
size_mul = (deref->type->is_64bit() &&
deref->type->vector_elements > 2) ? 32 : 16;
}
@@ -159,7 +159,7 @@ lower_buffer_access::emit_access(void *mem_ctx,
is_write ? write_mask : (1 << deref->type->vector_elements) - 1;
insert_buffer_access(mem_ctx, deref, deref->type, offset, mask, -1);
} else {
unsigned N = deref->type->is_double() ? 8 : 4;
unsigned N = deref->type->is_64bit() ? 8 : 4;
/* We're dereffing a column out of a row-major matrix, so we
* gather the vector from each stored row.
@@ -328,7 +328,7 @@ lower_buffer_access::setup_buffer_access(void *mem_ctx,
bool *row_major,
int *matrix_columns,
const glsl_struct_field **struct_field,
unsigned packing)
enum glsl_interface_packing packing)
{
*offset = new(mem_ctx) ir_constant(0u);
*row_major = is_dereferenced_thing_row_major(deref);
@@ -358,7 +358,7 @@ lower_buffer_access::setup_buffer_access(void *mem_ctx,
* thread or SIMD channel is modifying the same vector.
*/
array_stride = 4;
if (deref_array->array->type->is_double())
if (deref_array->array->type->is_64bit())
array_stride *= 2;
} else if (deref_array->array->type->is_matrix() && *row_major) {
/* When loading a vector out of a row major matrix, the
@@ -367,7 +367,7 @@ lower_buffer_access::setup_buffer_access(void *mem_ctx,
* vector) is handled below in emit_ubo_loads.
*/
array_stride = 4;
if (deref_array->array->type->is_double())
if (deref_array->array->type->is_64bit())
array_stride *= 2;
*matrix_columns = deref_array->array->type->matrix_columns;
} else if (deref_array->type->without_array()->is_interface()) {

View File

@@ -58,7 +58,7 @@ public:
ir_rvalue **offset, unsigned *const_offset,
bool *row_major, int *matrix_columns,
const glsl_struct_field **struct_field,
unsigned packing);
enum glsl_interface_packing packing);
};
} /* namespace lower_buffer_access */

View File

@@ -57,7 +57,6 @@ public:
return progress;
}
ir_visitor_status visit_enter(ir_texture *);
void handle_rvalue(ir_rvalue **rvalue);
private:
@@ -65,25 +64,23 @@ private:
bool progress;
};
ir_visitor_status
lower_const_array_visitor::visit_enter(ir_texture *)
{
return visit_continue_with_parent;
}
void
lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)
{
if (!*rvalue)
return;
ir_constant *con = (*rvalue)->as_constant();
ir_dereference_array *dra = (*rvalue)->as_dereference_array();
if (!dra)
return;
ir_constant *con = dra->array->as_constant();
if (!con || !con->type->is_array())
return;
void *mem_ctx = ralloc_parent(con);
char *uniform_name = ralloc_asprintf(mem_ctx, "constarray__%p", con);
char *uniform_name = ralloc_asprintf(mem_ctx, "constarray__%p", dra);
ir_variable *uni =
new(mem_ctx) ir_variable(con->type, uniform_name, ir_var_uniform);
@@ -96,7 +93,8 @@ lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)
uni->data.max_array_access = uni->type->length - 1;
instructions->push_head(uni);
*rvalue = new(mem_ctx) ir_dereference_variable(uni);
ir_dereference_variable *varref = new(mem_ctx) ir_dereference_variable(uni);
*rvalue = new(mem_ctx) ir_dereference_array(varref, dra->array_index);
progress = true;
}

View File

@@ -273,11 +273,11 @@ lower_packed_varyings_visitor::run(struct gl_shader *shader)
continue;
/* This lowering pass is only capable of packing floats and ints
* together when their interpolation mode is "flat". Treat integers as
* being flat when the interpolation mode is none.
* together when their interpolation mode is "flat". Therefore, to be
* safe, caller should ensure that integral varyings always use flat
* interpolation, even when this is not required by GLSL.
*/
assert(var->data.interpolation == INTERP_QUALIFIER_FLAT ||
var->data.interpolation == INTERP_QUALIFIER_NONE ||
!var->type->contains_integer());
/* Clone the variable for program resource list before
@@ -432,7 +432,7 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue,
bool gs_input_toplevel,
unsigned vertex_index)
{
unsigned dmul = rvalue->type->is_double() ? 2 : 1;
unsigned dmul = rvalue->type->is_64bit() ? 2 : 1;
/* When gs_input_toplevel is set, we should be looking at a geometry shader
* input array.
*/
@@ -480,7 +480,7 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue,
char right_swizzle_name[4] = { 0, 0, 0, 0 };
left_components = 4 - fine_location % 4;
if (rvalue->type->is_double()) {
if (rvalue->type->is_64bit()) {
/* We might actually end up with 0 left components! */
left_components /= 2;
}
@@ -607,7 +607,7 @@ lower_packed_varyings_visitor::get_packed_varying_deref(
if (this->packed_varyings[slot] == NULL) {
char *packed_name = ralloc_asprintf(this->mem_ctx, "packed:%s", name);
const glsl_type *packed_type;
if (unpacked_var->is_interpolation_flat())
if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT)
packed_type = glsl_type::ivec4_type;
else
packed_type = glsl_type::vec4_type;
@@ -627,8 +627,7 @@ lower_packed_varyings_visitor::get_packed_varying_deref(
packed_var->data.centroid = unpacked_var->data.centroid;
packed_var->data.sample = unpacked_var->data.sample;
packed_var->data.patch = unpacked_var->data.patch;
packed_var->data.interpolation = packed_type == glsl_type::ivec4_type
? unsigned(INTERP_QUALIFIER_FLAT) : unpacked_var->data.interpolation;
packed_var->data.interpolation = unpacked_var->data.interpolation;
packed_var->data.location = location;
packed_var->data.precision = unpacked_var->data.precision;
packed_var->data.always_active_io = unpacked_var->data.always_active_io;
@@ -677,7 +676,7 @@ lower_packed_varyings_visitor::needs_lowering(ir_variable *var)
return false;
type = type->without_array();
if (type->vector_elements == 4 && !type->is_double())
if (type->vector_elements == 4 && !type->is_64bit())
return false;
return true;
}

View File

@@ -138,7 +138,7 @@ lower_shared_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
bool row_major;
int matrix_columns;
assert(var->get_interface_type() == NULL);
const unsigned packing = GLSL_INTERFACE_PACKING_STD430;
const enum glsl_interface_packing packing = GLSL_INTERFACE_PACKING_STD430;
setup_buffer_access(mem_ctx, var, deref,
&offset, &const_offset,
@@ -206,7 +206,7 @@ lower_shared_reference_visitor::handle_assignment(ir_assignment *ir)
bool row_major;
int matrix_columns;
assert(var->get_interface_type() == NULL);
const unsigned packing = GLSL_INTERFACE_PACKING_STD430;
const enum glsl_interface_packing packing = GLSL_INTERFACE_PACKING_STD430;
setup_buffer_access(mem_ctx, var, deref,
&offset, &const_offset,
@@ -365,7 +365,7 @@ lower_shared_reference_visitor::lower_shared_atomic_intrinsic(ir_call *ir)
bool row_major;
int matrix_columns;
assert(var->get_interface_type() == NULL);
const unsigned packing = GLSL_INTERFACE_PACKING_STD430;
const enum glsl_interface_packing packing = GLSL_INTERFACE_PACKING_STD430;
buffer_access_type = shared_atomic_access;
setup_buffer_access(mem_ctx, var, deref,

View File

@@ -61,7 +61,7 @@ public:
unsigned *const_offset,
bool *row_major,
int *matrix_columns,
unsigned packing);
enum glsl_interface_packing packing);
uint32_t ssbo_access_params();
ir_expression *ubo_load(void *mem_ctx, const struct glsl_type *type,
ir_rvalue *offset);
@@ -99,7 +99,7 @@ public:
ir_expression *emit_ssbo_get_buffer_size(void *mem_ctx);
unsigned calculate_unsized_array_stride(ir_dereference *deref,
unsigned packing);
enum glsl_interface_packing packing);
ir_call *lower_ssbo_atomic_intrinsic(ir_call *ir);
ir_call *check_for_ssbo_atomic_intrinsic(ir_call *ir);
@@ -273,7 +273,7 @@ lower_ubo_reference_visitor::setup_for_load_or_store(void *mem_ctx,
unsigned *const_offset,
bool *row_major,
int *matrix_columns,
unsigned packing)
enum glsl_interface_packing packing)
{
/* Determine the name of the interface block */
ir_rvalue *nonconst_block_index;
@@ -344,7 +344,7 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
unsigned const_offset;
bool row_major;
int matrix_columns;
unsigned packing = var->get_interface_type()->interface_packing;
enum glsl_interface_packing packing = var->get_interface_type_packing();
this->buffer_access_type =
var->is_in_shader_storage_block() ?
@@ -557,7 +557,7 @@ lower_ubo_reference_visitor::write_to_memory(void *mem_ctx,
unsigned const_offset;
bool row_major;
int matrix_columns;
unsigned packing = var->get_interface_type()->interface_packing;
enum glsl_interface_packing packing = var->get_interface_type_packing();
this->buffer_access_type = ssbo_store_access;
this->variable = var;
@@ -666,7 +666,7 @@ lower_ubo_reference_visitor::emit_ssbo_get_buffer_size(void *mem_ctx)
unsigned
lower_ubo_reference_visitor::calculate_unsized_array_stride(ir_dereference *deref,
unsigned packing)
enum glsl_interface_packing packing)
{
unsigned array_stride = 0;
@@ -736,7 +736,7 @@ lower_ubo_reference_visitor::process_ssbo_unsized_array_length(ir_rvalue **rvalu
unsigned const_offset;
bool row_major;
int matrix_columns;
unsigned packing = var->get_interface_type()->interface_packing;
enum glsl_interface_packing packing = var->get_interface_type_packing();
int unsized_array_stride = calculate_unsized_array_stride(deref, packing);
this->buffer_access_type = ssbo_unsized_array_length_access;
@@ -970,7 +970,7 @@ lower_ubo_reference_visitor::lower_ssbo_atomic_intrinsic(ir_call *ir)
unsigned const_offset;
bool row_major;
int matrix_columns;
unsigned packing = var->get_interface_type()->interface_packing;
enum glsl_interface_packing packing = var->get_interface_type_packing();
this->buffer_access_type = ssbo_atomic_access;
this->variable = var;

View File

@@ -93,7 +93,6 @@ public:
{
this->mem_ctx = ralloc_context(NULL);
this->variable_list.make_empty();
this->in_whole_array_copy = false;
}
~ir_array_reference_visitor(void)
@@ -105,8 +104,6 @@ public:
virtual ir_visitor_status visit(ir_variable *);
virtual ir_visitor_status visit(ir_dereference_variable *);
virtual ir_visitor_status visit_enter(ir_assignment *);
virtual ir_visitor_status visit_leave(ir_assignment *);
virtual ir_visitor_status visit_enter(ir_dereference_array *);
virtual ir_visitor_status visit_enter(ir_function_signature *);
@@ -116,8 +113,6 @@ public:
exec_list variable_list;
void *mem_ctx;
bool in_whole_array_copy;
};
} /* namespace */
@@ -162,34 +157,11 @@ ir_array_reference_visitor::visit(ir_variable *ir)
return visit_continue;
}
ir_visitor_status
ir_array_reference_visitor::visit_enter(ir_assignment *ir)
{
in_whole_array_copy =
ir->lhs->type->is_array() && ir->whole_variable_written();
return visit_continue;
}
ir_visitor_status
ir_array_reference_visitor::visit_leave(ir_assignment *ir)
{
in_whole_array_copy = false;
return visit_continue;
}
ir_visitor_status
ir_array_reference_visitor::visit(ir_dereference_variable *ir)
{
variable_entry *entry = this->get_variable_entry(ir->var);
/* Allow whole-array assignments on the LHS. We can split those
* by "unrolling" the assignment into component-wise assignments.
*/
if (in_assignee && in_whole_array_copy)
return visit_continue;
/* If we made it to here without seeing an ir_dereference_array,
* then the dereference of this array didn't have a constant index
* (see the visit_continue_with_parent below), so we can't split
@@ -378,33 +350,6 @@ ir_array_splitting_visitor::visit_leave(ir_assignment *ir)
*/
ir_rvalue *lhs = ir->lhs;
/* "Unroll" any whole array assignments, creating assignments for
* each array element. Then, do splitting on each new assignment.
*/
if (lhs->type->is_array() && ir->whole_variable_written() &&
get_splitting_entry(ir->whole_variable_written())) {
void *mem_ctx = ralloc_parent(ir);
for (unsigned i = 0; i < lhs->type->length; i++) {
ir_rvalue *lhs_i =
new(mem_ctx) ir_dereference_array(ir->lhs->clone(mem_ctx, NULL),
new(mem_ctx) ir_constant(i));
ir_rvalue *rhs_i =
new(mem_ctx) ir_dereference_array(ir->rhs->clone(mem_ctx, NULL),
new(mem_ctx) ir_constant(i));
ir_rvalue *condition_i =
ir->condition ? ir->condition->clone(mem_ctx, NULL) : NULL;
ir_assignment *assign_i =
new(mem_ctx) ir_assignment(lhs_i, rhs_i, condition_i);
ir->insert_before(assign_i);
assign_i->accept(this);
}
ir->remove();
return visit_continue;
}
handle_rvalue(&lhs);
ir->lhs = lhs->as_dereference();

View File

@@ -72,14 +72,7 @@ opt_conditional_discard_visitor::visit_leave(ir_if *ir)
/* Move the condition and replace the ir_if with the ir_discard. */
ir_discard *discard = (ir_discard *) ir->then_instructions.head;
if (!discard->condition)
discard->condition = ir->condition;
else {
void *ctx = ralloc_parent(ir);
discard->condition = new(ctx) ir_expression(ir_binop_logic_and,
ir->condition,
discard->condition);
}
discard->condition = ir->condition;
ir->replace_with(discard);
progress = true;

View File

@@ -138,14 +138,14 @@ public:
void
ir_constant_propagation_visitor::constant_folding(ir_rvalue **rvalue)
{
if (this->in_assignee || *rvalue == NULL)
if (*rvalue == NULL)
return;
if (ir_constant_fold(rvalue))
this->progress = true;
ir_dereference_variable *var_ref = (*rvalue)->as_dereference_variable();
if (var_ref && !var_ref->type->is_array()) {
if (var_ref) {
ir_constant *constant = var_ref->constant_expression_value();
if (constant) {
*rvalue = constant;

View File

@@ -83,6 +83,7 @@ public:
}
virtual ir_visitor_status visit(class ir_dereference_variable *);
void handle_loop(class ir_loop *, bool keep_acp);
virtual ir_visitor_status visit_enter(class ir_loop *);
virtual ir_visitor_status visit_enter(class ir_function_signature *);
virtual ir_visitor_status visit_enter(class ir_function *);
@@ -252,21 +253,24 @@ ir_copy_propagation_visitor::visit_enter(ir_if *ir)
return visit_continue_with_parent;
}
ir_visitor_status
ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
void
ir_copy_propagation_visitor::handle_loop(ir_loop *ir, bool keep_acp)
{
exec_list *orig_acp = this->acp;
exec_list *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
/* FINISHME: For now, the initial acp for loops is totally empty.
* We could go through once, then go through again with the acp
* cloned minus the killed entries after the first run through.
*/
this->acp = new(mem_ctx) exec_list;
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
if (keep_acp) {
/* Populate the initial acp with a copy of the original */
foreach_in_list(acp_entry, a, orig_acp) {
this->acp->push_tail(new(this->acp) acp_entry(a->lhs, a->rhs));
}
}
visit_list_elements(this, &ir->body_instructions);
if (this->killed_all) {
@@ -284,6 +288,20 @@ ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
}
ralloc_free(new_kills);
}
ir_visitor_status
ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
{
/* Make a conservative first pass over the loop with an empty ACP set.
* This also removes any killed entries from the original ACP set.
*/
handle_loop(ir, false);
/* Then, run it again with the real ACP set, minus any killed entries.
* This takes care of propagating values from before the loop into it.
*/
handle_loop(ir, true);
/* already descended into the children. */
return visit_continue_with_parent;

View File

@@ -106,6 +106,7 @@ public:
ralloc_free(mem_ctx);
}
void handle_loop(ir_loop *, bool keep_acp);
virtual ir_visitor_status visit_enter(class ir_loop *);
virtual ir_visitor_status visit_enter(class ir_function_signature *);
virtual ir_visitor_status visit_leave(class ir_assignment *);
@@ -374,8 +375,8 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_if *ir)
return visit_continue_with_parent;
}
ir_visitor_status
ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
void
ir_copy_propagation_elements_visitor::handle_loop(ir_loop *ir, bool keep_acp)
{
exec_list *orig_acp = this->acp;
exec_list *orig_kills = this->kills;
@@ -389,6 +390,13 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
if (keep_acp) {
/* Populate the initial acp with a copy of the original */
foreach_in_list(acp_entry, a, orig_acp) {
this->acp->push_tail(new(this->acp) acp_entry(a));
}
}
visit_list_elements(this, &ir->body_instructions);
if (this->killed_all) {
@@ -406,6 +414,13 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
}
ralloc_free(new_kills);
}
ir_visitor_status
ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
{
handle_loop(ir, false);
handle_loop(ir, true);
/* already descended into the children. */
return visit_continue_with_parent;

View File

@@ -85,13 +85,10 @@ public:
{
ir_variable *var = ir->variable_referenced();
if (!var || var->data.mode != this->mode || !var->type->is_array() ||
!is_gl_identifier(var->name))
if (!var || var->data.mode != this->mode || !var->type->is_array())
return visit_continue;
/* Only match gl_FragData[], not gl_SecondaryFragDataEXT[] */
if (this->find_frag_outputs && var->data.location == FRAG_RESULT_DATA0 &&
var->data.index == 0) {
if (this->find_frag_outputs && var->data.location == FRAG_RESULT_DATA0) {
this->fragdata_array = var;
ir_constant *index = ir->array_index->as_constant();
@@ -146,8 +143,7 @@ public:
if (var->data.mode != this->mode || !var->type->is_array())
return visit_continue;
if (this->find_frag_outputs && var->data.location == FRAG_RESULT_DATA0 &&
var->data.index == 0) {
if (this->find_frag_outputs && var->data.location == FRAG_RESULT_DATA0) {
/* This is a whole array dereference. */
this->fragdata_usage |= (1 << var->type->array_size()) - 1;
this->lower_fragdata_array = false;

View File

@@ -144,7 +144,7 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned)
* layouts, do not eliminate it.
*/
if (entry->var->is_in_buffer_block()) {
if (entry->var->get_interface_type()->interface_packing !=
if (entry->var->get_interface_type_packing() !=
GLSL_INTERFACE_PACKING_PACKED)
continue;
}

View File

@@ -1079,7 +1079,7 @@ function_key_compare(const void *a, const void *b)
const glsl_type *const key2 = (glsl_type *) b;
if (key1->length != key2->length)
return false;
return 1;
return memcmp(key1->fields.parameters, key2->fields.parameters,
(key1->length + 1) * sizeof(*key1->fields.parameters)) == 0;
@@ -1090,8 +1090,20 @@ static uint32_t
function_key_hash(const void *a)
{
const glsl_type *const key = (glsl_type *) a;
return _mesa_hash_data(key->fields.parameters,
(key->length + 1) * sizeof(*key->fields.parameters));
char hash_key[128];
unsigned size = 0;
size = snprintf(hash_key, sizeof(hash_key), "%08x", key->length);
for (unsigned i = 0; i < key->length; i++) {
if (size >= sizeof(hash_key))
break;
size += snprintf(& hash_key[size], sizeof(hash_key) - size,
"%p", (void *) key->fields.structure[i].type);
}
return _mesa_hash_string(hash_key);
}
const glsl_type *
@@ -1422,7 +1434,7 @@ glsl_type::can_implicitly_convert_to(const glsl_type *desired,
unsigned
glsl_type::std140_base_alignment(bool row_major) const
{
unsigned N = is_double() ? 8 : 4;
unsigned N = is_64bit() ? 8 : 4;
/* (1) If the member is a scalar consuming <N> basic machine units, the
* base alignment is <N>.
@@ -1540,7 +1552,7 @@ glsl_type::std140_base_alignment(bool row_major) const
unsigned
glsl_type::std140_size(bool row_major) const
{
unsigned N = is_double() ? 8 : 4;
unsigned N = is_64bit() ? 8 : 4;
/* (1) If the member is a scalar consuming <N> basic machine units, the
* base alignment is <N>.
@@ -1677,7 +1689,7 @@ unsigned
glsl_type::std430_base_alignment(bool row_major) const
{
unsigned N = is_double() ? 8 : 4;
unsigned N = is_64bit() ? 8 : 4;
/* (1) If the member is a scalar consuming <N> basic machine units, the
* base alignment is <N>.
@@ -1786,7 +1798,7 @@ glsl_type::std430_base_alignment(bool row_major) const
unsigned
glsl_type::std430_array_stride(bool row_major) const
{
unsigned N = is_double() ? 8 : 4;
unsigned N = is_64bit() ? 8 : 4;
/* Notice that the array stride of a vec3 is not 3 * N but 4 * N.
* See OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block Layout"
@@ -1804,7 +1816,7 @@ glsl_type::std430_array_stride(bool row_major) const
unsigned
glsl_type::std430_size(bool row_major) const
{
unsigned N = is_double() ? 8 : 4;
unsigned N = is_64bit() ? 8 : 4;
/* OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block Layout":
*

View File

@@ -64,6 +64,11 @@ enum glsl_base_type {
GLSL_TYPE_ERROR
};
static inline bool glsl_base_type_is_64bit(enum glsl_base_type type)
{
return type == GLSL_TYPE_DOUBLE;
}
enum glsl_sampler_dim {
GLSL_SAMPLER_DIM_1D = 0,
GLSL_SAMPLER_DIM_2D,
@@ -490,11 +495,19 @@ struct glsl_type {
}
/**
* Query whether a double takes two slots.
* Query whether a 64-bit type takes two slots.
*/
bool is_dual_slot_double() const
bool is_dual_slot() const
{
return base_type == GLSL_TYPE_DOUBLE && vector_elements > 2;
return is_64bit() && vector_elements > 2;
}
/**
* Query whether or not a type is 64-bit
*/
bool is_64bit() const
{
return glsl_base_type_is_64bit(base_type);
}
/**
@@ -745,6 +758,14 @@ struct glsl_type {
*/
bool record_compare(const glsl_type *b, bool match_locations = true) const;
/**
* Get the type interface packing.
*/
enum glsl_interface_packing get_interface_packing() const
{
return (enum glsl_interface_packing)interface_packing;
}
private:
static mtx_t mutex;

View File

@@ -659,122 +659,6 @@ nir_copy_deref(void *mem_ctx, nir_deref *deref)
return NULL;
}
/* This is the second step in the recursion. We've found the tail and made a
* copy. Now we need to iterate over all possible leaves and call the
* callback on each one.
*/
static bool
deref_foreach_leaf_build_recur(nir_deref_var *deref, nir_deref *tail,
nir_deref_foreach_leaf_cb cb, void *state)
{
unsigned length;
union {
nir_deref_array arr;
nir_deref_struct str;
} tmp;
assert(tail->child == NULL);
switch (glsl_get_base_type(tail->type)) {
case GLSL_TYPE_UINT:
case GLSL_TYPE_INT:
case GLSL_TYPE_FLOAT:
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_BOOL:
if (glsl_type_is_vector_or_scalar(tail->type))
return cb(deref, state);
/* Fall Through */
case GLSL_TYPE_ARRAY:
tmp.arr.deref.deref_type = nir_deref_type_array;
tmp.arr.deref.type = glsl_get_array_element(tail->type);
tmp.arr.deref_array_type = nir_deref_array_type_direct;
tmp.arr.indirect = NIR_SRC_INIT;
tail->child = &tmp.arr.deref;
length = glsl_get_length(tail->type);
for (unsigned i = 0; i < length; i++) {
tmp.arr.deref.child = NULL;
tmp.arr.base_offset = i;
if (!deref_foreach_leaf_build_recur(deref, &tmp.arr.deref, cb, state))
return false;
}
return true;
case GLSL_TYPE_STRUCT:
tmp.str.deref.deref_type = nir_deref_type_struct;
tail->child = &tmp.str.deref;
length = glsl_get_length(tail->type);
for (unsigned i = 0; i < length; i++) {
tmp.arr.deref.child = NULL;
tmp.str.deref.type = glsl_get_struct_field(tail->type, i);
tmp.str.index = i;
if (!deref_foreach_leaf_build_recur(deref, &tmp.arr.deref, cb, state))
return false;
}
return true;
default:
unreachable("Invalid type for dereference");
}
}
/* This is the first step of the foreach_leaf recursion. In this step we are
* walking to the end of the deref chain and making a copy in the stack as we
* go. This is because we don't want to mutate the deref chain that was
* passed in by the caller. The downside is that this deref chain is on the
* stack and , if the caller wants to do anything with it, they will have to
* make their own copy because this one will go away.
*/
static bool
deref_foreach_leaf_copy_recur(nir_deref_var *deref, nir_deref *tail,
nir_deref_foreach_leaf_cb cb, void *state)
{
union {
nir_deref_array arr;
nir_deref_struct str;
} c;
if (tail->child) {
switch (tail->child->deref_type) {
case nir_deref_type_array:
c.arr = *nir_deref_as_array(tail->child);
tail->child = &c.arr.deref;
return deref_foreach_leaf_copy_recur(deref, &c.arr.deref, cb, state);
case nir_deref_type_struct:
c.str = *nir_deref_as_struct(tail->child);
tail->child = &c.str.deref;
return deref_foreach_leaf_copy_recur(deref, &c.str.deref, cb, state);
case nir_deref_type_var:
default:
unreachable("Invalid deref type for a child");
}
} else {
/* We've gotten to the end of the original deref. Time to start
* building our own derefs.
*/
return deref_foreach_leaf_build_recur(deref, tail, cb, state);
}
}
/**
* This function iterates over all of the possible derefs that can be created
* with the given deref as the head. It then calls the provided callback with
* a full deref for each one.
*
* The deref passed to the callback will be allocated on the stack. You will
* need to make a copy if you want it to hang around.
*/
bool
nir_deref_foreach_leaf(nir_deref_var *deref,
nir_deref_foreach_leaf_cb cb, void *state)
{
nir_deref_var copy = *deref;
return deref_foreach_leaf_copy_recur(&copy, &copy.deref, cb, state);
}
/* Returns a load_const instruction that represents the constant
* initializer for the given deref chain. The caller is responsible for
* ensuring that there actually is a constant initializer.

View File

@@ -1234,50 +1234,6 @@ nir_tex_instr_is_query(nir_tex_instr *instr)
}
}
static inline nir_alu_type
nir_tex_instr_src_type(nir_tex_instr *instr, unsigned src)
{
switch (instr->src[src].src_type) {
case nir_tex_src_coord:
switch (instr->op) {
case nir_texop_txf:
case nir_texop_txf_ms:
case nir_texop_txf_ms_mcs:
case nir_texop_samples_identical:
return nir_type_int;
default:
return nir_type_float;
}
case nir_tex_src_lod:
switch (instr->op) {
case nir_texop_txs:
case nir_texop_txf:
return nir_type_int;
default:
return nir_type_float;
}
case nir_tex_src_projector:
case nir_tex_src_comparitor:
case nir_tex_src_bias:
case nir_tex_src_ddx:
case nir_tex_src_ddy:
return nir_type_float;
case nir_tex_src_offset:
case nir_tex_src_ms_index:
case nir_tex_src_texture_offset:
case nir_tex_src_sampler_offset:
return nir_type_int;
default:
unreachable("Invalid texture source type");
}
}
static inline unsigned
nir_tex_instr_src_size(nir_tex_instr *instr, unsigned src)
{
@@ -1695,6 +1651,9 @@ typedef struct nir_shader_compiler_options {
/* lower {slt,sge,seq,sne} to {flt,fge,feq,fne} + b2f: */
bool lower_scmp;
/** enables rules to lower idiv by power-of-two: */
bool lower_idiv;
/* Does the native fdot instruction replicate its result for four
* components? If so, then opt_algebraic_late will turn all fdotN
* instructions into fdot_replicatedN instructions.
@@ -1764,9 +1723,6 @@ typedef struct nir_shader_info {
/* Whether or not this shader ever uses textureGather() */
bool uses_texture_gather;
/** Whether or not this shader uses nir_intrinsic_interp_var_at_offset */
bool uses_interp_var_at_offset;
/* Whether or not this shader uses the gl_ClipDistance output */
bool uses_clip_distance_out;
@@ -1967,10 +1923,6 @@ nir_deref_struct *nir_deref_struct_create(void *mem_ctx, unsigned field_index);
nir_deref *nir_copy_deref(void *mem_ctx, nir_deref *deref);
typedef bool (*nir_deref_foreach_leaf_cb)(nir_deref_var *deref, void *state);
bool nir_deref_foreach_leaf(nir_deref_var *deref,
nir_deref_foreach_leaf_cb cb, void *state);
nir_load_const_instr *
nir_deref_get_const_initializer_load(nir_shader *shader, nir_deref_var *deref);
@@ -2338,8 +2290,6 @@ bool nir_lower_returns(nir_shader *shader);
bool nir_inline_functions(nir_shader *shader);
bool nir_propagate_invariant(nir_shader *shader);
void nir_lower_var_copy_instr(nir_intrinsic_instr *copy, void *mem_ctx);
void nir_lower_var_copies(nir_shader *shader);
@@ -2388,16 +2338,6 @@ typedef struct nir_lower_tex_options {
*/
unsigned lower_txp;
/**
* If true, lower away nir_tex_src_offset for all texelfetch instructions.
*/
bool lower_txf_offset;
/**
* If true, lower away nir_tex_src_offset for all rect textures.
*/
bool lower_rect_offset;
/**
* If true, lower rect textures to 2D, using txs to fetch the
* texture dimensions and dividing the texture coords by the

View File

@@ -76,6 +76,7 @@ class Value(object):
return Constant(val, name_base)
__template = mako.template.Template("""
#include "compiler/nir/nir_search_helpers.h"
static const ${val.c_type} ${val.name} = {
{ ${val.type_enum}, ${val.bit_size} },
% if isinstance(val, Constant):
@@ -84,6 +85,7 @@ static const ${val.c_type} ${val.name} = {
${val.index}, /* ${val.var_name} */
${'true' if val.is_constant else 'false'},
${val.type() or 'nir_type_invalid' },
${val.cond if val.cond else 'NULL'},
% elif isinstance(val, Expression):
${'true' if val.inexact else 'false'},
nir_op_${val.opcode},
@@ -113,7 +115,7 @@ static const ${val.c_type} ${val.name} = {
Variable=Variable,
Expression=Expression)
_constant_re = re.compile(r"(?P<value>[^@]+)(?:@(?P<bits>\d+))?")
_constant_re = re.compile(r"(?P<value>[^@\(]+)(?:@(?P<bits>\d+))?")
class Constant(Value):
def __init__(self, val, name):
@@ -150,7 +152,8 @@ class Constant(Value):
return "nir_type_float"
_var_name_re = re.compile(r"(?P<const>#)?(?P<name>\w+)"
r"(?:@(?P<type>int|uint|bool|float)?(?P<bits>\d+)?)?")
r"(?:@(?P<type>int|uint|bool|float)?(?P<bits>\d+)?)?"
r"(?P<cond>\([^\)]+\))?")
class Variable(Value):
def __init__(self, val, name, varset):
@@ -161,6 +164,7 @@ class Variable(Value):
self.var_name = m.group('name')
self.is_constant = m.group('const') is not None
self.cond = m.group('cond')
self.required_type = m.group('type')
self.bit_size = int(m.group('bits')) if m.group('bits') else 0

View File

@@ -317,25 +317,6 @@ nir_fdot(nir_builder *build, nir_ssa_def *src0, nir_ssa_def *src1)
return NULL;
}
static inline nir_ssa_def *
nir_bany_inequal(nir_builder *b, nir_ssa_def *src0, nir_ssa_def *src1)
{
switch (src0->num_components) {
case 1: return nir_ine(b, src0, src1);
case 2: return nir_bany_inequal2(b, src0, src1);
case 3: return nir_bany_inequal3(b, src0, src1);
case 4: return nir_bany_inequal4(b, src0, src1);
default:
unreachable("bad component size");
}
}
static inline nir_ssa_def *
nir_bany(nir_builder *b, nir_ssa_def *src)
{
return nir_bany_inequal(b, src, nir_imm_int(b, 0));
}
static inline nir_ssa_def *
nir_channel(nir_builder *b, nir_ssa_def *def, unsigned c)
{

View File

@@ -57,10 +57,6 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, nir_shader *shader)
shader->info.gs.uses_end_primitive = 1;
break;
case nir_intrinsic_interp_var_at_offset:
shader->info.uses_interp_var_at_offset = 1;
break;
default:
break;
}

View File

@@ -25,20 +25,6 @@
#include "nir_builder.h"
#include "nir_control_flow.h"
static bool
deref_apply_constant_initializer(nir_deref_var *deref, void *state)
{
struct nir_builder *b = state;
nir_load_const_instr *initializer =
nir_deref_get_const_initializer_load(b->shader, deref);
nir_builder_instr_insert(b, &initializer->instr);
nir_store_deref_var(b, deref, &initializer->def, 0xf);
return true;
}
static bool inline_function_impl(nir_function_impl *impl, struct set *inlined);
static void
@@ -188,35 +174,11 @@ inline_functions_block(nir_block *block, nir_builder *b,
/* Add copies of all in parameters */
assert(call->num_params == callee_copy->num_params);
b->cursor = nir_before_instr(&call->instr);
/* Before we insert the copy of the function, we need to lower away
* constant initializers on local variables. This is because constant
* initializers happen (effectively) at the top of the function and,
* since these are about to become locals of the calling function,
* initialization will happen at the top of the caller rather than at
* the top of the callee. This isn't usually a problem, but if we are
* being inlined inside of a loop, it can result in the variable not
* getting re-initialized properly for all loop iterations.
*/
nir_foreach_variable(local, &callee_copy->locals) {
if (!local->constant_initializer)
continue;
nir_deref_var deref;
deref.deref.deref_type = nir_deref_type_var,
deref.deref.child = NULL;
deref.deref.type = local->type,
deref.var = local;
nir_deref_foreach_leaf(&deref, deref_apply_constant_initializer, b);
local->constant_initializer = NULL;
}
exec_list_append(&b->impl->locals, &callee_copy->locals);
exec_list_append(&b->impl->registers, &callee_copy->registers);
b->cursor = nir_before_instr(&call->instr);
/* We now need to tie the two functions together using the
* parameters. There are two ways we do this: One is to turn the
* parameter into a local variable and do a shadow-copy. The other

View File

@@ -41,8 +41,6 @@
#define ARR(...) { __VA_ARGS__ }
INTRINSIC(nop, 0, ARR(0), false, 0, 0, 0, xx, xx, xx,
NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(load_var, 0, ARR(0), true, 0, 1, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
INTRINSIC(store_var, 1, ARR(0), false, 0, 1, 1, WRMASK, xx, xx, 0)
@@ -268,16 +266,16 @@ INTRINSIC(ssbo_atomic_comp_swap, 4, ARR(1, 1, 1, 1), true, 1, 0, 0, xx, xx, xx,
* in shared_atomic_add, etc).
* 2: For CompSwap only: the second data parameter.
*/
INTRINSIC(shared_atomic_add, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_imin, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_umin, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_imax, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_umax, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_and, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_or, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_xor, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_exchange, 2, ARR(1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_comp_swap, 3, ARR(1, 1, 1), true, 1, 0, 1, BASE, xx, xx, 0)
INTRINSIC(shared_atomic_add, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_imin, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_umin, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_imax, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_umax, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_and, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_or, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_xor, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_exchange, 2, ARR(1, 1), true, 1, 0, 0, xx, xx, xx, 0)
INTRINSIC(shared_atomic_comp_swap, 3, ARR(1, 1, 1), true, 1, 0, 0, xx, xx, xx, 0)
#define SYSTEM_VALUE(name, components, num_indices, idx0, idx1, idx2) \
INTRINSIC(load_##name, 0, ARR(0), true, components, 0, num_indices, \

View File

@@ -56,7 +56,6 @@ lower_reduction(nir_alu_instr *instr, nir_op chan_op, nir_op merge_op,
nir_alu_src_copy(&chan->src[1], &instr->src[1], chan);
chan->src[1].swizzle[0] = chan->src[1].swizzle[i];
}
chan->exact = instr->exact;
nir_builder_instr_insert(builder, &chan->instr);
@@ -230,7 +229,6 @@ lower_alu_instr_scalar(nir_alu_instr *instr, nir_builder *b)
nir_alu_ssa_dest_init(lower, 1, instr->dest.dest.ssa.bit_size);
lower->dest.saturate = instr->dest.saturate;
comps[chan] = &lower->dest.dest.ssa;
lower->exact = instr->exact;
nir_builder_instr_insert(b, &lower->instr);
}
@@ -254,9 +252,6 @@ nir_lower_alu_to_scalar_impl(nir_function_impl *impl)
lower_alu_instr_scalar(nir_instr_as_alu(instr), &builder);
}
}
nir_metadata_preserve(impl, nir_metadata_block_index |
nir_metadata_dominance);
}
void

View File

@@ -38,39 +38,16 @@
#include "nir.h"
#include "nir_builder.h"
static int
tex_instr_find_src(nir_tex_instr *tex, nir_tex_src_type src_type)
{
for (unsigned i = 0; i < tex->num_srcs; i++) {
if (tex->src[i].src_type == src_type)
return i;
}
return -1;
}
static void
tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
{
assert(src_idx < tex->num_srcs);
/* First rewrite the source to NIR_SRC_INIT */
nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src, NIR_SRC_INIT);
/* Now, move all of the other sources down */
for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
tex->src[i-1].src_type = tex->src[i].src_type;
nir_instr_move_src(&tex->instr, &tex->src[i-1].src, &tex->src[i].src);
}
tex->num_srcs--;
}
static void
project_src(nir_builder *b, nir_tex_instr *tex)
{
/* Find the projector in the srcs list, if present. */
int proj_index = tex_instr_find_src(tex, nir_tex_src_projector);
if (proj_index < 0)
unsigned proj_index;
for (proj_index = 0; proj_index < tex->num_srcs; proj_index++) {
if (tex->src[proj_index].src_type == nir_tex_src_projector)
break;
}
if (proj_index == tex->num_srcs)
return;
b->cursor = nir_before_instr(&tex->instr);
@@ -125,57 +102,18 @@ project_src(nir_builder *b, nir_tex_instr *tex)
nir_src_for_ssa(projected));
}
tex_instr_remove_src(tex, proj_index);
}
static bool
lower_offset(nir_builder *b, nir_tex_instr *tex)
{
int offset_index = tex_instr_find_src(tex, nir_tex_src_offset);
if (offset_index < 0)
return false;
int coord_index = tex_instr_find_src(tex, nir_tex_src_coord);
assert(coord_index >= 0);
assert(tex->src[offset_index].src.is_ssa);
assert(tex->src[coord_index].src.is_ssa);
nir_ssa_def *offset = tex->src[offset_index].src.ssa;
nir_ssa_def *coord = tex->src[coord_index].src.ssa;
b->cursor = nir_before_instr(&tex->instr);
nir_ssa_def *offset_coord;
if (nir_tex_instr_src_type(tex, coord_index) == nir_type_float) {
assert(tex->sampler_dim == GLSL_SAMPLER_DIM_RECT);
offset_coord = nir_fadd(b, coord, nir_i2f(b, offset));
} else {
offset_coord = nir_iadd(b, coord, offset);
/* Now move the later tex sources down the array so that the projector
* disappears.
*/
nir_instr_rewrite_src(&tex->instr, &tex->src[proj_index].src,
NIR_SRC_INIT);
for (unsigned i = proj_index + 1; i < tex->num_srcs; i++) {
tex->src[i-1].src_type = tex->src[i].src_type;
nir_instr_move_src(&tex->instr, &tex->src[i-1].src, &tex->src[i].src);
}
if (tex->is_array) {
/* The offset is not applied to the array index */
if (tex->coord_components == 2) {
offset_coord = nir_vec2(b, nir_channel(b, offset_coord, 0),
nir_channel(b, coord, 1));
} else if (tex->coord_components == 3) {
offset_coord = nir_vec3(b, nir_channel(b, offset_coord, 0),
nir_channel(b, offset_coord, 1),
nir_channel(b, coord, 2));
} else {
unreachable("Invalid number of components");
}
}
nir_instr_rewrite_src(&tex->instr, &tex->src[coord_index].src,
nir_src_for_ssa(offset_coord));
tex_instr_remove_src(tex, offset_index);
return true;
tex->num_srcs--;
}
static nir_ssa_def *
get_texture_size(nir_builder *b, nir_tex_instr *tex)
{
@@ -506,12 +444,6 @@ nir_lower_tex_block(nir_block *block, nir_builder *b,
progress = true;
}
if ((tex->op == nir_texop_txf && options->lower_txf_offset) ||
(tex->sampler_dim == GLSL_SAMPLER_DIM_RECT &&
options->lower_rect_offset)) {
progress = lower_offset(b, tex) || progress;
}
if ((tex->sampler_dim == GLSL_SAMPLER_DIM_RECT) && options->lower_rect) {
lower_rect(b, tex);
progress = true;

View File

@@ -471,7 +471,7 @@ lower_copies_to_load_store(struct deref_node *node,
return true;
}
/* Performs variable renaming
/* Performs variable renaming by doing a DFS of the dominance tree
*
* This algorithm is very similar to the one outlined in "Efficiently
* Computing Static Single Assignment Form and the Control Dependence
@@ -479,132 +479,133 @@ lower_copies_to_load_store(struct deref_node *node,
* SSA def on the stack per block.
*/
static bool
rename_variables(struct lower_variables_state *state)
rename_variables_block(nir_block *block, struct lower_variables_state *state)
{
nir_builder b;
nir_builder_init(&b, state->impl);
nir_foreach_block(block, state->impl) {
nir_foreach_instr_safe(instr, block) {
if (instr->type != nir_instr_type_intrinsic)
continue;
nir_foreach_instr_safe(instr, block) {
if (instr->type != nir_instr_type_intrinsic)
continue;
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
switch (intrin->intrinsic) {
case nir_intrinsic_load_var: {
struct deref_node *node =
get_deref_node(intrin->variables[0], state);
switch (intrin->intrinsic) {
case nir_intrinsic_load_var: {
struct deref_node *node =
get_deref_node(intrin->variables[0], state);
if (node == NULL) {
/* If we hit this path then we are referencing an invalid
* value. Most likely, we unrolled something and are
* reading past the end of some array. In any case, this
* should result in an undefined value.
*/
nir_ssa_undef_instr *undef =
nir_ssa_undef_instr_create(state->shader,
intrin->num_components,
intrin->dest.ssa.bit_size);
if (node == NULL) {
/* If we hit this path then we are referencing an invalid
* value. Most likely, we unrolled something and are
* reading past the end of some array. In any case, this
* should result in an undefined value.
*/
nir_ssa_undef_instr *undef =
nir_ssa_undef_instr_create(state->shader,
intrin->num_components,
intrin->dest.ssa.bit_size);
nir_instr_insert_before(&intrin->instr, &undef->instr);
nir_instr_remove(&intrin->instr);
nir_ssa_def_rewrite_uses(&intrin->dest.ssa,
nir_src_for_ssa(&undef->def));
continue;
}
if (!node->lower_to_ssa)
continue;
nir_alu_instr *mov = nir_alu_instr_create(state->shader,
nir_op_imov);
mov->src[0].src = nir_src_for_ssa(
nir_phi_builder_value_get_block_def(node->pb_value, block));
for (unsigned i = intrin->num_components; i < 4; i++)
mov->src[0].swizzle[i] = 0;
assert(intrin->dest.is_ssa);
mov->dest.write_mask = (1 << intrin->num_components) - 1;
nir_ssa_dest_init(&mov->instr, &mov->dest.dest,
intrin->num_components,
intrin->dest.ssa.bit_size, NULL);
nir_instr_insert_before(&intrin->instr, &mov->instr);
nir_instr_insert_before(&intrin->instr, &undef->instr);
nir_instr_remove(&intrin->instr);
nir_ssa_def_rewrite_uses(&intrin->dest.ssa,
nir_src_for_ssa(&mov->dest.dest.ssa));
break;
nir_src_for_ssa(&undef->def));
continue;
}
case nir_intrinsic_store_var: {
struct deref_node *node =
get_deref_node(intrin->variables[0], state);
if (!node->lower_to_ssa)
continue;
if (node == NULL) {
/* Probably an out-of-bounds array store. That should be a
* no-op. */
nir_instr_remove(&intrin->instr);
continue;
}
nir_alu_instr *mov = nir_alu_instr_create(state->shader,
nir_op_imov);
mov->src[0].src = nir_src_for_ssa(
nir_phi_builder_value_get_block_def(node->pb_value, block));
for (unsigned i = intrin->num_components; i < 4; i++)
mov->src[0].swizzle[i] = 0;
if (!node->lower_to_ssa)
continue;
assert(intrin->dest.is_ssa);
assert(intrin->num_components ==
glsl_get_vector_elements(node->type));
mov->dest.write_mask = (1 << intrin->num_components) - 1;
nir_ssa_dest_init(&mov->instr, &mov->dest.dest,
intrin->num_components,
intrin->dest.ssa.bit_size, NULL);
assert(intrin->src[0].is_ssa);
nir_instr_insert_before(&intrin->instr, &mov->instr);
nir_instr_remove(&intrin->instr);
nir_ssa_def *new_def;
b.cursor = nir_before_instr(&intrin->instr);
nir_ssa_def_rewrite_uses(&intrin->dest.ssa,
nir_src_for_ssa(&mov->dest.dest.ssa));
break;
}
unsigned wrmask = nir_intrinsic_write_mask(intrin);
if (wrmask == (1 << intrin->num_components) - 1) {
/* Whole variable store - just copy the source. Note that
* intrin->num_components and intrin->src[0].ssa->num_components
* may differ.
*/
unsigned swiz[4];
for (unsigned i = 0; i < 4; i++)
swiz[i] = i < intrin->num_components ? i : 0;
case nir_intrinsic_store_var: {
struct deref_node *node =
get_deref_node(intrin->variables[0], state);
new_def = nir_swizzle(&b, intrin->src[0].ssa, swiz,
intrin->num_components, false);
} else {
nir_ssa_def *old_def =
nir_phi_builder_value_get_block_def(node->pb_value, block);
/* For writemasked store_var intrinsics, we combine the newly
* written values with the existing contents of unwritten
* channels, creating a new SSA value for the whole vector.
*/
nir_ssa_def *srcs[4];
for (unsigned i = 0; i < intrin->num_components; i++) {
if (wrmask & (1 << i)) {
srcs[i] = nir_channel(&b, intrin->src[0].ssa, i);
} else {
srcs[i] = nir_channel(&b, old_def, i);
}
}
new_def = nir_vec(&b, srcs, intrin->num_components);
}
assert(new_def->num_components == intrin->num_components);
nir_phi_builder_value_set_block_def(node->pb_value, block, new_def);
if (node == NULL) {
/* Probably an out-of-bounds array store. That should be a
* no-op. */
nir_instr_remove(&intrin->instr);
break;
continue;
}
default:
break;
if (!node->lower_to_ssa)
continue;
assert(intrin->num_components ==
glsl_get_vector_elements(node->type));
assert(intrin->src[0].is_ssa);
nir_ssa_def *new_def;
b.cursor = nir_before_instr(&intrin->instr);
unsigned wrmask = nir_intrinsic_write_mask(intrin);
if (wrmask == (1 << intrin->num_components) - 1) {
/* Whole variable store - just copy the source. Note that
* intrin->num_components and intrin->src[0].ssa->num_components
* may differ.
*/
unsigned swiz[4];
for (unsigned i = 0; i < 4; i++)
swiz[i] = i < intrin->num_components ? i : 0;
new_def = nir_swizzle(&b, intrin->src[0].ssa, swiz,
intrin->num_components, false);
} else {
nir_ssa_def *old_def =
nir_phi_builder_value_get_block_def(node->pb_value, block);
/* For writemasked store_var intrinsics, we combine the newly
* written values with the existing contents of unwritten
* channels, creating a new SSA value for the whole vector.
*/
nir_ssa_def *srcs[4];
for (unsigned i = 0; i < intrin->num_components; i++) {
if (wrmask & (1 << i)) {
srcs[i] = nir_channel(&b, intrin->src[0].ssa, i);
} else {
srcs[i] = nir_channel(&b, old_def, i);
}
}
new_def = nir_vec(&b, srcs, intrin->num_components);
}
assert(new_def->num_components == intrin->num_components);
nir_phi_builder_value_set_block_def(node->pb_value, block, new_def);
nir_instr_remove(&intrin->instr);
break;
}
default:
break;
}
}
for (unsigned i = 0; i < block->num_dom_children; ++i)
rename_variables_block(block->dom_children[i], state);
return true;
}
@@ -736,7 +737,7 @@ nir_lower_vars_to_ssa_impl(nir_function_impl *impl)
}
}
rename_variables(&state);
rename_variables_block(nir_start_block(impl), &state);
nir_phi_builder_finish(state.phi_builder);

View File

@@ -257,7 +257,7 @@ unpack_4x8("unorm")
unpack_2x16("half")
unop_horiz("pack_uvec2_to_uint", 1, tuint32, 2, tuint32, """
dst.x = (src0.x & 0xffff) | (src0.y << 16);
dst.x = (src0.x & 0xffff) | (src0.y >> 16);
""")
unop_horiz("pack_uvec4_to_uint", 1, tuint32, 4, tuint32, """

View File

@@ -45,10 +45,11 @@ d = 'd'
# however, be used for backend-requested lowering operations as those need to
# happen regardless of precision.
#
# Variable names are specified as "[#]name[@type]" where "#" inicates that
# the given variable will only match constants and the type indicates that
# Variable names are specified as "[#]name[@type][(cond)]" where "#" inicates
# that the given variable will only match constants and the type indicates that
# the given variable will only match values from ALU instructions with the
# given output type.
# given output type, and (cond) specifies an additional condition function
# (see nir_search_helpers.h).
#
# For constants, you have to be careful to make sure that it is the right
# type because python is unaware of the source and destination types of the
@@ -62,6 +63,14 @@ d = 'd'
# constructed value should have that bit-size.
optimizations = [
(('imul', a, '#b@32(is_pos_power_of_two)'), ('ishl', a, ('find_lsb', b))),
(('imul', a, '#b@32(is_neg_power_of_two)'), ('ineg', ('ishl', a, ('find_lsb', ('iabs', b))))),
(('udiv', a, '#b@32(is_pos_power_of_two)'), ('ushr', a, ('find_lsb', b))),
(('idiv', a, '#b@32(is_pos_power_of_two)'), ('imul', ('isign', a), ('ushr', ('iabs', a), ('find_lsb', b))), 'options->lower_idiv'),
(('idiv', a, '#b@32(is_neg_power_of_two)'), ('ineg', ('imul', ('isign', a), ('ushr', ('iabs', a), ('find_lsb', ('iabs', b))))), 'options->lower_idiv'),
(('umod', a, '#b(is_pos_power_of_two)'), ('iand', a, ('isub', b, 1))),
(('fneg', ('fneg', a)), a),
(('ineg', ('ineg', a)), a),
(('fabs', ('fabs', a)), ('fabs', a)),
@@ -224,6 +233,8 @@ optimizations = [
(('~flog2', ('frcp', a)), ('fneg', ('flog2', a))),
(('~flog2', ('frsq', a)), ('fmul', -0.5, ('flog2', a))),
(('~flog2', ('fpow', a, b)), ('fmul', b, ('flog2', a))),
(('~fadd', ('flog2', a), ('flog2', b)), ('flog2', ('fmul', a, b))),
(('~fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv', a, b))),
(('~fmul', ('fexp2', a), ('fexp2', b)), ('fexp2', ('fadd', a, b))),
# Division and reciprocal
(('~fdiv', 1.0, a), ('frcp', a)),

View File

@@ -44,8 +44,7 @@
* var.pb_val = nir_phi_builder_add_value(pb, var.defs)
*
* // Visit each block. This needs to visit dominators first;
* // nir_foreach_block() will be ok.
*
* // nir_for_each_block() will be ok.
* foreach block:
* foreach instruction:
* foreach use of variable var:

View File

@@ -1,196 +0,0 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#include "nir.h"
static void
add_src(nir_src *src, struct set *invariants)
{
if (src->is_ssa) {
_mesa_set_add(invariants, src->ssa);
} else {
_mesa_set_add(invariants, src->reg.reg);
}
}
static bool
add_src_cb(nir_src *src, void *state)
{
add_src(src, state);
return true;
}
static bool
dest_is_invariant(nir_dest *dest, struct set *invariants)
{
if (dest->is_ssa) {
return _mesa_set_search(invariants, &dest->ssa);
} else {
return _mesa_set_search(invariants, dest->reg.reg);
}
}
static void
add_cf_node(nir_cf_node *cf, struct set *invariants)
{
if (cf->type == nir_cf_node_if) {
nir_if *if_stmt = nir_cf_node_as_if(cf);
add_src(&if_stmt->condition, invariants);
}
if (cf->parent)
add_cf_node(cf->parent, invariants);
}
static void
add_var(nir_variable *var, struct set *invariants)
{
_mesa_set_add(invariants, var);
}
static bool
var_is_invariant(nir_variable *var, struct set * invariants)
{
return var->data.invariant || _mesa_set_search(invariants, var);
}
static void
propagate_invariant_instr(nir_instr *instr, struct set *invariants)
{
switch (instr->type) {
case nir_instr_type_alu: {
nir_alu_instr *alu = nir_instr_as_alu(instr);
if (!dest_is_invariant(&alu->dest.dest, invariants))
break;
alu->exact = true;
nir_foreach_src(instr, add_src_cb, invariants);
break;
}
case nir_instr_type_tex: {
nir_tex_instr *tex = nir_instr_as_tex(instr);
if (dest_is_invariant(&tex->dest, invariants))
nir_foreach_src(instr, add_src_cb, invariants);
break;
}
case nir_instr_type_intrinsic: {
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
switch (intrin->intrinsic) {
case nir_intrinsic_copy_var:
/* If the destination is invariant then so is the source */
if (var_is_invariant(intrin->variables[0]->var, invariants))
add_var(intrin->variables[1]->var, invariants);
break;
case nir_intrinsic_load_var:
if (dest_is_invariant(&intrin->dest, invariants))
add_var(intrin->variables[0]->var, invariants);
break;
case nir_intrinsic_store_var:
if (var_is_invariant(intrin->variables[0]->var, invariants))
add_src(&intrin->src[0], invariants);
break;
default:
/* Nothing to do */
break;
}
}
case nir_instr_type_jump:
case nir_instr_type_ssa_undef:
case nir_instr_type_load_const:
break; /* Nothing to do */
case nir_instr_type_phi: {
nir_phi_instr *phi = nir_instr_as_phi(instr);
if (!dest_is_invariant(&phi->dest, invariants))
break;
nir_foreach_phi_src(src, phi) {
add_src(&src->src, invariants);
add_cf_node(&src->pred->cf_node, invariants);
}
break;
}
case nir_instr_type_call:
unreachable("This pass must be run after function inlining");
case nir_instr_type_parallel_copy:
default:
unreachable("Cannot have this instruction type");
}
}
static bool
propagate_invariant_impl(nir_function_impl *impl, struct set *invariants)
{
bool progress = false;
while (true) {
uint32_t prev_entries = invariants->entries;
nir_foreach_block_reverse(block, impl) {
nir_foreach_instr_reverse(instr, block)
propagate_invariant_instr(instr, invariants);
}
/* Keep running until we make no more progress. */
if (invariants->entries > prev_entries) {
progress = true;
continue;
} else {
break;
}
}
if (progress) {
nir_metadata_preserve(impl, nir_metadata_block_index |
nir_metadata_dominance |
nir_metadata_live_ssa_defs);
}
return progress;
}
bool
nir_propagate_invariant(nir_shader *shader)
{
/* Hash set of invariant things */
struct set *invariants = _mesa_set_create(NULL, _mesa_hash_pointer,
_mesa_key_pointer_equal);
bool progress = false;
nir_foreach_function(function, shader) {
if (function->impl && propagate_invariant_impl(function->impl, invariants))
progress = true;
}
_mesa_set_destroy(invariants, NULL);
return progress;
}

View File

@@ -127,6 +127,9 @@ match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src,
instr->src[src].src.ssa->parent_instr->type != nir_instr_type_load_const)
return false;
if (var->cond && !var->cond(instr, src, num_components, new_swizzle))
return false;
if (var->type != nir_type_invalid) {
if (instr->src[src].src.ssa->parent_instr->type != nir_instr_type_alu)
return false;

View File

@@ -68,6 +68,16 @@ typedef struct {
* never match anything.
*/
nir_alu_type type;
/** Optional condition fxn ptr
*
* This is only allowed in search expressions, and allows additional
* constraints to be placed on the match. Typically used for 'is_constant'
* variables to require, for example, power-of-two in order for the search
* to match.
*/
bool (*cond)(nir_alu_instr *instr, unsigned src,
unsigned num_components, const uint8_t *swizzle);
} nir_search_variable;
typedef struct {

View File

@@ -0,0 +1,94 @@
/*
* Copyright © 2016 Red Hat
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*
* Authors:
* Rob Clark <robclark@freedesktop.org>
*/
#ifndef _NIR_SEARCH_HELPERS_
#define _NIR_SEARCH_HELPERS_
#include "nir.h"
static inline bool
__is_power_of_two(unsigned int x)
{
return ((x != 0) && !(x & (x - 1)));
}
static inline bool
is_pos_power_of_two(nir_alu_instr *instr, unsigned src, unsigned num_components,
const uint8_t *swizzle)
{
nir_const_value *val = nir_src_as_const_value(instr->src[src].src);
/* only constant src's: */
if (!val)
return false;
for (unsigned i = 0; i < num_components; i++) {
switch (nir_op_infos[instr->op].input_types[src]) {
case nir_type_int:
if (val->i32[swizzle[i]] < 0)
return false;
if (!__is_power_of_two(val->i32[swizzle[i]]))
return false;
break;
case nir_type_uint:
if (!__is_power_of_two(val->u32[swizzle[i]]))
return false;
break;
default:
return false;
}
}
return true;
}
static inline bool
is_neg_power_of_two(nir_alu_instr *instr, unsigned src, unsigned num_components,
const uint8_t *swizzle)
{
nir_const_value *val = nir_src_as_const_value(instr->src[src].src);
/* only constant src's: */
if (!val)
return false;
for (unsigned i = 0; i < num_components; i++) {
switch (nir_op_infos[instr->op].input_types[src]) {
case nir_type_int:
if (val->i32[swizzle[i]] > 0)
return false;
if (!__is_power_of_two(abs(val->i32[swizzle[i]])))
return false;
break;
default:
return false;
}
}
return true;
}
#endif /* _NIR_SEARCH_ */

View File

@@ -1335,9 +1335,54 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
} else {
image_type = sampled.sampler->var->var->interface_type;
}
const enum glsl_sampler_dim sampler_dim = glsl_get_sampler_dim(image_type);
const bool is_array = glsl_sampler_type_is_array(image_type);
const bool is_shadow = glsl_sampler_type_is_shadow(image_type);
nir_tex_src srcs[8]; /* 8 should be enough */
nir_tex_src *p = srcs;
unsigned idx = 4;
bool has_coord = false;
switch (opcode) {
case SpvOpImageSampleImplicitLod:
case SpvOpImageSampleExplicitLod:
case SpvOpImageSampleDrefImplicitLod:
case SpvOpImageSampleDrefExplicitLod:
case SpvOpImageSampleProjImplicitLod:
case SpvOpImageSampleProjExplicitLod:
case SpvOpImageSampleProjDrefImplicitLod:
case SpvOpImageSampleProjDrefExplicitLod:
case SpvOpImageFetch:
case SpvOpImageGather:
case SpvOpImageDrefGather:
case SpvOpImageQueryLod: {
/* All these types have the coordinate as their first real argument */
struct vtn_ssa_value *coord = vtn_ssa_value(b, w[idx++]);
has_coord = true;
p->src = nir_src_for_ssa(coord->def);
p->src_type = nir_tex_src_coord;
p++;
break;
}
default:
break;
}
/* These all have an explicit depth value as their next source */
switch (opcode) {
case SpvOpImageSampleDrefImplicitLod:
case SpvOpImageSampleDrefExplicitLod:
case SpvOpImageSampleProjDrefImplicitLod:
case SpvOpImageSampleProjDrefExplicitLod:
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_comparitor);
break;
default:
break;
}
/* For OpImageQuerySizeLod, we always have an LOD */
if (opcode == SpvOpImageQuerySizeLod)
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_lod);
/* Figure out the base texture operation */
nir_texop texop;
@@ -1383,108 +1428,10 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
break;
case SpvOpImageQuerySamples:
texop = nir_texop_texture_samples;
break;
default:
unreachable("Unhandled opcode");
}
nir_tex_src srcs[8]; /* 8 should be enough */
nir_tex_src *p = srcs;
unsigned idx = 4;
struct nir_ssa_def *coord;
unsigned coord_components;
switch (opcode) {
case SpvOpImageSampleImplicitLod:
case SpvOpImageSampleExplicitLod:
case SpvOpImageSampleDrefImplicitLod:
case SpvOpImageSampleDrefExplicitLod:
case SpvOpImageSampleProjImplicitLod:
case SpvOpImageSampleProjExplicitLod:
case SpvOpImageSampleProjDrefImplicitLod:
case SpvOpImageSampleProjDrefExplicitLod:
case SpvOpImageFetch:
case SpvOpImageGather:
case SpvOpImageDrefGather:
case SpvOpImageQueryLod: {
/* All these types have the coordinate as their first real argument */
switch (sampler_dim) {
case GLSL_SAMPLER_DIM_1D:
case GLSL_SAMPLER_DIM_BUF:
coord_components = 1;
break;
case GLSL_SAMPLER_DIM_2D:
case GLSL_SAMPLER_DIM_RECT:
case GLSL_SAMPLER_DIM_MS:
coord_components = 2;
break;
case GLSL_SAMPLER_DIM_3D:
case GLSL_SAMPLER_DIM_CUBE:
coord_components = 3;
break;
default:
assert("Invalid sampler type");
}
if (is_array && texop != nir_texop_lod)
coord_components++;
coord = vtn_ssa_value(b, w[idx++])->def;
p->src = nir_src_for_ssa(coord);
p->src_type = nir_tex_src_coord;
p++;
break;
}
default:
coord = NULL;
coord_components = 0;
break;
}
switch (opcode) {
case SpvOpImageSampleProjImplicitLod:
case SpvOpImageSampleProjExplicitLod:
case SpvOpImageSampleProjDrefImplicitLod:
case SpvOpImageSampleProjDrefExplicitLod:
/* These have the projector as the last coordinate component */
p->src = nir_src_for_ssa(nir_channel(&b->nb, coord, coord_components));
p->src_type = nir_tex_src_projector;
p++;
break;
default:
break;
}
unsigned gather_component = 0;
switch (opcode) {
case SpvOpImageSampleDrefImplicitLod:
case SpvOpImageSampleDrefExplicitLod:
case SpvOpImageSampleProjDrefImplicitLod:
case SpvOpImageSampleProjDrefExplicitLod:
case SpvOpImageDrefGather:
/* These all have an explicit depth value as their next source */
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_comparitor);
break;
case SpvOpImageGather:
/* This has a component as its next source */
gather_component =
vtn_value(b, w[idx++], vtn_value_type_constant)->constant->value.u[0];
break;
default:
break;
}
/* For OpImageQuerySizeLod, we always have an LOD */
if (opcode == SpvOpImageQuerySizeLod)
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_lod);
/* Now we need to handle some number of optional arguments */
if (idx < count) {
uint32_t operands = w[idx++];
@@ -1497,12 +1444,12 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
if (operands & SpvImageOperandsLodMask) {
assert(texop == nir_texop_txl || texop == nir_texop_txf ||
texop == nir_texop_txs);
texop == nir_texop_txf_ms || texop == nir_texop_txs);
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_lod);
}
if (operands & SpvImageOperandsGradMask) {
assert(texop == nir_texop_txl);
assert(texop == nir_texop_tex);
texop = nir_texop_txd;
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_ddx);
(*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_ddy);
@@ -1529,13 +1476,35 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
memcpy(instr->src, srcs, instr->num_srcs * sizeof(*instr->src));
instr->coord_components = coord_components;
instr->sampler_dim = sampler_dim;
instr->is_array = is_array;
instr->is_shadow = is_shadow;
instr->is_new_style_shadow =
is_shadow && glsl_get_components(ret_type->type) == 1;
instr->component = gather_component;
instr->sampler_dim = glsl_get_sampler_dim(image_type);
instr->is_array = glsl_sampler_type_is_array(image_type);
instr->is_shadow = glsl_sampler_type_is_shadow(image_type);
instr->is_new_style_shadow = instr->is_shadow;
if (has_coord) {
switch (instr->sampler_dim) {
case GLSL_SAMPLER_DIM_1D:
case GLSL_SAMPLER_DIM_BUF:
instr->coord_components = 1;
break;
case GLSL_SAMPLER_DIM_2D:
case GLSL_SAMPLER_DIM_RECT:
case GLSL_SAMPLER_DIM_MS:
instr->coord_components = 2;
break;
case GLSL_SAMPLER_DIM_3D:
case GLSL_SAMPLER_DIM_CUBE:
instr->coord_components = 3;
break;
default:
assert("Invalid sampler type");
}
if (instr->is_array)
instr->coord_components++;
} else {
instr->coord_components = 0;
}
switch (glsl_get_sampler_result_type(image_type)) {
case GLSL_TYPE_FLOAT: instr->dest_type = nir_type_float; break;
@@ -1749,8 +1718,8 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode,
break;
case SpvOpAtomicCompareExchange:
intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
intrin->src[3] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
intrin->src[2] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
intrin->src[3] = nir_src_for_ssa(vtn_ssa_value(b, w[6])->def);
break;
case SpvOpAtomicISub:
@@ -1847,8 +1816,8 @@ fill_common_atomic_sources(struct vtn_builder *b, SpvOp opcode,
break;
case SpvOpAtomicCompareExchange:
src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
src[0] = nir_src_for_ssa(vtn_ssa_value(b, w[7])->def);
src[1] = nir_src_for_ssa(vtn_ssa_value(b, w[8])->def);
break;
/* Fall through */

View File

@@ -239,12 +239,12 @@ vtn_get_branch_type(struct vtn_block *block,
swcase->fallthrough == block->switch_case);
swcase->fallthrough = block->switch_case;
return vtn_branch_type_switch_fallthrough;
} else if (block == switch_break) {
return vtn_branch_type_switch_break;
} else if (block == loop_break) {
return vtn_branch_type_loop_break;
} else if (block == loop_cont) {
return vtn_branch_type_loop_continue;
} else if (block == switch_break) {
return vtn_branch_type_switch_break;
} else {
return vtn_branch_type_none;
}
@@ -443,19 +443,6 @@ vtn_cfg_walk_blocks(struct vtn_builder *b, struct list_head *cf_list,
vtn_order_case(swtch, case_block->switch_case);
}
enum vtn_branch_type branch_type =
vtn_get_branch_type(break_block, switch_case, NULL,
loop_break, loop_cont);
if (branch_type != vtn_branch_type_none) {
/* It is possible that the break is actually the continue block
* for the containing loop. In this case, we need to bail and let
* the loop parsing code handle the continue properly.
*/
assert(branch_type == vtn_branch_type_loop_continue);
return;
}
block = break_block;
continue;
}
@@ -527,12 +514,11 @@ vtn_handle_phi_second_pass(struct vtn_builder *b, SpvOp opcode,
nir_variable *phi_var = phi_entry->data;
for (unsigned i = 3; i < count; i += 2) {
struct vtn_ssa_value *src = vtn_ssa_value(b, w[i]);
struct vtn_block *pred =
vtn_value(b, w[i + 1], vtn_value_type_block)->block;
b->nb.cursor = nir_after_instr(&pred->end_nop->instr);
struct vtn_ssa_value *src = vtn_ssa_value(b, w[i]);
b->nb.cursor = nir_after_block_before_jump(pred->end_block);
vtn_local_store(b, src, nir_deref_var_create(b, phi_var));
}
@@ -590,9 +576,7 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head *cf_list,
vtn_foreach_instruction(b, block_start, block_end, handler);
block->end_nop = nir_intrinsic_instr_create(b->nb.shader,
nir_intrinsic_nop);
nir_builder_instr_insert(&b->nb, &block->end_nop->instr);
block->end_block = nir_cursor_current_block(b->nb.cursor);
if ((*block->branch & SpvOpCodeMask) == SpvOpReturnValue) {
struct vtn_ssa_value *src = vtn_ssa_value(b, block->branch[1]);

View File

@@ -149,8 +149,8 @@ struct vtn_block {
/** Points to the switch case started by this block (if any) */
struct vtn_case *switch_case;
/** Every block ends in a nop intrinsic so that we can find it again */
nir_intrinsic_instr *end_nop;
/** The last block in this SPIR-V block. */
nir_block *end_block;
};
struct vtn_function {

View File

@@ -839,8 +839,8 @@ vtn_get_builtin_location(struct vtn_builder *b,
assert(*mode == nir_var_shader_in);
break;
case SpvBuiltInFrontFacing:
*location = SYSTEM_VALUE_FRONT_FACE;
set_mode_system_value(mode);
*location = VARYING_SLOT_FACE;
assert(*mode == nir_var_shader_in);
break;
case SpvBuiltInSampleId:
*location = SYSTEM_VALUE_SAMPLE_ID;
@@ -889,9 +889,81 @@ vtn_get_builtin_location(struct vtn_builder *b,
}
static void
apply_var_decoration(struct vtn_builder *b, nir_variable *nir_var,
const struct vtn_decoration *dec)
var_decoration_cb(struct vtn_builder *b, struct vtn_value *val, int member,
const struct vtn_decoration *dec, void *void_var)
{
struct vtn_variable *vtn_var = void_var;
/* Handle decorations that apply to a vtn_variable as a whole */
switch (dec->decoration) {
case SpvDecorationBinding:
vtn_var->binding = dec->literals[0];
return;
case SpvDecorationDescriptorSet:
vtn_var->descriptor_set = dec->literals[0];
return;
default:
break;
}
/* Now we handle decorations that apply to a particular nir_variable */
nir_variable *nir_var = vtn_var->var;
if (val->value_type == vtn_value_type_access_chain) {
assert(val->access_chain->length == 0);
assert(val->access_chain->var == void_var);
assert(member == -1);
} else {
assert(val->value_type == vtn_value_type_type);
if (member != -1)
nir_var = vtn_var->members[member];
}
/* Location is odd in that it can apply in three different cases: To a
* non-split variable, to a whole split variable, or to one structure
* member of a split variable.
*/
if (dec->decoration == SpvDecorationLocation) {
unsigned location = dec->literals[0];
bool is_vertex_input;
if (b->shader->stage == MESA_SHADER_FRAGMENT &&
vtn_var->mode == vtn_variable_mode_output) {
is_vertex_input = false;
location += FRAG_RESULT_DATA0;
} else if (b->shader->stage == MESA_SHADER_VERTEX &&
vtn_var->mode == vtn_variable_mode_input) {
is_vertex_input = true;
location += VERT_ATTRIB_GENERIC0;
} else if (vtn_var->mode == vtn_variable_mode_input ||
vtn_var->mode == vtn_variable_mode_output) {
is_vertex_input = false;
location += VARYING_SLOT_VAR0;
} else {
assert(!"Location must be on input or output variable");
}
if (nir_var) {
/* This handles the member and lone variable cases */
nir_var->data.location = location;
nir_var->data.explicit_location = true;
} else {
/* This handles the structure member case */
assert(vtn_var->members);
unsigned length =
glsl_get_length(glsl_without_array(vtn_var->type->type));
for (unsigned i = 0; i < length; i++) {
vtn_var->members[i]->data.location = location;
vtn_var->members[i]->data.explicit_location = true;
location +=
glsl_count_attribute_slots(vtn_var->members[i]->interface_type,
is_vertex_input);
}
}
return;
}
if (nir_var == NULL)
return;
switch (dec->decoration) {
case SpvDecorationRelaxedPrecision:
break; /* FIXME: Do nothing with this for now. */
@@ -1008,99 +1080,6 @@ apply_var_decoration(struct vtn_builder *b, nir_variable *nir_var,
}
}
static void
var_decoration_cb(struct vtn_builder *b, struct vtn_value *val, int member,
const struct vtn_decoration *dec, void *void_var)
{
struct vtn_variable *vtn_var = void_var;
/* Handle decorations that apply to a vtn_variable as a whole */
switch (dec->decoration) {
case SpvDecorationBinding:
vtn_var->binding = dec->literals[0];
return;
case SpvDecorationDescriptorSet:
vtn_var->descriptor_set = dec->literals[0];
return;
default:
break;
}
if (val->value_type == vtn_value_type_access_chain) {
assert(val->access_chain->length == 0);
assert(val->access_chain->var == void_var);
assert(member == -1);
} else {
assert(val->value_type == vtn_value_type_type);
}
/* Location is odd. If applied to a split structure, we have to walk the
* whole thing and accumulate the location. It's easier to handle as a
* special case.
*/
if (dec->decoration == SpvDecorationLocation) {
unsigned location = dec->literals[0];
bool is_vertex_input;
if (b->shader->stage == MESA_SHADER_FRAGMENT &&
vtn_var->mode == vtn_variable_mode_output) {
is_vertex_input = false;
location += FRAG_RESULT_DATA0;
} else if (b->shader->stage == MESA_SHADER_VERTEX &&
vtn_var->mode == vtn_variable_mode_input) {
is_vertex_input = true;
location += VERT_ATTRIB_GENERIC0;
} else if (vtn_var->mode == vtn_variable_mode_input ||
vtn_var->mode == vtn_variable_mode_output) {
is_vertex_input = false;
location += VARYING_SLOT_VAR0;
} else {
assert(!"Location must be on input or output variable");
}
if (vtn_var->var) {
/* This handles the member and lone variable cases */
vtn_var->var->data.location = location;
vtn_var->var->data.explicit_location = true;
} else {
/* This handles the structure member case */
assert(vtn_var->members);
unsigned length =
glsl_get_length(glsl_without_array(vtn_var->type->type));
for (unsigned i = 0; i < length; i++) {
vtn_var->members[i]->data.location = location;
vtn_var->members[i]->data.explicit_location = true;
location +=
glsl_count_attribute_slots(vtn_var->members[i]->interface_type,
is_vertex_input);
}
}
return;
} else {
if (vtn_var->var) {
assert(member <= 0);
apply_var_decoration(b, vtn_var->var, dec);
} else if (vtn_var->members) {
if (member >= 0) {
assert(vtn_var->members);
apply_var_decoration(b, vtn_var->members[member], dec);
} else {
unsigned length =
glsl_get_length(glsl_without_array(vtn_var->type->type));
for (unsigned i = 0; i < length; i++)
apply_var_decoration(b, vtn_var->members[i], dec);
}
} else {
/* A few variables, those with external storage, have no actual
* nir_variables associated with them. Fortunately, all decorations
* we care about for those variables are on the type only.
*/
assert(vtn_var->mode == vtn_variable_mode_ubo ||
vtn_var->mode == vtn_variable_mode_ssbo ||
vtn_var->mode == vtn_variable_mode_push_constant);
}
}
}
/* Tries to compute the size of an interface block based on the strides and
* offsets that are provided to us in the SPIR-V source.
*/
@@ -1194,7 +1173,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
case SpvStorageClassPushConstant:
var->mode = vtn_variable_mode_push_constant;
assert(b->shader->num_uniforms == 0);
b->shader->num_uniforms = vtn_type_block_size(var->type);
b->shader->num_uniforms = vtn_type_block_size(var->type) * 4;
break;
case SpvStorageClassInput:
var->mode = vtn_variable_mode_input;

View File

@@ -61,12 +61,6 @@ ifeq ($(shell echo "$(MESA_ANDROID_VERSION) >= 4.2" | bc),1)
LOCAL_SHARED_LIBRARIES += libsync
endif
# add libdrm if there are hardware drivers
ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
LOCAL_CFLAGS += -DHAVE_LIBDRM
LOCAL_SHARED_LIBRARIES += libdrm
endif
ifeq ($(strip $(MESA_BUILD_CLASSIC)),true)
# require i915_dri and/or i965_dri
LOCAL_REQUIRED_MODULES += \

View File

@@ -242,15 +242,6 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
return NULL;
break;
case __DRI_ATTRIB_MAX_PBUFFER_WIDTH:
_eglSetConfigKey(&base, EGL_MAX_PBUFFER_WIDTH,
_EGL_MAX_PBUFFER_WIDTH);
break;
case __DRI_ATTRIB_MAX_PBUFFER_HEIGHT:
_eglSetConfigKey(&base, EGL_MAX_PBUFFER_HEIGHT,
_EGL_MAX_PBUFFER_HEIGHT);
break;
default:
key = dri2_to_egl_attribute_map[attrib];
if (key != 0)
@@ -329,15 +320,6 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
surface_type &= ~EGL_PIXMAP_BIT;
}
/* No support for pbuffer + MSAA for now.
*
* XXX TODO: pbuffer + MSAA does not work and causes crashes.
* See QT bugreport: https://bugreports.qt.io/browse/QTBUG-47509
*/
if (base.Samples) {
surface_type &= ~EGL_PBUFFER_BIT;
}
conf->base.SurfaceType |= surface_type;
return conf;
@@ -775,99 +757,64 @@ dri2_create_screen(_EGLDisplay *disp)
/**
* Called via eglInitialize(), GLX_drv->API.Initialize().
*
* This must be guaranteed to be called exactly once, even if eglInitialize is
* called many times (without a eglTerminate in between).
*/
static EGLBoolean
dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
{
EGLBoolean ret = EGL_FALSE;
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
/* In the case where the application calls eglMakeCurrent(context1),
* eglTerminate, then eglInitialize again (without a call to eglReleaseThread
* or eglMakeCurrent(NULL) before that), dri2_dpy structure is still
* initialized, as we need it to be able to free context1 correctly.
*
* It would probably be safest to forcibly release the display with
* dri2_display_release, to make sure the display is reinitialized correctly.
* However, the EGL spec states that we need to keep a reference to the
* current context (so we cannot call dri2_make_current(NULL)), and therefore
* we would leak context1 as we would be missing the old display connection
* to free it up correctly.
*/
if (dri2_dpy) {
dri2_dpy->ref_count++;
return EGL_TRUE;
}
/* not until swrast_dri is supported */
if (disp->Options.UseFallback)
return EGL_FALSE;
/* Nothing to initialize for a test only display */
if (disp->Options.TestOnly)
return EGL_TRUE;
switch (disp->Platform) {
#ifdef HAVE_SURFACELESS_PLATFORM
case _EGL_PLATFORM_SURFACELESS:
ret = dri2_initialize_surfaceless(drv, disp);
break;
if (disp->Options.TestOnly)
return EGL_TRUE;
return dri2_initialize_surfaceless(drv, disp);
#endif
#ifdef HAVE_X11_PLATFORM
case _EGL_PLATFORM_X11:
ret = dri2_initialize_x11(drv, disp);
break;
if (disp->Options.TestOnly)
return EGL_TRUE;
return dri2_initialize_x11(drv, disp);
#endif
#ifdef HAVE_DRM_PLATFORM
case _EGL_PLATFORM_DRM:
ret = dri2_initialize_drm(drv, disp);
break;
if (disp->Options.TestOnly)
return EGL_TRUE;
return dri2_initialize_drm(drv, disp);
#endif
#ifdef HAVE_WAYLAND_PLATFORM
case _EGL_PLATFORM_WAYLAND:
ret = dri2_initialize_wayland(drv, disp);
break;
if (disp->Options.TestOnly)
return EGL_TRUE;
return dri2_initialize_wayland(drv, disp);
#endif
#ifdef HAVE_ANDROID_PLATFORM
case _EGL_PLATFORM_ANDROID:
ret = dri2_initialize_android(drv, disp);
break;
if (disp->Options.TestOnly)
return EGL_TRUE;
return dri2_initialize_android(drv, disp);
#endif
default:
_eglLog(_EGL_WARNING, "No EGL platform enabled.");
return EGL_FALSE;
}
if (ret) {
dri2_dpy = dri2_egl_display(disp);
if (!dri2_dpy) {
return EGL_FALSE;
}
dri2_dpy->ref_count++;
}
return ret;
}
/**
* Decrement display reference count, and free up display if necessary.
* Called via eglTerminate(), drv->API.Terminate().
*/
static void
dri2_display_release(_EGLDisplay *disp) {
static EGLBoolean
dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
unsigned i;
assert(dri2_dpy->ref_count > 0);
dri2_dpy->ref_count--;
if (dri2_dpy->ref_count > 0)
return;
_eglReleaseDisplayResources(drv, disp);
_eglCleanupDisplay(disp);
if (dri2_dpy->own_dri_screen)
@@ -922,21 +869,6 @@ dri2_display_release(_EGLDisplay *disp) {
}
free(dri2_dpy);
disp->DriverData = NULL;
}
/**
* Called via eglTerminate(), drv->API.Terminate().
*
* This must be guaranteed to be called exactly once, even if eglTerminate is
* called many times (without a eglInitialize in between).
*/
static EGLBoolean
dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
{
/* Release all non-current Context/Surfaces. */
_eglReleaseDisplayResources(drv, disp);
dri2_display_release(disp);
return EGL_TRUE;
}
@@ -1256,16 +1188,10 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
_EGLSurface *tmp_dsurf, *tmp_rsurf;
__DRIdrawable *ddraw, *rdraw;
__DRIcontext *cctx;
EGLBoolean unbind;
if (!dri2_dpy)
return _eglError(EGL_NOT_INITIALIZED, "eglMakeCurrent");
/* make new bindings */
if (!_eglBindContext(ctx, dsurf, rsurf, &old_ctx, &old_dsurf, &old_rsurf)) {
/* _eglBindContext already sets the EGL error (in _eglCheckMakeCurrent) */
if (!_eglBindContext(ctx, dsurf, rsurf, &old_ctx, &old_dsurf, &old_rsurf))
return EGL_FALSE;
}
/* flush before context switch */
if (old_ctx && dri2_drv->glFlush)
@@ -1280,21 +1206,14 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
dri2_dpy->core->unbindContext(old_cctx);
}
unbind = (cctx == NULL && ddraw == NULL && rdraw == NULL);
if (unbind || dri2_dpy->core->bindContext(cctx, ddraw, rdraw)) {
if ((cctx == NULL && ddraw == NULL && rdraw == NULL) ||
dri2_dpy->core->bindContext(cctx, ddraw, rdraw)) {
if (old_dsurf)
drv->API.DestroySurface(drv, disp, old_dsurf);
if (old_rsurf)
drv->API.DestroySurface(drv, disp, old_rsurf);
if (!unbind)
dri2_dpy->ref_count++;
if (old_ctx) {
EGLDisplay old_disp = _eglGetDisplayHandle(old_ctx->Resource.Display);
if (old_ctx)
drv->API.DestroyContext(drv, disp, old_ctx);
dri2_display_release(old_disp);
}
return EGL_TRUE;
} else {
@@ -1312,11 +1231,7 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *dsurf,
_eglPutSurface(old_rsurf);
_eglPutContext(old_ctx);
/* dri2_dpy->core->bindContext failed. We cannot tell for sure why, but
* setting the error to EGL_BAD_MATCH is surely better than leaving it
* as EGL_SUCCESS.
*/
return _eglError(EGL_BAD_MATCH, "eglMakeCurrent");
return EGL_FALSE;
}
}

View File

@@ -80,6 +80,8 @@
#include "eglimage.h"
#include "eglsync.h"
#define ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
struct wl_buffer;
struct dri2_egl_driver
@@ -175,10 +177,6 @@ struct dri2_egl_display
const __DRI2interopExtension *interop;
int fd;
/* dri2_initialize/dri2_terminate increment/decrement this count, so does
* dri2_make_current (tracks if there are active contexts/surfaces). */
int ref_count;
int own_device;
int swap_available;
int invalidate_available;

View File

@@ -29,7 +29,6 @@
#include <errno.h>
#include <dlfcn.h>
#include <fcntl.h>
#include <xf86drm.h>
#if ANDROID_VERSION >= 0x402
@@ -163,8 +162,6 @@ droid_window_dequeue_buffer(struct dri2_egl_surface *dri2_surf)
static EGLBoolean
droid_window_enqueue_buffer(_EGLDisplay *disp, struct dri2_egl_surface *dri2_surf)
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
/* To avoid blocking other EGL calls, release the display mutex before
* we enter droid_window_enqueue_buffer() and re-acquire the mutex upon
* return.
@@ -195,12 +192,6 @@ droid_window_enqueue_buffer(_EGLDisplay *disp, struct dri2_egl_surface *dri2_sur
dri2_surf->buffer = NULL;
mtx_lock(&disp->Mutex);
if (dri2_surf->dri_image) {
dri2_dpy->image->destroyImage(dri2_surf->dri_image);
dri2_surf->dri_image = NULL;
}
return EGL_TRUE;
}
@@ -289,8 +280,6 @@ droid_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
dri2_surf->base.GLColorspace);
if (!config)
goto cleanup_surface;
dri2_surf->dri_drawable =
(*dri2_dpy->dri2->createNewDrawable)(dri2_dpy->dri_screen, config,
@@ -384,9 +373,6 @@ get_back_bo(struct dri2_egl_surface *dri2_surf)
int fourcc, pitch;
int offset = 0, fd;
if (dri2_surf->dri_image)
return 0;
if (!dri2_surf->buffer)
return -1;
@@ -445,8 +431,10 @@ droid_image_get_buffers(__DRIdrawable *driDrawable,
static EGLBoolean
droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
{
struct dri2_egl_driver *dri2_drv = dri2_egl_driver(drv);
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw);
_EGLContext *ctx;
if (dri2_surf->base.Type != EGL_WINDOW_BIT)
return EGL_TRUE;
@@ -737,7 +725,7 @@ droid_open_device(void)
fd = -1;
}
return (fd >= 0) ? fcntl(fd, F_DUPFD_CLOEXEC, 3) : -1;
return (fd >= 0) ? dup(fd) : -1;
}
/* support versions < JellyBean */
@@ -883,7 +871,6 @@ cleanup_device:
close(dri2_dpy->fd);
cleanup_display:
free(dri2_dpy);
dpy->DriverData = NULL;
return _eglError(EGL_NOT_INITIALIZED, err);
}

View File

@@ -726,6 +726,5 @@ cleanup:
close(fd);
free(dri2_dpy);
disp->DriverData = NULL;
return EGL_FALSE;
}

View File

@@ -157,7 +157,6 @@ cleanup_driver:
close(dri2_dpy->fd);
cleanup_display:
free(dri2_dpy);
disp->DriverData = NULL;
return _eglError(EGL_NOT_INITIALIZED, err);
}

View File

@@ -118,13 +118,6 @@ resize_callback(struct wl_egl_window *wl_win, void *data)
(*dri2_dpy->flush->invalidate)(dri2_surf->dri_drawable);
}
static void
destroy_window_callback(void *data)
{
struct dri2_egl_surface *dri2_surf = data;
dri2_surf->wl_win = NULL;
}
/**
* Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface().
*/
@@ -166,7 +159,6 @@ dri2_wl_create_surface(_EGLDriver *drv, _EGLDisplay *disp,
dri2_surf->wl_win->private = dri2_surf;
dri2_surf->wl_win->resize_callback = resize_callback;
dri2_surf->wl_win->destroy_window_callback = destroy_window_callback;
dri2_surf->base.Width = -1;
dri2_surf->base.Height = -1;
@@ -265,11 +257,8 @@ dri2_wl_destroy_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf)
if (dri2_surf->throttle_callback)
wl_callback_destroy(dri2_surf->throttle_callback);
if (dri2_surf->wl_win) {
dri2_surf->wl_win->private = NULL;
dri2_surf->wl_win->resize_callback = NULL;
dri2_surf->wl_win->destroy_window_callback = NULL;
}
dri2_surf->wl_win->private = NULL;
dri2_surf->wl_win->resize_callback = NULL;
free(surf);
@@ -1249,7 +1238,6 @@ dri2_initialize_wayland_drm(_EGLDriver *drv, _EGLDisplay *disp)
wl_event_queue_destroy(dri2_dpy->wl_queue);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;
return EGL_FALSE;
}
@@ -1706,8 +1694,6 @@ dri2_wl_swrast_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
dri2_surf->format = WL_SHM_FORMAT_ARGB8888;
dri2_surf->wl_win = window;
dri2_surf->wl_win->private = dri2_surf;
dri2_surf->wl_win->destroy_window_callback = destroy_window_callback;
dri2_surf->base.Width = -1;
dri2_surf->base.Height = -1;
@@ -1897,7 +1883,6 @@ dri2_initialize_wayland_swrast(_EGLDriver *drv, _EGLDisplay *disp)
wl_event_queue_destroy(dri2_dpy->wl_queue);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;
return EGL_FALSE;
}

View File

@@ -1231,7 +1231,6 @@ dri2_initialize_x11_swrast(_EGLDriver *drv, _EGLDisplay *disp)
xcb_disconnect(dri2_dpy->conn);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;
return EGL_FALSE;
}
@@ -1303,13 +1302,15 @@ dri2_initialize_x11_dri3(_EGLDriver *drv, _EGLDisplay *disp)
dri2_dpy->screen = DefaultScreen(dpy);
}
if (!dri2_dpy->conn || xcb_connection_has_error(dri2_dpy->conn)) {
if (xcb_connection_has_error(dri2_dpy->conn)) {
_eglLog(_EGL_WARNING, "DRI3: xcb_connect failed");
goto cleanup_dpy;
}
if (!dri3_x11_connect(dri2_dpy))
goto cleanup_conn;
if (dri2_dpy->conn) {
if (!dri3_x11_connect(dri2_dpy))
goto cleanup_conn;
}
if (!dri2_load_driver_dri3(disp))
goto cleanup_conn;
@@ -1337,8 +1338,10 @@ dri2_initialize_x11_dri3(_EGLDriver *drv, _EGLDisplay *disp)
disp->Extensions.WL_bind_wayland_display = EGL_TRUE;
#endif
if (!dri2_x11_add_configs_for_visuals(dri2_dpy, disp, false))
goto cleanup_configs;
if (dri2_dpy->conn) {
if (!dri2_x11_add_configs_for_visuals(dri2_dpy, disp, false))
goto cleanup_configs;
}
dri2_dpy->loader_dri3_ext.core = dri2_dpy->core;
dri2_dpy->loader_dri3_ext.image_driver = dri2_dpy->image_driver;
@@ -1367,7 +1370,6 @@ dri2_initialize_x11_dri3(_EGLDriver *drv, _EGLDisplay *disp)
xcb_disconnect(dri2_dpy->conn);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;
return EGL_FALSE;
}
@@ -1465,7 +1467,6 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay *disp)
xcb_disconnect(dri2_dpy->conn);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;
return EGL_FALSE;
}

View File

@@ -103,17 +103,6 @@ egl_dri3_get_dri_context(struct loader_dri3_drawable *draw)
return dri2_ctx->dri_context;
}
static __DRIscreen *
egl_dri3_get_dri_screen(struct loader_dri3_drawable *draw)
{
_EGLContext *ctx = _eglGetCurrentContext();
struct dri2_egl_context *dri2_ctx;
if (!ctx)
return NULL;
dri2_ctx = dri2_egl_context(ctx);
return dri2_egl_display(dri2_ctx->base.Resource.Display)->dri_screen;
}
static void
egl_dri3_flush_drawable(struct loader_dri3_drawable *draw, unsigned flags)
{
@@ -130,7 +119,6 @@ static struct loader_dri3_vtable egl_dri3_vtable = {
.set_drawable_size = egl_dri3_set_drawable_size,
.in_current_context = egl_dri3_in_current_context,
.get_dri_context = egl_dri3_get_dri_context,
.get_dri_screen = egl_dri3_get_dri_screen,
.flush_drawable = egl_dri3_flush_drawable,
.show_fps = NULL,
};

View File

@@ -627,9 +627,7 @@ eglCreateContext(EGLDisplay dpy, EGLConfig config, EGLContext share_list,
_EGL_CHECK_DISPLAY(disp, EGL_NO_CONTEXT, drv);
if (config)
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_CONTEXT, drv);
else if (!disp->Extensions.MESA_configless_context)
if (!config && !disp->Extensions.MESA_configless_context)
RETURN_EGL_ERROR(disp, EGL_BAD_CONFIG, EGL_NO_CONTEXT);
if (!share && share_list != EGL_NO_CONTEXT)
@@ -1939,7 +1937,7 @@ _eglLockDisplayInterop(EGLDisplay dpy, EGLContext context,
return MESA_GLINTEROP_SUCCESS;
}
PUBLIC int
int
MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_device_info *out)
{
@@ -1961,7 +1959,7 @@ MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
return ret;
}
PUBLIC int
int
MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out)

View File

@@ -34,8 +34,6 @@
#ifndef EGLDEFINES_INCLUDED
#define EGLDEFINES_INCLUDED
#include "util/macros.h"
#ifdef __cplusplus
extern "C" {
#endif
@@ -50,6 +48,7 @@ extern "C" {
#define _EGL_VENDOR_STRING "Mesa Project"
#define ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
#define MIN2(A, B) (((A) < (B)) ? (A) : (B))
#ifdef __cplusplus

View File

@@ -53,16 +53,10 @@ struct _egl_global _eglGlobal =
/* ClientExtensionsString */
"EGL_EXT_client_extensions"
" EGL_EXT_platform_base"
#ifdef HAVE_WAYLAND_PLATFORM
" EGL_EXT_platform_wayland"
#endif
#ifdef HAVE_X11_PLATFORM
" EGL_EXT_platform_x11"
#endif
#ifdef HAVE_DRM_PLATFORM
" EGL_MESA_platform_gbm"
#endif
" EGL_KHR_client_get_all_proc_addresses"
" EGL_MESA_platform_gbm"
};

Some files were not shown because too many files have changed in this diff Show More