Compare commits

...

4925 Commits

Author SHA1 Message Date
Chad Versace
1c4238a8e5 vk/0.130: Bump header version to 0.130
All APIs have been updated. This eliminates the diff between the
work-in-progress header and the 0.130 header.
2015-07-10 20:06:09 -07:00
Chad Versace
f43a304dc6 vk/0.130: Update vkAllocMemory to use VkMemoryType 2015-07-10 17:35:52 -07:00
Chad Versace
df2a013881 vk/0.130: Implement vkGetPhysicalDeviceMemoryProperties() 2015-07-10 17:35:52 -07:00
Chad Versace
c7f512721c vk/gem: Change signature of anv_gem_get_aperture()
Replace the anv_device parameter with anv_physical_device, because this needs
querying before vkCreateDevice.
2015-07-10 17:35:52 -07:00
Chad Versace
8cda3e9b1b vk/device: Add member anv_physical_device::fd
During anv_physical_device_init(), we opend the DRM device to do some
queries, then promptly closed it. Now we keep it open for the lifetime
of the anv_physical_device so that we can query it some more during
vkGetPhysicalDevice*Properties() [which will happen in follow-up
commits].
2015-07-10 17:35:52 -07:00
Chad Versace
4422bd4cf6 vk/device: Add func anv_physical_device_finish()
Because in a follow-up patch I need to do some non-trival teardown on
anv_physical_device. Currently, however, anv_physical_device_finish() is
currently a no-op that's just called in the right place.

Also, rename function fill_physical_device -> anv_physical_device_init
for symmetry.
2015-07-10 17:35:52 -07:00
Jason Ekstrand
7552e026da vk/device: Add an explicit destructor for RenderPass 2015-07-10 12:33:04 -07:00
Jason Ekstrand
8b342b39a3 vk/image: Add an explicit DestroyImage function 2015-07-10 12:30:58 -07:00
Jason Ekstrand
b94b8dfad5 vk/image: Add explicit constructors for buffer/image view types 2015-07-10 12:26:31 -07:00
Jason Ekstrand
18340883e3 nir: Add C++ versions of NIR_(SRC|DEST)_INIT 2015-07-10 11:57:33 -07:00
Chad Versace
9e64a2a8e4 mesa: Fix generation of git_sha1.h.tmp for gitlinks
Don't assume that $(top_srcdir)/.git is a directory. It may be a
gitlink file [1] if $(top_srcdir) is a submodule checkout or a linked
worktree [2].

[1] A "gitlink" is a text file that specifies the real location of
    the gitdir.
[2] Linked worktrees are a new feature in Git 2.5.

Cc: "10.6, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 75784243df)
2015-07-10 11:24:25 -07:00
Jason Ekstrand
19f0a9b582 vk/query.c: Use the casting functions 2015-07-09 20:32:44 -07:00
Jason Ekstrand
6eb221c884 vk/pipeline.c: Use the casting functions 2015-07-09 20:28:08 -07:00
Jason Ekstrand
fb4e2195ec vk/formats.c: Use the casting functions 2015-07-09 20:24:17 -07:00
Jason Ekstrand
a52e208203 vk/image.c: Use the casting functions 2015-07-09 20:24:07 -07:00
Jason Ekstrand
b1de1d4f6e vk/device.c: One more use of a casting function 2015-07-09 20:23:46 -07:00
Jason Ekstrand
8739e8fbe2 vk/meta.c: Use the casting functions 2015-07-09 20:16:13 -07:00
Jason Ekstrand
92556c77f4 vk: Fix the build 2015-07-09 18:59:08 -07:00
Jason Ekstrand
098209eedf device.c: Use the cast helpers a bunch of places 2015-07-09 18:49:43 -07:00
Jason Ekstrand
73f9187e33 device.c: Use the cast helpers 2015-07-09 18:41:27 -07:00
Jason Ekstrand
7d24fab4ef vk/private.h: Add a bunch of static inline casting functions
We will need these as soon as we turn on type saftey.  We might as well
define and start using them now rather than later.
2015-07-09 18:40:54 -07:00
Jason Ekstrand
5c49730164 vk/device.c: Fix whitespace issues 2015-07-09 18:20:28 -07:00
Jason Ekstrand
c95f9b61f2 vk/device.c: Use ANV_FROM_HANDLE a bunch of places 2015-07-09 18:20:10 -07:00
Jason Ekstrand
335e88c8ee vk/vulkan.h: Add the pEnabledFeatures field to DeviceCreateInfo 2015-07-09 16:21:31 -07:00
Jason Ekstrand
34871cf7f3 vk/vulkan.h: Change the MsCreateInfo structure to the 130 version
We do nothing with it at the moment, so this is a no-op.
2015-07-09 16:19:54 -07:00
Jason Ekstrand
8c2c37fae7 vk: Remove the old GetPhysicalDeviceInfo call 2015-07-09 16:14:37 -07:00
Jason Ekstrand
1f907011a3 vk: Add the new PhysicalDeviceQueue queries 2015-07-09 16:14:37 -07:00
Jason Ekstrand
977a469bce vk: Support GetPhysicalDeviceProperties 2015-07-09 16:14:37 -07:00
Jason Ekstrand
65e0b304b6 vk: Add support for GetPhysicalDeviceLimits 2015-07-09 16:14:37 -07:00
Jason Ekstrand
f6d51f3fd3 vk: Add GetPhysicalDeviceFeatures 2015-07-09 16:14:37 -07:00
Chad Versace
5b75dffd04 vk/device: Fix vkEnumeratePhysicalDevices()
The Vulkan spec says that pPhysicalDeviceCount is an out parameter if
pPhysicalDevices is NULL; otherwise it's an inout parameter.

Mesa incorrectly treated it unconditionally as an inout parameter, which
could have lead to reading unitialized data.
2015-07-09 15:53:21 -07:00
Chad Versace
fa915b661d vk/device: Move device enumeration to vkEnumeratePhysicalDevices()
Don't enumerate devices in vkCreateInstance(). That's where global,
device-independent initialization should happen. Move device enumeration
to the more logical location, vkEnumeratePhysicalDevices().
2015-07-09 15:41:17 -07:00
Chad Versace
c34d314db3 vk/device: Be consistent about path to DRM device
Function fill_physical_device() has a 'path' parameter, and struct
anv_physical_device has a 'path' member. Sometimes these are used;
sometimes hardcoded "/dev/dri/renderD128" is used instead.

Be consistent. Hardcode "/dev/dri/renderD128" in exactly one location,
during initialization of the physical device.
2015-07-09 15:27:26 -07:00
Connor Abbott
cff06bbe7d vk/compiler: create an empty parameters list
Prevents problems when initializing the sanity_param_count.
2015-07-09 14:29:23 -04:00
Connor Abbott
3318a86d12 nir/spirv: fix wrong writemask for ALU operations 2015-07-09 14:28:39 -04:00
Connor Abbott
b8fedc19f5 nir/spirv: fix memory context for builtin variable
Fixes valgrind errors with func.depthstencil.basic.
2015-07-08 22:03:30 -04:00
Connor Abbott
e4292ac039 nir/spirv: zero out value array
Before values are pushed or annotated with a name, decoration, etc.,
they need to have an invalid type, NULL name, NULL decoration, etc.
ralloc zero's everything by accident, so this wasn't an issue in
practice, but we should be explicitly zero'ing it.
2015-07-08 22:03:30 -04:00
Connor Abbott
997831868f vk/compiler: create the right kind of program struct
This fixes Valgrind errors and gets all the tests to pass with
--use-spir-v.
2015-07-08 22:03:30 -04:00
Connor Abbott
a841e2c747 vk/compiler: mark inputs/outputs as read/written
This doesn't handle inputs and outputs larger than a vec4, but we plan
to add a varyiing splitting/packing pass to handle those anyways.
2015-07-08 22:03:30 -04:00
Jason Ekstrand
8640dc12dc vk/vulkan.h: Copy the VkStructureType enum from version 130
We now have the exact same structs which require pType.
2015-07-08 17:45:52 -07:00
Jason Ekstrand
5a4ebf6bc1 vk: Move to the new pipeline creation API's 2015-07-08 17:30:18 -07:00
Chad Versace
4fcb32a17d vk/0.130: Remove VkImageViewCreateInfo::minLod
It's now set solely through VkSampler.
2015-07-08 14:48:22 -07:00
Jason Ekstrand
367b9ba78f vk/vulkan.h: Move renderPassContinue from GraphicsBeginInfo to BeginInfo 2015-07-08 14:37:30 -07:00
Jason Ekstrand
d29ec8fa36 vk/vulkan.h: Update to the new UpdateDescriptorSets api 2015-07-08 14:24:56 -07:00
Jason Ekstrand
c8577b5f52 vk: Add a macro for creating anv variables from vulkan handles
This is very helpful for doing the mass bunch of casts at the top of a
function.  It will also be invaluable when we get type saftey in the API.
2015-07-08 14:24:14 -07:00
Chad Versace
ccb27a002c vk/0.130 Update VkObjectType values
Don't import any new enum tokens from the 0.130 header. Just update the
values of existing enums. This reduces the diff by about 16 lines.
2015-07-08 12:53:49 -07:00
Chad Versace
8985dd15a1 vk/0.130: Remove VkDescriptorUpdateMode
Nowhere used.
2015-07-08 12:51:46 -07:00
Chad Versace
e02dfa309a vk/0.130: Remove VK_DEVICE_CREATE_MULTI_DEVICE_IQ_MATCH_BIT 2015-07-08 12:49:48 -07:00
Chad Versace
e9034ed875 vk/0.130: Update vkCmdBlitImage signature
Add VkTexFilter param. Ignored for now.
2015-07-08 12:47:48 -07:00
Jason Ekstrand
aae45ab583 vk/vulkan.h: Add packing parameters to BufferImageCopy 2015-07-08 11:51:34 -07:00
Chad Versace
b4ef7f354b vk/0.130: Remove msaa members of VkDepthStencilViewCreateInfo 2015-07-08 11:50:51 -07:00
Jason Ekstrand
522ab835d6 vk/vulkan.h: Move over to the new border color enums 2015-07-08 11:44:52 -07:00
Jason Ekstrand
7598329774 vk/vulkan.h: Move VkFormatProperties 2015-07-08 11:16:45 -07:00
Jason Ekstrand
52940e8fcf vk/vulkan.h: Add RenderPassBeginContents 2015-07-08 10:57:13 -07:00
Jason Ekstrand
e19d6be2a9 vk/vulkan.h: Add command buffer levels 2015-07-08 10:53:32 -07:00
Jason Ekstrand
c84f2d3b8c vk/vulkan.h: Import the VkPipeEvent enum from 130
Now, VkPipeEventFlags is back in sync with VkPipeEvent
2015-07-08 10:49:46 -07:00
Jason Ekstrand
b20cc72603 vk/vulkan.h: Remove VkFormatInfoType 2015-07-08 10:39:31 -07:00
Jason Ekstrand
8e05bbeee9 vk/vulkan.h: Update extension handling to rev 130 2015-07-08 10:38:07 -07:00
Jason Ekstrand
cc29a5f4be vk/vulkan.h: Move format quering to the physical device 2015-07-08 09:34:47 -07:00
Jason Ekstrand
719fa8ac74 vk/vulkan.h: Remove some peer opening structs and STRUCTURE_TYPE enums 2015-07-08 09:25:13 -07:00
Jason Ekstrand
fc6dcc6227 vk: Add a copy of the v90 header. 2015-07-08 09:23:29 -07:00
Jason Ekstrand
12119282e6 vk/vulkan.h: Remove an unneeded comment 2015-07-08 09:18:09 -07:00
Jason Ekstrand
3c65a1ac14 vk/vulkan.h: Remove the MemoryRange stubs and add sparse stubs 2015-07-08 09:16:48 -07:00
Jason Ekstrand
bb6567f5d1 vk/vulkan.h: Switch BindObjectMemory to a device function and remove the index 2015-07-08 09:04:16 -07:00
Jason Ekstrand
e7acdda184 vk/vulkan.h: Switch to the split ProcAddr functions in 130 2015-07-07 18:51:53 -07:00
Jason Ekstrand
db24afee2f vk/vulkan.h: Switch from GetImageSubresourceInfo to GetImageSubresourceLayout 2015-07-07 18:20:18 -07:00
Jason Ekstrand
ef8980e256 vk/vulkan.h: Switch from GetObjectInfo to GetMemoryRequirements 2015-07-07 18:16:42 -07:00
Jason Ekstrand
d9c2caea6a vk: Update memory flushing functions to 130
This involves updating the prototype for FlushMappedMemory, adding
InvalidateMappedMemoryRanges, and removing PinSystemMemory.
2015-07-07 17:22:31 -07:00
Jason Ekstrand
d5349b1b18 vk/vulkan.h: Constify the pFences parameter to ResetFences 2015-07-07 17:18:00 -07:00
Jason Ekstrand
6aa1b89457 vk/vulkan.h: Move the definitions of Create(Framebuffer|RenderPass)
This better matches the 130 header.
2015-07-07 17:13:10 -07:00
Jason Ekstrand
0ff06540ae vk: Implement the GetRenderAreaGranularity function
At the moment, we're just going to scissor clears so a granularity of 1x1
is all we need.
2015-07-07 17:11:37 -07:00
Jason Ekstrand
435b062b26 vk/vulkan.h: Add a PipelineLayout parameter to BindDescriptorSets 2015-07-07 17:06:10 -07:00
Jason Ekstrand
518ca9e254 vk/vulkan.h: Add a compareEnable parameter to SamplerCreateInfo
Our hardware doesn't actually need this, so adding it is a no-op.
2015-07-07 16:49:04 -07:00
Jason Ekstrand
672590710b vk/vulkan.h: Remove initialCount from SemaphoreCreateInfo 2015-07-07 16:42:42 -07:00
Jason Ekstrand
80046a7d54 vk/vulkan.h: Update clear color handling to 130 2015-07-07 16:37:43 -07:00
Jason Ekstrand
3e4b00d283 meta: Use the VkClearColorValue structure for the color attribute 2015-07-07 16:27:06 -07:00
Jason Ekstrand
a35fef1ab2 vk/vulkan.h: Remove the pass argument from EndRenderPass 2015-07-07 16:22:23 -07:00
Jason Ekstrand
d2ca7e24b4 vk/vulkan.h: Rename VertexInputStateInfo to VertexInputStateCreateInfo 2015-07-07 16:15:55 -07:00
Jason Ekstrand
abbb776bbe vk/vulkan.h: Remove programPointSize
Instead, we auto-detect whether or not your shader writes gl_PointSize.  If
it does, we use 1.0, otherwise we take it from the shader.
2015-07-07 16:00:46 -07:00
Chad Versace
e7ddfe03ab vk/0.130: Stub vkCmdClear*Attachment() funcs
vkCmdClearColorAttachment
vkCmdClearDepthStencilAttachment
2015-07-07 15:57:37 -07:00
Chad Versace
f89e2e6304 vk/0.130: Define enum VkImageAspectFlagBits 2015-07-07 15:57:37 -07:00
Chad Versace
55ab1737d3 vk/0.130: Define VkRect3D 2015-07-07 15:55:53 -07:00
Chad Versace
11901a9100 vk/0.130: Update name of vkCmdClearDepthStencilImage() 2015-07-07 15:53:35 -07:00
Chad Versace
dff32238c7 vk/0.130: Stub vkCmdExecuteCommands() 2015-07-07 15:51:55 -07:00
Chad Versace
85c0d69be9 vk/0.130: Update vkCmdWaitEvents() signature 2015-07-07 15:49:57 -07:00
Chad Versace
0ecb789b71 vk: Remove unused 'v' param from stub() macro 2015-07-07 15:47:24 -07:00
Chad Versace
f78d684772 vk: Stub vkCmdPushConstants() from 0.130 header 2015-07-07 15:46:19 -07:00
Chad Versace
18ee32ef9d vk: Update vkCmdPipelineBarrier to 0.130 header 2015-07-07 15:43:41 -07:00
Chad Versace
4af79ab076 vk: Add func anv_clear_mask()
A little helper func for inspecting and clearing bitmasks.
2015-07-07 15:43:41 -07:00
Jason Ekstrand
788a8352b9 vk/vulkan.h: Remove some unused fields.
In particular, the following are removed:

 - disableVertexReuse
 - clipOrigin
 - depthMode
 - pointOrigin
 - provokingVertex
2015-07-07 15:33:00 -07:00
Jason Ekstrand
7fbed521bb vk/vulkan.h: Remove the explicit primitive restart index
Unfortunately, this requires some non-trivial changes to the driver.  Now
that the primitive restart index isn't given explicitly by the client, we
always use ~0 for everything like D3D does.  Unfortunately, our hardware is
awesome and a 32-bit version of ~0 doesn't match any 16-bit values.  This
means, we have to set it to either UINT16_MAX or UINT32_MAX depending on
the size of the index type.  Since we get the index type from
CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need
to lazy-emit the VF packet.
2015-07-07 15:33:00 -07:00
Chad Versace
d6b840beff vk: Delete some comments not present in 0.130 header
Deleting the comments reduces diff noise.
2015-07-07 15:16:13 -07:00
Chad Versace
84a5bc25e3 vk: Pull in remaining 0.130 handle types
This pulls in the definition of VkShaderModule and VkPipelineCache,
which nowhere used yet.
2015-07-07 15:13:01 -07:00
Chad Versace
f2899b1af2 vk: Pull in #defines from 0.130 header
Despite not being used yet, pulling in the macros does diminish the
header diff.
2015-07-07 15:11:30 -07:00
Jason Ekstrand
962d6932fa vk/vulkan.h: Rename (min|max)Depth to (min|max)DepthBounds 2015-07-07 12:37:54 -07:00
Jason Ekstrand
1fb859e4b2 vk/vulkan.h: Remove client-settable pointSize from DynamicRsState 2015-07-07 12:35:32 -07:00
Jason Ekstrand
245583075c vk/vulkan.h: Remove UINT8 index buffers 2015-07-07 11:26:49 -07:00
Jason Ekstrand
0a42332904 vk/vulkan.h: Re-order the object declarations 2015-07-07 11:26:49 -07:00
Kristian Høgsberg Kristensen
a1eea996d4 vk: Emit 3DSTATE_SAMPLE_MASK
This was missing and was causing the driver to not work with
execlists. Presumably we get a different initial hw context with
execlists enabled, that has sample mask 0 initially.

Set this to 0xffff for now.  When we add MS support, we need to take the
value from VkPipelineMsStateCreateInfo::sampleMask.
2015-07-06 23:54:12 -07:00
Kristian Høgsberg Kristensen
c325bb24b5 vk: Pull in new generated headers
The new headers use stdbool for enable/disable fields which
implicitly converts expressions like (flags & 8) to 0 or 1.
Also handles MBO (must-be-one) fields by setting them to one,
corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and
makes a few enum values less clashy.
2015-07-06 22:12:26 -07:00
Chad Versace
23075bccb3 vk/image: Validate vkCreateImageView more
Exhaustively validate the function input.  If it's not validated and
doesn't have an anv_finishme(), then I overlooked it.
2015-07-06 18:28:26 -07:00
Chad Versace
69e11adecc vk/image: Add more info to VkImageViewType table
Convert the table from the direct mapping
  VkImageViewType -> SurfaceType

into a mapping to an info struct
  VkImageViewType -> struct anv_image_view_info
2015-07-06 18:28:26 -07:00
Chad Versace
b844f542e0 vk: Update VkImageViewType to 0.130.0
This splits 1D and 1D_ARRAY, 2D and 2D_ARRAY, CUBE and CUBE_ARRAY.

The new tokens are unused. This is just a header update.
2015-07-06 18:28:26 -07:00
Chad Versace
5b04db71ff vk/image: Move validation for vkCreateImageView
Move the validation from anv_CreateImageView() and anv_image_view_init()
to anv_validate_CreateImageView(). No new validation is added.
2015-07-06 18:27:14 -07:00
Jason Ekstrand
1f1b26bceb vk/vulkan.h: Rename VkRect to VkRect2D 2015-07-06 17:47:18 -07:00
Jason Ekstrand
63c1190e47 vk/vulkan.h: Rename count to arraySize in VkDescriptorSetLayoutBinding 2015-07-06 17:43:58 -07:00
Jason Ekstrand
d84f3155b1 vk/vulkan.h: Remove the Vk(Memory|Semaphor|Image)OpenInfo structs
We already deleted the functions that need them.  The structs are just
dangling uselessly.
2015-07-06 17:37:13 -07:00
Jason Ekstrand
65f9ccb4e7 vk/vulkan.h: Remove VK_MEMORY_PROPERTY_PREFER_HOST_LOCAL_BIT
We weren't doing anything with it, so this is a no-op
2015-07-06 17:33:45 -07:00
Jason Ekstrand
68fa750f2e vk/vulkan.h: Replace DEVICE_COHERENT_BIT with DEVICE_NON_COHERENT_BIT 2015-07-06 17:32:28 -07:00
Jason Ekstrand
d5b5bd67f6 vk/vulkan.h: Use the query result bits from revision 130
None of the important bits or names actually changed.  It just
added/removed some no-op names.

No functional change.
2015-07-06 17:27:11 -07:00
Jason Ekstrand
d843418c2e vk/vulkan.h: One more quick enum refactor clean-up 2015-07-06 17:26:29 -07:00
Jason Ekstrand
2b37fc28d1 vk/vulkan.h: Get rid of VERTEX_INPUT_STEP_RATE_DRAW
We never supported it, so no functional change.
2015-07-06 17:24:26 -07:00
Jason Ekstrand
a75967b1bb vk/vulkan.h: Remove the CLEAR_OPTIMAL image layout 2015-07-06 17:21:19 -07:00
Jason Ekstrand
2b404e5d00 vk: Rename CPU_READ/WRITE_BIT to HOST_READ/WRITE_BIT 2015-07-06 17:18:25 -07:00
Jason Ekstrand
c57ca3f16f vk/vulkan.h: Remove VK_IMAGE_CREATE_CLONEABLE_BIT 2015-07-06 17:14:30 -07:00
Jason Ekstrand
2de388c49c vk: Remove SHAREABLE bits
They were removed from the Vulkan API and we don't really use them because
there are no multi-GPU i965 systems.
2015-07-06 17:12:51 -07:00
Jason Ekstrand
1b0c47bba6 vk/vulkan.h: Re-order the logic op enums 2015-07-06 17:08:11 -07:00
Jason Ekstrand
c7cef662d0 vk/vulkan.h: Reformat a bunch of enums to match revision 130
In theory, no functional change.
2015-07-06 17:06:02 -07:00
Jason Ekstrand
8c5e48f307 vk: Rename NUM_SHADER_STAGE to SHADER_STAGE_NUM
This is a refactor of more than just the header but it lets us finish
reformating the shader stage enum.
2015-07-06 16:43:28 -07:00
Jason Ekstrand
d9176f2ec7 vk: Reformat a bunch of enums
This accounts for a number differences between the generated headers and
the hand-written header.  Not all reformatting is done in this commit but
it does make the headers much more diffable.

In theory, no functional change.
2015-07-06 16:41:31 -07:00
Jason Ekstrand
e95bf93e5a vk: Pull the VkResult enum from revision 130 2015-07-06 16:15:12 -07:00
Jason Ekstrand
1b7b580756 vk: re-arrange enums to match the order in revision 130 2015-07-06 16:11:05 -07:00
Jason Ekstrand
2fb524b369 vk: Rename a parameter in CmdBindDynamicStateObject 2015-07-06 15:37:17 -07:00
Jason Ekstrand
c5ffcc9958 vk: Remove multi-device stuff 2015-07-06 15:34:55 -07:00
Jason Ekstrand
c5ab5925df vk: Remove ClearDescriptorSets 2015-07-06 15:32:40 -07:00
Jason Ekstrand
ea5fbe1957 vk: Remove begin/end descriptor pool update 2015-07-06 15:32:27 -07:00
Jason Ekstrand
9a798fa946 vk: Remove stub for CloneImageData 2015-07-06 15:30:05 -07:00
Jason Ekstrand
78a0d23d4e vk: Remove the stub support for memory priorities 2015-07-06 15:28:10 -07:00
Jason Ekstrand
11cf214578 vk: Remove the stub support for explicit memory references 2015-07-06 15:27:58 -07:00
Jason Ekstrand
0dc7d4ac8a vk/vulkan.h: Reformat structs to match revision 130
Structs in the old version were specified as

typedef struct VkSomeThing_
{
   type                                        field; // comment
} VkSomeThing;

However, in the generated headers, you have

typedef struct {
   type                                        field;
} VkSomeThing;

This commit also removes some unneeded whitespaces.
2015-07-06 15:19:12 -07:00
Jason Ekstrand
19aabb5730 vk/vulkah.h: Re-arrange structures to match the order in 130 2015-07-06 15:09:30 -07:00
Connor Abbott
f9dbc34a18 nir/spirv: fix some bugs 2015-07-06 15:00:37 -07:00
Connor Abbott
f3ea3b6e58 nir/spirv: add support for builtins inside structures
We may be able to revert this depending on the outcome of bug 14190, but
for now it gets vertex shaders working with SPIR-V.
2015-07-06 15:00:37 -07:00
Connor Abbott
15047514c9 nir/spirv: fix a bug with structure creation
We were creating 2 extra bogus fields.
2015-07-06 15:00:37 -07:00
Connor Abbott
73351c6a18 nir/spirv: fix a bad assertion in the decoration handling
We should be asserting that the parent decoration didn't hand us
a member if the child decoration did, but different child decorations
may obviously have different members.
2015-07-06 15:00:37 -07:00
Connor Abbott
70d2336e7e nir/spirv: pull out logic for getting builtin locations
Also add support for more builtins.
2015-07-06 15:00:37 -07:00
Connor Abbott
aca5fc6af1 nir/spirv: plumb through the type of dereferences
We need this to know if a deref is of a builtin.
2015-07-06 15:00:37 -07:00
Connor Abbott
66375e2852 nir/spirv: handle structure member builtin decorations 2015-07-06 15:00:37 -07:00
Connor Abbott
23c179be75 nir/spirv: add a vtn_type struct
This will handle decorations that aren't in the glsl_type.
2015-07-06 15:00:37 -07:00
Connor Abbott
f9bb95ad4a nir/spirv: move 'type' into the union
Since SSA values now have their own types, it's more convenient to make
'type' only used when we want to look up an actual SPIR-V type, since
we're going to change its type soon to support various decorations that
are handled at the SPIR-V -> NIR level.
2015-07-06 15:00:37 -07:00
Jason Ekstrand
d5dccc1e7a vk: Move CreateFramebuffer and CreateRenderPass higher in the header
This matches where they are in the 130 header.
2015-07-06 14:41:43 -07:00
Jason Ekstrand
4a42f45514 vk: Remove atomic counters stubs 2015-07-06 14:38:45 -07:00
Jason Ekstrand
630b19a1c8 vk: Make vulkan.h look more like vulkan-130.h
Most of these changes are insubstantial.  The only potentially substantial
cyhange is that we added a few new #defines for API maximums.
2015-07-06 14:32:52 -07:00
Jason Ekstrand
2f9180b1b2 vk: Add a revision 130 header along-side the current header 2015-07-06 14:16:51 -07:00
Jason Ekstrand
1f1465f077 vk/meta: Add an initial implementation of ClearColorImage 2015-07-02 18:15:06 -07:00
Jason Ekstrand
8a6c8177e0 vk/meta: Factor the guts out of cmd_buffer_clear 2015-07-02 18:13:59 -07:00
Jason Ekstrand
beb0e25327 vk: Roll back to API v90
This is what version 0.1 of the Vulkan SDK is built against.
2015-07-01 16:44:12 -07:00
Jason Ekstrand
fa663c27f5 nir/spirv: Add initial structure member decoration support 2015-07-01 15:38:26 -07:00
Jason Ekstrand
e3d60d479b nir/spirv: Make vtn_handle_type match the other handler functions
Previously, the caller of vtn_handle_type had to handle actually inserting
the type.  However, this didn't really work if the type was decorated in
any way.
2015-07-01 15:34:10 -07:00
Jason Ekstrand
7a749aa4ba nir/spirv: Add basic support for Op[Group]MemberDecorate 2015-07-01 14:18:07 -07:00
Jason Ekstrand
682eb9489d vk/x11: Allow for the client querying the size of the format properties 2015-07-01 14:18:07 -07:00
Chad Versace
bba767a9af vk/formats: Fix entry for S8_UINT
I forgot to update this when fixing the depth formats.
2015-06-30 09:41:44 -07:00
Chad Versace
6720b47717 vk/formats: Document new meaning of anv_format::cpp
The way the code currently works is that anv_format::cpp is the cpp of
anv_format::surface_format.

Me and Kristian disagree about how the code *should* work. Despite that,
I think it's in our discussion's best interest to document how the code
*currently* works. That should eliminate confusion.

If and when the code begins to work differently, then we'll update the
anv_format comments.
2015-06-30 09:41:41 -07:00
Chad Versace
709fa463ec vk/depth: Add a FIXME
3DSTATE_DEPTH_BUFFER.Width,Height are wrong.
2015-06-26 22:15:03 -07:00
Chad Versace
5b3a1ceb83 vk/image: Enable 2d single-sample color miptrees
What's been tested, for both image views and color attachment views:

    - VK_FORMAT_R8G8B8A8_UNORM
    - VK_IMAGE_VIEW_TYPE_2D
    - mipLevels: 1, 2
    - baseMipLevel: 0, 1
    - arraySize: 1, 2
    - baseArraySlice: 0, 1

What's known to be broken:

    - Depth and stencil miptrees. To fix this, anv_depth_stencil_view
      needs major rework.
    - VkImageViewType != 2D
    - MSAA

Fixes Crucible tests:

  func.miptree.view-2d.levels02.array01.*
  func.miptree.view-2d.levels01.array02.*
  func.miptree.view-2d.levels02.array02.*
2015-06-26 22:11:15 -07:00
Chad Versace
c6e76aed9d vk/image: Define anv_surface, refactor anv_image
This prepares for upcoming miptree support.

anv_surface is a proxy for color surfaces, depth surfaces, and stencil
surfaces.  Embed two instances of anv_surface into anv_image: the
primary surface (color or depth), and an optional stencil surface.
2015-06-26 21:45:53 -07:00
Chad Versace
127cb3f6c5 vk/image: Reformat function signatures
Reformat them to match Mesa code-style.
2015-06-26 20:12:42 -07:00
Chad Versace
fdcd71f71d vk/image: Embed VkImageCreateInfo* into anv_image_create_info
All function signatures that matched this pattern,
  old: f(const VkImageCreateInfo *, const struct anv_image_create_info *)

were rewritten as
  new: f(const struct anv_image_create_info *)
2015-06-26 20:06:08 -07:00
Chad Versace
ca6cef3302 vk/image: Drop some tmp vars in anv_image_view_init()
Variables 'tile_mode' and 'format' are unneeded.
2015-06-26 19:50:04 -07:00
Chad Versace
9c46ba9ca2 vk/image: Abort on stencil image views
The code doesn't work. Not even close.

Replace the broken code with a FINISHME and abort.
2015-06-26 19:23:21 -07:00
Chad Versace
667529fbaa vk: Reindent struct anv_image 2015-06-26 15:27:20 -07:00
Chad Versace
74e3eb304f vk: Define MIN(a, b) macro 2015-06-26 15:09:07 -07:00
Chad Versace
55752fe94a vk: Rename functions ALIGN_*32 -> align_*32
ALIGN_U32 and ALIGN_I32 are functions, not macros. So stop using
allcaps.
2015-06-26 15:07:59 -07:00
Connor Abbott
6ee082718f Merge branch 'wip/nir-vtn' into vulkan
Adds composites and matrix multiplication, plus some control flow fixes.
2015-06-26 12:14:05 -07:00
Chad Versace
37d6e04ba1 vk/formats: Remove the cpp=0 stencil hack
The format table defined cpp = 0 for stencil-only formats. The real cpp
is 1.

When code begins to lie, especially about stencil buffers, code becomes
increasingly fragile as time progresses, and the damage becomes
increasingly hard to undo. (For precedent, see the painful history of
stencil buffer cpp in the git log for gen6 and gen7 in the i965 driver).
Let's undo the stencil buffer cpp lie now to avoid future pain.

In the format table, set cpp = 1 for VK_FORMAT_S8; replace checks for
cpp == 0; and delete all comments about the hack.
2015-06-26 09:58:22 -07:00
Chad Versace
67a7659d69 vk/image: Refactor anv_image_create()
From my experience with intel_mipmap_tree.c, I learned that for struct's
like anv_image and intel_mipmap_tree, which have sprawling
multi-function construction codepaths, it's easy to mistakenly use
unitialized struct members during construction.

Let's eliminate the risk of using unitialized anv_image members during
construction.  Fill the struct at the function bottom instead of
piecemeal throughout the constructor.
2015-06-26 09:32:59 -07:00
Chad Versace
5d7103ee15 vk/image: Group some assertions closer together
In anv_image_create(), group together the assertions on
VkImageCreateInfo.
2015-06-26 09:05:46 -07:00
Chad Versace
0349e8d607 vk/formats: #undef fmt at end of format table 2015-06-26 07:38:02 -07:00
Chad Versace
068b8a41e2 vk: Fix comment for anv_depth_stencil_view::stencil_qpitch
s/DEPTH/STENCIL/
2015-06-26 07:31:57 -07:00
Chad Versace
7ea707a42a vk/image: Add qpitch fields to anv_depth_stencil_view
For now, hard-code them to 0.
2015-06-25 20:10:16 -07:00
Chad Versace
b91a76de98 vk: Reindent and document struct anv_depth_stencil_view 2015-06-25 20:10:16 -07:00
Chad Versace
ebe1e768b8 vk/formats: Fix incorrect depth formats
anv_format::surface_format was incorrect for Vulkan depth formats.
For example, the format table mapped

    VK_FORMAT_D24_UNORM -> .surface_format = D24_UNORM_X8_UINT
    VK_FORMAT_D32_FLOAT -> .surface_format = D32_FLOAT

but should have mapped

    VK_FORMAT_D24_UNORM -> .surface_format = R24_UNORM_X8_TYPELESS
    VK_FORMAT_D32_FLOAT -> .surface_format = R32_FLOAT

The Crucible test func.depthstencil.basic passed despite the bug, but
only because it did not attempt to texture from the depth surface.

The core problem is that RENDER_SURFACE_STATE.SurfaceFormat and
3DSTATE_DEPTH_BUFFER.SurfaceFormat are distinct types. Considering them
as enum spaces, the two enum spaces have incompatible collisions.

Fix this by adding a new field 'depth_format' to struct anv_format.

Refer to brw_surface_formats.c:translate_tex_format() for precedent.
2015-06-25 20:10:16 -07:00
Chad Versace
45b804a049 vk/image: Rename local variable in anv_image_create()
This function has many local variables for info structs. Having one
named simply 'info' is confusing.  Rename it to 'format_info'.
2015-06-25 20:10:16 -07:00
Chad Versace
528071f004 vk/formats: Fix table entry for R8G8B8_SNORM
Now that anv_formats[] is formatted like a table, buggy entries are
easier to see.
2015-06-25 20:10:16 -07:00
Chad Versace
4c8146313f vk/formats: Rename anv_format::format -> surface_format
I misinterpreted anv_format::format as a VkFormat. Instead, it is
a hardware surface format (RENDER_SURFACE_STATE.SurfaceFormat). Rename
the field to 'surface_format' to make it unambiguous.
2015-06-25 20:10:16 -07:00
Chad Versace
4b8b451a1d vk/formats: Rename anv_format::channels -> num_channels
I misinterpreted anv_format::channels as a bitmask of channels.
Renaming it to 'num_channels' makes it unambiguous.
2015-06-25 20:10:16 -07:00
Chad Versace
af0ade0d6c vk: Reindent struct anv_format 2015-06-25 20:10:16 -07:00
Chad Versace
ae29fd1b55 vk/formats: Don't abbreviate tokens in the format table
Abbreviating the VK_FORMAT_* tokens doesn't help much. To the contrary,
it means grep and ctags can't find them.
2015-06-25 20:10:16 -07:00
Jason Ekstrand
d5e41a3a99 vk/compiler: Add the initial hacks to get SPIR-V up and going 2015-06-25 17:36:35 -07:00
Jason Ekstrand
c4c1d96a01 HACK: Get rid of sanity_param_count for FS 2015-06-25 17:36:34 -07:00
Jason Ekstrand
4f5ef945e0 i965: Don't print the GLSL IR if it doesn't exist 2015-06-25 17:36:34 -07:00
Jason Ekstrand
588acdb431 nir/spirv: Set the right location for shader input/outputs
We need to add FRAG_RESULT_DATA0 etc. to the input/output location.
2015-06-25 17:36:34 -07:00
Jason Ekstrand
333b8ddd6b nir/spirv: Set the interface type on uniform blocks 2015-06-25 17:36:34 -07:00
Jason Ekstrand
7e1792b1b7 nir/spirv: Set the system value mode on builtins 2015-06-25 17:36:34 -07:00
Jason Ekstrand
b72936fdad nir/spirv: Actually put variables on the right linked list 2015-06-25 17:36:34 -07:00
Jason Ekstrand
ee0a8f23e4 glsl: Move vert_attrib varying_slot and frag_result enums to shader_enums.h 2015-06-25 17:36:34 -07:00
Chad Versace
fa352969a2 vk/image: Check extent does not exceed surface type limits 2015-06-25 16:53:24 -07:00
Chad Versace
99031aa0f3 vk/image: Stop hardcoding SurfaceType of VkImageView
Instead, translate VkImageViewType to a gen SurfaceType.
2015-06-25 16:53:22 -07:00
Chad Versace
7ea121687c vk/image: Add anv_image::surf_type
This the gen SurfaceType, such as SURFTYPE_2D.
2015-06-25 16:52:16 -07:00
Chad Versace
cb30acaced vk/image: Add tables for gen SurfaceType
Tables for mapping VkImageType and VkImageViewType to gen SurfaceType.
Tables are unused.
2015-06-25 16:52:16 -07:00
Chad Versace
1132080d5d vk/util: Add anv_loge() for logging error messages 2015-06-25 16:52:16 -07:00
Chad Versace
5f2d469e37 vk: Add func anv_is_aligned() 2015-06-25 16:52:16 -07:00
Chad Versace
f7fb7575ef vk: Add anv_minify() 2015-06-25 16:52:05 -07:00
Chad Versace
7cec6c5dfd vk: Define MAX(a, b) macro 2015-06-25 16:29:42 -07:00
Jason Ekstrand
d178e15567 nir/spirv: Fix up some dererf ralloc parenting 2015-06-24 21:39:07 -07:00
Jason Ekstrand
845002e163 i965/nir: Handle returns as long as they're at the end of a function 2015-06-24 21:38:49 -07:00
Jason Ekstrand
2ecac045a4 i965/nir: Split NIR shader handling into two functions
The brw_create_nir function takes a GLSL or ARB shader and turns it into a
NIR shader.  The guts of the optimization and lowering code is now split
into a new brw_process_shader function.
2015-06-24 21:22:07 -07:00
Jason Ekstrand
e369a0eb41 nir/spirv: Use vtn_ssa_value for texture coordinates 2015-06-24 20:39:37 -07:00
Jason Ekstrand
d0bd2bc604 nir/spirv: Add support for the Uniform storage class
This is kida sketchy.  I'm not really sure this is the way it's supposed to
be used.
2015-06-24 20:32:05 -07:00
Jason Ekstrand
ba0d9d33d4 nir/spirv: Add support for some more decorations including built-in 2015-06-24 20:30:32 -07:00
Jason Ekstrand
1bc0a1ad98 nir/spirv: Make the header file C++ safe 2015-06-24 19:01:10 -07:00
Jason Ekstrand
88d02a1b27 vk: Build xmlconfig stuff into libi965_compiler 2015-06-24 15:59:09 -07:00
Kristian Høgsberg Kristensen
24dff4f8fa vk/headers: Handle MBO fields
These must be set to one.
2015-06-24 09:37:50 -07:00
Jason Ekstrand
a62edcce4e Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-06-23 18:05:25 -07:00
Jason Ekstrand
6844d6b7f8 i965/fs: Get rid of an unused variable in emit_barrier()
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-23 17:06:05 -07:00
Jason Ekstrand
40801295d5 i965: Remove the brw_context from the visitors
As of this commit, nothing actually needs the brw_context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:36:13 -07:00
Jason Ekstrand
bcaf4a3f07 i965/vec4_vs: Add an explicit use_legacy_snorm_formula flag
This way we can stop doing is_gles3 checks inside of the compiler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:35:01 -07:00
Jason Ekstrand
924b15d7de i965/vec4: Turn some _mesa_problem calls into asserts
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:35:00 -07:00
Jason Ekstrand
663f8d121d i965/vs: Pass the current set of clip planes through run() and run_vs()
Previously, these were pulled out of the GL context conditionally based on
whether we were running ff/ARB or a GLSL program.  Now, we just pass them
in so that the visitor doesn't have to grab them itself.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:35:00 -07:00
Jason Ekstrand
4af62c0f5c i965/fs: Add a do_rep_send flag to run_fs
Previously, we were pulling it from brw->do_rep_send

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:35:00 -07:00
Jason Ekstrand
1b0f6ffa15 i965: Pull calls to get_shader_time_index out of the visitor
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:34:59 -07:00
Jason Ekstrand
c7893dc3c5 i965: Use a single index per shader for shader_time.
Previously, each shader took 3 shader time indices which were potentially
at arbirary points in the shader time buffer.  Now, each shader gets a
single index which refers to 3 consecutive locations in the buffer.  This
simplifies some of the logic at the cost of having a magic 3 a few places.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 15:33:16 -07:00
Jason Ekstrand
6e255a3299 i965: Add compiler options to brw_compiler
This creates the options at screen cration time and then we just copy them
into the context at context creation time.  We also move is_scalar to the
brw_compiler structure.

We also end up manually setting some values that the core would have set by
default for us.  Fortunately, there are only two non-zero shader compiler
option defaults that we aren't overriding anyway so this isn't a big deal.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 14:28:09 -07:00
Jason Ekstrand
073294d3ef i965/fs: Plumb compiler debug logging through brw_compiler
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 14:28:08 -07:00
Jason Ekstrand
3fd457c9dd i965/fs: Do the no16 perf logging directly in fs_visitor::no16()
While we're at it, we'll drop the note about 10-20% performance loss.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 14:28:08 -07:00
Jason Ekstrand
f45bf97f30 i965/fs: Make no16 non-variadic
We never used the fact that it was variadic anyway.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-23 14:28:08 -07:00
Jason Ekstrand
1bc3b62d4a i965: Move INTEL_DEBUG variable parsing to screen creation time
v2: Do bufmgr set_debug and set_aub_dump at screen time as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-23 14:28:08 -07:00
Jason Ekstrand
d7565b7d65 i965: Remove the dependance on brw_context from the generators
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-23 14:28:08 -07:00
Jason Ekstrand
e639a6f68e i965: Plumb compiler debug logging through a function pointer in brw_compiler
v2 (Ken): Make shader_debug_log a printf-like function.
v3 (Jason): Add a void * to pass the brw_context through

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-23 14:28:08 -07:00
Kenneth Graunke
b0ad3ce4e7 mesa: Add a va_args variant of _mesa_gl_debug().
This will be useful for wrapper functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-23 14:28:08 -07:00
Jason Ekstrand
630764407a i965: Replace some instances of brw->gen with devinfo->gen 2015-06-23 14:28:08 -07:00
Matt Turner
ae097580ac i965: Initialize backend_shader::mem_ctx in its constructor.
We were initializing it in each subclasses' constructors for some
reason.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-23 12:24:42 -07:00
Matt Turner
d8eeb4917c i965: Assert that the GL primitive isn't out of range.
Coverity sees the if (mode >= BRW_PRIM_OFFSET (128)) test and assumes
that the else-branch might execute for mode to up 127, which out be out
of bounds.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-23 12:24:42 -07:00
Matt Turner
4d93a07c45 i965/cfg: Assert that cur_do/while/if pointers are non-NULL.
Coverity sees that the functions immediately below the new assertions
dereference these pointers, but is unaware that an ENDIF always follows
an IF, etc.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-23 12:24:42 -07:00
Matt Turner
04758d25b4 mesa: Delete unused ICEIL().
Can't find any uses of it in git history.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-23 12:24:42 -07:00
Matt Turner
a49328d58d i965/fs: Don't mess up stride for uniform integer multiplication.
If the stride is 0, the source is a uniform and we should not modify the
stride.

Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91047
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-23 12:24:42 -07:00
Boyan Ding
3fa9bb81ec egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals
The call to dri2_x11_add_configs_for_visuals (previously
dri2_add_configs_for_visuals) was moved downwards in commit f8c5b8a1,
but appeared again in its original position after its rename in
d019cd81. Remove it.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-23 18:54:27 +01:00
Connor Abbott
dee4a94e69 nir/vtn: add support for phi nodes 2015-06-23 10:34:55 -07:00
Connor Abbott
fe1269cf28 nir/builder: add support for inserting before/after blocks 2015-06-23 10:34:22 -07:00
Ben Widawsky
20dca37a20 i965/gen9: Don't use encrypted MOCS
On gen9+ MOCS is an index into a table. It is 7 bits, and AFAICT, bit 0 is for
doing encrypted reads.

I don't recall how I decided to do this for BXT. I don't know this patch was
ever needed, since it seems nothing is broken today on SKL. Furthermore, this
patch may no longer be needed because of the ongoing changes with MOCS setup. It
is what is being used/tested, so it's included in the series.

The chosen values are the old values left shifted. That was also an arbitrary
choice.

v2: Use shift in MOCS to make it clear what we're doing. (Ken)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-23 10:22:07 -07:00
Ilia Mirkin
78d58e6425 nv50,nvc0: make sure to pushbuf_refn before putting bo into pushbuf_data
Without first running the bo through pushbuf_refn, the nouveau drm
library will have uninitialized structures regarding this bo, and will
insert incorrect data.

This fixes supertuxkart 0.9 crash on start (where it ends up doing a lot
of indirect draws).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-23 12:08:34 -04:00
Ilia Mirkin
9fcbf515b4 nvc0: always put all tfb bufs into bufctx
Since we clear the TFB bufctx binding point above, we need to put all of
the active tfb's back in, even if they haven't changed since last time.
Otherwise the tfb may get moved into sysmem and the underlying mapping
will generate write errors.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-23 12:08:34 -04:00
Ilia Mirkin
fccf012adc glsl: binding point is a texture unit, which is a combined space
This fixes compilation failures in Dota 2 Reborn where a texture unit
binding point was used that was numerically higher than the max
per stage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-23 12:08:34 -04:00
Emil Velikov
59f8d4ee79 android: egl: do not link against libglapi
The only reason we touch glapi is to dlopen it in order to:
 - make sure that the unresolved _glapi* symbols in the dri modules are
provided.
 - fetch glFlush() and use it at various stages in the dri2 driver.

Cc: Chih-Wei Huang <cwhuang@linux.org.tw>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 17:08:05 +01:00
Emil Velikov
a0dc6b7824 gbm: do not (over)link against libglapi.so
The whole of GBM does not rely on even a single symbol from the GL
dispatch library, unsuprisingly. The only need for it comes from the
unresolved symbols in the DRI modules, which are now correctly handled
with Frank's commit.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 17:08:05 +01:00
Frank Henigman
828f13330c gbm: dlopen libglapi so gbm_create_device works
Dri driver libs are not linked to pull in libglapi so gbm_create_device()
fails when it tries to dlopen them (unless the application is linked
with something that does pull in libglapi, like libGL).
Until dri drivers can be fixed properly, dlopen libglapi before trying
to dlopen them.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Frank Henigman <fjhenigman@google.com>
[Emil Velikov: Drop misleading bugzilla link, mention that libname differs]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 17:08:05 +01:00
Emil Velikov
6ed52f78a0 configure: drop unused variable GBM_BACKEND_DIRS
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-23 17:08:05 +01:00
Emil Velikov
994be5143a configure: error out when building libEGL without shared-glapi
The latter is a hard requirement and without it we'll error out later
on in the build.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-23 17:08:05 +01:00
Emil Velikov
ddc886b5bf configure: error out when building backend-less libEGL
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-23 17:07:32 +01:00
Emil Velikov
2752e629e7 drivers/x11: drop unneeded HAVE_X11_DRIVER check
Already handled in the Makefile which includes the drivers/x11 subdir.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-23 17:04:40 +01:00
Emil Velikov
92dc507862 configure: allow building shared-glapi powered libgl-xlib
Cc: Brian Paul <brianp@vmware.com>
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-23 17:04:34 +01:00
Emil Velikov
5c37ababae targets/libgl-xlib: fix the build against shared_glapi
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-23 17:04:29 +01:00
Emil Velikov
b92233f2a5 drivers/x11: fix the build against shared_glapi
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-23 17:04:21 +01:00
Emil Velikov
6d744aaf4e configure: warn about shared_glapi & xlib-glx only when both are set
Printing out the message when shared_glapi is disabled only leads to
confusion.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-23 17:02:50 +01:00
Emil Velikov
06109db47b glapi: remap_helper.py: remove unused argument 'es'
Identical to the previous commit - unused by neither the Autotools,
Android or SCons build.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-23 16:57:27 +01:00
Emil Velikov
ec16bb62ac glapi: gl_table.py: remove unused variable 'es'
None of the three build systems ever set it, as such we can clear things
up a bit.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-23 16:56:50 +01:00
Derek Foreman
4f8f790525 egl: Use the loader_open_device() helper to do open with CLOEXEC
We've moved the open with CLOEXEC idiom into a helper function, so
call it instead of duplicating the code.

This also replaces a couple of opens that didn't properly do CLOEXEC.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 16:54:56 +01:00
Derek Foreman
324ee9b391 glx: Use loader_open_device() helper
We've moved the open with CLOEXEC idiom into a helper function, so
call it instead of duplicating the code here.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 16:54:53 +01:00
Derek Foreman
9c92746349 loader: Rename drm_open_device() to loader_open_device() and share it
This is already our common idiom for opening files with CLOEXEC and
it's a little ugly, so let's share this one implementation.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 16:54:51 +01:00
Derek Foreman
aaac913e90 egl/drm: Duplicate fd with F_DUPFD_CLOEXEC to prevent leak
Replacing dup() with fcntl F_DUPFD_CLOEXEC creates the duplicate
file descriptor with CLOEXEC so it won't be leaked to child
processes if the process fork()s later.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-23 16:54:47 +01:00
Jose Fonseca
be5f71d4a5 draw,tgsi: Assume TGSI_PROPERTY_GS_INVOCATIONS default of 1.
If the shader doesn't specify number of invocations, assume one.

This fixes geometry shaders on state trackers other than Mesa (and
probably graw tests too.)

Trivial.
2015-06-23 12:19:52 +01:00
Jose Fonseca
634cfb9a45 glsl: Specify the shader stage in linker errors due to too many in/outputs.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-23 12:06:39 +01:00
Dave Airlie
4731be701f docs: update GL3 with softpipe/llvmpipe gpu_shader5 pieces.
This just updates the bits I've added in the previous few patches.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-23 15:55:30 +10:00
Dave Airlie
1a71fbe28c draw/gallivm: add invocation ID support for llvmpipe.
This extends the draw code to add support for invocations.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-23 15:54:07 +10:00
Dave Airlie
40d225803e draw/tgsi: implement geom shader invocation support.
This is just for softpipe, llvmpipe won't work without
some changes.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-23 15:53:49 +10:00
Dave Airlie
24e77cb09f tgsi: handle indirect sampler arrays. (v2)
This is required for ARB_gpu_shader5 support in softpipe.

v2: add support to txd/txf/txq paths.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-23 15:52:48 +10:00
Kenneth Graunke
1762568fd3 nir: Allow vec2/vec3/vec4 instructions in the select peephole pass.
These are basically just moves, so they should be safe as well.

When disabling i965's GLSL IR level scalarizer (channel expressions)
pass, I started seeing NIR code like this:

        if ssa_21 {
                block block_1:
                /* preds: block_0 */
                vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30
                /* succs: block_3 */
        } else {
                block block_2:
                /* preds: block_0 */
                /* succs: block_3 */
        }
        block block_3:
        /* preds: block_1 block_2 */
        vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2

Previously, the GLSL IR scalarizer pass would break the vec4 into a
series of fmovs, which were allowed by the peephole pass.  But with
the vec4 operation, they were not.  We want to keep getting selects.

Normal i965 on Broadwell:
instructions in affected programs:     200 -> 176 (-12.00%)
helped:                                4

With brw_fs_channel_expressions() disabled:
instructions in affected programs:     1832 -> 1646 (-10.15%)
helped:                                30

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-22 14:08:36 -07:00
Kenneth Graunke
94e3864707 i965: Add and fix comments in brw_vue_map.c.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-22 14:05:44 -07:00
Kenneth Graunke
38eb9015e3 i965: Split VUE map handling out of brw_vs.c into brw_vue_map.c.
This was originally only used by the vertex shader, but it's now used by
the geometry shader as well, and will also eventually be used for
tessellation control and evaluation shaders.

I suspect it will be easier to find in a file named after the concept.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-22 14:05:44 -07:00
Connor Abbott
9a3dda101e nir/vtn: fix emitting code after loops
When we're done emitting the code for a loop, we need to visit the new
break block, which is the merge block of the current loop, rather than
the old merge block, which is the merge block of the loop containing the
one we just emitted code for.
2015-06-22 13:53:08 -07:00
Ben Widawsky
90754d2df0 i965/gen9: Implement Push Constant Buffer workaround
This implements a workaround (exact excerpt as a comment in the code). The docs
specify [clearly, after you struggle for a while] that the offset isn't relative
to state base. This actually makes sense. This fixes hangs on SKL.

Buffer #0 is meant to be used for normal uniforms.
Buffer #1 is typically used for gather constants when using RS.
Buffer #1-#3 could be used to push a bunch of UBO data which would just be
  somewhere in memory, and not relative to the dynamic state.

NOTE: I've moved away from the ternary operator for the new gen9 conditions.
Admittedly it's probably not great to do this, but I really want to fix this all
up in the subsequent patch and doing it here makes that diff a lot nicer. I want
to split out the gen8/9 code to make the function a bit more readable, but to
keep this easily cherry-pickable I am doing this fix first. If we decide not to
merge the cleanup patch then I can revisit this.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Valtteri Rantala <Valtteri.rantala@intel.com>
2015-06-22 12:11:41 -07:00
Connor Abbott
e9c21d0ca0 unbreak things 2015-06-22 11:59:55 -07:00
Brian Paul
2b07b8d104 mesa: use _mesa_lookup_enum_by_nr() in print_array()
Print GL_FLOAT, etc. instead of hex value.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-06-22 08:46:56 -06:00
Chia-I Wu
8787141429 ilo: emit 3DPRIMITIVE from gen6_3dprimitive_info
It allows us to remove ilo_ib_state::draw_start_offset and
ILO_PRIM_RECTANGLES.  gen6_3d_translate_pipe_prim() is also replaced by
ilo_translate_draw_mode().
2015-06-22 15:18:57 +08:00
Chia-I Wu
58f95b332d ilo: align vertex buffer size in buf_create()
With ilo_format.[ch] moved out of core, the aligning of vertex buffers does
not belong to core anymore.
2015-06-22 15:18:57 +08:00
Chia-I Wu
513bc5d90b ilo: move ilo_format.[ch] out of core
They provide PIPE_FORMAT_x to GEN6_FORMAT_x translation as well as some
convenient helpers.  Move them out of core.
2015-06-22 15:18:56 +08:00
Chia-I Wu
3547bb0783 ilo: add ilo_state_surface_valid_format()
Check if a surface format can be used for the specified access type.
2015-06-22 15:18:56 +08:00
Chia-I Wu
aa3e5e0dde ilo: add ilo_state_vf_valid_element_format()
Check if a surface format can be used as a VE format.
2015-06-22 15:18:56 +08:00
Alexandre Courbot
da8300cb03 nvc0: use NV_VRAM_DOMAIN() macro
Use the newly-introduced NV_VRAM_DOMAIN() macro to support alternative
VRAM domains for chips that do not have dedicated video memory.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-06-22 01:00:02 -04:00
Alexandre Courbot
f22406837f nouveau: support for custom VRAM domains
Some GPUs (e.g. GK20A, GM20B) do not embed VRAM of their own and use
the system memory as a backend instead. For such systems, allocating
objects in VRAM results in errors since the kernel will not allow
VRAM objects allocations.

This patch adds a vram_domain member to struct nouveau_screen that can
optionally be initialized to an alternative domain to use for VRAM
allocations. If left untouched, NOUVEAU_BO_VRAM will be used for
systems that embed VRAM, and NOUVEAU_BO_GART will be used for VRAM-less
systems.

Code that uses GPU objects is then expected to use the NV_VRAM_DOMAIN()
macro in place of NOUVEAU_BO_VRAM to ensure correct behavior on
VRAM-less chips.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-06-22 01:00:02 -04:00
Chia-I Wu
57bdcae9e0 ilo: add ilo_state_compute
Replace gen6_idrt_data with ilo_state_compute, which has a bunch of
validations and is now preferred.
2015-06-22 12:56:55 +08:00
Dave Airlie
2bf5a4211e r600g: ignore sampler views for now.
This fixes a regression in that r600 stopped working when
sampler views were pushed.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-22 14:02:49 +10:00
Rob Clark
66a93a0ff9 freedreno/ir3: pass sz to split_dest()
For query_levels, we generate a getinfo with writemask of (z), which RA
will consider as size==3.  But we were still generating four fanouts.
Which meant that RA would see it as two different register classes,
depending on the path to definer.  Ie. on the getinfo instruction itself
it would see size==3, but when chasing back through the fanouts it would
see size==4.

Easiest way to solve that is to just generate the chain of neighboring
fanouts to have the correct size in the first place.

Note: we may eventually want split_dest() to take start/end or wrmask
instead, since really we only need size==1.  But RA is not clever enough
for that, query_levels is not that common, and the other two registers
that get allocated are never used so those register slots can be
immediately re-used.  So bunch of work for probably no real gain.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 08:01:12 -04:00
Rob Clark
1ee4d51e7a freedreno/ir3/nir: add more opcodes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 08:01:06 -04:00
Rob Clark
43048c7093 freedreno/ir3: only unminify txf coords on a3xx
Seems like a4xx gets this right.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 08:01:05 -04:00
Rob Clark
0f008082b1 freedreno: remove int sampler shader variants
We get this information from NIR (which gets it from sview decl in tgsi
when translating from tgsi), so no need to maintain shader variants for
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 08:00:58 -04:00
Rob Clark
457f7c2a2a freedreno/ir3: block reshuffling and loops!
This shuffles things around to allow the shader to have multiple basic
blocks.  We drop the entire CFG structure from nir and just preserve the
blocks.  At scheduling we know whether to schedule conditional branches
or unconditional jumps at the end of the block based on the # of block
successors.  (Dropping jumps to the following instruction, etc.)

One slight complication is that variables (load_var/store_var, ie.
arrays) are not in SSA form, so we have to figure out where to put the
phi's ourself.  For this, we use the predecessor set information from
nir_block.  (We could perhaps use NIR's dominance frontier information
to help with this?)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:54:38 -04:00
Rob Clark
660d5c1646 freedreno/ir3: a4xx encodes larger immed offset
Without this, negative branch/jump offsets look like very large positive
offsets.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:54:31 -04:00
Rob Clark
d646d3ae9d freedreno/ir3: simplify find_neighbors stop condition
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:54:16 -04:00
Rob Clark
c8fb5f8a01 freedreno/ir3: move inputs/outputs to shader
These belong in the shader, rather than the block.  Mostly a lot of
churn and nothing too interesting.  But splitting this out from the
rest of ir3_block reshuffling to cut down the noise in the later
patch.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:54:04 -04:00
Rob Clark
d52fb2f5ad freedreno/ir3/ra: use register_allocate
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:58 -04:00
Rob Clark
694beb8b83 freedreno/ir3: introduce ir3_compiler object
Right now, just provides a cleaner way to get at the gpu-id, given the
separation between compiler and context.  But we will need this also to
hold the reg-set for new register allocation.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:50 -04:00
Rob Clark
5c1e153467 freedreno/ir3: dump nocp option
No longer used, or even possible, with NIR frontend.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:43 -04:00
Rob Clark
7674ab12e8 freedreno/ir3: silence warnings
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:35 -04:00
Rob Clark
0f6faa8ff3 freedreno/ir3: remove tgsi f/e
Also remove ir3_flatten which was only used by tgsi f/e.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:25 -04:00
Rob Clark
7273cb4e93 freedreno/ir3/sched: convert to priority queue
Use a more standard priority-queue based scheduling algo.  It is simpler
and will make things easier once we have multiple basic blocks and flow
control.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:17 -04:00
Rob Clark
adf1659ff5 freedreno/ir3: use standard list implementation
Use standard list_head double-linked list and related iterators,
helpers, etc, rather than weird combo of instruction array and next
pointers depending on stage.  Now block has an instrs_list.  In
certain stages where we want to remove and re-add to the blocks list
we just use list_replace() to copy the list to a new list_head.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:53:09 -04:00
Rob Clark
67d994c676 freedreno/ir3: drop dot graph dumping
At least for now.. right now the instruction and instruction list
printing should suffice, and the re-working of ir3_block would require
a lot of changes in that code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:52:58 -04:00
Rob Clark
5c8c2e2f97 freedreno/ir3: more builder helpers
Use ir3_MOV() builder in a couple of spots, rather than open-coding the
instruction construction.  Also add ir3_NOP() builder and use that
instead of open coding.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:52:41 -04:00
Rob Clark
b33015f889 gallium/ttn: add missing SNE
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-21 07:52:36 -04:00
Rob Clark
c79b2e626c util/list: add list_first/last_entry
I need an easier way to get at head/tail in ir3.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21 07:52:36 -04:00
Rob Clark
b3d2e36716 gallium/ttn: add texture-type support
v2: rebased on using SVIEW to hold type information

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:52:29 -04:00
Rob Clark
cb258c1dec glsl_to_tgsi: add SVIEW decl support
Freedreno needs sampler type information to deal with int/uint textures.
To accomplish this, start creating sampler-view declarations, as
suggested here:

 http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html

create a sampler-view with index matching the sampler, to encode the
texture type (ie. SINT/UINT/FLOAT).  Ie:

   DCL SVIEW[n], 2D, UINT
   DCL SAMP[n]
   TEX OUT[1], IN[1], SAMP[n]

For tgsi texture instructions which do not take an explicit SVIEW
argument, the SVIEW index is implied by the SAMP index.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:52:22 -04:00
Rob Clark
93379748f7 util/blitter (and friends): generate appropriate SVIEW decls
Some hardware needs to know the sampler type.  Update the blit related
shaders to include SVIEW decl.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:52:16 -04:00
Rob Clark
e536992986 util/pstipple: updates for SVIEW decls
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:52:12 -04:00
Rob Clark
b516e68afb draw: updates to support SVIEW decls
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:52:07 -04:00
Rob Clark
f481af110e tgsi/transform: add support for SVIEW decls
TODO single return_type (use enum)

v2: single return_type arg, and use enum

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:52:02 -04:00
Rob Clark
b13135e066 tgsi: update docs for SVIEW usage with TEX* instructions
Based on mailing list discussion here:

http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-21 07:51:53 -04:00
Eric Anholt
717376155d mesa: Back out an accidental change I had in a VC4 commit.
This was a hack as part of debugging some glamor-on-GLES2 behavior that
ended up being an xserver bug.  I suspect we can just flip this extension
on for GLES2, but the spec says it requires 3.1.
2015-06-20 15:04:17 -07:00
Emil Velikov
104bff0376 docs: add news item and link release notes for mesa 10.5.8
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-20 16:42:21 +01:00
Emil Velikov
aa28423bcc docs: Add sha256sums for the 10.5.8 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a81b1d5512)
2015-06-20 16:42:21 +01:00
Emil Velikov
97caf2054f Add release notes for the 10.5.8 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 24b043aab7)
2015-06-20 16:42:21 +01:00
Eric Anholt
c009038674 vc4: Use a defined t value for 1D textures.
This doesn't fix the broken 1D cases of texsubimage, but it does prevent
segfaulting when dumping the QIR code generated in fbo-1d.
2015-06-20 00:16:32 -07:00
Eric Anholt
bb107110a4 vc4: Fix write-only texsubimage when we had to align.
We need to make sure that when we store the aligned box, we've got
initialized contents in the border.  We could potentially just load the
border area, but for now let's get text rendering working in X (and fix
the GL_TEXTURE_2D errors in piglit's texsubimage test and
gl-2.1-pbo/test_tex_image)
2015-06-20 00:16:32 -07:00
Chia-I Wu
028590cbc7 ilo: clean up header includes
Core is more self-contained now.
2015-06-20 11:20:12 +08:00
Chia-I Wu
244caba250 ilo: avoid ilo_ib_state in genX_3DPRIMITIVE()
ilo_ib_state is not in core.
2015-06-20 11:18:30 +08:00
Chia-I Wu
dcb5bad3a3 ilo: move gen6_so_SURFACE_STATE() out of core
It does not belong to core.
2015-06-20 11:18:10 +08:00
Chia-I Wu
e3372c4bfb ilo: add ilo_state_sol_buffer
It serves the same purpose as ilo_state_vertex_buffer does.
2015-06-20 11:18:09 +08:00
Chia-I Wu
9904e647cc ilo: add ilo_state_index_buffer
It serves the same purpose as ilo_state_vertex_buffer does.
2015-06-20 11:18:07 +08:00
Chia-I Wu
da4878cb80 ilo: add ilo_state_vertex_buffer
Being a parameter-like state, we may want to get rid of
ilo_state_vertex_buffer_info or ilo_state_vertex_buffer eventually.  But we
want them now as they are how we do cross-validation right now.
2015-06-20 11:14:14 +08:00
Chia-I Wu
4555211028 ilo: add 3DSTATE_VF_INSTANCING to ilo_state_vf
3DSTATE_VF_INSTANCING specifies instancing enable and step rate.  They are
specified along with 3DSTATE_VERTEX_BUFFERS instead prior to Gen8.  Both
commands are added.
2015-06-20 11:14:14 +08:00
Chia-I Wu
e8d297b7a1 ilo: add 3DSTATE_VF to ilo_state_vf
3DSTATE_VF specifies cut index enable and cut index.  Cut index enable is
specified in 3DSTATE_INDEX_BUFFER instead prior to Gen7.5.  Both commands are
added.
2015-06-20 11:14:14 +08:00
Chia-I Wu
7b3432b62d ilo: embed pipe_index_buffer in ilo_ib_state
Make it obvious that we save a copy of pipe_index_buffer.
2015-06-20 11:14:10 +08:00
Chia-I Wu
73f0d6d22d ilo: fix a buffer overrun
Add missing parentheses in SURFTYPE_NULL initialization.
2015-06-20 11:13:20 +08:00
Chia-I Wu
aa3ec8bc46 ilo: fix a -Wmaybe-uninitialized warning
ilo_shader.c: In function ‘ilo_shader_select_kernel_sbe’:
ilo_shader.c:1140:27: warning: ‘src_skip’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
2015-06-20 11:13:20 +08:00
Brian Paul
a1f84453a2 glsl: fix formatting glitch in _mesa_print_ir()
Print the closing ) before the newline.  Trivial.
2015-06-19 16:46:29 -06:00
Kristian Høgsberg Kristensen
9b9f973ca6 vk: Implement scratch buffers to make spilling work 2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
9e59003fb1 vk: Undo relocs for scratch bos 2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
b20794cfa8 vk/allocator: Get rid of non-memfd path
We can just use modern valgrind now.
2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
aba75d0546 vk/headers: Make General State offsets relocations 2015-06-19 15:42:15 -07:00
Ben Widawsky
7c3da3592e i965/gen8: Use HALIGN_16 for single sample mcs buffers
The original code meant to do this, but was only checking num_samples == 1 to
figure out if a surface was fast clear capable. However, we can allocate single
sample miptrees with num_samples == 0 (when it's an internally created buffer).

This fixes a bunch of the piglit tests on gen8. Other gens should have been
fine.

Here is the order of events that allowed this to slip through:
t0: I wrote halign patches and tested them. These alignment assertions are for
   gen8 fast clear surfaces, basically.
t1: I pushed bogus perf patch which made fast clears never happen
t2: Reworked halign patches based on Chad's feedback and introduced the bug this
   patch fixes.
t2.5: I tested reworked patches, but assertion wasn't hit because of t1.
t3. Matt fixed issue in t1 which made fast clears happen here:
commit 22af95af83
Author: Matt Turner <mattst88@gmail.com>
Date:   Thu Jun 18 16:14:50 2015 -0700

    i965: Add missing braces around if-statement.

This logic should match that of the v1 of my halign patch series.

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Matt Turner <mattst88@gmail.com>
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2015-06-19 11:25:00 -07:00
Ilia Mirkin
539cb2b76e mesa: move ARB_gs5 enums to core, EXT_polygon_offset_clamp to desktop
When adding EXT_polygon_offset_clamp, I first made it core-only, and
never moved the enum getter back to the GL/GL_CORE section. Similarly,
ARB_gs5 is a core-only extension, so move its getters to the GL_CORE
section.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-19 14:11:27 -04:00
Brian Paul
6ec4e9c28d u_vbuf: fix src_offset alignment in u_vbuf_create_vertex_elements()
If the driver says PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY=1,
the driver should never receive a pipe_vertex_element::src_offset value
that's not a multiple of four.  But the vbuf code wasn't actually adjusting
the src_offset value when creating the vertex element state object.

We just need to align the src_offset values put in the driver_attribs[]
array.

See the piglit gl-1.5-vertex-buffer-offsets test.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-06-19 10:54:24 -06:00
Brian Paul
c40f44cc99 gallium: whitespace, formatting clean-up in p_state.h
Remove trailing whitespace, move some braces, 78-column wrapping.
Trivial.
2015-06-19 08:45:00 -06:00
Brian Paul
4c11008eba st/wgl: fix WGL_SWAP_METHOD_ARB query
There are three possible return values (not two): WGL_SWAP_COPY_ARB,
WGL_SWAP_EXCHANGE_EXT and WGL_SWAP_UNDEFINED_ARB.

VMware bug 1431184

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
73bdf4ba86 stw: use new stw_get_nop_function() function to avoid Viewperf 12 crashes
Also, print a warning if we do return NULL from wglGetProcAddress() to
help spot this sort of problem in the future.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
8d005a643e stw: add some no-op functions for GL_EXT_dsa, GL_NV_half_float
Viewperf 12 calls wglGetProcAddress() to get pointers to some unsupported
DSA and half-float functions.  We return NULL but Viewperf doesn't check
for null before trying to jump through the pointer.  That causes a crash.

This patch adds no-op functions to call instead (used by the next patch).
This avoids the crash but the rendering is incorrect.

Some DSA functions are being added to Mesa at this time so we may be
able to remove some of these no-ops in the future.

More no-op functions may be added as needed.

VMware PR1383421

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-06-19 08:45:00 -06:00
Jose Fonseca
eee9247018 st/wgl: Don't return core profile for 3.1 contexts.
WGL_CONTEXT_PROFILE_MASK_ARB doesn't apply to desktop OpenGL versions
less than 3.2 -- applications can't specify whether they want a core or
a compat 3.1 context -- instead they are supposed the check whether the
returned context advertises GL_ARB_compatibility extension.

Mesa doesn't support compatability contexts for version higher than 3.1,
so we used to return core profile context, but this makes several Windows
applications unhappy, because they just assume they got a compatability
context without checking.

So it seems safer to on Windows to never return core profile for 3.1,
ie, just fail the context creation.

VMware PR1365920.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
528bd94432 st/wgl: set PIPE_BIND_SAMPLER_VIEW for window color buffers
To allow sampling from the surface for things like glCopyPixels
or glCopyTexSubImage.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
9405c1b3b0 st/wgl: add support for multisample pixel formats
Create pixel formats with 0, 4, 8 and 16 samples per pixel.
Add a SVGA_FORCE_MSAA env var to force creating all pixel formats
with a particular sample count.  This is useful for testing Mesa/GLUT/
etc. programs which don't ordinarily use multisample.

Reviewed-by: Matthew McClure <mcclurem@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
0925e5f5bc st/wgl: respect sample count when creating framebuffer surfaces
Use the visual/pixel format's sample count instead of zero.

Reviewed-by: Matthew McClure <mcclurem@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
b8249de646 st/wgl: fix WGL_SAMPLE_BUFFERS_ARB query
Only report 1 for WGL_SAMPLE_BUFFERS_ARB if the number of samples
per pixel > 1.

Reviewed-by: Matthew McClure <mcclurem@vmware.com>
2015-06-19 08:45:00 -06:00
Brian Paul
5ad5d44af5 tgsi: add comments for ureg_emit_label() 2015-06-19 08:45:00 -06:00
Brian Paul
12c1c0706d tgsi: new comments, assertion for executing TGSI_OPCODE_CAL 2015-06-19 08:45:00 -06:00
Timothy Arceri
2ce2b80c6f docs: update developer info
Update piglit link to the current Piglit website.

Add note about updating patchwork when sending patch revisions.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-06-19 18:27:40 +10:00
Jose Fonseca
afeb922206 llvmpipe: Truncate the binned constants to max const buffer size.
Tested with Ilia Mirkin's gzdoom.trace and
"arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test
without my earlier fix to fail linkage when UBO exceeds
GL_MAX_UNIFORM_BLOCK_SIZE.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-06-19 07:21:06 +01:00
Jose Fonseca
f734d25560 glsl: Fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE.
It's not totally clear whether other Mesa drivers can safely cope with
over-sized UBOs, but at least for llvmpipe receiving a UBO larger than
its limit causes problems, as it won't fit into its internal display
lists.

This fixes piglit "arb_uniform_buffer_object-maxuniformblocksize
fsexceed" without regressions for llvmpipe.

NVIDIA driver also fails to link the shader from
"arb_uniform_buffer_object-maxuniformblocksize fsexceed".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65525

PS: I don't recommend cherry-picking this for Mesa stable, as some app
might inadvertently been relying on UBOs larger than
GL_MAX_UNIFORM_BLOCK_SIZE to work on other drivers, so even if this
commit is universally accepted it's probably best to let it mature in
master for a while.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-06-19 07:21:05 +01:00
Connor Abbott
841aab6f50 matrices matrices matrices 2015-06-18 18:52:44 -07:00
Connor Abbott
d0fc04aacf nir/types: be less strict about constructing matrix types 2015-06-18 18:51:51 -07:00
Ilia Mirkin
5974841fd0 glsl: guard gl_NumSamples enablement on ARB_sample_shading
gl_NumSamples should only be enabled when ARB_sample_shading is enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-18 20:40:22 -04:00
Connor Abbott
22854a60ef nir/builder: add a nir_fdot() convenience function 2015-06-18 17:34:55 -07:00
Connor Abbott
0e86ab7c0a nir/types: add a helper to transpose a matrix type 2015-06-18 17:34:12 -07:00
Connor Abbott
de4c31a085 fix glsl450 for composites 2015-06-18 17:33:08 -07:00
Matt Turner
22af95af83 i965: Add missing braces around if-statement.
Fixes a performance problem caused by commit b639ed2f.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895
2015-06-18 16:45:55 -07:00
Jordan Justen
2310a65c28 i965/compute: Fix undefined code with right_mask for SIMD32
Although we don't support SIMD32, krh pointed out that the left shift
by 32 is undefined by C/C++ for 32-bit integers.

Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-18 11:24:39 -07:00
Ilia Mirkin
770f141866 mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls
This was apparently missed when ARB_sso support was added.
Add label support to pipeline objects just like all the other
debug-related objects.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-18 13:21:44 -04:00
Ilia Mirkin
b6e238023c glsl: add version checks to conditionals for builtin variable enablement
A number of builtin variables have checks based on the extension being
enabled, but were missing enablement via a higher GLSL version.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-18 13:21:44 -04:00
Ilia Mirkin
c40e7ee7c4 glsl: handle conversions to double when comparing param matches
This allows mod(int, int) to become selected as float mod when doubles
are supported.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-06-18 13:21:44 -04:00
Emil Velikov
6b0378e483 ilo: remove missing ilo_fence.h from the sources list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-18 12:59:28 +01:00
Boyan Ding
997fc807b2 egl/x11: Set version of swrastLoader to 2
which it actually implements instead of the newest version defined in
dri_interface.h

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-18 12:57:59 +01:00
Eric Anholt
1d45e44b2f vc4: Move tile state/alloc allocation into the kernel.
This avoids a security issue where userspace could have written the tile
state/tile alloc behind the GPU's back, and will apparently be necessary
for fixing stability bugs (tile state buffers are missing some top bits
for the tile alloc's address).
2015-06-17 23:53:49 -07:00
Eric Anholt
9adcd2d80a vc4: Move RCL generation into the kernel.
There weren't that many variations of RCL generation, and this lets us
skip all the in-kernel validation for what we generated.
2015-06-17 23:53:49 -07:00
Eric Anholt
91c73a9a28 vc4: Add dumping of VC4_PACKET_TILE_BINNING_MODE_CONFIG. 2015-06-17 23:53:49 -07:00
Eric Anholt
dc1fbad2eb vc4: Fix memory leak from simple_list conversion.
I accidentally shadowed the outside declaration, so we always returned
NULL even when we'd found something in the cache.
2015-06-17 23:53:49 -07:00
Eric Anholt
62d153ea37 vc4: Track the number of BOs allocated and their size.
This is useful for BO leak debugging.
2015-06-17 23:53:49 -07:00
Iago Toral Quiroga
2b1cdb0edd i965: Fix textureGrad with cube samplers
We can't use sampler messages with gradient information (like
sample_g or sample_d) to deal with this scenario because according
to the PRM:

"The r coordinate and its gradients are required only for surface
types that use the third coordinate. Usage of this message type on
cube surfaces assumes that the u, v, and gradients have already been
transformed onto the appropriate face, but still in [-1,+1] range.
The r coordinate contains the faceid, and the r gradients are ignored
by hardware."

Instead, we should lower this to compute the LOD manually based on the
gradients and use a different sample message that takes the computed
LOD instead of the gradients. This is already being done in
brw_lower_texture_gradients.cpp, but it is restricted to shadow
samplers only, although there is a comment stating that we should
probably do this also for samplerCube and samplerCubeArray.

Because of this, both dEQP and Piglit test cases for textureGrad with
cube maps currently fail.

This patch does two things:
1) Activates the texturegrad lowering pass for all cube samplers.
2) Corrects the computation of the LOD value for cube samplers.

I had to do 2) because for cube maps the calculations implemented
in the lowering pass always compute a value of rho that is twice
the value we want (so we get a LOD value one unit larger than we
want). This only happens for cube map samplers (all kinds). I am
not sure about why we need to do this, but I suspect that it is
related to the fact that cube map coordinates, when transported
to a specific face in the cube, are in the range [-1, 1] instead of
[0, 1] so we probably need to divide the derivatives by 2 when
we compute the LOD. Doing that would produce the same result as
dividing the final rho computation by 2 (or removing a unit
from the computed LOD, which is what we are doing here).

Fixes the following piglit tests:
bin/tex-miplevel-selection textureGrad Cube -auto -fbo
bin/tex-miplevel-selection textureGrad CubeArray -auto -fbo
bin/tex-miplevel-selection textureGrad CubeShadow -auto -fbo

Fixes 10 dEQP tests in the following category:
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-18 08:35:46 +02:00
Kristian Høgsberg Kristensen
aedd3c9579 vk: Add missing gen7 RENDER_SURFACE_STATE struct 2015-06-17 21:42:29 -07:00
Ilia Mirkin
36e3eb6a95 nvc0/ir: can't have a join on a load with an indirect source
Triggers an INVALID_OPCODE warning on GK208. Seems rare enough to not
warrant verification on other chips. Fixes the new piglits:

  ubo_array_indexing/fs-nonuniform-control-flow.shader_test
  ubo_array_indexing/vs-nonuniform-control-flow.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-17 22:23:20 -04:00
Connor Abbott
bf5a615659 composites composites composites 2015-06-17 16:25:38 -07:00
Kevin Rogovin
ff06901082 docs: mark GL_ARB_framebuffer_no_attachments done for i965
Mark GL_ARB_framebuffer_no_attachments as done for i965.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
8319999831 i965: enable ARB_framebuffer_no_attachments for Gen7+
Enable GL_ARB_framebuffer_no_attachments in i965 for Gen7 and higher.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
9ded636975 i965: execution of frag-shader when it has atomic buffer
Ensure that the GPU spawns the fragment shader thread for those
fragment shaders with atomic buffer access.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
bbb700967e mesa: function for testing if current frag-shader has atomics
Add helper function that checks if current fragment shader active
of gl_context has atomic buffer access.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
41b6db225f i965: Use _mesa_geometric_ functions appropriately
Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometry_ convenience functions
for those places where the geometry of the gl_framebuffer is needed
(in contrast to the geometry of the intersection of the attachments
of the gl_framebuffer).

This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments
on Gen7 and higher in i965.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
51f4b51151 mesa: helper function for scissor box of gl_framebuffer
Add helper convenience function that intersects the scissor values
against a passed bounding box. In addition, to avoid replicated code,
make the function _mesa_scissor_bounding_box() use this new function.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
74987977a3 mesa: add helper functions for geometry of gl_framebuffer
Add convenience helper functions for fetching geometry of gl_framebuffer
that return the geometry of the gl_framebuffer instead of the geometry of
the buffers of the gl_framebuffer when then the gl_framebuffer has no
attachments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
6aa12994bd PATCH 03/10] mesa: Complete ARB_framebuffer_no_attachments in Mesa core
Implement GL_ARB_framebuffer_no_attachments in Mesa core
 - changes to conditions for framebuffer completenss
 - implement set/get functions for framebuffers for
   new functions in GL_ARB_framebuffer_no_attachments

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:03 +03:00
Kevin Rogovin
c9d26f201a mesa: Constants and functions for ARB_framebuffer_no_attachments
Define the enumeration constants, function entry points and
glGet for the GL_ARB_framebuffer_no_attachments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:02 +03:00
Kevin Rogovin
da81999bee mesa: Define infrastructure for ARB_framebuffer_no_attachments
Define the infrastructure for the extension GL_ARB_framebuffer_no_attachments:
 - extension table
 - additions to gl_framebuffer

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
2015-06-17 14:39:02 +03:00
Eric Anholt
a0cd1a4060 vc4: Make sure that direct texture clamps have a minimum value of 0.
I was thinking of the MIN opcode in terms of unsigned math, but it's
signed, so if you used a negative array index, you could read before the
UBO.  Fixes segfaults under simulation in piglit array indexing tests with
mprotect-based guard pages.
2015-06-16 15:15:14 -07:00
Eric Anholt
d4d2736149 vc4: Swap around which src we spill to ra31/rb31.
I wanted to assert that src1 came from a non-unspilled register in shader
validation, and this easily gets us that.  And, as a bonus:

total instructions in shared programs: 93347 -> 92723 (-0.67%)
instructions in affected programs:     60524 -> 59900 (-1.03%)
2015-06-16 15:15:14 -07:00
Eric Anholt
507f3e708c vc4: R4 is not a valid register for clamped direct texturing.
Our array only goes to R3, and R4 is a special case that shouldn't be
used.
2015-06-16 15:15:14 -07:00
Eric Anholt
2eac356467 vc4: Factor out the live clamp register getter. 2015-06-16 15:15:14 -07:00
Eric Anholt
596532cc7d vc4: Drop the unused "stride" field of surfaces.
We're always looking at the slice anyway, when we would have needed it.
2015-06-16 15:15:14 -07:00
Eric Anholt
6dd55b4909 vc4: Handle refcounting the exec BO like we do in the kernel.
This reduces the diff to the kernel, and will be useful when I make the
kernel allocate more BOs as part of validation.
2015-06-16 15:15:14 -07:00
Eric Anholt
731ac05cc4 vc4: Use VC4_SET/GET_FIELD for some RCL packets. 2015-06-16 15:15:14 -07:00
Eric Anholt
e22a192784 vc4: Make symbolic values for packet sizes. 2015-06-16 15:15:14 -07:00
Eric Anholt
c2f8287601 vc4: Use symbolic values in texture ptype validation. 2015-06-16 15:15:14 -07:00
Eric Anholt
5fbbec9aae vc4: Move vc4_packet.h to the kernel/ directory, since it's also shared.
I want to notice discrepancies when I diff -u between Mesa and the kernel.
2015-06-16 15:15:14 -07:00
Anuj Phogat
e20345204d i965/gen9: Disable Mip Tail for YF/YS tiled surfaces
Disabling miptails fixed the buffer corruption happening in FBO
which use YF/YS tiled renderbuffer or texture as color attachment.

Spec recommends disabling mip tails only for non-mip-mapped surfaces.
But, without disabling miptails I couldn't get correct data out of
mipmapped YF/YS tiled surface.

We need better understanding of miptails before start using them.
For now this patch helps move things forward.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-16 14:52:49 -07:00
Anuj Phogat
54591bb67f i965/gen9: Set vertical and horizontal surface alignments
Patch sets the alignments for texture and renderbuffer surfaces.

V3: Make changes inside horizontal_alignment() and
    vertical_alignment() (Topi)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-16 14:52:48 -07:00
Anuj Phogat
6c380d42b1 i965: Use BRW_SURFACE_* in place of GL_TEXTURE_*
Makes no functional changes in the code.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-16 14:52:48 -07:00
Anuj Phogat
af08530332 i965: Rename use_linear_1d_layout() and make it global
This function will be utilised in later patches.

V2: Make both pointers constants (Topi)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-16 14:52:48 -07:00
Anuj Phogat
0668756447 i965/gen9: Set tiled resource mode in surface state
This patch sets the tiled resource mode for texture and renderbuffer
surfaces.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-16 14:52:48 -07:00
Haixia Shi
6b8accb36b egl/dri2: implement platform_surfaceless
The surfaceless platform is for off-screen rendering only. Render node support
is required.

Only consider the render nodes. Do not use normal nodes as they require
auth hooks.

v3: change platform_null to platform_surfaceless
v4: make libdrm required for surfaceless
v5: remove modified include guards with defined(HAVE_SURFACELESS_PLATFORM)
v6: use O_CLOEXEC for drm fd

Signed-off-by: Haixia Shi <hshi@chromium.org>
Signed-off-by: Zach Reizner <zachr@google.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-16 13:55:26 -07:00
Neil Roberts
c753866cc4 i965/vec4: Fix the source register for indexed samplers
Previously when setting up the sample instruction for an indirect
sampler the vec4 backend was directly passing the pseudo opcode's
src0. However vec4_visitor::visit(ir_texture *) doesn't set the
texture operation's src0 -- it's left as BAD_FILE, which when
translated into a brw_reg gives the null register. In brw_SAMPLE,
gen6_resolve_implied_move() inserts a MOV from the inst->base_mrf and
sets the src0 appropriately. The indirect sampler case did not have a
call to gen6_resolve_implied_move().

The fs backend avoids this because the platforms that support dynamic
indexing of samplers (IVB+) have been converted to not use the
fake-MRF hack, and instead send from proper GRFs.

This patch makes it call gen6_resolve_implied_move before setting up
the indirect message. This is similar to what is done for constant
sampler numbers in brw_SAMPLE.

The Piglit tests for sampler array indexing didn't pick this up
because they were using a texture with a solid colour so it didn't
matter what texture coordinates were actually used. The tests have now
been changed to be more thorough in this commit:

http://cgit.freedesktop.org/piglit/commit/?id=4f9caf084eda7

With that patch the tests for gs and vs are currently failing on
Ivybridge, but this patch fixes them. There are no other changes to a
Piglit run on Ivybridge.

On Skylake the gs tests were failing even without the Piglit patch
because Skylake needs the source registers to work correctly in order
to send a message header to select SIMD4x2 mode.

(The explanation in the commit message is partially written by Matt
Turner)

Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-16 18:44:32 +01:00
Marek Olšák
aab55b0bc6 st/mesa: improve assertions in vp/fp translation
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-16 15:47:03 +02:00
Marek Olšák
42a3c1ec84 mesa: don't rebind constant buffers after every state change if GS is active
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-16 15:47:03 +02:00
Chris Forbes
358b6bb7a7 mesa: generalize sso stage interleaving check
For tessellation.

v2: cleanup by Marek Olšák

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-16 15:47:03 +02:00
Marek Olšák
8af11afc38 mesa: remove unused variables from gl_program
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-16 15:47:02 +02:00
Chris Forbes
fa49536ab1 glsl: add ir reader support for ir_barrier
Picked from the tessellation branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-16 15:47:02 +02:00
Marek Olšák
2f86c22e75 glsl: print locations of variables
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-16 15:47:02 +02:00
Marek Olšák
797f4eacea configure.ac: rename LLVM_VERSION_PATCH to avoid conflict with llvm-config.h
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-06-16 15:47:02 +02:00
Timothy Arceri
da6996485f Revert "glsl: remove restriction on unsized arrays in GLSL ES 3.10"
This reverts commit adee54f826.

Further down in the GLSL ES 3.10 spec it say:

"If an array is declared as the last member of a shader storage block
and the size is not specified at compile-time, it is sized at run-time.
In all other cases, arrays are sized only at compile-time."

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-06-16 20:58:59 +10:00
Tapani Pälli
7d88ab42b9 mesa: set override_version per api version override
Before 9b5e92f get_gl_override was called only once, but now it is
called for multiple APIs (GLES2, GL), version needs to be set always.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90797
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-16 13:52:01 +03:00
Neil Roberts
1a6220b416 i965: Fix aligning to the block size in intel_miptree_copy_slice
This function was trying to align the width and height to a multiple
of the block size for compressed textures. It was using align_w/h as a
shortcut to get the block size as up until Gen9 this always happens to
match. However in Gen9+ the alignment values are expressed as
multiples of the block size so in effect the alignment values are
always 4 for compressed textures as that is the minimum value we can
pick. This happened to work for most compressed formats because the
block size is also 4, but for FXT1 this was breaking because it has a
block width of 8.

This fixes some Piglit tests testing FXT1 such as

spec@3dfx_texture_compression_fxt1@fbo-generatemipmap-formats

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2015-06-16 11:28:44 +01:00
Ilia Mirkin
8b24388647 nv50,nvc0: clamp uniform size to 64k
The state tracker will pass through requests from buggy applications
which will have the buffer size larger than the max allowed (64k). Clamp
the size to 64k so that we don't get errors when uploading the constbuf
data.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-15 15:48:58 -04:00
Ilia Mirkin
a2af42c1d2 nvc0/ir: fix collection of first uses for texture barrier insertion
One of the places we have to insert texbars is in situations where the
result of the tex gets overwritten by a different instruction (e.g. in a
conditional statement). However in some situations it can actually
appear as though the original tex itself is an overwriting instruction.
This can naturally never really happen, so just ignore the tex
instruction when it comes up.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90347
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-06-15 14:31:00 -04:00
Eric Anholt
932d1613d1 egl: Drop check for driver != NULL.
Back in 2013, a patch was added (with 2 reviewers!) at the end of the
block to early exit the loop in this case, without noticing that the loop
already did.  I added another early exit case, again without noticing, but
Rob caught me.  Just drop the loop condition that apparently surprises
most of us, instead of leaving the end of the loop conspicuously not
exiting on success.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2015-06-15 10:32:23 -07:00
Eric Anholt
bcd8a64f32 gallium: Drop the gallium-specific Android sw winsys.
This was part of gallium_egl, and we now have the normal libEGL Android
winsys support to handle it.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-15 10:32:23 -07:00
Eric Anholt
6ce0b0e317 vc4: Add support for building on Android.
v2: Add a comment explaining why we link libmesa_glsl.  Drop warning
    option from freedreno.  Add vc4 to the documentation for
    BOARD_GPU_DRIVERS.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-15 10:32:23 -07:00
Eric Anholt
fd3234891f gallium: Enable build of NIR support on Android.
v2: Add a comment explaining why we link libmesa_glsl.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-15 10:32:23 -07:00
Eric Anholt
71aaf62fca egl/dri2: Fix Android Lollipop build on ARM.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-15 10:32:23 -07:00
Anuj Phogat
8e9eec5cbf meta: Abort texture upload if pixels == null and no pixel unpack buffer set
in case of glTexImage{1,2,3}D(). Texture has already been allocated
at this point and we have no data to upload. With out this patch,
with create_pbo = true, we end up creating a temporary pbo and then
uploading uninitialzed texture data.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
a4ff47ade9 meta: Abort meta path if ReadPixels need rgb to luminance conversion
After recent addition of pbo testing in piglit test getteximage-luminance,
it fails on i965. This patch makes a sub test pass.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
ba2b1f8668 mesa: Turn need_rgb_to_luminance_conversion() in to a global function
This will be used by _mesa_meta_pbo_GetTexSubImage() in a later patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
0b13adcd08 mesa: Use helper function need_rgb_to_luminance_conversion()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
82abdf209a mesa: Handle integer formats in need_rgb_to_luminance_conversion()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
6c14b66e40 meta: Use is_power_of_two() helper function
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
278460279b i965: Check for miptree pitch alignment before using intel_miptree_map_movntdqa()
We have an assert() in intel_miptree_map_movntdqa() which expects
the pitch to be 16 byte aligned.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-15 09:07:28 -07:00
Anuj Phogat
84d27c32d2 i965: Remove break after return
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-15 09:07:28 -07:00
Jürgen Rühle
2e42deb29c nv50/ir: OP_JOIN is a flow instruction
OP_JOIN instructions are assumed to be flow instructions and mercilessly
casted to FlowInstruction.

This patch fixes an instance where an OP_JOIN is created as a plain
instruction. This can cause crashes in the ir printer.

[imirkin: add ->fixed = 1]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-06-15 11:46:32 -04:00
Emil Velikov
061c9bc204 docs: add news item and link release notes for mesa 10.6.0
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-15 08:57:56 +01:00
Emil Velikov
f9e0441328 docs: Add sha256sums for the 10.6.0 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 5d327b3735)
2015-06-15 08:57:55 +01:00
Emil Velikov
311abe7fbd docs: Update 10.6.0 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 3b9cde5c81)
2015-06-15 08:57:55 +01:00
Chia-I Wu
94ab563671 ilo: add ilo_state_raster_{line,poly}_stipple
Initialize hardware stipple states on bound instead of on emission.
2015-06-15 15:06:11 +08:00
Chia-I Wu
7cb853d52a ilo: add ilo_state_sample_pattern
Move sample pattern initialization from ilo_render to
ilo_state_sample_pattern.
2015-06-15 15:06:11 +08:00
Chia-I Wu
8f37e8e64f ilo: add 3DSTATE_AA_LINE_PARAMETERS to ilo_state_raster
Utilize ilo_state_raster to avoid redundant state change.
2015-06-15 15:06:11 +08:00
Marek Olšák
b0a2280e45 gallium/util: add util_last_bit64
This will be needed by radeonsi.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-06-14 20:17:29 +02:00
Marek Olšák
2489054f66 glsl: fix "tesselation" typo
Trivial.
2015-06-14 20:17:29 +02:00
Marek Olšák
790510808e r600g: handle TGSI input/output array declarations correctly
Most of this code could be removed if r600g used tgsi_shader_info.
2015-06-14 20:17:29 +02:00
Chia-I Wu
117926debb ilo: merge ilo_state_3d*.[ch] to ilo_state.[ch]
With most code replaced to ilo_state_*, what was left did not belong there
anymore.
2015-06-15 01:23:23 +08:00
Chia-I Wu
54e0a8ed5d ilo: add ilo_state_ps to ilo_shader_cso 2015-06-15 01:22:13 +08:00
Chia-I Wu
30fcb31c9b ilo: add ilo_state_{vs,hs,ds,gs} to ilo_shader_cso 2015-06-15 01:07:10 +08:00
Chia-I Wu
da6e45fcbc ilo: embed ilo_state_sbe in ilo_shader 2015-06-15 01:07:10 +08:00
Chia-I Wu
5a52627c4f ilo: embed ilo_state_vf in ilo_ve_state 2015-06-15 01:07:09 +08:00
Chia-I Wu
9bfa987fb0 ilo: embed ilo_state_urb in ilo_state_vector 2015-06-15 01:07:09 +08:00
Chia-I Wu
eaf2c73899 ilo: embed ilo_state_sol in ilo_shader 2015-06-15 01:07:09 +08:00
Chia-I Wu
960ca7d5e3 ilo: embed ilo_state_cc in ilo_blend_state 2015-06-15 01:07:09 +08:00
Chia-I Wu
402e155cd3 ilo: embed ilo_state_raster in ilo_rasterizer_state 2015-06-15 01:07:09 +08:00
Chia-I Wu
ded7d412d0 ilo: embed ilo_state_viewport in ilo_viewport_state 2015-06-15 01:06:45 +08:00
Chia-I Wu
4b5c0a8341 ilo: replace ilo_sampler_cso with ilo_state_sampler 2015-06-15 01:06:45 +08:00
Chia-I Wu
745ef2c07b ilo: replace ilo_view_surface with ilo_state_surface 2015-06-15 01:06:45 +08:00
Chia-I Wu
c10c1ac0cf ilo: replace ilo_zs_surface with ilo_state_zs 2015-06-15 01:06:44 +08:00
Chia-I Wu
6dad848d1a ilo: add ilo_state_ps
We want to make ilo_shader_cso a union of ilo_state_{vs,hs,ds,gs,ps}.
2015-06-15 01:06:44 +08:00
Chia-I Wu
df9f846ac6 ilo: add ilo_state_{vs,hs,ds,gs}
We want to make ilo_shader_cso a union of ilo_state_{vs,hs,ds,gs} and ps
payload.
2015-06-15 01:06:44 +08:00
Chia-I Wu
a0bb1c2d17 ilo: add ilo_state_sbe
We want to replace ilo_kernel_routing with ilo_state_sbe.
2015-06-15 01:06:44 +08:00
Chia-I Wu
1ccab943b6 ilo: add ilo_state_vf
We want to replace ilo_ve_state with ilo_state_vf.
2015-06-15 01:06:44 +08:00
Chia-I Wu
9c77ebef24 ilo: add ilo_state_urb 2015-06-15 01:06:44 +08:00
Chia-I Wu
3ff40be0ee ilo: add ilo_state_sol 2015-06-15 01:06:44 +08:00
Chia-I Wu
62bb643718 ilo: add ilo_state_cc
We want to replace ilo_dsa_state and ilo_blend_state with ilo_state_cc.
2015-06-15 01:06:44 +08:00
Chia-I Wu
6be8b6053d ilo: add ilo_state_raster
We want to replace ilo_rasterizer_state with ilo_state_raster.
2015-06-15 01:06:44 +08:00
Chia-I Wu
4fa7ed99a1 ilo: add ilo_state_viewport
We want to replace ilo_viewport_cso and ilo_scissor_state with
ilo_state_viewport.
2015-06-14 23:00:04 +08:00
Chia-I Wu
61fea171af ilo: add ilo_state_sampler
We want to replace ilo_sampler_cso with ilo_state_sampler.
2015-06-14 23:00:04 +08:00
Chia-I Wu
f5f2007322 ilo: add ilo_state_surface
We want to replace ilo_view_surface with ilo_state_surface.
2015-06-14 23:00:04 +08:00
Chia-I Wu
b91250a56b ilo: add ilo_state_zs
We want to replace ilo_zs_surface with ilo_state_zs.  One noteworthy
difference is that ilo_state_zs always aligns level 0 to 8x4 when HiZ is
enabled.  HiZ will not be enabled for 1D surfaces as a result.
2015-06-14 23:00:03 +08:00
Chia-I Wu
9af1fc590d ilo: update genhw headers
Generate these new enums

  enum gen_reorder_mode;
  enum gen_clip_mode;
  enum gen_front_winding;
  enum gen_fill_mode;
  enum gen_cull_mode;
  enum gen_pixel_location;
  enum gen_sample_count;
  enum gen_inputattr_select;
  enum gen_msrast_mode;
  enum gen_prefilter_op;

Correct the type of GEN6_SAMPLER_DW0_BASE_LOD.  Rename gen_logicop_function,
gen_sampler_mip_filter, gen_sampler_map_filter, gen_sampler_aniso_ratio, and
others.
2015-06-14 15:43:20 +08:00
Chia-I Wu
9cb0df4b50 ilo: add ilo_image_disable_aux()
When aux bo allocation fails, ilo_image_disable_aux() should be called to
disable aux buffer.
2015-06-14 15:43:20 +08:00
Chia-I Wu
f0de65cbc2 ilo: add array_size and level_count to ilo_image
We will use them for bound checking.
2015-06-14 15:43:20 +08:00
Chia-I Wu
f9d2bbe967 ilo: add pipe_texture_target to ilo_image
Save the target in ilo_image instead of passing it around.
2015-06-14 15:43:20 +08:00
Chia-I Wu
9da9cf729f ilo: fix "Render Cache Read Write Mode"
It needs be set to R/W only when using certain messages via DP render cache.
Since we only use RT wrties with the render cache, we never need to set it.
2015-06-14 15:43:20 +08:00
Chia-I Wu
1885ac4908 ilo: avoid resource owning in core
It is up to the users whether to reference count the BOs or not.
2015-06-14 15:43:20 +08:00
Chia-I Wu
ab7229b9b6 ilo: assert core objects are zero-initialized
Core objects are usually embedded inside calloc()'ed objects and we expect
them to be zero-initialized.
2015-06-14 15:43:20 +08:00
Tom Stellard
4d35eef326 radeon/llvm: Handle LLVM backend rename from R600 to AMDGPU
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-06-12 21:02:00 -07:00
Tom Stellard
3e74122337 gallivm: Only build lp_profile() body when PROFILE is defined
The only use of lp_profile() is wrapped in #if defined(PROFILE),
so there is no reason to build it unless this macro is defined.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-12 21:02:00 -07:00
Timothy Arceri
faf7670ee8 glsl: fix compile error message
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-13 12:21:26 +10:00
Kristian Høgsberg Kristensen
fa8a07748d vk: Compute CS exec mask and thread width max in pipeline
We compute the right mask and thread width max parameters as part of
pipeline creation and set them accordingly at vkCmdDispatch() and
vkCmdDispatchIndirect() time. These parameters depend only on the local
group size and the dispatch width of the program so we can figure this
out at pipeline create time.
2015-06-12 18:21:50 -07:00
Kristian Høgsberg Kristensen
c103c4990c vk: Set binding table layout for CS
We weren't setting the binding table layout for the backend compiler.
2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen
2fdd17d259 vk: Generate CS prog_data into the pipeline instance
We were generating the prog_data into a local variable and never
initializing the pipeline->cs_prog_data one.
2015-06-12 18:21:49 -07:00
Ben Widawsky
935f1f60da i965/gen8+: Add aux buffer alignment assertions
This helped find the incorrect HALIGN values from the previous patches.

v2: Add PRM references for assertions (Chad)

v3: Remove duplicated part of commit message, assert num_samples > 1, instead of
num_samples > 0. (Chad)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-12 18:09:49 -07:00
Ben Widawsky
a2421623db i965/gen9: Set HALIGN_16 for all aux buffers
Just like the previous patch, but for the GEN9 constraints.

v2:
bugfix: Gen9 HALIGN was being set for all miptree buffers (Chad). To address
this, move the check to where the gen8 check is, and do the appropriate
conditional there.

v3:
Remove stray whitespace introduced in v2 (Chad)
Rework comment to show AUX_CCS and AUX_MCS specifically. Remove misworded part
about gen7 (Chad).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-12 18:09:49 -07:00
Ben Widawsky
c4aa041a61 i965/gen8: Correct HALIGN for AUX surfaces
This restriction was attempted in this commit:
commit 4705346463
Author: Anuj Phogat <anuj.phogat@gmail.com>
Date:   Fri Feb 13 11:21:21 2015 -0800

   i965/gen8: Use HALIGN_16 if MCS is enabled for non-MSRT

However, the commit itself doesn't achieve the desired goal as determined by the
asserts which the next patch adds. mcs_mt is NULL (never set) we're in the
process of allocating the mcs_mt miptree when we get to this function. I didn't
check, but perhaps this would work with blorp, however, meta clears allocate the
miptree structure (which AFAICT needs the alignment also) way before it
allocates using meta clears where the renderbuffer is allocated way before the
aux buffer.

The restriction is referenced in a few places, but the most concise one [IMO]
from the spec is for Gen9. Gen8 loosens the restriction in that it only requires
this for non-msrt surface.

   When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, HALIGN 16 must
   be used.

With the code before the miptree layout flag rework (patches preceding this),
accomplishing this workaround is very difficult.

v2:
bugfix: Don't set HALIGN16 for gens before 8 (Chad)

v3:
non-trivial rebase

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-12 18:09:49 -07:00
Ben Widawsky
e92fbdcf9c i965: Extract tiling from fast clear decision
There are several constraints when determining if one can fast clear a surface.
Some of these are alignment, pixel density, tiling formats, and others that vary
by generation. The helper function which exists today does a suitable job,
however it conflates "BO properties" with "Miptree properties" when using
tiling. I consider the former to be attributes of the physical surface, things
which are determined through BO allocation, and the latter being attributes
which are derived from the API, and having nothing to do with the underlying
surface.

Determining tiling properties and creating miptrees are related operations
(when we allocate a BO for a miptree) with some disjoint constraints. By
extracting the decisions into two distinct choices (tiling vs. miptree
properties), we gain flexibility throughout the code to make determinations
about when we can or cannot fast clear strictly on the miptree.

To signify this change, I've also renamed the function to indicate it is a
distinction made on the miptree. I am torn as to whether or not it was a good
idea to remove "non_msrt" since it's a really nice thing for grep.

v2:
Reword some comments (Chad)
intel_is_non_msrt_mcs_tile_supported->intel_tiling_supports_non_msrt_mcs (Chad)
Make full if ladder for gens in above function (Chad)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2015-06-12 18:09:49 -07:00
Ben Widawsky
b91a110d5c i965/gen9: Only allow Y-Tiled MCS buffers
For GEN9, much of the logic to use X-Tiled buffers has been stripped out. It is
still supported in some places, but it's never desirable. Unfortunately we don't
yet have the ability to have Y-Tiled scanout (see:
http://patchwork.freedesktop.org/patch/46984/),

NOTE: This patch shouldn't actually do anything since SKL doesn't yet use fast
clears (they are disabled because they are causing regressions). THerefore, the
only case we can get to this function on SKL is by way of
intel_update_winsys_renderbuffer_miptree.

v2: Update commit message to be more clear that the NOTE is for SKL only.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-12 18:09:48 -07:00
Ben Widawsky
b5c5aac687 i965: Consolidate certain miptree params to flags
I think pretty much everyone agrees that having more than a single bool as a
function argument is bordering on a bad idea. What sucks about the current
code is in several instances it's necessary to propagate these boolean
selections down to lower layers of the code. This requires plumbing (mechanical,
but still churn) pretty much all of the miptree functions each time.  By
introducing the flags paramater, it is possible to add miptree constraints very
easily.

The use of this, as is already the case, is sometimes we have some information
at the time we create the miptree that needs to be known all the way at the
lowest levels of the create/allocation, disable_aux_buffers is currently one
such example. There will be another example coming up in a few patches.

v2:
Tab fix. (Ben)
Long line fixes (Topi)
Use anonymous enum instead of #define for layout flags (Chad)
Use 'X != 0' instead of !!X (everyone except Chad)

v3:
Some non-trivial conflict resolution on top of Anuj's patches.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: "Pohjolainen, Topi" <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-06-12 18:09:48 -07:00
Timothy Arceri
0d2068a92d glsl: enforce restriction on AoA interface blocks in GLSL ES 3.10
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-06-13 08:31:21 +10:00
Timothy Arceri
94d669b0d2 glsl: enforce fragment shader input restrictions in GLSL ES 3.10
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-06-13 08:31:15 +10:00
Timothy Arceri
3d78bdea31 glsl: enforce output variable rules for GLSL ES 3.10
Some rules are already applied this just adds the missing ones.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-06-13 08:31:09 +10:00
Jordan Justen
f0e772392f i965/nir: Support barrier intrinsic function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-12 15:12:40 -07:00
Jordan Justen
f7ef8ec9d8 i965/fs: Implement support for ir_barrier
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-12 15:12:40 -07:00
Jordan Justen
7953c00073 i965: Add brw_barrier to emit a Gateway Barrier SEND
This will be used to implement the Gateway Barrier SEND needed to implement
the barrier function.

v2:
 * notify => gateway_notify (Ken)
 * combine short lines of brw_barrier proto/decl (mattst88)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-12 15:12:40 -07:00
Jordan Justen
0d250cc210 i965: Add brw_WAIT to emit wait instruction
This will be used to implement the barrier function.

v2:
 * Rename to brw_WAIT (mattst88)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-12 15:12:40 -07:00
Jordan Justen
b925f1a1df i965: Add notification register
This will be used by the wait instruction when implementing the barrier()
function.

v2:
 * Changes suggested by mattst88

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-12 15:12:40 -07:00
Jordan Justen
bdbbec33cf i965: Disassemble Gateway SEND messages
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-12 15:12:40 -07:00
Jordan Justen
69659546a6 i965/inst: Add gateway_notify and gateway_subfuncid fields
These fields will be used when emitting a send for the barrier function.

Reference: IVB PRM Volume 4, Part 2, Section 1.1.1 Message Descriptor

v2:
 * notify => gateway_notify (Ken)
 * define bits for gen4-gen6 (bwidawsk, Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-12 15:12:40 -07:00
Jordan Justen
1b9cc257d4 i965: Add GATEWAY_SFID definitions
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:40 -07:00
Jordan Justen
2867f2e8cd nir: Add barrier intrinsic function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:40 -07:00
Chris Forbes
86855365b4 glsl: Add builtin barrier() function
[jordan.l.justen@intel.com: Add CS support]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-12 15:12:39 -07:00
Chris Forbes
e7f628c2fc glsl: Add ir node for barrier
v2:
 * Changes suggested by mattst88

[jordan.l.justen@intel.com: Add nir support]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:39 -07:00
Jordan Justen
86b4acb409 i965/cs: Use exec all for CS terminate
This prevents an assertion from being hit with SIMD16:

Assertion `inst->exec_size == dispatch_width() || force_writemask_all' failed.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-06-12 15:12:39 -07:00
Chad Versace
cfc175b409 i965/fs: Fix unused variable warning
Annotate offset_components with attribute 'unused'.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-12 12:37:25 -07:00
Emil Velikov
d15c06b514 vc4: automake: enable subdir-objects
Silence the warnings about the future incompatibility with automake 2.0

Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:42:22 +01:00
Erik Faye-Lund
634f200256 mesa: build xmlconfig to a separate static library
As we use the file from both the dri modules and loader, we end up with
multiple definition of the symbols provided in our gallium dri  modules.
Additionally we compile the file twice.

Resolve both issues, effectively enabling the build on toolchains which
don't support -Wl,--allow-multiple-definition.

v2: [Emil Velikov]
 - Fix the Scons/Android build.
 - Resolve libgbm build issues (bring back the missing -lm)

Cc: Julien Isorce <j.isorce@samsung.com>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90310
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90905
Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:32:18 +01:00
Emil Velikov
83b5648a1e targets/nine: link against libnir/libglsl_util
Based on commit 101142c4010(xa: support for drivers which use NIR)

Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90466
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:32:18 +01:00
Emil Velikov
ba512cc7fa pipe-loader: add libnir and libglsl_util to the link
Based on commit 101142c4010(xa: support for drivers which use NIR)

Cc: Rob Clark <robclark@freedesktop.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90466
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:32:18 +01:00
Emil Velikov
1df5a6c71e mesa; add a dummy _mesa_error_no_memory() symbol to libglsl_util
Rather than forcing everyone to provide their own definition of the symbol
provide a common (dummy) one.

This helps us resolve the build of the standalone pipe-drivers (amongst
others), which are missing the symbol.

Cc: Rob Clark <robclark@freedesktop.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:32:18 +01:00
Emil Velikov
4722743f4b gallium: use $(top_builddir) when referencing static archives
Just like every other place in gallium.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:32:17 +01:00
Emil Velikov
3f5dc9b94f freedreno: use CXX linker rather than explicit link against libstdc++
Cc: Rob Clark <robclark@freedesktop.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:32:17 +01:00
Emil Velikov
0e55db3b8a egl/haiku: coding style fixes
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:53 +01:00
Emil Velikov
b0f33e9736 egl/haiku: plug some obvious memory leaks
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:48 +01:00
Emil Velikov
e77a32fcae egl/haiku: minor surface management cleanups
Drop the stub/unused function haiku_create_surface() and add some basic implementation for destroy_surface()

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:44 +01:00
Emil Velikov
d38a80ba6c egl/haiku: kill off haiku_log()
It's an incomplete copy of the default _eglLog() implementation. Just
use the default logger.

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:40 +01:00
Emil Velikov
667fe2f5e9 egl/haiku: we don't use src/loader, drop all the references to it
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:37 +01:00
Emil Velikov
d0af283303 egl/haiku: remove unused variables in struct haiku_egl_driver
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:34 +01:00
Emil Velikov
46f87b2c19 egl/haiku: handle memory allocation failure
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:27 +01:00
Emil Velikov
ed9dcdf927 egl/haiku: use CALL/TRACE/ERROR over _eglLog() for haiku specifics
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:24 +01:00
Emil Velikov
0b652fedb5 egl/haiku: remove commented out code
It serves little to no purpose. As the driver gets updated, one can
look at the existing implementation (dri2) for reference rather than
letting the commented functions bitrot.

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:20 +01:00
Emil Velikov
c3036f4bb1 egl/haiku: use correct version variable
Earlier commit folded the two separate variables into one, but forgot to
update the haiku driver.

Fixes: 0e4b564ef28(egl: combine VersionMajor and VersionMinor into one
variable)
Cc: Marek Olšák <marek.olsak@amd.com>>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-12 15:17:12 +01:00
Jose Fonseca
0dde821bcc trace: Add missing p_compiler.h include.
For boolean.

Trivial.
2015-06-12 12:14:11 +01:00
Francisco Jerez
8d3c48eed2 i965/fs: Remove one more fixed brw_null_reg() from the visitor.
Instead use fs_builder::null_reg_f() which has the correct register
width.  Avoids the assertion failure in fs_builder::emit() hit by the
"ES3-CTS.shaders.loops.for_dynamic_iterations.unconditional_break_fragment"
GLES3 conformance test introduced by 4af4cfba9e.

Reported-and-reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-12 11:17:25 +03:00
Kristian Høgsberg Kristensen
00494c6cb7 vk: Document how depth/stencil formats work in anv_image_create()
This reverts commits

  e17ed04 * vk/image: Don't double-allocate stencil buffers
  1ee2d1c * vk/image: Teach anv_image_choose_tile_mode about WMAJOR

and instead adds a comment to describe the subtlety of how we create
images for stencil only formats.
2015-06-11 22:07:16 -07:00
Kristian Høgsberg Kristensen
fbc9fe3c92 vk: Use compute pipeline layout when binding compute sets 2015-06-11 21:57:43 -07:00
Kenneth Graunke
16658f426d Revert "i965: Advertise a line width of 40.0 on Cherryview and Skylake."
This reverts commit f3b709c0ac.

The "dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_4.
interpolation.lines_wide" test appears to be broken on Cherryview when
we expose line widths greater than 12.0.  I'm not sure why.

For now, just go back to the limits we used on older platforms.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90902
Acked-by: Matt Turner <mattst88@gmail.com>
2015-06-11 16:59:49 -07:00
Kristian Høgsberg Kristensen
765175f5d1 vk: Implement basic compute shader support 2015-06-11 15:31:42 -07:00
Kristian Høgsberg Kristensen
7637b02aaa vk: Emit PIPELINE_SELECT on demand 2015-06-11 15:21:49 -07:00
Kristian Høgsberg Kristensen
405697eb3d vk: Stop asserting we have a fragment shader
Even for graphics, this is not a requirement, we can have a depth-only output pipeline.
2015-06-11 15:07:38 -07:00
Kristian Høgsberg Kristensen
e7edde60ba vk: Defer setting viewport dynamic state
We can't emit this until we've done a 3D pipeline select.
2015-06-11 15:04:09 -07:00
Kristian Høgsberg Kristensen
f7fe06cf0a vk: Disable shader stages in the graphics pipeline batch
We need to move this into the graphics pipeline batch so we don't  emit it
for compute pipelines.
2015-06-11 14:58:31 -07:00
Kristian Høgsberg Kristensen
9aae480cc4 vk: Don't emit STATE_SIP
We don't have a SIP kernel and don't enable exceptions.
2015-06-11 14:56:29 -07:00
Kristian Høgsberg Kristensen
923e923bbc vk: Compile fragment shader after VS and GS
Just moving code around to do shader stages in the natual order.
2015-06-11 14:55:50 -07:00
Kenneth Graunke
f4310cdbd0 i965: Re-index SSA definitions before printing NIR code.
This makes the SSA definitions use sequential numbers (0, 1, 2, ...)
instead of seemingly random ones.  There's not much point normally,
but it makes debug output much easier to read.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-11 11:17:52 -07:00
Jason Ekstrand
1dd63fcbed vk/entrypoints: Don't print every single function call 2015-06-11 10:10:13 -07:00
Brian Paul
1a6e4f46ed gallium: remove explicit values from PIPE_CAP_ enums
The other PIPE_CAPF_ and PIPE_SHADER_CAP_ enums don't have explicit values.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-11 10:27:17 -06:00
Kristian Høgsberg Kristensen
b581e924b6 vk: Remove left-over trp call 2015-06-11 09:26:49 -07:00
Kristian Høgsberg Kristensen
d76ea7644a vk: Set maximum point size range
We set both minimum and maximum point size to 0 in 3DSTATE_CLIP, which
will clip away all points.
2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen
a5b49d2799 vk: Use generated headers with fixed point support
The generated headers now convert float in the template struct to the
correct fixed point format.
2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen
ea7ef46cf9 vk: Regenerate headers with __gen_validate_value() 2015-06-11 09:25:03 -07:00
Jason Ekstrand
a566b1e08a vk/formats: Refactor format properties code
Along with the refactor, we now do the right thing when we hit an
unsupported format: Set the flags to 0 and return VK_SUCCESS.
2015-06-11 09:11:16 -07:00
Jose Fonseca
9fed4f9bf5 mesa/main: Don't use ONCE_FLAG_INIT as a r-value.
It should only be used as an initializer expression.

Trivial, and fixes Windows builds.

Nevertheless, overwriting an once_flag like this seems dangerous and
should be revised.
2015-06-11 13:35:23 +01:00
Iago Toral Quiroga
0f1fe649b7 i965/gen8: Fix antialiased line rendering with width < 1.5
The same fix Marius implemented for gen6 (commit a9b04d8a) and
gen7 (commit 24ecf37a).

Also, we need the same code to handle special cases of line width
in gen6, gen7 and now gen8, so put that in the helper function
we use to compute the line width.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-11 13:40:15 +02:00
Martin Peres
5b61cb1236 glsl: fix constructing a vector from a matrix
Without this patch, the following constructs (not an extensive list)
would crash mesa:

- mat2 foo = mat2(1); vec4 bar = vec4(foo);
- mat3 foo = mat3(1); vec4 bar = vec4(foo);
- mat3 foo = mat3(1); ivec4 bar = ivec4(foo);

The first case is explicitely allowed by the GLSL spec, as seen on
page 101 of the GLSL 4.40 spec:

	"vec4(mat2) // the vec4 is column 0 followed by column 1"

The other cases are implicitely allowed also.

The actual changes are quite minimal. We first split each column of
the matrix to a list of vectors and then use them to initialize the
vector. An additional check to make sure that we are not trying to
copy 0 elements of a vector fix the (i)vec4(mat3) case as the last
vector (3rd column) is not needed at all.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-11 14:04:29 +03:00
Tapani Pälli
83624c141d mesa/es3.1: enable DRAW_INDIRECT_BUFFER_BINDING for gles3.1
(increases ES31-CTS.draw_indirect.basic.* passing tests)

v2: only expose DRAW_INDIRECT_BUFFER_BINDING for GL core + ES3.1

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-06-11 13:39:44 +03:00
Juha-Pekka Heikkila
56e9f3b493 mesa/main: avoid null access in format_array_table_init()
If _mesa_hash_table_create failed we'd get null pointer. Report
error and go away.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-11 13:17:11 +03:00
Juha-Pekka Heikkila
fd00c738c0 mesa/main: Remove _mesa_HashClone()
I didn't find this being used anywhere.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-11 13:17:11 +03:00
Alexander Monakov
bd38f91f8d i965: do_blit_drawpixels: decode array formats
Correct a regression introduced by commit 922c0c9fd5 by converting "array
format", if received from _mesa_format_from_format_and_type, to mesa_format.

References: https://bugs.freedesktop.org/show_bug.cgi?id=90839
Signed-off-by: Alexander Monakov <amonakov@gmail.com>
Tested-by: AnAkkk <anakin.cs@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-06-11 00:15:37 -07:00
Iago Toral Quiroga
f9a18acb56 i965: do not round line width when multisampling or antialiaing are enabled
In commit fe74fee8fa we rounded the line width to the nearest integer to
match the GLES3 spec requirements stated in section 13.4.2.1, but that seems
to break a dEQP test that renders wide lines in some multisampling scenarios.

Ian noted that the Open 4.4 spec has the following similar text:

    "The actual width of non-antialiased lines is determined by rounding the
    supplied width to the nearest integer, then clamping it to the
    implementation-dependent maximum non-antialiased line width."

and suggested that when ES removed antialiased lines, they removed
"non-antialised" from that paragraph but probably should not have.

Going by that note, this patch restricts the quantization implemented in
fe74fee8fa only to regular aliased lines. This seems to keep the
tests fixed with that commit passing while fixing the broken test.

v2:
  - Drop one of the clamps (Ken, Marius)
  - Add a rule to prevent advertising line widths that when rounded go beyond
    the limits allowed by the hardware (Ken)
  - Update comments in the code accordingly (Ian)
  - Put the code in a utility function (Ian)

Fixes:
dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90749

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-06-11 08:32:07 +02:00
Jason Ekstrand
2a3c29698c vk/image: Add a bunch of asserts 2015-06-10 21:04:51 -07:00
Jason Ekstrand
c8b62d109b vk: Add a couple vk_error calls 2015-06-10 21:04:13 -07:00
Jason Ekstrand
7153b56abc vk/private: Add a non-fatal assert 2015-06-10 21:03:50 -07:00
Jason Ekstrand
29d2bbb2b5 vk/cmd: Add an initial implementation of PipelineBarrier
We may want to do something more inteligent here later such as actually
handling image layout transitions.  However, this should do for now.
2015-06-10 16:37:33 -07:00
Kenneth Graunke
f83b9e58f6 i965: Momentarily pretend to support ARB_texture_stencil8 for blits.
Broadwell's stencil blitting code attempts to bind a renderbuffer as a
texture, using dd->BindRenderbufferTexImage().

This calls _mesa_init_teximage_fields(), which then attempts to set
img->_BaseFormat = _mesa_base_tex_format(ctx, internalFormat), which
assert fails if internalFormat is GL_STENCIL_INDEX8 but
ARB_texture_stencil8 is unsupported.

To work around this, just pretend to support the extension momentarily,
during the blit.  Meta has already munged a variety of other things in
the context (including the API!), so it's not that much worse than what
we're already doing.

Fixes regressions since commit f7aad9da20
(mesa/teximage: use correct extension for accept stencil texture.).

v2: Add an XXX comment explaining the situation (requested by Jason
    Ekstrand and Martin Peres), and an assert that we don't support
    the extension so we remember to remove this hack (requested by
    Neil Roberts).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-10 14:24:49 -07:00
Jason Ekstrand
047ed02723 vk/emit: Use valgrind to validate every packed field 2015-06-10 12:43:02 -07:00
Brian Paul
7217faf39f llvmpipe: simplify lp_resource_copy()
Just implement it in terms of util_resource_copy_region().  Both the
original code and util_resource_copy_region() boil down to mapping,
calling util_copy_box() and unmapping.

No piglit regressions.  This will also help to implement GL_ARB_copy_image.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-10 08:20:58 -06:00
Tapani Pälli
5b0d6f5c1b mesa: add GL_RED, GL_RG support for floating point textures
Mesa supports EXT_texture_rg and OES_texture_float. This patch adds
support for using unsized enums GL_RED and GL_RG for floating point
targets and writes proper checks for internalformat when format is
GL_RED or GL_RG and type is of GL_FLOAT or GL_HALF_FLOAT.

Later, internalformat will get adjusted by adjust_for_oes_float_texture
after these checks.

v2: simplify to check vs supported enums
v3: follow the style and break out if internalFormat ok (Kenneth)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90748
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-10 13:00:30 +03:00
Tapani Pälli
07e4f12e66 mesa: allow unsized formats GL_RG, GL_RED for GLES 3.0 with half float
v2: && -> ||, we enable on gles3 or if ARB_texture_rg is enabled

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90748
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-10 12:59:50 +03:00
Timothy Arceri
adee54f826 glsl: remove restriction on unsized arrays in GLSL ES 3.10
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-06-10 18:54:43 +10:00
Jason Ekstrand
9cae3d18ac vk: Add valgrind checks in various emit functions
The check in batch_bo_finish should catch any undefined values in the batch
but isn't that great for debugging.  The checks in the various emit
functions will help get better granularity.
2015-06-09 21:51:37 -07:00
Jason Ekstrand
d5ad24e39b vk: Move the valgrind include and VG() macro to private.h 2015-06-09 21:51:37 -07:00
Dave Airlie
563706c146 st/dri: check pscreen is valid before querying param
we don't check the validity of pscreen until dri_init_screen_helper

hit this trying to init glamor on a device with no driver (udl).

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-10 14:10:33 +10:00
Dave Airlie
c6877c9e59 nouveau: set imported buffers to what the kernel gives us
When we import a dma-buf fd from another driver the kernel
gives us the right info, and this trashes it.

Convert the kernel bo flags into the domain flags.

This helps getting reverse prime and glamor working.

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-10 14:10:01 +10:00
Chad Versace
e17ed04b03 vk/image: Don't double-allocate stencil buffers
If the main surface has format S8_UINT, then don't allocate the
auxiliary stencil surface.
2015-06-09 16:39:28 -07:00
Chad Versace
1ee2d1c3fc vk/image: Teach anv_image_choose_tile_mode about WMAJOR 2015-06-09 16:38:55 -07:00
Chad Versace
2d2e148952 vk/util: Add anv_abortf(), anv_abortfv()
Convenience functions to print an error message then abort.
2015-06-09 16:38:50 -07:00
Chad Versace
ffb1ee5d20 vk: Define anv_noreturn macro 2015-06-09 16:38:46 -07:00
Chad Versace
f1db3b3869 vk/image: Factor tile mode selection into separate function
Because it will eventually need to get smarter.
2015-06-09 16:38:42 -07:00
Jason Ekstrand
11e941900a vk/device: Actually allow destruction 2015-06-09 16:28:46 -07:00
Jason Ekstrand
5d4b6a01af vk/cmd_buffer: Properly initialize/reset dynamic states 2015-06-09 16:27:55 -07:00
Jason Ekstrand
634a6150b9 vk/pipeline: Zero out the depth-stencil state when not in use 2015-06-09 16:26:55 -07:00
Jason Ekstrand
919e7b7551 vk/device: Use anv_CreateDynamicViewportState instead of the vk one 2015-06-09 16:01:56 -07:00
Jason Ekstrand
0599d39dd9 vk/device: Dedent the vkCreateDynamicViewportState call 2015-06-09 15:53:26 -07:00
Chad Versace
d57c4cf999 vk/util: Annotate anv_finishme() as printflike 2015-06-09 14:46:49 -07:00
Chad Versace
822cb16abe vk: Define anv_printflike() macro 2015-06-09 14:46:45 -07:00
Chad Versace
081f617b5a vk/image: Stop hardcoding alignment of stencil surfaces
Look up the alignment from anv_tile_info_table.
2015-06-09 14:16:56 -07:00
Chad Versace
e6bd568f36 vk/image: Rewrite tile info table
- Reduce the number of table lookups in anv_image_create from 4 to 1.
- Add field for surface alignment.
- Shorten field names tile_width, tile_height -> width, height.
2015-06-09 14:16:45 -07:00
Chad Versace
5b777e2bcf vk/image: Delete an old comment 2015-06-09 14:14:29 -07:00
Jason Ekstrand
d842a6965f vk/compiler: Free the GL errors data 2015-06-09 12:36:23 -07:00
Jason Ekstrand
9f292219bf vk/compiler: Free more of prog_data when tearing down a pipeline 2015-06-09 12:36:23 -07:00
Jason Ekstrand
66b00d5e5a vk/queue: Embed the queue in and allocate it with the device 2015-06-09 12:36:23 -07:00
Jason Ekstrand
38f5eef59d vk/device: Free border color states when we have valgrind 2015-06-09 12:36:23 -07:00
Jason Ekstrand
999b56c507 vk/device: Destroy all batch buffers
Due to a copy+paste error, we were destroying all but the first batch or
surface state buffer.  Now we destroy them all.
2015-06-09 12:36:23 -07:00
Jason Ekstrand
3a38b0db5f vk/meta: Clean up temporary objects 2015-06-09 12:36:23 -07:00
Jason Ekstrand
9d6f55dedf vk/surface_view: Add a destructor 2015-06-09 12:36:23 -07:00
Eric Anholt
9dca3beb62 vc4: Drop qir include from vc4_screen.h
We didn't need any of it except for the list header, and qir.h pulls in
nir.h, which is not really interesting to winsys.
2015-06-09 12:25:50 -07:00
Eric Anholt
8d10b2a046 vc4: Drop subdirectory in vc4 build.
Just because we put the source in a subdir, doesn't mean we need helper
libraries in the build.  This will also simplify the Android build setup.
2015-06-09 12:25:50 -07:00
Eric Anholt
e67b12eaf8 vc4: Update to current kernel validation code.
After profiling on real hardware, I found a few ways to cut down the
kernel overhead.
2015-06-09 12:25:50 -07:00
Chih-Wei Huang
c5e11e5f7f android: build with libcxx on android lollipop
On Lollipop, apparently stlport is gone and libcxx must be used instead.
We still support stlport when building on earlier android releases.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 12:25:50 -07:00
Chih-Wei Huang
1842832660 android: enable the radeonsi driver
Based on the nice work of Paulo Sergio Travaglia <pstglia@gmail.com>.

The main modifications are:

- Include paths for LLVM header files and shared/static libraries
- Set C++ flag "c++11" to avoid compiling errors on LLVM header files
- Set defines for LLVM
- Add GALLIVM source files
- Changes path of libelf library for lollipop

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Anholt <eric@anholt.net>
2015-06-09 12:25:50 -07:00
Chih-Wei Huang
1e4081f54a android: generate files by $(call es-gen)
Use the pre-defined macro es-gen to generate new added files
instead of writing new rules manually. The handmade rules
that may generate the files before the directory is created
result in such an error:

/bin/bash: out/target/product/x86/gen/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/main/format_pack.c: No such file or directory
make: *** [out/target/product/x86/gen/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/main/format_pack.c] Error 1

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 12:25:49 -07:00
Chih-Wei Huang
c3b5afbd4e android: try to load gallium_dri.so directly
This avoids needing hardlinks between all of the DRI driver .so names,
since we're the only loader on the system.

v2: Add early exit on success (like previous block) and log message on
    failure.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 12:25:15 -07:00
Chad Versace
e6162c2fef vk/image: Add anv_image::h_align,v_align
Use the new fields to compute RENDER_SURFACE_STATE.Surface*Alignment.
We still hardcode them to 4, though.
2015-06-09 12:19:24 -07:00
Chih-Wei Huang
ac296aee58 android: Depend on gallium_dri from EGL, instead of linking in gallium.
The Android gallium build used to use gallium_egl, which was removed back
in March.  Instead, we will now use a normal Mesa libEGL loader with
dlopen()ing of a DRI module.

v2: add a clean step to rebuild all dri modules properly.
v3: Squish the 2 patches doing this together (change by anholt).

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 11:38:45 -07:00
Chih-Wei Huang
933df3d335 android: add rules to build a gallium_dri.so
This single .so includes all of the enabled gallium drivers.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 11:38:45 -07:00
Chih-Wei Huang
f4f609b27e android: add rules to build gallium/state_trackers/dri
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 11:38:45 -07:00
Chih-Wei Huang
581aa208fa android: export more dirs from libmesa_dri_common
The include paths of libmesa_dri_common are also used by modules
that need libmesa_dri_common.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 11:38:44 -07:00
Chih-Wei Huang
b8213bbe4c android: loader: export the path to be included
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-06-09 11:38:44 -07:00
Ben Widawsky
30ba4faf5d i965/gen9: Use raw PS invocation count for queries
Previously the number needed to be divided by 4 to get the proper results. Now
the hardware does the right thing. Through experimentation it seems Braswell
(CHV) does also need the division by 4.

Fixes piglit test:
arb_pipeline_statistics_query-frag

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-06-09 11:17:37 -07:00
Brian Paul
c10dc485f3 glsl: fix comment typo: s/accpet/accept/ 2015-06-09 10:49:35 -06:00
Brian Paul
37e0677870 mesa: remove some MAX_NV_FRAGMENT_PROGRAM_* macros
GL_NV_fragment_program support was removed a while ago.  This is just
some clean-up.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 10:49:35 -06:00
Jason Ekstrand
670862a506 fs/reg_allocate: Remove the MRF hack helpers from fs_visitor
These are helpers that only exist in this one file.  No reason to put them
in the visitor.

Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-06-09 09:22:56 -07:00
Jason Ekstrand
86e5afbfee i965/fs: Don't let the EOT send message interfere with the MRF hack
Previously, we just put the message for the EOT send as high in the file as
it would go.  This is because the register pre-filling hardware will stop
all over the early registers in the file in preparation for the next thread
while you're still sending the last message.  However, if something happens
to spill, then the MRF hack interferes with the EOT send message and, if
things aren't scheduled nicely, will stomp on it.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90520
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-06-09 09:22:56 -07:00
Jose Fonseca
65bd4159b3 rtasm: Generalize executable memory allocator to all Unices.
We're only using fairly portable standard Unix calls here, so might as
well save ourselves future trouble by enabling on all Unices by default.

https://bugs.freedesktop.org/show_bug.cgi?id=90904

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-06-09 16:18:16 +01:00
Francisco Jerez
698c391521 i965/fs: Drop fs_inst::force_uncompressed.
This is now unused.  Saves a whole bit of memory per instruction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:35 +03:00
Francisco Jerez
44928b799a i965/fs: Remove dead IR construction code from the visitor.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:35 +03:00
Francisco Jerez
51948085a2 i965/fs: Migrate test_fs_cmod_propagation to the IR builder.
v2: Use set_predicate/condmod.  Use fs_builder::OPCODE instead of
    ::emit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
76c8142d0a i965/fs: Migrate test_fs_saturate_propagation to the IR builder.
v2: Use set_saturate.  Use fs_builder::OPCODE instead of ::emit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
bf83a1a219 i965/fs: Migrate translation of NIR texturing instructions to the IR builder.
v2: Don't remove assignments of base_ir just yet.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
979fe2ffee i965/fs: Migrate translation of NIR intrinsics to the IR builder.
v2: Use fs_builder::SEL instead of ::emit.  Use set_condmod().

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
fe88c7ae38 i965/fs: Migrate translation of NIR ALU instructions to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
3632c28bde i965/fs: Migrate translation of NIR control flow to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
9976731485 i965/fs: Migrate NIR variable handling to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
09733f220a i965/fs: Migrate NIR emit_percomp() to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
d5cb2e5137 i965/fs: Migrate CS terminate message to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:34 +03:00
Francisco Jerez
e522f12f03 i965/fs: Migrate VS output writes to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
e32c16c47f i965/fs: Migrate FS framebuffer writes to the IR builder.
The explicit call to fs_builder::group() in emit_single_fb_write() is
required by the builder (otherwise the assertion in fs_builder::emit()
would fail) because the subsequent LOAD_PAYLOAD and FB_WRITE
instructions are in some cases emitted with a non-native execution
width.  The previous code would always use the channel enables for the
first quarter, which is dubious but probably worked in practice
because FB writes are never emitted inside non-uniform control flow
and we don't pass the kill-pixel mask via predication in the cases
where we have to fall-back to SIMD8 writes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
840cbef416 i965/fs: Migrate FS alpha test to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
ad68853f17 i965/fs: Migrate FS discard handling to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
46f264638a i965/fs: Migrate FS gl_SamplePosition/ID computation code to the IR builder.
v2: Use fs_builder::AND/SHR/MOV instead of ::emit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
31477226ec i965/fs: Migrate FS interpolation code to the IR builder.
v2: Fix some preexisting trivial codestyle issues.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
d3c10ad427 i965/fs: Migrate shader time to the IR builder.
v2: Change null register destination type to UD so it can be compacted.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
35e64f2a76 i965/fs: Migrate untyped surface read and atomic to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
db83d9d2d0 i965/fs: Migrate texturing implementation to the IR builder.
v2: Remove tabs from modified lines.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:33 +03:00
Francisco Jerez
546839ef63 i965/fs: Migrate pull constant loads to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
8f626c1498 i965/fs: Migrate Gen4 send dependency workarounds to the IR builder.
v2: Change brw_null_reg() to bld.null_reg_f().

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
4af4cfba9e i965/fs: Migrate lower_integer_multiplication to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
efa60e49f2 i965/fs: Migrate lower_load_payload to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
8f8c6b7bda i965/fs: Migrate register spills and fills to the IR builder.
Yes, it's incorrect to use the 0-th channel enable group
unconditionally without considering the execution and regioning
controls of the instruction that uses the spilled value, but it
matches the previous behaviour exactly, the builder just makes the
preexisting problem more obvious because emitting an instruction of
non-native SIMD width without having called .group() or .exec_all()
explicitly would have led to an assertion failure.

I'll fix the problem in a follow-up series, as the solution is going
to be non-trivial.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
3e6ac0bced i965/fs: Migrate try_replace_with_sel to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
6114ba4dcc i965/fs: Migrate opt_sampler_eot to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
a800ec04ad i965/fs: Migrate opt_peephole_sel to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
78f7c9edeb i965/fs: Create and emit instructions in one step in opt_peephole_sel.
This simplifies opt_peephole_sel() slightly by emitting the SEL
instructions immediately after they are created, what makes the
sel_inst and mov_imm_inst arrays unnecessary and will make it possible
to get rid of the explicit inserts when the pass is migrated to the IR
builder.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
74c2458ecf i965/fs: Migrate opt_cse to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:32 +03:00
Francisco Jerez
e7069fbc70 i965/fs: Don't drop force_writemask_all and _sechalf when copying a CSE temporary.
LOAD_PAYLOAD instructions need the same treatment as any other
generator instructions, at least FB writes and typed surface messages
will need a payload built with non-zero execution controls.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:31 +03:00
Francisco Jerez
497d238ae7 i965/vec4: Take into account all instruction fields in CSE instructions_match().
Most of these fields affect the behaviour of the instruction, but
apparently we currently don't CSE the kind of instructions for which
these fields could make a difference in the VEC4 back-end.  That's
likely to change soon though when we start using send-from-GRF for
texture sampling and surface access messages.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:31 +03:00
Francisco Jerez
8013b8147a i965/fs: Take into account all instruction fields in CSE instructions_match().
Most of these fields affect the behaviour of the instruction so it
could actually break the program if we CSE a pair of otherwise
matching instructions with different values of these fields.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:31 +03:00
Francisco Jerez
d86c2e6e53 i965/fs: Migrate opt_peephole_predicated_break to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:31 +03:00
Francisco Jerez
35e5f118a5 i965/fs: Migrate opt_combine_constants to the IR builder.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:31 +03:00
Francisco Jerez
e04b4156a7 i965/fs: Allocate a common IR builder object in fs_visitor.
v2: Call fs_builder::at_end() to point the builder at the end of the
    program explicitly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:18:31 +03:00
Francisco Jerez
8ea8f83c8f i965/fs: Introduce FS IR builder.
The purpose of this change is threefold: First, it improves the
modularity of the compiler back-end by separating the functionality
required to construct an i965 IR program from the rest of the visitor
god-object, what in turn will reduce the coupling between other
components and the visitor allowing a more modular design.  This patch
doesn't yet remove the equivalent functionality from the visitor
classes, as it involves major back-end surgery.

Second, it improves consistency between the scalar and vector
back-ends.  The FS and VEC4 builders can both be used to generate
scalar code with a compatible interface or they can be used to
generate natural vector width code -- 1 or 4 components respectively.

Third, the approach to IR construction is somewhat different to what
the visitor classes currently do.  All parameters affecting code
generation (execution size, half control, point in the program where
new instructions are inserted, etc.) are encapsulated in a stand-alone
object rather than being quasi-global state (yes, anything defined in
one of the visitor classes is effectively global due to the tight
coupling with virtually everything else in the compiler back-end).
This object is lightweight and can be copied, mutated and passed
around, making helper IR-building functions more flexible because they
can now simply take a builder object as argument and will inherit its
IR generation properties in exactly the same way that a discrete
instruction would from the same builder object.

The emit_typed_write() function from my image-load-store branch is an
example that illustrates the usefulness of the latter point: Due to
hardware limitations the function may have to split the untyped
surface message in 8-wide chunks.  That means that the several
functions called to help with the construction of the message payload
are themselves required to set the execution width and half control
correctly on the instructions they emit, and to allocate all registers
with half the default width.  With the previous approach this would
require the used helper functions to be aware of the parameters that
might differ from the default state and explicitly set the instruction
bits accordingly.  With the new approach they would get a modified
builder object as argument that would influence all instructions
emitted by the helper function as if it were the default state.

Another example is the fs_visitor::VARYING_PULL_CONSTANT_LOAD()
method.  It doesn't actually emit any instructions, they are simply
created and inserted into an exec_list which is returned for the
caller to emit at some location of the program.  This sort of two-step
emission becomes unnecessary with the builder interface because the
insertion point is one more of the code generation parameters which
are part of the builder object.  The caller can simply pass
VARYING_PULL_CONSTANT_LOAD() a modified builder object pointing at the
location of the program where the effect of the constant load is
desired.  This two-step emission (which pervades the compiler back-end
and is in most cases redundant) goes away: E.g. ADD() now actually
adds two registers rather than just creating an ADD instruction in
memory, emit(ADD()) is no longer necessary.

v2: Drop scalarizing VEC4 builder.
v3: Take a backend_shader as constructor argument.  Improve handling
    of debug annotations and execution control flags.
v4: Drop Gen6 IF with inline comparison.  Rename "instr" variable.
    Initialize cursor to NULL by default and add method to explicitly
    point the builder at the end of the program.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-09 15:07:18 +03:00
Francisco Jerez
6e04065729 i965: Define consistent interface to enable instruction result saturation.
v2: Use set_ prefix.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-09 13:56:06 +03:00
Francisco Jerez
7624f8410f i965: Define consistent interface to enable instruction conditional modifiers.
v2: Use set_ prefix.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-09 13:56:06 +03:00
Francisco Jerez
239dfc5410 i965: Define consistent interface to predicate an instruction.
v2: Use set_ prefix.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-09 13:56:06 +03:00
Francisco Jerez
f9367191b3 mesa: Drop include of simple_list.h from mtypes.h.
simple_list.h defines a number of macros with short non-namespaced
names that can easily collide with other declarations (first_elem,
last_elem, next_elem, prev_elem, at_end), and according to the comment
it was only being included because of struct simple_node, which is no
longer used in this file.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-09 13:56:06 +03:00
Francisco Jerez
277b94f172 dri/nouveau: Include simple_list.h explicitly in nv*_state_tnl.c.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-09 13:56:06 +03:00
Francisco Jerez
7065c8153b tnl: Include simple_list.h explicitly in t_context.c.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-09 13:56:06 +03:00
Francisco Jerez
08a1046f67 mesa: Include simple_list.h explicitly in errors.c.
This seems to be the only user of simple_list in core mesa not
including the header explicitly.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-09 13:56:05 +03:00
Jason Ekstrand
58afc24e57 vk/allocator: Remove the concept of a slave block pool
This reverts commit d24f8245db.
2015-06-08 17:46:32 -07:00
Jason Ekstrand
b6363c3f12 vk/device: Remove the binding table pools/streams 2015-06-08 17:45:57 -07:00
Dave Airlie
f7aad9da20 mesa/teximage: use correct extension for accept stencil texture.
This was using the wrong extension, ARB_stencil_texturing
doesn't mention any changes in this area.

Fixes "dEQP-GLES3.functional.fbo.completeness.renderable.texture.
stencil.stencil_index8."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90751
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-08 15:47:09 -07:00
Jason Ekstrand
531549d9fc vk/pipeline: Move freeing the program stream to pipeline.c
It's created in pipeline.c so we should free it there.
2015-06-08 14:27:04 -07:00
Anuj Phogat
556b2fbd24 i965: Make a helper function intel_miptree_set_total_width_height()
and some more code refactoring. No functional changes in this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-08 13:57:11 -07:00
Anuj Phogat
9111377978 i965/gen9: Set vertical alignment for the miptree
v3: Use ffs() and a switch loop in
    tr_mode_horizontal_texture_alignment() (Ben)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-08 13:57:11 -07:00
Anuj Phogat
447410b664 i965/gen9: Set horizontal alignment for the miptree
v3: Use ffs() and a switch loop in
    tr_mode_vertical_texture_alignment() (Ben)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-08 13:57:11 -07:00
Anuj Phogat
126078faca i965/gen9: Set tiled resource mode for the miptree
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-08 13:57:11 -07:00
Anuj Phogat
ef6b9985ea i965: Pass miptree pointer as function parameter in intel_vertical_texture_alignment_unit
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-08 13:57:11 -07:00
Anuj Phogat
9edac38f2a i965: Move intel_miptree_choose_tiling() to brw_tex_layout.c
and change the name to brw_miptree_choose_tiling().

V3: Remove redundant function parameters. (Topi)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-08 13:57:11 -07:00
Anuj Phogat
2cbe730ac5 i965: Choose tiling in brw_miptree_layout() function
This refactoring is required by later patches in this series.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-06-08 13:57:11 -07:00
Jason Ekstrand
66a4dab89a vk/pipeline: Don't destroy the program stream
It's freed in compiler.cpp and we don't want to free it twice.
2015-06-08 13:53:19 -07:00
Jason Ekstrand
920fb771d4 vk/allocator: Make the use of NULL_BLOCK in state_stream_finish explicit 2015-06-08 13:53:19 -07:00
Ben Widawsky
4f2f5c8d81 i965: Disallow saturation for MACH operations.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-08 12:43:28 -07:00
Chris Wilson
922c0c9fd5 i965: Export format comparison for blitting between miptrees
Since the introduction of

commit 536003c11e
Author: Boyan Ding <boyan.j.ding@gmail.com>
Date:   Wed Mar 25 19:36:54 2015 +0800

    i965: Add XRGB8888 format to intel_screen_make_configs

winsys buffers no longer have an alpha channel. This causes
_mesa_format_matches_format_and_type() to reject previously working BGRA
uploads from using the BLT fast path. Instead of using the generic
routine for matching formats exactly, export the slightly more relaxed
check from intel_miptree_blit() which importantly allows the blitter
routine to apply a small number of format conversions.

References: https://bugs.freedesktop.org/show_bug.cgi?id=90839
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Alexander Monakov <amonakov@gmail.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-06-08 17:56:14 +01:00
Chris Wilson
c2d0606827 i915: Blit RGBX<->RGBA drawpixels
The blitter already has code to accommodate filling in the alpha channel
for BGRX destination formats, so expand this to also allow filling the
alpha channgel in RGBX formats.

More importantly for the next patch is moving the test into its own
function for the purpose of exporting the check to the callers.

v2: Fix alpha expansion as spotted by Alexander with the fix suggested by
Kenneth

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Alexander Monakov <amonakov@gmail.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-06-08 17:56:10 +01:00
Chris Wilson
8da79b8378 i965: Fix HW blitter pitch limits
The BLT pitch is specified in bytes for linear surfaces and in dwords
for tiled surfaces. In both cases the programmable limit is 32,767, so
adjust the check to compensate for the effect of tiling.

v2: Tweak whitespace for functions (Kenneth)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-06-08 17:55:56 +01:00
Kristian Høgsberg Kristensen
52637c0996 vk: Quiet a few warnings 2015-06-08 08:51:40 -07:00
Kristian Høgsberg Kristensen
9eab70e54f vk: Create a minimal context for the compiler
This avoids the full brw context initialization and just sets up context
constants, initializes extensions and sets a few driver vfuncs for the
front-end GLSL compiler.
2015-06-08 08:51:40 -07:00
Martin Peres
8614b9e489 softpipe/query: force parenthesis around a logical not
This makes GCC5 happy.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-08 12:38:08 +03:00
Martin Peres
184e4de3a1 main/version: make sure all the output variables get set in get_gl_override
This fixes 2 warnings in gcc 5.1.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-08 12:37:42 +03:00
Michel Dänzer
56e38edc96 radeonsi: Add CIK SDMA support
Based on the corresponding SI support. Same as that, this is currently
only enabled for one-dimensional buffer copies due to issues with
multi-dimensional SDMA copies.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-08 18:13:22 +09:00
Michel Dänzer
79f2acb8f8 r600g,radeonsi: Assert that there's enough space after flushing
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-06-08 18:10:35 +09:00
Emil Velikov
9538902c4f docs: add news item and link release notes for mesa 10.5.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-06-07 13:44:37 +01:00
Emil Velikov
f7db7fe6ea docs: Add sha256sums for the 10.5.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit eb3a704bb0)
2015-06-07 13:42:48 +01:00
Emil Velikov
56efe81ab1 Add release notes for the 10.5.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 495bcbc48c)
2015-06-07 13:42:46 +01:00
Kenneth Graunke
7b8f20ec55 prog_to_nir: Fix fragment depth writes.
In the ARB_fragment_program specification, the result.depth output
variable is treated as a vec4, where the fragment depth is stored in the
.z component, and the other three components are undefined.

This is different than GLSL, which uses a scalar value (gl_FragDepth).

To make this consistent for driver backends, this patch makes
prog_to_nir use a scalar output variable for FRAG_RESULT_DEPTH,
moving result.depth.z into the first component.

Fixes Glean's fragProg1 "Z-write test" subtest.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90000
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-06 13:26:10 -07:00
Chris Forbes
52e5ad7bf8 i965: Set max texture buffer size to hardware limit
Previously we were leaving this at the default of 64K, which meets the
spec but is too small for some real uses. The hardware can handle up to
128M.

User was complaining about this on freenode ##OpenGL today.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-06 18:40:33 +12:00
Jason Ekstrand
ce00233c13 vk/cmd_buffer: Use the dynamic state stream in emit_dynamic and merge_dynamic 2015-06-05 17:26:41 -07:00
Jason Ekstrand
e69588b764 vk/device: Use a 64-byte alignment for CC state 2015-06-05 17:26:26 -07:00
Jason Ekstrand
c2eeab305b vk/pipeline: Actually free the program stream and dynamic pool 2015-06-05 17:26:26 -07:00
Jason Ekstrand
ed2ca020f8 vk/allocator: Avoid double-free in the bo pool 2015-06-05 17:12:28 -07:00
Jason Ekstrand
aa523d3c62 vk/gem: Call VALGRIND_FREELIKE_BLOCK before unmapping 2015-06-05 16:41:49 -07:00
Ben Widawsky
b639ed2f1b i965: Add gen8 fast clear perf debug
In an ideal world I would just implement this instead of adding the perf debug.
There are some errata involved which lead me to believe it won't be so simple as
flipping a few bits.

There is room to add a thing for Gen9s flexibility, but since I am actively
working on that I have opted to ignore it.

Example:
Multi-LOD fast clear - giving up (256x128x8).

v2: Use braces for if statements because they are multiple lines (Ken)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-05 14:25:47 -07:00
Ben Widawsky
77a44512d9 i965: Add buffer sizes to perf debug of fast clears
When we cannot do the optimized fast clear it's important to know the buffer
size since a small buffer will have much less performance impact.

A follow-on patch could restrict printing the message to only certain sizes.

Example:
Failed to fast clear 1400x1056 depth because of scissors.  Possible 5% performance win if avoided.

Recommended-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-05 14:25:47 -07:00
Marek Olšák
6acb61fc9c clover: clarify and fix the EGL interop error case
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
2015-06-05 19:44:33 +02:00
Marek Olšák
a1cb407b04 egl: expose EGL 1.5 if all requirements are met
There's no driver support yet, because EGL_KHR_gl_colorspace isn't
implemented.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
51c8c66e1d egl: return correct invalid-type error from eglCreateSync
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
820a4d402a egl: add new platform functions (v2)
These are just wrappers around the existing extension functions.

v2: return BAD_ALLOC if _eglConvertAttribsToInt fails

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
515f04ed6f egl: add eglCreateImage (v2)
v2: - use calloc
    - return BAD_ALLOC if calloc fails

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
1e79e054e7 egl: add eglGetSyncAttrib (v2)
v2: - don't modify "value" in eglGetSyncAttribKHR after an error
    - rename _egl_api::GetSyncAttribKHR -> GetSyncAttrib
    - rename GetSyncAttribKHR_t -> GetSyncAttrib_t
    - rename _eglGetSyncAttribKHR to _eglGetSyncAttrib

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
7524592da6 egl: add eglWaitSync
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
2885ba0e4c egl: add EGL 1.5 functions that don't need any changes from extensions
Declare the functions without the suffix, so that the core names are exported.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
d333d30632 egl: use EGL 1.5 types without suffixes
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
706466f461 egl: add context attribs from EGL 1.5
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
f9f894447e egl: fix setting context flags
Cc: 10.6 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
0e4b564ef2 egl: combine VersionMajor and VersionMinor into one variable
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
efda9c5649 egl: set the EGL version in common code
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
3a83adeb7c egl: remove unused _egl_global::ClientExtensions
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
20249d3559 egl: import platform headers from registry (v2)
v2: don't remove local Mesa changes

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:33 +02:00
Marek Olšák
6b31f22338 egl: import eglext.h from registry and cleanup eglmesaext.h (v2)
v2: include mesa and chromium extensions in eglext.h so as not to break
    existing users
v3: keep PFNEGLSWAPBUFFERSREGIONNOK because piglit uses it

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
49ae822183 egl: import egl.h from registry (v2)
v2: split the commit into 3 patches

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
f52e8572ae mesa: remove unused gl_config::colorIndexMode
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
4312b4f570 mesa: use GL_GEOMETRY_PROGRAM_NV instead of MESA_GEOMETRY_PROGRAM
There's no reason to use our own definition.
Tessellation will use the NV definitions too.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
3b2721ce11 mesa: use _mesa_has_geometry_shader in get_programiv
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
b7ef7903b8 mesa: remove useless gl_compute_program_state::Current
This is for user assembly shaders only (not GLSL). We won't support those.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
e8b040477e mesa: remove unused geometry shader variables
These states are for GS assembly shaders only. We don't support those.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-05 19:44:32 +02:00
Marek Olšák
3d16b5af1d tgsi/ureg: fix a coverity defect in emit_decls
Reported by Ilia Mirkin.
2015-06-05 19:44:32 +02:00
Marek Olšák
6aff87bb01 r600g: fix a coverity defect in streamout code
Reported by Ilia Mirkin.
2015-06-05 19:44:32 +02:00
Marek Olšák
6bf3729a3f glsl_to_tgsi: use TGSI array declarations for VS,GS arrays of outputs (v2)
v2: don't use PIPE_MAX_SHADER_ARRAYS
2015-06-05 19:44:32 +02:00
Marek Olšák
9b1921100e glsl_to_tgsi: use TGSI array declarations for GS,FS arrays of inputs (v2)
v2: don't use PIPE_MAX_SHADER_ARRAYS
2015-06-05 19:44:32 +02:00
Marek Olšák
26c8a49bc4 glsl_to_tgsi: remove some emit functions by using C++ default values 2015-06-05 19:44:32 +02:00
Marek Olšák
85cd1cf4b8 glsl_to_tgsi: rename emit -> emit_asm
My editor thinks "emit" is a keyword, which breaks code indexing.
2015-06-05 19:44:32 +02:00
Marek Olšák
30b74c02cd glsl_to_tgsi: remove memset after calloc 2015-06-05 19:44:32 +02:00
Marek Olšák
6ae3bc2569 glsl_to_tgsi: don't use a static array size for st_translate::arrays 2015-06-05 19:44:32 +02:00
Marek Olšák
57c98e22db glsl_to_tgsi: don't use a static array size for "array_sizes" 2015-06-05 19:44:32 +02:00
Marek Olšák
b6ebe7eabf tgsi/ureg: don't emit in/out arrays if drivers don't support ranged declarations
Softpipe, llvmpipe, r300g, and radeonsi pass tests. Other drivers need testing.

Freedreno and nv30 are definitely broken. Other drivers seem to be alright.
2015-06-05 19:44:32 +02:00
Marek Olšák
a015b3952f tgsi/ureg: add support for output array declarations 2015-06-05 19:44:32 +02:00
Marek Olšák
1fa6c99e24 tgsi/ureg: add support for GS input array declarations 2015-06-05 19:44:32 +02:00
Marek Olšák
d3fbc65986 tgsi/ureg: merge input and fs_input arrays 2015-06-05 19:44:32 +02:00
Marek Olšák
3b1d157751 tgsi/ureg: rename and simplify ureg_DECL_gs_input
There is nothing special about it and it's used for tessellation shaders
too.
2015-06-05 19:44:32 +02:00
Marek Olšák
918ca4031f tgsi/ureg: add support for FS input array declarations 2015-06-05 19:44:32 +02:00
Marek Olšák
cf2c9265a3 tgsi/scan: get more information about arrays and handle arrays correctly (v2)
v2: use less memory for the information
2015-06-05 19:44:32 +02:00
Tapani
78395dbf9f mesa: fix program resource queries for builtin variables
Patch fixes special cases with gl_VertexID and sets all builtin
variables locations as '-1' as specified by the extension spec.

Fixes ES 3.1 conformance test failure:
	ES31-CTS.program_interface_query.input-built-in

v2: comments + use is_gl_identifier() (Martin)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-05 08:39:59 +03:00
Alan Coopersmith
cb277cde6f glsl_compiler: Remove unused extra argument to printf in usage_fail
Flagged by Oracle's parfait static analyzer:

Error: Format string argument mismatch (CWE 628)
   In call to printf with format string "usage: %s [options] <file.vert | file.geom | file.frag>\n\nPossible options are:\n"
      Too many arguments for format string (got more than 1 arguments)
        at line 285 of src/glsl/main.cpp in function 'usage_fail'.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-04 19:01:16 -07:00
Roland Scheidegger
00d8733120 docs: add note about llvmpipe supporting GL_ARB_shader_stencil_export 2015-06-05 02:25:03 +02:00
Roland Scheidegger
6e5970ffee draw: (trivial) fix NULL pointer dereference
This probably got broken when the samplers were converted to be indexed
by shader type.
Seen when looking at bug 89819 though I'm not sure if that really was what
the bug was about...

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-05 02:20:35 +02:00
Kenneth Graunke
c820407ef0 i965/fs: Print mlen in dump_instructions() output.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-06-04 15:34:01 -07:00
Kenneth Graunke
15a12795c6 prog_to_nir: Make RSQ properly take the absolute value of its argument.
I just botched this when writing the original code.

From the ARB_vertex_program specification:
"The RSQ instruction approximates the reciprocal of the square root of
 the absolute value of the scalar operand and replicates it to all four
 components of the result vector."

Fixes a Glean vertProg1 subtest:
RSQ test 2 (reciprocal square root of negative value)

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90547
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-06-04 15:32:46 -07:00
Chad Versace
87d98e1935 vk: Fix 2 incorrect typecasts
The compiler didn't find the cast errors because all Vulkan types are
just integers.
2015-06-04 14:32:22 -07:00
Chad Versace
b981379bcf vk: Make make clean remove generated spirv headers 2015-06-04 14:26:46 -07:00
Jason Ekstrand
8d930da35d vk/allocator: Remove an unneeded VG() wrapper 2015-06-04 09:14:33 -07:00
Jason Ekstrand
7f90e56e42 vk/device: Dissalow device destruction 2015-06-04 09:14:33 -07:00
Chad Versace
9cd42b3dea vk: Fix build
Commit 1286bd, which deleted vk.c, broke the build. Update the Makefile
to fix it.
2015-06-04 09:01:30 -07:00
Martin Peres
71e9457877 main: fix a regression in uniform handling introduced by 87a4bc5
The comment was accurate but the condition was reversed...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-04 15:42:06 +03:00
Martin Peres
87a4bc5118 mesa: reference built-in uniforms into gl_uniform_storage
This change introduces a new field in gl_uniform_storage to
explicitely say that a uniform is built-in. In the case where it is,
no storage is defined to make it clear that it is read-only from the
mesa side. I fixed all the places in the code that made use of the
structure that I changed. Any place making a wrong assumption and using
the storage straight away will just crash.

This patch seems to implement the path of least resistance towards
listing built-in uniforms in GL_ACTIVE_UNIFORM (and other APIs).

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-06-04 09:25:00 +03:00
Roland Scheidegger
4fd42a7c27 llvmpipe: Implement stencil export
Pretty trivial, fixes the issue that we're expected to be able to blit
stencil surfaces (as the blit just relies on util blitter code which needs
stencil export to do it).
2 piglits skip->pass, 11 fail->pass

v2: prettify, keep different stencil ref value handling out of depth/stencil
test itself.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-06-04 03:56:19 +02:00
Jason Ekstrand
251aea80b0 vk/DS: Mask stencil masks to 8 bits 2015-06-03 16:59:13 -07:00
Connor Abbott
47bd462b0c awesome control flow bugfixes/clarifications 2015-06-03 14:10:28 -04:00
Matt Turner
d46d04529b i965: Use UW-typed immediate in multiply inst.
Some hardware reads only the low 16-bits even if the type is UD, but
other hardware like Cherryview can't handle this.

Fixes spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple on
Cherryview.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90830
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-06-03 10:47:41 -07:00
Matt Turner
54a70a8ef2 program: Replace gl_inst_opcode with enum prog_opcode.
Both were introduced at the same time. I'm not sure why we needed two.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-03 10:40:59 -07:00
Matt Turner
fb011d3157 program: Remove dead Aux field from prog_instruction.
Appears to have been last used by the i965 driver (removed by commit
098acf6c).

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-03 10:40:59 -07:00
Matt Turner
ef3f89e53e program: Shrink and rename SaturateMode field to Saturate.
It was 2 bits to accommodate SATURATE_PLUS_MINUS_ONE (removed by commit
09b566e1). A similar change was made to TGSI recently in commit
e1c4e8aa.

Reducing the size from 2 bits to 1 reduces the size of the bit fields
from 17 bits to 16, which is a much nicer number.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-06-03 10:40:59 -07:00
Brian Paul
56b2b3d385 mesa: move no-change glDepthFunc check earlier
If the incoming func matches the current state it must be a legal
value so we can do this before the switch statement.

Signed-off-by: Brian Paul <brianp@vmware.com>
2015-06-03 11:35:46 -06:00
Brian Paul
4dd72fe70d mesa: restore GL_EXT_depth_bounds_test state in glPopAttrib()
Spotted by inspection.  Untested (no piglit test).

Signed-off-by: Brian Paul <brianp@vmware.com>
2015-06-03 11:35:46 -06:00
Brian Paul
6139195606 mesa: fix glPushAttrib(0) / glPopAttrib() error
If the glPushAttrib() mask value was zero we didn't actually push
anything onto the attribute stack.  A subsequent glPopAttrib() call
would generate a GL_STACK_UNDERFLOW error.  Now push a dummy attribute
in that case to prevent the error.

Mesa now matches nvidia's behavior.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-03 11:35:46 -06:00
Kristian Høgsberg Kristensen
a37d122e88 vk: Set color/blend state in meta clear if not set yet 2015-06-02 23:08:05 -07:00
Kristian Høgsberg Kristensen
1286bd3160 vk: Delete vk.c test case
We now have crucible up and running and all vk sub-cases have been moved
over. Delete this crufty old hack of a test case.
2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
2f6aa424e9 vk: Update generated headers with support for 64 bit fields 2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
5744d1763c vk: Set cb_state to NULL at cmd buffer create time
Dynamic color/blend state can be NULL in case we're not rendering to
color targets (only output to depth and/or stencil). Initialize
cmd_buffer->cb_state to NULL so we can reliably detect whether it's been
set or not.
2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
c8f078537e vk: Implement vertexOffset parameter of vkCmdDrawIndexed()
As exposed by the func.draw_indexed test, we were ignoring the argument
and hardcoding 0.
2015-06-02 22:57:42 -07:00
Timothy Arceri
86a74e9b6b nir: use src for ssa helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-03 06:50:39 +10:00
Timothy Arceri
5f7b8fa481 nir: remove extra semicolon
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-03 06:50:33 +10:00
Matt Turner
5da809d70f prog_to_nir: Remove OPCODE_MOV special case.
OPCODE_MOV is in the op_trans[] array.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-02 12:22:42 -07:00
Matt Turner
576f7241b6 prog_to_nir: Remove from op_trans[] opcodes handled in the switch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-06-02 12:22:42 -07:00
Jason Ekstrand
e702197e3f vk/formats: Add a name to the metadata and better logging 2015-06-02 11:30:39 -07:00
Jason Ekstrand
fbafc946c6 vk/formats: Rework the formats table 2015-06-02 11:30:39 -07:00
Eduardo Lima Mitev
5b226a1242 nir: prevent use-after-free condition in should_lower_phi()
lower_phis_to_scalar() pass recurses the instruction dependence graph to
determine if all the sources of a given instruction are scalarizable.
To prevent cycles, it temporary marks the phi instruction before recursing in,
then updates the entry with the resulting value. However, it does not consider
that the entry value may have changed after a recursion pass, hence causing
a use-after-free situation and a crash.

This patch fixes this by reloading the entry corresponding to the 'phi'
after recursing and before updating its value.

The crash can be reproduced ~20% of times with the dEQP test:

dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-02 20:21:49 +02:00
Kenneth Graunke
762395736b i965: Add Gen8+ VS dispatch_mode assertion.
Suggested by Ben Widawsky.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-01 22:08:54 -07:00
Kristian Høgsberg Kristensen
f98c89ef31 vk: Move query related functionality to new file query.c 2015-06-01 21:52:45 -07:00
Kenneth Graunke
a2655e0dd4 i965: Drop LOAD_PAYLOAD workaround in fs_visitor::emit_urb_writes().
Now that Jason's LOAD_PAYLOAD improvements have landed, we don't need
this.  Passing 1 for the number of header registers already takes care
of setting force_writemask_all on the header copy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-06-01 12:45:41 -07:00
Kenneth Graunke
386bf336c4 i965: Use proper pitch for scalar GS pull constants and UBOs.
See the corresponding code in brw_vs_surface_state.c.

v2: const more things (requested by Topi Pohjolainen)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-01 12:45:40 -07:00
Kenneth Graunke
0f8ec779dd i965: Create a shader_dispatch_mode enum to replace VS/GS fields.
We used to store the GS dispatch mode in brw_gs_prog_data while
separately storing the VS dispatch mode in brw_vue_prog_data::simd8.

This patch introduces an enum to represent all possible dispatch modes,
and stores it in brw_vue_prog_data::dispatch_mode, unifying the two.

Based on a suggestion by Matt Turner.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-01 12:45:40 -07:00
Kenneth Graunke
9945573d65 i965: Drop "Vector Mask Enable" bit from 3DSTATE_GS on Gen8+.
The documentation makes it pretty clear that we shouldn't use this:

   "Under normal conditions SW shall specify DMask, as the GS stage
    will provide a Dispatch Mask appropriate to SIMD4x2 or SIMD8 thread
    execution (as a function of dispatch mode).  E.g., for SIMD4x2
    execution, the GS stage will generate a Dispatch Mask that is equal
    to what the EU would use as the Vector Mask.  For SIMD8 execution
    there is no known usage model for use of Vector Mask (as there is
    for PS shaders)."

I also managed to find descriptions of DMask and VMask, in the "State
Register" (sr0.2/3) field descriptions:

   "Dispatch Mask (DMask).  This 32-bit field specifies which channels
    are active at Dispatch time."

   "Vector Mask (VMask).  This 32-bit field contains, for each 4-bit
    group, the OR of the corresponding 4-bit group in the dispatch
    mask."

SIMD4x2 shaders process one or two vec4 values, with each 4-bit group
corresponding to xyzw channel enables (either all on, or all off).
Thus, DMask = VMask in SIMD4x2 mode.  But in SIMD8 mode, 4-bit groups
are meaningless, so it just messes up your values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-01 12:45:40 -07:00
Jason Ekstrand
08748e3a0c i965: Use NIR by default for vertex shaders on GEN8+
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

   total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
   instructions in affected programs:     1514770 -> 1454047 (-4.01%)
   helped:                                5813
   HURT:                                  1120

The gained programs are ARB vertext programs that were previously going
through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-06-01 12:25:58 -07:00
Jason Ekstrand
d4cbf6a728 vk/compiler: Add an index_count to the bind map and check for OOB 2015-06-01 12:25:58 -07:00
Jason Ekstrand
510b5c3bed vk/HACK: Plumb real descriptor set/index into textures 2015-06-01 12:25:58 -07:00
Jason Ekstrand
aded32bf04 NIR: Add a helper for doing sampler lowering for vulkan 2015-06-01 12:25:58 -07:00
Brian Paul
f97166e550 docs: update GL_ARB_copy_image, GL_ARB_clear_texture gallium status
VMware is working on these.

Signed-off-by: Brian Paul <brianp@vmware.com>
2015-06-01 07:47:25 -06:00
Brian Paul
51d08d55f4 gallium/util: silence silence unused var warnings for non-debug build
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-01 07:42:05 -06:00
Brian Paul
54070a9d1d egl/dri2: silence uninitialized variable warnings
And update assertions to be more informative.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-01 07:42:04 -06:00
Brian Paul
87813c504a gallivm: silence unused var warnings for non-debug build
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-01 07:42:03 -06:00
Brian Paul
71afc13eda pipebuffer: silence unused var warnings for non-debug build
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-01 07:42:02 -06:00
Brian Paul
8759185871 st/mesa: silence unused var warnings for non-debug build
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-01 07:42:02 -06:00
Brian Paul
ae5d6db924 draw: silence unused var warnings for non-debug build
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-06-01 07:42:01 -06:00
Jose Fonseca
512117ce0e gallivm: Remove stub disassemblerSymbolLookupCB.
It's incompletete -- it wasn't filling ReferenceType so it was causing
garbagge on the disassembly.  Furthermore it seems impossible to get the
jump information through this interface.

The solution for function size problem is to effectively book-keep the
machine code start and end address while JIT'ing.
2015-06-01 10:43:28 +01:00
Kristian Høgsberg Kristensen
5caa408579 vk: Indent tables to align '=' at column 48 2015-05-31 22:36:26 -07:00
Kristian Høgsberg Kristensen
76bb658518 vk: Add support for anisotropic bits 2015-05-31 22:15:34 -07:00
Kristian Høgsberg Kristensen
dc56e4f7b8 vk: Implement support for sampler border colors
This supports the three Vulkan border color types for float color
formats. The support for integer formats is a little trickier, as we
don't know the format of the texture at this time.
2015-05-31 17:20:48 -07:00
Jason Ekstrand
e497ac2c62 vk/device: Only flush the texture cache when setting state base address
After further examination, it appears that the other flushes and stalls
weren't actually needed.
2015-05-30 18:04:50 -07:00
Neil Roberts
7f62fdae16 i965: Don't add base_binding_table_index if it's zero
When calculating the binding table index for non-constant sampler
array indexing it needs to add the base binding table index which is a
constant within the generated code. Often this base is zero so we can
avoid a redundant instruction in that case.

It looks like nothing in shader-db is doing non-constant sampler array
indexing so this patch doesn't make any difference but it might be
worth having anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
2015-05-31 00:48:57 +01:00
Neil Roberts
6c846dc57b i965: Don't use a temporary when generating an indirect sample
Previously when generating the send instruction for a sample
instruction with an indirect sampler it would use the destination
register as a temporary store. This breaks when used in combination
with the opt_sampler_eot optimisation because that forces the
destination to be null. This patch fixes that by avoiding the temp
register altogether.

The reason the temporary register was needed was because it was trying
to ensure the binding table index doesn't overflow a byte by and'ing
it with 0xff. The result is then or'd with samper_index<<8. This patch
instead just and's the whole thing by 0xfff. This will ensure that a
bogus sampler index won't overflow into the rest of the message
descriptor but unlike the previous code it won't ensure that the
binding table index doesn't overflow into the sampler index. It
doesn't seem like that should matter very much though because if the
shader is generating a bogus sampler index then it's going to just get
garbage out either way.

Instead of doing sampler_index<<8|(sampler_index+base_table_index) the
new code avoids one operation by doing
sampler_index*0x101+base_table_index which should be equivalent.
However if we wanted to avoid the multiply for some reason we could do
this by adding an extra or instruction still without needing the
temporary register.

This fixes a number of Piglit tests on Skylake that were using
indirect samplers such as:

 spec@arb_gpu_shader5@execution@sampler_array_indexing@fs-simple

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-05-31 00:48:57 +01:00
Jason Ekstrand
2251305e1a vk/cmd_buffer: Track descriptor set dirtying per-stage 2015-05-30 10:07:29 -07:00
Jason Ekstrand
33cccbbb73 vk/device: Emit PIPE_CONTROL flushes surrounding new STATE_BASE_ADDRESS
According to the bspec, you're supposed to emit a PIPE_CONTROL with a CS
stall and a render target flush prior to chainging STATE_BASE_ADDRESS.  A
little experimentation, however, shows that this is not enough.  It also
appears as if you have to flush the texture cache after chainging base
address or things won't propagate properly.
2015-05-30 08:08:07 -07:00
Eric Anholt
ec1c72d38e vc4: Don't bother with safe list traversal in CSE.
We don't remove or move instructions.
2015-05-29 22:09:53 -07:00
Eric Anholt
78c773bb36 vc4: Convert from simple_list.h to list.h
list.h is a nicer and more familiar set of list functions/macros.
2015-05-29 22:09:53 -07:00
Jason Ekstrand
b2b9fc9fad vk/allocator: Don't call VALGRIND_MALLOCLIKE_BLOCK on fresh gem_mmap's 2015-05-29 21:15:47 -07:00
Jason Ekstrand
03ffa9ca31 vk: Don't crash on partial descriptor sets 2015-05-29 20:43:10 -07:00
Eric Anholt
21a22a61c0 vc4: Make sure we allocate idle BOs from the cache.
We were returning the most recently freed BO, without checking if it
was idle yet.  This meant that we generally stalled immediately on the
previous frame when generating a new one.  Instead, allocate new BOs
when the *oldest* BO is still busy, so that the cache scales with how
much is needed to keep some frames outstanding, as originally
intended.

Note that if you don't have some throttling happening, this means that
you can accidentally run the system out of memory.  The kernel is now
applying some throttling on all execs, to hopefully avoid this.
2015-05-29 18:15:00 -07:00
Eric Anholt
c821ccf0e3 vc4: Fix return value handling for BO waits.
If the wait ever returned -ETIME, we'd abort because the errno was
stored in errno and not drmIoctl()'s return value.
2015-05-29 18:15:00 -07:00
Jason Ekstrand
4ffbab5ae0 vk/device: Allow for starting a new surface state buffer
This commit allows for us to create a whole new surface state buffer when
the old one runs out of room.  We simply re-emit the state base address for
the new state, re-emit binding tables, and keep going.
2015-05-29 17:49:41 -07:00
Jason Ekstrand
c4bd5f87a0 vk/device: Do lazy surface state emission for binding tables
Before, we were emitting surface states up-front when binding tables were
updated.  Now, we wait to emit the surface states until we emit the binding
table.  This makes meta simpler and should make it easier to deal with
swapping out the surface state buffer.
2015-05-29 16:51:11 -07:00
Timothy Arceri
fcc79af9e2 mesa: remove unused function declaration
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-30 07:24:02 +10:00
Brian Paul
82305f7b00 dri_util: make version var unsigned to silence warnings
_mesa_override_gl_version_contextless() takes an unsigned version
parameter.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-29 13:36:39 -06:00
Ben Widawsky
b307921c3f i965: Disable compaction for EOT send messages
AFAICT, there is no real way to make sure a send message with EOT is properly
ignored from compact, nor can I see a way to actually encode EOT while
compacting. Before the single send optimization we'd always bail because we hit
the is_immediate && !is_compactable_immediate case. However, with single send,
is_immediate is not true, and so we end up trying to compact the un-compactible.

Without this, any compacting single send instruction will hang because the EOT
isn't there. I am not sure how I didn't hit this when I originally enabled the
optimization.  I didn't check if some surrounding code changed.

I know Neil and Matt were both looking into this. I did a quick search and
didn't see any patches out there to handle this. Please ignore if this has
already been sent by someone. (Direct me to it and I will review it).

Reported-by: Neil Roberts <neil@linux.intel.com>
Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-29 11:55:10 -07:00
Kristian Høgsberg Kristensen
4aecec0bd6 vk: Store dynamic slot index with struct anv_descriptor_slot
We need to make sure we use the right index into dynamic offset
array. Dynamic descriptors can be present or not in different stages and
to get the right offset, we need to compute the index at
vkCreateDescriptorSetLayout time.
2015-05-29 11:32:53 -07:00
Roland Scheidegger
c0d2b83f0b gallivm: make sampling more robust when the sampler setup is bogus
Pure integer formats cannot be sampled with linear tex / mip filters. In GL
such a setup would make the texture incomplete.
We shouldn't rely on the state tracker though to filter that out, just return
all zeros instead of dying in the lerp.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-05-29 19:33:19 +02:00
Jose Fonseca
0ad15e55bf configure.ac: Link mcdisassembler component.
gallivm now depends on it. And depending on particular LLVM version /
configure options, the build can fail without this change due to
undefined reference to `LLVM*Disasm*' symbols.

Trivial.
2015-05-29 12:17:16 +01:00
Jose Fonseca
9119cd7d2c configure.ac: Don't bother checking whether LLVM's MCJIT component is available.
Now that we require LLVM 3.3, MCJIT is guaranteed to be available.

Trvial.
2015-05-29 12:14:34 +01:00
Jose Fonseca
0db4ef9df1 gallivm: Use the LLVM's C disassembly interface.
It doesn't do everything we want.  In particular it doesn't allow to
detect jumps or return opcodes.  Currently we detect the x86's RET
opcode.

Even though it's worse for LLVM 3.3, it's an improvement for LLVM 3.7,
which was totally busted.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-05-29 11:20:58 +01:00
Jose Fonseca
29203e7738 gallivm: Disable frame pointer omission on LLVM 3.7.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-05-29 11:20:58 +01:00
Marek Olšák
dd048543e9 configure.ac: enable building GLES1 and GLES2 by default
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-29 11:52:44 +02:00
Marek Olšák
25e9ae2b79 st/dri: fix postprocessing crash when there's no depth buffer
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89131

Cc: 10.6 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-29 11:52:44 +02:00
Marek Olšák
7116250b7a radeon/llvm: reset temps_count on deallocation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-29 11:52:44 +02:00
Marek Olšák
7afc992c20 radeon/llvm: don't use a static array size for radeon_llvm_context::arrays (v2)
v2: - don't use realloc (tgsi_shader_info provides the size)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-29 11:52:44 +02:00
Kristian Høgsberg Kristensen
fad418ff47 vk: Implement dynamic buffer offsets
We do this by creating a surface state on the fly that incorporates the
dynamic offset. This patch also refactor the descriptor set layout
constructor a bit to be less clever with switch statement fall
through. Instead of duplicating the subtle code to update the sampler
and surface slot map, we just use two switch statements.
2015-05-28 22:41:20 -07:00
Dave Airlie
065978d36b softpipe: fix offset wrapping calculations (v2)
Roland pointed out my previous attempt was lacking, so I enhanced the
texwrap piglit test, and tested them. This fixes the offset calculations
in a number of areas by adding the offset first, it also fixes the fastpaths,
which I forgot to address in the previous commit.

v2: try and avoid divides in most paths, the repeat mirror path
really was ugly no matter which way I went, so I left it having
the divide.
Also fix the gather lod calculation bug.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-29 13:15:47 +10:00
Jason Ekstrand
b95ec49e57 i965/vs: Rework the logic for generating NIR from ARB vertex programs
Whether or not to use NIR is now equivalent to brw->scalar_vs.  We can
simplify the logic and make it far less confusing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 17:07:01 -07:00
Jason Ekstrand
78644ffc4d i965/fs: Remove the ir_visitor code
Now that everything is running through NIR, this is all dead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 17:07:01 -07:00
Jason Ekstrand
66a03a4c4b i965: Remove the old fragment program code
Now that everything is running through NIR, this is all dead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 17:07:00 -07:00
Jason Ekstrand
114497afff i965: Make NIR non-optional for scalar shaders
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 17:07:00 -07:00
Jason Ekstrand
8b9ecfff36 i965: Make fs/vec4_visitor inherit from ir_visitor directly
This is using multiple inheritance in C++.  However, ir_visitor is really
just an interface with no data so it shouldn't be so bad.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 17:07:00 -07:00
Jason Ekstrand
99cb423320 i965: Rename backend_visitor to backend_shader
The backend_shader class really is a representation of a shader.  The fact
that it inherits from ir_visitor is somewhat immaterial.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 17:07:00 -07:00
Ian Romanick
1ca60de4c0 mesa: Enable ARB_direct_state_access by default for core profile
And core profile only.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 17:02:54 -07:00
Ian Romanick
ef4dd0fc3e dispatch_sanity: Validate the compatibility profile dispatch table too
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 17:02:47 -07:00
Ian Romanick
49ab670f52 dispatch_sanity: Split list of GL 3.1 functions in to core and common
The next patch will add a test for compatibility profile dispatch, and
it seems to make more sense to share the lists.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
a6fa74e6bb mesa: Don't install glVertexAttribL* functions in compatibility profile
GL_ARB_vertex_attrib_64bit is exclusive to core profile, and none of the
other functions added by the extension are advertised in other profiles.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
4e5efa9e7d glapi: Make GL_ARB_direct_state_access functions exclusive to core profile
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Dylan Baker <baker.dylan.c@gmail.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
f20899b727 glapi: Store exec table version info outside the XML
Currently on the functions that are exclusive to core-profile are
implemented.  The remainder continue to live in the XML.  Additional
functions can be moved later.

The functions for GL_ARB_draw_indirect and GL_ARB_multi_draw_indirect
are put in the dispatch table inside the VBO module, so they do not need
to be moved over.

The diff of src/mesa/main/api_exec.c before and after this patch is as
expected.  All of the functions listed in apiexec.py moved out of a 'if
(_mesa_is_desktop(ctx))' block into a new 'if (ctx->API ==
API_OPENGL_CORE)' block.

v2: Remove stray shebang line in apiexec.py.  Suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Dylan Baker <baker.dylan.c@gmail.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
5c4aab58ee Revert "mesa: Add an extension flag for ARB_direct_state_access"
This reverts commit 30dcaaec35.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
832ea2345a mesa: Use the profile instead of an extension bit to validate GL_TEXTURE_CUBE_MAP
The extension on which this depends will always be enabled in core
profile, and the extension bit is about to be removed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
90e98ea215 Revert "mesa: Add ARB_direct_state_access checks in XFB functions"
This reverts commit 7d212765a4.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
cab233f277 Revert "mesa: Add ARB_direct_state_access checks in buffer object functions"
This reverts commit 339ed0984d.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
8bcd14fab9 Revert "mesa: Add ARB_direct_state_access checks in FBO functions"
This reverts commit 6ad0b7e07a.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
f3e8596a37 Revert "mesa: Add ARB_direct_state_access checks in renderbuffer functions"
This reverts commit cb49940766.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
1ac6a8f1d1 Revert "mesa: Add ARB_direct_state_access checks in texture functions"
This reverts commit 8940957238.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
92e362191e Revert "mesa: Add ARB_direct_state_access checks in VAO functions"
This reverts commit 36b0579337.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
ae54577544 Revert "mesa: Add ARB_direct_state_access checks in sampler object functions"
This reverts commit 9e7149c898.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
a9dcf45cd8 Revert "mesa: Add ARB_direct_state_access checks in program pipeline functions"
This reverts commit bebf3c6ab3.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
a9f678a8f4 Revert "mesa: Add ARB_direct_state_access checks in query object functions"
This reverts commit d3368e0c9e.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
f1fcf79e3c Revert "i915: Enable ARB_direct_state_access"
This reverts commit 121030eed8.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
4bc00b1a4b Revert "i965: Enable ARB_direct_state_access"
This reverts commit a57feba0a3.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:32 -07:00
Ian Romanick
73cf10e623 Revert "st/mesa: Enable ARB_direct_state_access"
This reverts commit 357bf80caa.

Acked-by: Fredrik Höglund <fredrik@kde.org>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-28 16:56:31 -07:00
Ian Romanick
9b5e92f4cc mesa: Allow overriding the version of ES2+ contexts
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-28 16:56:31 -07:00
Ian Romanick
03fd6704db mesa: Add support for a new override string MESA_GLES_VERSION_OVERRIDE
The string is only applied when the context is API_OPENGLES2.

The bulk of the change is to prevent overriding the context to
API_OPENGL_CORE based on the requested version.  If the context is
API_OPENGL_ES2, don't change it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-28 16:56:31 -07:00
Ian Romanick
464c56d3d5 dri_util: Use _mesa_override_gl_version_contextless
Remove _mesa_get_gl_version_override.  We don't need two functions that
do basically the same thing.  This change seemed easier (esp. with the
next patch) than going the other way.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-28 16:56:31 -07:00
Ian Romanick
1fe243938b mesa/es3.1: Enable ES 3.1 API and shading language version
This is a bit of a hack for now.  Several of the extensions required for
OpenGL ES 3.1 have no support, at all, in Mesa.  However, with this
patch and a patch to allow MESA_GL_VERSION_OVERRIDE to work with ES
contexts, people can begin testing the ES "version" of the functionality
that is supported.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-28 16:56:31 -07:00
Ian Romanick
366ceacf72 gles/es3.1: Enable dispatch of almost all new GLES 3.1 functions
A couple functions are missing because there are no implementations of
them yet.  These are:

      glFramebufferParameteri (from GL_ARB_framebuffer_no_attachments)
      glGetFramebufferParameteriv (from GL_ARB_framebuffer_no_attachments)
      glMemoryBarrierByRegion

v2: Rebase on updated dispatch_sanity.cpp test.

v3: Add support for glDraw{Arrays,Elements}Indirect in vbo_exec_array.c.
The updated dispatch_sanity.cpp test discovered this omission.

v4: Rebase on glapi changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-28 16:56:31 -07:00
Jason Ekstrand
9ffc1bed15 vk/device: Split state base address emit into its own function 2015-05-28 15:34:08 -07:00
Jason Ekstrand
468c89a351 vk/device: Use anv_batch_emit for MI_BATCH_BUFFER_START 2015-05-28 15:25:02 -07:00
Jason Ekstrand
8bbe7fa7a8 i965/fs: Properly handle explicit depth in SIMD16 with dual-source blend
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90629
Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-28 13:33:09 -07:00
Jason Ekstrand
2dc0f7fe5b vk/device: Actually destroy batch buffers 2015-05-28 13:08:21 -07:00
Matt Turner
e354cc9b79 i965: Silence warning in 3-src type-setting.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-28 12:59:04 -07:00
Matt Turner
0596134410 i965/fs: Fix lowering of integer multiplication with cmod.
If the multiplication's result is unused, except by a conditional_mod,
the destination will be null. Since the final instruction in the lowered
sequence is a partial-write, we can't put the conditional mod on it and
we have to store the full result to a register and do a MOV with a
conditional mod.

Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90580
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-28 12:58:50 -07:00
Jason Ekstrand
8cf932fd25 vk/query: Don't emit a CS stall by itself
Both the bspec and the simulator don't like this.  I'm not sure if stalling
at the scoreboard is right but it at least shuts up the simulator.
2015-05-28 10:27:53 -07:00
Jason Ekstrand
730ca0efb1 vk/device: Fixups for batch buffer chaining
Some how these didn't get merged with the other batch buffer chaining
stuff.  Oh well, it's here now.
2015-05-28 10:26:11 -07:00
Jason Ekstrand
de221a672d meta: Add a default ds_state and use it when no ds state is set 2015-05-28 10:06:45 -07:00
Jason Ekstrand
6eefeb1f84 vk/meta: Share the dummy RS and CB state between clear and blit 2015-05-28 10:00:38 -07:00
Iago Toral Quiroga
2231cf0ba3 nir: Fix output swizzle in get_mul_for_src
When we compute the output swizzle we want to consider the number of
components in the add operation. So far we were using the writemask
of the multiplication for this instead, which is not correct.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-28 18:25:37 +02:00
Jose Fonseca
09d6243aed gallivm: Workaround LLVM PR23628.
Temporarily undefine DEBUG macro while including LLVM C++ headers,
leveraging the push/pop_macro pragmas, which are supported both by GCC
and MSVC.

https://bugs.freedesktop.org/show_bug.cgi?id=90621

Trivial.
2015-05-28 10:12:55 +01:00
Kristian Høgsberg Kristensen
5a317ef4cb vk: Initialize dynamic state binding points to NULL
We rely on these being initialized to NULL so meta can reliably detect
whether or not they've been set. ds_state is also allowed to not be
present so we need a well-defined value for that.
2015-05-27 22:13:48 -07:00
Eric Anholt
10aacf5ae8 vc4: Just stream out fallback IB contents.
The idea I had when I wrote the original shadow code was that you'd see a
set_index_buffer to the IB, then a bunch of draws out of it.  What's
actually happening in openarena is that set_index_buffer occurs at every
draw, so we end up making a new shadow BO every time, and converting more
of the BO than is actually used in the draw.

While I could maybe come up with a better caching scheme, for now just
do the simple thing that doesn't result in a new shadow IB allocation
per draw.

Improves performance of isosurf in drawelements mode by 58.7967% +/-
3.86152% (n=8).
2015-05-27 17:29:11 -07:00
Eric Anholt
f8de6277bf vc4: Don't try to put our dmabuf-exported BOs into the BO cache.
We'd sometimes try to reallocate something that X was using as a new
pipe_resource, and potentially conflict in our rendering.  But even
worse, if we reallocated the BO as a shader, the kernel would reject
rendering using the shader.
2015-05-27 17:29:11 -07:00
Eric Anholt
b0edc19a52 vc4: Don't forget to make our raster shadow textures non-raster.
Not sure what happened in my testing that made the previous shadow
code fix glxgears swapbuffering, but this also fixes lots of CopyArea
in X (like dragging xlogo around in metacity).
2015-05-27 17:29:11 -07:00
Samuel Pitoiset
41630c0653 vc4: make vc4_begin_query() return a boolean
I forgot to make the change in 96f164f6f0.
This fixes a warning with GCC and probably an error with Clang.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-05-27 17:29:03 -07:00
Ben Widawsky
e2d84d99f5 i965: Emit 3DSTATE_MULTISAMPLE before WM_HZ_OP (gen8+)
Starting with GEN8, there is documentation that the multisample state command
must be emitted before the 3DSTATE_WM_HZ_OP command any time the multisample
count changes. The 3DSTATE_WM_HZ_OP packet gets emitted as a result of a
intel_hix_exec(), which is called upon a fast clear and/or a resolve. This can
happen before the state atoms are checked, and so the multisample state must be
put directly in the function.

v1:
- In v0, I was always emitting the command, but Ken came up with the condition to
determine whether or not the sample count actually changed.
- Ken's recommendation was to set brw->num_multisamples after emitting
3DSTATE_MULTISAMPLE. This doesn't work. I put my best guess as to why in the XXX
(it was causing 7 regressions on BDW).

v2:
Flag NEW_MULTISAMPLE state. As Ken found, in state upload we check for the
multisample change to determine whether or not to emit certain packets. Since
the hiz code doesn't actually care about the number of multisamples, set the
flag and let the later code take care of it.

Jenkins results:
http://otc-mesa-ci.jf.intel.com/view/dev/job/bwidawsk/136/

Fixes around 200 piglit tests on SKL. I'm somewhat surprised that it seems to
have no impact on BDW as the restriction is needed there as well.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Neil Roberts <neil@linux.intel.com> (v0)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
2015-05-27 17:08:08 -07:00
Vinson Lee
147ffd4816 gallivm: Do not use NoFramePointerElim with LLVM 3.7.
TargetOptions::NoFramePointerElim was removed in llvm-3.7.0svn r238244
"Remove NoFramePointerElim and NoFramePointerElimOverride from
TargetOptions and remove ExecutionEngine's dependence on CodeGen. NFC."

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-05-27 17:01:51 -07:00
Chad Versace
1435bf4bc4 .gitignore: Ignore spirv2nir binary 2015-05-27 17:01:09 -07:00
Chad Versace
f559fe9134 .gitignore: Scope Vulkan's generated source files
Don't ignore any file named entrypoints.{c,h}. Ignore it only if it's in
src/vulkan.
2015-05-27 16:59:53 -07:00
Chad Versace
ca385dcf2a vk: gitignore generated source files 2015-05-27 16:57:31 -07:00
Chad Versace
466f61e9f6 vk/glsl_scraper: Replace adhoc arg parsing with argparse 2015-05-27 16:56:02 -07:00
Chad Versace
fab9011c44 vk/image: Assert that VkImageTiling is valid 2015-05-27 16:21:04 -07:00
Chad Versace
c0739043b3 vk/image: Remove trailing whitespace 2015-05-27 16:15:47 -07:00
Chad Versace
4514e63893 vk/glsl: Reject invalid options
The script incorrectly interpreted --blah as the input filename.
2015-05-27 16:14:26 -07:00
Chad Versace
fd8b5e0df2 vk/glsl_scraper: Indent large text blocks
Indent them to the same level as if the text was code.

No changes in entrypoints.{c,h} after a clean build.
2015-05-27 16:09:31 -07:00
Chad Versace
df4b02f4ed vk/glsl_scraper: Fix code style for imports
Python style is one module imported per line, and imports are at the top
of the file.
2015-05-27 16:04:12 -07:00
Kenneth Graunke
70c6f2323e i965: Remove _NEW_MULTISAMPLE dirty bit from 3DSTATE_PS_EXTRA.
BRW_NEW_NUM_SAMPLES is sufficient.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-05-27 12:20:25 -07:00
Kenneth Graunke
bb18df008e i965: Delete GS scratch space workaround warning.
This workaround is documented in the 3DSTATE_GS documentation.  It
appears to only apply to early steppings of Broadwell and Skylake.

I don't think it ever affected production hardware, so at this point it
probably makes sense to delete it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-27 12:20:18 -07:00
Jason Ekstrand
b23885857f vk/meta: Actually create the CB state for blits 2015-05-27 12:06:30 -07:00
Jason Ekstrand
da8f148203 vk: Rework anv_batch and use chaining batch buffers
This mega-commit primarily does two things.  First, is to turn anv_batch
into a better abstraction of a batch.  Instead of actually having a BO, it
now has a few pointers to some piece of memory that are used to add data to
the "batch".  If it gets to the end, there is a function pointer that it
can call to attempt to grow the batch.

The second change is to start using chained batch buffers.  When the end of
the current batch BO is reached, it automatically creates a new one and
ineserts an MI_BATCH_BUFFER_START command to chain to it.  In this way, our
batch buffers are effectively infinite in length.
2015-05-27 11:48:28 -07:00
Jason Ekstrand
59def43fc8 Fixup for growable reloc lists 2015-05-27 11:48:28 -07:00
Jason Ekstrand
1c63575de8 vk/cmd_buffer: Allocate the surface_bo from device->batch_bo_pool 2015-05-27 11:48:28 -07:00
Jason Ekstrand
403266be05 vk/device: Make reloc lists growable 2015-05-27 11:48:28 -07:00
Jason Ekstrand
5ef81f0a05 vk/device: Use a bo pool for batch buffers 2015-05-27 11:48:28 -07:00
Jason Ekstrand
6f3e3c715a vk/allocator: Add a BO pool 2015-05-27 11:48:28 -07:00
Jason Ekstrand
59328bac10 vk/allocator: Add a free list that acts on pointers instead of offsets 2015-05-27 11:48:28 -07:00
EdB
40665362fd clover: Log build options when dumping clc source.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-27 15:33:58 +03:00
Ian Romanick
2b8c51834b glapi: Encapsulate nop table knowledge in new _mesa_new_nop_table function
Encapsulate the knowledge about how to build the nop table in a new
_mesa_new_nop_table function.  This makes it easier for dispatch_sanity
to keep working now and in the future.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
2015-05-26 18:25:41 -07:00
Kristian Høgsberg
a1d30f867d vk: Add support for dynamic and pipeline color blend state 2015-05-26 17:12:37 -07:00
Kristian Høgsberg
2514ac5547 vk/test: Create and use color/blend dynamic and pipeline state 2015-05-26 17:12:37 -07:00
Kristian Høgsberg
1cd8437b9d vk/meta: Allocate and set color/blend state
For color blend, we have to set our own state to avoid inheriting bogus
blend state.
2015-05-26 17:12:37 -07:00
Thomas Helland
8d813d14e1 docs: Fix some typos in the developer notes
Found when double-checking my review on Brian's series.

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-26 15:14:04 -06:00
Kristian Høgsberg
610e6291da vk: Allocate samplers from dynamic stream 2015-05-26 11:50:34 -07:00
Kristian Høgsberg
b29f44218d vk: Emit color calc state
This involves pulling stencil ref values out of DS dynamic state and the
blend constant out of CB dynamic state.
2015-05-26 11:27:31 -07:00
Kristian Høgsberg
5e637c5d5a vk/pack: Generate length macros for structs 2015-05-26 11:27:31 -07:00
Kristian Høgsberg
998837764f vk: Program depth bias
This makes 3DSTATE_RASTER a split state command.
2015-05-26 11:27:31 -07:00
Kristian Høgsberg
0dbed616af vk: Add support for texture component swizzle
This also drops the share create_surface_state helper and moves filling
out SURFACE_STATE directly into anv_image_view_init() and
anv_color_attachment_view_init().
2015-05-26 11:27:29 -07:00
Brian Paul
be71bbfaa2 mesa: do not use _glapi_new_nop_table() for DRI builds
Commit 4bdbb588a9 introduced new _glapi_new_nop_table() and
_glapi_set_nop_handler() functions in the glapi dispatcher (which
live in libGL.so).  The calls to those functions from context.c
would be undefined (i.e. an ABI break) if the libGL used at runtime
was older.

For the time being, use the old single generic_nop() function for
non-Windows builds to avoid this problem.  At some point in the future
it should be safe to remove this work-around.  See comments for more
details.

v2: Incorporate feedback from Emil.  Use _WIN32 instead of
GLX_DIRECT_RENDERING to control behavior, move comments.

Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
2015-05-26 12:16:48 -06:00
Brian Paul
2ab0ca36c1 docs: add information about reviewing patches
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-26 12:16:36 -06:00
Brian Paul
c6184f84b7 docs: update the coding style information
This hasn't been updated in a long time and from recent discussion on
the mailing list, it's not always clear what's expected.  Hopefully,
this will help a bit.

v2: document function brace placement, per Thomas Helland.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2015-05-26 10:02:59 -06:00
Brian Paul
d959885b91 docs: update documentation about patch formatting, testing, etc
v2: correctly escape < and > chars.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2015-05-26 10:02:59 -06:00
Brian Paul
98f2f47f7a docs: reorganize devnotes.html file
Move "Adding Extensions" to the end.  Add a simple table of contents
at the top.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2015-05-26 10:02:59 -06:00
Brian Paul
eec904d29c xlib: fix X_GLXCreateContextAtrribs/Attribs typo
In case the glproto.h file isn't up to date, we provide the #define
for X_GLXCreateContextAttribsARB.

v2: fix other occurances, improve #ifndef test, per Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-05-26 09:58:09 -06:00
Brian Paul
dce53a7d24 mesa: add some comments in copyimage.c
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-26 09:58:09 -06:00
Brian Paul
0b76541ce0 mesa: move decls, add const qualifiers in copyimage.c
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-26 09:58:09 -06:00
Brian Paul
8369675a55 mesa: code clean-ups in textureview.[ch]
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-26 09:58:09 -06:00
Brian Paul
3ddd1cf7d1 mesa: const qualify, return bool for _mesa_texture_view_compatible_format()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-26 09:58:09 -06:00
Brian Paul
09eabf5be6 mesa: add const qualifer on _mesa_is_compressed_format()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-26 09:58:09 -06:00
Jose Fonseca
b787f48ed2 glapi: Avoid argparse type argument for API XML input files.
argparse type is a nice type saver for simple data types, but it doesn't
look a good fit for the input XML file:

- Certain implementations of argparse (particularly python 2.7.3's)
  invoke the type constructor for the default argument even when an
  option is passed in the command line.  Causing `No such file or
  directory: 'gl_API.xml'` when the current dir is not
  src/mapi/glapi/gen.

- The parser takes multiple arguments.  This is currently worked around
  using lambdas, but that unnecessarily complex and hard to read.
  Furthermore it's odd to have a side-effect as heavy as parsing XML
  happening deep inside the argument parsing.

https://bugs.freedesktop.org/show_bug.cgi?id=90600

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-26 15:26:03 +01:00
Marek Olšák
224a77cc60 radeonsi: use a switch statement in si_delete_shader_selector
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák
0c5a309cee radeonsi: use a switch statement in si_shader_selector_key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák
fa7f606e89 radeonsi: fix scratch buffer setup for geometry shaders
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák
f41517242a radeonsi: remove unused cases from si_shader_io_get_unique_index
These can't occur between VS and GS, because GS is only supported
in the core profile.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:37 +02:00
Marek Olšák
af4b9c7c2e radeonsi: don't count special outputs for the VS export count
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:36 +02:00
Marek Olšák
e4339bc988 radeonsi: add support for PIPE_CAP_TGSI_TEXCOORD
Without it, texcoords are mapped to GENERIC[0..7], PointCoord is mapped to
GENERIC[8], and user-defined varyings start from GENERIC[9]. Since texcoords
can only be used between VS and PS, and PointCoord is PS-only, it's silly to
always start from GENERIC[9] in all other shaders (such as LS, HS, ES, GS).

This adds support for TEXCOORD and PCOORD semantics. As a result, st/mesa
will use GENERIC[0] as a base for user-defined varyings, which should make
linking ES and GS as well as tessellation shaders at runtime easier.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-05-26 12:42:31 +02:00
Marek Olšák
3d35027fdc tgsi/ureg: enable creating tessellation shaders with ureg_create_shader
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-26 11:46:28 +02:00
Marek Olšák
c1266f28d6 tgsi/text: enable parsing tessellation shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-26 11:46:28 +02:00
Marek Olšák
0d84b6cf84 gallium: rename TGSI tessellation processor types to match pipe shader names
I forgot to do this when pushing the interface changes.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-26 11:46:28 +02:00
Marek Olšák
92c31bb0dd gallium: use const in set_tess_state
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-26 11:46:28 +02:00
Koop Mast
967825d053 clover: Build fix for FreeBSD.
Cc: 10.6 10.5 <mesa-stable@lists.freedesktop.org>
2015-05-26 11:46:28 +02:00
Neil Roberts
5ae6c7bfce i965/skl: Add a message header for the TXF_MCS instruction in vec4vs
When using SIMD4x2 on Skylake, the sampler instructions need a message
header to select the correct mode. This was added for most sample
instructions in 0ac4c2727 but the TXF_MCS instruction is emitted
separately and it was missed.

This fixes a bunch of Piglit tests which test texelFetch in a geometry
shader, for example:

 spec/arb_texture_multisample/texelfetch/2-gs-sampler2dms

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-26 10:22:27 +01:00
Kristian Høgsberg
cbe7ed416e vk: Implement dynamic and pipeline ds state 2015-05-25 20:20:31 -07:00
Kristian Høgsberg
37743f90bc vk: Set up depth and stencil buffers 2015-05-25 20:20:31 -07:00
Kristian Høgsberg
7c0d0021eb vk/test: Add new depth-stencil test
Not yet a depth stencil test, but will become one.
2015-05-25 20:20:31 -07:00
Kristian Høgsberg
0997a7b2e3 vk: Add basic MOCS settings
This matches what we do for GL.
2015-05-25 20:20:31 -07:00
Kristian Høgsberg
c03314bdd3 vk: Update to header files with nested struct support
This will let us do MOCS settings right.
2015-05-25 20:20:31 -07:00
Ilia Mirkin
3ec1815285 nv30: falling back to draw path for edgeflag does no good
The problem is that the EDGEFLAG has to be toggled at vertex submission
time. This can be done from either the draw or the regular paths. Avoid
falling back to draw just because there's an edgeflag.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 21:45:31 -04:00
Ilia Mirkin
25be70462d nv30/draw: switch varying hookup logic to know about texcoords
Commit 8acaf862df switched things over to use TEXCOORD instead of
GENERIC, but did not update the nv30 swtnl draw paths. This teaches the
draw logic about TEXCOORD.

Among other things, this fixes a crash in demos/arbocclude when using
swtnl. Curiously enough, the point-sprite piglit works without this.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 21:45:31 -04:00
Ilia Mirkin
c3d36a2e1a nv30/draw: allocate vertex buffers in gart
These are only used once per draw, so it makes sense to keep them in
GART. Also take this opportunity to modernize the buffer mapping API
usage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 21:45:22 -04:00
Ilia Mirkin
fdad7dfbda nv30/draw: only use the DMA1 object (GART) if the bo is not in VRAM
Instead of always having it in the data, let the bo placement decide it.
This fixes glxgears with swtnl forced on.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 21:45:08 -04:00
Ilia Mirkin
3600439897 nv30/draw: fix indexed draws with swtnl path and a resource index buffer
The map = assignment was missing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 20:16:51 -04:00
Jason Ekstrand
ae8c93e023 vk/cmd_buffer: Initialize the pipeline pointer to NULL
If a meta operation is called before the pipeline is set, this can cause
uses of undefined values.  They *should* be harmless, but we might as well
shut up valgrind on this one too.
2015-05-25 17:14:49 -07:00
Jason Ekstrand
912944e59d vk/device: Use the correct number of viewports when creating default VP state
Fixes valgrind uninitialized value errors
2015-05-25 17:14:49 -07:00
Jason Ekstrand
1b211feb6c vk/compiler: Zero out the vs_prog_data struct when VS is disabled
Prevents uninitialized value errors
2015-05-25 17:14:49 -07:00
Ilia Mirkin
5646f0f18a glsl: avoid leaking linked gl_shader when there's a late linker error
This makes piglit mixing-clip-distance-and-clip-vertex-disallowed have 0
definitely lost blocks with valgrind. (Same non-0 number of possibly
lost blocks though.)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 16:52:11 -04:00
Roland Scheidegger
6a111e54d7 llvmpipe: (trivial) add parantheses in (!x == y) expression
Apparently some compilers think we probably wanted to do !(x == y) instead
and issue a warning, so just shut it up... No functional change, obviously.

Cc: <mesa-stable@lists.freedesktop.org>
2015-05-25 22:24:42 +02:00
Jason Ekstrand
903bd4b056 vk/compiler: Fix up the binding hack and make it work in NIR 2015-05-25 12:57:32 -07:00
Ilia Mirkin
bb973723a5 st/mesa: don't leak glsl_to_tgsi object on link failure
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 15:45:12 -04:00
Ilia Mirkin
147816375d nv30/draw: draw expects constbuf size in bytes, not vec4 units
This fixes glxgears with NV30_SWTNL=1 forced on. Probably fixes a bunch
of other situations where we fall back to the swtnl path.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 14:11:16 -04:00
Ilia Mirkin
89585edf3c nv30/draw: avoid leaving stale pointers in draw state
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-25 14:11:16 -04:00
Jason Ekstrand
cc3d275557 Fix an unused variable warning
Trivial.  Deleted the 2 unneeded lines.
2015-05-25 09:27:10 -07:00
Tobias Klausmann
843ff4ba2a docs: Mark ARB_cull_distance as in progress
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-05-25 16:27:09 +02:00
Iago Toral Quiroga
3dec892d9b docs: Mark ARB_shader_storage_buffer_object as in progress
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-05-25 10:26:38 +02:00
Ilia Mirkin
7518fc3c66 nv30: fix clip plane uploads and enable changes
nv30_validate_clip depends on the rasterizer state. Also we should
upload all the new clip planes on change since next time the plane data
won't have changed, but the enables might.

This fixes fixed-clip-enables and vs-clip-vertex-enables shader tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-24 12:00:03 -04:00
Ilia Mirkin
aba3392541 nv30: avoid doing extra work on clear and hitting unexpected states
Clearing can happen at a time when various state objects are incoherent
and not ready for a draw. Some of the validation functions don't handle
this well, so only flush the framebuffer state. This has the advantage
of also not doing extra work.

This works around some crashes that can happen when clearing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2015-05-24 12:00:03 -04:00
Emil Velikov
207ae2b0ef docs: add news item and link release notes for mesa 10.5.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-24 10:47:54 +01:00
Emil Velikov
81d5d78573 docs: Add sha256sums for the 10.5.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 8cb28bc49d)
2015-05-24 10:45:38 +01:00
Emil Velikov
3ab4556b84 Add release notes for the 10.5.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit b1cf9cfb16)
2015-05-24 10:45:35 +01:00
Ilia Mirkin
9870ed05dd nv30: avoid leaking render state and draw shaders
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-24 02:26:29 -04:00
Ilia Mirkin
605ce36d7f nv30: don't leak fragprog consts
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-24 01:33:06 -04:00
Ilia Mirkin
fa7f9f123b nv50/ir: avoid messing up arg1 of PFETCH
There can be scenarios where the "indirect" arg of a PFETCH becomes
known, and so the code will attempt to propagate it. Use this
opportunity to just fold it into the first argument, and prevent the
load propagation pass from touching PFETCH further.

This fixes gs-input-array-vec4-index-rd.shader_test and
vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-23 22:15:15 -04:00
Grigori Goronzy
f972b223c4 clover: try userptr for CL_MEM_USE_HOST_PTR
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory,
if possible. This is just what userptr is for, so use it.

In case the memory cannot be mapped, a fallback similar to
CL_MEM_COPY_HOST_PTR is used.

v2: constify, drop unneeded cast

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-24 01:14:49 +02:00
Grigori Goronzy
5c495e8638 clover: implement CL_MEM_ALLOC_HOST_PTR
This flag is typically used to request pinned host memory, to avoid
any copies between GPU and CPU.

This improves throughput with an older OpenCL app which I unfortunately
can't publish due to its licensing.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-24 01:14:48 +02:00
Ilia Mirkin
c922758685 nv30: check nouveau_bo_map output of notify bo
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-23 19:10:07 -04:00
Ilia Mirkin
921917c8d8 nvc0: a geometry shader can have up to 1024 vertices output
The 1024 is already reported everywhere, not sure where this 0x1ff came
from.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-23 17:55:21 -04:00
Jason Ekstrand
6ca67f62e8 i965/fs: Fix implied_mrf_writes for scratch writes
We build the entire message in the generator so all the MRF writes are
implied.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-23 12:09:24 -07:00
Jason Ekstrand
58aed1031d prog_to_nir: Use a variable for uniform data
Previously, the prog_to_nir pass was directly generating uniform load/store
intrinsics.  This converts it to use a single giant "parameters" variable
and we now depend on lowering to get the uniform load/store intrinsics.
One advantage of this is that we now have one code-path after we do the
initial conversion into NIR.

No shader-db changes.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-23 12:09:08 -07:00
Samuel Pitoiset
c783fd476c nv50: fix PIPE_QUERY_TIMESTAMP_DISJOINT, based on nvc0
PIPE_QUERY_TIMESTAMP_DISJOINT could not work because q->ready was always
set to FALSE. To fix this issue, add more different states for queries
according to nvc0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-23 19:00:55 +02:00
Ilia Mirkin
217301843a nvc0/ir: LOAD's can't be used for shader inputs
We forgot to convert to VFETCH in case of indirect access. Fix that.

This avoids crashes on the new gs-input-array-vec4-index-rd and
vs-output-array-vec4-index-wr-before-gs but they still fail.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-22 19:08:24 -04:00
Ilia Mirkin
0bab3962f5 nv50/ir: guess that the constant offset is the starting slot of array
When we get something like IN[ADDR[0].x+5], we will now guess that we
should look at IN[5] for the "base" information.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-22 19:08:14 -04:00
Jason Ekstrand
57153da2d5 vk: Actually implement some sort of destructor for all object types 2015-05-22 15:15:08 -07:00
Ilia Mirkin
d1eea18a59 nvc0/ir: set ftz when sources are floats, not just destinations
In the case of a compare, the destination might be a predicate, but we
still want to flush denorms.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
2015-05-22 16:51:05 -04:00
Ilia Mirkin
a85aba190d nv50/ir: allow OP_SET to merge with OP_SET_AND/etc as well as a neg
This covers the pattern where a KILL_IF is used, which triggers a
comparison of -x to 0. This can usually be folded into the comparison whose
result is being compared to 0, however it may, itself, have already been
combined with another comparison. That shouldn't impact the logic of
this pass however. With this and the & 1.0 change, code like

00000020: 001c0001 80081df4     set b32 $r0 lt f32 $r0 0x3e800000
00000028: 001c0000 201fc000     and b32 $r0 $r0 0x3f800000
00000030: 7f9c001e dd885c00     set $p0 0x1 lt f32 neg $r0 0x0
00000038: 0000003c 19800000     $p0 discard

becomes

00000020: 001c001d b5881df4     set $p0 0x1 lt f32 $r0 0x3e800000
00000028: 0000003c 19800000     $p0 discard

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-22 16:51:05 -04:00
Ilia Mirkin
d2a474e8d4 nvc0/ir: optimize set & 1.0 to produce boolean-float sets
This has started to happen more now that the backend is producing
KILL_IF more often.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2015-05-22 16:51:05 -04:00
Ilia Mirkin
e5ad19a46e nvc0/ir: allow iset to produce a boolean float
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-22 16:51:05 -04:00
Ilia Mirkin
0ec6b8ea8c nvc0/ir: avoid jumping to a sched instruction
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-22 16:51:05 -04:00
Brian Paul
491adb61d2 glx: fix Scons build
Replace -h with --header-tag as was done for the Makefile build.

Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2015-05-22 14:38:33 -06:00
Dylan Baker
3f823cc55a glapi: glX_proto_size.py: use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
9ace0b5422 glapi: glX_proto_size.py: use argparse instead of getopt
This is roughly equivalent to the original getopt, except that it
removes the '-h' short option, which argparse reserves for
auto-generated help messages. It does retain the long option specified
by the getopt version, and changes the makefile to use that.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
1c7cc67778 glapi: glX_proto_recv.py: Use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
d986cb7c70 glapi: glX_proto_recv.py: use argparse instead of getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
67d3ec0bb8 glapy: gl_genexec.py: use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
79c4e595bc glapi: gl_genexec.py: use argparse instead of getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
9097a4a103 glapi: glX_proto_send.py: use a main function.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
9eed4e6232 glapi: glX_proto_send.py: use argparse instead of getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
dddac8cac3 glapi: glX_server_table.py: use argparse instead of getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
952bd305c6 glapi: gl_SPARC_asm.py: use main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
86c9fb526e glapi: gl_SPARC_asm.py use argparse instead of getopt
Also drop -m switch, which only accepted a single value or raised an
error, and was unused in the makefile.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
f2e78bd697 glapi: gl_x86-64_asm.py: Use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
2e3da443f1 glapi: gl_x86_64_asm.py: Use argparse instead of getopt
Also removes the redundant -m argument, which could only be set to
'generic', or it would raise an exception. This option wasn't used in
the makefile.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:28 -07:00
Dylan Baker
4892456799 glapi: gl_x86_asm.py: use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
fc96122fb6 glapi: gl_x86_asm.py: use argparse instead of getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
5998d32f09 glapi: gl_gentable.py: use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
d36fa4472e glapi: gl_gentable.py: Replace getopt with argparse
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
3317cea048 glapi: gl_apitemp.py: Use a main function
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
24ec03bd05 glapi: gl_apitemp.py: Convert to argparse instead of getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
6c4dcef6dc glapi: gl_enums.py: use main() function for if __name__ == "__main__"
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
fd5f1dd6c7 glapi: gl_enums.py: use argparse instead of getopt.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
e51530ba16 glapi: gl_procs.py: Use argparse rather than getopt
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
28ecdd6be7 glapi: gl_procs.py: Fix a few low hanging style things
Shuts up analysis tools to make them return actual problems.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
622fee43c8 glapi: remap_helper.py: use argparse instead of optparse
Make the code simpler, cleaner, and easier to work with.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
bdae3bc1ff glapi: remap_helper.py: Fix some low hanging style issues
This makes the tools shut up about a bunch of problems, making them more
useful for catching actual problems.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
cf718cc964 glapi: gl_table.py: replace getopt with argparse.
This results in slightly less code, but code that is much more readable.
It has the advantage of putting everything together in one place, all of
the code is self documenting, help messages are auto-generated, choices
are automatically enforced, and the syntax is much less C like, taking
advantage of python features and idioms.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Dylan Baker
b6298c7a71 glapi: gl_table.py: Fix some low hanging style issues
Making the tools shut up about worthless errors so you can see real ones
is very useful

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-22 11:31:27 -07:00
Matt Turner
a1c070c1a7 i965/disasm: Skip swizzle disassembly when using 3-src repctrl.
... since it's always .x, and also always print the subreg offset when
using repctrl.
2015-05-22 11:26:37 -07:00
Matt Turner
5614bcc416 nir: Remove sRGB colorspace conversion round-trip.
Some shaders in Civilization V and Beyond Earth do

   pow(pow(x, 2.2), 0.454545)

which is converting to and from sRGB colorspace.

A more general rule that replaces pow(pow(a, b), c) with pow(a, b * c)
actually regresses two shaders in Sun Temple in which the result of the
inner pow is used twice, once by another pow and once by another
instruction. Also, since 2.2 * 0.454545 isn't exactly one, the more
general pattern would have still left us with a pow, and I'm 2.2 *
0.454545 percent sure that's not what they want.

instructions in affected programs:     934 -> 886 (-5.14%)
helped:                                16
2015-05-22 11:26:36 -07:00
Samuel Pitoiset
a21d23e191 nv50: fix PIPELINE_STATISTICS with HUD, based on nvc0
Tested on NVA8. No regression for ARB_pipeline_statistics piglit tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-22 11:39:23 +02:00
Samuel Pitoiset
867fd2b5f5 nv50: fix 64-bit queries with HUD, based on nvc0
A sequence number is written for 32-bits queries to make sure they are
ready, but not for 64-bits queries. Instead, we have to use a fence in
order to fix the HUD because it doesn't wait until the result is ready.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-22 11:39:23 +02:00
Christian König
6921ea42a1 radeon/vce: adapt new firmware interface changes
v2: make this also compatible with original released firmware
v3 (chk): switch to original idea of separate files for fw versions

Signed-off-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2)
2015-05-22 10:17:24 +02:00
Christian König
2b40c306d2 radeon/vce: move CPB handling function into common code
They are not firmware version dependent.

Signed-off-by: Christian König <christian.koenig@amd.com>
2015-05-22 10:17:24 +02:00
Jason Ekstrand
0f0b5aecb8 vk/pipeline: Track VB's that are actually used by the pipeline
Previously, we just blasted out whatever VB's we had marked as "dirty"
regardless of which ones were used by the pipeline.  Given that the stride
of the VB is embedded in the pipeline this can cause problems.  One problem
is if the pipeline doesn't use the given VB binding we emit a bogus stride.
Another problem is that we weren't properly resetting the dirty bits when
the pipeline changed.
2015-05-21 16:58:53 -07:00
Jason Ekstrand
0a54751910 vk/device: Memset descriptor sets to 0 and handle descriptor set holes 2015-05-21 16:33:04 -07:00
Dave Airlie
7c1a00174b u_math: uses assert, include assert.h
this fixes a build problem found on RHEL s390.

not sure what configure options caused it, I couldn't get it on
x86 here.

Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6" mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-22 09:19:58 +10:00
Jason Ekstrand
519fe765e2 vk: Do relocations in surface states when they are created
Previously, we waited until later and did a pass through the used surfaces
and did the relocations then.  This lead to doing double-relocations which
was causing us to get bogus surface offsets.
2015-05-21 15:55:29 -07:00
Timothy Arceri
d67515b7be glsl: remove element_type() helper
We now have is_array() and without_array() that make the
code much clearer and remove the need for this.

For all remaining calls to this we already knew that
the type was an array so returning a null wasn't adding any value.

v2: use without_array() in _mesa_ast_array_index_to_hir() and don't use
 without_array() in lower_clip_distance_visitor() as we want to make sure the
 array is 2D.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-22 08:35:45 +10:00
Jason Ekstrand
ccf2bf9b99 vk/test: Use the glsl_scraper for building shaders 2015-05-21 12:24:02 -07:00
Jason Ekstrand
f3d70e4165 vk/glsl_scraper: Use the LunarG back-door for GLSL source 2015-05-21 12:22:44 -07:00
Jason Ekstrand
cb56372eeb vk/glsl_scraper: Use a fake GLSL version that glslang will accept 2015-05-21 12:21:02 -07:00
Jason Ekstrand
0e441cde71 vk: Bake the GLSL_VK_SHADER macro into the scraper output file 2015-05-21 12:21:00 -07:00
Jason Ekstrand
f17e835c26 vk/meta: Use glsl_scraper for our GLSL source
We are not yet using SPIR-V for meta but this is a first step.
2015-05-21 11:39:54 -07:00
Jason Ekstrand
b13c0f469b vk: More out-of-tree build fixes 2015-05-21 11:32:59 -07:00
Jason Ekstrand
f294154e42 vk: Fix for out-of-tree builds 2015-05-21 10:23:18 -07:00
Matt Turner
51ccdb6346 glsl: Use AM_V_GEN/AM_V_at in NIR rules. 2015-05-21 09:43:43 -07:00
Kristian Høgsberg
f9e66ea621 vk: Remove render pass stub call
This isn't really a stub.
2015-05-20 20:34:52 -07:00
Kristian Høgsberg
a29df71dd2 vk: Add WSI implementation 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
f886647b75 vk: Add debug stubs 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
63da974529 vk: Mark remaining unsupported formats as such 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
387a1bb58f vk: Mark VK_FORMAT_UNDEFINED as 1 cpp, 1 channel 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
a1bd426393 vk: Stream surface state instead of using the surface pool
Since the binding table pointer is only 16 bits, we can only have 64kb
of binding table state allocated at any given time. With a block size of
1kb, that amounts to just 64 command buffers, which is not enough.
2015-05-20 20:34:52 -07:00
Kristian Høgsberg
01504057f5 vk: Use surface_format_info from dri driver for vkGetFormatInfo 2015-05-20 20:34:52 -07:00
Chad Versace
a61f307996 vk: Fix result of vkCreateInstance
When fill_physical_device() fails, don't return VK_SUCCESS.
2015-05-20 19:51:10 -07:00
Ilia Mirkin
6cdb29d52f freedreno/a3xx: set .zw of sprite coords to .01
Fixes non-determinism in bin/point-sprite rendering, and the stars on
the intro screen to neverball.

Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-20 21:54:00 -04:00
Ilia Mirkin
3e7bc67285 freedreno/ir3: fix immediate usage in tgsi tex fe
get_immediate will return a const reference, the requested immediate
isn't necessarily in the x slot. Make sure to use the swizzle.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-20 21:53:59 -04:00
Jason Ekstrand
14929046ba vk/compiler: Add shader language detection
This commit adds support for the LunarG GLSL back-door as well as detecting
regular GLSL and SPIR-V.  The SPIR-V path doesn't exist yet, so that will
cause an assert-fail.
2015-05-20 17:05:41 -07:00
Jason Ekstrand
47c1cf5ce6 vk/test: Add a test for testing buffer copies 2015-05-20 16:20:04 -07:00
Jason Ekstrand
bea66ac5ad vk/meta: Add support for copying arbitrary size buffers 2015-05-20 16:20:04 -07:00
Jason Ekstrand
9557b85e3d vk/meta: Use the biggest format possible for buffer copies
This should substantially improve throughput of buffer copies.
2015-05-20 16:20:04 -07:00
Jason Ekstrand
13719e9225 vk/meta: Fix buffer copy extents 2015-05-20 16:20:04 -07:00
Emil Velikov
36438f0db6 targets/osmesa: drop the -module tag from LDFLAGS
Gallium equivalent of commit 06ff751f97f(darwin: Fix install name of
libOSMesa)

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-20 21:56:36 +01:00
Jeremy Huddleston Sequoia
06ff751f97 darwin: Fix install name of libOSMesa
Passing -module to glibtool causes the resulting library to be called
libSomething.so rather than libSomething.dylib on darwin.

Regardless if libOSMesa is a library or a module, it has been used as
the former for quite some time. Update the build to reflect that and
resolve the naming issue.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
[Emil Velikov: Tweak the commit message.]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-20 21:56:32 +01:00
Alan Coopersmith
31cd2d75dc swrast: Build fix for Solaris
Fixes regression from commit 5b2d3480f5

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2015-05-20 21:44:21 +01:00
Jason Ekstrand
2126c68e5c nir: Get rid of the array elements parameter on load/store intrinsics
Previously, we used intrinsic->const_index[1] to represent "the number of
array elements to load" for load/store intrinsics.  However, this set to 1
by every pass that ever creates a load/store intrinsic.  Also, while it
might make some sense for registers, it makes no sense whatsoever in SSA.
On top of that, the i965 backend was the only backend to ever support it;
freedreno and vc4 just assert that it's always 1.  Let's just delete it.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-20 09:28:06 -07:00
Marek Olšák
e1c4e8aaaa gallium: remove TGSI_SAT_MINUS_PLUS_ONE
It's a remnant of some old NV extension. Unused.

I also have a patch that removes predicates if anyone is interested.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-05-20 15:40:46 +02:00
Marek Olšák
e4201bb618 cso: add context cleanup code from st/mesa
This fixes a crash in nouveau which can't handle
set_constant_buffer(PIPE_SHADER_TESS_*).

Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-20 15:39:20 +02:00
Samuel Iglesias Gonsalvez
4ee69a97bb mesa/main: validate name syntax for array variables only
From ARB_program_interface_query:

 "Note that if an interface enumerates a single active resource list
 entry for an array variable (e.g., "a[0]"), a <name> identifying
 any array element other than the first (e.g., "a[1]") is not
 considered to match."

It doesn't apply to arrays of interface blocks but just to array
variables.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-20 07:24:53 +02:00
Dave Airlie
1b05290676 GL3.txt: update softpipe ARB_gpu_shader5 status
texture gather and it already supported the new instructions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:36:14 +10:00
Dave Airlie
55a7b5165d softpipe: start adding gather support (v2)
This adds both ARB_texture_gather and the enhanced gather
for ARB_gpu_shader5.

This passes all the piglit tests, it relies on the GLSL
lowering pass to make textureGatherOffsets work.

v2: use inline to get gather component (Brian)
fix function name, add asserts (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:32:59 +10:00
Dave Airlie
0108eae291 softpipe: use arrays to make gather easier
This is a prep change for gather, and it makes more sense
to use an array in these cases.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:32:55 +10:00
Dave Airlie
a6861ecfc9 tgsi: handle TG4 opcode in tgsi exec
This just adds a new modifier interface for drivers to implement.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:32:51 +10:00
Dave Airlie
3f5c67d651 softpipe: add textureOffset support.
This was an oversight when GLSL1.30 was enabled, I think my
misunderstanding.

This fixes a bunch of tex-miplevel-selection tests under softpipe,
and is required for textureGather support.

I'm not sure this won't make sampling slowering, but its softpipe,
correctness first and all that.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:32:47 +10:00
Dave Airlie
8bec83a307 softpipe: move control into a filter args struct
more stuff for offsets and gather will go in here later.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:32:44 +10:00
Dave Airlie
99e583120c softpipe: move some image filter parameters into a struct
This moves some of the image filter args into a struct,
and passes that instead, this is prep work for adding texture
gather support which needs new arguments.

review: make filter args const.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-20 12:32:27 +10:00
Jason Ekstrand
d7044a19b1 vk/meta: Use texture() instead of texture2D() 2015-05-19 12:44:35 -07:00
Jason Ekstrand
edff076188 vk: Use binding instead of index in uniform layout qualifiers
This more closely matches what the Vulkan docs say to do.
2015-05-19 12:44:22 -07:00
Jason Ekstrand
e37a89136f vk/glsl_scraper: Add a --glsl-only option 2015-05-19 11:29:07 -07:00
Jason Ekstrand
4bcf58a192 vk/glsl_scraper: Use the line number from the end of the macro
We used to use the line number from the start of the macro but this doesn't
seem to match the c preprocessor
2015-05-19 11:29:07 -07:00
Jason Ekstrand
1573913194 vk/glsl_scraper: Don't open files until needed
This prevents us from writing an empty file when the compile failed.
2015-05-19 11:29:07 -07:00
Emil Velikov
b9b516248e Post-branch version bump to 10.7.0-devel, add release notes template
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-19 13:23:05 +01:00
Emil Velikov
0c9e0b7a6c glapi: track GL_ARB_program_interface_query.xml
Add the file to the API_XML list, otherwise there will be no knowledge
by the build that it should be included in the tarball.

Thus the (scons) build will fail.

Fixes: b297fc27aa9(glapi: add GL_ARB_program_interface_query skeleton)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-19 13:23:05 +01:00
Emil Velikov
0148c0ae6a i965: add brw_cs.h to the sources list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-19 12:39:05 +01:00
Kristian Høgsberg
e4c11f50b5 vk: Call finish for binding table state stream 2015-05-18 21:12:13 -07:00
Jason Ekstrand
851495d344 vk/meta: Use the new *view_init functions and stack-allocated views
This should save us a good deal of the leakage that meta currently has.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
4668bbb161 vk/image: Factor view creation out into separate *_init functions
The *_init functions work basically the same as the Vulkan entrypoints
except that they act on an already-created view and take an optional
command buffer option.  If a command buffer is given, the surface state is
allocated out of the command buffer's state stream.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
7c9f209427 Revert "vk/allocator: Don't use memfd when valgrind is detected"
This reverts commit b6ab076d6b.

It turns out setting USE_MEMFD to 0 is really bad because it means we can't
resize the pool.  Besides, valgrind SVN handles memfd so we really don't
need this fallback for valgrind anymore.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
923691c70d vk: Use a separate block pool and state stream for binding tables
The binding table pointers packet only allows for a 16-bit binding table
address so all binding tables have to be in the first 64 KB of the surface
state BO.  We solve this by adding a slave block pool that pulls off the
first 64 KB worth of blocks and reserves them for binding tables.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
d24f8245db vk/allocator: Add a concept of a slave block pool
We probably need a better name but this will do for now.
2015-05-18 20:57:43 -07:00
Kristian Høgsberg
997596e4c4 vk/test: Add test that prints format features 2015-05-18 20:52:44 -07:00
Kristian Høgsberg
241b59cba0 vk/test: Test timestamps and occlusion queries 2015-05-18 20:52:44 -07:00
Kristian Høgsberg
ae9ac47c74 vk: Make timestamp command work correctly
This was using the wrong timestamp register and needs to write a 64 bit
value.
2015-05-18 20:52:43 -07:00
Kristian Høgsberg
82ddab4b18 vk: Make occlusion query work, both copy and get functions 2015-05-18 20:52:43 -07:00
Kristian Høgsberg
1d40e6ade8 vk: Update generated header files
This fixes a problem where register addresses where incorrectly shifted.
2015-05-18 20:52:43 -07:00
Kristian Høgsberg
f330bad545 vk: Only fill render targets for meta clear
Clear inherits the render targets from the current render pass. This
means we need to fill out the binding table after switching to meta
bindings. However, meta copies etc happen outside a render pass and
break when we try to fill in the render targets. This change fills the
render targets only for meta clear.
2015-05-18 20:52:43 -07:00
Alexander von Gluck IV
7af2601a07 mesa/driver/haiku: Drop Mesa swrast renderer
This just created extra upkeep and the push to move extern
C's into mesa code would mean a large number of extern's
in core Mesa driver interfaces. The Haiku Gallium renderers
are mostly insulated via the C-based Haiku state tracker.

As any future hardware support in Haiku will be gallium
based, lets just drop swrast.

Haiku has a Mesa 7.12 fork for gcc2 that uses swrast.

This commit fixes the last of the Haiku build issues.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-18 21:02:25 -04:00
Jason Ekstrand
b6c7d8c911 vk/pipeline: Use a state_stream for storing programs
Previously, we were effectively using a state_stream, it was just
hand-rolled based on a block pool.  Now we actually use the data structure.
2015-05-18 15:58:20 -07:00
Jason Ekstrand
4063b7deb8 vk/allocator: Add support for valgrind tracking of state pools and streams
We leave the block pool untracked so that reads/writes to freed blocks
will get caught and do the tracking at the state pool/stream level.  We
have to do a few extra gymnastics for streams because valgrind works in
terms of poitners and we work in terms of separate map and offset.
Fortunately, the users of the state pool and stream should always be using
the map pointer provided in the anv_state structure.  We just have to
track, per block, the map that was used when we initially got the block.
Then we can make sure we always use that map and valgrind should stay
happy.
2015-05-18 15:58:20 -07:00
Jason Ekstrand
b6ab076d6b vk/allocator: Don't use memfd when valgrind is detected 2015-05-18 15:58:20 -07:00
Jason Ekstrand
682d11a6e8 vk/allocator: Assert that block_pool_grow succeeds 2015-05-18 15:48:19 -07:00
Jason Ekstrand
42298b05d1 i965: Use NIR by default for vertex shaders on GEN8+
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

   total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
   instructions in affected programs:     1514770 -> 1454047 (-4.01%)
   helped:                                5813
   HURT:                                  1120

The gained programs are ARB vertext programs that were previously going
through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-05-18 15:32:00 -07:00
Rob Clark
e6f912f07e freedreno: fence fix
A fence can outlive the ctx, so we shouldn't deref the ctx to get at the
screen.  We need some updates in libdrm_freedreno API to completely
handle fences properly, but this is at least an improvement.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-05-18 17:47:54 -04:00
Jason Ekstrand
28804fb9e4 vk/gem: VG_CLEAR the padding for the gem_mmap struct 2015-05-18 12:05:17 -07:00
Ben Widawsky
8427ad9125 i965: Add gen8 blend state
OLD:
0x00007340:      0x00800000:    BLEND:
0x00007344:      0x84202100:    BLEND:

NEW:
0x00007340:      0x00800000:    BLEND: Alpha blend/test
0x00007344:      0x0000000b84202100: BLEND_ENTRY00:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x0000734c:      0x0000000b84202100: BLEND_ENTRY01:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x00007354:      0x0000000b84202100: BLEND_ENTRY02:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x0000735c:      0x0000000b84202100: BLEND_ENTRY03:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x00007364:      0x0000000b84202100: BLEND_ENTRY04:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x0000736c:      0x0000000b84202100: BLEND_ENTRY05:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x00007374:      0x0000000b84202100: BLEND_ENTRY06:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----
0x0000737c:      0x0000000b84202100: BLEND_ENTRY07:
                        Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
                        function ADD,ADD (color, alpha), Disables: ----

v2: Line length fixes, and const usage (Topi)
Safer initialization of name string (Topi)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-18 12:02:18 -07:00
Ben Widawsky
fa284d6f2f i965: Add renderbuffer surface indexes to debug
This patch is optional in the series. It does make the output much cleaner, but
there is some risk.

Sample output (v3):
0x00007e80:      0x231d7000:  SURF000: 2D R8G8B8A8_UNORM  VALIGN4 HALIGN4 Y-tiled
0x00007e84:      0x05000000:  SURF000: MOCS: 0x5 Base MIP: 0.0 (0 mips) Surface QPitch: 0
0x00007e88:      0x009f009f:  SURF000: 160x160 [AUX_NONE]
0x00007e8c:      0x0000027f:  SURF000: 1 slices (depth), pitch: 640
0x00007e90:      0x00000000:  SURF000: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007e94:      0x00000000:  SURF000: x,y offset: 0,0, min LOD: 0
0x00007e98:      0x00000000:  SURF000: AUX pitch: 0 qpitch: 0
0x00007e9c:      0x09770000:  SURF000: Clear color: R(0)G(0)B(0)A(0)
0x00007ea0:      0x00001000:  SURF000: 0x00001000
0x00007ea4:      0x00000000:  SURF000: 0x00000000
0x00007ea8:      0x00000000:  SURF000: 0x00000000
0x00007eac:      0x00000000:  SURF000: 0x00000000
0x00007e40:      0x234df000:  SURF001: 2D R11G11B10_FLOAT  VALIGN4 HALIGN16 Y-tiled
0x00007e44:      0x09000000:  SURF001: MOCS: 0x9 Base MIP: 0.0 (0 mips) Surface QPitch: 0
0x00007e48:      0x009f009f:  SURF001: 160x160 [AUX_CCS_D (Uncompressed, MULTISAMPLE_COUNT=1)]
0x00007e4c:      0x0000027f:  SURF001: 1 slices (depth), pitch: 640
0x00007e50:      0x00000000:  SURF001: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007e54:      0x00000000:  SURF001: x,y offset: 0,0, min LOD: 0
0x00007e58:      0x00000001:  SURF001: AUX pitch: 0 qpitch: 0
0x00007e5c:      0x09770000:  SURF001: Clear color: R(0)G(0)B(0)A(0)
0x00007e60:      0x0002b000:  SURF001: 0x0002b000
0x00007e64:      0x00000000:  SURF001: 0x00000000
0x00007e68:      0x0002a000:  SURF001: 0x0002a000
0x00007e6c:      0x00000000:  SURF001: 0x00000000

v2: Rebased on Topi's recent series which changed around some of the gen8
surface setup code.

v3: Use ralloc_asprintf instead of asprintf to be more friendly to non-GNU
platforms.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2015-05-18 12:02:18 -07:00
Ben Widawsky
c14bb07230 i965: Add Gen9 surface state decoding
Gen9 surface state is very similar to the previous generation. The important
changes here are aux mode, and the way clear colors work.

NOTE: There are some things intentionally left out of this decoding.

v2: Redo the string for the aux buffer type to address compressed variants.

v3: Use the shift for compression enable (instead of compression mode) (Topi)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-18 12:02:18 -07:00
Ben Widawsky
313abbb8ca i965: Add gen8 surface state debug info
AFAICT, none of the old data was wrong (the gen7 decoder), but it wa smissing a
bunch of stuff.

Adds a tick (') to denote the beginning of the surface state for easier reading.
This will be replaced later with some better, but more risky code.

OLD:
0x00007980:      0x23016000:     SURF: 2D BRW_SURFACEFORMAT_B8G8R8A8_UNORM
0x00007984:      0x18000000:     SURF: offset
0x00007988:      0x00ff00ff:     SURF: 256x256 size, 0 mips, 1 slices
0x0000798c:      0x000003ff:     SURF: pitch 1024, tiled
0x00007990:      0x00000000:     SURF: min array element 0, array extent 1
0x00007994:      0x00000000:     SURF: mip base 0
0x00007998:      0x00000000:     SURF: x,y offset: 0,0
0x0000799c:      0x09770000:     SURF:
0x00007940:      0x231d7000:     SURF: 2D BRW_SURFACEFORMAT_R8G8B8A8_UNORM
0x00007944:      0x78000000:     SURF: offset
0x00007948:      0x001f001f:     SURF: 32x32 size, 0 mips, 1 slices
0x0000794c:      0x0000007f:     SURF: pitch 128, tiled
0x00007950:      0x00000000:     SURF: min array element 0, array extent 1
0x00007954:      0x00000000:     SURF: mip base 0
0x00007958:      0x00000000:     SURF: x,y offset: 0,0
0x0000795c:      0x09770000:     SURF:

NEW (v1):
0x00007980:      0x23016000:    SURF': 2D B8G8R8A8_UNORM  VALIGN4 HALIGN4 X-tiled
0x00007984:      0x18000000:     SURF: MOCS: 0x18 Base MIP: 0.0 (0 mips) Surface QPitch: 0
0x00007988:      0x00ff00ff:     SURF: 256x256 [AUX_NONE]
0x0000798c:      0x000003ff:     SURF: 1 slices (depth), pitch: 1024
0x00007990:      0x00000000:     SURF: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007994:      0x00000000:     SURF: x,y offset: 0,0, min LOD: 0
0x00007998:      0x00000000:     SURF: AUX pitch: 0 qpitch: 0
0x0000799c:      0x09770000:     SURF: Clear color: ----
0x00007940:      0x231d7000:    SURF': 2D R8G8B8A8_UNORM  VALIGN4 HALIGN4 Y-tiled
0x00007944:      0x78000000:     SURF: MOCS: 0x78 Base MIP: 0 (0 mips) Surface QPitch: ff0000
0x00007948:      0x001f001f:     SURF: 32x32 [AUX_NONE]
0x0000794c:      0x0000007f:     SURF: 1 slices (depth), pitch: 128
0x00007950:      0x00000000:     SURF: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007954:      0x00000000:     SURF: x,y offset: 0,0, min LOD: 0
0x00007958:      0x00000000:     SURF: AUX pitch: 0 qpitch: 0
0x0000795c:      0x09770000:     SURF: Clear color: ----
0x00007920:      0x00007980:    BIND0: surface state address
0x00007924:      0x00007940:    BIND1: surface state address

v2: Style cleanups (Matt)
Fix aux mode dword 7->6 (Topi)
Use exp2 instead of pow (Matt)
Add dwords 8-12 to the dump

v3: Needed to update the surface format name getter for the change in the first
patch in the series

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Matt Turner <mattst88@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-18 12:02:18 -07:00
Ben Widawsky
7f0c7a5f90 i965: Add gen7+ sampler state to batch debug
OLD:
0x00007e00:      0x10000000: WM SAMP0: filtering
0x00007e04:      0x000d0000: WM SAMP0: wrapping, lod
0x00007e08:      0x00000000: WM SAMP0: default color pointer
0x00007e0c:      0x00000090: WM SAMP0: chroma key, aniso

NEW:
0x00007e00:      0x10000000: SAMPLER_STATE 0: Disabled = no, Base Mip: 0.0, Mip/Mag/Min Filter: NONE/NEAREST/NEAREST, LOD Bias: 0.0
0x00007e04:      0x000d0000: SAMPLER_STATE 0: Min LOD: 0.0, Max LOD: 13.0
0x00007e08:      0x00000000: SAMPLER_STATE 0: Border Color
0x00007e0c:      0x00000090: SAMPLER_STATE 0: Max aniso: RATIO 2:1, TC[XYZ] Address Control: CLAMP|CLAMP|WRAP

v2: Move GET_BITS macro to here (with paren protection) Ben/Topi
Add const to the sampler pointer (Topi)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-18 12:02:18 -07:00
Ben Widawsky
1fa0789a94 i965: Add viewport extents (gen8) to batch decode
0x00007da0:      0xc1da740e: SF_CLIP VP: guardband xmin = -27.306667
0x00007da4:      0x41da740e: SF_CLIP VP: guardband xmax = 27.306667
0x00007da4:      0x41da740e: SF_CLIP VP: guardband ymin = -23.405714
0x00007da8:      0xc1bb3ee7: SF_CLIP VP: guardband ymax = 23.405714
0x00007db0:      0x00000000: SF_CLIP VP: Min extents: 0.00x0.00
0x00007db8:      0x00000000: SF_CLIP VP: Max extents: 299.00x349.00

While here, fix the wrong offsets for the guardband (I didn't check if it used
to be valid on GEN4).

v2: Remove leftover GET_BITS which belongs later in the series. (Topi)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-18 12:02:18 -07:00
Ben Widawsky
e45a292556 i965: Add all surface types to the batch decode
It's true that not all surfaces apply for every gen, but for the most part this
is what we want. (The unfortunate case is when we use a valid surface, but not
for the specific GEN).

This was automated with a vim macro.

v2: Shortened common forms such as R8G8B8A8->RGBA8. Note that this makes some of
the sample output in subsequent commits slightly incorrect.

v3: Use the name from the table (Ken). This requires declaring the surface
format array as extern, and declaring the struct in the .h file.

v4: Move the struct back and create a helper function to obtain the name (Ken)
Get rid of the now useless helper in the state_dump.c

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v3)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-18 12:02:18 -07:00
Ben Widawsky
421e396bb7 i965: Add string for surface format to table
Recommended-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-18 12:02:18 -07:00
Jason Ekstrand
8440b13f55 vk/meta: Rework the indentation style
No functional change.
2015-05-18 10:43:51 -07:00
Kristian Høgsberg
5286ef7849 vk: Provide more realistic values for device info 2015-05-18 10:27:08 -07:00
Kristian Høgsberg
69fd473321 vk: Use a temporary buffer for formatting in finishme
This is more likely to avoid breaking up the message when racing with
other threads.
2015-05-18 10:27:08 -07:00
Jason Ekstrand
cd7ab6ba4e vk/meta: Add an initial implementation of vkCmdCopyBuffer
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
c25ce55fd3 vk/meta: Add an initial implementation of vkCmdCopyBufferToImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
08bd554cda vk/meta: Add an initial implementation of vkCmdBlitImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
fb27d80781 vk/meta: Add an initial implementation of vkCmdCopyImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
c15f3834e3 vk/gem: Set the gem_mmap.flags parameter to 0 if it exists 2015-05-18 10:27:08 -07:00
Jason Ekstrand
f7b0f922be vk/gem: Only VK_CLEAR the addr_ptr in gen_mmap 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
ca7e62d421 vk: Add a logger wrapper for the generated entrypoint 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
eb92745b2e vk/gem: Just return -1 from anv_gem_wait() on error
We were returning -errno, unlike all the other gem functions.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
05754549e8 vk: Fix vkGetOjectInfo return values
We weren't properly returning the allocation count.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
6afb26452b vk: Implement fences
This basic implementation uses a throw-away bo for synchronization.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
e26a7ffbd9 vk/meta: Use anv_* internal entrypoints 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
b7fac7a7d1 vk: Implement allocation count query 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
783e6217fc vk: Change pData/pDataSize semantics
We now always copy the entire struct unless pData is NULL and
unconditionally write back the struct size. It's not clear this is
useful if the structs may grow over time, but it seems to be the
expected behaviour for now.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
b4b3bd1c51 vk: Return VK_SUCCESS from vkAllocDescriptorSets
This should've been returning VK_SUCCESS all along.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
a9f2115486 vk: Return VK_SUCCESS for all descriptor pool entry points 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
60ebcbed54 vk: Start Implementing vkGetFormatInfo()
We move the format table and vkGetFormatInfo to their own file in the
process.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
454345da1e vk: Add script for generating ifunc entry points
This lets us generate a hash table for vkGetProcAddress and lets us call
public functions internally without the public entrypoint overhead.
2015-05-18 10:27:02 -07:00
Matt Turner
f7df169ba1 i965/fs: Implement integer multiply without mul/mach.
Ivybridge and Baytrail can't use mach with 2Q quarter control, so just
do it without the accumulator. Stupid accumulator.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-18 10:11:36 -07:00
Matt Turner
0a9e3a0160 i965/fs: Rework compression control selection.
The next commit uses an add(16) with a UW destination with a stride of
2, which needs compression control since it's writing two registers. The
old code would have failed to set compression control correctly.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-18 10:11:36 -07:00
Matt Turner
4ec09c7747 i965/fs: Support integer multiplication in SIMD16 on Haswell.
Ivybridge (and presumably Baytrail) have a bug that prevents this from
working.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-18 10:11:36 -07:00
Matt Turner
0592ee457d i965/fs: Add set_sechalf() method.
Used in the next commit.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-18 10:11:36 -07:00
Matt Turner
81deefc45b i965/fs: Unrestrict constant propagation into integer multiply.
Gen8+'s MUL instruction doesn't ignore the high 16-bits of one source
like on earlier platforms, so we can constant propagate into it without
worry. Integer multiplies (not into the accumulator, which is done for
imul_high) are lowered in lower_integer_multiplication(), so it's safe
there as well.

On Broadwell, fragment shaders only:
total instructions in shared programs: 4377769 -> 4377451 (-0.01%)
instructions in affected programs:     48064 -> 47746 (-0.66%)
helped:                                156

On Broadwell, vertex shaders only:
total instructions in shared programs: 2858885 -> 2856313 (-0.09%)
instructions in affected programs:     26380 -> 23808 (-9.75%)
helped:                                134

On Broadwell, vertex shaders only (with INTEL_USE_NIR=1):
total instructions in shared programs: 2911688 -> 2865984 (-1.57%)
instructions in affected programs:     1421715 -> 1376011 (-3.21%)
helped:                                6186

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-18 10:11:36 -07:00
Matt Turner
1e4e17fbd9 i965/fs: Lower integer multiplication after optimizations.
32-bit x 32-bit integer multiplication requires multiple instructions
until Broadwell. This patch just lets us treat the MUL instruction in
the FS backend like it operates on Broadwell, and after optimizations
we lower it into a sequence of instructions on older platforms.

Doing this will allow us to some extra optimization on integer
multiplies.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-18 10:11:36 -07:00
Ilia Mirkin
ae405d429f gk110/ir: switch to gk104-style sched codes rather than all-in-one
Matches change to envydis/envyas tools.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-18 12:59:52 -04:00
Tapani Pälli
9f4eaba36f glsl: add stage references for UBO uniforms
Patch marks uniforms inside UBO properly referenced by stages.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90397
2015-05-18 15:23:09 +03:00
Iago Toral Quiroga
845ad2667a i965: Fix textureSize for Lod > 0 with non-mipmap filters
Currently, when the MinFilter is GL_LINEAR or GL_NEAREST we hide the
actual miplevel count from the hardware (and we avoid re-creating
the miptree structure with all the levels), since we don't expect
levels other than the base level to be needed. Unfortunately,
GLSL's textureSize() function is an exception to this rule. This
function takes a lod parameter that we need to use to return the
size of the appropriate miplevel (if it exists). The spec only
requires that the miplevel exists, so even if the sampler is
configured with a linear or nearest MinFilter, as far as the user
has uploaded miplevels for the texture, textureSize() should return
the appropriate sizes.

This patch fixes this by exposing the actual miplevel count for all
sampling engine textures while keeping the original implementation
for render targets (for render targets textures we do not provide
the miplevel count but the actual LOD we are wrting to, so we
want to make sure that we make this the base level).

Fixes 28 dEQP tests in the following category:
dEQP-GLES3.functional.shaders.texture_functions.texturesize.*

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-05-18 11:23:17 +02:00
Kristian Høgsberg
333bcc2072 vk: Fix vulkan header inconsistency
The function pointer typedef and the function prototype for
vkCmdClearColorImage() didn't agree. Fix the typedef to match the
prototype.
2015-05-17 21:08:31 -07:00
Kristian Høgsberg
b9eb56a404 vk: Add function pointer typedef for intel extension
Also guard function prototype by VK_PROTOTYPES.
2015-05-17 21:08:30 -07:00
Kristian Høgsberg
75cb85c56a vk: Add missing VKAPI for vkQueueRemoveMemReferences 2015-05-17 21:08:30 -07:00
Jason Ekstrand
a924ea0c75 Merge remote-tracking branch 'fdo-personal/wip/nir-vtn' into vulkan
This adds the SPIR-V -> NIR translator.
2015-05-16 12:43:16 -07:00
Jason Ekstrand
a63952510d nir/spirv: Don't assert that the current block is empty
It's possible that someone will give us SPIR-V code in which someone
needlessly branches to new blocks.  We should handle that ok now.
2015-05-16 12:34:34 -07:00
Jason Ekstrand
4e44dcc312 nir/spirv: Add initial support for samplers 2015-05-16 12:34:15 -07:00
Jason Ekstrand
d6f52dfb3e nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops
NIR doesn't have the native opcodes for them anymore
2015-05-16 12:33:32 -07:00
Jason Ekstrand
a53e795524 nir/types: Add support for sampler types 2015-05-16 12:32:58 -07:00
Jason Ekstrand
0fa9211d7f nir/spirv: Make the global constants in spirv.h static
I've been promissed in a bug that this will be fixed in a future version of
the header.  However, in the interest of my branch building, I'm adding
these changes in myself for the moment.
2015-05-16 11:16:34 -07:00
Jason Ekstrand
036a4b1855 nir/spirv: Handle jump-to-loop in a more general way 2015-05-16 11:16:34 -07:00
Jason Ekstrand
56f533b3a0 nir/spirv: Handle boolean uniforms correctly 2015-05-16 11:16:34 -07:00
Jason Ekstrand
64bc58a88e nir/spirv: Handle control-flow with loops 2015-05-16 11:16:34 -07:00
Jason Ekstrand
3a2db9207d nir/spirv: Set a name on temporary variables 2015-05-16 11:16:34 -07:00
Jason Ekstrand
a28f8ad9f1 nir/spirv: Use the correct length for copying string literals 2015-05-16 11:16:34 -07:00
Jason Ekstrand
7b9c29e440 nir/spirv: Make vtn_ssa_value handle constants as well as ssa values 2015-05-16 11:16:33 -07:00
Jason Ekstrand
b0d1854efc nir/spirv: Add initial support for GLSL 4.50 builtins 2015-05-16 11:16:33 -07:00
Jason Ekstrand
1da9876486 nir/spirv: Split the core datastructures into a header file 2015-05-16 11:16:33 -07:00
Jason Ekstrand
98d78856f6 nir/spirv: Use the builder for all instructions
We don't actually use it to create all the instructions but we do use it
for insertion always.  This should make things far more consistent for
implementing extended instructions.
2015-05-16 11:16:33 -07:00
Jason Ekstrand
ff828749ea nir/spirv: Add support for a bunch of ALU operations 2015-05-16 11:16:33 -07:00
Jason Ekstrand
d2a7972557 nir/spirv: Add support for indirect array accesses 2015-05-16 11:16:33 -07:00
Jason Ekstrand
683c99908a nir/spirv: Explicitly type constants and SSA values 2015-05-16 11:16:33 -07:00
Jason Ekstrand
c5650148a9 nir/spirv: Handle OpBranchConditional
We do control-flow handling as a two-step process.  The first step is to
walk the instructions list and record various information about blocks and
functions.  This is where the acutal nir_function_overload objects get
created.  We also record the start/stop instruction for each block.  Then
a second pass walks over each of the functions and over the blocks in each
function in a way that's NIR-friendly and actually parses the instructions.
2015-05-16 11:16:33 -07:00
Jason Ekstrand
ebc152e4c9 nir/spirv: Add a helper for getting a value as an SSA value 2015-05-16 11:16:33 -07:00
Jason Ekstrand
f23afc549b nir/spirv: Split instruction handling into preamble and body sections 2015-05-16 11:16:33 -07:00
Jason Ekstrand
ae6d32c635 nir/spirv: Implement load/store instructiosn 2015-05-16 11:16:33 -07:00
Jason Ekstrand
88f6fbc897 nir: Add a helper for getting the tail of a deref chain 2015-05-16 11:16:33 -07:00
Jason Ekstrand
06acd174f3 nir/spirv: Actaully add variables to the funciton or shader 2015-05-16 11:16:33 -07:00
Jason Ekstrand
5045efa4aa nir/spirv: Add a vtn_untyped_value helper 2015-05-16 11:16:33 -07:00
Jason Ekstrand
01f3aa9c51 nir/spirv: Use vtn_value in the types code and fix a off-by-one error 2015-05-16 11:16:33 -07:00
Jason Ekstrand
6ff0830d64 nir/types: Add an is_vector_or_scalar helper 2015-05-16 11:16:33 -07:00
Jason Ekstrand
5acd472271 nir/spirv: Add support for deref chains 2015-05-16 11:16:33 -07:00
Jason Ekstrand
7182597e50 nir/types: Add a scalar type constructor 2015-05-16 11:16:32 -07:00
Jason Ekstrand
eccd798cc2 nir/spirv: Add support for OpLabel 2015-05-16 11:16:32 -07:00
Jason Ekstrand
a6cb9d9222 nir/spirv: Add support for declaring functions 2015-05-16 11:16:32 -07:00
Jason Ekstrand
8ee23dab04 nir/types: Add accessors for function parameter/return types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
707b706d18 nir/spirv: Add support for declaring variables
Deref chains and variable load/store operations are still missing.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
b2db85d8e4 nir/spirv: Add support for constants 2015-05-16 11:16:32 -07:00
Jason Ekstrand
3f83579664 nir/spirv: Add basic support for types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
e9d3b1e694 nir/types: Add more helpers for creating types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
fe550f0738 glsl/types: Expose the function_param and struct_field structs to C
Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find
them.  This commit simpliy moves the ifdef.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
053778c493 glsl/types: Add support for function types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
7b63b3de93 glsl: Add GLSL_TYPE_FUNCTION to the base types enums 2015-05-16 11:16:32 -07:00
Jason Ekstrand
2b570a49a9 nir/spirv: Rework the way values are added
Instead of having functions to add values and set various things, we just
have a function that does a few asserts and then returns the value.  The
caller is then responsible for setting the various fields.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
f9a31ba044 nir/spirv: Add stub support for extension instructions 2015-05-16 11:16:32 -07:00
Jason Ekstrand
4763a13b07 REVERT: Add a simple helper program for testing SPIR-V -> NIR translation 2015-05-16 11:16:32 -07:00
Jason Ekstrand
cae8db6b7e glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp 2015-05-16 11:16:32 -07:00
Jason Ekstrand
98452cd8ae nir: Add the start of a SPIR-V to NIR translator
At the moment, it can handle the very basics of strings and can ignore
debug instructions.  It also has basic support for decorations.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
573ca4a4a7 nir: Import the revision 30 SPIR-V header from Khronos 2015-05-16 11:16:31 -07:00
Fredrik Höglund
5a55f681f6 mesa: Check the lookup_framebuffer return value in NamedFramebufferRenderbuffer
Found by Coverity.

Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-16 19:55:00 +02:00
Jason Ekstrand
057bef8a84 vk/device: Use bias rather than layers for computing binding table size
Because we statically use the first 8 binding table entries for render
targets, we need to create a table of size 8 + surfaces.
2015-05-16 10:42:53 -07:00
Jason Ekstrand
22e61c9da4 vk/meta: Make clear a no-op if no layers need clearing
Among other things, this prevents recursive meta.
2015-05-16 10:30:05 -07:00
Jason Ekstrand
120394ac92 vk/meta: Save and restore the old bindings pointer
If we don't do this then recursive meta is completely broken.  What happens
is that the outer meta call may change the bindings pointer and the inner
meta call will change it again and, when it exits set it back to the
default.  However, the outer meta call may be relying on it being left
alone so it uses the non-meta descriptor sets instead of its own.
2015-05-16 10:28:04 -07:00
Jason Ekstrand
4223de769e vk/device: Simplify surface_count calculation 2015-05-16 10:23:09 -07:00
Ilia Mirkin
d7081828cc tgsi/dump: fix declaration printing of tessellation inputs/outputs
mareko: only output second dimension for non-patch semantics

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:23 +02:00
Ilia Mirkin
dfc3bced2c tgsi/ureg: allow ureg_dst to have dimension indices
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:23 +02:00
Marek Olšák
ec67d73a73 tgsi/ureg: use correct limit for max input count
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:23 +02:00
Ilia Mirkin
93c940736f tgsi/sanity: set implicit in/out array sizes based on patch sizes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:23 +02:00
Ilia Mirkin
5b45cbe7e2 tgsi/scan: allow scanning tessellation shaders
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Marek Olšák
2420ee497a gallium: disable tessellation shaders for meta ops
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Marek Olšák
ed1b273ffc gallium/cso: set NULL shaders at context destruction
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Marek Olšák
2a7da1bddb gallium/cso: add support for tessellation shaders
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Marek Olšák
267ad27ab6 gallium/u_blitter: disable tessellation for all operations
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Marek Olšák
66630290df gallium/util: print vertices_per_patch in util_dump_draw_info
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Marek Olšák
369aca1b4a trace: implement new tessellation functions
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Ilia Mirkin
6b26206120 gallium: add set_tess_state to configure default tessellation parameters
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:22 +02:00
Ilia Mirkin
4dbfe6b627 gallium: add vertices_per_patch to draw info
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:51:15 +02:00
Ilia Mirkin
9e1ba1d689 gallium: add tessellation shader properties
v2: Marek: rename tess spacing definitions

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:48:54 +02:00
Ilia Mirkin
18bce2f194 gallium: add interfaces for controlling tess program state
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:48:54 +02:00
Marek Olšák
7ffc1fb928 gallium: bump shader input and output limits
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:48:54 +02:00
Ilia Mirkin
018aa27953 gallium: add new semantics for tessellation
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:48:54 +02:00
Ilia Mirkin
88c4f5d0a5 gallium: add new PATCHES primitive type
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:48:54 +02:00
Ilia Mirkin
398b0b3e36 gallium: add tessellation shader types
v2: Marek: rename shader types

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-05-16 14:48:54 +02:00
Jason Ekstrand
eb1952592e vk/glsl_helpers: Fix GLSL_VK_SHADER with respect to commas
Previously, the GLSL_VK_SHADER macro didn't work if the shader contained
commas outside of parentheses due to the way the C preprocessor works.
This commit fixes this by making it variadic again and doing it correctly
this time.
2015-05-15 22:17:07 -07:00
Ian Romanick
35c28103b0 glapi: Remove offset from the DTD
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:34 -07:00
Ian Romanick
a75910071e glapi: Whitespace clean up after the previous commit
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:33 -07:00
Ian Romanick
f507d33d4f glapi: Remove all offset tags from the XML
Changes generated by:

    cd src/mapi/glapi/gen
    for i in *.xml; do
        cat $i |\
        sed 's/[[:space:]]*offset="[^"]*">/>/' |\
        sed 's/[[:space:]]*offset="[^"]*"[[:space:]]*$//' |\
        sed 's/[[:space:]]*offset="[^"]*"[[:space:]]*/ /' > x
        mv x $i
    done

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:31 -07:00
Ian Romanick
2b419e0db9 glapi: Use the offsets from static_data.py instead of from the XML
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2015-05-15 20:23:24 -07:00
Ian Romanick
0fe7eab8d9 glapi: Add a list of functions that are not used but still need dispatch slots
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:20 -07:00
Ian Romanick
d2ee60cd52 glapi: Remove static dispatch for functions that didn't exist in NVIDIA
Comparing the output of

    nm -D libGL.so.349.16 | grep ' T gl[^X]' | sed 's/.* T //'

between Catalyst NVIDIA 349.16 and this commit, the only change is a bunch
of functions that NVIDIA exports that Mesa does not.

If a function is not statically exported by either of the major binary
drivers on Linux, there is almost zero chance that any application
statically links with it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:19 -07:00
Ian Romanick
4adfc6ed31 glapi: Remove static dispatch for functions that didn't exist in fglrx
Comparing the output of

    nm -D arch/x86_64/usr/X11R6/lib64/fglrx/fglrx-libGL.so.1.2 |\
        grep ' T gl[^X]' | sed 's/.* T //'

between Catalyst 14.6 Beta and this commit, the only change is a bunch
of functions that AMD exports that Mesa does not and some OpenGL ES
1.1 functions that Mesa exported but AMD does not.

The OpenGL ES 1.1 functions (e.g., glAlphaFuncx) are added by extensions
in desktop.  Our infrastructure doesn't allow us to statically export a
function in one lib and not in another.  The GLES1 conformance tests
expect to be able to link with these functions, so we have to export
them.

If a function is not statically exported by either of the major binary
drivers on Linux, there is almost zero chance that any application
statically links with it.

As a side note... I find it odd that AMD exports glTextureBarrierNV but
not glTextureBarrier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:18 -07:00
Ian Romanick
90a1a4e234 glapi: Remove static dispatch for functions that didn't exist in 10.3
Comparing the output of

    nm libGL.so | grep ' T gl[^X]' | sed 's/.* T //'

between 10.3.7 and this commit, the only change is the removal of
glFramebufferTextureFaceARB.  This function was removed a couple commits
previously.

glClipControl was, at the time 10.3 shipped, a very new function.  It
was added by GL_ARB_clip_control.  That extension was ratified by the
Khronos Board of Promoters on August 7, 2014.  It's less than a year
old, and I don't think it's is likely that there are many applications
using that extension... much less statically linking with the function.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:16 -07:00
Ian Romanick
c1ad2bac71 glapi: Remove static dispatch for functions that didn't exist in 10.4
Comparing the output of

    nm libGL.so | grep ' T gl[^X]' | sed 's/.* T //'

between 10.4.7 and this commit, the only change is the removal of
glFramebufferTextureFaceARB.  This function was removed a couple commits
previously.

None of these functions are particuarly new.  If applications were not
statically linking them with 10.4.7, there's approximately zero chance
they will for 10.6.

Almost all of these functions are for GL_ARB_direct_state_access.
Since the whole DSA API wasn't statically exported (and the extension
wasn't enabled!), I think there's exactly zero chance anyone linked
against these symbols.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:15 -07:00
Ian Romanick
832d43bbb6 glapi: Remove static dispatch for functions that didn't exist in 10.5
Comparing the output of

    nm libGL.so | grep ' T gl[^X]' | sed 's/.* T //'

between 10.5.5 and this commit, the only change is the removal of
glFramebufferTextureFaceARB.  This function was removed a couple commits
previously.

None of these functions are particuarly new.  If applications were not
statically linking them with 10.5.5, there's approximately zero chance
they will for 10.6.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:23:13 -07:00
Ian Romanick
ea54b3ea1a glapi: Remove static_dispatch from the DTD
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:22:43 -07:00
Ian Romanick
7a22e78704 glapi: Whitespace clean up after the previous commit
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2015-05-15 20:22:40 -07:00
Ian Romanick
44e67398cc glapi: Remove all static_dispatch tags from the XML
Changes generated by:

    cd src/mapi/glapi/gen
    for i in *.xml; do
        cat $i |\
        sed 's/[[:space:]]*static_dispatch="[^"]*">/>/' |\
        sed 's/[[:space:]]*static_dispatch="[^"]*"[[:space:]]*$//' |\
        sed 's/[[:space:]]*static_dispatch="[^"]*"[[:space:]]*/ /' > x
        mv x $i
    done

Comparing the output of

        nm libGL.so | grep ' T gl[^X]' | sed 's/.* T //'

before and after this commit showed no differences.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2015-05-15 20:22:36 -07:00
Ian Romanick
d9be1db4b6 glapi: Store list of functions with static dispatch in a separate table
The set of functions with static dispatch is (supposed to be) defined by
the Linux OpenGL ABI.  We export quite a few more functions than that
for historical reasons.  However, this list should never grow.

This table is used instead of the static_dispatch tag in the XML to
generate the static dispatch functions.  I used

    nm libGL.so | grep ' T gl[^X]' | sed 's/.* T //'

before and after the change.  diff showed no differences.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2015-05-15 20:22:32 -07:00
Ian Romanick
d649fcf727 glapi: Store static dispatch offsets in a separate table
Since the set of functions with static will never change, there is no
reason to store it in the XML.  It's just one of those fields that
confuses people adding new functions.

This is split out from the rest of the series so that in-code assertions
can be used to verify that the data in the Python code matches the XML.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2015-05-15 20:22:26 -07:00
Ian Romanick
5aaabd7630 mesa: Remove all vestiges of glFramebufferTextureFaceARB
Mesa does not (and probably never will) support GL_ARB_geometry_shader4,
so this function will never exist.  Having a function that is
exec="skip" and offset="assign" is just weird.

There are still a couple 'exec="skip" offset="assign"' functions
remaining.  These remain because we either support GLX protocol for them
(glSampleMaskSGIS and glSamplePatternSGIS) or older DRI drivers still
need them in the dispatch table (glResizeBuffersMESA).  The SGIS
functions can be removed later.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:22:23 -07:00
Ian Romanick
0784bb01b5 glapi: Mark a couple functions "ignore" for GLX
Without this the next patch will try to put these functions in the
dispatch table in indirect_init.c.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-15 20:21:36 -07:00
Kristian Høgsberg
3b9f32e893 vk: Make cmd_buffer->bindings a pointer
This lets us save and restore efficiently by just moving the pointer to
a temporary bindings struct for meta.
2015-05-15 18:12:07 -07:00
Kristian Høgsberg
9540130c41 vk: Move vertex buffers into struct anv_bindings 2015-05-15 16:34:31 -07:00
Kristian Høgsberg
0cfc493775 vk: Fix GLSL_VK_SHADER macro
Stringify doesn't work with __ARGV__. The last macro argument swallows
up excess arguments and as such we can just stringify that.
2015-05-15 16:15:04 -07:00
Kristian Høgsberg
af45f4a558 vk: Fix warning from missing initializer
Struct initializers need to be { 0, } to zero out the variable they're
initializing.
2015-05-15 16:07:17 -07:00
Kristian Høgsberg
bf096c9ec3 vk: Build binding tables at bind descriptor time
This changes the way descriptor sets and layouts work so that we fill
out binding table contents at the time we bind descriptor sets. We
manipulate the binding table contents and sampler state in a shadow-copy
in anv_cmd_buffer. At draw time, we allocate the actual binding table
and sampler state and flush the anv_cmd_buffer copies.
2015-05-15 16:05:31 -07:00
Kristian Høgsberg
1f6c220b45 vk: Update the bind map length to reflect MAX_SETS 2015-05-15 15:22:29 -07:00
Kristian Høgsberg
b806e80e66 vk: Flip back to using memfd for the allocators 2015-05-15 15:22:29 -07:00
Kristian Høgsberg
0a775e1eab vk: Rename dyn_state_pool to dynamic_state_pool
Given that we already tolerate surface_state_pool and the even longer
instruction_state_pool, there's no reason to arbitrarily abbreviate
dynamic.
2015-05-15 15:22:29 -07:00
Kristian Høgsberg
f5b0f1351f vk: Consolidate image, buffer and color attachment views
These are all just surface state, offset and a bo.
2015-05-15 15:22:29 -07:00
Fredrik Höglund
b3059bb7c5 st/mesa: Flush the bitmap cache in st_BlitFramebuffer
With DSA we can no longer rely on this being done in st_validate_state
in response to the framebuffer bindings having changed.

This fixes the ext_framebuffer_multisample-bitmap piglit test.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-15 22:12:05 +02:00
Ian Romanick
d43aed9646 i965: Fix FS unit tests
Commit 3687d75 changed the fs_visitor constructors, but it didn't update
all the users.  As a result, 'make check' fails.

I added the explicit cast to the gl_program* parameter to make it more
clear which NULL was which.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@Whitecape.org>
2015-05-15 12:31:15 -07:00
Alexander von Gluck IV
7de484871d target/haiku-softpipe: Move api init into st code
We also reduce the amount of need-to-know information about st_api
to require one less extern "C" in st_manager.h

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-15 13:55:59 -04:00
Alexander von Gluck IV
9b5da7f06a st/hgl: Move st_api creation to st and extern "C" it
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-15 13:55:59 -04:00
Alexander von Gluck IV
73aef2d1d8 winsys/hgl: Add needed extern "C" to hgl winsys
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-15 13:55:59 -04:00
Alexander von Gluck IV
624b38add9 gallium/drivers: Add extern "C" wrappers to public entry
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-15 13:55:59 -04:00
Alexander von Gluck IV
40a8b2f92a gallium/aux: Add needed extern "C" wrappers
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-15 13:55:59 -04:00
Kenneth Graunke
3687d752e5 i965/fs: Combine the fs_visitor constructors.
For scalar GS support, we either need to add a fourth constructor which
takes the GS structures, or combine the existing two and pass the shader
stage.

Given that they're not significantly different, I opted for the latter.

v2: Remove more stuff from the .h file (Jason and Jordan).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-05-14 21:19:48 -07:00
Jason Ekstrand
41db8db0f2 vk: Add a GLSL scraper utility
This new utility, glsl_scraper.py scrapes C files for instances of the
GLSL_VK_SHADER macro, pulls out the shader source, and compiles it to
SPIR-V.  The compilation is done using glslValidator.  The result is then
placed into another C file as arrays of dwords that can be easiliy handed
to a Vulkan driver.
2015-05-14 19:18:57 -07:00
Jason Ekstrand
79ace6def6 vk/meta: Add a magic GLSL shader source macro 2015-05-14 19:07:34 -07:00
Emil Velikov
0c4eef6a2c egl: remove remaining EGL_MESA_copy_context skeleton
With earlier commit (7a58262e58 egl: Remove skeleton implementation of
EGL_MESA_screen_surface) we've removed the skeleton implementation of
eglCopyContextMESA(). Just like EGL_MESA_screen_surface this extension
was never implemented in mesa.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 21:05:16 +00:00
Emil Velikov
448e01b291 egl/main: fix EGL_KHR_get_all_proc_addresses
The extension requires that the address of the core functions should be
available via eglGetProcAddress. Currently the list is guarded by
_EGL_GET_CORE_ADDRESSES, which was only set for the scons (windows)
build.

Unconditionally enable it for all the builds (automake, android and
haiku) considering that the extension is not platform specific and is
always enabled.

v2: Drop the _EGL_GET_CORE_ADDRESSES macro altogether.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-14 21:00:05 +00:00
Marc-André Lureau
ffc94e32a3 egl: more define fixes for EGL_MESA_image_dma_buf_export
s/EGL_MESA_dma_buf_image_export/EGL_MESA_image_dma_buf_export as defined by the spec
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-05-14 21:00:05 +00:00
Emil Velikov
e3cc5ad49d egl/main: expose only core EGL functions statically
The EGL 1.3, 1.4 and 1.5 spec (as quoted below) explicitly mentions that
providing static symbols for functions provided by EGL extensions is not
portable. Considering that relatively recently we've seen a non-mesa
desktop EGL implementation, the fact that we opt for such behaviour has
gone unnoticed.

From the EGL 1.5 specification:
    For functions that are queryable with eglGetProcAddress,
    implementations may choose to also export those functions
    statically from the object libraries implementing those
    functions. However, portable clients cannot rely on this
    behavior.

To encourage devs against writing such non-portable code, let's hide the
symbols similar to the official binary driver from NVIDIA.

v2: Quote the EGL 1.5 spec, as suggested by Chad.

Cc: Brian Paul <brianp@vmware.com>
Cc: Chad Versace <chad.versace@intel.com>
Cc: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-05-14 21:00:05 +00:00
Emil Velikov
f9bf9133cc egl: fix the EGL_MESA_image_dma_buf_export header declarations
Similar to other EGL extensions - guard the function prototypes by
EGL_EGLEXT_PROTOTYPES as the libEGL library does (should) not provide
the symbols statically.

Instead users should call eglGetProcAddress, which returns the function
pointer. The latter of which was missing the type declaration (typedef).

Cc: Dave Airlie <airlied@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-05-14 21:00:04 +00:00
Emil Velikov
9790988123 egl/main: Update README.txt
The driver search/load is not done at eglGetDisplay (or eglOpenDisplay
as the readme called it) time, but during eglInitialize().

Drop _eglMain (available only for external drivers) reference. Mention
we use function(s), specific to the built-in driver(s).

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-05-14 21:00:04 +00:00
Emil Velikov
1fac38ee32 egl/main: cleanup function prototypes
Cleanup the function propotypes which were part of the previous EGL
drivers.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-05-14 21:00:04 +00:00
Emil Velikov
209360bbb9 egl/main: drop support for external egl drivers
The only user (egl_gallium) is not longer around.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-05-14 21:00:04 +00:00
Rob Clark
4925c35660 freedreno: fix bug in tile/slot calculation
This was causing corruption with hw binning on a306.  Unlikely that it
is a306 specific, but rather the smaller gmem size resulted in different
tile configuration which was triggering the bug at certain resolutions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4" and "10.5" and "10.6" <mesa-stable@lists.freedesktop.org>
2015-05-14 14:46:14 -04:00
Rob Clark
fcc7d6323b freedreno: enable a306
Whitelist adreno 306 (as found in msm8916/apq8016).  Works pretty much
out of the box, although the smaller GMEM size requires more tiles to
fit 1920x1080, so bump up the max # of tiles as well.

Since it is just whitelist + trivial change, it makes sense to land on
all the active release branches.

Note that a305c ends up with gpu-id "306", hence a306 ends up with
gpu-id of "307".  Apparently that is what happens when you let the
marketing dept name things.

Cc: "10.4" and "10.5" and "10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-05-14 14:46:14 -04:00
Jason Ekstrand
018a0c1741 vk/meta: Add a better comment about the VS for blits 2015-05-14 11:39:32 -07:00
Alexander von Gluck IV
0fbf49ce57 egl/haiku: Drop extern "C". No longer needed
Reviewed-⁠by: Brian Paul <brianp@vmware.com>
2015-05-14 14:08:40 -04:00
Alexander von Gluck IV
8362068c1b egl: Add needed extern "C" for C++ access
* Haiku's egl driver is C++ due to the interface natively being C++

Reviewed-⁠by: Brian Paul <brianp@vmware.com>
2015-05-14 14:08:37 -04:00
Samuel Pitoiset
175cbb447a nvc0: remove unused nv50_tsc_wrap_mode() function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-14 13:27:44 -04:00
Samuel Pitoiset
ac1ac94b38 nv50/ir: silence compiler warnings about mismatched tags
These warnings have been detected by Clang 3.6.

codegen/nv50_ir_from_tgsi.cpp:1319:10: warning: struct 'Source' was
previously declared as a class [-Wmismatched-tags] const struct tgsi::Source *code;

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-14 13:27:44 -04:00
Samuel Pitoiset
70651b7041 nv50/ir: remove unused private field cycle to SchedDataCalculator
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-14 13:27:43 -04:00
Samuel Pitoiset
7469f2fd23 nv30: remove unused nvfx_fp_memcpy() function and comment nv40_fp_bra()
The nv40_fp_bra() function in the same file is also unused but this is
the only place where the nv30/nv40 isa is documented.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-14 13:27:43 -04:00
Samuel Pitoiset
48c84a36dd nvc0: do not expose MP counters for nvf0 (GK110+)
This fixes a crash when trying to monitor MP counters because compute
support is not implemented for nvf0.

Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-14 13:27:43 -04:00
Fredrik Höglund
b9cb7c1980 docs/relnotes: Mark off ARB_direct_state_access for 10.6
v2: Make it clear that ARB_direct_state_access is only available on
    drivers that support GL 2.0+

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:18 +02:00
Fredrik Höglund
d9109cc211 docs: Update the ARB_direct_state_access status
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:18 +02:00
Fredrik Höglund
357bf80caa st/mesa: Enable ARB_direct_state_access
Assume that all drivers that advertise support for NPOT textures
are able to support GL 2.0.

v2: Add a comment.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:18 +02:00
Fredrik Höglund
a57feba0a3 i965: Enable ARB_direct_state_access
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:18 +02:00
Fredrik Höglund
121030eed8 i915: Enable ARB_direct_state_access
This extension requires OpenGL 2.0, so enable it on gen3 and later.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:18 +02:00
Fredrik Höglund
d3368e0c9e mesa: Add ARB_direct_state_access checks in query object functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
bebf3c6ab3 mesa: Add ARB_direct_state_access checks in program pipeline functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
9e7149c898 mesa: Add ARB_direct_state_access checks in sampler object functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
36b0579337 mesa: Add ARB_direct_state_access checks in VAO functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
8940957238 mesa: Add ARB_direct_state_access checks in texture functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
cb49940766 mesa: Add ARB_direct_state_access checks in renderbuffer functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
6ad0b7e07a mesa: Add ARB_direct_state_access checks in FBO functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:17 +02:00
Fredrik Höglund
339ed0984d mesa: Add ARB_direct_state_access checks in buffer object functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:16 +02:00
Fredrik Höglund
7d212765a4 mesa: Add ARB_direct_state_access checks in XFB functions
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:16 +02:00
Fredrik Höglund
03420eac0c mesa: Make GL_TEXTURE_CUBE_MAP valid in FramebufferTextureLayer
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:16 +02:00
Fredrik Höglund
30dcaaec35 mesa: Add an extension flag for ARB_direct_state_access
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2015-05-14 15:48:16 +02:00
Laura Ekstrand
9de7a81626 main: Add entry point for NamedFramebufferDrawBuffers.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:16 +02:00
Laura Ekstrand
68c6964b37 main: Refactor DrawBuffers.
This could have added a new DD table entry for DrawBuffers that takes an
arbitrary draw buffer, but, after looking at the existing DD functions,
Kenneth Graunke recommended that we just skip calling the DD functions in the
case of ARB_direct_state_access.  The DD implementations for DrawBuffer(s)
have limited functionality, especially with respect to
ARB_direct_state_access.

[Fredrik: Call the driver function when fb is the bound draw buffer]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
1f0a5f32d3 main: Add entry point for NamedFramebufferReadBuffer.
[Fredrik: Fix the name of the buf parameter in the XML file]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
7518c6b5b2 main: Refactor _mesa_ReadBuffer.
This could have added a new DD table entry for ReadBuffer that takes an
arbitrary read buffer, but, after looking at the existing DD functions,
Kenneth Graunke recommended that we just skip calling the DD functions in the
case of ARB_direct_state_access.  The DD implementations for ReadBuffer
have limited functionality, especially with respect to
ARB_direct_state_access.

[Fredrik: Call the driver function when fb is the bound read buffer]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
642fb71277 main: Add entry point for NamedFramebufferDrawBuffer.
[Fredrik: Fix the name of the buf parameter in the XML file]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
2f32e4847d main: Refactor _mesa_DrawBuffer.
This could have added a new DD table entry for DrawBuffer that takes an
arbitrary draw buffer, but, after looking at the existing DD functions,
Kenneth Graunke recommended that we just skip calling the DD functions in the
case of ARB_direct_state_access.  The DD implementations for DrawBuffer(s)
have limited functionality, especially with respect to
ARB_direct_state_access.

[Fredrik: Call the driver function when fb is the bound draw buffer]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
f8fd8dfee8 main: Refactor _mesa_drawbuffers.
[Fredrik: Whitespace fix]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
9f1db78a83 main: Add stubs for [Get]NamedFramebufferParameteri[v].
The ARB_direct_state_access specification says (as of 2015.02.05):
   "Interactions with OpenGL 4.3 or ARB_framebuffer_no_attachments

       If neither OpenGL 4.3 nor ARB_framebuffer_no_attachments are supported,
       ignore the support for NamedFramebufferParameteri and
       GetNamedFramebufferParameteriv."

This commit adds stubs for these entry points.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
a0329c7b40 main: Fake entry point for glClearNamedFramebufferfi.
Mesa's ClearBuffer framework is very complicated and thoroughly married to the
object binding model.  Moreover, the OpenGL spec for ClearBuffer is also very
complicated.  At some point, we should implement buffer clearing for arbitrary
framebuffer objects, but for now, we will just wrap ClearBuffer.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
bbd9c55d02 main: Fake entry point for glClearNamedFramebufferfv.
Mesa's ClearBuffer framework is very complicated and thoroughly married to the
object binding model.  Moreover, the OpenGL spec for ClearBuffer is also very
complicated.  At some point, we should implement buffer clearing for arbitrary
framebuffer objects, but for now, we will just wrap ClearBuffer.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
43db4b8465 main: Fake entry point for glClearNamedFramebufferuiv.
Mesa's ClearBuffer framework is very complicated and thoroughly married to the
object binding model.  Moreover, the OpenGL spec for ClearBuffer is also very
complicated.  At some point, we should implement buffer clearing for arbitrary
framebuffer objects, but for now, we will just wrap ClearBuffer.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:15 +02:00
Laura Ekstrand
6236c47799 main: Fake entry point for glClearNamedFramebufferiv.
Mesa's ClearBuffer framework is very complicated and thoroughly married to the
object binding model.  Moreover, the OpenGL spec for ClearBuffer is also very
complicated.  At some point, we should implement buffer clearing for arbitrary
framebuffer objects, but for now, we will just wrap ClearBuffer.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
d890fc710f main: Add entry points for InvalidateNamedFramebuffer[Sub]Data.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
65d4a20f1c main: Refactor invalidate_framebuffer_storage.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
b4368ac09d main: Complete error conditions for glInvalidate*Framebuffer.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
6b284f08ab main: _mesa_blit_framebuffer updates its arbitrary framebuffers.
Previously, we used _mesa_update_state to update the currently bound
framebuffers prior to performing a blit.  Now that _mesa_blit_framebuffer
uses arbitrary framebuffers, _mesa_update_state is not specific enough.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
47b910d275 main: Add entry point for BlitNamedFramebuffer.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
b590c61725 main: Refactor _mesa_update_draw_buffer_bounds.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
39be0c5f6c main: Refactor _mesa_get_clamp_read_color.
This wasn't neccessary for ARB_direct_state_access, but felt like a good idea
for the sake of completeness.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
2cabfd9636 main: Refactor _mesa_[update|get]_clamp_fragment_color.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
c1fe8d841c main: Refactor _mesa_[update|get]_clamp_vertex_color.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
9036a6c0aa main: Refactor _mesa_update_framebuffer.
_mesa_update_framebuffer now operates on arbitrary read and draw framebuffers.
This allows BlitNamedFramebuffer to update the state of its arbitrary read and
draw framebuffers.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:14 +02:00
Laura Ekstrand
1a314f3c51 main: Refactor glBlitFramebuffer.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
df032ef7e0 main: Fix whitespace in blit.c
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
f22fa307de main: Add entry point GetNamedFramebufferAttachmentParameteriv.
[Fredrik: - Update one of the error messages to reflect that the
            framebuffer might not be the bound framebuffer.
          - Whitespace fixes.]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
f93f95928d main: Add entry point for CheckNamedFramebufferStatus.
[Fredrik: - Retain the debugging code in CheckFramebufferStatus.
          - Whitespace fixes.]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
80e9bf2641 main: Fix indents in former get_texture_for_framebuffer functions.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
085c67dc77 main: Major refactor of get_texture_for_framebuffer.
This splits off the (still) rather large chunk that is
get_texture_for_framebuffer into lots of smaller functions specialized to
service the wide variety of unique needs of *FramebufferTexture* entry points.
The result is much cleaner because, rather than having a pile of branches and
confusing conditions (like the boolean layered), the uniqueness is baked into
the entry points. The entry points know whether or not they are layered or use
a textarget.

[Fredrik: - Mention the value of <textarget> in the error message.
          - Rename check_zoffset to check_layer, and zoffset to layer.
            The zoffset parameter was renamed to layer in
            ARB_framebuffer_object.
          - Make layered a GLboolean since the value is visible to the API.
          - Remove EXT suffixes in refactored code.
          - Whitespace fixes.]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
d78c831a14 main: Add entry points for glNamedFramebufferTexture[Layer].
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
a602b21f94 main: Fix indentation in get_texture_for_framebuffer.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
a9f73f7f42 main: Refactor get_texture_for_framebuffer.
This moves a few blocks around so that the control flow is more obvious.  If
the texture is 0, just return true at the beginning of the function.
Likewise, if the texObj is NULL, return true at the beginning of the function
as well.

[Fredrik: Fix the texObj NULL check]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:13 +02:00
Laura Ekstrand
a245e3bdeb main: Split framebuffer_texture.
Split apart utility function framebuffer_texture to better prepare for
implementing NamedFramebufferTexture and NamedFramebufferTextureLayer.  This
should also pave the way for some future cleanup work.

[Fredrik: - Mention which limit was exceeded when <layer> is out of range.
          - Update a comment to reflect that <fb> might not be the bound
            framebuffer.
          - Make it clear that the error message in glFramebufferTexture*D
            refers to the <textarget> parameter.
          - Remove EXT suffixes.]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
69bdc9dcb8 main: Fix an error generated by FramebufferTexture
gl*FramebufferTexture should generate GL_INVALID_VALUE when the
texture doesn't exist.

[Fredrik: Split this change out from the next commit]

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-05-14 15:48:12 +02:00
Fredrik Höglund
8ba7ad8abc mesa: Generate GL_INVALID_VALUE in framebuffer_texture when layer < 0
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-05-14 15:48:12 +02:00
Fredrik Höglund
f9f5c82284 main: Require that the texture exists in framebuffer_texture
Generate GL_INVALID_OPERATION if the texture hasn't been created.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
8f78c6889d main: Fix the indentation in framebuffer_texture
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
a29318bf0a main: Add entry point for NamedFramebufferRenderbuffer.
[Fredrik: - Remove the DummyRenderbuffer checks now that they are
            done in _mesa_lookup_renderbuffer_err.
          - Fix the <renderbuffertarget> name in error messages.
          - Make the error message in _mesa_framebuffer_renderbuffer
            reflect that <fb> might not be the bound framebuffer.
          - Remove EXT suffixes from GL tokens.]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
3d100372f1 main: Rename framebuffer renderbuffer software fallback.
Rename _mesa_framebuffer_renderbuffer to _mesa_FramebufferRenderbuffer_sw in
preparation for adding the ARB_direct_state_access backend function for
FramebufferRenderbuffer and NamedFramebufferRenderbuffer to share.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
2bb138e7ec main: Add utility function _mesa_lookup_renderbuffer_err.
[Fredrik: Generate an error for non-existent renderbuffers]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
f868de7d6b main: Add glCreateFramebuffers.
[Fredrik: Whitespace fixes]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Laura Ekstrand
6d8eff4af7 main: Add utility function _mesa_lookup_framebuffer_err.
[Fredrik: Generate an error for non-existent framebuffers]

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-05-14 15:48:12 +02:00
Jason Ekstrand
8c92701a69 vk/test: Use VK_IMAGE_TILING_OPTIMAL for the render target 2015-05-13 22:27:38 -07:00
Jason Ekstrand
4fb8bddc58 vk/test: Do a copy of the RT into a linear buffer and write that to a PNG 2015-05-13 22:23:30 -07:00
Jason Ekstrand
bd5b76d6d0 vk/meta: Add the start of a blit implementation
Currently, we only implement CopyImageToBuffer
2015-05-13 22:23:30 -07:00
Jason Ekstrand
94b8c0b810 vk/pipeline: Default to a SamplerCount of 1 for PS 2015-05-13 22:23:30 -07:00
Jason Ekstrand
d3d4776202 vk/pipeline: Add an extra flag for force-disabling the vertex shader
This way we can pass in a vertex shader and yet have the pipeline emit an
empty 3DSTATE_VS packet.  We need this for meta because we need to trick
the compiler into not deleting our inputs but at the same time disable the
VS so that we can use a rectlist.  This should go away once we actually get
SPIR-V.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
a1309c5255 vk/pass: Emit a flushing pipe control at the end of the pass
This is rather crude but it at least makes sure that all the render targets
get flushed at the end of the pass.  We probably actually want to do
somthing based on image layout traansitions, but this will work for now.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
07943656a7 vk/compiler: Set the binding table texture_start
This is by no means a complete solution to the binding table problems.
However, it does make texturing actually work.  Before, we were texturing
from the render target since they were both starting at 0.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
cd197181f2 vk/compiler: Zero the prog data
We use prog_data[stage] != NULL to determine whether or not we need to
clean up that stage.  Make sure it default to NULL.
2015-05-13 22:22:59 -07:00
Jason Ekstrand
1f7dcf9d75 vk/image: Stash more information in images and views 2015-05-13 22:22:59 -07:00
Jason Ekstrand
43126388cd vk/meta: Save/restore more stuff in cmd_buffer_restore 2015-05-13 22:22:59 -07:00
Chad Versace
50806e8dec vk: Install headers
I need this for building a testsuite.
2015-05-13 17:49:26 -07:00
Kristian Høgsberg
83c7e1f1db vk: Add support for sampler descriptors 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
4f9eaf77a5 vk: Use a typesafe anv_descriptor struct 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
5c9d77600b vk: Create and bind a sampler in vk.c 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
18acfa7301 vk: Fix copy-n-paste sType in vkCreateSampler 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a1ec789b0b vk: Add a dynamic state stream to anv_cmd_buffer
We'll need this for sampler state.
2015-05-13 14:47:11 -07:00
Kristian Høgsberg
3f52c016fa vk: Move struct anv_sampler to private.h 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a77229c979 vk: Allocate layout->count number of descriptors
layout->count is the number of descriptors the application
requested. layout->total is the number of entries we need across all
stages.
2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a3fd136509 vk: Fill out sampler state from API values 2015-05-13 14:47:11 -07:00
Roland Scheidegger
adcf8f8a13 softpipe: enable ARB_texture_view
Some bits were already there for texture views but some were missing.
In particular for cube map views things needed to change a bit.
For simplicity I ended up removing the separate face addr bit (just use
the z bit) - cube arrays didn't use it already, so just follow the same
logic there. (In theory using separate bits could allow for better hash
function but I don't think anyone ever did some measurements of that so
probably not worth the trouble, if we'd reintroduce it we'd certainly
wanted to use the same logic for cube arrays and cube maps.)
Also extend the seamless cube sampling to cube arrays - as there were no
piglit failures before this is apparently untested, but things now generally
work quite the same for cube textures and cube array textures so there
hopefully shouldn't be any trouble...

49 new piglits, 47 pass, 2 fail (both due to fake multisampling).

v2: incorporate Brian's feedback, add sampler view validation,
function rename, formatting fixes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-13 22:57:50 +02:00
Roland Scheidegger
e6c66f4fb0 llvmpipe: enable ARB_texture_view
All the functionality was pretty much there, just not tested.
Trivially fix up the missing pieces (take target info from view not
resource), and add some missing bits for cubes.
Also add some minimal debug validation to detect uninitialized target values
in the view...

49 new piglits, 47 pass, 2 fail (both related to fake multisampling,
not texture_view itself). No other piglit changes.

v2: move sampler view validation to sampler view creation, update docs.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-13 22:57:50 +02:00
Roland Scheidegger
2712f70d57 gallium/util: fix blitter sampler view target initialization
This was missing, and drivers relying on the target in the view could get
into quite some trouble.

Signed-off-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-05-13 22:57:50 +02:00
Alexander von Gluck IV
cf71e7093c glapi/hgl: Drop extern "C" as it was added to glapi
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-13 15:26:29 -04:00
Alexander von Gluck IV
d27b114eaf glapi: Add extern "C" to glapi_priv.h
* The Haiku glapi has a C++ wrapper around the dispatch code.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-13 15:26:26 -04:00
Chad Versace
828817b88f vk: Ignore vk executable 2015-05-13 12:05:38 -07:00
Alexander von Gluck IV
915d808a56 gallium/st + hgl: Build fixes for Haiku
* No impact risk to any other platforms
* Tracing printf needs stdio.h now due to child header change
* Add missing #/src include directory for util/macros.h
2015-05-13 09:41:30 -05:00
Francisco Jerez
d247615e0d i965: Fix PBO cache coherency issue after _mesa_meta_pbo_GetTexSubImage().
This problem can easily be reproduced with a number of
ARB_shader_image_load_store piglit tests, which use a buffer object as
PBO for a pixel transfer operation and later on bind the same buffer
to the pipeline as shader image -- The problem is not exclusive to
images though, and is likely to affect other kinds of buffer objects
that can be bound to the 3D pipeline, including vertex, index,
uniform, atomic counter buffers, etc.

CC: 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-05-13 14:28:25 +03:00
Tapani Pälli
58715b7239 i965/fs: set execution size to 8 with simd8 ddy instruction
Commit dd5c825 changed the way how execution size for instructions
get set. Previously it was based on destination register width, now
it is set explicitly when emitting instructions.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90258
2015-05-13 09:08:47 +03:00
Dave Airlie
71fc52072b i965/cs: drop explicit initialisers in C++ file
gcc 4.4.7 really doesn't like them, and they aren't standard
C++, they seem to be a gcc extension.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-13 10:09:48 +10:00
Ilia Mirkin
c696a318ef nouveau: document nouveau_heap
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-12 18:58:49 -04:00
Ilia Mirkin
d06ce2f1df nvc0: switch mechanism for shader eviction to be a while loop
This aligns it to work similarly to nv50. However there's no library
code there, so the whole thing can be freed. Here we end up with an
allocated node that's not attached to a specific program.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86792
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-12 18:47:17 -04:00
Ilia Mirkin
380f7611b5 st/mesa: update stencil surface if it comes from texture
Now that ARB_texture_stencil8 is supported, this might happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-05-12 18:41:11 -04:00
Kristian Høgsberg
2b7a060178 vk: Fix stale error handling in vkQueueSubmit 2015-05-12 14:38:58 -07:00
Kristian Høgsberg
cb986ef597 vk: Submit all cmd buffers passed to vkQueueSubmit 2015-05-12 14:38:12 -07:00
Kristian Høgsberg
9905481552 vk: Add generated header for HSW and IVB (GEN75 and GEN7) 2015-05-12 14:29:04 -07:00
Jason Ekstrand
ffe9f60358 vk: Add stub() and stub_return() macros and mark piles of functions as stubs 2015-05-12 13:45:02 -07:00
Jason Ekstrand
d3b374ce59 vk/util: Add a anv_finishme function/macro 2015-05-12 13:43:36 -07:00
Jason Ekstrand
7727720585 vk/meta: Break setting up meta clear state into it's own functin 2015-05-12 13:03:50 -07:00
Jason Ekstrand
4336a1bc00 vk/pipeline: Add support for disabling the scissor in "extra" 2015-05-12 12:53:01 -07:00
Alex Deucher
71ba30f778 radeonsi: add new bonaire pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2015-05-12 14:46:42 -04:00
Marek Olšák
0ea1047d8c st/mesa: translate st_api robustness flags to gl_context flags
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 19:38:45 +02:00
Marek Olšák
f1c42475a5 st/dri: add support for create_context_robustness GLX and EGL extensions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 19:38:45 +02:00
Marek Olšák
a0ad185803 st/mesa: implement GetGraphicsResetStatus
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 19:38:45 +02:00
Marek Olšák
79ffc08ae8 gallium: add PIPE_CAP_DEVICE_RESET_STATUS_QUERY
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 19:38:31 +02:00
Marek Olšák
cacd0e290a gallium: add an interface for querying a device reset status
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 19:34:20 +02:00
Francisco Jerez
a533d4edf1 clover: Implement locking of the wait_count, _chain and _status members of event.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
CC: 10.5 <mesa-stable@lists.freedesktop.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
4022a468b2 clover: Wrap event::_status in a method to prevent unlocked access.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
CC: 10.5 <mesa-stable@lists.freedesktop.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
2232b929fd clover: Refactor event::trigger and ::abort to prevent deadlock and reentrancy issues.
Refactor ::trigger and ::abort to split out the operations that access
concurrently modified data members and require locking from the
recursive and possibly re-entrant part of these methods.  This will
avoid some deadlock situations when locking is implemented.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
CC: 10.5 <mesa-stable@lists.freedesktop.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
d91d6b3f03 nir: Translate memory barrier intrinsics from GLSL IR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
f8f8b31847 nir: Translate image load, store and atomic intrinsics from GLSL IR.
v2: Undefine coordinate components not applicable to the target.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
6de78e6b0c nir: Fix indexing of atomic counter arrays with a constant value.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
f1269a3e01 nir: Add memory barrier intrinsic.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
d9e930997f nir: Define image load, store and atomic intrinsics.
v2: Undefine coordinate components not applicable to the target.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
ee1a8b5a8c i965/fs: Have component() set the register stride to zero.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:56 +03:00
Francisco Jerez
4171ef371a i965/fs: Fix offset() for registers with zero stride.
stride == 0 implies that the register has one channel per vector
component.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-12 15:47:56 +03:00
Francisco Jerez
0db663503e i965: Don't forget the force_sechalf flag in lower_load_payload().
Regression from commit 41868bb682.
Fixes a bunch of ARB_shader_image_load_store tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-05-12 15:47:56 +03:00
Francisco Jerez
cbf204069d i965: Document brw_mask_reg().
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-12 15:47:56 +03:00
Tapani Pälli
95774ca258 nir: fix sampler lowering pass for arrays
This fixes bugs with special cases where we have arrays of
structures containing samplers or arrays of samplers.

I've verified that patch results in calculating same index value as
returned by _mesa_get_sampler_uniform_value for IR. Patch makes
following ES3 conformance test pass:

	ES3-CTS.shaders.struct.uniform.sampler_array_fragment

v2: remove unnecessary comment (Topi)
    simplify changes and the overall code (Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90114
2015-05-12 14:28:16 +03:00
Neil Roberts
426023050d i965: Use predicate enable bit for conditional rendering w/o stalling
Previously whenever a primitive is drawn the driver would call
_mesa_check_conditional_render which blocks waiting for the result of
the query to determine whether to render. On Gen7+ there is a bit in
the 3DPRIMITIVE command which can be used to disable the primitive
based on the value of a state bit. This state bit can be set based on
whether two registers have different values using the MI_PREDICATE
command. We can load these two registers with the pixel count values
stored in the query begin and end to implement conditional rendering
without stalling.

Unfortunately these two source registers were not in the whitelist of
available registers in the kernel driver until v3.19. This patch uses
the command parser version from intel_screen to detect whether to
attempt to set the predicate data registers.

The predicate enable bit is currently only used for drawing 3D
primitives. For blits, clears, bitmaps, copypixels and drawpixels it
still causes a stall. For most of these it would probably just work to
call the new brw_check_conditional_render function instead of
_mesa_check_conditional_render because they already work in terms of
rendering primitives. However it's a bit trickier for blits because it
can use the BLT ring or the blorp codepath. I think these operations
are less useful for conditional rendering than rendering primitives so
it might be best to leave it for a later patch.

v2: Use the command parser version to detect whether we can write to
    the predicate data registers instead of trying to execute a
    register load command.
v3: Simple rebase
v4: Changes suggested by Kenneth Graunke: Split the
    load_64bit_register function out to a separate patch so it can be
    a shared public function. Avoid calling
    _mesa_check_conditional_render if we've already determined that
    there's no query object. Some styling fixes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 11:20:47 +01:00
Neil Roberts
9585879d46 i956: Add a function to load a 64-bit register from a buffer
Adds brw_load_register_mem64 which is similar to brw_load_register_mem
except that it queues two GEN7_MI_LOAD_REGISTER_MEM commands in order
to load both halves of a 64-bit register. The function is implemented
by splitting the 32-bit version into an internal helper function which
takes a size.

This will later be used to set the 64-bit predicate source registers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 11:20:35 +01:00
Neil Roberts
8a59f2f26f i965: Store the command parser version number in intel_screen
In order to detect whether the predicate source registers can be used
in a later patch we will need to know the version number for the
command parser. This patch just adds a member to intel_screen and does
an ioctl to get the version.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 11:20:35 +01:00
Kristian Høgsberg
d77c34d1d2 vk: Add clear load-op for render passes 2015-05-11 23:25:29 -07:00
Kristian Høgsberg
b734e0bcc5 vk: Add support for driver-internal custom pipelines
This lets us disable the viewport, use rect lists and repclear.
2015-05-11 23:25:29 -07:00
Kristian Høgsberg
ad132bbe48 vk: Fix 3DSTATE_VERTEX_BUFFER emission
Set VertexBufferIndex to the attribute binding, not the location.
2015-05-11 23:25:29 -07:00
Kristian Høgsberg
6a895c6681 vk: Add 32 bpc signed and unsigned integer formats 2015-05-11 23:25:29 -07:00
Kristian Høgsberg
55b9b703ea vk: Add anv_batch_emit_merge() helper macro
This lets us emit a state packet by merging to half-backed versions,
typically one from the pipeline object and one from a dynamic state
objects.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
099faa1a2b vk: Store bo pointer in anv_image and anv_buffer
We don't need to point back to the memory object the bo came from.
Pointing directly to a bo lets us bind images and buffers to other
bos - like our allocator bos.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
4f25f5d86c vk: Support not having a vertex shader
This lets us bypass the vertex shader and pass data straight into
the rasterizer part of the pipeline.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
20ad071190 vk: Allow NULL as a valid pipeline layout
Vertex buffers and render targets aren't part of the layout so having
an empty layout is pretty common.
2015-05-11 22:12:56 -07:00
Roland Scheidegger
971be2b7c9 docs/GL3: (trivial) mark some tf extensions as done for softpipe/llvmpipe
Those extensions were enabled for ages already.
2015-05-12 04:48:48 +02:00
Emil Velikov
95089bfaeb docs: add news item and link release notes for mesa 10.5.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-11 22:07:46 +01:00
Emil Velikov
d4125c41f9 docs: Add sha256 sums for the 10.5.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 8ee1a1c08b)
2015-05-11 22:06:13 +01:00
Emil Velikov
22aaa746bd Add release notes for the 10.5.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit d88fb40505)
2015-05-11 22:06:11 +01:00
Ilia Mirkin
2b5355c8ab st/mesa: make sure to create a "clean" bool when doing i2b
i2b has to work for all integers, not just 1. INEG would not necessarily
result with all bits set, which is something that other operations can
rely on by e.g. using AND (or INEG for b2i).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
2015-05-11 15:52:17 -04:00
Tom Stellard
9c4dc98b29 clover: Fix a bug with multi-threaded events v2
It was possible for some events never to get triggered if one thread
was creating events and another threads was waiting for them.

This patch consolidates soft_event::wait() and hard_event::wait()
into event::wait() so that hard_event objects will now wait for
all their dependencies to be submitted before flushing the command
queue.

v2:
  - Rename variables
  - Use mutable varibales so we can keep event::wait() const
  - Open code signalled() call so mutex can be atted to signalled
    without deadlocking.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-11 18:52:29 +00:00
Tom Stellard
f546902d95 clover: Add a mutex to guard queue::queued_events
This fixes a potential crash where on a sequence like this:

Thread 0: Check if queue is not empty.
Thread 1: Remove item from queue, making it empty.
Thread 0: Do something assuming queue is not empty.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-11 18:52:18 +00:00
Matt Turner
73f4010082 i965/fs: Add missing initializer in fs_visitor(). 2015-05-11 11:25:03 -07:00
Adam Jackson
7a58262e58 egl: Remove skeleton implementation of EGL_MESA_screen_surface
No backend wires this up to anything, and the extension spec has been
marked obsolete for 4+ years.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2015-05-11 14:19:37 -04:00
Axel Davy
13fa84e1bc egl/swrast: Enable config extension for swrast
Enables to use dri config for swrast, like vblank_mode.

Reviewed-by: Dave Airlie <airlied@redhat.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
cdcfe48fb0 egl/wayland: Implement swrast support
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
cd25e52f6b egl/wayland: Simplify dri2_wl_create_surface
This function is always used with EGL_WINDOW_BIT. Pixmaps are forbidden
for Wayland, and PBuffers are unimplemented.

Reviewed-by: Daniel Stone <daniels@collabora.com>.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
f1cc478d89 egl/x11: move dri2_x11_swrast_create_image_khr to egl_dri2_fallback.h
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
4cd546df82 egl/wayland: Implement DRI_PRIME support
When the server gpu and requested gpu are different:
. They likely don't support the same tiling modes
. They likely do not have fast access to the same locations

Thus we do:
. render to a tiled buffer we do not share with the server
. Copy the content at every swap to a buffer with no tiling
that we share with the server.

This is similar to the glx dri3 DRI_PRIME implementation.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
fb0960a14b egl/wayland: Add support for render-nodes
It is possible the server advertises a render-node.
In that case no authentication is needed,
and Gem names are forbidden.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>

v2: do not check for __DRI_IMAGE_DRIVER, but instead
do not advertise __DRI_DRI2_LOADER when on a render-node.
2015-05-11 19:31:44 +02:00
Axel Davy
c4ff6d00cd glx/dri3: Add additional check for gpu offloading case
Checks blitImage is implemented.
Initially having the __DRIimageExtension extension
at version 9 at least meant blitImage was supported.
However some implementation do advertise version >= 9
without implementing it.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
05ac39ac49 doc/egl: Remove depreciated EGL_SOFTWARE
EGL_SOFTWARE is not supported anywhere in the code,
whereas LIBGL_ALWAYS_SOFTWARE is.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
6aaf09b93b egl/wayland: properly destroy wayland objects
the wl_registry and the wl_queue allocated weren't destroyed.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:43 +02:00
Neil Roberts
bfdae9149e i965/fs: Disable opt_sampler_eot for textureGather
The opt_sampler_eot optimisation seems to break when the last
instruction is SHADER_OPCODE_TG4. A bunch of Piglit tests end up doing
this so it causes a lot of regressions. I can't find any documentation
or known workarounds to indicate that this is expected behaviour, but
considering that this is probably a pretty unlikely situation in a
real use case we might as well disable it in order to avoid the
regressions. In total this fixes 451 tests.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-05-11 12:09:20 +01:00
Tapani Pälli
abf3fefa1a mesa: use _mesa_has_compute_shaders instead of extension check
This was really the original purpose, for enabling the path for
ES3.1 tests without the extension being set. Set also fallthrough
comment for Coverity (caught by Matt).

v2: .. and test the right way, not wrong one (Ilia Mirkin)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-11 08:14:51 +03:00
Marta Lofstedt
4a8cd2799c main: glGetIntegeri_v fails for GL_VERTEX_BINDING_STRIDE
The return type for GL_VERTEX_BINDING_STRIDE is missing,
this cause glGetIntegeri_v to fail.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-05-11 08:00:30 +03:00
Dave Airlie
9ab90c058f r600: use pipe->hw prim convert from radeonsi
This avoids future addition to PIPE_PRIM_ from causing regressions
on r600g.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-11 06:43:18 +10:00
Rob Clark
1cbdafc47a freedreno/ir3/nir: fix build break after f752effa
Our lower if/else pass was missed when converting NIR to use linked
lists rather than hashsets to track use/def sets.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-05-10 06:03:53 -04:00
Kristian Høgsberg
769785c497 Add vulkan driver for BDW 2015-05-09 11:38:32 -07:00
Ilia Mirkin
da136dc07d nv50/ir: only enable mul saturate on G200+
Commit 44673512a8 enabled support for saturating fmul. However
experimentally this does not seem to work on the older chips. Restrict
the feature to G200 (NVA0) and later.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90350
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:41:51 -04:00
Ilia Mirkin
7892210400 nvc0: reset the instanced elements state when doing blit using 3d engine
Since we update num_vtxelts here, we could otherwise end up with stale
instancing information in the upper bits which wouldn't otherwise get
reset. (Also we run the risk of the previous draw having set the first
element as instanced.)

This appears as one of the causes for the test pointed out in fdo#90363
to fail on nvc0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:36:23 -04:00
Ilia Mirkin
e9b1ea29bf nvc0: keep track of PGRAPH state in nvc0_screen
See identical commit for nv50. Destroying the current context and then
creating a new one or switching to another existing context would cause
the "current" state to not be properly initialized, so we save it off in
the screen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:36:23 -04:00
Ilia Mirkin
f617029db3 nv50: keep track of PGRAPH state in nv50_screen
Normally this is kept in nv50_context, and on switching the active
context, the state is copied from the previous context. However when the
last context is destroyed, this is lost, and a new context might later
be created. When the currently-active context is destroyed, save its
state in the screen, and restore it when setting the current context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363
Reported-by: Matteo Bruni <matteo.mystral@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:36:23 -04:00
Kenneth Graunke
d6fb155f30 nir: Fix aggressive typos in nir_from_ssa.c.
s/agressive/aggressive/g

Trivial.
2015-05-08 19:38:14 -07:00
Jason Ekstrand
fb5f411248 nir/search: Save/restore the variables_seen bitmask when matching
Shader-db results on Broadwell:

   total instructions in shared programs: 7152330 -> 7137006 (-0.21%)
   instructions in affected programs:     1330548 -> 1315224 (-1.15%)
   helped:                                5797
   HURT:                                  76
   GAINED:                                0
   LOST:                                  8

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:15 -07:00
Jason Ekstrand
e0cfe59c37 nir/search: Assert that variable id's are in range
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:15 -07:00
Jason Ekstrand
13facfbd5b nir/search: handle explicitly sized sources in match_value
Previously, this case was being handled in match_expression prior to
calling match_value.  However, there is really no good reason for this
given that match_value has all of the information it needs.  Also, they
weren't being handled properly in the commutative case and putting it in
match_value gives us that for free.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:14 -07:00
Jason Ekstrand
f752effa08 nir/nir: Use a linked list instead of a hash set for use/def sets
This commit switches us from the current setup of using hash sets for
use/def sets to using linked lists.  Doing so should save us quite a bit of
memory because we aren't carrying around 3 hash sets per register and 2 per
SSA value.  It should also save us CPU time because adding/removing things
from use/def sets is 4 pointer manipulations instead of a hash lookup.

Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists:

   GLSL IR Only:        586.4 +/- 1.653833
   NIR with hash sets:  675.4 +/- 2.502108
   NIR + use/def lists: 641.2 +/- 1.557043

I also ran a memory usage experiment with Ken's patch to delete GLSL IR and
keep NIR.  This patch cuts an aditional 42.9 MiB of ralloc'd memory over
and above what we gained by deleting the GLSL IR on the same dota trace.

On the code complexity side of things, some things are now much easier and
others are a bit harder.  One of the operations we perform constantly in
optimization passes is to replace one source with another.  Due to the fact
that an instruction can use the same SSA value multiple times, we had to
iterate through the sources of the instruction and determine if the use we
were replacing was the only one before removing it from the set of uses.
With this patch, uses are per-source not per-instruction so we can just
remove it safely.  On the other hand, trying to iterate over all of the
instructions that use a given value is more difficult.  Fortunately, the
two places we do that are the ffma peephole where it doesn't matter and GCM
where we already gracefully handle duplicates visits to an instruction.

Another aspect here is that using linked lists in this way can be tricky to
get right.  With sets, things were quite forgiving and the worst that
happened if you didn't properly remove a use was that it would get caught
in the validator.  With linked lists, it can lead to linked list corruption
which can be harder to track.  However, we do just as much validation of
the linked lists as we did of the sets so the validator should still catch
these problems.  While working on this series, the vast majority of the
bugs I had to fix were caught by assertions.  I don't think the lists are
going to be that much worse than the sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
2c2cd368aa util/list: Add a list validation function
Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
addcf41066 util/list: Add list_empty and list_length functions
v2: Don't use C99 when iterating over the list

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
b31d8983ba util/list: Add C99-based iterator macros
v2: Use LIST_ENTRY instead of container_of in iterators

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
7a30668ad6 util: Move gallium's linked list to util
The linked list in gallium is pretty much the kernel list and we would like
to have a C-based linked list for all of mesa.  Let's not duplicate and
just steal the gallium one.

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
258b4194c8 gallium/double_list: s/INLINE/inline and remove the p_compiler include
Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
ecc2cfc8b6 nir: Use nir_instr_rewrite_src in copy propagation
We were rolling our own rewrite_src variant in copy-propagation.  Let's
stop doing that and use the ones in core NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
f72a8d1cf0 nir: Add a function for rewriting the condition of an if statement
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
300d729436 nir: Add and use initializer #defines for nir_src and nir_dest
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
6702ebce57 nir: Modernize the out-of-SSA pass
The out-of-SSA pass was one of the first passes written when getting SSA
up-and-going (for obvious reasons).  As such, it came before a lot of the
nifty SSA-based helpers were introduced.  This commit modernizes it so that
we're no longer doing nearly as much manual banging on use/def sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
7ee0216e2d nir/validate: Validate SSA def parent instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Ilia Mirkin
c4ac09e30e nv50/ir: only propagate saturate up if some actual folding took place
The former logic would copy the saturate up to any mul with an immediate
if there was a subsequent mul with a saturate. However we only want to
do that if we collapsed 2 muls by multiplying their immediates (or were
able to put the immediate in as a post-multiplier).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-08 18:56:56 -04:00
Ian Romanick
3bdbc1e436 nir: Delete all traces of nir_op_flog
Nothing produces it, and nothing can consume it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
ad51f9b421 nir: Don't produce nir_op_flog from GLSL IR
All paths that produce GLSL IR for NIR lower ir_unop_log.  All paths
that consume NIR will explode if they geta nir_op_flog.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
e0a17f6e31 nir: Delete all traces of nir_op_fexp
Nothing produces it, and nothing can consume it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
a45d55f17c nir: Don't produce nir_op_fexp from GLSL IR
All paths that produce GLSL IR for NIR lower ir_unop_exp.  All paths
that consume NIR will explode if they geta nir_op_fexp.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
5e0dca62a7 prog_to_nir: OPCODE_EXP is not nir_op_fexp
It's a weird thing that provides some values related to 2**x.  It's also
already handled by a case in the switch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Neil Roberts
f98c3f3e44 i965/fs: Improve a comment about stripping trailing zeroes
Originally I wrote that removing the first parameter doesn't work but
I didn't know why. I now found a mention of this in the PRM so it's
probably worthing adding it to the comment.
2015-05-08 16:16:56 +01:00
Fredrik Höglund
b004510072 docs: Update the ARB_direct_state_access status
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:04 +02:00
Fredrik Höglund
97b268f1de mesa: Implement GetVertexArrayIndexed[64]iv
v2: Fix the name of the entry point in the error messages.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:04 +02:00
Fredrik Höglund
2ad0268871 mesa: Add support for querying GL_VERTEX_ATTRIB_ARRAY_LONG
This parameter was added in OpenGL 4.3 and GL_ARB_direct_state_access.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:04 +02:00
Fredrik Höglund
4f5160300d mesa: Add a vao parameter to get_vertex_array_attrib
This is needed to implement glGetVertexArrayIndexediv and
glGetVertexArrayIndexed64iv.

v2: Make the vao parameter const.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:04 +02:00
Fredrik Höglund
1085c01121 mesa: Implement GetVertexArrayiv
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
0a895c379e mesa: Implement VertexArrayBindingDivisor
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
f2ef09d44a mesa: Add a vao parameter to vertex_binding_divisor
This is needed to implement VertexArrayBindingDivisor.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
dc2eaaf912 mesa: Implement VertexArrayAttribBinding
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
ade0179f77 mesa: Add a vao parameter to vertex_attrib_binding
This is needed to implement VertexArrayAttribBinding.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
f0030b0f1f mesa: Implement VertexArrayAttrib[I|L]Format
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
fa350eadfb mesa: Add a vao parameter to update_array_format
This is needed to implement VertexArrayAttrib*Format.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
bc6668e35d mesa: Refactor VertexAttrib[I|L]Format
The only difference between these functions is the legal types and
sizes, so consolidate the code into a single vertex_attrib_format()
function and call it from all three entry points.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
308926853d mesa: Implement VertexArrayVertexBuffers
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
cc9b68e9c9 mesa: Implement VertexArrayVertexBuffer
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
c59b5317fc mesa: Add a vao parameter to bind_vertex_buffer
This is needed to implement VertexArrayVertexBuffer and
VertexArrayVertexBuffers.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
7ccc4f3f23 mesa: Implement VertexArrayElementBuffer
v2: Add a doxygen comment.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:03 +02:00
Fredrik Höglund
c99efbd3c2 mesa: Implement EnableVertexArrayAttrib
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:02 +02:00
Fredrik Höglund
96b6463463 mesa: Implement DisableVertexArrayAttrib
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:02 +02:00
Fredrik Höglund
6c37acfbed mesa: Keep track of the last looked-up VAO
This saves the cost of repeated hash table lookups when the same
vertex array object is referenced in a sequence of calls such as:

    glVertexArrayAttribFormat(vao, ...);
    glVertexArrayAttribBinding(vao, ...);
    glEnableVertexArrayAttrib(vao, ...);
    ...

Note that VAO's are container objects that are not shared between
contexts.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:02 +02:00
Fredrik Höglund
2830c2fbeb mesa: Add _mesa_lookup_vao_err
This is a convenience function that generates GL_INVALID_OPERATION
when the array object doesn't exist.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:02 +02:00
Fredrik Höglund
a1f48268b4 mesa: Implement CreateVertexArrays
v2: Update the documentation for gen_vertex_arrays().

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-05-08 15:31:02 +02:00
Neil Roberts
e51bad669a i965/skl: In opt_sampler_eot always set destination register to null
opt_sampler_eot enables a direct write to framebuffer from a sample.
In order to do this the sample message needs to have a message header
so if there wasn't one already then the function adds one. In addition
the function sets the destination register to null because it's no
longer used. However it was only doing this in cases where it was
adding a message header. This patch just moves setting the destination
so that it happens even if there's a messge header. In practice this
doesn't seem to make any difference but it's a bit cleaner.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-05-08 12:40:22 +01:00
Neil Roberts
1c5de556c5 i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot
Commit 94ee908448 added a header size parameter to the function to
create the LOAD_PAYLOAD instruction. However this broke
opt_sampler_eot which manually constructs the instruction and so
wasn't setting the header_size. This ends up making the parameters for
the send message all have the wrong location and it all falls apart.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-05-08 12:40:14 +01:00
Martin Peres
e4b2973607 docs: document the LIBGL_DRI3_DISABLE environment variable
Suggested-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-05-08 13:36:52 +03:00
Dave Airlie
ff64411c84 docs: update ARB_vertex_attrib_64bit status
Add to GL3.txt and release notes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:22:12 +10:00
Dave Airlie
ef83c9b762 st/mesa: add double input support including lowering (v3.1)
This takes a different approach to previously, we cannot index into the
inputMapping with anything but the mesa attribute index, so we can't use
the just add one to index trick, we need more info to add one to it
after we've mapped the input.

(Fixed copy propgation and cleaned up a little)

v2: drop float64 format check, just attr->Doubles.
merge enable patch.
v3: cleanup code a bit.
v3.1: minor review fixups (comment, newline) (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:02 +10:00
Dave Airlie
c4254ee526 mesa/vbo: add support for 64-bit vertex attributes. (v1)
This adds support in the vbo and array code to handle
double vertex attributes.

v0.2: merge code to handle doubles in vbo layer.
v1: don't use v0, merge api_array elt code.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Dave Airlie
ad208d975a glsl: check total count of multi-slot double vertex attribs
The spec is vague all over the place about this, but this seems
to be the intent, we can probably make this optional later if
someone makes hw that cares and writes a driver.

Basically we need to double count some of the d types but
only for totalling not for slot number assignment.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Dave Airlie
023fc344da glsl: track which program inputs are doubles
instead of doing the attempts at dual slot handling here,
let the backend do it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Dave Airlie
5d6190e496 glsl: add ARB_vertex_attrib_64bit support. (v2)
Just more boilerplate stuff.

v2:
bad fallthrough on versioning,
this is my ugly but self contained solution (Ian)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Dave Airlie
fc71ae7c57 mesa: add ARB_vertex_attrib_64bit to extensions. (v2)
Just add the boilerplate bits.

v2: add to version.c

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Dave Airlie
5a7f04925f mapi: add GL_ARB_vertex_attrib_64bit support
This just adds the glapi bits.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Dave Airlie
731b7c49bb st/glsl_to_tgsi: fix ir_assignment hack doing bad things for doubles
This hack for fixing gl_FragDepth apparantly caused a GLSL shader
outputting a single double to try and output a dvec4, but we hadn't
assigned outputs for the secondary bit.

This avoids going into the hack code for scalar doubles.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08 10:21:01 +10:00
Topi Pohjolainen
b1119ce838 i965/wm/gen6: Add option for disabling statistics collection
Normally this is always needed but for internal blits and clears
we need to be able to disable it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-07 22:30:18 +03:00
Topi Pohjolainen
dae7183cdd i965/wm/gen6: Refactor state setup
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-07 22:30:17 +03:00
Anuj Phogat
d14f3e14b4 i965: Remove unused variables
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-05-07 11:43:01 -07:00
Anuj Phogat
15259d63e8 i965: Change the order of conditions tested in if
Reduces the number of conditions tested in if to one in case of
non-integer formats. Makes no functional changes.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-07 11:43:01 -07:00
Matt Turner
8e029105c2 nir: Allow feq/fne/ieq/ine to be optimized with inot.
instructions in affected programs:     380 -> 376 (-1.05%)
helped:                                2

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
f5cf74d8ba nir: Recognize (a < c || b < c) as min(a, b) < c.
... and (a >= c) || (b >= c) as max(a, b) >= c.

Similar to commit 97e6c1b9.

total instructions in shared programs: 6182276 -> 6182180 (-0.00%)
instructions in affected programs:     6400 -> 6304 (-1.50%)
helped:                                68
HURT:                                  4

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
ceb8b739ce nir: Recognize trivial min/max.
No changes, but does prevent some regressions in the next commit.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
8ae559971a nir: Recognize i2b(b2i(x)) as x.
Helps the same set of programs as the previous commit.

instructions in affected programs:     4490 -> 4346 (-3.21%)
helped:                                8

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
74697e2844 nir: Recognize imul(b2i(a), b2i(b)) as a logical AND.
Four shaders in Unreal 4's Sun Temple are helped, and gain SIMD16
because we avoid an integer multiplication.

instructions in affected programs:     2353 -> 2245 (-4.59%)
helped:                                4
GAINED:                                4

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Chad Versace
c636284ee8 i965/sync: Implement DRI2_Fence extension
This enables EGL_KHR_fence_sync and EGL_KHR_wait_sync.

Below is the difference in piglit results, before and after this patch.
No regressions and several tests improve from 'skip' to 'pass'. Out of
EGL_KHR_fence_sync tests, two of the multithreaded tests skip; all other
tests pass.

  cmdline: piglit run -p gbm -t sync tests/quick.py
  mesa: master@1ac7db0
  piglit: 4069bec
  hw: Ivybridge

        | before after
  ------+-------------
   pass |     32    46
   fail |      0     0
  crash |      0     0
   skip |     35    21
  total |     67    67

v2:
  - Set fence->signalled = true in brw_fence_has_completed() too.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-07 08:11:22 -07:00
Chad Versace
2516d835b1 i965/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync'
I'm about to implement DRI2_Fenc in intel_syncobj.c.  To prevent
madness, we need to prefix functions for GL_ARB_sync with 'gl' and
functions for DRI2_Fence with 'dri'. Otherwise, the file will become
a jumble of similiarly named functions.

For example:
    old-name:      intel_client_wait_sync()
    new-name:      intel_gl_client_wait_sync()
    soon-to-come:  intel_dri_client_wait_sync()

I wrote this renaming commit separately from the commit that implements
DRI2_Fence because I wanted the latter diff to be reviewable.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-07 08:11:21 -07:00
Chad Versace
19b5a82fda i915/sync: Return early when calloc fails
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-07 08:11:21 -07:00
Chad Versace
00f3c7baeb i965/sync: Return NULL when calloc fails
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-07 08:11:21 -07:00
Chad Versace
9cf9a2dec5 i915/sync: Don't crash when deleting sync object
Don't pass NULL to drm_intel_bo_unreference(). It doesn't like that.

Bug found by code inspection.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-07 08:11:21 -07:00
Chad Versace
a93ab73a07 i965/sync: Don't crash when deleting sync object
Don't pass NULL to drm_intel_bo_unreference(). It doesn't like that.

Bug found by code inspection.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-07 08:11:21 -07:00
Chad Versace
a6bfdd7b46 egl/dri2: Fix codestyle in a comment
Pointed out by Kenneth Graunke. Trivial fix.
2015-05-07 08:09:07 -07:00
Martin Peres
cedd5008da glx: report which DRI version is used when in verbose debug mode
This should make it more obvious in bug reports while also removing
any sort of guesswork for developers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-05-07 16:56:14 +03:00
Vinson Lee
cf5e015f71 glapi: Add positional argument specifier.
Fix build error introduced with commit 1c5a57a "glapi/es3.1: Add support
for GLES versions > 3.0" with Python < 2.7.

  File "src/mapi/glapi/gen/gl_genexec.py", line 230, in <module>
    printer.Print(api)
  File "src/mapi/glapi/gen/gl_XML.py", line 120, in Print
    self.printBody(api)
  File "src/mapi/glapi/gen/gl_genexec.py", line 187, in printBody
    condition_parts.append('(ctx->API == API_OPENGLES2 && ctx->Version >= {})'.format(int(f.api_map['es2'] * 10)))
ValueError: zero length field name in format

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-05-06 23:26:21 -07:00
Ilia Mirkin
55b66dc4de nv50/ir: add SHL to the list of U32 opcodes
Having the wrong inferred type prevents a number of optimizations,
including constant propagation (since float immediates work differently
than integer immediates).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-06 20:50:03 -04:00
Ian Romanick
51e3453785 i965: Sort extension enable lists
Sort by GEN, then sort by extension name.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-06 13:05:18 -07:00
Vinson Lee
382b1a36e3 r600g: Fix Clang return-type build error.
Fix Clang return-type error introduced with commit
96f164f6f0 "gallium: make
pipe_context::begin_query return a boolean".

  CC       r600_query.lo
r600_query.c:443:3: error: non-void function 'r600_begin_query' should return a value [-Wreturn-type]
                return;
                ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-05-06 12:21:34 -07:00
Kenneth Graunke
0c0ca55711 i965/fs: Allow copy propagation on ATTR file registers.
This especially helps with NIR because we currently emit MOVs at the top
of the shader to copy from various ATTR registers to a giant VGRF array
of all inputs.  (This could potentially be done better, but since
there's only ever one write to each register, it should be trivial to
copy propagate away...)

With NIR - only vertex shaders:
total instructions in shared programs: 3129373 -> 2889581 (-7.66%)
instructions in affected programs:     3119717 -> 2879925 (-7.69%)
helped:                                20833

Without NIR - only vertex shaders:
total instructions in shared programs: 2745901 -> 2724483 (-0.78%)
instructions in affected programs:     693426 -> 672008 (-3.09%)
helped:                                3516

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
7a75b55a01 i965/fs_inst: Get rid of the effective_width field
The effective_width field was an ill-concieved hack to get around issues in
the LOAD_PAYLOAD instruction.  Now that the LOAD_PAYLOAD instruction is far
more sane, this field can die.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
41868bb682 i965/fs: Rework the fs_visitor LOAD_PAYLOAD instruction
The newly reworked instruction is far more straightforward than the
original.  Before, the LOAD_PAYLOAD instruction was lowered by a the
complicated and broken-by-design pile of heuristics to try and guess
force_writemask_all, exec_size, and a number of other factors on the
sources.

Instead, we use the header_size on the instruction to denote which sources
are "header sources".  Header sources are required to be a single physical
hardware register that is copied verbatim.  The registers that follow are
considered the actual payload registers and have a width that correspond's
to the LOAD_PAYLOAD's exec_size and are treated as being per-channel.  This
gives us a fairly straightforward lowering:

 1) All header sources are copied directly using force_writemask_all and,
    since they are guaranteed to be a single register, there are no
    force_sechalf issues.

 2) All non-header sources are copied using the exact same force_sechalf
    and force_writemask_all modifiers as the LOAD_PAYLOAD operation itself.

 3) In order to accommodate older gens that need interleaved colors,
    lower_load_payload detects when the destination is a COMPR4 register
    and automatically interleaves the non-header sources.  The
    lower_load_payload pass does the right thing here regardless of whether
    or not the hardware actually supports COMPR4.

This patch commit itself is made up of a bunch of smaller changes squashed
together.  Individual change descriptions follow:

i965/fs: Rework fs_visitor::LOAD_PAYLOAD

   We rework LOAD_PAYLOAD to verify that all of the sources that count as
   headers are, indeed, exactly one register and that all of the non-header
   sources match the destination width.  We then take the exec_size for
   LOAD_PAYLOAD directly from the destination width.

i965/fs: Make destinations of load_payload have the appropreate width

i965/fs: Rework fs_visitor::lower_load_payload

   v2: Don't allow the saturate flag on LOAD_PAYLOAD instructions

i965/fs_cse: Support the new-style LOAD_PAYLOAD

i965/fs_inst::is_copy_payload: Support the new-style LOAD_PAYLOAD

i965/fs: Simplify setup_color_payload

   Previously, setup_color_payload was a a big helper function that did a
   lot of gen-specific special casing for setting up the color sources of
   the LOAD_PAYLOAD instruction.  Now that lower_load_payload is much more
   sane, most of that complexity isn't needed anymore.  Instead, we can do
   a simple fixup pass for color clamps and then just stash sources
   directly in the LOAD_PAYLOAD.  We can trust lower_load_payload to do the
   right thing with respect to COMPR4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
94ee908448 i965/fs: Make LOAD_PAYLOAD take a header size
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
74dccdad4b i965/fs: Make emit_single_fb_write take an explicit exec_size
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
32af7d4188 i965/fs_inst: Add an is_copy_payload helper
This commit adds a new is_copy_payload helper to fs_inst that takes the
place of the similarly named functions in cse and register coalesce.  The
two is_copy_payload functions in CSE and register coalesce were subtly
different and potentially subtly broken.  The new version unifies the two
and should be more correct.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
76c1086f2d i965: Change header_present to header_size in backend_instruction
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:30 -07:00
Jason Ekstrand
a9ccb14d14 i965/fs_cse: Factor out code to create copy instructions
v2: Get rid of the block parameter and make src a const reference

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:29 -07:00
Jason Ekstrand
cf4607e853 i965/fs: Make half(fs_reg, unsigned) handle register files more explicitly
Previously, we had a special case for uniforms and immediates and then a
bunch of asserts for various other pessimal things.  This commit changes it
so that it explicitly does something on each register file.  Some of them
are disallowed and others are treated properly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 10:29:29 -07:00
Francisco Jerez
88414de45e i965/fs: Fix passing an immediate to half().
Immediates are generally uniform, they yield the same value to both
halves of any instruction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-06 10:29:29 -07:00
Jeremy Huddleston Sequoia
5b2d3480f5 swrast: Build fix for darwin
Fixes regression from commit 64b1dc4449

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90147
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: jon.turney@dronecode.org.uk
CC: ionic@macports.org
2015-05-06 10:04:05 -06:00
Chad Versace
b0f410a2a0 egl/dri2: Check return value of __DRI2fence::create_fence()
If it returns NULL, then return early with an error.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-05-06 07:55:41 -07:00
Roland Scheidegger
b8a1495106 draw: (trivial) fix out-of-bounds vector initialization
Was off-by-one. llvm says inserting an element with an index higher than the
number of elements yields undefined results. Previously such inserts were
ignored but as of llvm revision 235854 the vector gets replaced with undef,
causing failures.
This fixes piglit gl-3.2-layered-rendering-gl-layer, as mentioned in
https://llvm.org/bugs/show_bug.cgi?id=23424.

Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
2015-05-06 16:51:09 +02:00
Martin Peres
9891fc329b main/queryobj: add GL_QUERY_TARGET support to GetQueryObjectiv()
This was missing from my patchset to support the query-related entry
points of Direct State Access.

Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-05-06 15:26:12 +03:00
Chia-I Wu
ef5d4bcc3a ilo: silence a compiler warning
Silence

  ilo_query.c:120:7: warning: 'return' with no value, in function returning non-void

since commit 96f164f6.
2015-05-06 16:35:30 +08:00
Tapani Pälli
818cc90535 mesa: support compute stage in _mesa_program_resource_prop
Increases pass rate of ES31-CTS.*program_interface_query* tests
when run with MESA_EXTENSION_OVERRIDE='GL_ARB_compute_shader'. Many
of the negative tests that happen to use compute stage in queries
start passing.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-05-06 11:12:01 +03:00
Tapani Pälli
3706e5dbc9 glsl: mark special built-in inputs referenced by vertex stage
Refactoring done on active attribute queries did not take in to
account special built-in inputs for the vertex stage. This commit
sets them referenced by vertex stage so that they get enumerated
properly.

Fixes Piglit test 'get-active-attrib-returns-all-inputs' failure.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90243
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-By: Martin Peres <martin.peres@linux.intel.com>
2015-05-06 11:10:51 +03:00
Chris Forbes
1fcdb2ce79 relnotes: Note support for viewport arrays on i965/Gen6.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2015-05-06 19:05:17 +12:00
Chris Forbes
5fc23375e8 i965/gen6: Enable ARB_viewport_array and AMD_vertex_shader_viewport_index
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 19:01:58 +12:00
Chris Forbes
c41f625200 i965/gen6: Upload all the SF viewports
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 19:01:57 +12:00
Chris Forbes
2a8835d485 i965/gen6: Upload all the clip viewports
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 19:01:55 +12:00
Chris Forbes
0374159b0c i965/gen6: setup limits for ARB_viewport_array
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-06 19:01:38 +12:00
Brian Paul
212f26bb60 st/mesa: fix pipe_query_result result initializer
Fixes MSVC build error.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-05 16:00:54 -06:00
Brian Paul
062e2b06b2 st/mesa: fix st_NewPerfMonitor() declaration
Was missing the context parameter.  Fixes MSVC warning.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-05 16:00:53 -06:00
Brian Paul
0beaf1cd9a glsl: add parens in shader_integer_mix() to silence compiler warning
Silences gcc warning:
builtin_functions.cpp:204:23: warning: suggest parentheses around '&&'
within '||' [-Wparentheses]

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-05 16:00:53 -06:00
Brian Paul
f7bdb2f372 st/mesa: also try PIPE_FORMAT_R10G10B10A2_UNORM for GL_RGB10
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-05 15:27:52 -06:00
Samuel Pitoiset
cea910bc28 nvc0: all queries use an unsigned 64-bits integer by default
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:36 +03:00
Samuel Pitoiset
35a9286be6 nvc0: make begin_query return false when all MP counters are used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:36 +03:00
Samuel Pitoiset
3a365df665 docs: mark GL_AMD_performance_monitor on nvc0 for the 10.6.0 release
Other drivers which want to enable this extension must expose groups of
GPU hardware performance counters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:36 +03:00
Samuel Pitoiset
ed7d3886cc nvc0: define driver-specific query groups
This patch defines "Driver statistics" and "MP counters" groups, but
only the latter will be exposed through GL_AMD_performance_monitor.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:36 +03:00
Christoph Bumiller
4cd1cfb983 st/mesa: implement GL_AMD_performance_monitor
This is based on the original patch of Christoph Bumiller.

v2 (Samuel Pitoiset):
 - improve Gallium interface for this extension
 - rewrite some parts of the original code
 - fix compilation errors and piglit tests

v3:
 - only enable this extension when the underlying driver expose GPU counters
 - get rid of the ring buffer of queries

v4:
 - add a debug message when the maximum number of counters has been
   reached

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:36 +03:00
Samuel Pitoiset
96f164f6f0 gallium: make pipe_context::begin_query return a boolean
GL_AMD_performance_monitor must return an error when a monitoring
session cannot be started.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:36 +03:00
Samuel Pitoiset
546ec980f8 gallium: replace pipe_driver_query_info::max_value by a union
This allows queries to return different numeric types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:35 +03:00
Samuel Pitoiset
d5b2832c11 gallium: add new numeric types to pipe_query_result
This will be used by GL_AMD_performance_monitor.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:35 +03:00
Samuel Pitoiset
b620829b5e gallium: add new fields to pipe_driver_query_info
According to the spec of GL_AMD_performance_monitor, valid type values
returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT.
This also introduces the new field group_id in order to categorize
queries into groups.

v2: add PIPE_DRIVER_QUERY_TYPE_BYTES

v3: fix incorrect query type for radeon and svga drivers

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:35 +03:00
Samuel Pitoiset
f137f5c691 gallium: add pipe_screen::get_driver_query_group_info
Driver queries are organized as a single hierarchy where queries are
categorized into groups. Each group has a list of queries and a maximum
number of queries that can be sampled. The list of available groups can
be obtained using pipe_screen::get_driver_query_group_info.

This will be used by GL_AMD_performance monitor.

v2: add group type (CPU/GPU)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
2015-05-06 00:03:35 +03:00
Tim Rowley
ce01c0af70 mesa: fix shininess check for ffvertex_prog v2
Switch to using VERT_BIT_GENERIC macro, as varying_vp_inputs is a
bitmask.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-05 16:20:08 -04:00
Marius Predut
24ecf37ac0 i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN7
On SNB and IVB hw, for 1 pixel line thickness or less,
the general anti-aliasing algorithm give up - garbage line is generated.
Setting a Line Width of 0.0 specifies the rasterization of
the “thinnest” (one-pixel-wide), non-antialiased lines.
Lines rendered with zero Line Width are rasterized using
Grid Intersection Quantization rules as specified
by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization.

v2: Daniel Stone: Fix = used instead of == in an if-statement.
v3: Ian Romanick: Use "._Enabled" flag insteed ".Enabled".
    Add code comments. re-word wrap the commit message.
    Add a complete bugzillia list.
    Improve the hardcoded values to produce better results.
v4: Matt Turner: typo fixes and adjust <= 1.49 to become < 1.5

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006

Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-05-05 11:56:20 -07:00
Kenneth Graunke
d376c3549b i965: Fix missing type in local variable declaration.
Trivial.  Fixes the following compiler warning (from GCC 5.1.0):

brw_context.c:629:10: warning: type defaults to ‘int’ in declaration
of ‘simd_size’ [-Wimplicit-int]

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-05 11:52:15 -07:00
Matt Turner
07b49f126a i965/vec4: Use same type for immediate, for compaction. 2015-05-05 11:44:37 -07:00
Marius Predut
a9b04d8a0d i965/aa: fixing anti-aliasing bug for thinnest width lines - GEN6
On SNB and IVB hw, for 1 pixel line thickness or less,
the general anti-aliasing algorithm give up - garbage line is generated.
Setting a Line Width of 0.0 specifies the rasterization of
the “thinnest” (one-pixel-wide), non-antialiased lines.
Lines rendered with zero Line Width are rasterized using
Grid Intersection Quantization rules as specified
by bspec section 6.3.12.1 Zero-Width (Cosmetic) Line Rasterization.

v2: Daniel Stone: Fix = used instead of == in an if-statement.
v3: Ian Romanick: Use "._Enabled" flag insteed ".Enabled".
    Add code comments. re-word wrap the commit message.
    Add a complete bugzillia list.
    Improve the hardcoded values to produce better results.
v4: Matt Turner: typo fixes and adjust <= 1.49 to become < 1.5

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28832
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=9951
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27007
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60797
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=15006

Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-05-05 11:44:37 -07:00
Matt Turner
6da2d71888 i965: Remove end-of-thread SEND alignment code.
This was present in Eric's initial implementation of the compaction code
for Sandybridge (commit 077d01b6). There is no documentation saying this
is necessary, and removing it causes no regressions in piglit on any
platform.
2015-05-05 11:44:37 -07:00
Boyan Ding
28090b30dd i965: Add XRGB8888 format to intel_screen_make_configs
Some application, such as drm backend of weston, uses XRGB8888 config as
default. i965 doesn't provide this format, but before commit 65c8965d,
the drm platform of EGL takes ARGB8888 as XRGB8888. Now that commit
65c8965d makes EGL recognize format correctly so weston won't start
because it can't find XRGB8888. Add XRGB8888 format to i965 just as
other drivers do.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89689
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-05-05 14:43:18 +01:00
Emil Velikov
8da47e8a69 nir: add nir_array.h to the sources list
Otherwise `make distcheck' will fail.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-05 14:39:16 +01:00
Samuel Iglesias Gonsalvez
08a4639e81 glsl: don't lower fragdata array if the output data types don't match
Commit 7e414b5864 broke the gl_FragData array
into separate gl_FragData[i] variables, so drivers can eliminate useless
writes to gl_FragData improving their performance.

The problem occurs when GLSL IR code is linked in the following case:

* The FS output variable base data type does not match gl_FragData one (float
  vector)
* The FS output variable is replaced by gl_out_FragDataX because of commit
  7e414b5864 with X from 0 to GL_MAX_DRAW_BUFFERS.

Then the FS output variable base data type is lost in the resulting GLSL IR,
making that the driver does a wrong assignment to gl_out_FragData components
because of unmatching data types.

This patch reverts the fragdata array lowering when the output var base data type
doesn't match gl_out_FragData, i.e., when output variable base data type is
not a float or a float vector.

This patch fixes 250 dEQP tests (tested in an Intel Haswell machine)

dEQP-GLES3.functional.fragment_out.random.* (22 failed tests)
dEQP-GLES3.functional.fragment_out.array.uint.* (120 failed tests)
dEQP-GLES3.functional.fragment_out.array.int.* (108 failed tests)

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-05 12:50:22 +02:00
Neil Roberts
4ab8d59a23 i965/skl: Align compressed textures to four times the block size
On Skylake it is possible to choose your own alignment values for
compressed textures but they are expressed as a multiple of the block
size. The minimum alignment value we can use is 4 so we effectively
have to align to 4 times the block size. This patch makes it initially
set mt->align_[wh] to the large alignment value and then later divides
it by the block size so that it can be uploaded as part of the surface
state.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-05-05 10:19:16 +01:00
Dave Airlie
b5045e2991 egl: image_dma_buf_export - use KHR 64-bit type
After talking to Jon Leech he suggested this should be fine.

update spec to the version in the registry.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-05 12:19:40 +10:00
Ian Romanick
1c5a57aee1 glapi/es3.1: Add support for GLES versions > 3.0
Make the checks in the Python script and the generated code more generic
to support arbitrary GLES versions >= 2.0.

The updated dispatch_sanity.cpp test discovered this problem.  Without
this, the next patch would erroneously enable GLES 3.1 functions in GLES
2.0 and GLES 3.0.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
23d2f63b58 glsl/es3.1: Allow misc ARB_gpu_shader5 built-ins in GLSL ES 3.10
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
cea605d373 glsl/es3.1: Allow textureGather and textureGatherOffset in GLSL ES 3.10
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
0e1655c6bd glsl/es3.1: Allow enhnaced packing functions in GLSL ES 3.10
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
ad14f44b3e glsl/es3.1: Allow interger mix built-ins in GLSL ES 3.10
v2: Add missing lexer support.  Noticed by Tapani.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1]
2015-05-04 13:49:58 -07:00
Ian Romanick
dd61475d56 glsl/es3.1: Allow separate shader objects in GLSL ES 3.10
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
2dcc535300 glsl/es3.1: Allow explicit uniform locations in GLSL ES 3.10
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
7038370bd1 glsl/es3.1: Allow 3.10 ES shaders in a GLES 3.1 context
Currently no 3.10 ES features (beyond 3.00 ES) are enabled.  That will
come later.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
7efc11e071 mesa/es3.1: Add _mesa_is_gles31 helper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
56030a75ed docs/GL3: Update GLES 3.1 dependencies
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
6c9c317caf glsl: Add glsl_parser_state::has_atomic_counters helper
v2: Change GL version from 400 to 420.  Noticed by Tapani and Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
fa3475b269 mesa: Use bool in _mesa_is_ helpers instead of GLboolean
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:58 -07:00
Ian Romanick
1ec6523fcf mesa: Trivial coding standards cleanups
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:57 -07:00
Ian Romanick
5a845cf898 mesa: Use bool instead of GLboolean
v2: Squash in whitespace fixes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-04 13:49:44 -07:00
Ian Romanick
8b103cf636 glsl: Silence unused parameter warnings
I opted to comment out "last_field" because it was not obvious what the
meaning of the dangling bool would be.  For the other parameters, the
meaning was more intuitive without the name.

link_uniform_blocks.cpp:70:65: warning: unused parameter 'name' [-Wunused-parameter]
    virtual void enter_record(const glsl_type *type, const char *name,
                                                                 ^
link_uniform_blocks.cpp:77:65: warning: unused parameter 'name' [-Wunused-parameter]
    virtual void leave_record(const glsl_type *type, const char *name,
                                                                 ^
link_uniform_blocks.cpp:93:62: warning: unused parameter 'record_type' [-Wunused-parameter]
                             bool row_major, const glsl_type *record_type,
                                                              ^
link_uniform_blocks.cpp:94:34: warning: unused parameter 'last_field' [-Wunused-parameter]
                             bool last_field)
                                  ^
link_uniforms.cpp:547:65: warning: unused parameter 'name' [-Wunused-parameter]
    virtual void enter_record(const glsl_type *type, const char *name,
                                                                 ^
link_uniforms.cpp:556:65: warning: unused parameter 'name' [-Wunused-parameter]
    virtual void leave_record(const glsl_type *type, const char *name,
                                                                 ^
link_uniforms.cpp:567:34: warning: unused parameter 'last_field' [-Wunused-parameter]
                             bool last_field)
                                  ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-04 13:49:04 -07:00
Ian Romanick
778c7f149a mesa: Restore functionality to dispatch sanity test
Along with a couple secondary goals, the dispatch sanity test had two
major, primary goals.

1. Ensure that all functions part of an API version are set in the
   dispatch table.

2. Ensure that functions that cannot be part of an API version are not
   set in the dispatch table.

Commit 4bdbb58 removed the tests ability to fulfill either of its
primary goals by removing anything that used _mesa_generic_nop().  It
seems like the problem on Windows could have been resolved by adding the
NULL context pointer check from nop_handler to _mesa_generic_nop().
There is, however, some debugging benefit to actually getting the
(supposed) function name logged in the "unsupported function called"
message.

The preceding commit added a function, _glapi_new_nop_table, that
allocates a table of per-entry point no-op functions.  Restore the
ability to actually validate the sanity of the dispatch table by using
_glapi_new_nop_table.

Previous to this commit removing a function from one of the
*_functions_possible lists would not cause the test to fail.  With this
commit removing such a function will result in failure, as is expected.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-05-04 13:27:21 -07:00
Francisco Jerez
e1ae0c3bc3 i965: Fix variable indexing of sampler arrays under non-uniform control flow.
ARB_gpu_shader5 requires sampler array indexing expressions to be
dynamically uniform, this however doesn't have any implications on the
control flow that leads to the evaluation of that expression being
uniform.  Use emit_uniformize() to obtain an arbitrary live value from
the binding table index calculation instead of assuming that the first
channel is always live.

Fixes the following Piglit test cases:
  arb_gpu_shader5/execution/sampler_array_indexing/fs-nonuniform-control-flow.shader_test
  arb_gpu_shader5/execution/sampler_array_indexing/vs-nonuniform-control-flow.shader_test

part of the series:
  http://lists.freedesktop.org/archives/piglit/2015-February/014615.html

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
b234537cc3 i965: Fix variable indexing of UBO arrays under non-uniform control flow.
ARB_gpu_shader5 requires UBO array indexing expressions to be
dynamically uniform, this however doesn't have any implications on the
control flow that leads to the evaluation of that expression being
uniform.  Use emit_uniformize() to obtain an arbitrary live value from
the binding table index calculation instead of assuming that the first
channel is always live.

Fixes the following Piglit tests:
  arb_gpu_shader5/execution/ubo_array_indexing/fs-nonuniform-control-flow.shader_test
  arb_gpu_shader5/execution/ubo_array_indexing/vs-nonuniform-control-flow.shader_test

part of the series:
  http://lists.freedesktop.org/archives/piglit/2015-February/014616.html

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
046abc998c i965: Define helper function to copy an arbitrary live component from some register.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
3da9f708d4 i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode.
v2: Save some CPU cycles by doing 'return progress' rather than
    'depth++' in the discard jump special case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 17:44:17 +03:00
Francisco Jerez
715bc6d8b1 i965: Introduce the FIND_LIVE_CHANNEL pseudo-opcode.
This instruction calculates the index of an arbitrary channel enabled
in the current execution mask.  It's expected to be used as input for
the BROADCAST opcode, but it's implemented as a separate instruction
rather than being baked into BROADCAST because FIND_LIVE_CHANNEL has
no dependencies so it can always be CSE'ed with other instances of the
same instruction within a basic block.

v2: Whitespace fixes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
f2fad0dc80 i965: Perform basic optimizations on the BROADCAST opcode.
v2: Style fixes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
c74511f5dc i965: Introduce the BROADCAST pseudo-opcode.
The BROADCAST instruction picks the channel from its first source
given by an index passed in as second source.  This will be used in
situations where all channels from the same SIMD thread have to agree
on the value of something, e.g. a surface binding table index.

This is in particular the case for UBO, sampler and image arrays,
which can be indexed dynamically with the restriction that all active
SIMD channels access the same index, provided to the shared unit as
part of a single scalar field of the message descriptor.  Simply
taking the index value from the first channel as we were doing until
now is incorrect, because it might contain an uninitialized value if
the channel had previously been disabled by non-uniform control flow.

v2: Minor style fixes.  Improve commit message.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
ce0e151721 glsl: Keep track of the early_fragment_tests flag in gl_shader.
And rename _mesa_glsl_parse_state::early_fragment_tests to
fs_early_fragment_tests for consistency with other FS-specific flags in the
same struct.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:17 +03:00
Francisco Jerez
6c1f6f8291 glsl: Error out on invalid uses of the early_fragment_tests layout qualifier.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
b5994d24d8 glsl: Forbid use of image qualifiers in declarations of type other than image.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
3f8558650d glsl: Split off memory qualifiers from storage qualifiers.
Image memory qualifiers (coherent, volatile, restrict, readonly and writeonly)
follow slightly different rules from storage qualifiers, e.g. the uniqueness
rule doesn't apply.  Make them a separate non-terminal.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
f64edfdc44 glsl: Forbid opaque variables as operands of the ternary operator.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
b5854ee72b mesa: Update image unit state when glBindImageTexture is called with texture=0.
There's no indication in the spec that the image unit state other than the
bound texture object shouldn't be updated when glBindImageTexture() is called
passing the zero texture as argument.  It's very unlikely that any application
would ever have relied on this, but it's easy to get right, and it fixes the
"state" ARB_shader_image_load_store piglit test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
b663d6bc6f mesa: Initialize image units to default state on context creation.
This is the required initial image unit state according to "Table 23.45. Image
State (state per image unit)" of the OpenGL 4.3 specification.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
1b9990e373 mesa: Implement image uniform queries.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
cad0cf4cee mesa: Validate original image internal format rather than derived mesa format.
This matches what _mesa_BindImageTextures() does.  The derived image format
(gl_texture_image::TexFormat) isn't necessarily equivalent to the internal
format of the texture image.  If a forbidden internal format has been
specified we need to mark the image unit as invalid as required by the spec,
regardless of the derived format.  Fixes the "invalid"
ARB_shader_image_load_store piglit test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
4e4855f1de mesa: Call _mesa_test_texobj_completeness() before using _MaxLevel in image validation.
gl_texture_object::_MaxLevel doesn't have any meaningful value until
_mesa_test_texobj_completeness() has been run.  Fixes the "level"
ARB_shader_image_load_store piglit test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:16 +03:00
Francisco Jerez
f74ba58f84 mesa: Add support for binding a buffer texture to a shader image unit.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:15 +03:00
Francisco Jerez
8424cafbac mesa: Add extern "C" guards to shaderimage.h to allow inclusion from C++ code.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:15 +03:00
Francisco Jerez
dded5271e4 mesa: Export shader image format to mesa format conversion function.
This function will be useful for back-ends to translate an image internal
format as specified in GLSL code into a mesa format.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-04 17:44:15 +03:00
Iago Toral Quiroga
96142a3e87 swrast: Fix rgba_draw_pixels with GL_COLOR_INDEX
When we implemented the format conversion rewrite we forgot to handle
GL_COLOR_INDEX here, which needs special handling.

Fixes the following piglit test:
bin/gl-1.0-drawpixels-color-index -auto -fbo

Buzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90213

Tested-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-04 16:08:41 +02:00
Francisco Jerez
f1d1d17db6 i965: Add memory fence opcode.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-04 15:05:21 +03:00
Francisco Jerez
f118e5d15f i965: Add typed surface access opcodes.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-04 15:05:21 +03:00
Francisco Jerez
0775d8835a i965: Add untyped surface write opcode.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 15:05:21 +03:00
Francisco Jerez
c97a7705ea i965: Reorder sources of the untyped atomic opcode.
This is consistent with the untyped surface read opcode.  From now on
all typed and untyped surface access opcodes will follow the same
pattern: src[0] will be the message payload, src[1] will be the
surface index and src[2] will be a control immediate (atomic operation
for atomic opcodes and number of vector components for surface read
and write opcodes).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 15:05:20 +03:00
Francisco Jerez
ac747ca5f7 i965: Pass the number of components as a source of the untyped surface read opcode.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 15:05:20 +03:00
Francisco Jerez
20915130ac i965/vec4: Add support for untyped surface message sends from GRF.
This doesn't actually enable untyped surface message sends from GRF
yet, the upcoming atomic counter and image intrinsic lowering code
will.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 15:05:20 +03:00
Francisco Jerez
8865fe309d i965: Don't request untyped atomic writeback message if the destination is null.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 15:05:20 +03:00
Francisco Jerez
0519a6259b i965: Simplify generator code for untyped surface messages.
The generate_untyped_*() methods do nothing useful other than calling
the corresponding function from brw_eu_emit.c.  The calls to
brw_mark_surface_used() will go away too in a future commit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-04 15:05:20 +03:00
Francisco Jerez
2f1c16df3e i965: Fix the untyped surface opcodes to deal with indirect surface access.
Change brw_untyped_atomic() and brw_untyped_surface_read() to take the
surface index as a register instead of a constant and to use
brw_send_indirect_message() to emit the indirect variant of send with
a dynamically calculated message descriptor.  This will be required to
support variable indexing of image arrays for
ARB_shader_image_load_store.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-05-04 15:05:20 +03:00
Chia-I Wu
4348046a2f ilo: use ilo_image exclusively in core
Initialize ilo_view_surface and ilo_zs_surface from ilo_image instead of
ilo_texture.
2015-05-02 22:28:31 +08:00
Chia-I Wu
9b705ec32d ilo: add ilo_image_can_enable_aux()
It replaces ilo_texture_can_enable_hiz().
2015-05-02 22:14:07 +08:00
Chia-I Wu
430594c34f ilo: make ilo_image more self-contained
Add depth0, sample_count, and scanout to ilo_image.
2015-05-02 22:14:06 +08:00
Chia-I Wu
f6ca4084c7 ilo: add ilo_image_init_for_imported()
It replaces ilo_image_update_for_imported_bo() and enables more error
checkings for imported textures.
2015-05-02 22:14:06 +08:00
Chia-I Wu
938c9b8cea ilo: prepare for image init for imported bo
Refactoring in prepraration for ilo_image_init_for_imported().
2015-05-02 22:14:06 +08:00
Chia-I Wu
3f9415077b ilo: constify ilo_image_params
Make ilo_image_params const in functions that do not modify it.
2015-05-02 22:14:06 +08:00
Chia-I Wu
c209aa7a8f ilo: improve readability of ilo_image
Improve docs, rename struct fields, and reorder walk types.  No real changes.
2015-05-02 22:14:06 +08:00
Chia-I Wu
9b72bf5bd2 ilo: move command builder to core 2015-05-02 22:14:06 +08:00
Chia-I Wu
9e24c49e64 ilo: move ilo_state_3d* to core
ilo state structs (struct ilo_xxx_state) are moved as well.
2015-05-02 22:14:06 +08:00
Chia-I Wu
8ab18262c5 ilo: add ilo_buffer.h to core
Rename the original ilo_buffer to ilo_buffer_resource to avoid name conflict.
2015-05-02 22:14:06 +08:00
Chia-I Wu
3afbeb115a ilo: move BOs from ilo_texture to ilo_image
We want to work with ilo_image instead of ilo_texture in core.
2015-05-02 22:14:06 +08:00
Chia-I Wu
ac47563cb4 ilo: move ilo_layout.[ch] to core as ilo_image.[ch]
Move files and s/layout/image/.
2015-05-02 22:14:06 +08:00
Chia-I Wu
8252765532 ilo: add ilo_format.[ch] to core
The original ilo_format.[ch] are removed.
2015-05-02 22:14:06 +08:00
Chia-I Wu
9b7080c8b3 ilo: add ilo_fence.h to core
Implement pipe_fence_handle on top of ilo_fence.
2015-05-02 22:14:06 +08:00
Chia-I Wu
2182beb431 ilo: add ilo_dev_init() to core
Move init_dev() from ilo_screen.c to core.
2015-05-02 22:14:06 +08:00
Chia-I Wu
7562f9e907 ilo: rename ilo_dev_info to ilo_dev
With intel_winsys being embedded in it, drop the "_info" suffix.
2015-05-02 22:14:06 +08:00
Chia-I Wu
19351af53d ilo: move intel_winsys to ilo_dev_info
We want to use ilo_dev_info instead of ilo_screen in core.
2015-05-02 22:14:06 +08:00
Chia-I Wu
b3197fe5f4 ilo: add ilo_dev.h to core
Move what are remaining in ilo_common.h (that is, ilo_dev_*) to ilo_dev.h.
2015-05-02 22:14:06 +08:00
Chia-I Wu
7bb4fa72c0 ilo: add ilo_debug.[ch] to core
They consist of the debug helpers that used to live in ilo_common.h and
ilo_screen.c.
2015-05-02 22:14:06 +08:00
Chia-I Wu
a5797873d0 ilo: add ilo_core.h to core
ilo_core.h includes the common gallium headers that were included in
ilo_common.h.
2015-05-02 22:14:05 +08:00
Chia-I Wu
bbe91576b7 ilo: move intel_winsys.h to core
Add a new subdirectory and start moving files that do not depend on
ilo_screen/ilo_context to it.
2015-05-02 22:14:05 +08:00
Jordan Justen
eeee212e53 i965: Upload atomic buffer state for compute shaders
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Jordan Justen
5328ffbe79 i965/cs: Emit MEDIA_STATE_FLUSH after WALKER
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Jordan Justen
8d87070af2 i965/cs: Implement brw_emit_gpgpu_walker
Tested on Ivybridge, Haswell and Broadwell.

v2:
 * Use SET_FIELD. (Ken)
 * Use simd_size / 16 to support SIMD8/16/32. Ken suggested
   that we might be able to do it arithmetically rather than just
   supporting SIMD8 and SIMD16 with a conditional.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Jordan Justen
0e0e23ef53 i965/state: Emit pipeline select when changing pipelines
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Paul Berry
013031b229 i965: Implement DispatchCompute() back-end
brw_emit_gpgpu_walker will be implemented in a subsequent patch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Paul Berry
8f1423b2c4 main/cs: Implement front end code for glDispatchCompute().
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Paul Berry
4d0f3d2319 mesa/cs: Add DispatchCompute() to driver function table.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:50:00 -07:00
Jordan Justen
5f70b49d4b i965/cs: Emit state base address
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
b750e14fbb i965/fs: Add CS shader time support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
6b1b484b60 i965/cs: Upload brw_cs_state
v3:
 * Add defines. Misc cleanup suggestions. (Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
6ec6c1581c i965/cs: Support CS program precompile
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
17233f9bbc i965: Add brw_setup_tex_for_precompile. Use in VS, GS & FS.
Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
932045061b i965/cs: Emit compute shader code and upload programs
v2:
 * Don't bother checking for 'gen > 5' (krh)
 * Populate sampler data in key (krh)

v3:
 * Drop no8 support, and simplify code in several places (Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
cb18f3f021 i965/cs: Set invocation counts based on max_cs_threads
For ES, we set the max counts based on SIMD8, which is currently
accurate.

For desktop GL, we set the max counts based on SIMD16, which can fail
in some cases where a SIMD16 program is not currently supported.
Therefore, this value is not currently accurate, but will work fine in
many cases, and lets us run more test cases. Eventually we want to
always be able to generate a SIMD16 program.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
73cb2d3a73 i965/cs: Add max_cs_threads
Add values for gen7 & gen8. These are the number threads in a
subslice.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:49:59 -07:00
Jordan Justen
ea888c771c i965: Remove comment about chv device numbers being preliminary
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:29 -07:00
Jordan Justen
c380973a95 i965/fs: Support compute programs in fs_visitor
v2:
 * Clean out some unneeded code copied from run_fs (krh)
 * Always use NIR
 * Split shader time out into a separate commit

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Jordan Justen
ae6308a41e i965/cache: Add support for CS in program state cache
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Paul Berry
92a57e7207 i965/cs: Add brw_cs_prog_data, brw_cs_prog_key and brw_context::cs.
jordan.l.justen@intel.com:
 * Added brw_cs_prog_key structure
 * Added brw_cs_prog_data::dispatch_grf_start_reg_16
 * Added brw_cs_prog_data::local_size
 * Added brw_cs_prog_data::simd_size

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Jordan Justen
2a4df9c524 i965/cs: Add generator support for CS_OPCODE_CS_TERMINATE
v2:
 * Don't rely on brw_eu* to generate the send instruction. We now
   generate the send here, and drop the "i965/cs: Add support for the
   SEND message that terminates a CS thread" brw_eu* patch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Jordan Justen
dff4a42676 i965/cs: Mark g0 as used by CS_OPCODE_CS_TERMINATE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Jordan Justen
d79cdee1d9 i965/fs: Add emit_cs_terminate to emit CS_OPCODE_CS_TERMINATE
v2:
 * Do more work at the visitor level. g0 is loaded and sent to the
   generator now.

v3:
 * Use Ken's comment explaining g0 usage

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Jordan Justen
eeb4b68224 i965/cs: Add CS_OPCODE_CS_TERMINATE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Jordan Justen
f002176d5d i965/cs: Add BRW_NEW_CS_PROG_DATA and BRW_CACHE_CS_PROG
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Paul Berry
d94a9e7041 i965: Add an INTEL_DEBUG=cs option.
At the moment it's not wired up to anything.  Later patches will hook
it up to the compute shader back-end.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Paul Berry
bf058dad6b mesa/cs: Add compute support to update_program().
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Paul Berry
abb049dab6 mesa/cs: Update program.c for compute shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Paul Berry
56d5c5ab5c mesa/cs: Add inline functions for dealing with compute shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Paul Berry
6ee4dac1ef i965/cs: Add BRW_NEW_COMPUTE_PROGRAM state flag.
Also add code to brw_upload_state to set it when the compute program
changes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-02 00:34:28 -07:00
Neil Roberts
02e9773bc8 i965/fs: Strip trailing constant zeroes in sample messages
If a send message is emitted with a message length that is less than
required for the message then the remaining parameters default to
zero. We can take advantage of this to save a register when a shader
passes constant zeroes as the final coordinates to the sample
function.

I think this might be useful for GLES applications that are using 2D
textures to simulate 1D textures.

On Skylake it will be useful for shaders that do
texelFetch(tex,something,0) which I think is fairly common. This helps
more on Skylake because in that case the order of the instruction
operands are u,v,lod,r which is good for 2D textures whereas before
they were u,lod,v,r which is only good for 1D textures.

On Haswell:
total instructions in shared programs: 8535730 -> 8533261 (-0.03%)
instructions in affected programs:     236968 -> 234499 (-1.04%)
helped:                                1174

On Skylake:
total instructions in shared programs: 10345646 -> 10341237 (-0.04%)
instructions in affected programs:     293011 -> 288602 (-1.50%)
helped:                                1218

Reviewed-by: Matt Turner <mattst88@gmail.com>

v2: Applied suggestions by Kenneth Graunke:
    - Only apply on Gen5+
    - Apply to all texture opcodes, not just TEX and TXF.
    Moved the optimisation into the loop as suggested by Matt Turner.
    Fix the array index when there is a header.
2015-05-01 11:46:28 +01:00
Neil Roberts
be119e80c9 i965/skl: Force the exec size to 8 when initing header for SIMD4x2
On Gen9+ there needs to be a header when sampling using SIMD4x2. The
header is set up by copying from the g0 register. Commit 07c571a39f
tried to fix this mov instruction to always use an exec size of 8
because previously it was incorrectly using 4. It did this by casting
the type of the destination register to vec8. This was done because
there is code in brw_set_dest to guess the exec size based on the
width of the dest register. However I misunderstood how this works
because it is actually only used when the width is less than 8. That
means the patch actually changed it to use the default exec size which
on SIMD16 would be 16 and the MOV would clobber over the first
register in the send message. This patch makes it additionally set the
default exec size to 8. This is similar to how the message is set up
in fs_generator::generate_tex.

I think this wasn't picked up by any Piglit tests because we don't
have any fragment shaders that hit this code path so nothing was using
SIMD16. However the patch caused failures in deqp tests.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90153
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
2015-05-01 11:46:22 +01:00
Kenneth Graunke
1ac7db07b3 i965: Unhardcode a few more stage names and abbreviations.
The stage_abbrev and stage_name fields in backend_visitor provide what
we need without any additional effort.  It also means we'll get the
right names for compute shaders, SIMD8 geometry shaders, and both kinds
of tessellation shaders.

This does unfortunately change the capitalization of the stage
abbreviation in the INTEL_DEBUG=optimizer output filenames.  It doesn't
seem worth adding code to handle, though.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-30 11:49:50 -07:00
Marek Olšák
1db5d3c19e docs/relnotes: document the new EGL sync extensions 2015-04-30 14:38:38 +02:00
Marek Olšák
e70de9b032 st/dri: implement the fence interface for CL events 2015-04-30 14:38:38 +02:00
Marek Olšák
952b5e84db gallium,clover: add OpenCL interoperability support for CL events
v2: - move interop.cpp to clover/api
    - change intptr_t to void* in the interface
    - add a virtual function fence() to simplify some code

v3: - use bool in the interface
v4: - enclose the last two interop functions in try..catch

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-04-30 14:38:38 +02:00
Marek Olšák
7070b0dd66 st/dri: implement the fence interface 2015-04-30 14:38:38 +02:00
Marek Olšák
a2557b30d8 egl/dri2: return the latest sync status in eglGetSyncAttribKHR 2015-04-30 14:38:38 +02:00
Marek Olšák
290a3eb750 egl/dri2: implement EGL_KHR_cl_event2 (v2)
v2: fix the SYNC_CONDITION query
2015-04-30 14:38:38 +02:00
Marek Olšák
a8617cc042 egl/dri2: implement EGL_KHR_wait_sync 2015-04-30 14:38:38 +02:00
Marek Olšák
9a0bda2430 egl/dri2: implement EGL_KHR_fence_sync 2015-04-30 14:38:38 +02:00
Marek Olšák
592ee249a1 mesa: add GL_OES_EGL_sync
This is an empty extension whose presence means that EGL sync objects can be
used with ES contexts.
2015-04-30 14:38:38 +02:00
Marek Olšák
b02a5bf3ba dri_interface: add an interface for fences 2015-04-30 14:38:38 +02:00
Marek Olšák
396cbabbef egl/dri: don't expose configs with an accumulation buffer 2015-04-30 14:38:38 +02:00
Ilia Mirkin
33f0d1138d nvc0/ir: fix predicated PFETCH for real
Commit a9d08a250 accidentally didn't make use of the new src1 variable.
Use it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-04-30 02:02:47 -04:00
Ilia Mirkin
db269ae495 nv50/ir: fix asFlow() const helper for OP_JOIN
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-04-29 23:34:30 -04:00
Ilia Mirkin
a9d08a250a nvc0/ir: fix predicated PFETCH emission
src1 would contain the predicate, which would get emitted as a register
source by an undiscerning srcId helper. Work around this in the same way
as in emitTEX.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-04-29 23:34:22 -04:00
Ilia Mirkin
515ac907e6 gk110/ir: fix set with a register dest to not auto-set the abs flag
This was causing src0 to always have the absolute value flag set.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-04-29 18:03:19 -04:00
Topi Pohjolainen
13670e8bad i965/blorp: Prepare drawing rectangle for flipped coordinates
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:49 +03:00
Topi Pohjolainen
dfd896699d i965/blorp: Add support for layered rendering
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:49 +03:00
Topi Pohjolainen
91daf9f09b i965/blorp: Allow blend state to be set for multiple render targets
Original blorp writes only one buffer per shader invocation. Once
the launch mechanism is shared with glsl-based programs there will
be need for supporting multiple render targets.

Also drop the always constant color write disable settings.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
7fb0db4dd1 i965/blorp: Prepare for attributes other than render position
Note that the magic number of one in gen7 logic is replaced by
BRW_SF_URB_ENTRY_READ_OFFSET ( == 1 also) for clarity.

On gen6 the change from zero to one (BRW_SF_URB_ENTRY_READ_OFFSET)
has no effect for native blorp as blorp doesn't use any
additional attributes. In fact, regular pipeline setup always
uses BRW_SF_URB_ENTRY_READ_OFFSET even when there are no additional
attributes. Hence the change makes the two (blorp and regular)
consistent.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
25ce6c6943 i965/blorp: Remove unused arguments
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
dce1972945 i965/gen7/blorp: Remove unused arguments
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
4de0bef7f4 i965/blorp: Allow caller to provide sampler settings
v2 (Ken): s/use_unorm_coords/non_normalized_coords/

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
bfdacac86c i965/blorp: Refactor vertex buffer state setup
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
d271a13ba3 i965/blorp: Remove constant parameter
This was still needed when we had support for blorp clears but now
this is fixed to nop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
d7e49fba9a i965/gen8: Expose state base address setup
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:48 +03:00
Topi Pohjolainen
fea168f495 i965/ps/gen8: Refactor state uploading
v2: Use SET_FIELD() for sampler count, and for that reason
    added GEN7_PS_SAMPLER_COUNT_MASK.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
4047420ec4 i965/ps/gen7: Refactor state uploading
Now the uploading depends only on the input parameters instead
of consulting the current gl-state.

v2: Rebased on top of sampler count clamping

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
02dbc79297 i965: Refactor sampler state setup
v2 (Matt): Moved * to the name.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
47f32cb50d i965: Remove dependency to tex object in default color setup
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
21071afc43 i965: Refactor and expose brw_upload_binding_table()
Read and write parts of the state stage are also split into
explicit arguments allowing future patches to use constant
program data.

v2 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
c15e20d8f6 i965: Expose and refactor brw_update_renderbuffer_surfaces()
Note that brw_update_renderbuffer_surfaces() already had a helper
variable which was used in parallel to direct access of the current
draw buffer of the context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
c8b0d890c0 i965: Refactor rb surface setup to allow caller to store offsets
Notice that in gen7_wm_surface_state.c there is also indentation
change in the surrounding code removing tabs.

v2 (Matt): Fixed whitespace: tabs -> spaces

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
d6c83c9d86 i965/gen8: Use constant pointers for reading miptree details
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:47 +03:00
Topi Pohjolainen
f39846fb57 i965/ps: Use SET_FIELD() for sampler count
The value is actually clamped to 0-16 as sample state pointer
can be used to support more than 16 samplers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-30 00:28:33 +03:00
Ian Romanick
2c7e289d8b glx: Massive update of comments in struct extension_info
In response to another patch, Emil asked for some clarification how this
stuff works.  Rather than just reply to the e-mail, I decided to update
the exlanation in the code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-29 13:18:59 -07:00
Marek Olšák
a582b22c63 winsys/radeon: add a private interface for radeon_surface 2015-04-29 21:51:40 +02:00
Marek Olšák
dcfbc006b6 winsys/radeon: move radeon_winsys.h to drivers/radeon 2015-04-29 21:51:40 +02:00
EdB
d8f817ae7f clover: remove util/compat
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-29 14:25:42 +00:00
Neil Roberts
5d4f085a43 i965: Don't try to apply the opt_sampler_eot extension for vs
The opt_sampler_eot optimisation of fs_visitor effectively assumes
that it is running on a fragment shader because it casts the program
key to a brw_wm_prog_key. However on Skylake fs_visitor can also be
used for vertex shaders. It looks like this usually works anyway
because the optimisation is skipped if key->nr_color_regions != 1.
However for a vertex shader the key is actually a brw_vs_prog_key so
the space for nr_color_regions is probably taken up by
key->base.program_string_id. This can end up making nr_color_regions
be 1 in which case the function will later assert when the last
instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test
suite assert. Presumably this only happens there because that compiles
a lot of shaders so it would end up with a high value for
program_string_id.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-29 15:31:45 +01:00
Emil Velikov
b124dc2b70 r300: do not link against libdrm_intel
Accidentally added since the introduction of the file.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-04-29 15:15:19 +01:00
EdB
2d112ed961 clover: make module::symbol::name a string
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-29 12:45:07 +00:00
EdB
5ca9b23319 clover: remove compat::string
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-29 12:45:00 +00:00
EdB
1b4a1d0049 clover: remove compat classes that match std one
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-29 12:44:53 +00:00
EdB
3c61ff0d89 clover: compile all sources with c++11
Later we can remove the compat code

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-29 12:43:55 +00:00
Axel Davy
231be57ee2 st/nine: Remove Managed texture hack.
Previously binding an unitialized managed texture
was causing a crash, and a workaround was added to
prevent the crash.

This patch removes this workaround and instead set the initial
state of managed textures as dirty, so that when the texture is bound
for the first time, it is always initialized.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
58d295d41e st/nine: Enforce LOD 0 for D3DUSAGE_AUTOGENMIPMAP
For D3DUSAGE_AUTOGENMIPMAP textures, applications can only
lock/copy from/get surface descriptor for/etc the first level.
Thus it makes sense to restrict the LOD to 0, and use only the first
level to generate the sublevels.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
6f57e01436 st/nine: Some D3DUSAGE_AUTOGENMIPMAP fixes
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
24eca6a30d st/nine: util_gen_mipmap doesn't need we reset states.
util_gen_mipmap uses pipe->blit, and thus doesn't need
we restore all states after using it.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
7a7758c552 st/nine: D3DUSAGE_AUTOGENMIPMAP is forbidden for volumes
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
ec411d9b74 st/nine: Fix NineBaseTexture9_PreLoad
It wasn't uploading the texture when the lod
had changed.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
b45fa97a22 st/nine: Rewrite Managed texture uploads
That part of the code was quite obscure.
This new implementation tries to make it clearer
by separating the differents parts, and commenting more.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
090ebc7638 st/nine: Bound the dirty regions to resource size
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
520e36f89c st/nine: Simplify Surface9 Managed resources implementation
Remove the Surface9 code for dirty rects, used only for Managed
resources. Instead convey the information to the parent texture.

According to documentation, this seems to be the expected behaviour,
and if documentation is wrong there, that's not a problem since it can
only leads to more texture updates in corner cases.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
4c2247ac60 st/nine: Remove impossible cases with Managed textures
Copying to/from a Managed texture is forbidden.
Rendering to a Managed texture is forbidden.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
e558ce98f2 st/nine: Encapsulate variables for MANAGED resource
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
35fe920e1e st/nine: Rework texture data allocation
Some applications assume the memory for multilevel
textures is allocated per continuous blocks.

This patch implements that behaviour.

v2: cache offsets

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
54f8e8a18d st/nine: Fix update_vertex_elements bad rebase
This code was supposed to be removed, but a rebase seems to have
made it stay.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
87868d3832 st/nine: Add debug warning when application uses sw processing
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Axel Davy
4acbf420d1 st/nine: Rework update_vertex_buffers
Previous code was trying to optimise to call set_vertex_buffers on
big packets, and thus avoids as many calls as possible.

However in practice doing so won't be faster (drivers implement
set_vertex_buffers by a loop over the buffers we want to bind)

When we want to unbind a buffer, we were calling set_vertex_buffers
on a buffer with vtxbuf->buffer = NULL. It works on some drivers,
but not on all of them, because it isn't in Gallium spec.
This patch fixes that.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:11 +02:00
Xavier Bouchoux
5beb411bf7 st/nine: Fix computation of const_used_size
Was sometimes too large for PS.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
2015-04-29 08:28:10 +02:00
Axel Davy
559342d01d gallium/svga: Remove useless ARRAY_SIZE declaration
This is already declared in util/macros.h

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Axel Davy
64880d073a util/macros: Move DIV_ROUND_UP to util/macros.h
Move DIV_ROUND_UP to a shared location accessible everywhere

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Xavier Bouchoux
405c7d7511 st/nine: Fix behaviour of D3DUSAGE_QUERY_POSTPIXELSHADER_BLENDING
Ignore D3DUSAGE_QUERY_POSTPIXELSHADER_BLENDING when
D3DUSAGE_RENDERTARGET is not specified.

This behaviour matches windows drivers.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
2015-04-29 08:28:10 +02:00
Xavier Bouchoux
d838fe8243 st/nine: Improve D3DQUERYTYPE_TIMESTAMP
Avoid blocking when retrieving D3DQUERYTYPE_TIMESTAMP result with
NineQuery9_GetData(), when D3DGETDATA_FLUSH is not specified.
This mimics Win behaviour and gives slightly better performance
for some games.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
2015-04-29 08:28:10 +02:00
Xavier Bouchoux
851abb9145 st/nine: Fix D3DQUERYTYPE_TIMESTAMPFREQ query
D3DQUERYTYPE_TIMESTAMPFREQ is supposed to give the frequency
at which the clock of D3DQUERYTYPE_TIMESTAMP runs.

PIPE_QUERY_TIMESTAMP returns a value in ns, thus the corresponding
frequency is 1000000000.
PIPE_QUERY_TIMESTAMP_DISJOINT returns the frequency at which
PIPE_QUERY_TIMESTAMP value is updated. It isn't always
1000000000.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
2015-04-29 08:28:10 +02:00
Tiziano Bacocco
31bb4cd5c6 st/nine: Change x86 FPU Control word on device creation
As on wined3d and windows, when D3DCREATE_FPU_PRESERVE is not
specified, change the fpu control word to all exceptions masked,
single precision, round to nearest.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
2015-04-29 08:28:10 +02:00
Axel Davy
e7b1a1e57c st/nine: Do not advertise D3DDEVCAPS_TEXTURESYSTEMMEMORY
No major vendor advertises it, and we weren't supporting it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Axel Davy
907f28f87e st/nine: Fix comment in update_viewport
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Axel Davy
6e825b69bd st/nine: Workaround barycentrics issue on some cards
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Xavier Bouchoux
f3fd06e94d st/nine: Clear struct pipe_blit_info before use.
render_condition_enable was uninitialized.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
2015-04-29 08:28:10 +02:00
Patrick Rudolph
77a38d2088 st/nine: NineDevice9_Clear skip fastpath for bigger depth-buffers
This adds an additional check to make sure the bound depth-buffer doesn't
exceed the rendertarget size when clearing depth and color buffer at once.
D3D9 clears only a rectangle with the same dimensions as the viewport, leaving
other parts of the depth-buffer intact.

This fixes failing WINE test visual.c:depth_buffer_test()

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Axel Davy
716bef2643 st/nine: Fix wrong assert in nine_shader
The sampler src index was wrong for texldl and texldd

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Axel Davy
8d3e063e68 st/nine: Handle special LIT case
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-04-29 08:28:10 +02:00
Jose Fonseca
114ac39a88 mesa: Fix glGetProgramiv(GL_ACTIVE_ATTRIBUTES).
It's returning random values, because RESOURCE_VAR() is casting
different objects into ir_variable pointers.

This updates _mesa_count_active_attribs to filter the resources with the
same logic used in _mesa_longest_attribute_name_length.

https://bugs.freedesktop.org/show_bug.cgi?id=90207

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-04-29 06:42:12 +01:00
Marc-André Lureau
c66c158e59 egl: misc fixes for EGL_MESA_image_dma_buf_export
Fix define and a function argument name introduced in commit
8f7338f284

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-29 15:13:19 +10:00
Ilia Mirkin
6fe0d4f035 nvc0/ir: flush denorms to zero in non-compute shaders
This will set the FTZ flag (flush denorms to zero) on all opcodes that
can take it.

This resolves issues in Unigine Heaven 4.0 where there were solid-filled
boxes popping up.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89455
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-28 20:17:03 -04:00
Brian Paul
66985d2a6d meta: remove unneeded #include colortab.h
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-28 12:27:48 -06:00
Brian Paul
7e8de8219f mesa: remove unneeded #include colortab.h
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-28 12:27:48 -06:00
Brian Paul
7c1be009b7 mesa: remove unused options var in compile_shader()
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-28 12:27:48 -06:00
Brian Paul
3597a0de94 docs: more details about Viewperf 12 medical-01 test issues 2015-04-28 12:27:48 -06:00
Ilia Mirkin
e312a69958 nvc0: expose GLSL version 410
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-28 12:48:22 -04:00
Ilia Mirkin
b5947984cd st/mesa: allow glsl version up to 410, enable ARB_shader_precision
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-04-28 12:48:22 -04:00
Leo Liu
2d4a890c0b st/va: add h264 decoder level support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-04-28 17:42:52 +02:00
Leo Liu
b2596efeb7 st/omx/dec: add h264 decoder level support
v2: use sps level idc as level to driver

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-04-28 17:42:45 +02:00
Leo Liu
1a5e2bb5ce vl: add level idc in sps
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-04-28 17:42:39 +02:00
Leo Liu
ef1ae703a9 st/omx/dec: separate create_video_codec to different codecs
v2: get frame size from port info

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-04-28 17:42:35 +02:00
Leo Liu
d043b51ba4 st/vdpau: add h264 decoder level support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-04-28 17:42:29 +02:00
Leo Liu
4509fc8b94 gallium/util: get h264 level based on number of max references and resolution
v2: add commments for limitation of max references numbers,
and what the caculation is based

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-04-28 17:42:25 +02:00
Marek Olšák
6d05396b00 r600g,radeonsi: add a driver query returning GPU load
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-04-28 16:05:45 +02:00
Marek Olšák
0b8e73a6ae r600g,radeonsi: add driver queries for GPU temperature and shader+memory clocks
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-04-28 16:05:45 +02:00
Ilia Mirkin
9143940da2 gm107/ir: add lane/vertex count sysvals
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 21:25:29 -04:00
Ilia Mirkin
89e0b08794 gk110/ir: add support for writing per-patch and shader outputs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 21:25:28 -04:00
Ilia Mirkin
52614f59b7 freedreno/a3xx: color masking works like a blend for some formats
When there is a colormask active that does not cover all the channels,
enable reading in the destination like with a combining blend
operation. This fixes fbo-blending-formats on a3xx.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 20:17:07 -04:00
Ilia Mirkin
9fc3f47278 freedreno/a3xx: add support for S8 and Z32F_S8
Enables ARB_depth_buffer_float. There is no sampling support for
interleaved Z32F_S8, so we store the two textures separately, one as
Z32F, the other as S8. As a result, we need a lot of additional logic
for restores and transfers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 20:17:07 -04:00
Ilia Mirkin
1571da6ac3 freedreno/a3xx: add Z32F support
32-bit depth buffers are stored as unorm, and thus need special handling
when moving to and from gmem. They are copied into gmem by writing
depth, and resolved from gmem using a special resolve bit which
apparently float-ifies the data.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 20:17:07 -04:00
Ilia Mirkin
0a4cb00c77 freedreno: add fd_transfer to wrap around pipe_transfer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 20:17:07 -04:00
Ilia Mirkin
f5c1101996 freedreno/a3xx: add support for disabling depth clipping
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 20:17:07 -04:00
Kenneth Graunke
dffc1a0ae3 i965/vs: Remove unnecessary NULL check on generate_code() result.
Code generation is not allowed to fail for any reason - in fact,
fs_generator has no mechanism for failing.  The visitor is responsible
for that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-27 14:59:06 -07:00
Timothy Arceri
d795cc6508 glsl: fix packing support for arrays of doubles
Broke in commit f00c5f85b8 when
adding support for multidimensional arrays

Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>
2015-04-28 07:49:32 +10:00
Matt Turner
ff6ee39c19 i965: Enable ARB_gpu_shader5 on Gen8+.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-27 14:44:32 -07:00
Matt Turner
0c06d019bc i965/fs: Fix code emission for imul_high in NIR.
Copy over from brw_fs_visitor.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-27 14:44:32 -07:00
Matt Turner
ecf428aa59 i965/fs: Fix stride for multiply in macro.
We have to use W/UW type for src1 of the multiply in the MUL/MACH macro,
but in order to read the low 16-bits of each 32-bit integer, we need to
set the appropriate stride.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-27 14:44:32 -07:00
Matt Turner
b3e29a2022 Revert "i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen > 7."
This reverts commit 9f5e5bd34d.

I have no idea what made me believe these didn't apply to Gen > 7. They
do, and without them we generate bad code that causes failures on Gen 8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-27 14:44:32 -07:00
Olivier Pena
b94a4e8498 scons: Support LLVM 3.5 and 3.6 on windows.
llvm/Config/llvm-config.h is parsed instead of llvm/Config/config.h for
detecting LLVM version
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-June/073707.html).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-27 21:47:27 +01:00
Ilia Mirkin
dfb0b36e8f mesa: fix up GLSL version when computing GL version
In some situations it is convenient for a driver to expose a higher GLSL
version while some extensions are still incomplete. However in that
situation, it would report a GLSL version that was higher than the GL
version. Avoid that situation by limiting the GLSL version to the GL
version.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-27 16:03:16 -04:00
Roland Scheidegger
7c3d1c132e softpipe: fix another stencil-as-float issue
Hopefully this is the last one now (for texture X32_S8X24_UINT views).
+4 piglits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90167

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-27 18:51:30 +02:00
Ilia Mirkin
dfb274af4c mesa: the function name appears to have a gl prefix already
Currently we're producing errors like

User error: GL_INVALID_OPERATION in glglDeleteProgramsARB(invalid call)

And noop_warn appears to be called with the full function name. Don't
prepend a gl prefix.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-27 12:06:54 -04:00
Zoë Blade
05e7f7f438 Fix a few typos
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-04-27 17:28:29 +03:00
Francisco Jerez
e17dc004fd i965/gen8: Factor out texture surface state set-up from gen8_update_texture_surface().
This moves most of the surface state set-up logic that can be shared
between textures and shader images to a separate function.
2015-04-27 17:28:29 +03:00
Francisco Jerez
6f26ffaf66 i965/gen7: Factor out texture surface state set-up from gen7_update_texture_surface().
This moves most of the surface state set-up logic that can be shared
between textures and shader images to a separate function.
2015-04-27 17:28:28 +03:00
Francisco Jerez
e94c80c08b i965: Add helper functions to calculate the slice pitch of an array or 3D miptree. 2015-04-27 17:28:28 +03:00
Olivier Pena
f9965347dc scons: add target osmesa using gallium state tracker.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-27 15:18:36 +01:00
Marek Olšák
db2415189a radeonsi: set an optimal value for DB_Z_INFO.ZRANGE_PRECISION
Required because of a VI hw bug.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-27 15:57:07 +02:00
Marek Olšák
bed98eef9a radeonsi: remove deprecated and useless registers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-27 15:56:27 +02:00
Marek Olšák
393b0e0531 radeonsi: remove useless includes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-27 15:56:27 +02:00
Marek Olšák
d8269be1ce gallium/radeon: print winsys info with R600_DEBUG=info
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-27 15:56:27 +02:00
Marek Olšák
96bbdc5188 winsys/radeon: make radeon_bo_vtbl static
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-27 15:56:27 +02:00
Timothy Arceri
ca9e280d89 glsl: replace while loop with without_array function
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 21:31:08 +10:00
Timothy Arceri
f00c5f85b8 glsl: support packing of arrays of arrays
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 21:31:01 +10:00
Timothy Arceri
fda5f7bb2f glsl: add arrays of arrays support to without_array function
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-27 21:30:54 +10:00
Martin Peres
9ea38ee96d docs/GL3: started adding support for shader_image_size
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-27 10:13:49 +03:00
Gediminas Jakutis
6fc0cd2f52 gallium/hud: add more options to customize HUD panes
Extends the syntax of GALLIUM_HUD environment variable to:
- Add options to set the size and exact location of each pane.
- Add an option to limit the maximum allowed value of the X axis on a
  pane, clamping the graph down to not go above this value.
- Add an option to auto-adjust the value of the Y axis down to the
  highest value still visible on the graph.

v2:
- Make the patch simpler and smaller.
- With dynamic auto-adjusting on, adjust the Y axis once per pane
  update instead of updating once every several seconds.
- No longer mishandle pane height when having more than one graph per
  pane.
2015-04-26 00:40:08 +02:00
Kenneth Graunke
30c8d8a831 i965: Fill out the rest of brw_debug_recompile_sampler_key().
This makes INTEL_DEBUG=perf report shader recompiles due to CMS vs.
UMS/IMS differences and Sandybridge textureGather workarounds.

Previously, we just flagged them as "Something else".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-25 09:48:59 -07:00
Kenneth Graunke
19165e3b6e i965: Disassemble sampler message names on Gen5+.
Previously, sampler messages were decoded as

sampler (1, 0, 2, 2) mlen 6 rlen 8              { align1 1H };

I don't know how much time we've collectly wasted trying to read this
format.  I can never recall which number is the surface index, sampler
index, message type, or...whatever that other number is.  Figuring out
the message name from the numerical code is also painful.

Now they decode as:

sampler sample_l SIMD16 Surface = 1 Sampler = 0 mlen 6 rlen 8 { align1 1H };

This is easy to read at a glance, and matches the format I used for
render target formats.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-04-25 09:47:29 -07:00
Matt Turner
7f5a8ac155 i965/fs: Disallow constant propagation into POW on Gen 6.
Fixes assertion failures in three piglit tests on Gen 6 since commit
0087cf23e.
2015-04-25 02:15:35 -07:00
Ilia Mirkin
67ba388dc0 mesa: add support for exposing up to GL4.2
Add the 4.0/4.1/4.2 extensions lists to compute_version. A couple of
extensions aren't in mesa yet, so those are marked with 0 until they
become supported.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-24 21:55:14 -04:00
Matt Turner
11d2305d7f i965/fs: Add missing pixel_x/y to brw_instruction_name().
Forgotten in commit 529064f6.
2015-04-24 16:25:02 -07:00
Matt Turner
51c61fff8f i965/fs: Don't constant propagate into integer math instructions.
Constant combining won't promote non-floats, so this isn't safe.

Fixes regressions since commit 0087cf23e.
2015-04-24 16:25:02 -07:00
Emil Velikov
e170185896 docs: add news item and link release notes for mesa 10.5.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-24 23:00:14 +01:00
Emil Velikov
196cf8db65 docs: Add sha256 sums for the 10.5.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit adb47b5b27)
2015-04-24 23:00:14 +01:00
Emil Velikov
5b39cb4736 Add release notes for the 10.5.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit ea0d1f575c)
2015-04-24 23:00:14 +01:00
Brian Paul
13b2e6a520 mesa: put more info in glTexImage GL_OUT_OF_MEMORY error message
Give the user some idea about the size of the texture which caused
the GL_OUT_OF_MEMORY error.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-24 14:48:54 -06:00
Matt Turner
0087cf23e8 i965/fs: Allow 2-src math instructions to have immediate src1.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-24 11:39:01 -07:00
Matt Turner
f251ea393b nir: Transform pow(x, 4) into (x*x)*(x*x). 2015-04-24 11:39:01 -07:00
Matt Turner
9b577d5702 glsl: Transform pow(x, 4) into (x*x)*(x*x).
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-24 11:39:01 -07:00
Tapani Pälli
18f44d3030 mesa: fix glGetActiveUniformsiv regression
Commit 7519ddb caused regression to glGetActiveUniformsiv.
Patch adds back validation loop of all given uniforms before
writing any values, not touching params in case of errors
is tested by the conformance suite.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90149
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-24 13:42:24 +03:00
Tapani Pälli
a563689a40 mesa: refactor active attrib queries for glGetProgramiv
Main motivation here is to get rid of iterating IR and
encapsulate queries within program resources.
No functional changes.

Piglit tests calling the modified functionality:

   - gl-get-active-attrib-returns-all-inputs
   - glsl-1.50-get-active-attrib-array
   - getactiveattrib

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-24 13:41:54 +03:00
Jason Ekstrand
d5a15a89f0 i965: Add an INTEL_DEBUG=spill option to test spilling
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-23 18:08:21 -07:00
Jason Ekstrand
bf55096207 i965/debug: Use the ull specifier for DEBUG enum defines
The INTEL_DEBUG variable is a uint64_t and if we want a enum value higer
than 32 bits, you need to use ull.  We might as well use it for all of them.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-23 18:08:20 -07:00
Kenneth Graunke
5957da1edb i965: Disallow linear blits that are not cacheline aligned.
The BLT engine on Gen8+ requires linear surfaces to be cacheline
aligned.  This restriction was added as part of converting the BLT to
use 48-bit addressing.

The main user, intel_emit_linear_blit, now handles this properly.
But we might also have linear miptrees; just refuse to blit those.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88521
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-04-23 14:16:57 -07:00
Kenneth Graunke
8c17d53823 i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions.
The BLT engine on Gen8+ requires linear surfaces to be cacheline
aligned.  This restriction was added as part of converting the BLT to
use 48-bit addressing.

intel_emit_linear_blit needs to handle blits that are not cacheline
aligned, as we use it for arbitrary glBufferSubData calls and subrange
mappings.

Since intel_emit_linear_blit uses 1 byte per pixel, we can use the src/dst
pixel X offset field to represent the unaligned portion, and subtract
that from the address so it's cacheline aligned.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88521
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-04-23 14:05:41 -07:00
Pali Rohár
29f0f976bd mapi: Adding missing string.h include.
File glapi_entrypoint.c calls memcpy() function, but does not include
string.h header. So compilation can fail at error: implicit declaration
of function 'memcpy'.

Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-23 22:02:07 +01:00
Jose Fonseca
525be9c079 os/os_memory_aligned.h: Handle integer overflow.
This code is only used when our memory debugging wrappers are enabled,
as we use the C runtime functions directly elsewhere.

Tested llvmpipe on Windows w/ memory debugging enabled.

VMware PR894263.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-04-23 21:59:43 +01:00
Roland Scheidegger
f2a7fd9943 draw: fix prim ids when there's no gs
We were resetting the prim id count for each run of the prim assembler,
hence this only worked when the draw calls were very small (the exact limit
depending on the vertex size), since larger draw calls get split up.
So, do the same as we do already if there's a gs, reset it to zero explicitly
for every new instance (this possibly could use the same variable but that
isn't doable without some heavy refactoring and I'm not sure it makes sense).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90130.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

CC: <mesa-stable@lists.freedesktop.org>
2015-04-23 18:14:22 +02:00
Marek Olšák
ecc7f2ed91 gallium/radeon: don't crash when getting out-of-bounds TEMP references
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-23 16:14:39 +02:00
Jason Ekstrand
125574d1ef nir/lower_source_mods: Don't propagate register sources
The nir_lower_source_mods pass does a weak form of copy propagation to
clean up all of the mov-with-negate's that get generated.  However, we
weren't properly checking that the sources were SSA and so we could end up
moving a register read which is not, in general, valid.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
296131f467 nir: Rewrite instr_rewrite_src
The old code wasn't correctly handling the case where the new value of the
source contains an indirect.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
d61bd972d8 nir/locals_to_regs: Hanadle indirect accesses of length-1 arrays
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
06f3c98b9d nir/locals_to_regs: Initialize registers with constant initializers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
4e9b376594 nir/locals_to_regs: Pass around the nir_shader rather than a void * mem_ctx
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
f50f59d3d9 nir: Add a simple growing array data structure
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
8b900e7405 nir/types: Make glsl_get_length smarter
Previously, this function returned the number of elements for structures
and arrays and 0 for everything else.  In NIR, this is almost never what
you want because we also treat matricies as arrays so you have to
special-case constantly.  This commit  glsl_get_length treat matrices
as an array of columns by returning the number of columns instead of 0

This also fixes a bug in locals_to_regs caused by not checking for the
matrix case in one place.

v2: Only special-case for matrices and return a length of 0 for vectors as
    we did before.  This was needed to not break the TGSI-based drivers and
    doesn't really affect NIR at the moment.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
7e1d21edbf nir: Move get_const_initializer_load from vars_to_ssa to NIR core
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
ba88760202 nir/lower_vars_to_ssa: Pass around the nir_shader instead of a void mem_ctx
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
c68364ac34 i965/nir: Use the correct offsets when handling register indirects
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
e79120afdc nir/print: Print the closing paren on load_const instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
02f03fc0f1 nir/tex: Use the correct return size for query_levels and lod
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
94669cb534 nir: Refactor tex_instr_dest_size to use a switch statement
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
73cc76362d nir/lower_vars_to_ssa: Actually look for indirects when determining aliasing
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:39 -07:00
Dave Airlie
734bceed86 docs: mark off texture_stencil8 (v2.1)
copy drivers from the stencil_texturing list,
softpipe is definitely broken for stencil texturing
since it uses float, but I'll look at that later.

v2.1: update relnotes

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-23 10:11:27 +10:00
Dave Airlie
6cc49c4ce1 st/mesa: add ARB_texture_stencil8 support (v4)
if we support stencil texturing, enable texture_stencil8
there is no requirement to support native S8 for this,
the texture can be converted to x24s8 fine.

v2: fold fixes from Marek in:
   a) put S8 last in the list
   b) fix renderable to always test for d/s renderable
    fixup the texture case to use a stencil only format
    for picking the format for the texture view.
v3: hit fallback for getteximage
v4: put s8 back in front, it shouldn't get picked now (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-23 10:11:27 +10:00
Dave Airlie
782e71cc07 mesa: finish implementing ARB_texture_stencil8 (v5)
Parts of this were implemented previously, so finish it off.

v2: fix getteximage falling into the integer check
    add fixes for the FBO paths, (fbo-stencil8 test).

v3: fix getteximage path harder.
v4: remove swapbytes from getteximage path (Ilia)
v5: brown paper bag the swapbytes removal. (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-23 10:11:26 +10:00
Jason Ekstrand
1948880720 mesa: remove the gl_sl_pragmas structure
This code was added by Brian Paul in 2009 but, as far as Matt and I can
tell, it's been dead ever since the new GLSL compiler was added.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-22 16:00:35 -07:00
Jason Ekstrand
ae3870df70 i965: Add a brw_compiler structure and store the register sets in it
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:34 -07:00
Jason Ekstrand
a85c4c9b3f i965: Rename brw_compile to brw_codegen
This name better matches what it's actually used for.  The patch was
generated with the following command:

for file in *; do
sed -i -e s/brw_compile/brw_codegen/g $file
done

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:34 -07:00
Jason Ekstrand
cfc56fcee3 i965: Use device_info instead of the context for computing vue maps
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:34 -07:00
Jason Ekstrand
02ccb19495 i965: Use device_info instead of the context in instruction scheduling
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:34 -07:00
Jason Ekstrand
28e9601d0e i965: Add a devinfo field to backend_visitor and use it for gen checks
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:34 -07:00
Jason Ekstrand
73bf8f3d6b i965: Remove remaining uses of ctx->Const.UniformBooleanTrue in visitors
Since commit 2881b123, we have used 0/~0 for representing booleans on all
gens.  However, we still had a bunch of places in the visitor code where we
were still referring to ctx->Const.UniformBooleanTrue.  Since this is
always ~0, we can just remove them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
2bf207b473 i965/vec4: Add a devinfo field to the generator and use it for gen checks
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
5bda1ff1be i965/fs: Add a devinfo field to the generator and use it for gen checks
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
38dc2ddab4 i965/device_info: Add a supports_simd16_3src flag
This also involves moving revision checking to screen creation time and
passing that into brw_get_device_info so that we can get the right
device_info for early versions of SKL.  Since the only place we used
revision was to check for SIMD16 3-src instruction support, it's safe to
remove the revision field from brw_context.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
85db2aca52 i965/device_info: Add a HSW_FEATURES macro
It's basically just a copy of GEN7_FEATURES only with is_haswell set

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
9c89e47806 i965: Make the annotation code take a device_info instead of a context
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
5cb91db619 i965/fs: Remove the GL context from the generator
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:33 -07:00
Jason Ekstrand
61c4702489 i965: Remove the context field from brw_compiler
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:32 -07:00
Jason Ekstrand
639314d40e i965: Make the disassembler take a device_info instead of a context
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:32 -07:00
Jason Ekstrand
c3e5f32840 i965: Make instruction compaction take a device_info instead of a context
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:32 -07:00
Jason Ekstrand
4e9c79c847 i965: Make the brw_inst helpers take a device_info instead of a context
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:32 -07:00
Jason Ekstrand
6219a8f098 i965/eu: Add a devinfo parameter to brw_compile
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:32 -07:00
Jason Ekstrand
a921475c22 i965: Do better fake context setup in unit tests
In future tests, we will start relying on devinfo and not just brw in the
compiler.  Changing this now keeps these tests from failing in the future.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:32 -07:00
Jason Ekstrand
ceb6e5eebe i965: Remove the context parameter from brw_texture_offset
It wasn't really being used anyway.  We used it to assert that gpu_shader5
is supported in the back-end but that should be caught by the front-end.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-22 16:00:31 -07:00
Dave Airlie
8a41cd2407 softpipe: fix stencil write to use an integer value
This fixes a number of regressions since
61393bdcdc
u_tile: fix stencil texturing tests under softpipe

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89960
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-23 08:32:30 +10:00
Anuj Phogat
2c08e3b8ea mesa: Fix typo in a comment
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-22 15:24:43 -07:00
Rob Clark
cb24d3b7ad freedreno: misc minor cleanups
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:28 -04:00
Rob Clark
1b58d8c2bf freedreno/a4xx: (partial) gl_FragCoord.zw
The bit to enable .z is still commented out, as it is triggering gpu
hangs in 0ad.  But at least gl_FragCoord.w works now, and we know what
bits we are *supposed* to set for .z (with that uncommented all piglit
fragcoord tests are passing).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:28 -04:00
Rob Clark
a869183123 freedreno/a4xx: primitive-restart
This was the missing bit to get dolphin-emu working on a4xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:28 -04:00
Rob Clark
632ea2a113 freedreno/nir: sysval fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:28 -04:00
Rob Clark
13527df143 freedreno/a4xx: wire up integer texture sampling
Similar to a3xx, the compiler needs to know the return type of the sam,
etc, instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:28 -04:00
Rob Clark
48a651e98c freedreno/a4xx: formats updates/fixes
Update formats table with new formats that Ilia has figured out, and fix
sampling from srgb texture and integer vbo's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:28 -04:00
Rob Clark
21ceedfd8b freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 13:20:27 -04:00
Emil Velikov
9450bd56be gallium/targets/d3dadapter9: drop the libdrm prefix for drm.h
The path is provided by libdrm.pc and already used appropriately by
the build system.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 16:03:01 +01:00
Brian Paul
02e93be55e cso: minor comment fix 2015-04-22 08:58:05 -06:00
Brian Paul
31667e6237 glsl: rewrite glsl_type::record_key_hash() to avoid buffer overflow
This should be more efficient than the previous snprintf() solution.
But more importantly, it avoids a buffer overflow bug that could result
in crashes or unpredictable results when processing very large interface
blocks.

For the app in question, key->length = 103 for some interfaces.  The check
if size >= sizeof(hash_key) was insufficient to prevent overflows of the
hash_key[128] array because it didn't account for the terminating zero.
In this case, this caused the call to hash_table_string_hash() to return
different results for identical inputs, and then shader linking failed.

This new solution also takes all structure fields into account instead
of just the first 15 when sizeof(pointer)==8.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-22 08:58:05 -06:00
Brian Paul
bd4dbdfa51 mesa: add check for NV_texture_barrier in _mesa_TextureBarrierNV()
If an app called glTextureBarrierNV() without checking if the
extension was available, we'd crash with some gallium drivers
in st_TextureBarrier() because the pipe_context::texture_barrier()
pointer was NULL.

Generate GL_INVALID_OPERATION instead.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-22 08:58:05 -06:00
Brian Paul
b260d9d91f main: silence missing return value warning in array_index_of_resource()
v2: return -1 instead of 0, per Emil Velikov.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-22 08:58:05 -06:00
Chih-Wei Huang
0b1823f5be android: re-build all mesa binaries properly
The clean steps ensure both 32-bit and 64-bit objects are cleaned.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 15:57:00 +01:00
Emil Velikov
36e59215ba android: xmlpool: cleanup the generation rules
- Do not attempt to create the save folder twice - both dir $@ and
PRIVATE_LOCALEDIR point to the same place.
 - Use @ and $(hide), for mkdir and python, to avoid spamming the
output.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 15:56:32 +01:00
Chih-Wei Huang
98c8997fe5 android: xmlpool: Get rid of the last use of intermediates-dir-for
v2 [Emil Velikov]
 - Keep the PRIVATE_LOCALEDIR variable.
 - Do not use $(@D) but the more widespead $(dir $@)

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 15:54:51 +01:00
Chih-Wei Huang
5b8d61b0cc android: export the path of the generated headers
The modules need the headers can get the path automatically.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 15:53:36 +01:00
Chih-Wei Huang
b0e33c2256 android: fix the building rules for Android 5.0
Android 5.0 allows modules to generate source into $OUT/gen, which will
then be copied into $OUT/obj and $OUT/obj_$(TARGET_2ND_ARCH) as necessary.
Modules will need to change calls to local-intermediates-dir into
local-generated-sources-dir.

The patch changes local-intermediates-dir into local-generated-sources-dir.
If the Android version is less than 5.0, fallback to local-intermediates-dir.

The patch also fixes the 64-bit building issue of Android 5.0.

v2 [Emil Velikov]
 - Keep the LOCAL_UNSTRIPPED_PATH variable.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 15:53:35 +01:00
Chih-Wei Huang
671a550846 android: fix building issues of host binaries
Define _GNU_SOURCE to enable features (__USE_XOPEN2K and __USE_UNIX98)
required to build the host binaries.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 15:53:35 +01:00
Chih-Wei Huang
076edc6a03 android: fix a building error of libmesa_program
Add libmesa_glsl to LOCAL_STATIC_LIBRARIES to get
its exported include path (for nir_opcodes.h).

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 15:53:35 +01:00
Emil Velikov
8098bf8e7a android: mesa: fold the ARCH_X86_HAVE_SSE4_1 conditionals
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 15:53:11 +01:00
Emil Velikov
669cfc267a android: mesa: fix the path of the SSE4_1 optimisations
Commit dd6f641303c(mesa: Build with subdir-objects.) removed the SRCDIR
variable, but forgot to update all references of it.

v2: Fix path - must be relative to LOCAL_PATH. (Chih-Wei)

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 15:52:02 +01:00
Emil Velikov
64171c2d24 android: build the Mesa IR -> NIR translator
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:53:22 +01:00
Emil Velikov
c734261dcf android: nir: add build rules for nir_builder_opcodes.h
Missed out with commit 2a135c470e3(nir: Add an ALU op builder kind of
like ir_builder.h)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:51:31 +01:00
Mauro Rossi
06619749a1 android: add inital NIR build
Required by the i965 driver.

v2:
 - Split out the nir_builder_opcodes.h rules.
 - Do not unconditionally hide the python command - use $(hide)
 - Use LOCAL_EXPORT_C_INCLUDE_DIRS to manage includes for the generated
sources.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
[Emil Velikov: Split from a larger commit, v2]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:49:39 +01:00
Emil Velikov
618885f71f android: dri: link against libmesa_util
The dri modules depend on symbols provided by it.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:37:32 +01:00
Emil Velikov
0afbd2df04 android: add $(mesa_top)/src/mesa/main to the includes list
Required by the format_{un,}pack rework. Otherwise the build will fail
to locate the respective headers - format_{un,}pack.h

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:37:17 +01:00
Emil Velikov
39a175e0c7 android: add HAVE__BUILTIN_* and HAVE_FUNC_ATTRIBUTE_* defines
All of those are available on gcc 4.5 and later with the current android
build using gcc 4.7.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:36:34 +01:00
Emil Velikov
94cab35ee9 android: add gallium dirs to more places in the tree
Similar to e8c5cbfd921(mesa: Add gallium include dirs to more parts of
the tree.)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:36:25 +01:00
Emil Velikov
8d90bfb724 android: dri/common: conditionally include drm_cflags/set __NOT_HAVE_DRM_H
Otherwise we'll fail to find the drm.h header.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:36:18 +01:00
Emil Velikov
2d06791f6f android: egl: add libsync_cflags to the build
... via local_shared_libraries. Otherwise the sync/sync.h header won't
be found.

Note: 10.5 and earlier will need similar change in st/egl.

v2: Append the library to the local_shared_libraries list. (Chih-Wei)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:36:05 +01:00
Mauro Rossi
5f7081eb90 android: mesa: generate the format_{un,}pack.[ch] sources
Missed out with commit e1fdcddafe9(mesa: Autogenerate format_unpack.c)

v2: Conditionaly print the python commands - s/@/$(hide) / (Chih-Wei)

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
[Emil Velikov: Split our from a larger commit.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:32:07 +01:00
Emil Velikov
6fb8017866 android: add $(mesa_top)/src include to the whole of mesa
Many parts of mesa already have the include with others depending on it
but it's missing. Add it once at the top makefile and be done with it.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:26:22 +01:00
Emil Velikov
ba3bc1eea2 android: use := operator for assigning MESA_VERSION
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:25:58 +01:00
Chih-Wei Huang
6c2c5f74a2 util: android: optimize the rules to generate format_srgb.c
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:24:22 +01:00
Chih-Wei Huang
63a76c15d8 android: simplify the subdirs including rules
Use the macro defined in the Android build system.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:24:13 +01:00
Emil Velikov
86919352e3 android: use LOCAL_SHARED_LIBRARIES over TARGET_OUT_HEADERS
... to manage the LIBDRM*_CFLAGS. The former is the recommended approach
by the Android build system developers while the latter has been
depreciated for quite some time.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-22 14:23:28 +01:00
Emil Velikov
413bc0a618 ilo: remove unused include from Android.mk
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2015-04-22 14:18:47 +01:00
Kenneth Graunke
00bf7d2e9c drirc: Add "Second Life" quirk (allow_glsl_extension_directive_midshader).
Appears to fix shader compilation.  Tested by starting the client,
dragging the "quality and speed" slider back and forth, and watching the
console output - instead of piles of "shader failed to compile", the CPU
seems to be busy compiling shaders.  I haven't actually tried to play.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69226
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71591
Cc: mesa-stable@lists.freedesktop.org
2015-04-21 22:16:30 -07:00
Kenneth Graunke
44461e7098 nir: Fix per-component negation in prog_to_nir's SWZ handling.
I missed the fact that the ARB_fragment_program SWZ instruction allows
per-component negation.  To fix this, move Abs/Negate handling into both
the simple case and the SWZ case's per-component loop.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90000
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-21 12:01:36 -07:00
Tapani Pälli
ed10f9cfad glsl: correct indentation of comment, Trivial.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2015-04-21 20:11:43 +03:00
Matt Turner
529064f6a8 i965/fs: Combine pixel center calculation into one inst.
The X and Y values come interleaved in g1 (.4-.11 inclusive), so we can
calculate them together with a single add(32) instruction on some
platforms like Broadwell and newer or in SIMD8 elsewhere.

Note that I also moved the PIXEL_X/PIXEL_Y virtual opcodes from before
LINTERP to after it. That's because the writes_accumulator_implicitly()
function in backend_instruction tests for <= LINTERP for determining
whether the instruction indeed writes the accumulator implicitly. The
old FS_OPCODE_PIXEL_X/Y emitted ADD instructions, which did, but the new
opcodes just emit MOVs, which don't. It doesn't matter, since we don't
use these opcodes on Gen4/5 anymore, but in the case that we do...

On Broadwell:
total instructions in shared programs: 7192355 -> 7186224 (-0.09%)
instructions in affected programs:     1190700 -> 1184569 (-0.51%)
helped:                                6131

On Haswell:
total instructions in shared programs: 6155979 -> 6152800 (-0.05%)
instructions in affected programs:     652362 -> 649183 (-0.49%)
helped:                                3179

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
5af0604d52 i965/fs: Calculate delta_x and delta_y together.
This lets SIMD16 programs on G45 and Gen5 use the PLN instruction.

On Ironlake:

total instructions in shared programs: 5634757 -> 5518055 (-2.07%)
instructions in affected programs:     1745837 -> 1629135 (-6.68%)
helped:                                11439
HURT:                                  4

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
fde3100fe6 i965/fs: Emit ADDs for gl_FragCoord, not virtual opcodes.
These were used only on Gen4 and 5. emit_interpolation_setup_gen6() emits
ADDs directly. The virtual opcodes weren't providing anything useful.

I'm going to repurpose these opcodes, so deleting and readding them makes
it simpler to see what's going on.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
b14313e452 i965/fs: Manually set source regioning on PLN instructions.
Like LINE (commit 92346db0), src0 must have a scalar region. Setting
src1's region to <8,8,1> lets us pass a properly sized combined delta_xy
argument in a few commits without getting a bogus <16,16,1> region.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
a1dd2f0bb6 i965/fs: Add LINTERP's src0 to fs_inst::regs_read().
LINTERP's src0 is PLN's src1, and PLN's src1 reads exec_size / 4
registers.

Having that information lets us drop the delta_x/y special case code in
split_virtual_grfs().

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
8bc49f9536 i965/fs: Set compression only if writing two registers.
We don't want to set compression control on a SIMD16 instruction
operating on words or smaller.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
b5a5b63548 i965/fs: Allow an execution size of 32.
In a few commits, we'll start emitting an add(32) instruction on some
platforms.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
45a1348612 i965: Make type_sz() return unsigned.
Avoids annoying warnings when comparing with sizeof(...).

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
dd5c825053 i965: Replace guess_execution_size with something simpler.
guess_execution_size() does two things:

   1. Cope with small destination registers.
   2. Cope with SIMD8 vs SIMD16 mode.

This patch replaces the first with a simple if block in brw_set_dest: if
the destination register width is less than 8, you probably want the
execution size to match.  (I didn't put this in the 3src block because
it doesn't seem to matter.)

Since only the FS compiler cares about SIMD16 mode, it's easy to just
set the default execution size there.

This pattern was already been proven in the Gen8+ generator, but we
didn't port it back to the existing generator when we combined the two.

This is based on a patch from Ken from about a year ago. I've rebased it
and and fixed a few bugs.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Matt Turner
3b4abdae04 i965/fs: Ensure delta_x/y are even-aligned registers on Gen6.
The BSpec says this applies to Gen6 as well.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-21 09:24:48 -07:00
Marius Predut
958b4965a2 main: remove __FUNCTION__ defined because it is obsolete
Consistently just use C99's __func__ everywhere.
No functional changes.

Signed-off-by: Marius Predut <marius.predut@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-21 13:05:30 +00:00
Marius Predut
d8b14a57a9 radeon: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
No functional changes.

Signed-off-by: Marius Predut <marius.predut@intel.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-21 13:05:03 +00:00
Tapani Pälli
ad5ae271e7 mesa: add missing break in switch statement
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-By: Martin Peres <martin.peres@linux.intel.com>
2015-04-21 14:38:59 +03:00
Tapani Pälli
5917ca349a glsl: add fallthrough comment on switch
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-By: Martin Peres <martin.peres@linux.intel.com>
2015-04-21 14:38:10 +03:00
Tapani Pälli
054c7dc7eb mesa: fix UBO queries for active uniforms
Commit 34df5eb introduced regression to GetActiveUniformBlockiv
when querying one of the following properties:

   GL_UNIFORM_BLOCK_ACTIVE_UNIFORMS
   GL_UNIFORM_BLOCK_ACTIVE_UNIFORM_INDICES

Implementation counted all uniforms in ubo directly while query should
check first if the uniform in question is _active_.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90109
Reviewed-By: Martin Peres <martin.peres@linux.intel.com>
2015-04-21 14:37:09 +03:00
Neil Roberts
7004632b28 i965/skl: Fix the qpitch value
On Skylake the qpitch value is uploaded as part of the surface state
so we don't need to add the extra rows that are done for other
generations. However for 3D textures it needs to be aligned to the
tile height and for depth/stencil textures it needs to be a multiple
of 8. Unlike previous generations the qpitch is measured as a multiple
of the block size for compressed surfaces. When the horizontal mipmap
layout is used for 1D textures then the qpitch is measured in pixels
instead of rows.

v2: Align the depth/stencil textures to a multiple of 8
v3: Add an assert that ALL_SLICES_AT_EACH_LOD is not used. Ignore the
    vertical alignment when picking the qpitch for 1D_ARRAY textures.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-20 22:03:21 -07:00
Neil Roberts
584f8e1ec5 i965/skl: Don't use ALL_SLICES_AT_EACH_LOD
The render surface state command for Skylake doesn't have the surface
array spacing bit so it's not possible to select this layout. I think
it was only used in order to make it pick a tightly-packed qpitch
value that doesn't include space for the mipmaps. However this won't
be necessary after the next patch because it will automatically pick a
packed qpitch value whenever first_level==last_level. It is better to
remove this layout entirely on Gen8+ because although it can
effectively be implemented with a small qpitch value when there are no
mipmaps it isn't possible to support the case where there are mipmaps
because in that case the layout is very different.

It could be good to make a similar change for Gen8 if we also change
the layouting code to pick the qpitch value in a similar way.

v2: Make the commit message and comments more convincing

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
2015-04-20 22:03:21 -07:00
EdB
c1485f4b7d clover: remove pre llvm 3.5.0 compatibility code
Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-20 18:11:04 +00:00
EdB
f39cd71618 clover: make llvm >= 3.5.0 and c++11 mandatory
Clover not longer compile with llvm <= 3.5.0 since e1d363b3.
e1d363b3 implies c++11 and llvm 3.5.0 CXXFLAGS provided it.
No one seems to have noticed it, it's now official.

Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-20 18:10:29 +00:00
Dave Airlie
3282e57bcf docs/GL3.txt: update ARB_shader_subroutine status
Admit to having started working on this, I don't admit to ever finishing it

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-20 18:43:36 +10:00
Nick Sarnie
645f77fe50 gallivm: Fix build against LLVM 3.7 SVN r235265
LLVM removed JITEmitDebugInfo from TargetOptions since they weren't used

v2: Be consistent with the LLVM version check (Aaron Watry)

Signed-off-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-20 13:34:45 +09:00
Ian Romanick
c015008ee0 doc: Add GL_ARB_shader_image_size dependency for OpenGL ES 3.1
imageSize() is in the GLSL ES 3.1 spec.  Trivial.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-20 08:30:28 +09:00
Ilia Mirkin
b2e871bd48 indices: fix provoking vertex for quads/quadstrips
This allows drivers to provide consistent flat shading for quads.
Otherwise a driver that only supported tris would have to force last
provoking vertex when drawing quads (and would have to say that quads
don't follow the provoking vertex convention).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-04-18 18:27:22 -04:00
Ilia Mirkin
1cdb01d716 primconvert: select pv convention only from flatshade_first
This should match to how drivers program hardware. flatshade relates to
whether color inputs are interpolated, not the provoking vertex
convention.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-04-18 18:27:09 -04:00
Ilia Mirkin
0904774af1 freedreno/a3xx: enable polymode setting with non-fill modes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-18 17:35:23 -04:00
Ilia Mirkin
6357601628 freedreno/a3xx: fix integer and 32-bit float border colors
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-18 17:35:23 -04:00
Ilia Mirkin
6895c3554e freedreno/a3xx: add support for float R/RG render targets
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-18 17:35:23 -04:00
Connor Abbott
1eac3ae1a6 mesa: add .mesa-install-links files to gitignore
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-17 15:24:14 -04:00
Connor Abbott
65f13352b9 mesa/main: add autogenerated format-info.c to gitignore
v2: move to right after format-info.h

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-17 15:24:06 -04:00
Kenneth Graunke
1d6829813e i965: Issue perf_debug messages for unsynchronized maps on !LLC systems.
We haven't implemented proper unsynchronized map support on !LLC systems
(pre-SNB, Atom).  MapBufferRange with GL_MAP_UNSYNCHRONIZE_BIT will
actually do a synchronized map, probably killing performance.

Also warn on BufferSubData, when we should be doing an unsynchronized
upload, but instead have to do a synchronous map.

v2: Only complain if the buffer is actually busy - we use unsynchronized
    maps internally for vertex upload and such, but expect those to not
    be busy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
2015-04-17 12:14:52 -07:00
Kenneth Graunke
cd9058fae3 i965: Make shader_time store names/ids instead of referencing shaders.
Jason noticed that shader_time was bumping the reference count on the
gl_shader_program and gl_program structures, in code called during
compilation.

Not only were these never unreferenced, but it meant fragment shaders
might be referenced twice (SIMD8 and SIMD16)...or only once.

We don't actually need the programs.  We just need their numeric ID and
their language (GLSL/ARB/FF) or KHR_debug label.  If there's a label, we
have to strdup it since the underlying program could be deleted.

To be fair, we're not exactly cleaning that up either, but we at least
ralloc it out of the shader_time arrays, so if we ever bother cleaning
those up, they'll go away properly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-17 12:07:35 -07:00
Kenneth Graunke
eb6e770889 i965: Delete some unnecessary code in brw_report_shader_time().
It is true that a gl_shader_program with ID 0 will be a fixed-function
fragment program; a gl_program with ID 0 but NULL gl_shader_program
means that it's a fixed-function vertex shader.

But that's not terribly interesting or relevant to what we're doing.
We just need to know that ID 0 means "fixed function".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-17 12:07:33 -07:00
Kenneth Graunke
e9efd667de i965: Make shader_time use 0 instead of -1 for "no meaningful ID".
0 is not a valid GLSL shader or ARB program ID.  For some reason,
shader_time used -1 instead...so we had code to detect 0, then override
it to -1.

We can just delete that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-17 12:06:08 -07:00
Tobias Nygren
cfab4ea9c6 adjust a couple of ifdefs to handle NetBSD correctly
Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tobias Nygren <tnn@NetBSD.org>
2015-04-17 12:04:48 -07:00
Tobias Nygren
52e4e4712f configure.ac: fix bashism
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tobias Nygren <tnn@NetBSD.org>
2015-04-17 12:04:21 -07:00
Anuj Phogat
79010c9a53 i965: Render R16G16B16X16 as R16G16B16A16
This enables using _mesa_meta_pbo_TexSubImage() to upload data
to R16G16B16X16 texture. Earlier it fell back to slower paths.

Jenkins run shows no piglit regressions.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-17 11:48:38 -07:00
Anuj Phogat
c6b0922c31 i965: Update the comment about platforms supporting blorp
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-04-17 11:48:38 -07:00
Matt Turner
4dacb212fd nir: Allow abs/neg in select peephole pass.
total instructions in shared programs: 4314531 -> 4308949 (-0.13%)
instructions in affected programs:     429085 -> 423503 (-1.30%)
helped:                                1680
HURT:                                  0
GAINED:                                0
LOST:                                  111

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-17 11:01:34 -07:00
Jason Ekstrand
472ef9a02f i965/fs: Change SEL and MOV types as needed to propagate source modifiers
SEL and MOV instructions, as long as they don't have source modifiers, are
just copying bits around.  This commit adds support to copy propagation to
switch the type of a SEL or MOV instruction as needed so that it can
propagate source modifiers.  This is needed because NIR generates integer
SEL and MOV instructions whenver it doesn't know what else to generate.

shader-db results with NIR:
total FS instructions in shared programs: 4360910 -> 4360186 (-0.02%)
FS instructions in affected programs:     59094 -> 58370 (-1.23%)
helped:                                   341
HURT:                                     0
GAINED:                                   2
LOST:                                     0

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-17 11:01:34 -07:00
Jason Ekstrand
bb99a58e77 i965/fs: Use the source type when looking for UD negations in copy prop
There can be problems with floats and conditional modifiers when
copy-propagating a negated UD source.  The problem arises when a source
modifier is applied to a UD value.  In this case, a 33-bit representation
is internally used.  If you do the following:

   1: mov foo:UD 7U
   2: mov bar:UD -foo:UD
   3: mov out:F bar:UD

the out register will have the value (float)(unt32_t)-7 which is some very
large floating-point number.  However, if we allow copy-propagation of the
second mov, we get

   1: mov foo:UD 7U
   3: mov out:f -bar:UD

and, since the negation is computed in 33-bits, we get a value of -7.0f
which is clearly not the same.  This is a similar problem if the
instruction has a conditional modifier where the 33-bit value is used in
the comparison and not the 32-bit version.

Previously, we checked the source to be copied for the negate and then
checked the source being propagated to for the type.  This isn't quite what
we want because we are really just looking for negated UD sources.  A check
later in the file ensures that both ends of the propagate have the right
type so it works.  However, if we relax the restriction that both ends of
the propagation have the same type, it ends up causing us to bail early in
cases we don't want.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-17 11:01:33 -07:00
Rob Clark
95e68adcd9 freedreno/ir3/nir: few little fixes
isaml needs to scale up coords based on LoD.  Also fix bogus bary.f
varying # when there are non-bary frag shader inputs.  And use sub.s of
a positive immediate rather than add.s of negative (since CP is better
about figuring out that those can be collapsed into the cat2 instr).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 11:40:14 -04:00
Rob Clark
efbf14e893 freedreno/ir3/nir: lower if/else
For now, completely flatten if/else blocks.  That will almost certainly
change once we have flow control.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 11:40:14 -04:00
Rob Clark
e5e11b5baf freedreno/a4xx: support for large shaders
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:40:50 -04:00
Rob Clark
20ea698c49 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:40:44 -04:00
Rob Clark
57f0d3b3c6 freedreno/ir3/nir: UBO support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:40:36 -04:00
Rob Clark
87807e5cc5 freedreno/ir3: move out helper
We'll also want it in NIR f/e for implementing UBO support.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:40:28 -04:00
Rob Clark
70b2f872ea freedreno/a4xx: sysvals and UBOs
Basically just sync up the cmdstream emit parts to match the changes
already done on a3xx.

Also, fix scheduling for mem instructions.  This is needed on a4xx, and
I am a bit surprised it isn't needed for a3xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:40:18 -04:00
Rob Clark
e14af4c067 nir/builder: add nir_builder_insert_after_instr()
For lowering if/else, I need a way to insert at the end of the previous
block.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:34:15 -04:00
Rob Clark
7a9063e7c7 gallium/ttn: fix TXF
There is a level param stashed away in the .w component of the first
src.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-17 10:34:15 -04:00
Rob Clark
ef7c4f39bf gallium/ttn: add UBO support
v2: move ishl into ttn (instead of driver backend) to keep the units
    consistent between immediate and indirect offsets

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-17 10:34:15 -04:00
Rob Clark
8efe20467b gallium/ttn: minor cleanup
v2: also use ttn_src_for_indirect() everywhere for addr access, rather
    than open-coding it for INPUT/CONST srcs
v3: move ralloc out of ttn_src_for_indirect() into the one call site
    that needs a ptr

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-17 10:34:15 -04:00
Rob Clark
a3cce7a38e gallium/ttn: add support for TXL2
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-17 10:34:15 -04:00
Rob Clark
f44d836d7a gallium/ttn: add support for texture offsets
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-17 10:34:14 -04:00
Brian Paul
e050a19af8 mesa/st: Free st_translate with FREE macro.
To match CALLOC_STRUCT macro.

Fixes memory corruption on Windows when u_memory's memory debugging is
enabled.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-17 15:14:23 +01:00
Jose Fonseca
8638e3ae1b libgl-gdi: Prevent "pure virtual method called" error when.
When running piglit w/ llvmpipe on Windows several tests terminate
abnormally just when the test exits.

The problem was that LLVMContextDispose was being called
after LLVM global destructors.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-04-16 20:37:34 +01:00
Ville Syrjälä
4fc645aed1 i965: Add marketing names for CHV
All CHV devices will be branded as "Intel(r) HD Graphics".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2015-04-16 21:32:41 +03:00
Ian Romanick
94aab6cde6 nir: Convert the if-test for num_inputs == 2 to an assertion
Suggested by Jason on a different patch after some comments /
questions by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabott0@gmail.com>
2015-04-16 09:56:49 -07:00
Marek Olšák
61293bfced configure.ac: print LLVM_LDFLAGS
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velilkov@gmail.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
0d46440c3a glsl_to_tgsi: only associate the uniform storage once at link time
This hack is no longer needed. (see the previous commit)

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
bb5df7350b glsl_to_tgsi: add STATE_FB_WPOS_Y_TRANSFORM at link time
This will allow removing the uniform storage re-association during
TGSI generation at draw time.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
e2066a4344 glsl_to_tgsi: add assertions for detecting out-of-bounds immediates access
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
dcc74d47c4 glsl_to_tgsi: don't use a potentially-undefined immediate for ir_query_levels
Cc: 10.4 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
14c5bc3b9a glsl_to_tgsi: fix out-of-bounds constant access and crash for uniforms
This fixes piglit shaders@glsl-fs-uniform-array-loop-unroll with immediate
shader compilation - it's a compiler test, so it has never been translated
to TGSI before.

Cc: 10.4 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
d3045d391b glsl_to_tgsi: cleanup includes
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
76c2d4498d mesa/program: remove dead code
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
b79c620663 radeonsi: add a debug option to compile shaders when they're created
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-16 18:36:29 +02:00
Marek Olšák
99eef3b8b3 st/mesa: add a debug option to compile shaders at link time
v2: fix crashes

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-04-16 18:36:29 +02:00
Kristian Høgsberg
993a6288f7 i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders
The ir_tex opcode turns into a sample or sample_c message, which will try to
compute derivatives to determine the lod. This produces garbage for
non-fragment shaders where the sample coordinates don't correspond to
subspans.

We fix this by rewriting the opcode from ir_tex to ir_txl and setting the
lod to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89457
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <kristian.h.kristensen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-16 09:18:30 -07:00
Emil Velikov
a7d018accf radeonsi: remove bogus r600-- triple
As mentioned by Michel Dänzer for LLVM >= 3.6 we create the
LLVMTargetMachine (with triple amdgcn--), as we setup the radeonsi
context. For older LLVM or hardware (r600) the triple is always r600--
and is created at a later stage - radeon_llvm_compile()

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-04-16 14:15:19 +01:00
Neil Roberts
33f73e93ff i965/skl: Add the header for constant loads outside of the generator
Commit 5a06ee738 added a step to the generator to set up the message
header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7
instruction. That pseudo opcode is implemented in terms of multiple
actual opcodes, one of which writes to one of the source registers in
order to set up the message header. This causes problems because the
scheduler isn't aware that the source register is written to and it
can end up reorganising the instructions incorrectly such that the
write to the source register overwrites a needed value from a previous
instruction. This problem was presenting itself as a rendering error
in the weapon in Enemy Territory: Quake Wars.

Since commit 588859e1 there is an additional problem that the double
register allocated to include the message header would end up being
split into two. This wasn't happening previously because the code to
split registers was explicitly avoided for instructions that are
sending from the GRF.

This patch fixes both problems by splitting the code to set up the
message header into a new pseudo opcode so that it will be done
outside of the generator. This new opcode has the header register as a
destination so the scheduler can recognise that the register is
written to. This has the additional benefit that the scheduler can
optimise the message header slightly better by moving the mov
instructions further away from the send instructions.

On Skylake it appears to fix the following three Piglit tests without
causing any regressions:

 gs-float-array-variable-index
 gs-mat3x4-row-major
 gs-mat4x3-row-major

I think we actually may need to do something similar for the fs
backend and possibly for message headers from regular texture sampling
but I'm not entirely sure.

v2: Make sure the exec-size is retained as 8 for the mov instruction
    to initialise the header from g0. This was accidentally lost
    during a rebase on top of 07c571a39f.
    Split the patch into two so that the helper function is a separate
    change.
    Fix emitting the MOV instruction on Gen7.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89058
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-04-16 13:02:26 +01:00
Neil Roberts
a9e4cf5d32 i965/vec4: Add a helper function to emit VS_OPCODE_PULL_CONSTANT_LOAD
There were three places in the visitor that had a similar chunk of
code to emit the VS_OPCODE_PULL_CONSTANT_LOAD opcode using a register
for the offset. This patch combines the chunks into a helper function
to reduce the code duplication. It will also be useful in the next
patch to expand what happens on Gen9+. This shouldn't introduce any
functional changes.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-04-16 13:01:43 +01:00
Jose Fonseca
037e0e78ab mesa,glsl: rename interface to programInterface.
`interface` is a define on Windows -- an alias for `struct` keyword,
used when declaring COM interfaces in C or C++.

So use instead `programInterface`, therefore matching the name used
in GL_ARB_program_interface_query spec/headers, which was renamed exactly
for the same reason:

  "Revision 10, May 10, 2012 (pbrown)
     - Rename the formal parameter <interface> used by the functions in this
       extension to <programInterface>.  Certain versions of the Microsoft
       C/C++ compiler and/or its headers cause "interface" to be treated as a
       reserved keyword."

Trivial.
2015-04-16 10:23:24 +01:00
Flora Cui
f78b2c432f gbm: Add GBM_BO_USE_LINEAR flag
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-04-16 15:49:15 +09:00
Tapani Pälli
7c154bbe60 mesa: refactor GetUniformBlockIndex
Use _mesa_program_resource_index to get index.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
1b256eb0ec mesa: refactor GetUniformIndices
Use _mesa_program_resource_index to get indices.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
51313f567d mesa: refactor GetUniformLocation
Use _mesa_program_resource_location to get location.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
45637e9c1f mesa: refactor GetActiveUniformBlockName
Use _mesa_get_program_resource_name to get name.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
284003e1f1 mesa: remove unused _mesa_get_uniform_name
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
8d6fa52e33 mesa: refactor GetActiveUniformName
Use _mesa_get_program_resource_name to get name.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
17dc939f75 mesa: refactor GetActiveUniform
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
dc39d843d2 mesa: refactor GetTransformFeedbackVarying
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
7519ddb4d8 mesa: refactor GetActiveUniformsiv, use _mesa_program_resource_prop
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-16 07:55:57 +03:00
Tapani Pälli
34df5ebd77 mesa: mesa_bufferiv utility function for buffer objects
Patch adds new function 'mesa_bufferiv' and refactors existing
GetActiveUniformBlockiv and GetActiveAtomicCounterBufferiv to
use it.

corresponding Piglit tests:
   arb_uniform_buffer_object*
   arb_shader_atomic_counters*

(Many tests hit the corresponding queries.)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
4e7f134f89 mesa: refactor GetFragDataIndex
Use _mesa_program_resource_location_index to fetch index.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
62057c77f1 mesa: refactor GetFragDataLocation
Use program_resource_location to fetch location.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
3d1544cc91 mesa: refactor GetAttribLocation
Use program_resource_location to fetch location.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
26c0394a96 mesa: refactor GetActiveAttrib
Instead of iterating IR, retrieve required information through
the new program resource functions.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-16 07:55:57 +03:00
Tapani Pälli
41c230cd98 mesa: enable GL_ARB_program_interface_query extension
(and mark it as DONE in docs/GL3.txt + 10.6.0 relnotes)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
2ab8de2181 mesa: implementation of glGetProgramResourceiv
Patch adds required helper functions to shaderapi.h and
the actual implementation.

The property query functionality can be tested with tests for
following functions that are refactored by later patches:

   GetActiveAtomicCounterBufferiv
   GetActiveUniformBlockiv
   GetActiveUniformsiv

v2: code cleanup (Ilia Mirkin)
    add bufSize < 0 check and error out
    fix is_resource_referenced to return bool
    check for propCount and bufSize, fixes in buffer_prop

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:57 +03:00
Tapani Pälli
9367ade331 mesa: glGetProgramResourceLocationIndex
Patch adds required helper functions to shaderapi.h and
the actual implementation.

The added functionality can be tested by tests for following
functions that are refactored by later patches:

   GetFragDataIndex

v2: return -1 if output not referenced by fragment stage
    (Ilia Mirkin)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:56 +03:00
Tapani Pälli
e0e4d77f01 mesa: glGetProgramResourceLocation
Patch adds required helper functions to shaderapi.h and
the actual implementation.

corresponding Piglit test:
   arb_program_interface_query-resource-location

The added functionality can be tested by tests for following
functions that are refactored by later patches:

   GetAttribLocation
   GetUniformLocation
   GetFragDataLocation

v2: code cleanup, changes to array element
    syntax checking (Ilia Mirkin)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:56 +03:00
Tapani Pälli
2a5a0d19d6 mesa: glGetProgramResourceName
Patch adds required helper functions to shaderapi.h and
the actual implementation.

Name generation copied from '_mesa_get_uniform_name' which can
be removed later by refactoring functions to use resource list.

The added functionality can be tested by tests for following
functions that are refactored by later patches:

   GetActiveUniformName
   GetActiveUniformBlockName

v2: no index for geometry shader inputs (Ilia Mirkin)
    add bufSize < 0 check and error out
    validate enum

corresponding Piglit test:
   arb_program_interface_query-getprogramresourcename

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.org>
2015-04-16 07:55:56 +03:00
Tapani Pälli
161f57f610 mesa: glGetProgramResourceIndex
Patch adds required helper functions to shaderapi.h and
the actual implementation.

v2: code cleanup (Ilia Mirkin)

corresponding Piglit test:
   arb_program_interface_query-getprogramresourceindex

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.org>
2015-04-16 07:55:56 +03:00
Tapani Pälli
4d3b98bc58 mesa: glGetProgramInterfaceiv
Patch adds required helper functions to shaderapi.h and
the actual implementation.

v2: code cleanup (Ilia Mirkin)
    fix array size fo xfb varyings
    validate programInterface and throw error

v3: put GL_MAX_NUM_COMPATIBLE_SUBROUTINES where
    it belongs

corresponding Piglit test:
   arb_program_interface_query-getprograminterfaceiv

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:56 +03:00
Tapani Pälli
c796ce4108 mesa/glsl: build list of program resources during linking
Patch adds ProgramResourceList to gl_shader_program structure.
List contains references to active program resources and is
constructed during linking phase.

This list will be used by follow-up patches to implement hooks
for GL_ARB_program_interface_query. It can be also used to
implement any of the older shader program query APIs.

v2: code cleanups + note for SSBO and subroutines (Ilia Mirkin)
v3: code cleanups + assert(MESA_SHADER_STAGES < 8) (Martin Peres)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:55:35 +03:00
Tapani Pälli
b297fc27aa glapi: add GL_ARB_program_interface_query skeleton
v2: update dispatch_sanity test (Jason Ekstrand)
    + small code cleanups

v3: xml and Makefile fixes (Ilia Mirkin, Matt Turner)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:30:12 +03:00
Tapani Pälli
993b9b6adb linker: fix varying linking if SSO program has only gs and fs
Previously linker did not take in to account case where one would
have only gs and fs (with SSO), patch adds the case by refactoring
code around assign_varying_locations. This makes sure locations for
gs get populated correctly.

This was found with some of the SSO subtests of Martin's upcoming
GetProgramInterfaceiv Piglit test which passes with the patch, no
Piglit regressions.

v2: code cleanups (Martin Peres)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-16 07:30:12 +03:00
Glenn Kennard
17d69862a9 r600g/sb: Skip empty ALU clause while scheduling
Fixes assert triggered by
ext_transform_feedback-intervening-read output use_gs
piglit test.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-16 12:43:20 +10:00
Ian Romanick
4cf5ca5ca5 nir: Try commutative sources in CSE
Shader-db results:

GM45 NIR:
total instructions in shared programs: 4082044 -> 4081919 (-0.00%)
instructions in affected programs:     27609 -> 27484 (-0.45%)
helped:                                44

Iron Lake NIR:
total instructions in shared programs: 5678776 -> 5678646 (-0.00%)
instructions in affected programs:     27406 -> 27276 (-0.47%)
helped:                                45

Sandy Bridge NIR:
total instructions in shared programs: 7329995 -> 7329096 (-0.01%)
instructions in affected programs:     142035 -> 141136 (-0.63%)
helped:                                406
HURT:                                  19

Ivy Bridge NIR:
total instructions in shared programs: 6769314 -> 6768359 (-0.01%)
instructions in affected programs:     140820 -> 139865 (-0.68%)
helped:                                423
HURT:                                  2

Haswell NIR:
total instructions in shared programs: 6183693 -> 6183298 (-0.01%)
instructions in affected programs:     96538 -> 96143 (-0.41%)
helped:                                303
HURT:                                  4

Broadwell NIR:
total instructions in shared programs: 7501711 -> 7498170 (-0.05%)
instructions in affected programs:     266403 -> 262862 (-1.33%)
helped:                                705
HURT:                                  5
GAINED:                                4

v2: Rebase on top of Connor's fix.

v3: Convert the if-test for num_inputs == 2 to an assertion.  Suggested
by Jason after some comments / questions by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
2015-04-15 18:15:59 -07:00
Ian Romanick
8957c9e448 glx: Create proper server dependency for GLX_EXT_create_context_es2_profile
Previously GLX_EXT_create_context_es2_profile was marked as "direct
only" so that it would not depend on server support.  Since the
extension required functions that are part of
GLX_ARB_create_context_profile, support for the EXT was disabled if the
ARB was not supported.

This was complete rubbish.  If the server supported the ARB but not the
EXT, sending a request with GLX_CONTEXT_ES2_PROFILE_BIT_EXT would result
in GLXBadProfileARB.

Instead of the misguided hack, make GLX_EXT_create_context_es2_profile
properly depend on server support by not marking it as "direct only."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-15 18:11:54 -07:00
Eric Anholt
b229e6c7de vc4: Don't try to use color load/stores to blit across format changes.
We could potentially support the right combination of 8888 to 565, but the
important thing for now is to not mix up our orderings of 8888.  Fixes
fbo-copyteximage regressions.
2015-04-15 16:50:23 -07:00
Eric Anholt
cff2e08c4c vc4: Don't try to use color load/stores to do depth/stencil blits.
Fixes regressions in fbo-generatemipmap-formats on depth/stencil (which
does blits to work around baselevel/lastlevel).
2015-04-15 16:50:23 -07:00
Eric Anholt
3a728d4dfb vc4: Update the shadow texture for public textures on every draw.
We don't know who else has written to it, so we'd better update it every
time.  This makes the gears spin in X again.
2015-04-15 16:50:23 -07:00
Eric Anholt
bd957b1b79 vc4: Hook up VC4_DEBUG=perf to some useful printfs. 2015-04-15 16:50:22 -07:00
Brian Paul
e1d095053b st/mesa: log shaders, GLSL info log with _mesa_log()
As with previous patch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-04-15 16:30:49 -06:00
Brian Paul
011cad806a mesa: log shaders, GLSL info log with _mesa_log()
Now, if we set MESA_LOG_FILE and MESA_GLSL=dump, all the shader info
will get logged to the named file instead of stderr.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-04-15 16:30:49 -06:00
Brian Paul
2926bbfb28 mesa: add _mesa_log(), _mesa_get_log_file() functions
_mesa_log() simply writes log information to stderr or MESA_LOG_FILE.
_mesa_get_log_file() returns the file handle to use for logging.

This will be used for shader dumping/logging instead of always printing
to stderr.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-04-15 16:30:49 -06:00
Brian Paul
11bfee4c3a tgsi: also dump label for TGSI_OPCODE_BGNSUB opcode
So we can see the label associated with subroutines.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-04-15 16:30:49 -06:00
Jose Fonseca
1aa50339d8 st/wgl: Couple of fixes to opengl32.dll's wglCreateContext/wglDeleteContext dispatch.
- Use GetModuleHandle instead of LoadLibrary to avoid incrementing the
  opengl32.dll reference count (otherwise the opengl32.dll will linger
  in memory forever.)

- Ensure we use our fake wglCreateContext/wglDeleteContext when using
  Mesa as a drop-in replacement for opengl32.dll

Untested.  Just noticed by accident.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-15 09:58:38 +01:00
Jose Fonseca
6635fb6cae mesa: Enable _mesa_dlopen on MSVC too.
As pointed out by Shervin Sharifi.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-15 09:58:27 +01:00
Samuel Iglesias Gonsalvez
3cbefe3cf4 glsl: fix assignment of multiple scalar and vecs to matrices.
When a vec has more elements than row components in a matrix, the
code could end up failing an assert inside assign_to_matrix_column().

This patch makes sure that when there is still room in the matrix for
more elements (but in other columns of the matrix), the data is actually
assigned.

This patch fixes the following dEQP test:

  dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_bool_to_mat4x2_vertex
  dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_bool_to_mat4x2_fragment

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-04-15 08:11:18 +02:00
Ian Romanick
bc672e261c nir: Fix typo in "ushr by 0" algebraic replacement
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "10.5" <mesa-stable@lists.freedestkop.org>
2015-04-14 16:41:04 -07:00
Ian Romanick
67a8610caf nir: Silence unused parameter warnings
nir/nir.h: In function 'nir_validate_shader':
nir/nir.h:1567:56: warning: unused parameter 'shader' [-Wunused-parameter]
 static inline void nir_validate_shader(nir_shader *shader) { }
                                                        ^
nir/nir_opt_cse.c: In function 'src_is_ssa':
nir/nir_opt_cse.c:165:32: warning: unused parameter 'data' [-Wunused-parameter]
 src_is_ssa(nir_src *src, void *data)
                                ^
nir/nir_opt_cse.c: In function 'dest_is_ssa':
nir/nir_opt_cse.c:171:35: warning: unused parameter 'data' [-Wunused-parameter]
 dest_is_ssa(nir_dest *dest, void *data)
                                   ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-14 16:41:04 -07:00
Connor Abbott
47a1b4841d nir/cse: fix bug with comparing non-per-component sources
We weren't comparing the right number of components when checking
swizzles. Use nir_ssa_alu_instr_num_src_components() to do the right
thing.

No piglit regressions, and no fixes either.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-14 19:07:44 -04:00
Ben Widawsky
b069f9eafd i965/fs: Combine tex/fb_write operations (opt)
Certain platforms support the ability to sample from a texture, and write it out
to the file RT - thus saving a costly send instructions (note that this is a
potnential win if one wanted to backport to a tag that didn't have the patch
from Topi which removed excess MOVs from LOAD_PAYLOAD - 97caf5fa04),

v2: Modify the algorithm. Instead of iterating in reverse through blocks and
insts, since the last block/inst is the only thing which can benefit. Rebased
on top of Ken's patching modifying is_last_send

v3: Rebased over almost 2 months, and Incorporated feedback from Matt:
Some comment typo fixes and rewordings.
Whitespace
Move the optimization pass outside of the optimize loop

v4: Some cosmetic changes requested from Ken. These changes ensured that the
optimization function always returned true when an optimization occurred, and
false when one did not. This behavior did not exist with the original patch. As
a result, having the separate helper function which Matt did not like no longer
made sense, and so now I believe everyone should be happy.

Benchmark (n=20)   %diff
*OglBatch5         -1.4
*OglBatch7         -1.79
OglFillTexMulti    5.57
OglFillTexSingle   1.16
OglShMapPcf        0.05
OglTexFilterAniso  3.01
OglTexFilterTri    1.94

No piglit regressions:
(http://otc-gfxtest-01.jf.intel.com:8080/view/dev/job/bwidawsk/112/)

[*] I believe my measurements are incorrect for Batch5-7. If I add this new
optimization, but never emit the new instruction I see similar results.

v5: Remove declaration of combine_tex_header since v4 dropped that function
(Ben)
Remove check for impossible case of an empty block (Matt)
Set dest earlier to avoid extra special-casing in generate_tex (Matt)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-14 15:22:47 -07:00
Ben Widawsky
6866378cf4 i965/fs: Only emit FS_OPCODE_PLACEHOLDER_HALT if there are discards
Based originally on a patch from Ken in May 2014 of the same title. Things
changed enough that I didn't feel comfortable leaving his authorship.

v2: Replace fp->UsesKill with wm_prog_data->uses_kill. Since Ken took the time
to also explain the difference to me, here is his explanation for posterity:

"fp->UsesKill indicates that a ARB_fragment_program shader uses the KIL
instruction, or that a GLSL shader uses the "discard" insntruction
(which are analogous).

On Gen4-5, we sometimes have to simulate OpenGL's "Alpha Test" feature
by emitting shader code that implicitly does a "discard" instruction.

In the key setup, we do:

   /* key->alpha_test_func means simulating alpha testing via discards,
    * so the shader definitely kills pixels.
    */
   prog_data.uses_kill = fp->program.UsesKill || key->alpha_test_func;

Even though the shader may not technically contain a "discard", we need
to act as if it does.

I've also been trying to move the i965 state setup code to use
brw_wm_prog_key for everything, rather than poking at core Mesa's
gl_program/gl_fragment_program/gl_shader/gl_shader_program structures.

--Ken"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-14 15:22:47 -07:00
Ben Widawsky
38707e1478 i965/fs: Create a has_side_effects for fs_inst
When an instruction has a side effect, it impacts the available options when
reordering an instruction. As the EOT flag is an implied write to the render
target in the FS, it can be considered a side effect.

This patch shouldn't actually have any impact on the current code since the EOT
flag implies that the opcode is already one with side effects,
FS_OPCODE_FB_WRITE. The next patch however will introduce an optimization
whereby the EOT flag can occur with an opcode SHADER_OPCODE_TEX, and as that
instruction will perform the same implied write to the render target, it cannot
be reordered.

v2: Remove extra whitespace (Matt)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-14 15:22:47 -07:00
Marius Predut
28d9e90428 i965: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:53 -07:00
Marius Predut
139e6c7c4a i915: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:51 -07:00
Marius Predut
fc57222f60 glx: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:41 -07:00
Marius Predut
6f4d9418b4 main: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:41 -07:00
Marius Predut
50cb780f7f state_tracker: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:41 -07:00
Marius Predut
d02942cc77 swrast: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:41 -07:00
Marius Predut
e1231159bc vbo: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:41 -07:00
Marius Predut
f0e693efb3 tnl: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Marius Predut <marius.predut@intel.com>
2015-04-14 12:23:41 -07:00
Matt Turner
3ca17e75e4 i965/fs: Correct mistake in determining whether a MUL is negated.
a * b is equivalent to -a * -b, and the previous code was failing at
that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89961
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-14 12:16:03 -07:00
Neil Roberts
07c571a39f i965/skl: Use an exec size of 8 to initialise the message header
Commit e93566a15c changed the message header code needed to
make Skylake use SIMD4x2 so that it uses a register with width 4
instead of 8 as the source register in the send message. However it
also changed the width for the dest in the MOV instruction which is
used to initialise the header register with the values from g0. The
width of the destination is used to determine the exec size in
brw_set_dest so this would end up making the MOV have an exec size of
4. I think this would end up leaving the top half of the register
uninitialised. The top half of the header has meaningful values so
this probably isn't a good idea.

This patch just casts the dest register for the MOV instruction back
to a vec8 to fix it. It doesn't cause any changes to a Piglit run.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-04-14 19:20:28 +01:00
Ian Romanick
05a1d84491 i965/fs: Always invert predicate of SEL with swapped arguments
Commit b616164 added an optimization of b2f generation of a comparison.
It also included an extra optimization of one of the comparison values
is a constant of zero.  The trick was that some value was known to be
zero, so that value could be used in the SEL instruction instead of
potentially loading 0.0 into a register.

This change switched the order of the arguments to the SEL, and, for
some unknown reason, I thought that the predicate should therefore
only be inverted for the == case.  Clearly, it should always be
inverted.

Fixes piglit fs-notEqual-of-expression.shader_test and
fs-equal-of-expression.shader_test.

v2: Don't do the "register already has zero" optimization for the '== 0'
case.  In that case, the register does not have zero when we want to
produce a zero result.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89722
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Tested-by: Lu Hua <huax.lu@intel.com>
2015-04-14 08:35:10 -07:00
Tom Stellard
e0994e0f97 radeon/llvm: Improve codegen for KILL_IF
Rather than emitting one kill instruction per component of KILL_IF's src
reg, we now or the components of the src register together and use the
result as a condition for just one kill instruction.

shader-db stats (bonaire):

979 shaders
Totals:
SGPRS: 34872 -> 34848 (-0.07 %)
VGPRS: 20696 -> 20676 (-0.10 %)
Code Size: 749032 -> 748452 (-0.08 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 12288 -> 12288 (0.00 %) bytes per wave

Totals from affected shaders:
SGPRS: 1184 -> 1160 (-2.03 %)
VGPRS: 600 -> 580 (-3.33 %)
Code Size: 13200 -> 12620 (-4.39 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Increases:
SGPRS: 2 (0.00 %)
VGPRS: 0 (0.00 %)
Code Size: 0 (0.00 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)

Decreases:
SGPRS: 5 (0.01 %)
VGPRS: 5 (0.01 %)
Code Size: 25 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)

*** BY PERCENTAGE ***

Max Increase:

SGPRS: 32 -> 40 (25.00 %)
VGPRS: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Max Decrease:

SGPRS: 32 -> 24 (-25.00 %)
VGPRS: 16 -> 12 (-25.00 %)
Code Size: 116 -> 96 (-17.24 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

*** BY UNIT ***

Max Increase:

SGPRS: 64 -> 72 (12.50 %)
VGPRS: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Max Decrease:

SGPRS: 32 -> 24 (-25.00 %)
VGPRS: 16 -> 12 (-25.00 %)
Code Size: 424 -> 356 (-16.04 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-04-14 13:37:12 +00:00
Tom Stellard
c6d79ed289 radeon/llvm: Run LLVM's instruction combining pass
This should improve code quality in general and will help with some
future changes to how we emit kill instructions.

shader-db shows a few regressions, but these don't seem to be the result
of deficiencies in instcombine.  They're mostly caused by the scheduler
making different decisions than before.

shader-db stats (bonaire):

979 shaders
Totals:
SGPRS: 35056 -> 34872 (-0.52 %)
VGPRS: 20624 -> 20696 (0.35 %)
Code Size: 764372 -> 749032 (-2.01 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 12288 -> 12288 (0.00 %) bytes per wave

Totals from affected shaders:
SGPRS: 13264 -> 13072 (-1.45 %)
VGPRS: 8248 -> 8316 (0.82 %)
Code Size: 486320 -> 470992 (-3.15 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 11264 -> 11264 (0.00 %) bytes per wave

Increases:
SGPRS: 6 (0.01 %)
VGPRS: 20 (0.02 %)
Code Size: 14 (0.01 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)

Decreases:
SGPRS: 32 (0.03 %)
VGPRS: 8 (0.01 %)
Code Size: 244 (0.25 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)

*** BY PERCENTAGE ***

Max Increase:

SGPRS: 32 -> 48 (50.00 %)
VGPRS: 12 -> 20 (66.67 %)
Code Size: 216 -> 224 (3.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Max Decrease:

SGPRS: 40 -> 32 (-20.00 %)
VGPRS: 16 -> 12 (-25.00 %)
Code Size: 368 -> 280 (-23.91 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

*** BY UNIT ***

Max Increase:

SGPRS: 32 -> 48 (50.00 %)
VGPRS: 28 -> 36 (28.57 %)
Code Size: 39320 -> 40132 (2.07 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Max Decrease:

SGPRS: 72 -> 64 (-11.11 %)
VGPRS: 48 -> 40 (-16.67 %)
Code Size: 6272 -> 5852 (-6.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 0 -> 0 (0.00 %) bytes per wave

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-04-14 13:37:05 +00:00
Tom Stellard
2569c7109d radeonsi: Add header and footer to shader stat dump
This makes it easier to parse.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-04-14 13:36:59 +00:00
Kenneth Graunke
406df68736 i965: Fix software primitive restart with indirect draws.
new_prim was declared as a stack variable within a nested scope; we
tried to retain a pointer to that data beyond the scope, which is bogus.

GCC with -O1 eliminated most of the code that set new_prim's fields.

Move the declaration to fix the bug.

v2: Also fix new_ib (thanks to Matt Turner and Ben Widawsky).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81025
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Cc: mesa-stable@lists.freedesktop.org
2015-04-14 01:49:02 -07:00
Kenneth Graunke
f55ded764c i965: Implement proper workaround for Gen4 GPU CONSTANT_BUFFER hangs.
I finally managed to dig up some information on our mysterious GPU hangs.
A wiki page from the Crestline validation team mentions that they found
a GPU hang in "Serious Sam 2" (on Windows) with remarkably similar
conditions to the ones we've seen in Google Chrome and glmark2.

Apparently, if WM_STATE has "PS Use Source Depth" enabled, CC_STATE has
most depth state disabled, and you issue a CONSTANT_BUFFER command and
immediately draw, the depth interpolator makes a small mistake that
leads to hangs.

Most of the traces I looked at contained a CONSTANT_BUFFER packet
immediately followed by 3DPRIMITIVE, or at least very few packets.
It appears they also have "PS Use Source Depth" enabled - either at the
hang, or a little before it.  So I think this is our bug.

The workaround is to emit a non-pipelined state packet after issuing a
CONSTANT_BUFFER packet.  This is really similar to the workaround I
developed in commit c4fd0c9052.

v2: Fix word-wrapping issues.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-14 01:49:00 -07:00
Kenneth Graunke
21d29124a7 i965: Fix INTEL_DEBUG=shader_time for SIMD8 VS.
In commit 4ebeb71573, I deleted the
emit_shader_time_end() call in emit_urb_writes().  But I failed to add
it to run_vs(), as I intended.  So no data was recorded at all.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-14 01:47:14 -07:00
Eric Anholt
1be329e64c vc4: Add a blitter path using just the render thread.
This accelerates the path for generating the shadow tiled texture when
asked to sample from a raster texture (typical in glamor).
2015-04-13 23:20:46 -07:00
Eric Anholt
76d56752cc vc4: Allow submitting jobs with no bin CL in validation.
For blitting, we want to fire off an RCL-only job.  This takes a bit of
tweaking in our validation and the simulator support (and corresponding
new code in the kernel).
2015-04-13 23:20:45 -07:00
Eric Anholt
43b20795b7 vc4: Move the blit code to a separate file.
There will be other blit code showing up, and it seems like the place
you'd look.
2015-04-13 23:20:45 -07:00
Eric Anholt
e214a59635 vc4: Separate out a bit of code for submitting jobs to the kernel.
I want to be able to have multiple jobs being set up at the same time (for
example, a render job to do a little fixup blit in the course of doing a
render to the main FBO).
2015-04-13 23:20:45 -07:00
Eric Anholt
44b63cf5c0 vc4: When asked to sample from a raster texture, make a shadow tiled copy.
So, it turns out my simulator doesn't *quite* match the hardware.  And the
errata about raster textures tells you most of what's wrong, but there's
still stuff wrong after that.  Instead, if we're asked to sample from
raster, we'll just blit it to a tiled temporary.

Raster textures should only be screen scanout, and word is that it's
faster to copy to tiled using the tiling engine first than to texture from
an entire raster texture, anyway.
2015-04-13 22:34:06 -07:00
Eric Anholt
d04b07f8e2 vc4: Fix off-by-one in branch target validation. 2015-04-13 22:34:06 -07:00
Eric Anholt
7fa2f2e366 vc4: Use NIR-level lowering for idiv.
This fixes the idiv tests in piglit.
2015-04-13 21:36:40 -07:00
Eric Anholt
84ebaff1b7 vc4: Add a bunch of type conversions.
These are required to get piglit's idiv tests working.  The
unsigned<->float conversions are wrong, but are good enough to get
piglit's small ranges of values working.
2015-04-13 21:36:40 -07:00
Dave Airlie
cc5860e407 st/mesa: align cube map arrays layers
We create textures internally for texsubimage, and we use
the values from sub image to create a new texture, however
we don't align these to valid sizes, and cube map arrays
must have an array size aligned to 6.

This fixes texsubimage cube_map_array on CAYMAN at least,
(it was causing  GPU hang and bad values), it probably
also fixes it on radeonsi and evergreen.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89957
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-14 08:04:41 +10:00
Dave Airlie
5ed79312ed st/mesa: convert sub image for cube map arrays to 2d arrays for upload
Since we can subimage upload a number of cube map array layers,
that aren't a complete cube map array, we should specify things
as a 2D array and blit from that.

Suggested by Ilia Mirkin as an alternate fix for texsubimage
cube map array issues.

seems to work just as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-14 08:04:21 +10:00
Mathias Froehlich
1e1d5456ba i965: Flush batchbuffer containing the query on glQueryCounter.
This change fixes a regression with timer queries introduced with
commit 3eb6258. There the pending batchbuffer is flushed
only if glEndQuery is executed. This present change adds such
a flush to glQueryCounter which also schedules a value query
just like glEndQuery does. The patch fixes GPU timer queries
going mad from within osgviewer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Cc: mesa-stable@lists.freedesktop.org
2015-04-13 20:30:23 +02:00
Eric Anholt
adae027260 vc4: Use the blit interface for updating shadow textures.
This lets us plug in a better blit implementation and have it impact the
shadow update, too.
2015-04-13 10:39:24 -07:00
Eric Anholt
39b6f7e76c vc4: Remove dead fields from vc4_surface. 2015-04-13 10:39:24 -07:00
Eric Anholt
5100221ff7 vc4: Skip sending down the clear colors if not clearing. 2015-04-13 10:39:24 -07:00
Eric Anholt
725620f21d vc4: Sync with kernel changes to relax BCL versus RCL validation.
There was no reason to tie the two packets' values together.
2015-04-13 10:39:23 -07:00
Eric Anholt
cb88d2cfcb vc4: Fix another space allocation mistake.
We're over-allocating our BCL in vc4_draw.c, so this never mattered.
However, new RCL-only blit support might end up here without having set up
any BCL contents.
2015-04-13 10:39:02 -07:00
Eric Anholt
8eb9304ee7 vc4: Add missed accounting for the size of the semaphore.
This wouldn't have mattered except in the worst case scenario RCL setup.
2015-04-13 10:33:30 -07:00
Matt Turner
89b140dfae swrast: Mark MAX_GLUINT literal with u suffix.
Coverity is confused by the "float < int / 2" expression and suggests
casting MAX_GLUINT to unsigned, which I believe it was supposed to have
been already.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-13 09:03:02 -07:00
Matt Turner
1c9db39d54 i965: Don't bother freeing NULL.
Commit e16c5c90 was replacing 'region' with 'mt', leaving this
nonsensical code.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-13 09:03:02 -07:00
Chad Versace
a76dc15b2b i965: Lift some restrictions on dma_buf EGLImages
Allow glEGLImageTargetRenderbufferStorageOES and
glEGLImageTargetTexture2DOES for dma_buf EGLImages if the image is
a single RGBA8 unorm plane. This is safe, despite fast color clears,
because i965 disables allocation of auxiliary buffers for EGLImages.

Chrome OS needs this, because its compositor uses dma_buf EGLImages for
its scanout buffers.

Testing:
  - Tested on Ivybridge Chromebook Pixel with WebGL Aquarium and
    YouTube.
  - No Piglit regressions on Broadwell with `piglit run -p gbm
    tests/quick.py`, with my Piglit patches that update the
    EGL_EXT_image_dma_buf_import tests.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:36:32 -07:00
Chad Versace
2943b15ce7 i965: Disable aux buffers for EGLImage-backed miptrees
EGL does not yet have extensions to manage the flushing and invalidating
of driver-internal aux buffers. So we must disable aux buffers of
dma_buf-backed EGLImages in order to safely render into them.

This patch is obviously needed for renderbufers. It's also needed for
textures because the user can attach the texture to a framebuffer and
because the driver sometimes renders to textures for internal reasons.

Testing:
  - Tested on Ivybridge Chromebook Pixel with WebGL Aquarium and
    YouTube.
  - No Piglit regressions on Broadwell with `piglit run -p gbm
    tests/quick.py`.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:36:32 -07:00
Chad Versace
bf504b6127 i965: Change intel_miptree_create_for_bo() signature
Add parameter 'bool disable_aux_buffers'.

This is a refactor patch. The patch changes no behavior because the new
parameter is false in every call.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:36:32 -07:00
Chad Versace
d3b042f359 i965: Add field intel_mipmap_tree::disable_aux_buffers
The new field disables allocation of auxiliary buffers, such as the HiZ
buffer and MCS buffer. This is useful for sharing the miptree bo with an
external client that doesn't understand auxiliary buffers.

We need this field to safely render to a buffer that was imported with
EGL_EXT_image_dma_buf_import, because EGL does not yet have extensions
to manage flushing and invalidating auxiliary buffers.

Nothing yet enables this field. That's left to follow-up patches.

Testing:
  - Tested on Ivybridge Chromebook Pixel with WebGL Aquarium and
    YouTube.
  - No Piglit regressions on Broadwell with `piglit run -p gbm
    tests/quick.py`.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:36:29 -07:00
Chad Versace
e1338f267f i965: Refactor brw_is_hiz_depth_format()
Every caller of this function uses it to determine if the current
miptree needs a hiz buffer to be allocated. Strangely, the function
doesn't take a miptree argument. So, this function effectively decides
if and when a miptree's hiz buffer gets allocated without inspecting the
miptree itself.  Luckily, the driver behaves correctly despite the
brw_is_hiz_depth_format's quirk.

I will soon make some changes to the miptree that will require
inspecting the miptree to determine if it needs a hiz buffer. So this
patch renames
    brw_is_hiz_depth_format -> intel_miptree_wants_hiz_buffer
and gives it a miptree parameter.

This patch shouldn't change any behavior.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:32:02 -07:00
Chad Versace
5776d65114 i965: Declare intel_miptree_create_layout() as static
It's not used outside intel_mipmap_tree.c.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:32:02 -07:00
Chad Versace
1ef4bf7191 i965: Declare intel_miptree_alloc_mcs() as static
It's not used outside of intel_mipmap_tree.c, nor should it ever be.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-04-13 07:32:02 -07:00
Jose Fonseca
36ceda4ece docs: Improve LLVM_USE_CRT_xxx instructions. 2015-04-13 13:08:13 +01:00
Jose Fonseca
fa1b3e1501 glx: Include util/macros.h instead of redefining PRINTFLIKE.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-04-13 12:03:33 +01:00
Jose Fonseca
978753e843 util/ralloc: Fix extern "C" usage.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-13 12:03:26 +01:00
Jose Fonseca
85dd46d90c mesa: Remove pointless USE_EXTERNAL_DXTN_LIB macro.
I'm not sure what was the original intention, but currently
USE_EXTERNAL_DXTN_LIB always ends up defined, one way or another.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-04-13 12:02:52 +01:00
Emil Velikov
5ddeab8a06 docs: add news item and link release notes for mesa 10.5.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-12 23:16:42 +01:00
Emil Velikov
a94f8e712f docs: Add 256 sums for the 10.5.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 65776421fe)
2015-04-12 23:14:31 +01:00
Emil Velikov
794b9bf26a Add release notes for the 10.5.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit c4b8bff6e2)
2015-04-12 23:14:28 +01:00
Emil Velikov
61c6cc4a4a docs: remove the --with-max-{width,height} note
Missed out with commit d99135b2e9b(configure: nuke
--with-max-{width,height})

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-12 23:14:07 +01:00
Emil Velikov
0e742b1cb3 configure.ac: remove deprecated --with-libclc-path
The option was deprecated with commit 959e83d6507(clover: Adapt libclc's
INCLUDEDIR and LIBEXECDIR to make use of the new introduced libclc.pc.)
back in 2012 with mesa 9.2.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-12 23:12:03 +01:00
Kenneth Graunke
b6354d9bb0 i965/nir: Make INTEL_DEBUG=ann work with NIR.
Now that we store a copy of the NIR shader, and don't immediately free
it, we can use it in annotations as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-11 12:35:47 -07:00
Kenneth Graunke
89c1feb78d i965: Create NIR during LinkShader() and ProgramStringNotify().
Previously, we translated into NIR and did all the optimizations and
lowering as part of running fs_visitor.  This meant that we did all of
that work twice for fragment shaders - once for SIMD8, and again for
SIMD16.  We also had to redo it every time we hit a state based
recompile.

We now generate NIR once at link time.  ARB programs don't have linking,
so we instead generate it at ProgramStringNotify time.

Mesa's fixed function vertex program handling doesn't bother to inform
the driver about new programs at all (which is rather mean), so we
generate NIR at the last minute, if it hasn't happened already.

shader-db runs ~9.4% faster on my i7-5600U, with a release build.

v2: Check NirOptions != NULL in ProgramStringNotify().  Don't bother
    using _mesa_program_enum_to_shader_stage as we already know it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-11 12:35:33 -07:00
Kenneth Graunke
b3e286c457 nir: Store num_direct_uniforms in the nir_shader.
Storing this here is pretty sketchy - I don't know if any driver other
than i965 will want to use it.  But this will make it a lot easier to
generate NIR code at link time.  We'll probably rework it anyway.

(Ian suggested making nir_assign_var_locations_scalar_direct_first
 simply modify the nir_shader's fields, rather than passing pointers
 to them.  If this stays long term, we should do that.  But Jason and
 I suspect we'll be reworking this area again in the near future.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-11 11:39:48 -07:00
Kenneth Graunke
f41f07f685 i965: Move lower_output_reads to brw_link_shader().
This makes it so emit_nir_code() doesn't modify the GLSL IR.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-11 11:38:14 -07:00
Matt Turner
8e414cbdec glsl: Mark path as unreachable. 2015-04-11 10:23:05 -07:00
Matt Turner
ea0c35faf8 i965: Remove useless null check.
If it were null, we'd have just derefernced it two lines above.
2015-04-11 09:59:47 -07:00
Matt Turner
024ecc783b i965/fs/nir: Mark fallthrough. 2015-04-11 09:59:47 -07:00
Matt Turner
1ac230975e i965: Remove useless reg_offset >= 0 tests.
Commit eb9bd3a1 changed the type of this field to uint16_t.
2015-04-11 09:59:46 -07:00
Rob Clark
b98c0262d1 freedreno/ir3/nir: couple little fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:41:03 -04:00
Rob Clark
1b936bb9f8 freedreno/ir3/nir: handle system values
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:57 -04:00
Rob Clark
715b2e0dbb freedreno/ir3/nir: handle txs and query_levels tex ops
These correspond to the tgsi TXQ opcode

(plus sneak in a fix for two-sided color)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:43 -04:00
Rob Clark
97e8fc3fdd freedreno/ir3/nir: split out tex helpers
We'll need these in one or two other spots.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:36 -04:00
Rob Clark
6e8160d6e3 freedreno/ir3/nir: simplify emit_tex()
Just build up arrays for src0/src1, and use create_collect()..

Also add back missing .3d flag for 3d/cube textures.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:28 -04:00
Rob Clark
d5357c16cc freedreno/ir3/cp: handle indirect properly
I noticed some cases where we where trying to copy-propagate indirect
src's into places they cannot go, like 2nd src for cat3 (mad, etc).
Expand out valid_flags() to be aware of relativ flag, and fix up a few
related spots.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:21 -04:00
Rob Clark
49be76166b freedreno/ir3/sched: avoid getting stuck on addr conflicts
When we get in a scenario where we cannot schedule any more instructions
due to address register conflict, clone the instruction that writes the
address register, and switch the remaining unscheduled users for the
current address register over to the new clone.

This is simpler and more robust than the previous attempt (which tried
and sometimes failed to ensure all other dependencies of users of the
address register were scheduled first).. hint it would try to schedule
instructions that were not actually needed for any output value.

We probably need to do the same with predicate register, although so far
it isn't so heavily used so we aren't running into problems with it
(yet).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:15 -04:00
Rob Clark
4cf4006674 freedreno/ir3/nir: add variable-indexing support
A bit fugly.. try and make this cleaner..  note if we hoist all the
get_addr() out of the loop we can drop the hashtable and just use
create_addr()..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:09 -04:00
Rob Clark
972ce757d7 freedreno/ir3/asm: change assert to warning
It probably *should* be an assert, but for now TGSI f/e isn't very good
about dealing w/ CONST vs ABS/NEG.  So for debug builds, print a warning
instead of crashing with an assert for now.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:40:03 -04:00
Rob Clark
09cbd97a47 freedreno/ir3/nir: set first_driver_param
Without this, a3xx breaks.. a4xx would too if it had already implemented
support for passing driver params.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:39:56 -04:00
Rob Clark
f0e9a632a1 freedreno/ir3/cp: support to swap mad src's
For a normal MAD (ie. not MADSH), if first source is gpr and second
source is const, we can swap the first two sources to avoid needing a
mov instruction.

This gives back the biggest advantage TGSI f/e had over NIR f/e for
common shaders, since TGSI f/e had this logic in the f/e.  Note that
doing this in copy-prop step has the advantage that it will also work
for cases like:

   MOV TEMP[b], CONST[x]
   MAD TEMP[d], TEMP[a], TEMP[b], TEMP[c]

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 11:39:46 -04:00
Rob Clark
f596135616 nir: fix bit of cargo-culting in lower_idiv
I guess I was looking too much at how lower_system_values worked when
writing lower_idiv.

Since ttn wasn't emitting load_var for sysvals and the only drivers
using lower_idiv were using ttn, I think nothing was broken as a result.
But might as well fix this before it becomes a problem.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-11 10:43:16 -04:00
Rob Clark
58add76791 nir: split out lower_sub from lower_negate
Originally you had to have one or the other.  But actually I don't want
either.  (Or rather I want whatever is the minimum # of instructions.)

TODO: not sure where the best place to insert a check that driver hasn't
set *both* lower_negate and lower_sub?

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 10:43:16 -04:00
Rob Clark
fd65122a90 gallium/ttn: add support for system values
So far just the system values that freedreno supports, so we may add
more later.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-11 10:43:16 -04:00
Rob Clark
2faa878f13 gallium/ttn: fix TXD
With TXD we also have the ddx/ddy sources (before the sampler).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-11 10:43:16 -04:00
Rob Clark
ca3ae90490 gallium/ttn: add TXQ support (v2)
Split out from ttn_tex() since it is kind of a weird instruction that
maps to two NIR opcodes, and it was cleaner this way.

v2: query_levels doesn't take any args

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-11 10:43:15 -04:00
Rob Clark
0b71451920 gallium/ttn: split out helper to get texture info
We'll need this as well for TXQ.  Split this out first to reduce noise
in the next patch.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-11 10:43:15 -04:00
Rob Clark
96c0f9328d gallium/ttn: add support for temp arrays
Since the rest of NIR really would rather have these as variables rather
than registers, create a nir_variable per array.  But rather than
completely re-arrange ttn to be variable based rather than register
based, keep the registers.  In the cases where there is a matching var
for the reg, ttn_emit_instruction will append the appropriate intrinsic
to get things back from the shadow reg into the variable.

NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give
an array id.  But those just kinda suck, and should really go away.
AFAICT we don't get those from glsl.  Might be an issue for some other
state tracker.

v2: rework to use load_var/store_var with deref chains
v3: create new "burner" reg for temporarily holding the (potentially
writemask'd) dest after each instruction; add load_var to initialize
temporary dest in case not all components are overwritten
v4: review comments: asserts and use ttn_src_for_indirect() in
ttn_array_deref() so we can drop later patch converting to use vec1 for
addr reg (since ttn_src_for_indirect() handles the imov to vec1 from
tgsi addr component that we want)
v5: rebase: new requirements about parent mem ctx for derefs

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-11 10:41:45 -04:00
Rob Clark
b91d987140 gallium/ttn: minor cleanup
Extract tgsi_dst->Index into a local.. split out from 'gallium/ttn: add
support for temp arrays' for noise reduction..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 10:24:50 -04:00
Jason Ekstrand
d47405eb70 i965: Use NIR by default for fragment shaders
GLSL IR vs. NIR shader-db results on i965:

   total instructions in shared programs: 2889747 -> 2890782 (0.04%)
   instructions in affected programs:     2425446 -> 2426481 (0.04%)
   helped:                                3698
   HURT:                                  5341

GLSL IR vs. NIR shader-db results on g4x:

   total instructions in shared programs: 2547252 -> 2550440 (0.13%)
   instructions in affected programs:     1984482 -> 1987670 (0.16%)
   helped:                                2844
   HURT:                                  4776

GLSL IR vs. NIR shader-db results on Iron Lake:

   total instructions in shared programs: 4053381 -> 4063828 (0.26%)
   instructions in affected programs:     3026601 -> 3037048 (0.35%)
   helped:                                4110
   HURT:                                  8331
   GAINED:                                1287
   LOST:                                  9

GLSL IR vs. NIR shader-db results on Sandy Bridge:

   total instructions in shared programs: 5307041 -> 5236666 (-1.33%)
   instructions in affected programs:     3442908 -> 3372533 (-2.04%)
   helped:                                11829
   HURT:                                  5604
   GAINED:                                33
   LOST:                                  18

GLSL IR vs. NIR shader-db results on Ivy Bridge:

   total instructions in shared programs: 4926333 -> 4857017 (-1.41%)
   instructions in affected programs:     3144042 -> 3074726 (-2.20%)
   helped:                                11559
   HURT:                                  4774
   GAINED:                                46
   LOST:                                  25

GLSL IR vs. NIR shader-db results on Bay Trail:

   total instructions in shared programs: 4926333 -> 4857017 (-1.41%)
   instructions in affected programs:     3144042 -> 3074726 (-2.20%)
   helped:                                11559
   HURT:                                  4774
   GAINED:                                46
   LOST:                                  25

GLSL IR vs. NIR shader-db results on Haswell:

   total instructions in shared programs: 4392487 -> 4293476 (-2.25%)
   instructions in affected programs:     2800180 -> 2701169 (-3.54%)
   helped:                                13073
   HURT:                                  3383
   GAINED:                                46
   LOST:                                  23

GLSL IR vs. NIR shader-db results on Broadwell (FS only):

   total instructions in shared programs: 4378113 -> 4283025 (-2.17%)
   instructions in affected programs:     2743209 -> 2648121 (-3.47%)
   helped:                                12470
   HURT:                                  3609
   GAINED:                                64
   LOST:                                  27

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-10 17:19:54 -07:00
Kenneth Graunke
c2a0600d5b i965: Don't set NirOptions for stages that will use the vec4 backend.
We've started using NirOptions != NULL to mean "we're using NIR for this
stage."  However, when INTEL_USE_NIR=1, we set it for a bunch of stages
that still use the vec4 backend, and thus definitely aren't using NIR.

For example, if INTEL_USE_NIR=1 we disable the GLSL IR cubemap
normalization pass, even for vertex shaders and geometry shaders.  This
is wrong, but breaks a very uncommon case.

When I started deleting GLSL IR for stages where we claimed to be using
NIR, this bug quickly became apparent.

For now, only set it for fragment shaders, and vertex shaders if
brw->scalar_vs is set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-10 16:22:48 -07:00
Nick Sarnie
f9048ee3c8 gallivm: Fix build since llvm-3.7.0svn r234495
Revert 50e9fa2ed6 as LLVM reverted their
change.

Signed-off-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2015-04-10 13:30:23 -04:00
Ville Syrjälä
50db8bd1b5 i965/disasm: Print the type after the swizzle also for 3src src operands
The disassembly currently has the swizzle after the type for 3src source
operands, and the other way around for 2src. Flip the type and swizzle
around for 3src so that the output matches 2src.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2015-04-10 14:53:12 +03:00
Kenneth Graunke
ae17f34850 i965: Move brw_link_shader's GLSL IR transformations into a helper.
This function was getting a bit large and unwieldy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:37 -07:00
Kenneth Graunke
10d85ffc5a i965: Change brw_shader to gl_shader in brw_link_shader().
Nothing actually wanted brw_shader fields - we just had to type
shader->base all over the place for no reason.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:35 -07:00
Kenneth Graunke
500da98e0b nir: Constify nir_lower_sampler's gl_shader_program pointer.
Now that we're not generating linker errors, we don't actually modify
this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:33 -07:00
Kenneth Graunke
709b88ccd8 nir: Remove linker_error calls from nir_lower_samplers().
These should never happen.  Plus, NIR passes really shouldn't be
reporting linker errors - this is past link time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:31 -07:00
Kenneth Graunke
99264b7f37 nir: Make nir_lower_samplers take a gl_shader_stage, not a gl_program *.
We don't actually need a gl_program struct.  We only used it to
translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage
(i.e. MESA_SHADER_VERTEX).  We may as well just pass that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:29 -07:00
Kenneth Graunke
4b27391cad nir: Move gl_shader_stage enum from mtypes.h to shader_enums.h.
I want to use this in some code that doesn't currently include mtypes.h.
It seems like a better place for it anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:27 -07:00
Kenneth Graunke
feafe70399 nir: Fix #include guards in shader_enums.h.
This header was originally going to be called pipeline.h, but it got
renamed at the last minute.  Make the include guards match.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:25 -07:00
Kenneth Graunke
d0f39a2fcd nir: Constify prog_to_nir's gl_program pointer.
prog_to_nir should not modify the incoming Mesa IR program - just
translate it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:15:58 -07:00
Vinson Lee
50e9fa2ed6 gallivm: Fix build since llvm-3.7.0svn r234460.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89963
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-04-09 10:41:26 -07:00
Roland Scheidegger
a873b79fa5 draw: (trivial) don't print the shader twice with GALLIVM_DEBUG=tgsi (or ir)
Neither the shader nor the key change when doing elts or linear variant, so
this was just annoying (probably mildly useful at some point when we printed
the IR per function too).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-09 01:32:30 +02:00
Roland Scheidegger
586536a4e1 gallivm: don't use control flow when doing indirect constant buffer lookups
llvm goes crazy when doing that, using way more memory and time, though there's
probably more to it - this points to a very much similar issue as fixed in
8a9f5ecdb1. In any case I've seen a quite
plain looking vertex shader with just ~50 simple tgsi instructions (but with a
dozen or so such indirect constant buffer lookups) go from a terribly high
~440ms compile time (consuming 25MB of memory in the process) down to a still
awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious
improvements possible (but I have no clue why it's so slow...).
The resulting shader is most likely also faster (certainly seemed so though
I don't have any hard numbers as it may have been influenced by compile times)
since generally fetching constants outside the buffer range is most likely an
app error (that is we expect all indices to be valid).
It is possible this fixes some mysterious vertex shader slowdowns we've seen
ever since we are conforming to newer apis at least partially (the main draw
loop also has similar looking conditionals which we probably could do without -
if not for the fetch at least for the additional elts condition.)

v2: use static vars for the fake bufs, minor code cleanups

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-09 01:32:30 +02:00
Brian Paul
09e7e2016b glsl: check for forced_language_version in is_version()
This is a follow-on fix from the earlier "glsl: allow ForceGLSLVersion
to override #version directives" change.  Since we're not changing
the language_version field, we have to check forced_language_version
here.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-08 17:03:16 -06:00
Neil Roberts
4deca1274c i965/skl: Fix the order of the arguments for the LD sampler message
In Skylake the order of the arguments for sample messages with the LD
type are u, v, lod, r whereas previously they were u, lod, v, r.

This fixes 144 Piglit tests including ones that directly use
texelFetch and also some using the meta stencil blit path which
appears to use texelFetch in its shader.

v2: Fix sampling 1D textures

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-08 12:08:41 +01:00
Zhenyu Wang
eb51c6d55f i965: Fix depth field setting in surface state for raw buffer on Gen7/8
On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface
state means [30:21] bits of number of entries which is different from
other surface format which uses [26:21] bits field.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-08 13:20:17 +08:00
Dave Airlie
6b722c390b u_tile: fix warnings about incompatible casts.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-08 10:31:42 +10:00
Glenn Kennard
f2947807c8 r600g/sb: Enable SB for geometry shaders
Add SV_GEOMETRY_EMIT special variable type to track the
implicit dependencies between CUT/EMIT_VERTEX/MEM_RING
instructions so GCM/scheduler doesn't reorder them.

Mark emit instructions as unkillable so DCE doesn't eat them.

Enable only for evergreen/cayman as there are a few
unexplained GS piglit regressions on R6xx/R7xx with SB
enabled otherwise.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-08 08:18:35 +10:00
Glenn Kennard
06bb68da4a r600g/sb: Update last_cf for loops
CF_END could end up emitted in the middle of a shader on cayman
when there was a loop at the very end.

Fixes glsl-1.50-geometry-end-primitive and
ext_transform_feedback-geometry-shaders-basic piglit tests.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-08 08:18:17 +10:00
Dave Airlie
61393bdcdc u_tile: fix stencil texturing tests under softpipe
arb_stencil_texturing-draw failed under softpipe because we got a float
back from the texturing function, and then tried to U2F it, stencil
texturing returns ints, so we should fix the tiling to retrieve
the stencil values as integers not floats.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-08 08:17:32 +10:00
Jason Ekstrand
11694737fc nir: Make nir_*_instr_create take a nir_shader instead of a void * context
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-07 14:34:21 -07:00
Kenneth Graunke
a10d493715 nir: Implement a nir_sweep() pass.
This pass performs a mark and sweep pass over a nir_shader's associated
memory - anything still connected to the program will be kept, and any
dead memory we dropped on the floor will be freed.

The expectation is that this will be called when finished building and
optimizing the shader.  However, it's also fine to call it earlier, and
many times, to free up memory earlier.

v2: (feedback from Jason Ekstrand)
- Skip sweeping impl->start_block, as it's already in the CF list.
- Don't sweep SSA defs (they're owned by their defining instruction)
- Don't steal phi sources (they're owned by nir_phi_instr).
- Don't steal tex->src (it's owned by the tex_inst itself)
- Don't sweep dereference chains (top-level dereferences are owned by
  the instruction; sub-dereferences are owned by the parent deref).
- Don't sweep sources and destinations (SSA defs are handled as part of
  the defining instruction, and registers are handled as part of
  function implementations).
- Just steal instructions; don't walk them (no longer required).

v3: (feedback from Jason Ekstrand)
- Steal indirect sources from nir_src/nir_dest.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:14 -07:00
Kenneth Graunke
de2014cf1e nir: Allocate dereferences out of their parent instruction or deref.
Jason pointed out that variable dereferences in NIR are really part of
their parent instruction, and should have the same lifetime.

Unlike in GLSL IR, they're not used very often - just for intrinsic
variables, call parameters & return, and indirect samplers for
texturing.  Also, nir_deref_var is the top-level concept, and
nir_deref_array/nir_deref_record are child nodes.

This patch attempts to allocate nir_deref_vars out of their parent
instruction, and any sub-dereferences out of their parent deref.
It enforces these restrictions in the validator as well.

This means that freeing an instruction should free its associated
dereference chain as well.  The memory sweeper pass can also happily
ignore them.

v2: Rename make_deref to evaluate_deref and make it take a nir_instr *
    instead of void *.  This involves adding &instr->instr everywhere.
    (Requested by Jason Ekstrand.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:14 -07:00
Kenneth Graunke
4f4b04b7c7 nir: Allocate nir_ssa_def::uses/if_uses out of the instruction.
We can't allocate them out of the nir_ssa_def itself, because it may not
be ralloc'd (for example, nir_dest embeds a nir_ssa_def).

However, allocating them out of the instruction should work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:13 -07:00
Kenneth Graunke
900498bd11 nir: Allocate nir_phi_src values out of the nir_phi_instr.
Phi sources are part of the phi instruction and should have the same
lifetime.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:13 -07:00
Kenneth Graunke
b05d53404c nir: Allocate nir_call_instr::params out of the nir_call itself.
The lifetime of the params array needs to be match the nir_call_instr
itself.  So, allocate it using the instruction itself as the context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:13 -07:00
Kenneth Graunke
73d106822e i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats.
This allows those formats to work with the meta PBO upload path.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-07 14:34:02 -07:00
Kenneth Graunke
60dcd97257 i965: Use SET_FIELD in 3DSTATE_STREAMOUT packets.
Suggested by Topi Pohjolainen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-07 14:34:02 -07:00
Jason Ekstrand
2e3b35a1cb nir/lower_tex_projector: Don't use designated initializers
These don't work in MSVC or in older versions of GCC

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89899
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-07 11:49:39 -07:00
Tapani Pälli
1aa5738e66 glsl: relax input->output validation for SSO programs
Commit 18004c3 introduced more restrictive validation to linker
between inputs and outputs. This patch skips the additional check
for programs that utilize GL_ARB_separate_shader_objects, there
inputs and outputs might not make exact match during linking but
only when constructing the final pipeline.

This made some of the GL_ARB_program_interface_query tests shaders
fail to link, these tests can be used to verify the change.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-04-07 08:11:07 +03:00
Ilia Mirkin
ae720c66cb nv50,nvc0: limit the y-tiling of 3d textures to the first level's tiling
We limit y-tiling to 0x20 when depth is involved. However the function is
run for each miplevel, and the hardware expects miplevel 0 to have the
highest tiling settings. Perform the y-tiling limit on all levels of a
3d texture, not just the ones that have depth.

Fixes:
  texelFetch fs sampler3D 98x129x1-98x129x9

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nick Tenney <nick.tenney@gmail.com> # GT216
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-04-06 23:06:55 -04:00
Dave Airlie
ad84689f73 r600g: fix op3 abs issue
This code to handle absolute values on op3 srcs was a bit too simple,
it really needs a temp reg per src, not one per channel, make it
easier and let sb clean up the mess.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89831

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-07 11:40:16 +10:00
Iago Toral Quiroga
2042a2f961 i965: Do not render primitives in non-zero streams then TF is disabled
Haswell hardware seems to ignore Render Stream Select bits from
3DSTATE_STREAMOUT packet when the SOL stage is disabled even if
the PRM says otherwise. Because of this, all primitives are sent
down the pipeline for rasterization, which is wrong. If SOL is
enabled, Render Stream Select is honored and primitives bound to
non-zero streams are discarded after stream output.

Since the only purpose of primives sent to non-zero streams is to
be recorded by transform feedback, we can simply discard all geometry
bound to non-zero streams then transform feedback is disabled
to prevent it from ever reaching the rasterization stage.

Notice that this patch introduces a small change in the behavior we
get when a geometry shader emits more vertices than the maximum declared:
before, a vertex that was emitted to a non-zero stream when TF was
disabled would still count for the purposes of checking that we don't
exceed the maximum number of output vertices declared by the shader. With
this change, these vertices are completely ignored and won't increase
the output vertex count, making more room for other (hopefully more
useful) vertices.

Fixes piglit test arb_gpu_shader5-emitstreamvertex_nodraw on Haswell
and Broadwell.

v2 (Ken): Drop is_haswell check in favor of doing this unconditionally.
Broadwell needs the workaround as well, and it doesn't hurt to do it in
general.  Also tweak comments - the Haswell PRM does actually mention
this ("Command Reference: Instructions" page 797).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83962
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-04-06 16:00:41 -07:00
Kenneth Graunke
f368d0fa1f i965: Add forgotten multi-stream code to Gen8 SOL state.
Fixes Piglit's arb_gpu_shader5-xfb-streams-without-invocations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2015-04-06 14:07:28 -07:00
Kenneth Graunke
f9e5dc0a85 i965: Fix instanced geometry shaders on Gen8+.
Jordan added this in commit 741782b594 for
Gen7 platforms.  I missed this when adding the Broadwell code.

Fixes Piglit's spec/arb_gpu_shader5/invocation-id-{basic,in-separate-gs}
with MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2015-04-06 14:06:26 -07:00
Kenneth Graunke
a09c5b8527 i965: Free dead GLSL IR one last time.
While working on NIR's memory allocation model, I realized the GLSL IR
memory model was broken.

During glCompileShader, we allocate everything out of the
_mesa_glsl_parse_state context, and reparent it to gl_shader at the end.

During glLinkProgram, we allocate everything out of a temporary context,
then reparent it to the exec_list containing the linked IR.

But during brw_link_shader - the driver's final opportunity to do
lowering and optimization - we just allocated everything out of the
permanent context given to us by the linker.  That memory stayed
forever.

Notably, passes like brw_fs_channel_expressions cause us to churn the
majority of the code, so we really want to free dead IR here.

Saves 125MB of memory when replaying a Dota 2 trace on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-06 14:03:43 -07:00
Kenneth Graunke
797d606127 i965: Implement SIMD16 texturing on Gen4.
This allows SIMD16 mode to work for a lot more programs.  Texturing is
also more efficient in SIMD16 mode than SIMD8.  Several messages don't
actually exist in SIMD8 mode, so we did SIMD16 messages and threw away
half of the data.  Now we compute real data in both halves.

Also, the SIMD16 "sample" message doesn't require all three coordinate
components to exist (like the SIMD8 one), so we can shorten the message
lengths, cutting register usage a bit.

I chose to implement the visitor functionality in a separate function,
since mixing true SIMD16 with SIMD8 code that uses SIMD16 fallbacks
seemed like a mess.  The new code bails on a few cases where we'd
have to do two SIMD8 messages - we just fall back to SIMD8 for now.

Improves performance in "Shadowrun: Dragonfall - Director's Cut" by
about 20% on GM45 (measured with LIBGL_SHOW_FPS=1 while standing around
in the first mission).

v2: Add ir_txf to the has_lod case (caught by Jordan Justen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-06 13:49:02 -07:00
Kenneth Graunke
8aee87fe4c i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.
Gen5+ systems allow you to specify multiple shader programs - both SIMD8
and SIMD16 - and the hardware will automatically dispatch to the most
appropriate one, given the number of subspans to be processed.

However, that is not the case on Gen4.  Instead, you program a single
shader.  If you enable multiple dispatch modes (SIMD8 and SIMD16), the
shader is supposed to contain a series of jump instructions at the
beginning.  The hardware will launch the shader at a small offset,
hitting one of the jumps.

We've always thought that sounds like a pain, and weren't clear how it
affected performance - is it worth having multiple shader types?  So,
we never bothered with SIMD16 until now.

This patch takes a simpler approach: try and compile a SIMD16 shader.
If possible, set the no_8 flag, telling the hardware to just use the
SIMD16 variant all the time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-06 13:49:02 -07:00
Kenneth Graunke
108b92b1e9 i965: Respect the no_8 flag on Gen4-5.
This flag means to ignore the SIMD8 program and only use the SIMD16 one.
It was originally meant for repdata clear shaders, but I plan to use it
for other things on Gen4 as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-06 13:49:02 -07:00
Kenneth Graunke
62050886c8 i965/fp: Set coord_components correctly for cube textures.
I've no idea why this was 4.  It certainly seems wrong.

Prevents assertion failures in fp-incomplete-tex with some upcoming
patches of mine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-06 13:49:01 -07:00
Ian Romanick
dd7d068784 glsl/cse: Maintain a list of free ae_entry objects
The CSE algorithm will continuously allocate new ae_entry objects.  As
each new basic block is exited, all of the previously allocated objects
are dumped.  Instead, put them in a free list and re-use them in the
next basic block.  Reduce, reuse, recycle!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2015-04-06 11:53:59 -07:00
Matt Turner
d131630c08 nir: Remove fsin_reduced/fcos_reduced.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-06 10:13:22 -07:00
Matt Turner
c8d65dd713 st/mesa: Remove unused emit_scs().
Was only used by the sin_reduced/cos_reduced cases, which themselves
were impossible to reach.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-06 10:13:22 -07:00
Matt Turner
5fb735b756 program: Remove unused emit_scs().
Was only used by the sin_reduced/cos_reduced cases, which themselves
were impossible to reach.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-06 10:13:22 -07:00
Matt Turner
cdb1eb9a3f i965/vec4: Remove emit_scs() prototype.
This has never existed.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-06 10:13:22 -07:00
Matt Turner
5c71cf8531 glsl: Remove never used sin_reduced/cos_reduced.
These were added in commit f2616e56, presumably in preparation for
translating ARB vp/fp into GLSL IR. That never happened, and neither did
a lowering pass that actually generated these instructions.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-06 10:13:22 -07:00
Antia Puentes
490621f0f2 glsl: Update the #line behaviour on GLSL 3.30+ and GLSL ES+
From GLSL 3.30 and GLSL ES 1.00 on, after processing the line
directive (including its new-line), the implementation should
behave as if it is compiling at the line number passed as
argument. In previous versions, it behaved as if compiling
at the passed line number + 1.

Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-06 08:55:10 +02:00
Antia Puentes
c0a7014601 glsl: respect the source number set by #line <line> <source>
From GLSL 1.30.10, section 3.3 (Preprocessor):
"#line line source-string-number ... After processing this directive
(including its new-line), the implementation will behave as if it is
compiling at ... source string number source-string-number. Subsequent
source strings will be numbered sequentially, until another #line
directive overrides that numbering."

In the previous implementation the source number was always zero.
Subsequent source strings are still not numbered sequentially, because
in the glShaderSource implementation we are concatenating the source code
strings into one long string.

Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-06 08:50:41 +02:00
Iago Toral Quiroga
47597f8f5c i965: Make sure we always mark array surfaces as such
Even if they only have one slice, otherwise textureSize() won't
produce correct results for the depth value.

Fixes 10 dEQP tests in this category:
dEQP-GLES3.functional.shaders.texture_functions.texturesize.sampler2darray*

Reviewed-by: Mark Janes <mark.a.janes at intel.com>
2015-04-06 08:07:42 +02:00
Rob Clark
8b0b81339b freedreno/ir3: add NIR compiler
The NIR compiler frontend is an alternative to the TGSI f/e, producing
the same ir3 IR and using the same backend passes for scheduling, etc.

It is not enabled by default yet, as there are still some regressions.
To enable, use 'FD_MESA_DEBUG=nir'.  It is enough to use with, for
example, xonotic or supertuxkart.

With the NIR f/e, scalarizing and a number of other lowering steps
happen in NIR, so we don't have to do them in ir3.  Which simplifies the
f/e and allows the lowered instructions to pass through other
optimization stages.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 16:36:40 -04:00
Ilia Mirkin
700d949ea1 freedreno/a3xx: don't decode srgb on mem2gmem
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:35 -04:00
Ilia Mirkin
b060b56772 freedreno/a3xx: pass sprite coord mode through to program emit
Use the correct sprite replacement depending on the flip of the coord
mode, using either T or 1-T depending on whether we have an upper-left or
lower-left coordinate origin. This fixes all the point sprite piglits.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:35 -04:00
Ilia Mirkin
1de72dfc8a freedreno/a3xx: add UBO support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:35 -04:00
Ilia Mirkin
c7811f56c2 freedreno/ir3: insert nop between sfu/mem operations
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:35 -04:00
Ilia Mirkin
14dfd8cc43 freedreno: dirty context when reallocating a bound bo
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:35 -04:00
Ilia Mirkin
bde2045fa2 freedreno: keep track of buffer valid ranges
Copies nouveau_buffer and radeon_buffer. This allows a write to proceed
to an uninitialized part of a buffer even when the GPU is using the
previously-initialized portions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:35 -04:00
Ilia Mirkin
dacf22e0a3 freedreno: mark resources as being read so that writes flush the queue
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:34 -04:00
Ilia Mirkin
2e1445c8f3 freedreno: don't bother setting resource timestamps
Waiting on a bo being ready is handled in fd_bo_cpu_prep. No need to
keep separate timestamps around.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:34 -04:00
Ilia Mirkin
1fee3061d5 freedreno: add a reading flag to indicate gpu is reading rsc
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:34 -04:00
Ilia Mirkin
ea0952a9db freedreno: fix resource flushing confusion
A resource flush is an upload of a hypothetically-staging texture to the
GPU. For a UMA system, this will largely be a no-op or
cache-maintenance. Move the render flush logic into transfer_map where
it belongs, and clear out the transfer_flush function.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:34 -04:00
Ilia Mirkin
bfb0a8eb69 freedreno: remove tex_resource
pipe_sampler_view already contains a texture, remove the redundant
tex_resource member which pointed at the same thing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-05 16:36:34 -04:00
Rob Clark
6cd9c94ce4 freedreno/ir3: handle FRAG IN's without interpolation specified
Fallback to picking based on semantic name.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 16:36:34 -04:00
Rob Clark
f513f006ce freedreno/ir3/cmdline: add @const headers for immediates
Since NIR f/e currently encodes immediates in instructions (rather than
passing via const), we need to ensure that when const's are used the get
initialized to the proper values.  Otherwise comparing NIR to TGSI
compiler, it will use proper immediate values in one case, and randomly
initialize values in the other.  Which confuses ir3test.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 16:36:34 -04:00
Rob Clark
6bc12bb5fd freedreno/ir3/cmdline: remove hack for old compiler
Since we dropped the old compiler, we don't need this hack anymore.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 16:36:34 -04:00
Rob Clark
f370e95421 freedreno/ir3: handle const/immed/abs/neg in cp
Be smarter about propagating copies from const or immed, or with abs/neg
modifiers.  Also, realize that absneg.s and absneg.f are really "fancy"
mov instructions.

This opens up the possibility to remove more copies.  It helps the TGSI
frontend a bit, but will be really needed for the NIR f/e which builds
everything up in SSA form (ie. will *always* insert a mov from const or
immediate).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 16:36:34 -04:00
Rob Clark
104713d9f2 freedreno/ir3: split float/int abs/neg
Even though in the end, they map to the same bits, the backend will need
to be able to differentiate float abs/neg vs integer abs/neg.  Rather
than making the backend figure it out based on instruction opcode (which
when combined with mov/absneg instructions, can be awkward), just split
out different flags for each so the frontend can signal it's intentions
more clearly.  Also, since (neg) for bitwise op's is actually a bitwise-
not, split it out into bnot flag.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 12:44:01 -04:00
Rob Clark
203f37540a freedreno/ir3: add ir3 builder helpers
Add helpers for constructing SSA forms of instructions.

Only partial cat5/cat6 coverage.. but we can add stuff as needed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 12:44:01 -04:00
Rob Clark
b1c9fb9fca freedreno/ir3: fix sam argument order comment
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 12:44:01 -04:00
Rob Clark
101142c401 xa: support for drivers which use NIR
We need to pull in libnir.la and it's dependency libglsl_util.la.  Also,
_mesa_error_no_memory() must be defined.

Fortunately with libnir.la (vs pulling in all of libglsl.la) we don't
also need libstdc++.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 09:24:17 -04:00
Rob Clark
1c857727a1 build: add libnir.la
If we want to use NIR from state trackers that don't already pull in the
whole of glsl (ie. anything other than mesa state tracker), we need a
separate more minimal libnir.  Possibly NIR should be better split out
from glsl, but for now, generate a second smaller libnir.la for those
who just want NIR but not all of glsl.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05 09:24:17 -04:00
Rob Clark
52282fa42d gallium/ttn: MOD is an integer instruction
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net
2015-04-05 09:24:17 -04:00
Rob Clark
7579ae422a gallium/ttn: add UMAD
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-05 09:24:17 -04:00
Rob Clark
f2ecc95e44 nir: add lowering for idiv/udiv/umod
Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD().
See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an
adaptation of the nv50 code from Ilia Mirkin).

A python/numpy script which implements the same algorithm (and is
possibly useful for debugging or analysis) can be found here:

  http://people.freedesktop.org/~robclark/div-lowering.py

I've tested this on i965 hacked up to insert the idiv lowering pass,
and on freedreno with NIR frontend.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Eric Anholt <eric@anholt.net> (vc4)
2015-04-05 09:20:35 -04:00
Rob Clark
7880bea2fb nir: fix typo for f2b/i2b/b2i expressions (v2)
v2: discovered that i2b/b2i are also confused

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-05 08:56:24 -04:00
Rob Clark
6829d76e02 nir: add option to lower slt/sge/seq/sne
In freedreno these get implemented as the matching f* instruction plus a
u2f to convert the result to float 1.0/0.0.  But less lines of code to
just let nir_opt_algebraic handle this for us, plus opens up some small
window for other opt passes to improve (ie. if some shader ended up with
both a flt and slt with same src args, for example).

v2: use b2f rather than u2f

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-05 08:56:24 -04:00
Mathias Froehlich
24b78fe54e mesa: Remove unused variables left over from 107ae27e57.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2015-04-05 09:40:47 +02:00
Mathias Fröhlich
fdd90fcb15 i965: Implement support for ARB_clip_control.
Switch between the two clip space definitions already available
in hardware. Update winding order dependent state according
to the clip control state.
This change did not introduce new piglit quick.test regressions on
an Ivybridge Mobile and a GM45 Express chipset.
Also it enables and passes the clip-control and clip-control-depth-precision
tests on these two chipsets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2015-04-05 08:01:47 +02:00
Mathias Froehlich
107ae27e57 mesa: Remove the _WindowMap from gl_viewport_attrib.
The _WindowMap can be dropped from gl_viewport_attrib now.
Simplify gl_viewport_attrib handling where possible.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2015-04-05 08:01:47 +02:00
Mathias Froehlich
29e6c7dbc5 tnl: Maintain the _WindowMap matrix in TNLcontext v2.
This is the only real user of _WindowMap which has the depth
buffer scaling multiplied in. Maintain the _WindowMap of the
one and only viewport inside TNLcontext.

v2:
Remove unneeded parentheses.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2015-04-05 08:01:47 +02:00
Mathias Froehlich
472913ea75 radeon: Make use of _mesa_get_viewport_xform v2.
Instead of _WindowMap just use the translation and scale
of the viewport transform directly. Thereby avoid dividing by
_DepthMaxF again.

v2:
Change order of assignments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2015-04-05 08:01:46 +02:00
Mathias Froehlich
a8ceb8e450 i965: Make use of _mesa_get_viewport_xform.
Instead of _WindowMap just use the translation and scale
of the viewport transform directly. Thereby avoid dividing by
_DepthMaxF again.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2015-04-05 08:01:46 +02:00
Ilia Mirkin
ba353935a3 nv50: allocate more offset space for occlusion queries
Commit 1a170980a0 started writing to q->data[4]/[5] but kept the
per-query space at 16, which meant that in some cases we would write
past the end of the buffer. Rotate by 32, like nvc0 does. This ensures
that we always have 32 bytes in front of us, and the data writes will go
within the allocated space.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89679
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nick Tenney <nick.tenney@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-04-04 11:30:03 -04:00
Jason Ekstrand
9c53e80b9b nir/lower_samplers: Use the right memory context for realloc'ing tex sources
As of da5ec2a, we allocate instruction sources out of the instruction
itself.  When we realloc the texture sources we need to use the right
memory context or ralloc will get angry and assert-fail

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-03 17:02:20 -07:00
Jason Ekstrand
1bd1fc248c i965: Use brw_nir_cubemap_normalize for NIR shaders
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-03 14:12:49 -07:00
Jason Ekstrand
52e718097f nir: Add a cubemap normalizing pass
This commit adds a pass to L1-normalize cube-map coordinates.  Some hardware
such as i965 requires that largest cube-map coordinate is +-1.  We had a
pass to perform this normalization in GLSL IR but we need it in NIR for
cube maps on ARB programs to work correctly.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2 (Suggested by Eric):
 - Do a vector fabs and split into components later
 - Move to core NIR

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-03 14:12:49 -07:00
Jason Ekstrand
bff4213326 i965: Check the INTEL_USE_NIR environment variable once at context creation
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-03 14:12:49 -07:00
Jason Ekstrand
dccc57eaba nir/from_ssa: Don't set reg->parent_instr for ssa_undef instructions
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-03 14:04:31 -07:00
Jason Ekstrand
7bdba4a245 nir: Add a src_get_parent_instr function
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-04-03 14:04:12 -07:00
Eric Anholt
cb966fb2be i965: Use the tex projector lowering pass instead of hand-rolling it.
This only impacts the ARB_fp path.  We can't quite disable the GLSL-level
lowering pass, because it needs to apply before
brw_do_lower_unnormalized_offset().

total instructions in shared programs: 5667857 -> 5667847 (-0.00%)
instructions in affected programs:     1114 -> 1104 (-0.90%)
helped:                                16
HURT:                                  6

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-03 11:50:27 -07:00
Eric Anholt
ea811b7868 nir: Add a lowering pass for texture projectors.
Not much hardware wants them these days, and it might give us a chance to
do CSE or algebraic at the NIR level.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-03 11:50:24 -07:00
Eric Anholt
64bdfc698d nir: Add an interface to turn a nir_src into a nir_ssa_def.
We use nir_ssa_defs for nir_builder args, so this takes a nir_src and
makes one so it can be passed in.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-03 11:50:22 -07:00
Eric Anholt
ec02970205 nir: Add an interface for the builder to insert instructions before.
So far we'd only used nir_builder to build brand new programs.  But if
we're doing modifications to instructions (like in a lowering pass), then
we want to generate new stuff before the instruction we're modifying.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-03 11:50:18 -07:00
Jose Fonseca
328375d274 gallium: fix gcc compile errors when using _XOPEN_SOURCE=600 but not std=c99
The fpclassify stuff either needs std=c99 or _XOPEN_SOURCE=600 passed
to gcc, but when using the latter the lrint family of function will be defined
too.
2015-04-03 19:22:09 +02:00
Carl Worth
b9b66985c3 i965: Rename do_<stage>_prog to brw_compile_<stage>_prog (and export)
This is in preparation for these functions to be called from other
files.

This commit is intended to have no functional change. It exists in
preparation for some upcoming code movement in preparation for the
shader cache.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-02 22:15:45 -07:00
Carl Worth
a57672f18d i965: Split out per-stage dirty-bit checking into separate functions
The dirty-bit checking from each brw_upload_<stage>_prog function is
split out into its a new brw_<stage>_state_dirty function.

This commit is intended to have no functional change. It exists in
preparation for some upcoming code movement in preparation for the
shader cache.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-02 22:15:45 -07:00
Carl Worth
28510d69ff i965: Split out brw_<stage>_populate_key into their own functions
This commit splits portions of the existing brw_upload_vs_prog and
brw_upload_gs_prog function into new brw_vs_populate_key and
brw_gs_populate_key functions. This follows the same style as is
already present for all other stages, (see brw_wm_populate_key, etc.).

This commit is intended to have no functional change. It exists in
preparation for some upcoming code movement in preparation for the
shader cache.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-02 22:15:45 -07:00
Ilia Mirkin
01d3b750b3 nv50/ir: avoid folding immediates into imad operations
Commit 09ee907266 added logic to fold immediates into mad operations,
but the emission code is only there for fmad. Only allow it on float
types.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 18:42:31 -04:00
Ilia Mirkin
603d28f32c nv50/ir: fix imad emission when dst == src2
Commit fb63df2215 added 4-byte mad support, but only supported
emission for floats. Disable it for ints for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 18:35:59 -04:00
Kenneth Graunke
da5ec2ac0b nir: Allocate nir_tex_instr::sources out of the instruction itself.
The lifetime of the sources array needs to be match the nir_tex_instr
itself.  So, allocate it using the instruction itself as the context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:20:03 -07:00
Kenneth Graunke
7380c641b1 nir: Allocate predecessor and dominance frontier sets from block itself.
These sets are part of the block, and their lifetime needs to match the
block itself.  So, allocate them using the block itself as the context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:20:02 -07:00
Kenneth Graunke
131444e1c5 nir: Allocate register fields out of the register itself.
The lifetime of each register's use/def/if_use sets needs to match the
register itself.  So, allocate them using the register itself as the
context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:20:01 -07:00
Kenneth Graunke
587b3a20a1 nir: Make nir_create_function() strdup the function name.
glsl_to_nir passes in the ir_function's name field; we were copying the
pointer, but not duplicating the memory.

We want to be able to free the linked GLSL IR program after translating
to NIR, so we'll need to create a copy of the function name that the NIR
shader actually owns.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:20:00 -07:00
Kenneth Graunke
f61b6c3e48 nir: Free dead variables when removing them.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:19:58 -07:00
Kenneth Graunke
f4e4491080 nir: Combine remove_dead_local_vars() and remove_dead_global_vars().
We can just pass a pointer to the list of variables, and reuse the code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:19:56 -07:00
Kenneth Graunke
33f0f68d59 ralloc: Implement a new ralloc_adopt() API.
ralloc_adopt() reparents all children from one context to another.
Conceptually, ralloc_adopt(new_ctx, old_ctx) behaves like this
pseudocode:

   foreach child of old_ctx:
      ralloc_steal(new_ctx, child)

However, ralloc provides no way to iterate over a memory context's
children, and ralloc_adopt does this task more efficiently anyway.

One potential use of this is to implement a memory-sweeper pass: first,
steal all of a context's memory to a temporary context.  Then, walk over
anything that should be kept, and ralloc_steal it back to the original
context.  Finally, free the temporary context.  This works when the
context is something that can't be freed (i.e. an important structure).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-02 14:19:41 -07:00
Jason Ekstrand
ca3b4d6d17 nir/opt_peephole_ffma: Fix a couple typos in a comment
Acked-by: Matt Turner <mattst88@gmail.com>
2015-04-02 11:09:37 -07:00
Ilia Mirkin
4609ba6ea3 mesa: add ARB_depth_buffer_float to ES3.0 required extension list
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-02 13:35:18 -04:00
Eric Anholt
a9152376b4 vc4: Add support for nir_iabs.
Tested using the GLSL 1.30 tests for integer abs().  Not currently used,
but it was one of the new opcodes used by robclark's idiv lowering.
2015-04-02 10:32:35 -07:00
Jason Ekstrand
e50cf5faa5 i965/generator: Get rid of the ! in the unreachable statement
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-02 10:21:18 -07:00
Jason Ekstrand
0573d0e484 nir/print: Correctly print swizzles for explicitly sized alu sources
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-02 10:21:18 -07:00
Ilia Mirkin
4a3c0e9950 freedreno/a3xx: add MRT support
The hardware only supports 4 MRTs. It should be possible to emulate
support for 8, but doesn't seem worth the trouble.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Ilia Mirkin
6f4c1976f4 freedreno: convert blit program to array for each number of rts
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Ilia Mirkin
d9992ab35a freedreno: add support for laying out MRTs in gmem
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Ilia Mirkin
602bc6c88d freedreno: add core infrastructure support for MRTs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Ilia Mirkin
d13803c76f freedreno/ir3: add support for FS_COLOR0_WRITES_ALL_CBUFS property
This will enable the driver to tell which regids to link up to which
MRT outputs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Ilia Mirkin
f27ec59084 freedreno/a3xx: add independent blend function support
This is needed for MRT support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Ilia Mirkin
8efa3e340d freedreno: remove alpha key from ir3_shader
This complication is unnecessary and makes MRTs more complicated and
likely to generate tons of variants.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-02 00:09:14 -04:00
Stéphane Marchesin
70eed78cac i915g: Implement EGL_EXT_image_dma_buf_import
This adds all the plumbing to get EGL_EXT_image_dma_buf_import in
i915g.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2015-04-01 20:13:37 -07:00
Matt Turner
a03d0ba78f i965/fs: Relax type check in cmod propagation.
The thing we want to avoid is int/float comparisons, but int/unsigned
comparisons with 0 are equivalent.

total instructions in shared programs: 6194829 -> 6193996 (-0.01%)
instructions in affected programs:     117192 -> 116359 (-0.71%)
helped:                                471

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-01 13:43:57 -07:00
Matt Turner
781badee7a nir: Remove useless ftrunc inside f2i/f2u.
No shader-db changes, probably because they're all removed by the GLSL
compiler optimization added in commit 69ad5fd4.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Matt Turner
97e6c1b957 nir: Recognize (a < b || a < c) as a < max(b, c).
Doesn't work for analogous && cases, because of NaNs.

total instructions in shared programs: 6195712 -> 6194829 (-0.01%)
instructions in affected programs:     42000 -> 41117 (-2.10%)
helped:                                403

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Matt Turner
a2b6e908cf nir: Add addition/multiplication identities of exp/log.
instructions in affected programs:     2858 -> 2808 (-1.75%)
helped:                                12

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Matt Turner
099c729b4c nir: Add identities for the log function.
The rcp(log(x)) pattern affects instruction counts.

instructions in affected programs:     144 -> 138 (-4.17%)
helped:                                6

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Matt Turner
8a6ae384b2 nir: Add identities for the exponential function.
No changes in shader-db.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Matt Turner
e26783d445 nir: Recognize another open coded lrp.
total instructions in shared programs: 6195924 -> 6195768 (-0.00%)
instructions in affected programs:     4876 -> 4720 (-3.20%)
helped:                                58
HURT:                                  10

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Matt Turner
e82437e141 nir: Recognize open coded lrp.
total instructions in shared programs: 6197614 -> 6195924 (-0.03%)
instructions in affected programs:     34773 -> 33083 (-4.86%)
helped:                                147
HURT:                                  6

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:43:57 -07:00
Kenneth Graunke
25e214db00 nir: Use _mesa_flsll(InputsRead) in prog->nir.
InputsRead is a 64-bit bitfield.  Using _mesa_fls would silently
truncate off the high bits, claiming inputs 32..56 (VARYING_SLOT_MAX)
were never read.

Using <= here was a hack I threw in at the last minute to fix programs
which happened to use input slot 32.  Switch back to using < now that
the underlying problem is fixed.

Fixes crashes in "Euro Truck Simulator 2" when using prog->nir, which
uses input slot 33.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 13:30:13 -07:00
Kenneth Graunke
3d166b313d mesa: Implement _mesa_flsll().
This is _mesa_fls() for 64-bit values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 13:30:13 -07:00
Kenneth Graunke
4b38c5c783 nir: In prog->nir, don't wrap dot products with ptn_channel(..., X).
ptn_move_dest and nir_fadd already take care of replicating the last
channel out, so we can just use a scalar and skip splatting it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-04-01 13:30:13 -07:00
Jason Ekstrand
218e45e2f7 i965: Use the same nir options for all gens
If we tell NIR to split ffma's, then we don't need seperate options
anymore.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:04 -07:00
Jason Ekstrand
b9d7454571 i965/nir: Run DCE again before going out of SSA
We run lowering and optimization passes that might leave garbage lying
around. This keeps the FS cse from having to clean it up.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:04 -07:00
Jason Ekstrand
37703040a1 i965/nir: Run the ffma peephole after the rest of the optimizations
The idea here is that fusing multiply-add combinations too early can reduce
our ability to perform CSE and value-numbering.  Instead, we split ffma
opcodes up-front, hope CSE cleans up, and then fuse after-the-fact.
Unless an algebraic pass does something silly where it inserts something
between the multiply and the add, splitting and re-fusing should never
cause a problem.  We run the late algebraic optimizations after this so
that things like compare-with-zero don't hurt our ability to fuse things.

shader-db results for fragment shaders on Haswell:
total instructions in shared programs: 4390538 -> 4379236 (-0.26%)
instructions in affected programs:     989359 -> 978057 (-1.14%)
helped:                                5308
HURT:                                  97
GAINED:                                78
LOST:                                  5

This does, unfortunately, cause some substantial hurt to a shader in Kerbal
Space Program.  However, the damage is caused by changing a single
instruction from a ffma to an add.  This, in turn, *decreases* register
pressure in one part of the program causing it to fail to register allocate
and spill.  Given the overwhelmingly positive results in other shaders and
the fact that the NIR for the Kerbal shaders is actually better, this
should be considered a positive.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:04 -07:00
Jason Ekstrand
7f344721b1 nir/peephole_ffma: Be less agressive about fusing multiply-adds
shader-db results for fragment shaders on Haswell:
total instructions in shared programs: 4395688 -> 4389623 (-0.14%)
instructions in affected programs:     355876 -> 349811 (-1.70%)
helped:                                1455
HURT:                                  14
GAINED:                                5
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:04 -07:00
Jason Ekstrand
a8c8b3b872 nir: Add a dedicated ffma peephole optimization
i965/nir: Use the dedicated ffma peephole

total instructions in shared programs: 4418748 -> 4394618 (-0.55%)
instructions in affected programs:     1292790 -> 1268660 (-1.87%)
helped:                                5999
HURT:                                  457
GAINED:                                4
LOST:                                  9

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:04 -07:00
Jason Ekstrand
e06a3d0282 nir: Move the compare-with-zero optimizations to the late section
total instructions in shared programs: 4422307 -> 4422363 (0.00%)
instructions in affected programs:     4230 -> 4286 (1.32%)
helped:                                0
HURT:                                  12

While this does hurt some things, the losses are minor and it prevents the
compare-with-zero optimization from fighting with ffma which is much more
important.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:03 -07:00
Jason Ekstrand
da294f9b2f nir/algebraic: Add a seperate section for "late" optimizations
i965/nir: Use the late optimizations

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:03 -07:00
Jason Ekstrand
1779dc060f nir/algebraic: Remove a duplicate optimization
This optimization is repeated verbatim above

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:03 -07:00
Jason Ekstrand
22ee7eeb4e nir/algebraic: #define around structure definitions
Previously, we couldn't generate two algebraic passes in the same file
because of multiple structure definitions.  To solve this, we play the
age-old header file trick and just #define around it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 12:51:03 -07:00
Jason Ekstrand
793a94d6b5 nir/print: Don't print extra swizzzle components
Previously, NIR would just print 4 swizzle components if the swizzle was
anything other than foo.xyzw.  This creates lots of noise if, for example,
you have a one-component element with a swizzle of foo.xxxx.

Reviewed-by: Kenneth Grunke <kenneth@whitecape.org>
2015-04-01 12:49:49 -07:00
Emil Velikov
d99135b2e9 configure: nuke --with-max-{width,height}
Unused as of commit 630ab0d27ba(mesa: remove last of MAX_WIDTH,
MAX_HEIGHT). Update all the remaining references to the defines.

v2: Use the correct variable name in the comments

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-04-01 19:43:34 +00:00
Emil Velikov
bd4925c6ac gallium: ship tgsi_to_nir.h in the tarball
Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-01 19:33:37 +00:00
Emil Velikov
4008975e6f configure.ac: error out if python/mako is not found when required
In case of using a distribution tarball (or a dirty git tree) one can
have the generated sources locally. Make configure.ac error out
otherwise, to alert that about the unmet requirement(s) of python/mako.

v2: Check only for a single file for each dependency.

Suggested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 19:33:37 +00:00
Matt Turner
3384179faa glsl: Make sure not to dereference NULL.
Found by Coverity.
2015-04-01 12:25:29 -07:00
Laura Ekstrand
142909f19d main: create_buffers unlocks mutex when throwing OUT_OF_MEMORY.
Ilia Mirkin found that I had forgotten to free the mutex in the error case.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-04-01 12:07:28 -07:00
Jose Fonseca
3321724c10 automake,scons: Put NIR source files in a separate var to fix SCons build.
SCons does not build NIR yet.

Trivial.
2015-04-01 19:49:09 +01:00
Jose Fonseca
7f0682cebf automake: Fix out-of-source builds.
Add include path for generated nir_opcodes.h.

Trivial.
2015-04-01 19:48:09 +01:00
Brian Paul
1625d7a87a mesa: don't include colormac.h in format code
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-01 12:04:28 -06:00
Brian Paul
2768a0b1b4 mesa: remove unneeded #include of colormac.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-01 12:04:28 -06:00
Brian Paul
f1d55017d7 tnl: remove unneeded #include of colormac.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-01 12:04:28 -06:00
Brian Paul
8ac9407a83 swrast: remove unneeded #include of colormac.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-01 12:04:28 -06:00
Brian Paul
2ad8af1a0c mesa: remove unused macros from colormac.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-04-01 12:04:28 -06:00
Eric Anholt
15b03b7964 nir: Recognize a pattern of bool frobbing from TGSI KILL_IF.
TGSI's conditional discards take float arg and negate it, so GLSL to TGSI
generates a b2f and negates that value.  Only, in NIR we want a proper
bool once again, so we compare with 0.  This is a lot of pointless extra
instructions.

total instructions in shared programs: 39735 -> 39702 (-0.08%)
instructions in affected programs:     1342 -> 1309 (-2.46%)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-01 10:57:01 -07:00
Eric Anholt
6e8d4a2f80 nir: Recognize a pattern for doing b2f without the opcode.
Since we have patterns based on b2f, generate them if we see the b2f
equivalent using an iand.  This is common when generating NIR from TGSI.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-01 10:57:01 -07:00
Eric Anholt
26261bca21 vc4: Add shader-db dumping of NIR instruction count.
I was previously using temporary disables of VC4 optimization to show the
benefits of improved NIR optimization, but this can get me quick and dirty
numbers for NIR-only improvements without having to add hacks to disable
VC4's code (disabling of which might hide ways that the NIR changes would
hurt actual VC4 codegen).
2015-04-01 10:57:01 -07:00
Eric Anholt
73e2d4837d vc4: Convert to consuming NIR.
NIR brings us better optimization than I would have bothered to write
within the driver, developers sharing future optimization work, and the
ability to share device-specific lowering code that we and other
GLES2-level drivers need.

total uniforms in shared programs: 13421 -> 13422 (0.01%)
uniforms in affected programs:     62 -> 63 (1.61%)
total instructions in shared programs: 39961 -> 39707 (-0.64%)
instructions in affected programs:     15494 -> 15240 (-1.64%)

v2: Add missing imov support, and assert that there are no dest saturates.
v3: Rebase on the target-specific algebraic series.
v4: Rebase on gallium-includes-from-NIR changes in mater.
v5: Rebase on variables being in lists instead of hash tables.
v6: Squash in intermediate changes that used the NIR-to-TGSI pass (which
    I'm not committing)
2015-04-01 10:57:01 -07:00
Eric Anholt
783ad697d2 gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.
This will be used by the VC4 driver for doing device-independent
optimization, and hopefully eventually replacing its whole IR.  It also
may be useful to other drivers for the same reason.

v2: Add all of the instructions I was relying on tgsi_lowering to remove,
    and more.
v3: Rebase on SSA rework of the builder.
v4: Use the NIR ineg operation instead of doing a src modifier.
v5: Don't use ineg for fnegs.  (infer_src_type on MOV doesn't do what I
    expect, again).
v6: Fix handling of multi-channel KILL_IF sources.
v7: Make ttn_get_f() return a swizzle of a scalar load_const, rather than
    a vector load_const.  CSE doesn't recognize that srcs out of those
    channels are actually all the same.
v8: Rebase on nir_builder auto-sizing, make the scalar arguments to
    non-ALU instructions actually be scalars.
v9: Add support for if/loop instructions, additional texture targets, and
    untested support for indirect addressing on temps.
v10: Rebase on master, drop bad comment about control flow and just choose
     the X channel, use int comparison opcodes in LIT for now, drop unused
     pipe_context argument..
v11: Fix translation of LRP (previously missed because I mis-translated
     back out), use nir_builder init helpers.
v12: Rebase on master, adding explicit include of mtypes.h to get
     INTERP_QUALIFIER_*
v13: Rebase on variables being in lists instead of hash tables, drop use
     of mtypes.h in favor of util/pipeline.h.  Use Ken's nir_builder
     swizzle and fmov/imov_alu helpers, drop "struct" in front of
     nir_builder, use nir_builder directly as the function arg in a lot of
     cases, drop redundant members of ttn_compile that are also in
     nir_builder, drop some half-baked malloc failure handling.
v14: The indirect uniform src0 should be scalar, not vector (noticed as
     odd by robclark, confirmed by cwabbott).  Apply Ken's review to
     initialize s->num_uniforms and friends, skip ttn_channel for dot
     products, and use the simpler discard_if intrinsic.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v13)
Acked-by: Rob Clark <robclark@freedesktop.org>
2015-04-01 10:57:01 -07:00
Eric Anholt
486dcfbbd9 vc4: Tell shader-db how big our UBOs are, if present.
I had regressed them for a while with the NIR work.
2015-04-01 10:57:01 -07:00
Eric Anholt
a3a07d46d1 mesa: Make a shared header for 3D pipeline enum / #defines.
NIR uses these enums/#defines in nir_variables and associated intrinsics,
but I want to be able to use them from TGSI->NIR and NIR->TGSI.
Otherwise, we had to pull in all of mtypes.h.

This doesn't cover all of the enums we might want from a shared compiler
core (like varying slots or vert attribs), but it at least covers what I
need at the moment (system values and interp qualifiers).

v2: Move to src/glsl since util/ is really vague.  Include in Makefile.am
    list.  Use plain bitshifts and stdint types instead of undefined
    BITFIELD64_BIT.
v3: Rename to shader_enums.h. Move it into Makefile.sources.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2, with
             recommendation to rename)
2015-04-01 10:57:01 -07:00
Emil Velikov
5604d7675e nir: add nir_builder.h to the tarball
The header was added with commit 2a135c470e3(nir: Add an ALU op builder
kind of like ir_builder.h) but did not made it into to the sources list.

Fortunately it remained unused until a recent commit faf6106c6f6(nir:
Implement a Mesa IR -> NIR translator.)

v2: Remove the bogus dependency. Tweak commit message.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 14:46:42 +01:00
Emil Velikov
4984cb7ef8 xmlpool: remove the clean target
... by folding it into CLEANFILES. Don't worry about $(LANG) as it is
essentially the first folder of $(POS). With the latter already handled.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 14:46:41 +01:00
Emil Velikov
a665b9b3c8 xmlpool: don't forget to ship the MOS
This will allow us to finally remove python from the build time
dependencies list. Considering that you're building from a release
tarball of course :-)

Cc: Bernd Kuhls <bernd.kuhls@t-online.de>
Reported-by: Bernd Kuhls <bernd.kuhls@t-online.de>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-04-01 14:46:41 +01:00
Emil Velikov
c07df0f201 osmesa: don't try to bundle osmesa.def SConscript
Both of which were removed with commit 69db422218b(scons: Don't build
osmesa.)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-04-01 14:46:41 +01:00
Emil Velikov
1d36c52f5d docs: note that classic osmesa/libEGL no longer builds with scons
Plus nuke the final reference to osmesa from README.WIN32.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-04-01 14:46:35 +01:00
Iago Toral Quiroga
3818dfcf3c i965: Handle scratch accesses where reladdr also points to scratch space
This is a problem when we have IR like this:

(array_ref (var_ref temps) (swiz x (expression ivec4 bitcast_f2i
   (swiz xxxx (array_ref (var_ref temps) (constant int (2)) ) )) )) ) )

where we are indexing an array with the result of an expression that
accesses the same array.

In this scenario, temps will be moved to scratch space and we will need
to add scratch reads/writes for all accesses to temps, however, the
current implementation does not consider the case where a reladdr pointer
(obtained by indexing into temps trough a expression) points to a register
that is also stored in scratch space (as in this case, where the expression
used to index temps access temps[2]), and thus, requires a scratch read
before it is accessed.

v2 (Francisco Jerez):
 - Handle also recursive reladdr addressing.
 - Do not memcpy dst_reg into src_reg when rewriting reladdr.

v3 (Francisco Jerez):
 - Reduce complexity by moving recursive reladdr scratch access handling
   to a separate recursive function.
 - Do not skip demoting reladdr index registers to scratch space if the
   top level GRF has already been visited.

v4 (Francisco Jerez)
 - Remove redundant checks.
 - Simplify code by making emit_resolve_reladdr return a register with
   the original src data except for reg, reg_offset and reladdr.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89508
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-04-01 15:35:23 +02:00
Roland Scheidegger
e3252defd2 gallivm: (trivial) fix the logic deciding if function call should be used...
Copy and paste bug with the img filter decision. Since there's only 2 different
filters anyway just drop this bit.
2015-04-01 13:26:19 +02:00
Martin Peres
59af7ed28c mesa/fbo: lock ctx->Shared->Mutex when allocating renderbuffers
This mutex is used to make sure the shared context does not change
while some shared code is looking into it.

Calling BindRenderbufferEXT BindRenderbuffer with a gles context
would not take the mutex before allocating an entry. Commit a34669b
then moved out the allocation out of bind_renderbuffer into
allocate_renderbuffer before using it for the CreateRenderBuffer
entry point. This thus also made this entry point unsafe.

The issue has been hinted by Ilia Mirkin.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-01 09:36:27 +03:00
Martin Peres
fa38321551 mesa/fbo: do not assign a value that is never read later on
The issue has been detected by coverty.

v2:
- move the declaration of obj to the else clause (Brian Paul)

v3: Review by Brian Paul
- get rid of the obj declaration in favor of a direct reference

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-04-01 09:36:27 +03:00
Dave Airlie
8f7338f284 egl: add initial EGL_MESA_image_dma_buf_export v2.4
At the moment to get an EGL image to a dma-buf file descriptor,
you have to use EGL_MESA_drm_image, and then use libdrm to
convert this to a file descriptor.

This extension just provides an API modelled on EGL_MESA_drm_image,
to return a dma-buf file descriptor.

v2: update spec for new API proposal
add internal queries to get the fourcc back from intel driver.

v2.1: add gallium pieces.

v2.2: add offsets to spec and API, rename fd->fds, stride->strides
in API. rewrite spec a bit more, add some q/a

v2.3:
add modifiers to query interface and 64-bit type for that (Daniel Stone)
specifiy what happens to num fds vs num planes differences. (Chad Versace)

v2.4:
fix grammar (Daniel Stone)

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-04-01 14:10:04 +10:00
Jordan Justen
22ccdf12dd i965/state: Remove brw->state.dirty
We now use brw->NewGLState and brw->ctx.NewDriverState instead.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:24 -07:00
Jordan Justen
7ecf3530d8 i965/state: Don't use brw->state.dirty.mesa
Now, we only use brw->NewGLState.

I used this bash & sed command in the i965 directory:
  for file in *.[ch] *.[ch]pp; do
    sed -i -e 's/brw->state\.dirty\.mesa/brw->NewGLState/g' $file
  done

Followed by manual changes to brw_state_upload.c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:24 -07:00
Jordan Justen
4e56a9ad46 i965/state: Don't use brw->state.dirty.brw
Now, we only use ctx->NewDriverState.

I used this bash & sed command in the i965 directory:
  for file in *.[ch] *.[ch]pp; do
    sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file
  done

Followed by manual changes to brw_state_upload.c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:24 -07:00
Jordan Justen
20ef23b227 i965/state: Add compute pipeline with empty atom lists
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:24 -07:00
Jordan Justen
a8e39e1903 i965/state: Only upload render programs for render state uploads
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:24 -07:00
Jordan Justen
d70f4e6daf i965/state: Create separate dirty state bits for each pipeline
When clearing the state for a pipeline, we will save changed state for
the other pipelines.

v3:
 * Adjust brw_upload_pipeline_state
   * Don't pull pipeline state bits into common state bits
   * Don't clear pipeline state bits
 * Adjust 'clear' phase
   * brw_clear_dirty_bits is now brw_render_state_finished
   * Move cross-pipeline state flagging to brw_pipeline_state_finished
   * Move pipeline clears to brw_pipeline_state_finished

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:24 -07:00
Jordan Justen
db11955072 i965/state: Support multiple pipelines in brw->num_atoms
brw->num_atoms is converted to an array, but currently just an array
of length 1.

Adds brw_copy_pipeline_atoms which copies the atoms for a pipeline,
and sets brw->num_atoms[p] for pipeline p.

v2:
 * Rename brw->atoms[] to render_atoms
 * Rename brw_add_pipeline_atoms to brw_copy_pipeline_atoms
 * Rename brw_pipeline_first_atom to brw_get_pipeline_atoms

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:23 -07:00
Jordan Justen
736a31d462 i965/state: Rename brw_clear_dirty_bits to brw_render_state_finished
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:23 -07:00
Jordan Justen
2c02baa487 i965/state: Rename brw_upload_state to brw_upload_render_state
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 16:40:23 -07:00
Roland Scheidegger
611bd80f3b gallivm: do some hack heuristic to disable texture functions
We've seen some cases where performance can hurt quite a bit.
Technically, the more simple the function the more overhead there is
for using a function for this (and the less benefits this provides).
Hence don't do this if we expect the generated code to be simple.
There's an even more important reason why this hurts performance,
which is shaders reusing the same unit with some of the same inputs,
as llvm cannot figure out the calculations are the same if they
are performned in the function (even just reusing the same unit without
any input being the same provides such optimization opportunities though
not very much). This is something which would need to be handled by IPO
passes however.
2015-04-01 00:56:12 +02:00
Matt Turner
47c4b38540 i965/fs: Allow CSE to handle MULs with negated arguments.
mul x, -y is equivalent to mul -x, y; and mul x, y is the negation of
mul x, -y.

With NIR:
total instructions in shared programs: 6167779 -> 6161193 (-0.11%)
instructions in affected programs:     983511 -> 976925 (-0.67%)
helped:                                4106
HURT:                                  16
GAINED:                                18
LOST:                                  7

Without NIR:
total instructions in shared programs: 6192323 -> 6185299 (-0.11%)
instructions in affected programs:     987875 -> 980851 (-0.71%)
helped:                                4146
HURT:                                  16
GAINED:                                16
LOST:                                  0
2015-03-31 14:14:36 -07:00
Matt Turner
438c1c0080 i965: Mark brw_inst_bits' brw_inst* parameter const.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-31 14:14:36 -07:00
Matt Turner
ac6102bcc5 glsl: Remove bogus Makefile dependency. 2015-03-31 14:14:36 -07:00
Matt Turner
2c38f891ad glsl: Reassociate multiplication of mat*mat*vec.
The typical case of mat4*mat4*vec4 is 80 scalar multiplications, but
mat4*(mat4*vec4) is only 32.

On HSW (with vec4 vertex shaders):
instructions in affected programs:     4420 -> 3194 (-27.74%)

On BDW (with scalar vertex shaders):
instructions in affected programs:     12756 -> 6726 (-47.27%)

Implementing a general matrix chain ordering is harder (or at least
tedious) because of having to walk the GLSL IR to create a list of
multiplicands. I'm guessing that this patch handles 90+% of cases, but
of course to tell definitively you'd have to implement the general
thing.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-03-31 14:01:15 -07:00
Matt Turner
cf2dc1624f glsl: Implement type inferencing of matrix types.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-03-31 14:01:15 -07:00
Matt Turner
73f6f9b9be glsl: Factor out a get_mul_type() function.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-03-31 14:01:15 -07:00
Marcin Ślusarz
f9e2295560 nouveau: synchronize "scratch runout" destruction with the command stream
When nvc0_push_vbo calls nouveau_scratch_done it does not mean
scratch buffers can be freed immediately. It means "when hardware
advances to this place in the command stream the scratch buffers
can be freed".

To fix it, just postpone scratch runout destruction after current
fence is signalled.

The bug existed for a very long time. Nobody noticed, because
"scratch runout" code path is rarely executed.

Fixes hang at the very beginning of first mission in "Serious Sam 3"
on nve7/gk107. It manifested as:

nouveau E[   PFIFO][0000:01:00.0] read fault at 0x000a9e0000 [PTE] from GR/GPC0/PE_2 on channel 0x007f853000 [Sam3[17056]]

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-31 22:04:31 +02:00
Brian Paul
3db0317351 docs: document Viewperf 12 issues
Signed-off-by: Brian Paul <brianp@vmware.com>
2015-03-31 11:50:20 -06:00
Neil Roberts
fe026d7ce5 i965/skl: Avoid using the 1D stencil layout for stencil-only images
Commit cf67ca9ffa made the layouting code pick a special layout for
1D images on Skylake. This should not be used for depth and stencil
buffers because these need to be treated as 2D tiled images. However
the patch was missing a check for images with a base format of
GL_STENCIL_INDEX. In practice I don't think it's currently possible to
hit this because Mesa doesn't support GL_ARB_texture_stencil8 and it's
not possible to create a 1D renderbuffer, but it'll be good to be
ready for when the extension is supported.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-31 18:22:01 +01:00
Tom Stellard
fda7558057 clover: Return CL_BUILD_ERROR for CL_PROGRAM_BUILD_STATUS when compilation fails v2
v2:
  - Don't use _errs map

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-31 15:40:51 +00:00
Tom Stellard
4c53d2acbb radeonsi/compute: Default to the same PIPE_SHADER_CAP values as other shader types v2
v2:
  - Fix typo

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-31 15:40:51 +00:00
Leo Liu
a714fbacf7 radeon/vce: implement video usability information support
This will help encoding VUI into the bitstream

v2: make backward compatible

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-03-31 12:31:58 -04:00
Leo Liu
8e3668a7c0 st/omx/enc: export framerate to vce driver
The framerate will be used for video usability info support by VCE driver

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-03-31 12:31:58 -04:00
Roland Scheidegger
489866938f llvmpipe: enable ARB_texture_gather
Just announce support for 4 components.
While here also increase the max/min texel offsets (the limit is completely
artificial, was chosen because that's what other hardware did, however there's
other drivers using larger limits).
Over a thousand little piglits skip->pass.

v2: update docs/GL3.txt

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-31 17:23:51 +02:00
Roland Scheidegger
0753b135f6 gallivm: implement TG4 for ARB_texture_gather
This is quite trivial, essentially just follow all the same code you'd
use with linear min/mag (and no mip) filter, then just skip the filtering
after looking up the texels in favor of direct assignment of the right channel
to the result. (This is though not true for the multi-offset version if we'd
want to support it - for this would probably need to do something along the
lines of 4x nearest sampling due to the necessity of doing coord wrapping
individually per texel.)
Supports multi-channel formats.
From the SM5 gather cap bit, should support non-constant offsets, plus shadow
comparisons (the former untested), but not component selection (should be
easy to implement but all this stuff is not really exposable anyway for now).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-31 17:23:51 +02:00
Roland Scheidegger
73c6914195 gallivm: add gather support to sampler interface
Luckily thanks to the revamped interface this is a lot less work now...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-31 17:23:51 +02:00
Roland Scheidegger
1863ed21ff gallivm: simplify sampler interface
This has got a bit out of control with more and more parameters added.
Worse, whenever something in there changes all callees have to be updated
for that, even though they don't really do much with any parameter in there
except pass it on to the actual sampling function.
Hence simply put almost everything into a struct. Also instead of relying
on some arguments being NULL, be explicit and set this in a key (which is
just reused for function generation for simplicity). (The code still relies
on them being NULL in the end for now.)
Technically there is a minimal functional change here for shadow sampling:
if shadow sampling is done is now determined explicitly by the texture
function (either sample_c or the gl-style tex func inherit this from target)
instead of the static texture state. These two should always match, however.
Otherwise, it should generate all the same code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-31 17:23:51 +02:00
Jose Fonseca
0fc5b80e7a util/debug: Update MgwHelp link, drop BfdHelp link. 2015-03-31 09:42:06 +01:00
Michel Dänzer
b8797a7875 gallivm: Fix build against LLVM 3.7 SVN r233648
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-31 15:05:01 +09:00
Eric Anholt
1dcc1ee314 vc4: Drop integer multiplies with 0 to moves of 0.
This cleans up more instructions generated by uniform array indexing
multiplies.

total instructions in shared programs: 39989 -> 39961 (-0.07%)
instructions in affected programs:     896 -> 868 (-3.12%)
2015-03-30 12:57:45 -07:00
Eric Anholt
8c5dcdbccb vc4: Add a constant folding pass.
This cleans up some pointless operations generated by the in-driver mul24
lowering (commonly generated by making a vec4 index for a matrix in a
uniform array).

I could fill in other operations, but pretty much anything else ought to
be getting handled at the NIR level, I think.

total uniforms in shared programs: 13423 -> 13421 (-0.01%)
uniforms in affected programs:     346 -> 344 (-0.58%)
2015-03-30 12:57:45 -07:00
Brian Paul
dbe67d76e0 glsl: allow ForceGLSLVersion to override #version directives
Previously, the ctx->Const.ForceGLSLVersion setting only worked if
the shader lacked a #version directive.  Now, the ForceGLSLVersion
setting will override the #version directive too.

This change should be safe since it should be rare to have an app
that has a mix of shader versions and we only wanted to override
the #version for shaders which lacked the #version directive.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-30 11:25:39 -06:00
Eric Anholt
c519c4d85e vc4: Don't bother masking out the low 24 bits for integer multiplies
The hardware just uses the low 24 lines, saving us an AND to drop the high
bits.

total uniforms in shared programs: 13433 -> 13423 (-0.07%)
uniforms in affected programs:     356 -> 346 (-2.81%)
total instructions in shared programs: 40003 -> 39989 (-0.03%)
instructions in affected programs:     910 -> 896 (-1.54%)
2015-03-30 09:23:39 -07:00
Eric Anholt
5df8bf86fe vc4: Make integer multiply use 24 bits for the low parts.
The hardware uses the low 24 bits in integer multiplies, so we can have
fewer high bits (and so probably drop them more frequently).
2015-03-30 09:23:39 -07:00
Samuel Iglesias Gonsalvez
18004c338f glsl: fail when a shader's input var has not an equivalent out var in previous
GLSL ES 3.00 spec, 4.3.10 (Linking of Vertex Outputs and Fragment Inputs),
page 45 says the following:

"The type of vertex outputs and fragment input with the same name must match,
otherwise the link command will fail. The precision does not need to match.
Only those fragment inputs statically used (i.e. read) in the fragment shader
must be declared as outputs in the vertex shader; declaring superfluous vertex
shader outputs is permissible."
[...]
"The term static use means that after preprocessing the shader includes at
least one statement that accesses the input or output, even if that statement
is never actually executed."

And it includes a table with all the possibilities.

Similar table or content is present in other GLSL specs: GLSL 4.40, GLSL 1.50,
etc but for more stages (vertex and geometry shaders, etc).

This patch detects that case and returns a link error. It fixes the following
dEQP test:

  dEQP-GLES3.functional.shaders.linkage.varying.rules.illegal_usage_1

However, it adds a new regression in piglit because the test hasn't a
vertex shader and it checks the link status.

bin/glslparsertest \
tests/spec/glsl-1.50/compiler/gs-also-uses-smooth-flat-noperspective.geom pass \
1.50 --check-link

This piglit test is wrong according to the spec wording above, so if this patch
is merged it should be updated.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-30 13:29:05 +02:00
Michel Dänzer
d64adc3a79 radeonsi: Cache LLVMTargetMachineRef in context instead of in screen
Fixes a crash in genymotion with several threads compiling shaders
concurrently.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89746

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-30 15:15:10 +09:00
Tapani Pälli
ce83a6ec81 glsl: fix unreachable(!"") to unreachable("")
Correct error with commit 151fb1e where assert was renamed
to unreachable without removing ! from string argument.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-30 08:16:00 +03:00
Emil Velikov
938b17940f docs: add news item and link release notes for mesa 10.5.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-28 19:21:31 +00:00
Emil Velikov
dc8d8a2951 docs: Add sha256 sums for the 10.5.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit ff87ae1e00)
2015-03-28 19:21:31 +00:00
Emil Velikov
6e19f6b4d0 Add release notes for the 10.5.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 5e59f895c4)
2015-03-28 19:21:31 +00:00
Ilia Mirkin
ee670c9efa freedreno/a3xx: add support for point sprite coordinate replacement
This does not (yet) support different coordinate origins, so the tests
still fail due to fbo flipping.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-28 14:54:41 -04:00
Ilia Mirkin
995f55a6ce freedreno/a3xx: make vs-set point size work
This appears to need the A2XX version of the point list, so select it at
draw time if necessary.

Experimentally, always using the A2XX version causes hangs when PSIZE
isn't actually emitted.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-28 14:54:41 -04:00
Ilia Mirkin
7fc5da8b93 freedreno/a3xx: point size should not be divided by 2
The division is probably a holdover from the days when the fixed point
inline functions generated by headergen were broken.

Also reduce the maximum point size to 4092 (vs 4096), which is what the
blob does.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-28 14:54:41 -04:00
Ilia Mirkin
738c8319ac freedreno/a3xx: fix 3d texture layout
The SZ2 field contains the layer size of a lower miplevel. It only
contains 4 bits, which limits the maximum layer size it can describe. In
situations where the next miplevel would be too big, the hardware
appears to keep minifying the size until it hits one of that size.
Unfortunately the hardware's ideas about sizes can differ from
freedreno's which can still lead to issues. Minimize those by stopping
to minify as soon as possible.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-28 14:54:41 -04:00
Ilia Mirkin
3735643df3 freedreno/a3xx: LAYERSZ2 appears to have no effect on arrays
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-28 14:54:40 -04:00
Kenneth Graunke
72b06fb08e nir: Fix copy and pasted error message in nir_validate.
These are nir_cf_nodes, not ALU instructions.
Also, use unreachable() to preempt said review feedback.

v2: Do it right (thanks Ilia).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-28 09:36:46 -07:00
Kenneth Graunke
31dc63d5ca i965/nir: Use NIR for ARB_vertex_program support on Gen8+.
Everything is already in place; we simply have to take the scalar code
generation path.  This gives us SIMD8 VS programs, instead of SIMD4x2.

v2: Rebase on the patch that drops brw->gen >= 8.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-27 21:16:51 -07:00
Kenneth Graunke
ac69ab7302 i965: Move env_var_as_boolean to intel_debug.c.
I need to use this in brw_vec4.cpp, so it can't be static anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-27 21:16:43 -07:00
Kenneth Graunke
826d3afb8f i965/fs: Add ARB_fragment_program support to the NIR backend.
Use prog_to_nir where we would normally call glsl_to_nir, handle program
parameter lists, and skip a few things that don't exist.

Using NIR generates much better shader code than Mesa IR, since we get
real optimizations, as opposed to prog_optimize:

total instructions in shared programs: 314007 -> 279892 (-10.86%)
instructions in affected programs:     285173 -> 251058 (-11.96%)
helped:                                2001
HURT:                                  67
GAINED:                                4
LOST:                                  7

v2: Change early return in nir_setup_uniforms to if/else (Jordan).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-27 21:16:34 -07:00
Kenneth Graunke
bf2c3bc316 nir: Lower subtraction to add with negation when !lower_negate.
prog->nir will generate fsub opcodes, but i965 doesn't implement them.
We may as well lower them at the NIR level, since it's trivial to do.

Suggested by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-27 21:16:34 -07:00
Kenneth Graunke
faf6106c6f nir: Implement a Mesa IR -> NIR translator.
Shamelessly ripped off from Eric Anholt's tgsi_to_nir pass.

This is not built on SCons, like the rest of NIR.

v2:
- Delete redundant c->s, c->impl, and c->cf_node_list pointers (Ken)
- Use nir_builder directly instead of ptn_compile in more places (Ken)
- Drop 'struct' keyword in front of nir_builder (ken)
- Add a file level Doxygen comment (Ken)
- Use scalar constants instead of splatting (Eric)
- Use nir_builder helpers for constants, moves, and swizzles (Connor)

v3: Minor indentation improvements.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-27 21:16:34 -07:00
Kenneth Graunke
06f7bea96a nir: Add builder helpers for MOVs with ALU sources and swizzling MOVs.
These will be useful for prog->nir and tgsi->nir.

v2: Don't forget to mark nir_swizzle as inline (Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-27 21:16:33 -07:00
Kenneth Graunke
75c922e0fe nir: Add nir_builder helpers for creating load_const intrinsics.
Both prog->nir and tgsi->nir will want to use these.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-27 21:16:33 -07:00
Ben Widawsky
74fd226e34 i965/skl: Don't use the PMA depth stall workaround
The PMA depth stall must be enabled (optimization turned off) under certain
circumstances on gen8. This was supposedly fixed for Gen9, which means we do not
need to check, or toggle the state. The hardware is supposed to enable the
hardware optimization by default, unlike BDW, so we also don't need to set it at
init. For whatever reason this improves stability on ETQW with the bug mentioned
below.

References: https://bugs.freedesktop.org/show_bug.cgi?id=89039 (doesn't fix)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Anuj Phogat <anuj.phogat@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-27 21:04:41 -07:00
Ben Widawsky
9d32d35850 i965/skl: Disable partial resolve in VC
Recomendation [sic] is to set this field to 1 always. Programming it to default
value of 0, may have -ve impact on performance for MSAA WLs.

Another don't suck bit which needs to get set.

The patch wasn't as well tested as I would have liked, primarily I don't have
perf numbers for it, but it's getting to a point where it is in danger of being
lost.

v2: v1 was a mix of two patches. Since 0x7004 is masked, we only need to set it
once at initialization and make sure the pma workaround doesn't set the mask bit
(which it doesn't).
Move LRI to init gpu state (Ken)
Add a comment.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-27 21:04:37 -07:00
Roland Scheidegger
b2424fb030 llvmpipe: simplify address calculation for 4x4 blocks
These functions looked quite complicated, even though what they actually did
was trivial (ever since we dropped swizzled rendering). Also drop lookup of
format block per bytes done for each block, and do it once per scene instead.
This improves everybody's favorite "benchmark" by 3% or so, though
lp_rast_shade_quads_all() which calls this shows up still quite high for a
function which does little more than call the jit function.
(This would most likely be much better handled by the jit function itself,
the strides are passed through anyway already, though for being able to
handle layers it would definitely add some complexity.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-28 02:59:42 +01:00
Roland Scheidegger
764fc2be5a gallivm: fix texture function name (key) when using txf/ld
When using the texel fetch functions rather than ordinary texturing,
the arguments are all int vecs instead of float vecs, not to mention
the actual function would look completely different. Hence this must
be included in the texture function name (which serves as the key)
otherwise things crash badly when a shader accesses the same texture
and sampler unit with both txf/ld and ordinary texturing instructions
with otherwise matching keys.
2015-03-28 02:58:43 +01:00
Timothy Arceri
2cb149c289 glsl: mark uniform and input interface blocks as read only
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-28 10:18:40 +11:00
Ilia Mirkin
58030a8f99 nv50/ir/gk110: fix offset flag position for TXD opcode
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-27 19:02:19 -04:00
Ilia Mirkin
49b86007aa nv50/ir: take postFactor into account when doing peephole optimizations
Multiply operations can have a post-factor on them, which other ops
don't support. Only perform the peephole optimizations when there is no
post-factor involved.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-27 19:02:19 -04:00
Jan Vesely
a99a16a0cf gallivm: Fix build since llvm r233411
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-27 18:57:02 -04:00
Eric Anholt
afa9fc1561 nir: Add optional lowering of flrp.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-27 13:29:48 -07:00
Roland Scheidegger
56076be2ac gallivm: use llvm function calls for texturing instead of inlining
There are issues with inlining everything, most notably llvm will use much
more memory (and be slower) when compiling. Ideally we'd probably use
functions for shader functions too but texture sampling usually is responsible
for quite some IR (it can easily reach 80% of total IR instructions) so this
seems like a good start.
This still generates a different function for all different combinations just
like before, however it is possible llvm is missing some optimization
opportunities - it is believed though such opportunities should be somewhat
rare, but at least for now it can still be switched off (at compile time only).
It should probably make compiled code also smaller because the same function
should be used for different variants in the same module (so for the
opaque/partial or linear/elts variants).
No piglit change (though it does indeed speed up unrealistic tests like
fp-indirections2 by a factor of 30 or so).
Has a small negative performance impact in openarena - I suspect this could
be fixed by running some IPO passes (despite the private linkage, llvm right
now does NO optimization at all wrt anything going past the call, even if
there's just one caller - so things like values stored before the call and then
always written by the function etc. will not be optimized away, nor will dead
arguments (which we mostly shouldn't have) be eliminated, always constant
arguments promoted etc.).

v2: use proper return values instead of pointer function arguments.
llvm supports aggregate return values, which do wonders here eliminating
unnecessary stack variables - everything in fact will be returned in registers
even without any IPO optimizations. It makes the code simpler too.
With this I could not measure a peformance impact in openarena any longer
(though since there's still no constant value propagation etc. into the tex
functions this does not mean it couldn't have a negative impact elsewhere).

v3: fix some minor issues suggested by Jose, and do disassembly (and the
profiling) without hacks.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-27 19:25:53 +01:00
Roland Scheidegger
8dad9455ff gallivm: pass jit_context pointer through to sampling
The callbacks used for getting the dynamic texture/sampler state were using
the jit_context from the generated jit function. This works just fine, however
that way it's impossible to generate separate functions for texture sampling,
as will be done in the next commit. Hence, pass this pointer through all
interfaces so it can be passed to a separate function (technically, it would
probably be possible to extract this pointer from the current function instead,
but this feels hacky and would probably require some more hacks if we'd use
real functions instead of inlining all shader functions at some point).
There should be no difference in the generated code for now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-27 19:25:53 +01:00
Christian König
787aa26cb7 gallium/vl: partially revert "Use util_cpu_to_le{16,32} in many more places."
The data in memory is in big endian format and needs to be converted
into CPU byte order. So the patch actually reversed what needs to be done.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-27 11:30:32 +01:00
Ilia Mirkin
626434893a tgsi: fix out-of-bounds access for cube arrays
The CUBE_ARRAY case uses r[4]. Make sure that the stack variable is
there.

Noticed by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-03-26 21:02:09 -04:00
Ilia Mirkin
f95a6b2ff4 st/mesa: initialize have_fma in constructor
Spotted by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-26 21:02:09 -04:00
Ilia Mirkin
1b87d73a9f gallium/util: remove u_linkage
Does not appear to be used in tree. Coverity spotted some errors in the
bitmask stuff, but the whole thing appears to be unused.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-26 21:02:09 -04:00
Ilia Mirkin
2e34faaf06 gallium/hud: avoid overflowing hud graph name size
Spotted by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-26 21:02:08 -04:00
Ilia Mirkin
9d1b5febb6 st/mesa: update arrays when the current attrib has been updated
Fixes the recently-sent gl-2.0-vertex-const-attr piglit test. Makes sure
to revalidate arrays when only the current attribute has been updated
via glVertexAttrib*.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89754
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-26 21:01:59 -04:00
Dave Airlie
91e3533481 st_glsl_to_tgsi: only do mov copy propagation on temps (v2)
Don't propagate ARRAYs

This should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=89759

v2: just specify arrays so we get input propagation
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-26 12:03:44 +10:00
Kenneth Graunke
ef09cfb51e i965: Drop unnecessary brw->gen >= 8 check from scalar VS code.
brw->scalar_vs already implies that brw->gen >= 8.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-25 16:19:26 -07:00
Kenneth Graunke
649173b473 i965/fs: Implement texture projection support.
Our fragment program backend implements support for TXP directly, and
there's no NIR lowering pass to remove the projection.  When we switch
fragment program support over to NIR, we need to support it somehow.

It's easy enough to support directly.

v2: Split out offset/tex_offset rename (requested by Jordan).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-25 16:17:19 -07:00
Kenneth Graunke
0a9bcf9e39 i965/fs: Rename offset to tex_offset to avoid shadowing offset().
fs_visitor::nir_emit_texture() created an fs_reg variable called offset,
which shadowed the offset() helper function in brw_ir_fs.h.

Rename the variable to tex_offset so we can still call offset().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-25 16:17:19 -07:00
Kenneth Graunke
3120345f40 nir: Add glsl_float_type() wrapper.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-25 16:17:19 -07:00
Matt Turner
871f1080d0 glsl: Use INFINITY instead of std::numeric_limits<float>::infinity().
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-25 15:06:48 -07:00
Emil Velikov
5dc573e5de automake: add missing egl files to the tarball
Namely the Haiku EGL driver backend and the SConscript for the dri2 EGL
driver backend.

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-25 21:00:02 +00:00
Ian Romanick
6075780247 glsl: Constify ir_instruction::equals
v2: Don't be lazy.  Constify the as_foo functions and use those instead
of ugly casts.  Suggested by Curro.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-25 10:41:08 -07:00
Ian Romanick
dec9664e35 glsl: Constify the as_foo functions
Now that they're all implemented using macros, this is trivial.

v2: Remove redundant parenthesis.  Suggested by Curro.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-25 10:40:52 -07:00
Ian Romanick
0c4ee62045 glsl: Implement remaining as_foo functions with macros
The downcast functions for non-leaf classes were previously implemented
"by hand."  Now they are implemented using macros based on the is_foo
functions added in the previous patch.

v2: Remove redundant parenthesis.  Suggested by Curro (on the next
patch).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-25 10:39:09 -07:00
Ian Romanick
a284c63006 glsl: Add is_rvalue, is_dereference, and is_jump methods
These functions deteremine when an IR node is one of the non-leaf
classes.

v2: Adjust indentation to line up.  Suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-25 10:34:59 -07:00
Jose Fonseca
25d6cdd2ff util/u_atomic: Ignore warnings interlocked accesses.
These are due how we implemented the atomic tests, not the atomic
implementation itself.  It's also difficult to refactor the code to
avoid the warnings due to the use of macros -- the code would be quite
hairy.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:48 +00:00
Jose Fonseca
28c54400af scons: Disable MSVC warnings about inconsistent function annotation.
Somehow, merely including any of the *intrin.h headers causes dozens of
this warnings (when compiling pretty much every source file).  MSVC does
not always complain the same -- so it's possible we're doing something
weird --, but silence these warnings in the meanwhile.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:45 +00:00
Jose Fonseca
cb88edbd4e mesa: Avoid MSVC C6334 warning in /analyze mode.
MSVC's implementation of signbit(x) uses sizeof(x) with expressions to
dispatch to an internal function based on the argument's type (float,
double, etc), but that raises a flag with MSVC's own static analyzer,
and because this is an inline function in a header it causes substantial
warning spam.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:43 +00:00
Jose Fonseca
fdb507e3d6 c99_math: Don't reimplement lrint and friends on MSVC 2013.
MSVC 2013 declares these functions, both for C and C++ source files.

This was caught with MSVC in analyze mode.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:41 +00:00
Jose Fonseca
69db422218 scons: Don't build osmesa.
There doesn't seem much interest on osmesa on Windows, particularly classic osmesa.

If there is indeed interest in osmesa on Windows, we should instead
integrate src/gallium/targets/osmesa into SCons.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:38 +00:00
Jose Fonseca
47870d658b scons: Don't build loader on Windows.
EGL was the last user.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:35 +00:00
Jose Fonseca
f9b8c9299d scons: Don't build egl on Windows.
Useless, as there are no drivers for it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:32 +00:00
Jose Fonseca
5db57b8a55 scons: Fix git_sha1.h generation fallback.
I didn't meant to remove the 'if not os.path.exists(filename)' statement.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-25 10:42:26 +00:00
Martin Peres
31a30fb342 docs: Update progress on ARB_direct_state_access.
v2:
- Fix the state of the Program pipelines and Query objects (Laura)

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
bf11c195a5 main: Added entry points for NamedRenderbufferStorage/Multisample
v2: Review from Laura Ekstrand
- get rid of a change that should not have happened in this patch
- improve the error messages
- fix alignments
- fix a capitalization in a function name in an error message

v3: Review from Laura Ekstrand
- move the test for the validity of the renderbuffer to less generic
  functions
- get rid of some changes that accidentally landed in the wrong commit
- revert some alignment fixes

v3: Review from Laura Ekstrand
- check that the lookup returns a valid renderbuffer
- cosmetic changes to some error messages

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
245e5c4813 main: Added entry point for glGetNamedRenderbufferParameteriv
v2:
- improve an error message

v3:
- move a test to less generic functions
- fix an alignment

v4:
- take the caller as a parameter instead of bool dsa
- check that the lookup returns a valid renderbuffer

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
a34669b961 main: Added entry point for glCreateRenderbuffers
v2:
- refactor bindRenderBuffer and create_render_buffers to fix an assertion

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
73a9d0fbe5 main: Added entry point for glCreateSamplers
Because of the current way the code is architectured, there is no
functional difference between the DSA and the non-DSA path.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
b09f2ee8f7 main: Added entry point for glCreateProgramPipelines
v2:
- add spaces in an error message (Laura)

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
19e6efc0ad main: Added entry points for glGetQueryBufferObject*
These entry points will be fleshed out when the GL_ARB_query_buffer_object
extension gets implemented. In the meantime, return GL_INVALID_OPERATION as
suggested by Ian.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
c3c1ed874e main: Added entry point for glCreateQueries
v2:
- display the name of the target instead of its id (Laura)

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
6ead10d08f main: Added entry point for glGetTransformFeedbacki64_v
v2: Review from Laura Ekstrand
- use the transform feedback object lookup wrapper

v3:
- use the new name of _mesa_lookup_transform_feedback_object_err

v4: Review from Laura Ekstrand
- fix some alignement problems

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
8799ecddb6 main: Added entry point for glGetTransformFeedbacki_v
v2: Review from Laura Ekstrand
- use the transform feedback object lookup wrapper

v3:
- use the new name of _mesa_lookup_transform_feedback_object_err

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
e59d2434a0 main: Added entry point for glGetTransformFeedbackiv
v2: Review from Laura Ekstrand
- use the transform feedback object lookup wrapper

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
296d82376e main: Added entry point for glTransformFeedbackBufferRange
v2: review from Laura Ekstrand
- use the refactored code to lookup the objects
- improve some error messages
- factor out the gl method name computation
- better handle the spec differences between the DSA and non-DSA cases
- quote the spec a little more

v3: review from Laura Ekstrand
- use the new name of _mesa_lookup_bufferobj_err
- swap the comments around the offset and size checks

v4: review from Laura Ekstrand
- add more spec quotes
- properly fix the comments around the offset and size checks

v5: review from Laura Ekstrand
- add quotes on the spec citations
- revert some changes in the printf format

v6: review from Laura Ekstrand
- remove a redondant "gl" in a method name

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-25 10:05:45 +02:00
Martin Peres
a5d165afed main: Added entry point for glTransformFeedbackBufferBase
v2: Review from Laura Ekstrand
- give more helpful error messages
- factor the lookup code for the xfb and objBuf
- replace some already-existing tabs with spaces
- add comments to explain the cases where xfb == 0 or buffer == 0
- fix the condition for binding the transform buffer or not

v3: Review from Laura Ekstrand
- rename _mesa_lookup_bufferobj_err to
  _mesa_lookup_transform_feedback_bufferobj_err and make it static to avoid a
  future conflict
- make _mesa_lookup_transform_feedback_object_err static

v4: Review from Laura Ekstrand
- add the pdf page number when quoting the spec
- rename some of the symbols to follow the public/private conventions

v5: Review from Laura Ekstrand
- properly rename some of the symbols to follow the public/private conventions
- fix some alignments
- add quotes around a spec citation
- add back a newline I accidentally deleted
- add spaces around the ternary operator usages

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-25 10:05:45 +02:00
Martin Peres
c86cb2da25 main: Added entry point for glCreateTransformFeedbacks
v2: Review from Laura Ekstrand
- generate the name of the gl method once
- shorten some lines to stay in the 78 chars limit

v3: Review from Fredrik Höglund <fredrik@kde.org>
- rename gl_mthd_name to func
- set EverBound in _mesa_create_transform_feedbacks in the dsa case

v4:
- rename _mesa_create_transform_feedbacks to create_transform_feedbacks and
  make it static

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
fc76fac419 main: fix the validation of the number of samples
Maybe this should be the job of the dispatch layer.

v2:
- add the section name and pdf page number of the quote (Laura)
- OpenGL 3.0 core does not exist, get rid of "core"

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
7bd8b48084 main: replace tabs by 8 spaces in fbobject.c
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Martin Peres
cd0763b78f main: replace tabs by 8 spaces in bufferobj.c
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-25 10:05:45 +02:00
Kristian Høgsberg
169b389a34 mesa: Apply visibility flags to src/Makefile.am targets
We were building libglsl_util.la without our visibility flags and
leaking hash_table_* symbols.

Signed-off-by: Kristian Høgsberg <kristian.h.kristensen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-24 22:02:57 -07:00
Matt Turner
babd0fa3e2 nir: Fix typo. 2015-03-24 19:14:40 -07:00
Matt Turner
3fb56805f0 nir: Recognize sat(add(b2f(a), b2f(b))) as a logical OR.
Transform this into b2f(or(a, b)).

instructions in affected programs:     432 -> 430 (-0.46%)
helped:                                2

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-24 14:43:37 -07:00
Matt Turner
c31158d2cb nir: Recognize mul(b2f(a), b2f(b)) as a logical AND.
Transform this into b2f(and(a, b)).

total instructions in shared programs: 6205448 -> 6204391 (-0.02%)
instructions in affected programs:     284030 -> 282973 (-0.37%)
helped:                                903
HURT:                                  6

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-24 14:43:37 -07:00
Matt Turner
b481ebbe39 glsl: Recognize sat(add(b2f(a), b2f(b))) as a logical OR.
Transform this into b2f(or(a, b)).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-24 14:43:37 -07:00
Matt Turner
c8e8f66036 glsl: Recognize mul(b2f(a), b2f(b)) as a logical AND.
Transform this into b2f(and(a, b)).

total instructions in shared programs: 6190291 -> 6189225 (-0.02%)
instructions in affected programs:     267247 -> 266181 (-0.40%)
helped:                                866

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-24 14:43:37 -07:00
Matt Turner
95729d2458 nir: Handle mixed scalar/vector arguments to logical and/or/xor.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-24 14:43:37 -07:00
Matt Turner
c8acbd1bfd glsl: Allow vector logic ops to be generated.
They're not accessible from the source language, but optimizations are
allowed to generate them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-24 14:42:51 -07:00
Emil Velikov
248eb54eb6 configure.ac: move AC_MSG_RESULT reporting back into the m4 macro
The one who does AC_MSG_CHECKING should provide the AC_MSG_RESULT.

Fixes: ced9425327 (configure: Introduce new output variable to
ax_check_python_mako_module.m4"

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89328
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
2015-03-24 20:49:32 +00:00
Jonathan Gray
726d99b197 gallium/util: Use HAVE___BUILTIN_FFS* macros.
Make use of the builtin ffs macros and split out ffsll
to a seperate block.  Needed for at least OpenBSD which
does not have ffsll in libc.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-24 20:49:32 +00:00
Emil Velikov
8cce7b05f1 i965: add the remaining files to the tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-24 20:49:31 +00:00
Emil Velikov
9950eec173 glsl: add the remaining files to the tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-24 20:49:31 +00:00
Emil Velikov
b2439602be makefile: add all headers to the tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-24 20:49:31 +00:00
Emil Velikov
113d59fb55 gbm: remove gbm_gallium_drm from the loader
No longer used as of commit 48c7461d5a0(st/gbm: remove state-tracker)

v2: Add commit message.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1)
2015-03-24 20:49:31 +00:00
Anuj Phogat
d8208312a3 glsl: Generate link error for non-matching gl_FragCoord redeclarations
in different fragment shaders. This also applies to a case when gl_FragCoord
is redeclared with no layout qualifiers in one fragment shader and not
declared but used in other fragment shader.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Khronos Bug#12957
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-03-24 11:16:31 -07:00
Eric Anholt
7bc39c8418 vc4: Add a dump-the-surface-contents routine.
This has been useful once again while trying to debug stride issues
between render targets and texturing.
2015-03-24 10:39:12 -07:00
Eric Anholt
4df13f55b6 vc4: Allow DRI3 on simulation, as well.
The problem I'd seen before seems to be gone.
2015-03-24 10:39:12 -07:00
Eric Anholt
7f797e3d17 vc4: Fix pitch alignment of linear textures.
Fixes some non-power-of-two texture rendering when I force ARGB8888 to
raster.
2015-03-24 10:39:12 -07:00
Eric Anholt
b3ea377f86 vc4: Write the alignment of level width consistently in validation.
16 / cpp happens to be the same as utile_w on the only raster format
supported (4 bytes per pixel), but simulator/hw source code generally
talks in terms of utiles.
2015-03-24 10:39:12 -07:00
Eric Anholt
8975a09494 vc4: Fix use of a bool as an enum.
The enum compared to was 0, so it worked out, but it sure looked wrong.
2015-03-24 10:39:12 -07:00
Eric Anholt
04605c21f6 vc4: Decide the HW's format before laying out the miptree.
I'm experimenting with a workaround for raster texture misrendering on
hardware, and this lets me look at the format chosen when computing
strides.
2015-03-24 10:39:12 -07:00
Eric Anholt
25d60763d9 vc4: Use our device-specific ioctls for create/mmap.
They don't do anything special for us, but I've been told by kernel
maintainers that relying on dumb for my acceleration-capable buffers
is not OK.
2015-03-24 10:39:12 -07:00
Eric Anholt
af3d747194 vc4: Make a new #define for making code conditional on the simulator.
I'd like to compile as much of the device-specific code as possible
when building for simulator, and using if (using_simulator) instead of
ifdefs helps.
2015-03-24 10:39:12 -07:00
Eric Anholt
9bafcf630a vc4: Add some useful debug printfs for miptrees.
I keep rewriting these.
2015-03-24 10:39:12 -07:00
Ilia Mirkin
baa22c8825 glsl: avoid calling base_alignment when samplers are involved
Earlier commit 53bf7c8fd2 changed the logic to always call
base_alignment on structs. 1ec715ce8b hacked the function to return 0
for sampler fields, but didn't handle sampler arrays. Instead of
extending the hack, avoid calling base_alignment in the first place on
non-UBO uniforms.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89726
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Palli <tapani.palli@intel.com>
2015-03-24 10:10:13 -04:00
Ilia Mirkin
43277fcd59 Revert "nv50,nvc0: remove bogus 64_FLOAT formats"
This reverts commit 20346808cf.

The conversion is actually done since these are the *B macro variants
and no vtx format is supplied, which makes them go through the translate
module.

This restores the following piglit tests to passing:

  draw-vertices user
  gl-2.0-vertexattribpointer

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-23 20:57:52 -04:00
Mario Kleiner
1110113a7f mapi: Make private copies of name strings provided by client.
glXGetProcAddress("glFoo") ends up in stub_add_dynamic() to
create dynamic stubs for dynamic functions. stub_add_dynamic()
doesn't store the caller provided name string "Foo" in a mesa
private copy, but just stores a pointer to the "glFoo" string
passed to glXGetProcAddress - a pointer into arbitrary memory
outside mesa's control.

If the caller passes some dynamically allocated/changing
memory buffer to glXGetProcAddress(), or the caller gets unmapped
from memory, e.g., some dynamically loaded application
plugin which uses OpenGL, this ends badly - with a dangling
pointer.

strdup() the name string provided by the client to avoid
this problem.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-23 22:17:03 +00:00
Tom Stellard
dfb1ae9d91 clover: Return 0 as storage size for local kernel args that are not set v2
The storage size for local kernel args can be queried before the
arguments are set by using the CL_KERNEL_LOCAL_MEM_SIZE param
of clGetKernelWorkGroupInfo().

The spec says that if local kernel arguments have not been specified,
then we should assume their size is 0.

v2:
  - Implement using c++11 member initialization.

Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-03-23 17:20:21 +00:00
Tom Stellard
769b366b83 gallivm: Use MCInstrInfo in the disassembler for querying instruction info
This fixes the build since llvm r232885 and also simplifies the code.
2015-03-23 14:43:10 +00:00
Giuseppe Bilotta
7932b30892 clover: use get_device_vendor instead of get_vendor
The pipe's get_vendor method returns something more akin to a driver
vendor string in most cases, instead of the actual device vendor. Use
get_device_vendor instead, which was introduced specifically for this
purpose.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-23 13:25:34 +00:00
Giuseppe Bilotta
76039b38f0 gallium: implement get_device_vendor() for existing drivers
The only hackish ones are llvmpipe and softpipe, which currently return
the same string as for get_vendor(), while ideally they should return
the CPU vendor.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-23 13:25:34 +00:00
Giuseppe Bilotta
31d4e6fbff gallium: introduce get_device_vendor() entrypoint for pipes
This will be needed by Clover to return the correct information
to CL_DEVICE_VENDOR info queries.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-23 13:25:34 +00:00
Giuseppe Bilotta
9280f17e82 gallium: remove trailing whitespace in p_screen.h
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-23 13:25:34 +00:00
Tom Stellard
6e17936bf8 clover: The unit for CL_DEVICE_MEM_BASE_ADDR_ALIGN is bits not bytes
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-23 13:22:42 +00:00
Tom Stellard
2b12b1752a clover: Add all the mandatory 1.1 extensions to the extension string
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-23 13:22:42 +00:00
Tom Stellard
96f9cc9181 clover: Add a space at the end of CL_DEVICE_OPENCL_C_VERSION
This is required by the spec.

Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-23 13:22:42 +00:00
Francisco Jerez
3d1bba7c9b i965/vec4: Fix handling of multiple register reads and writes in dead_code_eliminate().
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:52:57 +02:00
Francisco Jerez
2babde35b9 i965/vec4: Calculate live intervals with subregister granularity.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:52:57 +02:00
Francisco Jerez
e6e655ef76 i965/vec4: Define helpers to calculate the common live interval of a range of variables.
These will be especially useful when we start keeping track of
liveness information for each subregister.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:52:49 +02:00
Francisco Jerez
eddb87402e i965/vec4: Define helper functions to convert a register to a variable index.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:13:05 +02:00
Francisco Jerez
ce030a6399 i965/vec4: Don't lose the force_writemask_all flag during CSE.
And set it in the MOV instructions that copy the temporary to the
original destination if the generator instruction had it set.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:13:00 +02:00
Francisco Jerez
1db9c0cd0c i965/vec4: Fix handling of multiple register reads and writes in opt_cse().
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:12:56 +02:00
Francisco Jerez
d041a43c0f i965/vec4: Fix handling of multiple register reads and writes during copy propagation.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:12:52 +02:00
Francisco Jerez
588859e18c i965/vec4: Fix handling of multiple register reads and writes in split_virtual_grfs().
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:12:48 +02:00
Francisco Jerez
9304f60cbe i965/vec4: Fix handling of multiple register reads and writes in opt_register_coalesce().
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:12:40 +02:00
Francisco Jerez
74c7e5d351 i965: Define method to check whether a backend_reg is inside a given range.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:12:36 +02:00
Francisco Jerez
bf6eb37e0b i965/vec4: Remove dependency of vec4_live_variables on the visitor.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:12:13 +02:00
Francisco Jerez
2e7622a487 i965/vec4: Trivial copy propagate clean-up.
Fix typo and punctuation in a comment, break long line and add space
before curly bracket.

Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
7526ee36bc i965/vec4: Add argument index and type checks to SEL saturate propagation.
SEL saturate propagation already implicitly relies on these
assumptions.

Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
24073b2cd7 i965/vec4: Fix broken saturate mask check in copy propagation.
try_copy_propagate() was checking the bit of the saturate mask for the
arg-th component of the source to decide whether the whole source
should be saturated (WTF?).  We need to swizzle the original saturate
mask and check that for all enabled channels the saturate flag is
either set or unset, as we cannot saturate a subset of destination
components only.

Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
18dc59c212 i965/vec4: Don't lose copy propagation saturate bits for not written components.
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
a3733defbe Revert "i965/vec4: Don't lose the saturate modifier in copy propagation."
This reverts commit 0dfec59a27.  The
change prevented propagation of copies with the saturate flag set,
making the whole saturate mask tracking completely useless.  A proper
fix follows.

Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
21c829e5cc i965/vec4: Remove unused method definition.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
516d45f78a i965/vec4: Some more trivial swizzle clean-up.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
430c6bf70e i965/vec4: Improve src_reg/dst_reg conversion constructors.
This simplifies the src_reg/dst_reg conversion constructors using the
swizzle utils introduced in a previous patch.  It also makes them more
useful by changing their semantics slightly: dst_reg(src_reg) used to
set the writemask to XYZW if the src_reg swizzle was anything other
than XXXX, which was almost certainly not what the caller intended if
the swizzle was non-trivial.  After this patch the same components
that are present in the swizzle will be enabled in the resulting
writemask.

src_reg(dst_reg) used to set the first components of the swizzle to
the enabled components of the writemask and then replicate the last
enabled component to fill the swizzle, which, in cases where the
writemask didn't have exactly the first n components set, would in
general not be compatible with the original dst_reg.  E.g.:

| ADD(tmp, src_reg(tmp), src_reg(1));

would *not* do what one would expect (add one to each of the enabled
components of tmp) if tmp didn't have a writemask of the described
form (e.g. YZ, YW, XZW would all fail).  This pattern actually occurs
in many different places in the VEC4 back-end, it's a wonder that it
hasn't caused piglit failures until now.  After this patch
src_reg(dst_reg) will construct a swizzle with each enabled component
at its natural position (e.g. Y at the second position, Z at the
third, and so on).  The resulting swizzle will behave like the
identity when used in any instruction with the original writemask.

I've manually verified that *none* of the callers of both conversion
constructors were relying on the previous broken semantics.  There are
no piglit regressions on any generation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:33 +02:00
Francisco Jerez
62fd335338 i965/vec4: Pass argument by reference to src_reg/dst_reg conversion constructors.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
23bda945f5 i965/vec4: Remove swizzle_for_size() in favour of brw_swizzle_for_size().
It could be objected that swizzle_for_size() is "faster" than
brw_swizzle_for_size().  It's not measurably better in any reasonable
CPU-bound benchmark on VLV according to the Finnish benchmarking
system (including the SynMark2 DrvShComp shader compilation
benchmark).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
5bcca9f8dc i965/vec4: Remove broken vector size deduction in setup_builtin_uniform_values().
This seemed to be trying to deduce the number of uniform vector
components from the parameter swizzle, but the algorithm would always
give 4 as result.  Instead grab the correct number of components from
the GLSL type.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
132cdcc468 i965/vec4: Simplify visitor handling of swizzles using the swizzle utils.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
9a17e4e900 i965/vec4: Simplify opt_register_coalesce() using the swizzle utils.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
05ec72d8ec i965/vec4: Simplify reswizzle() using the swizzle utils.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
7b30493dc4 i965/vec4: Simplify opt_reduce_swizzle() using the swizzle utils.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
eb9bd3a1b0 i965: Fix signedness of backend_reg::reg_offset.
And make it 16-bit so it packs nicely with the previous field.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
7e816c7feb i965/vec4: Fix signedness of dst_reg::writemask.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
7678fb9c63 i965/vec4: Don't use GL types in the IR data structures.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
7bc02c786d i965/vec4: Fix signedness of brw_is_single_value_swizzle() argument.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:32 +02:00
Francisco Jerez
cff670b009 i965: Define some useful swizzle helper functions.
This defines helper functions implementing some common swizzle
transformations that are usually open-coded in the compiler back-end,
causing a lot of clutter.  Some optimization passes will become almost
trivial implemented in terms of these functions (e.g.
vec4_visitor::opt_reduce_swizzle()).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 14:09:29 +02:00
Tapani Pälli
3cf99701ba glsl: fix names in lower_constant_arrays_to_uniforms
Patch changes lowering pass to use unique name for each uniform
so that arrays from different stages cannot end up having same
name.

v2: instead of global counter, use pointer to achieve
    unique name (Kenneth Graunke)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-03-23 11:18:39 +02:00
Jason Ekstrand
a6d4a108d2 i965/nir: Use signed integer type for booleans
FS instructions with NIR on i965:
total instructions in shared programs: 2663561 -> 2619051 (-1.67%)
instructions in affected programs:     1612965 -> 1568455 (-2.76%)
helped:                                5455
HURT:                                  12

FS instructions with NIR on g4x:
total instructions in shared programs: 2352633 -> 2307908 (-1.90%)
instructions in affected programs:     1441842 -> 1397117 (-3.10%)
helped:                                5463
HURT:                                  11

FS instructions with NIR on ilk:
total instructions in shared programs: 3997305 -> 3934278 (-1.58%)
instructions in affected programs:     2189409 -> 2126382 (-2.88%)
helped:                                8969
HURT:                                  22

FS instructions with NIR on hsw (snb and ivb were similar):
total instructions in shared programs: 4109389 -> 4109242 (-0.00%)
instructions in affected programs:     109869 -> 109722 (-0.13%)
helped:                                339
HURT:                                  190

No SIMD16 programs were gained or lost on any platform

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 01:01:14 -07:00
Jason Ekstrand
41d64fa184 i965/nir: Do boolean resolves on GEN <= 5
v2: A couple comment clean-ups from Matt

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 01:01:14 -07:00
Jason Ekstrand
a55af2699f i965: Add a NIR analysis pass for determining when a boolean resolve is needed
v2: Fix the spelling of analyze and re-arrange code for better readability
    as per Connor's comments.
v3: Make the naming of things more consistent and add a pile of comments
v4: Stop trying to avoid vectors

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-23 01:01:14 -07:00
Jason Ekstrand
2612e569e0 i965/nir: Properly set the predicate on the SEL used in min/max
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 01:01:14 -07:00
Jason Ekstrand
80390f91a0 i965/nir: Use NIR lowering for ffma for gen < 6
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-23 01:01:14 -07:00
Jason Ekstrand
235c728020 i965/nir: Use emit_lrp for emitting flrp
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-23 01:01:14 -07:00
Jason Ekstrand
a3e05898e9 i965/fs: Make emit_lrp return an fs_inst
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-23 01:01:14 -07:00
Dave Airlie
484f9f4fcd i965: define I915_PARAM_REVISION
we are broken against the libdrm 2.4.60 minimum specified,
so fix it for now.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-23 09:55:33 +10:00
Jose Fonseca
397b491173 gallivm: Silence unused variable warnings on release builds.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
06ac717810 scons: Silence conversion from 'size_t' to 'type', possible loss of data on MSVC.
Most cases seem harmless, though that might not always be the case.  Maybe
one day we can get gcc to complain about these and fix them throughout
the code, but until then let's silence them.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
15c5595bb1 scons: Ensure inttypes.h is always pre-included on MSVC.
It's a bit hackish couldn't find another solution.  See code comment
for details.   The warning is useful, so universally disabling doesn't
sound a good idea.

Fixes

   warning C4005: 'xxx' : macro redefinition

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
e4d95982ee scons: Silence MSVC C4351 warning.
It warns about change in MSVC behavior -- array initialisation used to
be non-standard, but is standard now, assuming I understand correctly
http://en.cppreference.com/w/cpp/language/zero_initialization .

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
e518d97d7e scons: Match some of LLVM warning options.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
31e47a59ad scons: Cleanup flex/bison settings specification.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
9c1c657e19 scons: Prefer winflexbison, and use --wincompat when available.
This avoids MSVC the warning

  warning C4013: 'isatty' undefined; assuming extern returning int

with certain versions of flex.

Reviewed-by: Brian Paul <brianp@vmware.com>

v2: Add win flex-bison link to docs/install.html.
2015-03-22 08:23:24 +00:00
Jose Fonseca
015e8b6384 scons: Define YY_USE_CONST on MSVC.
This prevents the MSVC from

  warning C4090: 'function' : different 'const' qualifiers

when compiling flex generated lexers.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
357d1fc81a scons: Tell MSVC STL library to not use exceptions.
MSVC defaults to no exceptions unless /EH option is passed (which we don't), while
MSVC's STL defaults to use exceptions unless _HAS_EXCEPTIONS=0 is defined,
which we didn't.

This fixes

  warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
e6330f9f56 scons: Ensure git_sha1.h's directory exists.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
8f0274c6c6 configure: Bail out with MinGW targets.
We only support native Windows builds with SCons.

Tested with:

  ./configure --host=i686-w64-mingw32

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
8d5c303ab9 include: Ensure float.h is included for DBL_MAX.
I didn't actually hit the issue in practice, but just happen to notice
while looking at the code.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
60eff44277 st/vdpau: Avoid constness cast warnings.
Fixes MSVC

  warning C4090: '=' : different 'const' qualifiers

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-03-22 08:23:24 +00:00
Jose Fonseca
fb78cccd7b glsl: Disable MSVC switch warning on a per-file basis.
This addresses

  ...\glsl_parser.cpp(...) : warning C4065: switch statement contains 'default' but no 'case' labels

This is on code generated by bison, which we have little control.

It seems useful to have this warning otherwise enabled.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-22 08:23:23 +00:00
Jose Fonseca
d01a7cdae5 glsl: Avoid GLboolean vs bool arithmetic MSVC warnings.
Note that GLboolean is an alias for unsigned char, which lacks the
implicit true/false semantics that C++/C99 bool have.

Reviewed-by: Brian Paul <brianp@vmware.com>

v2: Change gl_shader::IsES and gl_shader_program::IsES to be bool as
recommended by Ian Romanick.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-22 08:23:23 +00:00
Emil Velikov
7c7954b09d galahad: actually remove the driver
Should have been part of 429a4355259(galahad: remove driver). Seems like
I've erroneously committed the trimmed patch.

Reported-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 22:35:27 +00:00
Emil Velikov
bbaf22a998 egl: cut down static storage size for {Version,ClientAPI}String
Both seems to be excessively long, namely:

ClientAPIString can get up-to 47 based on current code, while the name
of the driver can dictate the length of the VersionString, currently it
is around 11. Let's pad each to 100, rather than the current 1000.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-21 17:22:19 +00:00
Emil Velikov
0bff196b22 docs: note the removal of gbm_gallium, galahad and identity
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 17:21:30 +00:00
Emil Velikov
429a435525 galahad: remove driver
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:18:28 +00:00
Emil Velikov
84041bab3f gallium/docs: remove information about identity driver
Removed from tree.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:18:25 +00:00
Emil Velikov
55029b2bab docs: update the egl_platforms list
Add the missing wayland, null, android and haiku platforms.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:16:44 +00:00
Emil Velikov
0d6e7620f3 egl/main: drop platform fbdev specific code
st/egl was the only one which had support for this platform.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:16:41 +00:00
Emil Velikov
65a8d4d6dd winsys/sw/fbdev: remove unused software winsys
st/egl was its only user.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:16:38 +00:00
Emil Velikov
1081ed9dc3 winsys/sw/wayland: remove unused winsys
st/egl was its only user.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:16:35 +00:00
Emil Velikov
48c7461d5a st/gbm: remove state-tracker
st/egl was its only user.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-21 17:16:27 +00:00
Roland Scheidegger
e8039208c4 llvmpipe: use global llvm context for PIPE_SUBSYSTEM_EMBEDDED
There's 2 reasons why we'd want to use the global context:
1) There still seems to be one memory "leak" left when using multiple llvm
contexts (it is not a true leak as the memory disappears into some still
addressable pool but nevertheless the memory consumption grows). See
http://cgit.freedesktop.org/~jrfonseca/llvm-jitstress/
2) These contexts get kinda big - even when disposing modules etc. after
compiling a shader the LLVMContext can easily be over 100kB. So when there's
lots of llvm contexts arounds it adds up.

The downside is that at least right now this is absolutely not thread safe,
so this only works safely in environments where multiple pipe contexts are not
used concurrently.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-21 01:52:03 +01:00
Emil Velikov
b2dccfd17e docs: add news item and link release notes for mesa 10.4.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:54:14 +00:00
Emil Velikov
0030eef62b docs: Add sha256 sums for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit cb154bb221)
2015-03-21 00:53:22 +00:00
Emil Velikov
befb5d1c94 Add release notes for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit d26f3c1f86)
2015-03-21 00:53:21 +00:00
Dave Airlie
ad6ede260f mesa: reorder gl_light_attrib
reduces from 2664->2656.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:41 +10:00
Dave Airlie
b99c7defac mesa: reorder gl_framebuffer
this reduces it from 1088 -> 1080 bytes

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:38 +10:00
Dave Airlie
727eb4c4e7 mesa: fix hole in vertex_array_object
this just removes 4 bytes from this object.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:33 +10:00
Dave Airlie
974e4783a5 mesa: repack gl_texture_attrib.
This removes a hole, and puts the large allocation at the end,

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:29 +10:00
Dave Airlie
2dbd8284e7 mesa: reduce gl_colorbuffer_attrib and gl_fog_attrib
These 392->388 and 72->68.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:25 +10:00
Dave Airlie
2c016ed35f mesa: reorder gl_image_unit
reduces 40->32
but reduces use in context from 7680->6144.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:21 +10:00
Dave Airlie
0ff4726a06 mesa: reorder gl_program, gl_shader, gl_shader_program
gl_program : 1344->1336
gl_shader: 488->472
gl_shader_program: 352->344.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:16 +10:00
Dave Airlie
7b634fed59 mesa: reorder gl_transform_feedback_object
Reduces size from 184 to 176 bytes.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:09 +10:00
Dave Airlie
e17b0435c5 mesa: reorder prog_instruction
reduces size from 64 to 56 bytes.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:05 +10:00
Dave Airlie
401b11843b mesa: reorder gl_array_attrib
drops 80 bytes to 72.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:14:00 +10:00
Dave Airlie
b3f6e0bb58 mesa: reorder gl_client_array
drops from 56 to 48 bytes,
drops gl_vertex_array_object from 4584 to 4320 bytes

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:13:56 +10:00
Dave Airlie
cbaff50828 mesa: reorder gl_texture_unit
drops size from 520 -> 512 bytes,
which then makes gl_texture_attrib go from 99984 to 98440.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:13:51 +10:00
Dave Airlie
83606b4904 mesa: reorder gl_point_attrib
this drops the size from 52 bytes to 48 bytes.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:13:47 +10:00
Dave Airlie
684c914014 mesa: reorder gl_multisample_attrib
drops size from 28 bytes to 20.

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-21 08:13:17 +10:00
Ian Romanick
a04b520890 i965/fs: Use correct null destination register in cmod tests
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89670
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Vinson Lee <vlee@freedesktop.org>
2015-03-20 12:27:02 -07:00
Connor Abbott
ccb9cbc849 i965/fs: bail on move-to-flag in sel peephole
Fixes a piglit regression
(shaders/glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined) with
my series for GVN.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-20 11:53:11 -04:00
Francisco Jerez
1cc00f1875 i965: Mask out unused Align16 components in brw_untyped_atomic.
This is currently not a problem because the vec4 visitor happens to
mask out unused components from the destination, but it might become
an issue when we start using atomics without writeback message.  In
any case it seems sensible to set it again here because the
consequences of setting the wrong writemask (random graphics memory
corruption) are difficult to debug and can easily go unnoticed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-20 17:01:35 +02:00
Francisco Jerez
959d16e38e i965: Pass number of components explicitly to brw_untyped_atomic and _surface_read.
And calculate the message response size based on the number of
components rather than the other way around.  This simplifies their
interface somewhat and allows the caller to request a writeback
message with more than one vector component in SIMD4x2 mode.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-20 17:01:35 +02:00
Francisco Jerez
a815cd8449 i965: Don't disable exec masking for sampler message sends.
This was telling the sampler to do texture fetches for *all* channels
in the non-constant surface index case, what could have reduced
throughput unnecessarily when some of the channels were disabled by
control flow.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-20 17:01:35 +02:00
Francisco Jerez
a902a5d6ba i965: Factor out logic to build a send message instruction with indirect descriptor.
This is going to be useful because the Gen7+ uniform and varying pull
constant, texturing, typed and untyped surface read, write, and atomic
generation code on the vec4 and fs back-end all require the same logic
to handle conditionally indirect surface indices.  In pseudocode:

|   if (surface.file == BRW_IMMEDIATE_VALUE) {
|      inst = brw_SEND(p, dst, payload);
|      set_descriptor_control_bits(inst, surface, ...);
|   } else {
|      inst = brw_OR(p, addr, surface, 0);
|      set_descriptor_control_bits(inst, ...);
|      inst = brw_SEND(p, dst, payload);
|      set_indirect_send_descriptor(inst, addr);
|   }

This patch abstracts out this frequently recurring pattern so we can
now write:

| inst = brw_send_indirect_message(p, sfid, dst, payload, surface)
| set_descriptor_control_bits(inst, ...);

without worrying about handling the immediate and indirect surface
index cases explicitly.

v2: Rebase.  Improve documentatation and commit message. (Topi)
    Preserve UW destination type cargo-cult. (Topi, Ken, Matt)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-20 17:01:35 +02:00
Francisco Jerez
fd149628e1 i965: Set nr_params to the number of uniform components in the VS/GS path.
Both do_vs_prog and do_gs_prog initialize brw_stage_prog_data::nr_params to
the number of uniform *vectors* required by the shader rather than the number
of uniform components, contradicting the comment.  This is inconsistent with
what the state upload code and scalar path expect but it happens to work until
Gen8 because vec4_visitor interprets it as a number of vectors on construction
and later on overwrites its original value with the number of uniform
components referenced by the shader.

Also there's no need to add the number of samplers, they're not actually
passed in as uniforms.

Fixes a memory corruption issue on BDW with SIMD8 VS.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-20 16:55:36 +02:00
Kenneth Graunke
706b916960 i965/skl: Break down SIMD16 3-source instructions when required.
Several steppings of Skylake fail when using SIMD16 with 3-source
instructions (such as MAD).

This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit
tests.

Based on a patch by Neil Roberts.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-03-20 13:25:41 +00:00
Neil Roberts
bc4b18d297 i965: Refactor SIMD16-to-2xSIMD8 checks.
The places that were checking whether 3-source instructions are
supported have now been combined into a small helper function. This
will be used in the next patch to add an additonal restriction.

Based on a patch by Kenneth Graunke.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-20 13:25:41 +00:00
Neil Roberts
c02c4b567c i965: Store the GPU revision number in brw_context
brwContextInit now queries the GPU revision number via a new parameter
for DRM_I915_GETPARAM. This new parameter requires a kernel patch and
a patch to libdrm. If the kernel doesn't support it then it will
continue but set the revision number to -1. The intention is to use
this to implement workarounds that are only needed on certain
steppings of the GPU.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-03-20 13:25:40 +00:00
Fredrik Höglund
2fd21d8a84 mesa: Make sure the buffer exists in _mesa_lookup_bufferobj_err
Generate GL_INVALID_OPERATION and return NULL when the buffer object
hasn't been created.  All callers expect this.

v2: Use a more concise error message.

Cc: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-20 01:25:29 +01:00
Dave Airlie
9d97cd2e3e u_primconvert: add primitive restart support
This add primitive restart support to the prim conversion.

This involves changing the API for the translate functions
as we need to pass the prim restart index and the original
number of indices into the translate functions.

primitive restart is support for quads, quad strips
and polygons.

This deal with the case where the actual number of output
primitives is less than the initially calculated number,
by filling the rest of the output primitives with the restart
index, the other option is to reduce the output prim number,
but that will make the generator code a bit messier.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-20 09:46:30 +10:00
Kenneth Graunke
201aef9d13 i965/fp: Emit discard jumps.
This should improve the performance of any shaders using the KIL
instruction.  I'm a bit surprised we missed this.

Unfortunately, I have not been able to measure any performance
improvements from this patch.  It does make ARB_fragment_program
behave similarly to GLSL code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-19 16:14:51 -07:00
Kenneth Graunke
8a0946f3b1 i965/fs: Make an emit_discard_jump() function to reduce duplication.
This is already copied in two places, and I want to copy it to a third
place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-19 16:14:51 -07:00
Laura Ekstrand
09bfa326a9 main: Add TEXTURE_CUBE_MAP support in CopyTextureSubImage3D.
So it turns out that this doesn't actually fix any bugs or add any features,
stictly speaking. However, it does avoid a lot of kludginess.  Previously, if
you called

glCopyTextureSubImage3D(texcube, 0, 0, 0, zoffset = 3, ...

it would grab the texture image object for face = 0 in teximage.c instead of
the desired face = 3.  But Line 274 of brw_blorp_blit.cpp would correct for
this by updating the slice to 3.

This commit does the correct thing before calling any drivers,
which should make the functionality much more robust and uniform across all
drivers.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-19 16:07:57 -07:00
Laura Ekstrand
037e36a8aa main: Simplify debug messages for CopyTex*SubImage*D.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-19 16:07:44 -07:00
Ian Romanick
a44b95cd57 glsl: Annotate as_foo functions that the this pointer cannot be NULL
We use the idiom

   ir_foo *x = y->as_foo();
   if (x == NULL)
      return;

all over the place.  GCC generates some quite lovely code for this.
One such example:

  340a5b:       83 7d 18 04             cmpl   $0x4,0x18(%rbp)
  340a5f:       0f 85 06 04 00 00       jne    340e6b
  340a65:       48 85 ed                test   %rbp,%rbp
  340a68:       0f 84 fd 03 00 00       je     340e6b

This case used as_expression() (ir_type_expression is 4).  Note that it
checks the ir_type, then checks that the pointer isn't NULL.  There is
some disconnect in GCC around the condition in the as_foo functions.

      return ir_type == ir_type_##TYPE ? (ir_##TYPE *) this : NULL; \

It believes "this" could be NULL, so it emits check outside the function
just for fun.

This patch uses assume() to tell GCC that it need not bother with extra
NULL checking of the pointer returned by the as_foo functions.

   text	   data	    bss	    dec	    hex	filename
4836430	 158688	  26248	5021366	 4c9eb6	i965_dri-before.so
4836173	 158688	  26248	5021109	 4c9db5	i965_dri-after.so

v2: Replace 'if (this == NULL) unreachable("this cannot be NULL")' with
assume(this != NULL).  Suggested by Ilia Mirkin.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-19 15:35:42 -07:00
Paul Berry
bf9d921936 main: Change the type argument of use_shader_program() to gl_shader_stage.
This allows it to be called from a loop.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-03-19 13:38:51 -07:00
Paul Berry
57b2652322 main: Clean up a strange construction in use_shader_program().
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-19 13:38:51 -07:00
Jason Ekstrand
46c35c61e9 i965/nir: Sort uniforms direct-first and use two different uniform registers
Previously, we put all the uniforms into one big array.  The problem with
this approach is that, as soon as there was one indirect array acces, the
backend would decide that the entire large array should be pull constants.
This commit splits the array in half: first direct-only uniforms and then
potentially-indirect uniforms.  This may not be optimal, but it does let
the backend promote things to push constants.

Shader-db results on HSW:
total instructions in shared programs: 4114840 -> 4112172 (-0.06%)
instructions in affected programs:     43316 -> 40648 (-6.16%)
helped:                                116
HURT:                                  0

v2: Set param_size[num_direct_uniforms] only if we have indirect uniforms.
    This caused a bug that, strangely enough, only showed up on Broadwell
    vertex shaders.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-19 13:18:39 -07:00
Jason Ekstrand
8a33f95b7a nir/lower_io: Add a assign_locations function that sorts by [in]direct use
v2: Delete the set of indirectly accessed variables when we're done with it
v3: Rename from _packed to _scalar

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-19 13:18:39 -07:00
Jason Ekstrand
25db44a845 nir/lower_io: Make variable location assignment a manual operation
Previously, we just assigned variable locations in nir_lower_io.  Now, we
force the user to assign variable locations for us.  This gives the backend
a bit more control over where variables are placed.

v2: Rename from _packed to _scalar

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-19 13:18:39 -07:00
Jason Ekstrand
639115123e nir: Use a list instead of a hash_table for inputs, outputs, and uniforms
We never did a single hash table lookup in the entire NIR code base that I
found so there was no real benifit to doing it that way.  I suppose that
for linking, we'll probably want to be able to lookup by name but we can
leave building that hash table to the linker.  In the mean time this was
causing problems with GLSL IR -> NIR because GLSL IR doesn't guarantee us
unique names of uniforms, etc.  This was causing massive rendering isues in
the unreal4 Sun Temple demo.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-19 13:18:38 -07:00
Brian Paul
8f255f948b gallivm: remove unused 'builder' variable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-19 12:56:35 -06:00
Brian Paul
1cd3745911 mesa: use more descriptive error messages for glUniform errors
Different errors for type mismatches, size mismatches and matrix/
non-matrix mismatches.  Use a common format of "uniformName"@location
in the messags.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-19 12:56:35 -06:00
Matt Turner
b0d422cd2a i965/fs: Print spills:fills and number of promoted constants.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-03-19 11:15:57 -07:00
Ian Romanick
b616164c95 i965/fs: Emit better b2f of an expression on GEN4 and GEN5
On platforms that do not natively generate 0u and ~0u for Boolean
results, b2f expressions that look like

    f = b2f(expr cmp 0)

will generate better code by pretending the expression is

    f = ir_triop_sel(0.0, 1.0, expr cmp 0)

This is because the last instruction of "expr" can generate the
condition code for the "cmp 0".  This avoids having to do the "-(b & 1)"
trick to generate 0u or ~0u for the Boolean result.  This means code like

    mov(16)         g16<1>F         1F
    mul.ge.f0(16)   null            g6<8,8,1>F      g14<8,8,1>F
    (+f0) sel(16)   m6<1>F          g16<8,8,1>F     0F

will be generated instead of

    mul(16)         g2<1>F          g12<8,8,1>F     g4<8,8,1>F
    cmp.ge.f0(16)   g2<1>D          g4<8,8,1>F      0F
    and(16)         g4<1>D          g2<8,8,1>D      1D
    and(16)         m6<1>D          -g4<8,8,1>D     0x3f800000UD

v2: When the comparison is either == 0.0 or != 0.0 use the knowledge
that the true (or false) case already results in zero would allow better
code generation by possibly avoiding a load-immediate instruction.

v3: Apply the optimization even when neither comparitor is zero.

Shader-db results:

GM45 (0x2A42):
total instructions in shared programs: 3551002 -> 3550829 (-0.00%)
instructions in affected programs:     33269 -> 33096 (-0.52%)
helped:                                121

Iron Lake (0x0046):
total instructions in shared programs: 4993327 -> 4993146 (-0.00%)
instructions in affected programs:     34199 -> 34018 (-0.53%)
helped:                                129

No change on other platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Palli <tapani.palli@intel.com>
2015-03-19 10:21:08 -07:00
Matt Turner
036e347f3c util: Optimize _mesa_roundeven with SSE 4.1.
The SSE 4.1 ROUND instructions let us implement roundeven directly.
Otherwise we assume that the rounding mode has not been modified (as we
do in the rest of Mesa) and use rint().

glibc uses the ROUND instruction in rint() after a cpuid check. This
patch just lets us inline it directly when we're already building for
SSE 4.1.

Reviewed-by: Carl Worth <cworth@cworth.org>
2015-03-18 21:06:26 -07:00
Matt Turner
5de86102f9 util: Add a roundeven test.
Reviewed-by: Carl Worth <cworth@cworth.org>
2015-03-18 21:06:26 -07:00
Matt Turner
dd0d3a2c0f mesa: Replace _mesa_round_to_even() with _mesa_roundeven().
Eric's initial patch adding constant expression evaluation for
ir_unop_round_even used nearbyint. The open-coded _mesa_round_to_even
implementation came about without much explanation after a reviewer
asked whether nearbyint depended on the application not modifying the
rounding mode. Of course (as Eric commented) we rely on the application
not changing the rounding mode from its default (round-to-nearest) in
many other places, including the IROUND function used by
_mesa_round_to_even!

Worse, IROUND() is implemented using the trunc(x + 0.5) trick which
fails for x = nextafterf(0.5, 0.0).

Still worse, _mesa_round_to_even unexpectedly returns an int. I suspect
that could cause problems when rounding large integral values not
representable as an int in ir_constant_expression.cpp's
ir_unop_round_even evaluation. Its use of _mesa_round_to_even is clearly
broken for doubles (as noted during review).

The constant expression evaluation code for the packing built-in
functions also mistakenly assumed that _mesa_round_to_even returned a
float, as can be seen by the cast through a signed integer type to an
unsigned (since negative float -> unsigned conversions are undefined).

rint() and nearbyint() implement the round-half-to-even behavior we want
when the rounding mode is set to the default round-to-nearest. The only
difference between them is that nearbyint() raises the inexact
exception.

This patch implements _mesa_roundeven{f,}, a function similar to the
roundeven function added by a yet unimplemented technical specification
(ISO/IEC TS 18661-1:2014), with a small difference in behavior -- we
don't bother raising the inexact exception, which I don't think we care
about anyway.

At least recent Intel CPUs can quickly change a subset of the bits in
the x87 floating-point control register, but the exception mask bits are
not included. rint() does not need to change these bits, but nearbyint()
does (twice: save old, set new, and restore old) in order to raise the
inexact exception, which would incur some penalty.

Reviewed-by: Carl Worth <cworth@cworth.org>
2015-03-18 21:06:26 -07:00
Matt Turner
bb22aa08e4 i965/fs: Ignore type in cmod prop if scan_inst is CMP.
total instructions in shared programs: 6263270 -> 6203091 (-0.96%)
instructions in affected programs:     2606529 -> 2546350 (-2.31%)
helped:                                14301
GAINED:                                5
LOST:                                  3

Revewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-18 21:03:09 -07:00
Jason Ekstrand
e1f3ddef8c i965/nir: Make our environment variable checking smarter
Before, we enabled NIR if you set INTEL_USE_NIR to anything which mean that
INTEL_USE_NIR=false would actually turn on NIR.  In preparation for turning
NIR on by default, this commit makes it smarter by allowing the
INTEL_USE_NIR variable to work as either a force-enable or a force-disable.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-18 16:40:22 -07:00
Dave Airlie
37e3a116f8 egl: don't fill client apis string forever.
We never reset the string on eglTerminate, so it grows
for ever on multiple eglInitialise.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-19 08:28:38 +10:00
Jose Fonseca
cebc62f106 swrast: Use BITFIELD64_BIT for arrayAttribs.
As VARYING_SLOT_MAX can be bigger than 32.

I'll probably stop building swrast with MSVC in the near future, but this
seems a real bug regardless.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-18 21:51:54 +00:00
Jose Fonseca
d3e9aa8d88 scons: Don't link program_lexer.l/y twice.
program/lex.yy.c and program/program_parse.tab.c is already included in
the PROGRAM_FILES variable.

We still need to specify the dependency relationship though.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-18 21:51:54 +00:00
Jose Fonseca
a56f1a8b32 gallivm: Use INFINITY directly.
Already done below.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-18 21:51:40 +00:00
Jose Fonseca
1d30fd85dd scons: Silence MSVC warnings about overflows in constant arithmetic.
These get triggered even when using the standard C99 INFINITY/NAN
constants.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-18 21:51:40 +00:00
José Fonseca
bbac03ecca scons: Disable MSVC signed/unsigned mismatch warnings.
By default gcc ignores the issue, and as result code that mixes
signed/unsigned is so widespread through the code base that it ends up
being little more than noise, potentially obscuring more pertinent
warnings.

Maybe one day we enable the corresponding gcc warnings and cleanup, but
until then, this change disables them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-03-18 21:51:40 +00:00
Laura Ekstrand
2ccfce3f4c docs: Update progress on ARB_direct_state_access.
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-18 13:59:39 -07:00
Brian Paul
627991dbf7 dri: add _glapi_set_nop_handler(), _glapi_new_nop_table() to dri_test.c
I wasn't aware of these _glapi_ stub functions when I committed
4bdbb588a9.  Fixes "make check"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89662
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-18 12:46:11 -06:00
Brian Paul
9263986401 mesa: remove MSVC warning pragmas
Removing this block of pragmas doesn't seem to increase the number of
warning generated by MSVC.  Other than signed/unsigned comparison warnings
there's very few other warnings nowadays.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-18 09:01:50 -06:00
Brian Paul
ea1b066a34 mesa: add void to format_array_format_table_init() declaration
Silences an MSVC warning where it's called from call_once().

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-18 09:01:50 -06:00
Brian Paul
9fbbd60c1d mapi: move some #includes from .h file to .c files
Just include things where they're needed.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-18 09:01:50 -06:00
Brian Paul
4009d22b61 mesa: make _mesa_alloc_dispatch_table() static
Never called from outside of context.c

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-18 09:01:50 -06:00
Brian Paul
4bdbb588a9 mesa: reimplement dispatch table no-op function handling
Use the new _glapi_new_nop_table() and _glapi_set_nop_handler() to
improve how we handle calling no-op GL functions.

If there's a current context for the calling thread, generate a
GL_INVALID_OPERATION error.  This will happen if the app calls an
unimplemented extension function or it calls an illegal function
between glBegin/glEnd.

If there's no current context, print an error to stdout if it's a debug
build.

The dispatch_sanity.cpp file has some previous checks removed since
the _mesa_generic_nop() function no longer exists.

This fixes the piglit gl-1.0-dlist-begin-end and gl-1.0-beginend-coverage
tests on Windows.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-18 09:01:50 -06:00
Brian Paul
201e36e77d mapi: add new _glapi_new_nop_table() and _glapi_set_nop_handler()
_glapi_new_nop_table() creates a new dispatch table populated with
pointers to no-op functions.

_glapi_set_nop_handler() is used to register a callback function which
will be called from each of the no-op functions.

Now we always generate a separate no-op function for each GL entrypoint.
This allows us to do proper stack clean-up for Windows __stdcall and
lets us report the actual function name in error messages.  Before this
change, for non-Windows release builds we used a single no-op function
for all entrypoints.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-18 09:01:50 -06:00
Rob Clark
aee26d292f freedreno/ir3: fix infinite recursion in sched
One more case we need to handle.  One of the src instructions for the
indirect could also end up being ourself.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-18 10:42:33 -04:00
Rob Clark
62cc003b7d freedreno: fix spelling
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-18 10:42:33 -04:00
Marek Olšák
42715ad793 docs/GL3: don't list nv30
Suggested by Ilia Mirkin.
2015-03-18 12:04:27 +01:00
Marek Olšák
4e46af0195 docs/GL3: don't list swrast
Let's face it: This driver is unlikely to get more love.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-18 12:04:27 +01:00
Marek Olšák
2b5379651f docs/GL3: don't list r300
r300g already supports everything it can. There's no point in listing
the driver here.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-18 12:04:27 +01:00
Marek Olšák
a984abdad3 radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords
radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)

Discovered by Coverity. Reported by Ilia Mirkin.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-18 12:04:27 +01:00
Jonathan Gray
8475526a38 configure: check if compiler supports -Werror=vla.
Check if the compiler supports -Werror=vla before using it.
-Wvla was introduced with GCC 4.3 and is not present in 4.2.
Fixes the build on OpenBSD.

v2: Fix statement order, and quote $save_CFLAGS.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89433
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-18 10:53:20 +00:00
Chris Wilson
eeb504e0ae i965: Defer the throttle until we submit new commands
Currently, we throttle before the user begins preparing commands for the
next frame when we acquire the draw/read buffers. However, construction
of the command buffer can itself take significant time relative to the
frame time. If we move the throttle from the buffer acquire to the
command submit phase we can allow the user to improve concurrency
between the CPU and GPU (i.e. reduce the amount of time we waste inside
the throttle).

v2: Whitespace + delay throttling until after the next submission for
greater parallelism

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Chad Versace <chad.versace@linux.intel.com>
Cc: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> [v1]
2015-03-18 09:33:33 +00:00
Chris Wilson
64788b2e8d i965: Throttle to the previous frame
In order to facilitate the concurrency offered by triple buffering and to
offset the latency induced by swapping via an external process, which
may incur extra rendering itself, only throttle to the previous frame
and not the last. The second issue that mostly affects swap benchmarks,
but also can incur jitter in the throttling, is that the throttle bo is
closer to the next SwapBuffers rather than immediately after the previous
SwapBuffers. Throttling to the previous frame doubles the maximum possible
latency at the benefit of improving throughput and reducing jitter.

v2: Rename "first_post_swapbuffer" batches array to a plain
throttle_batch[] as the pluralisation was contorting the name and not
making it clear as to whether it was the first batch or first_post_swap
batch. Not least of which was that not all throttle points are SwapBuffers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Chad Versace <chad.versace@linux.intel.com>
Cc: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2015-03-18 09:33:33 +00:00
Chris Wilson
8b9bd19021 i965: Throttle rendering to an fbo
When rendering to an fbo, even though it may be acting as a winsys
frontbuffer or just generally, we never throttle. However, when rendering
to an fbo, there is no natural frame boundary. Conventionally we use
SwapBuffers and glFinish, but potential callers avoid often glFinish for
being too heavy handed (waiting on all outstanding rendering to complete).
The kernel provides a soft-throttling option for this case that waits for
rendering older than 20ms to be complete (that's a little too lax to be
used for swapbuffers, but is here a useful safety net). The remaining
choice is then either never to throttle, throttle after every draw call,
or at after intermediate user defined point such as glFlush and thus all the
implied flushes. This patch opts for the latter as that is the current
method used for flushing to front buffers.

v2: Defer the throttling from inside the flush to the next
intel_prepare_render() and switch non-fbo frontbuffer throttling over to
use the same lax method. The issuing being that
glFlush()/intel_prepare_read() is just as likely to be called inside a
tight loop and not at "frame" boundaries.

v3: Rename from need_front_throttle to need_flush_throttle to avoid any
ambiguity between front buffer rendering and fbo rendering. (Chad)

v4: Whitespace

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Chad Versace <chad.versace@linux.intel.com>
Cc: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2015-03-18 09:33:33 +00:00
Jason Ekstrand
27bf37ba05 nir/peephole_select: Allow uniform/input loads and load_const
Shader-db results on HSW:

total instructions in shared programs: 4174156 -> 4157291 (-0.40%)
instructions in affected programs:     145397 -> 128532 (-11.60%)
helped:                                383
HURT:                                  0
GAINED:                                20
LOST:                                  22

There are two more tests lost than gained.  However, comparing this with
GLSL IR vs. NIR results, the overall delta is reduced from 85/44
gained/lost on current master to 71/32 with this commit.  Therefore, I
think it's probably a boon since we are getting "closer" to where we were
before.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-17 17:11:05 -07:00
Jason Ekstrand
1be862c0c4 nir/peephole_select: Copy instructions into the block before the if
Previously we tried to do poor-man's copy propagation as we created the
select instructions.  Instead, this commit just moves the instructions from
the blocks inside the if into the block before.  Copy propagation will take
care of making sure we don't have any extra mov's in there for us.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-17 17:11:05 -07:00
Jason Ekstrand
8cf40ed05d nir/peephole_select: Rename are_all_move_to_phi and use a switch
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-17 17:11:05 -07:00
Mario Kleiner
cc5ddd584d glx: Handle out-of-sequence swap completion events correctly. (v2)
The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:        10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
    much more extensive in commit message, no code change wrt. initial
    version.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-17 23:54:02 +00:00
Emil Velikov
3f94a5afcb r600g: constify r600_shader_tgsi_instruction lists.
Massive list of constant data. Annotate it as such.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-17 23:52:39 +00:00
Emil Velikov
63cf2b4448 r600g: kill off r600_shader_tgsi_instruction::{tgsi_opcode,is_op3}
Both of which are no longer used. Use designated initializer to make
things obvious as people add/remove TGSI_OPCODEs.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-17 23:52:35 +00:00
Emil Velikov
5e68c6b322 r600g: use the tgsi opcode from parse.FullToken.FullInstruction
... rather than the local one in inst_info->tgsi_opcode.

This will allow us to simplify struct r600_shader_tgsi_instruction.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-17 23:52:32 +00:00
Ian Romanick
6db5e134b6 i965/fs: Apply gl_FrontFacing ? -1 : 1 optimization only for floats
At the very least, unreal4/sun-temple/102.shader_test uses this pattern
for a signed integer result.  However, that shader did not hit the
optimization in the first place because it uses !gl_FrontFacing.  I
changed the shader to use remove the logical-not and reverse the other
operands.  I verified that incorrect code is generated before this
change and correct code is generated after.

Fixes fs-frontfacing-ternary-1-neg-1.shader_test.

No shader-db changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-17 15:01:44 -07:00
Ian Romanick
4a53445b0d i965/fs: Change try_opt_frontfacing_ternary to eliminate asserts
If we check for the case that is actually necessary, the asserts
become superfluous.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-17 15:00:28 -07:00
Ian Romanick
ce3f46397d i965/fs: Handle CMP.nz ... 0 and AND.nz ... 1 similarly in cmod propagation
Espically on platforms that do not natively generate 0u and ~0u for
Boolean results, we generate a lot of sequences where a CMP is
followed by an AND with 1.  emit_bool_to_cond_code does this, for
example.  On ILK, this results in a sequence like:

    add(8)          g3<1>F          g8<8,8,1>F      -g4<0,1,0>F
    cmp.l.f0(8)     g3<1>D          g3<8,8,1>F      0F
    and.nz.f0(8)    null            g3<8,8,1>D      1D
    (+f0) iff(8)    Jump: 6

The AND.nz is obviously redundant.  By propagating the cmod, we can
instead generate

    add.l.f0(8)     null            g8<8,8,1>F      -g4<0,1,0>F
    (+f0) iff(8)    Jump: 6

Existing code already handles the propagation from the CMP to the ADD.

Shader-db results:

GM45 (0x2A42):
total instructions in shared programs: 3550829 -> 3550788 (-0.00%)
instructions in affected programs:     10028 -> 9987 (-0.41%)
helped:                                24

Iron Lake (0x0046):
total instructions in shared programs: 4993146 -> 4993105 (-0.00%)
instructions in affected programs:     9675 -> 9634 (-0.42%)
helped:                                24

Ivy Bridge (0x0166):
total instructions in shared programs: 6291870 -> 6291794 (-0.00%)
instructions in affected programs:     17914 -> 17838 (-0.42%)
helped:                                48

Haswell (0x0426):
total instructions in shared programs: 5779256 -> 5779180 (-0.00%)
instructions in affected programs:     16694 -> 16618 (-0.46%)
helped:                                48

Broadwell (0x162E):
total instructions in shared programs: 6823088 -> 6823014 (-0.00%)
instructions in affected programs:     15824 -> 15750 (-0.47%)
helped:                                46

No chage on Sandy Bridge or on any platform when NIR is used.

v2: Add unit tests suggested by Matt.  Remove spurious writes_flag()
check on scan_inst when scan_inst is known to be BRW_OPCODE_CMP (also
suggested by Matt).

v3: Fix some comments and remove some explicit int() casts in fs_reg
constructors in the unit tests.  Both suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-17 14:59:43 -07:00
Matt Turner
d35720da9b i965: Mark paths in linear <-> tiled functions as unreachable().
text    data     bss     dec     hex filename
9663       0       0    9663    25bf intel_tiled_memcpy.o   before
8215       0       0    8215    2017 intel_tiled_memcpy.o   after

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-03-17 14:09:56 -07:00
Matt Turner
6c6e2a15aa egl: Remove eglQueryString virtual dispatch.
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-17 14:09:56 -07:00
Laura Ekstrand
827da841a1 main: Correct _mesa_error with no format in bufferobj.c.
This fixes Bug 89616, a build failure due to line 1639 of bufferobj.c:
_mesa_error(ctx, GL_INVALID_OPERATION, func);

Trivial.
2015-03-17 13:30:54 -07:00
Laura Ekstrand
579297c8bd main: Cosmetic changes to GetBufferSubData.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
23eab47bbe main: Add entry point for GetNamedBufferSubData.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
3706ace244 main: Cosmetic updates to GetBufferPointerv.
v3: Review from Fredrik Hoglund
   -Split cosmetic refactor of GetBufferPointerv out into a separate commit

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
105ddc6aea main: Add entry point for GetNamedBufferPointerv.
v3: Review from Fredrik Hoglund
   -Split cosmetic refactor of GetBufferPointerv out into a separate commit

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
1e45752aaf main: Add entry points for GetNamedBufferParameteri[64]v.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
efcb830d49 main: Refactor GetBufferParameteri[64]v.
v2: Split into a refactor commit and an entry point commit.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
1cfc18da8d main: Add entry point for FlushMappedNamedBufferRange.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
ee5fae6e89 main: Refactor FlushMappedBufferRange.
v2:-Remove "_mesa" from in front of static software fallback.
   -Split out the refactor from the addition of the DSA entry points.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
f7f5df9954 main: Add entry point for UnmapNamedBuffer.
v2: review from Ian Romanick
   - Restore VBO_DEBUG and BOUNDS_CHECK
   - Remove _mesa from static software fallback unmap_buffer.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
a0cc03929e main: Add entry points for MapNamedBuffer[Range].
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
4f513bc330 main: Refactor MapBuffer[Range].
v2: review from Jason Ekstrand
   - Split refactor from addition of DSA entry points.
    review from Ian Romanick
   - Remove "_mesa" from static software fallback map_buffer_range
   - Restore VBO_DEBUG and BOUNDS_CHECK

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
16244525fb main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
5030d0a4f7 main: Add entry points for ClearNamedBuffer[Sub]Data.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
9fa6c3637a main: Refactor ClearBuffer[Sub]Data.
v2: review by Jason Ekstrand
   - Split refactor of clear buffer sub data from addition of DSA entry
     points.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
4adaad5fcc main: Add entry point for CopyNamedBufferSubData.
v2: remove _mesa in front of static software fallback.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
9cb732b8e9 main: Improve errors and style in BufferSubData.
- More explicit error reporting.
- Removed legacy style.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
566ccdf11b main: Add entry point for NamedBufferSubData.
v2: review by Ian Romanick
   - Remove "_mesa" from name of static software fallback buffer_sub_data.
   - Remove mappedRange from _mesa_buffer_sub_data.
   - Removed some cosmetic changes to a separate commit.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
cb56835f87 main: Add entry point for NamedBufferData.
v2: review from Ian Romanick
   - Fix space in ARB_direct_state_access.xml.
   - Remove "_mesa" from the name of buffer_data static fallback.
   - Restore VBO_DEBUG and BOUNDS_CHECK.
   - Fix beginning of comment to start on same line as /*

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
a76808dc19 main: Add entry point for NamedBufferStorage.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
2cf48c37c1 main: Add entry point for CreateBuffers.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
44ecf0793d Revert "main: _mesa_cube_level_complete checks NumLayers."
This reverts commit 1ee000a0b6.
Failures with the GLES3 conformance suite and Synmark2 OGLHdrBloom revealed
that this commit was in error.

Extensive testing with Piglit prior to patch review and upstreaming did not
reveal this problem because, in the few Piglit tests that test for cube
completeness, NumLayers = 6.  This is because all of the existing tests use
TextureStorage to initialize the texture, which sets NumLayers.

A new Piglit test has been sent to the mailing list that reproduces the bug
related to this patch ("texturing: Testing
glGenerateMipmap(GL_TEXTURE_CUBE_MAP) without glTexStorage2D").

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-17 10:04:10 -07:00
Neil Roberts
5a06ee7384 i965/skl: Send a message header when doing constant loads SIMD4x2
Commit 0ac4c27275 made it add a header for the send message when
using SIMD4x2 on Skylake because without this it will end up using
SIMD8D. However the patch missed the case when a sampler is being used
to implement constant loads from a buffer surface in a SIMD4x2 vertex
shader.

This fixes 29 Piglit tests, mostly related to the ARL instruction in
vertex programs.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-17 16:32:11 +00:00
Tapani Pälli
627c683086 i965/fs: in MAD optimizations, switch last argument to be immediate
Commit bb33a31 introduced optimizations that transform cases of MAD
in to simpler forms but it did not take in to account that src[0]
can not be immediate and did not report progress. Patch switches
src[0] and src[1] if src[0] is immediate and adds progress
reporting. If both sources are immediates, this is taken care of by
the same opt_algebraic pass on later run.

v2: Fix for all cases, use temporary fs_reg (Matt, Kenneth)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89569
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-03-17 07:59:30 +02:00
Vinson Lee
60f77b22b1 common.py: Fix PEP 8 issues.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 22:55:08 -07:00
Roland Scheidegger
2372275d2f gallivm: abort properly when running out of buffer space in lp_disassembly
Before this actually ran into an infinite loop printing out "invalid"...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-17 00:46:48 +01:00
Marek Olšák
9d1682d619 docs/GL3: also mark GLES3/GS5 for radeonsi as done 2015-03-16 23:27:25 +01:00
Emil Velikov
c066669b8d st/dri: remove unused include from the automake/scons build
st/dri/common hasn't been around for a while.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:52 +00:00
Emil Velikov
55f0c0a29f auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:36 +00:00
Emil Velikov
5664f57df3 gallium/sw/kms: trivial cleanups
Remove the forward declaration and make use of the DEBUG_PRINT macro for
debug builds.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:22 +00:00
Emil Velikov
771cd266b9 loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
2015-03-16 20:48:07 +00:00
Felix Janda
aead7fe2e2 c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default
Previously PTHREAD_MUTEX_RECURSIVE_NP had been used on linux for
compatibility with old glibc. Since mesa defines __GNU_SOURCE__
on linux PTHREAD_MUTEX_RECURSIVE is also available since at least
1998. So we can unconditionally use the portable version
PTHREAD_MUTEX_RECURSIVE.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88534
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-16 20:41:39 +00:00
Marek Olšák
b5f19db976 radeonsi: implement TGSI_OPCODE_BFI (v2)
v2: Don't use the intrinsics, the shader backend can recognize these
    patterns and generates optimal code automatically.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 14:58:19 +01:00
Marek Olšák
d3723c614f radeonsi: add a helper for extracting bitfields from parameters (v2)
This will be used a lot (especially by tessellation).

v2: don't use the bfe intrinsic

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 14:58:19 +01:00
Antia Puentes
9735a62a2c i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7
IvyBridge and Haswell PRM say that the JIP should be emitted
with type W but we were using UD. The previous implementation
did not show adverse effects, but IMHO it is safer to follow
the specification thoroughly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
2015-03-16 12:56:17 +01:00
Marek Olšák
dc39413640 radeonsi: move scratch reloc state setup
- move it to its own function
- do it after all states are emitted
- bump SI_MAX_DRAW_CS_DWORDS

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
567c8d7300 radeonsi: don't emit PA_SC_LINE_STIPPLE if not rendering lines
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
1f4bb38264 radeonsi: don't emit PA_SC_LINE_STIPPLE after every rasterizer state change
Do it only when the line stipple state is changed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
f5832f3f9d radeonsi: move PA_SU_SC_MODE_CNTL to rasterizer state
This requires enabling the optional GL provoking vertex behavior for quads.

+ some cosmetic changes, so that the register is set exactly the same as
on r600.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
98a2398222 radeonsi: implement line and polygon smoothing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
303d23e10d radeonsi: add shader code for smoothing
The fragment shader multiplies the alpha channel with gl_SampleMaskIn.
If blending is enabled, it looks like MSAA.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
4f20a8f278 radeonsi: split sample locations into its own state atom
Sample locations are not updated as often as framebuffers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
f7796a966d radeonsi: add basic code for overrasterization
This will be used for line and polygon smoothing.
This is GCN-only even though it's in shared code.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
1921fa4304 radeonsi: small cleanup in si_shader_selector_key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
52ff1edc51 radeonsi: simplify accessing alpha pointer in si_llvm_emit_fs_epilogue
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
955ebf2890 radeonsi: add support for easy opcodes from ARB_gpu_shader5
I have to use the BFE instrinsics, because BFE is one of the most complex
instructions that can't be matched easily. BFE has 3 conditional branches
and one of them is quite big.

In the isel DAG, lowered BFE has 27 nodes (including leafs).
2015-03-16 12:54:18 +01:00
Marek Olšák
755a2907a3 radeonsi: implement bit-finding opcodes from ARB_gpu_shader5
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
ca90cde81e radeonsi: implement gl_SampleMaskIn
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
f9fd0c4a55 radeonsi: add support for SQRT
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
d73c1c1304 radeonsi: add support for FMA
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
dfea35666e gallium/radeon: don't use LLVMReadOnlyAttribute for ALU
None of the instructions use a pointer argument.
(+ small cosmetic changes)

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
9da9c8e3f4 tgsi: handle bitwise opcodes in tgsi_opcode_infer_type (v2)
v2: set the same types as the destination type in tgsi_exec

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-16 12:54:18 +01:00
Marek Olšák
216543ea54 gallium: add FMA and DFMA opcodes (v3)
Needed by ARB_gpu_shader5.

v2: select DMAD for FMA with double precision
v3: add and select DFMA

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-16 12:54:18 +01:00
Rob Clark
e92bc6b38e freedreno: update generated headers
Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-15 18:00:19 -04:00
Rob Clark
d3fb949c03 freedreno/ir3: remove old compiler
Now that piglit is no longer falling back to old compiler for any tests,
we can remove it.  Hurray \o/

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15 13:27:03 -04:00
Rob Clark
feb858b788 freedreno/ir3: avoid scheduler deadlock
Deadlock can occur if we schedule an address register write, yet some
instructions which depend on that address register value also depend on
other unscheduled instructions that depend on a different address
register value.  To solve this, before scheduling an address register
write, ensure that all the other dependencies of the instructions which
consume this address register are already scheduled.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15 13:26:56 -04:00
Rob Clark
7208e96bb8 freedreno/ir3: bit of cleanup
Add an array_insert() macro to simplify inserting into dynamically sized
arrays, add a comment, and remove unused prototype inherited from the
original freedreno.git/fdre-a3xx test code, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15 13:26:44 -04:00
Kenneth Graunke
db095eb43b i965: De-duplicate is_expression_commutative() functions.
Create a backend_inst::is_commutative() method to replace two static
functions that did the exact same thing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-03-15 03:14:53 -07:00
Chris Forbes
f68a973dfb i965/gen4-5: Cope with immutable-format texture revalidation
This is unfortunately sometimes necessary due to rebasing levels when
rendering into them.

16 piglits crash -> pass, when building mesa with debug enabled.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-14 15:55:17 +13:00
Emil Velikov
8ed1b65b62 docs: add news item and link release notes for mesa 10.5.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-13 23:36:33 +00:00
Emil Velikov
5f72847a88 docs: Add sha256 sums for the 10.5.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 2abba086ca)
2015-03-13 23:35:02 +00:00
Emil Velikov
6c96608937 Add release notes for the 10.5.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 11c0ff60ef)
2015-03-13 23:35:00 +00:00
Ilia Mirkin
620e29b748 freedreno: fix slice pitch calculations
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-03-13 16:05:16 -04:00
Ilia Mirkin
89b26d5a36 freedreno/a3xx: use the same layer size for all slices
We only program in one layer size per texture, so that means that all
levels must share one size. This makes the piglit test

bin/texelFetch fs sampler2DArray

have the same breakage as its non-array version instead of being
completely off, and makes

bin/ext_texture_array-gen-mipmap

start passing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-03-13 16:05:16 -04:00
Ian Romanick
e76a8dc8ed i965/vs: Add missing resolve_bool_comparison calls on GEN4 and GEN5
The ir_unop_any problem was discovered by some later optimization passes
that generate ir_triop_csel.  I was also able to reproduce it by
modifying the gl-2.0-vertexattribpointer vertex shader to generate its
result using

   color = mix(vec4(0, 1, 0, 0),
               vec4(1, 0, 0, 0),
               bvec4(any(greaterThan(diff, vec4(tolerance)))));

instead of an if-statement.  This also required using #version 130 and
MESA_GLSL_VERSION_OVERRIDE=130.

I have not nominated this for stable releases because I don't think
there's any way to trigger the problem without GLSL 1.30 or
optimizations that don't exist in stable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@intel.com>
2015-03-13 12:57:32 -07:00
Chris Forbes
21ff9bfe1c i965/disasm: Fix format strings
Most of the brw_inst_* api returns 64bit values. This fixes disassembly
of sampler messages, etc.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-14 07:51:18 +13:00
Chris Forbes
7c3095d6b7 i965/disasm: Mark format() as being printf-style.
This allows us to get warnings from GCC when we mess up the format
strings.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-14 07:50:48 +13:00
Matt Turner
97399fc751 docs: List ARB_shading_language_packing/EXT_shader_integer_mix.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-13 10:42:38 -07:00
Matt Turner
8d3aa5926b glsl: Expose built-in packing functions under GLSL 4.2.
ARB_shading_language_packing is part of GLSL 4.2, not 4.0 as I
mistakenly believed. The following functions are available only with
ARB_shading_language_packing, GLSL 4.2 (not GLSL 4.0), or ES 3.0:

   - packSnorm2x16
   - unpackSnorm2x16
   - packHalf2x16
   - unpackHalf2x16

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-13 10:42:38 -07:00
Matt Turner
dac2e7deaa egl: Create queryable strings in eglInitialize().
Creating/recreating the strings in eglQueryString() is extra work and
isn't thread-safe, as exhibited by shader-db's run.c using libepoxy.

Multiple threads in run.c call eglReleaseThread() around the same time.
libepoxy calls eglQueryString() to determine whether eglReleaseThread()
exists, and our EGL implementation passes a pointer to the version
string to libepoxy while simultaneously overwriting the string, leading
to a failure in libepoxy.

Moreover, the EGL spec says (emphasis mine):

"eglQueryString returns a pointer to a *static*, zero-terminated string"

This patch moves some auxiliary functions from eglmisc.c to eglapi.c so
that they may be used to create the extension, API, and version strings
once during eglInitialize(). The auxiliary functions are renamed from
_eglUpdate* to _eglCreate*, and some checks made unnecessary by calling
the functions from eglInitialize() are removed.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-13 10:42:38 -07:00
Samuel Iglesias Gonsalvez
b43bbfa90a glsl: optimize (0 cmp x + y) into (-x cmp y).
The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev
cf6f33ee68 mesa: Check for valid PBO access in gl(Compressed)Tex(Sub)Image calls
This patch adds two types of checks to the gl(Compressed)Tex(Sub)Imgage family
of functions when a pixel buffer object is bound to GL_PIXEL_UNPACK_BUFFER:

- That the buffer is not mapped.
- The total data size is within the boundaries of the buffer size.

It does so by calling auxiliary validations functions from PBO API:
_mesa_validate_pbo_source() for non-compressed texture calls, and
_mesa_validate_pbo_source_compressed() for compressed texture calls.

The first check is defined in Section 6.3.2 'Effects of Mapping Buffers
on Other GL Commands' of the GLES 3.1 spec, page 57:

    "Any GL command which attempts to read from, write to, or change the
     state of a buffer object may generate an INVALID_OPERATION error if all
     or part of the buffer object is mapped. However, only commands which
     explicitly describe this error are required to do so. If an error is not
     generated, using such commands to perform invalid reads, writes, or
     state changes will have undefined results and may result in GL
     interruption or termination."

Similar wording exists in GL 4.5 spec, page 76.

In the case of gl(Compressed)Tex(Sub)Image(2,3)D, the specification doesn't force
implemtations to throw an error. However since Mesa don't currently implement
checks to determine when it is safe to read/write from/to a mapped PBO, we
should always return the error if all or parts of it are mapped.

The 2nd check is defined in Section 8.5 'Texture Image Specification' of the
OpenGL 4.5 spec, page 203:

    "An INVALID_OPERATION error is generated if a pixel unpack buffer object
     is bound and storing texture data would access memory beyond the end of
     the pixel unpack buffer."

Fixes 4 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
* dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
* dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
* dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev
7c084752c6 mesa: Separate PBO validation checks from buffer mapping, to allow reuse
Internal PBO functions such as _mesa_map_validate_pbo_source() and
_mesa_validate_pbo_compressed_teximage() perform validation and buffer mapping
within the same call.

This patch takes out the validation into separate functions to allow reuse
of functionality by other code (i.e, gl(Compressed)Tex(Sub)Image).

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev
7b5bb97cef mesa: Set the correct image size in _mesa_validate_pbo_access()
_mesa_validate_pbo_access() provides a generic way to check that a
requested pixel transfer operation on a PBO falls within the
boundaries of the buffer. It is used in various other places, and
depending on the caller, some arguments are used or not.

In particular, the 'clientMemSize' argument is used only by calls
that are knowledgeable of the total size of the user data involved
in a pixel transfer, such as the case of compressed texture image
calls. Other calls don't provide 'clientMemSize' directly since it
is made implicit from the size and format of the texture, and its
data type. In these cases, a sufficiently big value is passed to
'clientMemSize' (INT_MAX) to avoid an incorrect constrain.

The problem is that _mesa_validate_pbo_access() use uint
pointers to make the calculations, which are 64 bits long in 64
bits platforms, meanwhile the dummy INT_MAX passed in 'clientMemSize'
is just 32 bits. This causes a constrain that is not desired.

This patch fixes that by checking that if 'clientMemSize' is MAX_INT,
then UINTPTR_MAX is assumed instead.

This is an ugly workaround to the fact that _mesa_validate_pbo_access()
intends to be a one function fits all. The clean solution here would
be to break it into different functions that provide the adequate API
for each of the possible code paths and validation needs.

Since there are callers relying on passing INT_MAX to 'clientMemSize',
this patch is necessary to deal with the problem above while a cleaner
implementation of the PBO API is not implemented.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev
f6f7bfb5e1 meta: Remove error checks for texture <-> pixel-buffer transfers that don't belong in driver code
The implementation of texture <-> pixel-buffer transfers in drivers common layer
includes certain error checks and argument validation that don't belong there,
considering how the Mesa codebase is laid out. These are higher level
validations that, if necessary, should be performed earlier (i.e, in GL API
entry points).

This patch simply removes these error checks from driver code.

For more information, see discussion at
http://lists.freedesktop.org/archives/mesa-dev/2015-February/077417.html.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
2015-03-13 16:40:20 +01:00
Brian Paul
558dcd8770 util: convert slab macros to inline functions
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-13 08:03:43 -06:00
Brian Paul
d24a20e967 egl: fix cast to silence compiler warning
eglcurrent.c: In function '_eglSetTSD':
eglcurrent.c:57:4: warning: passing argument 2 of 'tss_set' discards
'const' qualifier from pointer target type [enabled by default]
    tss_set(_egl_TSD, (const void *) t);
    ^
In file included from ../../../include/c11/threads.h:72:0,
                 from eglcurrent.c:32:
../../../include/c11/threads_posix.h:357:1: note: expected 'void *'
but argument is of type 'const void *'
 tss_set(tss_t key, void *val)
 ^

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-13 08:03:43 -06:00
Alexandre Demers
a38e6c4fbd gallivm: (trivial) Fix typo in comment introduced by 70dc8a
Fix typo in comment introduced by 70dc8a

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-13 13:52:52 +00:00
Seán de Búrca
1a469a34d5 mesa: improve ARB_copy_image internal format compat check
The memory layout of compatible internal formats may differ in bytes per
block, so TexFormat is not a reliable measure of compatibility. For example,
GL_RGB8 and GL_RGB8UI are compatible formats, but GL_RGB8 may be laid out in
memory as B8G8R8X8. If GL_RGB8UI has a 3 byte-per-block memory layout, the
existing compatibility check will fail.

Additionally, the current check allows any two compressed textures which share
block size to be used, whereas the spec gives an explicit table of compatible
formats.

v2: Use a switch instead of array iteration for block class and show the
    correct GL error when internal formats are mismatched.
v3: Include spec citations for new compatibility checks, rearrange check
    order to ensure that compressed, view-compatible formats return the
    correct result, and make style fixes. Original commit message amended
    for clarity.
v4: Reformatted spec citations.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 16:40:03 -07:00
Kenneth Graunke
f3e4b2c9d2 nir: Fix non-determinism in nir_lower_vars_to_ssa().
Previously, we stored derefs in a hash table, using the malloc'd pointer
as the key.  Then, we walked through the hash table and generated code,
based on the order of the hash table's elements.

Memory addresses returned by malloc are pretty much random, which meant
that the hash was random, and the hash table's elements would be walked
in some random order.  This led to successive compiles of the same
shader using different variable names and slightly different orderings
of phi-nodes.  Code could not be diff'd, and the final assembly would
sometimes change slightly too.

It turns out the only point of the hash table was to avoid inserting
the same node multiple times for different dereferences.  We never
actually searched the hash table!  This patch uses an intrusive
linked list instead.  Since exec_list uses head and tail sentinels,
checking prev or next against NULL will tell us whether the node is
already in the list.

Pair programming with Jason Ekstrand.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-12 13:25:39 -07:00
Jason Ekstrand
67388c1ef2 util: Fix foreach_list_typed_safe when exec_node is not at offset 0.
__next and __prev are pointers to the structure containing the exec_node
link, not the embedded exec_node.  NULL checks would fail unless the
embedded exec_node happened to be at offset 0 in the parent struct.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Use "(__node)->__field.next != NULL" to check for the end of the list
   instead of the "&__next->__field != NULL".  The former is far more
   obviously correct as it matches what the non-safe versions do.  The
   original code tried to avoid any use of __next as the client code may
   delete it during its execution.  However, since the looping condition is
   checked after the iteration clause but before the client code is
   executed, we know that __node is valid during the looping condition.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-12 13:25:39 -07:00
Kenneth Graunke
547c760964 i965: Use NIR for scalar VS when INTEL_USE_NIR is set.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:49 -07:00
Kenneth Graunke
7ef0b6b367 i965/fs: Add VS output support to nir_setup_outputs().
Adapted from fs_visitor::visit(ir_variable *).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:49 -07:00
Kenneth Graunke
eb137117b7 i965/fs: Handle VS inputs in the NIR backend.
(Jason noted that this is not a good long term solution, and we should
instead improve nir_lower_io so that this extra set of MOVs is
unnecessary.  I tend to agree, but decided we could do that as a
follow-up improvement.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Kenneth Graunke
a5c4e7fcf5 i965/fs: Refactor fs_visitor::nir_setup_inputs().
No functional change.  In preparation for supporting vertex shaders,
this adds a switch statement on shader stage (since vertex attributes
and fragment shader varyings will need different handling).  It also
renames "varying" to "input", to be more general.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Kenneth Graunke
34628a838a i965: Implement NIR intrinsics for loading VS system values.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Kenneth Graunke
2c79f6f9c3 nir: Add intrinsics for SYSTEM_VALUE_BASE_VERTEX and VERTEX_ID_ZERO_BASE
Ian and I added these around the time Connor was developing NIR.  Now
that both exist, we should make them work together!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Kenneth Graunke
b9dea9bc45 i965/nir: Lower to registers a bit later.
We can't safely call nir_optimize() with register present, since several
passes called in the loop can't handle registers, and will fail asserts.

Notably, nir_lower_vec_alus() and nir_opt_algebraic() really don't want
registers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Kenneth Graunke
1f0067811c i965/nir: Optimize after nir_lower_var_copies().
Array variable copy splitting generates a bunch of stuff we want to
clean up before proceeding.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Kenneth Graunke
1d8ef6ba60 i965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor.
The NIR backend hardcodes brw_wm_prog_key at the moment, which won't
work when we support scalar VS.  We could use get_tex(), but it's a
static method.  I was going to promote it to fs_visitor, but then
realized that both parameters (stage and key) are already members.

It then occured to me that we could just set up a pointer in the
constructor, and skip having a function altogether.

This patch also converts all existing users to use key_tex.

v2: Make key_tex a "const brw_sampler_prog_key_data *" instead of
    non-const; word-wrap some lines.  (Review comments from Topi.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-12 08:29:48 -07:00
Brian Paul
48b0a3c1c9 tnl: HAVE_LE32_VERTS is never defined, remove associated code
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-12 07:52:45 -06:00
Brian Paul
6d3b86c3af mesa: move LONGSTRING into generated enums.c
enums.c is the only place this directive is needed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-12 07:52:45 -06:00
Brian Paul
f8ed0bbfef mesa: remove _ASMAPI, ASMAPIP
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-12 07:52:45 -06:00
Brian Paul
09ffa04cd9 mesa: remove _XFORMAPI
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-12 07:52:45 -06:00
Brian Paul
10035361b5 swrast: remove _BLENDAPI
_BLENDAPI boils down to __cdecl on Windows, but __cdecl is the default
calling convention so this serves no purpose.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-12 07:52:45 -06:00
Brian Paul
6ca5eaf49c mesa: use ARRAY_SIZE in _mesa_QueryMatrixxOES()
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-12 07:52:45 -06:00
Brian Paul
c3984c1155 mesa: remove register keyword, add const in _mesa_QueryMatrixxOES()
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-12 07:52:45 -06:00
Brian Paul
97f6d50f72 mesa: reindent querymatrix.c
Use 3-space indents, not 4.  Move some comments after the case statements.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-12 07:52:45 -06:00
Brian Paul
be4e198be0 mesa: move fpclassify work-arounds into c99_math.h
v2: Use #error in the #else clause, per Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-12 07:52:35 -06:00
Jose Fonseca
70dc8a9930 gallivm: Prevent double delete on LLVM 3.6
std::unique_ptr takes ownership of MM, and a double delete could ensure
in case of an error,  as pointed out by Chris Vine in
https://bugs.freedesktop.org/show_bug.cgi?id=89387

Reviewed-by: Chris Vine <chris@cvine.freeserve.co.uk>
2015-03-12 10:01:09 +00:00
Emil Velikov
30916a5ef0 autogen.sh: pass --force to autoreconf, quote ORIGDIR
By passing --force autoreconf will update all the aux files, which would
otherwise be ignored if one updates autoconf/automake.

Quote the ORIGDIR variable to prevent fall-outs, when its name contains
space.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-11 23:28:26 +00:00
Emil Velikov
a385d18598 glx: remove support for non-multithreaded platforms
Implicitly required for a while, although commit 9385c592c6 (mapi:
remove u_thread.h) was the one that put the final nail on the
coffin.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-11 23:28:26 +00:00
Emil Velikov
42144170d1 glx: remove final reference to THREADS
Left over from commit 18db13f5865(mapi: THREADS was always defined,
remove it)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-11 23:28:26 +00:00
Emil Velikov
39f90e6b9b configure: require pthreads for POSIX builds
This has been an implicit rule for building mesa for a long time. Let's
make it official and just bail out at configure time. This way we can
cleaning up some of our glx code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-11 23:28:25 +00:00
Emil Velikov
a806df3f23 egl/main: convert thread management to use c11 threads
Convert the code to use the C11 threads implementation, and nuke the
Windows non-pthreads code-path. The c11/threads_win32.h abstraction
should be better than the current code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-11 23:28:25 +00:00
Emil Velikov
efe87f1a80 egl/main: use c11/threads' mutex directly
Remove the inline wrappers/abstraction layer.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-11 23:28:25 +00:00
Jason Ekstrand
90e50908d7 nir/worklist: Don't change the start index when computing the tail index
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-11 15:18:16 -07:00
Thomas Helland
8fb8fe46fa nir: Optimize a + neg(a)
Shader-db i965 instructions:
total instructions in shared programs: 1711180 -> 1711159 (-0.00%)
instructions in affected programs:     825 -> 804 (-2.55%)
helped:                                9
HURT:                                  0
GAINED:                                3
LOST:                                  3

Shader-db NIR instructions:
total instructions in shared programs: 606187 -> 606179 (-0.00%)
instructions in affected programs:     298 -> 290 (-2.68%)
helped:                                4
HURT:                                  0
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2015-03-11 14:21:05 -07:00
Thomas Helland
0525f2e851 nir: Optimize (a*b)+(a*c) -> a*(b+c)
Shader-db i965 instructions:
total instructions in shared programs: 1715894 -> 1710802 (-0.30%)
instructions in affected programs:     443080 -> 437988 (-1.15%)
helped:                                1502
HURT:                                  13
GAINED:                                4
LOST:                                  4

Shader-db NIR instructions:
total instructions in shared programs: 607710 -> 606187 (-0.25%)
instructions in affected programs:     208285 -> 206762 (-0.73%)
helped:                                769
HURT:                                  8
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2015-03-11 14:21:05 -07:00
Marius Predut
09b0325409 vbo: improve the code style by adjust the preprocessing c code directives
Brian Paul review suggestion: there's more macro use here than necessary.
Removed and redefine some #define preprocessing directives.
Removed the directive input parameter 'T' .
No functional changes.

Signed-off-by: Marius Predut <marius.predut@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-11 09:34:25 -06:00
Brian Paul
9816acff2c mesa: remove CPU_TO_LE32() for AIX
This is the only remnant of AIX-specific code in Mesa.  Probably long
unused.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-11 09:34:25 -06:00
Brian Paul
3158b3abb3 mesa: remove #define __volatile
Not actually used anwhere in Mesa.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-11 09:34:24 -06:00
Brian Paul
d7193ce42c mesa: use strdup() instead of _mesa_strdup()
We were already using strdup() in various places in Mesa.  Get rid
of the _mesa_strdup() wrapper.  All the callers pass a non-NULL
argument so the NULL check isn't needed either.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-11 09:34:24 -06:00
Brian Paul
5376bc74cc st/glx: use strdup() instead of _mesa_strdup()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-11 09:34:24 -06:00
Brian Paul
279c5965aa xlib: use strdup() instead of _mesa_strdup()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-11 09:34:24 -06:00
Brian Paul
14ba6c9325 i915: add parens to silence operator precedence warning
Signed-off-by: Brian Paul <brianp@vmware.com>
2015-03-11 09:34:07 -06:00
Iago Toral Quiroga
6ac1bc90c4 i965: Fix out-of-bounds accesses into pull_constant_loc array
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-11 08:03:40 +01:00
Jordan Justen
5750595ca9 i965/gen6 gs: Convert brw_imm_ud/brw_imm_d to src_reg
Same idea as this patch, only for gen6_gs_visitor:

commit 49a938a265
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Fri Feb 20 12:12:25 2015 -0800
    i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

Suggested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-10 00:14:53 -07:00
Jordan Justen
e5269ca28e i965/fs: Use unsigned for CS/VS atomics pixel mask immediate data
brw_imm_ud(0xffff) should have been converted to fs_reg(0xffffu) to
make sure the uint32_t fs_reg constructor was matched.

commit 49a938a265
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Fri Feb 20 12:12:25 2015 -0800
    i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-10 00:13:08 -07:00
Jordan Justen
6626e3548b i965/gen8: Don't allocate hiz miptree structure
We now skip allocating a hiz miptree for gen8. Instead, we calculate
the required hiz buffer parameters and allocate a bo directly.

v2:
 * Update hz_height calculation as suggested by Topi
v3:
 * Bail if we failed to create the bo (Ben)
v4:
 * CEILING => DIV_ROUND_UP
 * Make sure mt->logical_depth0 being 0 would not cause trouble
 * Fail if Y tiling is not returned

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-09 23:56:51 -07:00
Jordan Justen
81124aefe8 i965/gen7: Don't allocate hiz miptree structure
We now skip allocating a hiz miptree for gen7. Instead, we calculate
the required hiz buffer parameters and allocate a bo directly.

v2:
 * Update hz_height calculation as suggested by Topi
v3:
 * Bail if we failed to create the bo (Ben)
v4:
 * CEILING => DIV_ROUND_UP
 * Make sure mt->logical_depth0 being 0 would not cause trouble
 * Fail if Y tiling is not returned

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-09 23:56:51 -07:00
Jordan Justen
31b851dccb i965/gen8: Don't rely directly on the hiz miptree structure
We are still allocating a miptree for hiz, but we only use fields from
intel_miptree_aux_buffer. This will allow us to switch over to not
allocating a miptree.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-09 23:56:51 -07:00
Jordan Justen
26eabd189d i965/gen7: Don't rely directly on the hiz miptree structure
We are still allocating a miptree for hiz, but we only use fields from
intel_miptree_aux_buffer. This will allow us to switch over to not
allocating a miptree.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-09 23:56:51 -07:00
Jordan Justen
aedcd466bb i965/hiz: Start to separate miptree out from hiz buffers
Today we allocate a miptree's for the hiz buffer. We needed this in
the past because we would point the hardware at offsets of the hiz
buffer. Since the hiz format is not documented, this is not a good
idea.

Since moving to support layered rendering on Gen7+, we no longer point
at an offset into the buffer on Gen7+.

Therefore, to support hiz on Gen7+, we don't need a full miptree
structure allocated.

This patch starts to create a new auxiliary buffer structure
(intel_miptree_aux_buffer) that can be a more simplistic miptree
side-band buffer associated with a miptree. (For example, to serve the
needs of the hiz buffer.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-09 23:56:50 -07:00
Dave Airlie
4d318b61fc mesa/scissor: fix typos in debug names
Just noticed this when working on virgl.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-10 16:45:45 +10:00
Samuel Pitoiset
e5cd42ed9a nvc0: fix wrong max value for driver queries
The maximum value of a Gallium HUD's panel is automatically adjusted
when the current value is greater than the max. If we set the
pipe_query_driver_info::max_value to UINT64_MAX, the maximum value is
never adjusted and this results in a flat line instead of a pretty curve
which is correctly scaled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-09 20:47:05 -04:00
Vinson Lee
13f4963ed2 i965: Silence GCC maybe-uninitialized warning.
brw_shader.cpp: In function ‘bool brw_saturate_immediate(brw_reg_type, brw_reg*)’:
brw_shader.cpp:618:31: warning: ‘sat_imm.brw_saturate_immediate(brw_reg_type, brw_reg*)::<anonymous union>::ud’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       reg->dw1.ud = sat_imm.ud;
                               ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 17:28:39 -07:00
Vinson Lee
282f67becd i915: Fix GCC unused-but-set-variable warning in release build.
i915_fragprog.c: In function ‘i915ValidateFragmentProgram’:
i915_fragprog.c:1453:11: warning: variable ‘k’ set but not used [-Wunused-but-set-variable]
       int k;
           ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 17:28:39 -07:00
Vinson Lee
5f759836ad Add macro for unused function attribute.
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-09 17:28:39 -07:00
Ben Widawsky
7aba4ab1f3 meta: Plug memory leak
It looks like this has existed since
commit f5a477ab76
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Mon Dec 16 11:54:08 2013 -0800

    meta: Refactor shader generation code out of mipmap generation path

Valgrind was complaining on fbo-generatemipmap-formats

v2: Instead, do the allocation after the early return block (v2)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-09 16:32:33 -07:00
Kenneth Graunke
e95969cd95 i965/fs: Don't issue FB writes for bound but unwritten color targets.
We used to loop over all color attachments, and emit FB writes for each
one, even if the shader didn't write to a corresponding output variable.
Those color attachments would be filled with garbage (undefined values).

Football Manager binds a framebuffer with 4 color attachments, but draws
to it using a shader that only writes to gl_FragData[0..2].  This meant
that color attachment 3 would be filled with garbage, resulting in
rendering artifacts.  Now we skip writing to it, fixing rendering.

Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to
GRFs, while writes to gl_FragData[i] initialize outputs[i].

Thanks to Jason Ekstrand for tracking this down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-09 16:07:04 -07:00
Kenneth Graunke
4ebeb71573 i965/fs: Make emit_shader_time_end() insert before EOT.
Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS).  This is duplicated several times and rather awkward.

I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.

Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT.  Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.

Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF.  (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)

v2: Rebase on v3 of the previous patch and other shader_time fixes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1]
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-09 16:07:04 -07:00
Kenneth Graunke
e43af8d09f i965/fs: Make get_timestamp() pass back the MOV rather than emitting it.
This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.

v2: Don't lose smear!  Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV.  Thanks Topi!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-09 16:07:04 -07:00
Kenneth Graunke
bea854c7f3 i965/fs: Make emit_shader_time_write return rather than emit.
Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-09 16:07:04 -07:00
Kenneth Graunke
f1adc45dbe i965/fs: Set smear on shader_time diff register.
The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register.  We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.

Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-09 16:07:03 -07:00
Kenneth Graunke
ef9cc7d0c1 i965/fs: Set force_writemask_all on shader_time instructions.
These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.

This fixes assert failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-09 16:07:03 -07:00
Alexandre Demers
7a37d5c3a4 r600g: Use R600_MAX_VIEWPORTS instead of 16
Lets define R600_MAX_VIEWPORTS instead of using 16 here and there
in the code when looping through viewports and scissors. It is
easier to understand what this number represents.

v2: Missed a case where R600_MAX_VIEWPORTS should have been used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-03-09 23:02:05 +01:00
Ian Romanick
85df48b45a i915: Remove unused IS_GEN2 macro
Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:09:21 -07:00
Ian Romanick
07a062997a i915: Remove (mostly) unused IS_915 macro
Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:09:16 -07:00
Ian Romanick
117288dbf3 i915: Remove (mostly) unused IS_PNV, IS_PNVG, and IS_PNVGM macros
Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:09:06 -07:00
Ian Romanick
19fda9fc83 i915: Remove IS_9XX macro
Since the i915 / i965 split, IS_9XX just means IS_GEN3.  Inspired by
Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:08:57 -07:00
Ian Romanick
6d41316b79 i915: Remove unused IS_MOBILE macro
Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:08:49 -07:00
Ian Romanick
e7d94be1ec i965: Don't write past the end of the application supplied buffer
Both the AMD and Intel APIs provide a dataSize parameter, and this
function would merrily ignore it.  Neither API specifies what to do when
the buffer isn't big enough.  I take the easy route of writing all the
complete bits of data that will fit.  With more complete specs, we could
probably do something different.

I noticed this while looking into an unused parameter warning.  The
warning was actually useful!

brw_performance_monitor.c: In function 'brw_get_perf_monitor_result':
brw_performance_monitor.c:1261:37: warning: unused parameter 'data_size' [-Wunused-parameter]
                             GLsizei data_size,
                                     ^

v2: Fix checks to include offset in the calculation.  Noticed by Jan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2015-03-09 14:07:14 -07:00
Ian Romanick
78a211cee5 i965: Silence unused parameter warning
All dd functions take a gl_context as the first parameter.  Instead of
removing it, just silence the warning.

brw_performance_monitor.c: In function 'brw_new_perf_monitor':
brw_performance_monitor.c:1354:41: warning: unused parameter 'ctx' [-Wunused-parameter]
 brw_new_perf_monitor(struct gl_context *ctx)
                                         ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:07:14 -07:00
Ian Romanick
3a6a732c43 i965: Silence many 'static' is not at beginning of declaration warnings
What a useful warning. #ThanksGCC

brw_performance_monitor.c:153:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_counter gen5_raw_chaps_counters[] = {
 ^
brw_performance_monitor.c:185:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static int gen5_oa_snapshot_layout[] =
 ^
brw_performance_monitor.c:221:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_group gen5_groups[] = {
 ^
brw_performance_monitor.c:240:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_counter gen6_raw_oa_counters[] = {
 ^
brw_performance_monitor.c:281:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static int gen6_oa_snapshot_layout[] =
 ^
brw_performance_monitor.c:317:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_counter gen6_statistics_counters[] = {
 ^
brw_performance_monitor.c:332:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static int gen6_statistics_register_addresses[] = {
 ^
brw_performance_monitor.c:346:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_group gen6_groups[] = {
 ^
brw_performance_monitor.c:356:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_counter gen7_raw_oa_counters[] = {
 ^
brw_performance_monitor.c:402:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static int gen7_oa_snapshot_layout[] =
 ^
brw_performance_monitor.c:470:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_counter gen7_statistics_counters[] = {
 ^
brw_performance_monitor.c:493:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static int gen7_statistics_register_addresses[] = {
 ^
brw_performance_monitor.c:515:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
 const static struct gl_perf_monitor_group gen7_groups[] = {
 ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:07:14 -07:00
Ian Romanick
c82c8b2201 i965/fs: Silence unused parameter warning
I don't this opt_cmod_propagation_local ever used the fs_visitor.

brw_fs_cmod_propagation.cpp:52:40: warning: unused parameter 'v' [-Wunused-parameter]
 opt_cmod_propagation_local(fs_visitor *v, bblock_t *block)
                                        ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:07:14 -07:00
Ian Romanick
f9779e4a8f i965/fs: Silence unused parameter warning
Unused since b18fd23.

brw_fs.cpp:2878:44: warning: unused parameter 'dispatch_width' [-Wunused-parameter]
 clear_deps_for_inst_src(fs_inst *inst, int dispatch_width, bool *deps,
                                            ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:07:13 -07:00
Ian Romanick
e4f26acc08 i965/fs: Silence unused parameter warning
brw_fs_visitor.cpp:2162:56: warning: unused parameter 'offset_components' [-Wunused-parameter]
                          fs_reg offset_value, unsigned offset_components,
                                                        ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-09 14:07:13 -07:00
Laura Ekstrand
1e552db522 main: Add entry point for TextureBufferRange.
v2: Review by Martin Peres
   - Get rid of difficult-to-follow code copied and pasted from
     the original TexBufferRange

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:54 -07:00
Laura Ekstrand
311b3686fe main: Add check_texture_buffer_target.
Creates a shared function to ensure that texture buffer target is
GL_TEXTURE_BUFFER. Helps to clean up the Tex[ture]Buffer[Range] functions.

v2: Review from Anuj Phogat
   - Split rebase of Tex[ture]Buffer[Range]

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:54 -07:00
Laura Ekstrand
5f8c6eabbe main: Add check_texture_buffer_range.
Creates a shared function that TexBufferRange and TextureBufferRange can use
to check the buffer range. This cleans up TexBufferRange considerably.

v2: Review from Anuj Phogat
   - Split rebase of Tex[ture]Buffer[Range]

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:54 -07:00
Laura Ekstrand
0f6372946b main: Cosmetic changes for Texture Buffers.
Adds a useful comment and some whitespace. Fixes an error message.

v2: Review from Anuj Phogat
   - Split rebase of Tex[ture]Buffer[Range]

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:54 -07:00
Laura Ekstrand
6b78a1fb89 main: Refactor _mesa_texture_buffer_range.
Changes how the caller is identified in error messages, moves a check for
ARB_texture_buffer_object from the entry points to the shared code in
_mesa_texture_buffer_range, and removes an unused argument (GLenum target).

v2: Review from Anuj Phogat
   - Split rebase of Tex[ture]Buffer[Range]

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:54 -07:00
Laura Ekstrand
d03337306a main: Use _mesa_lookup_bufferobj_err to simplify Tex[ture]Buffer[Range].
v2: Review from Anuj Phogat
   - Split rebase of Tex[ture]Buffer[Range]
   - Closing curly brace on the same line as else

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:54 -07:00
Laura Ekstrand
768ca8b83e main: Add utility function _mesa_lookup_bufferobj_err.
This function is exposed to mesa driver internals so that texture buffer
objects and array objects can use it.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Laura Ekstrand
ff011340a4 main: Checking for cube completeness in GetCompressedTextureImage.
v2: Review from Anuj Phogat
   - Remove redundant copies of the cube map block comment
   - Replace redundant "if (!texImage) return;" statements with
     assert(texImage)

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Laura Ekstrand
4080c330fa main: Add TEXTURE_CUBE_MAP support for glCompressedTextureSubImage3D.
v2: Review from Anuj Phogat
   - Remove redundant copies of the cube map block comment
   - Replace redundant "if (!texImage) return;" statements with
     assert(texImage)

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Laura Ekstrand
70eab80f80 main: assert(texImage) in ARB_DSA texture cube map functions.
ARB_direct_state_access functions that deal with texture cube
maps need to make sure that texture images are not NULL before operating on
them. In the following cases, the error check functions already throw an
error if texImage == NULL, so an assert can be raised instead.

v2: Review from Anuj Phogat
   - Replace redundant "if (!texImage) return;" statements with
     assert(texImage)

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Laura Ekstrand
c3e92faeb4 main: Remove redundant copy of cube map block comment in GetTextureImage.
The comment describing why ARB_direct_state_access texture cube map functions
use _mesa_cube_level_complete is very long.  To save room in the files,
readers are now referred to one central comment on texturesubimage in
teximage.c.

v2: Review from Anuj Phogat
   - Remove redundant copies of the cube map block comment

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Laura Ekstrand
8979368f12 main: Remove redundant NumLayers checks.
ARB_direct_state_access texture functions that operate on cube maps no longer
need to verify that cube map texture objects contain six texture images
because _mesa_cube_level_complete now does that for them.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Laura Ekstrand
1ee000a0b6 main: _mesa_cube_level_complete checks NumLayers.
_mesa_cube_level_complete now verifies that a cube map texture object actually
has six texture images before proceeding.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-09 13:33:53 -07:00
Marek Olšák
c939231e72 r300g: fix sRGB->sRGB blits
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-03-09 21:22:22 +01:00
Marek Olšák
9953586af2 r300g: fix a crash when resolving into an sRGB texture
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-03-09 21:03:49 +01:00
Marek Olšák
113601086d r300g: use memset for clearing the shader key 2015-03-09 20:58:32 +01:00
Marek Olšák
4815c187b7 r300g: remove the broken SNORM->UNORM shader lowering pass
Not used anymore.
2015-03-09 20:58:32 +01:00
Marek Olšák
74a757f92f r300g: fix RGTC1 and LATC1 SNORM formats
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-03-09 20:58:32 +01:00
Stefan Dösinger
f710b99071 r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)
This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01
test as well as the precision part of Wine's 3dc format test (fd.o bug
89156).

The Z component seems to contain a lower precision version of the
result, probably a temporary value from the decompression computation.
The Y and W component contain different data that depends on the input
values as well, but I could not make sense of them (Not that I tried
very hard).

GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in
piglit, and both formats are affected by a compiler bug if they're
sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx,
which returns random garbage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-03-09 20:58:32 +01:00
Tom Stellard
51b43c559f radeonsi: Add additional information to shader dumps
This adds SGPR count, VGPR count, shader size, LDS size, and scratch
usage to shader dumps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-09 13:53:33 +00:00
Tom Stellard
bbfa1c3239 radeonsi/compute: Use value from compiler for COMPUTE_PGM_RSRC1.FLOAT_MODE
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-09 13:53:33 +00:00
Tom Stellard
a646b00cfc clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG v2
This means dropping CL_FP_DENORM from the current return value.

v2:
  - Add comments about minimum values for OpenCL 1.2.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2015-03-09 13:53:33 +00:00
Ilia Mirkin
cb3eb43ad6 freedreno/ir3: get the # of miplevels from getinfo
This fixes ARB_texture_query_levels to actually return the desired
value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-09 10:50:39 -04:00
Ilia Mirkin
8ac957a51c freedreno/ir3: fix array count returned by TXQ
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-09 10:50:39 -04:00
Ilia Mirkin
f3dfe6513c freedreno: move fb state copy after checking for size change
Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-09 10:50:39 -04:00
Kenneth Graunke
b9c2fa15e3 nir: Make the printer include nir_variable::location too.
Being able to see both location and driver_location can be useful when
debugging IO mistakes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-03-09 01:34:03 -07:00
Iago Toral Quiroga
a72fb69604 i965/fs: Implement SIMD16 dual source blending.
From the SNB PRM, volume 4, part 1, page 193:

"The dual source render target messages only have SIMD8 forms due to
 maximum message length limitations. SIMD16 pixel shaders must send two of
 these messages to cover all of the pixels. Each message contains two colors
 (4 channels each) for each pixel in the message payload."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82831
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-09 08:15:13 +01:00
Kenneth Graunke
8dcc1f2c10 nir: Only do gl_FrontFacing workaround in glsl_to_nir for the FS.
Vertex shaders can have shader inputs where location happens to be
VARYING_SLOT_FACE.  Without predicating this on the shader stage,
we suddenly end up with load_front_face intrinsics in vertex shaders,
which is nonsensical.

Fixes spec/arb_vertex_buffer_object/pos-array when using NIR for VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-08 20:04:02 -07:00
Kenneth Graunke
c6f2abe67e nir: Plumb the shader stage into glsl_to_nir().
The next commit needs to know the shader stage in glsl_to_nir().
To facilitate that, we pass the gl_shader rather than the raw exec_list
of instructions.  This has both the exec_list and the stage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-08 20:04:01 -07:00
Kenneth Graunke
b200cbb0a4 nir: Add native_integers to nir_shader_compiler_options.
glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the
driver supports native integers.  Presumably other passes may as well.

Adding this to nir_shader_compiler_options is an easy way to provide
that information, as it's accessible via nir_shader::options.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-08 20:03:57 -07:00
Kenneth Graunke
a55da73be4 nir: Try to make sense of the nir_shader_compiler_options code.
The code in glsl_to_nir is entirely dead, as we translate from GLSL to
NIR at link time, when there isn't a _mesa_glsl_parse_state to pass,
so every caller passes NULL.

glsl_to_nir seems like the wrong place to try and create the shader
compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other
translators all would have to duplicate that code.  The driver should
set this up once with whatever settings it wants, and pass it in.

Eric also added a NirOptions field to ctx->Const.ShaderCompilerOptions[]
and left a comment saying: "The memory for the options is expected to be
kept in a single static copy by the driver."  This suggests the plan was
to do exactly that.  That pointer was not marked const, however, and the
dead code used a mix of static structures and ralloced ones.

This patch deletes the dead code in glsl_to_nir, instead making it take
the shader compiler options as a mandatory argument.  It creates an
(empty) options struct in the i965 driver, and makes NirOptions point
to that.  It marks the pointer const so that we can actually do so
without generating "discards const qualifier" compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-08 20:03:46 -07:00
Kenneth Graunke
2561aea6b3 nir: Delete nir_shader::user_structures and num_user_structures.
Nothing actually uses these, and the only caller of glsl_to_nir()
(brw_fs_nir.cpp) always passes NULL for the _mesa_glsl_parse_state
pointer, meaning they'll always be NULL and 0, respectively.

Just delete them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-08 20:03:44 -07:00
Kenneth Graunke
9f1e250e77 glsl: Mark array access when copying to a temporary for the ?: operator.
Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/
array-selection.vert test contains the following code:

   gl_Position = (pick_from_a_or_b ? a : b)[i];

where "a" and "b" are uniform vec4[2] variables.

ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and
generates an if-block to copy one or the other:

   (declare (temporary) (array vec4 2) conditional_tmp)
   (if (var_ref pick_from_a_or_b)
     ((assign () (var_ref conditional_tmp) (var_ref a)))
     ((assign () (var_ref conditional_tmp) (var_ref b))))

However, we failed to update max_array_access for "a" and "b", so it
remained 0 - here, the whole array is being accessed.  At link time,
update_array_sizes() used this bogus information to change the types
of "a" and "b" to vec4[1].  We then had assignments from a vec4[1] to
a vec4[2], which is highly illegal.

This tripped assertions in nir_split_var_copies with scalar VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2015-03-08 20:03:36 -07:00
Kenneth Graunke
a84f66a9b6 i965/nir: Resolve source modifiers on Gen8+ logic operations.
On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and
negate changes meaning to bitwise-not (~, not -).  This isn't what NIR
expects, so we should resolve the source modifers via a MOV.

+30 Piglits (fs-op-bit{and,or,xor}-not-abs-*).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-03-08 20:03:35 -07:00
Dave Airlie
7c25a4a84d st/mesa: drop unused texture function
This has no users.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-09 10:43:27 +10:00
Dave Airlie
c5e69409d7 mesa/st: remove unused TexData
this isn't hooked up to anything at all from what I can see.

Seems like a left over from commit 5d67d4fbebb(st/mesa: remove
st_TexImage(), use core Mesa code instead).

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-09 09:48:49 +10:00
Rob Clark
fd17db6fe5 freedreno: replace glsl130 debug flag with glsl120
Now that relative-dst works, we should never fall back to the old
compiler.  (Which is almost true, other than a couple edge case sched
fails in piglit).

So replace glsl130 flag to force GLSL 130 and integers on a3xx/a4xx with
a glsl120 flag to force GLSL 120 and !integers.

If this commit breaks any game/app/etc use FD_MESA_DEBUG=glsl120 as a
workaround and please let me know.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
0e8d58b80a gallium/docs: add some freedreno compiler docs
Enable the 'sphinx.ext.graphviz' extension, and add in a section for
driver specific docs, with freedreno compiler docs beneath.  The
goal is for more complete compiler docs, and hopefully some docs about
other parts of the driver (such as how tiling works, etc).

Note that there is also a Distribution -> Drivers section.  Although
that appears to be simply just a list of drivers.  Not sure if that
should move under the 'Drivers' section or left alone.  I did add a
one-line section for freedreno in the existing Distribution -> Drivers
section.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
060d349920 freedreno/ir3: relative dst
To simplify RA, assign arrays that are written to first.  Since enough
dependency information is in the graph to preserve order of reads and
writes of array, so all SSA names for the array collapse into one, just
assign the entire thing by array-id.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
b7703212d8 freedreno/ir3: split out array_fanin() helper
We'll need this too for relative dst..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
17754b70d7 freedreno/ir3: drop deref nodes
The meta-deref instruction doesn't really do what we need for relative
destination.  Instead, since each instruction can reference at most a
single address value, track the dependency on the address register via
instr->address.  This lets us express the dependency regardless of
whether it is used for dst and/or src.

The foreach_ssa_src{_n} iterator macros now also iterates the address
register so, at least in SSA form, the address register behaves as an
additional virtual src to the instruction.  Which is pretty much what
we want, as far as scheduling/etc.

TODO:
For now, the foreach_src{_n} iterators are unchanged.  We could wrap
the address in an ir3_register and make the foreach_src_{_n} iterators
behave the same way.  But that seems unnecessary at this point, since
we mainly care about the address dependency when in SSA form.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
f8f7548f46 freedreno/ir3: helpful iterator macros
I remembered that we are using c99.. which makes some sugary iterator
macros easier.  So introduce iterator macros to iterate all src
registers and all SSA src instructions.  The _n variants also return
the src #, since there are a handful of places that need this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
26b79ac3e4 freedreno/ir3: fix register usage calculations
For cat1 instructions, use reg() as well for relative src, to ensure
proper accounting of register usage.  Also, for relative instructions,
use reg->size rather than reg->wrmask to determine the number of
components read/written.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
3ecc834e75 freedreno/ir3: couple tweaks for cmdline compiler
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
0f797f7b7d freedreno/ir3: split up ssa_dst
And a couple other trivial renames, to prepare for relative dst.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Rob Clark
27648efa20 freedreno/ir3: fix failed assert in grouping
Turns out there are scenarios where we need to insert mov's in "front"
of an input.  Triggered by shaders like:

  VERT
  DCL IN[0]
  DCL IN[1]
  DCL OUT[0], POSITION
  DCL OUT[1], GENERIC[9]
  DCL SAMP[0]
  DCL TEMP[0], LOCAL
    0: MOV TEMP[0].xy, IN[1].xyyy
    1: MOV TEMP[0].w, IN[1].wwww
    2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY
    3: MOV OUT[1], TEMP[0]
    4: MOV OUT[0], IN[0]
    5: END

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-08 17:42:43 -04:00
Jon TURNEY
72d4f6c67f c99_alloca.h: Also use <alloca.h> for cygwin
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-07 18:18:32 +00:00
Vinson Lee
1ca39ec03c i915: Fix GCC unused-variable warning in release build.
i915_debug_fp.c: In function ‘i915_disassemble_program’:
i915_debug_fp.c:302:11: warning: unused variable ‘size’ [-Wunused-variable]
    GLuint size = program[0] & 0x1ff;
           ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-03-06 21:41:46 -08:00
Mark Janes
b28c037d64 r300g: Fix build, invalid extern "C" around header inclusion.
A previous patch to fix header inclusion within extern "C" neglected
to fix the occurences of this pattern in r300 files.

When the helper to detect this issue was pushed to master, it broke
the build for the r300 driver.  This patch fixes the r300 build.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-06 22:08:44 -05:00
Mark Janes
c4b91a1f5c nouveau: Fix build, invalid extern "C" around header inclusion.
A previous patch to fix header inclusion within extern "C" neglected
to fix the occurences of this pattern in nouveau files.

When the helper to detect this issue was pushed to master, it broke
the build for the nouveau driver.  This patch fixes the nouveau build.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-06 22:08:11 -05:00
Ilia Mirkin
20346808cf nv50,nvc0: remove bogus 64_FLOAT formats
There is no HW support for these and the VBO pusher doesn't know about
them. No need to, either, since the st will be lowering them to 2x32.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-06 22:06:05 -05:00
Emil Velikov
1e5f833a0d docs: add news item and link release notes for mesa 10.4.6/10.5.0
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-07 00:33:06 +00:00
Emil Velikov
ac9679b1c5 docs: Add sha256 sums for the 10.5.0 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 0d3e4ed134)
2015-03-07 00:25:05 +00:00
Emil Velikov
b48774e7d8 docs: Update 10.5.0 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 97357d475f)
2015-03-07 00:25:01 +00:00
Emil Velikov
19c5bee101 docs: Add sha256 sums for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fc9dd495b2)
2015-03-07 00:24:57 +00:00
Emil Velikov
9fe27c7b99 Add release notes for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 542a754524)
2015-03-07 00:24:54 +00:00
Chia-I Wu
bca6c8572f ilo: clarify valid and preferred tilings
We did it right until the switch to gen_surface_tiling, which has
GEN8_TILING_W.  Generally, GEN8_TILING_W may be valid but not preferred.
2015-03-07 04:32:39 +08:00
Chia-I Wu
bf061a3d2e ilo: clean up Gen6 WAs
Add a help function for each WA and make PIPE_CONTROL flags match the WA
descriptions.  Call gen6_wa_pre_pipe_contro() only before PIPE_CONTROLs.
Fix missing gen6_wa_pre_3dstate_vs_toggle() in the rectlist path.
2015-03-07 02:17:54 +08:00
Chia-I Wu
ba5670fc50 ilo: add generic ilo_render_3dprimitive()
It replaces gen[6-8]_3dprimitive().
2015-03-07 01:45:52 +08:00
Chia-I Wu
8b2eecfbf8 ilo: add generic ilo_render_pipe_control()
It replaces gen[6-8]_pipe_control() and a direct gen6_PIPE_CONTROL() call in
ilo_render_emit_flush().
2015-03-07 01:40:23 +08:00
Chia-I Wu
35b713ad75 ilo: fix padding of linear sampler views
Should use the temporary variable in the loop instead of layout->bo_height.
2015-03-07 01:38:35 +08:00
Chia-I Wu
dda4823844 ilo: do not check for interleaved_samples
interleaved_samples is only zero-initialized when layout_want_mcs() is called.
We should not check for it.  There is also no need to.
2015-03-07 01:38:35 +08:00
Emil Velikov
56ede80940 Revert "egl/main: use c11/threads' mutex directly"
This reverts commit 6cee785c69.

Not meant to go in yet. Lacking review.
2015-03-06 17:07:40 +00:00
Emil Velikov
eb14d28e6d Revert "egl/main: convert thread management to use c11 threads"
This reverts commit 33eff85336.

Not meant to go in yet. Lacking review.
2015-03-06 17:07:34 +00:00
Emil Velikov
3b1d69910d Revert "configure: require pthreads for POSIX builds"
This reverts commit 50714cec2b.

Not meant to go in yet. Lacking review.
2015-03-06 17:07:29 +00:00
Emil Velikov
8f2eaae10c Revert "glx: remove final reference to THREADS"
This reverts commit 8b15a883e0.

Not meant to go in yet. Lacking review.
2015-03-06 17:07:23 +00:00
Emil Velikov
5e3276f5c7 Revert "glx: remove support for non-multithreaded platforms"
This reverts commit 38591295cd.

Not meant to go in yet. Lacking review.
2015-03-06 17:07:11 +00:00
Emil Velikov
1c1fd82b4b glx: remove unneeded ifdef _WIN32 guard
The C99 header exists on other platforms as well.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-06 16:49:03 +00:00
Emil Velikov
3f16751639 util: rework _MSC_VER >= 1200 checks
Replace the _MSC_VER >= 1200 with defined (_MSC_VER) and compact if/else
statements. We require MSVC 2008 or later with commit 46110c5d564.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-06 16:48:50 +00:00
Emil Velikov
38591295cd glx: remove support for non-multithreaded platforms
Implicitly required for a while, although commit 9385c592c6 (mapi:
remove u_thread.h) was the one that put the final nail on the
coffin.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 16:46:18 +00:00
Emil Velikov
8b15a883e0 glx: remove final reference to THREADS
Left over from commit 18db13f5865(mapi: THREADS was always defined,
remove it)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 16:46:18 +00:00
Emil Velikov
50714cec2b configure: require pthreads for POSIX builds
This has been an implicit rule for building mesa for a long time. Let's
make it official and just bail out at configure time. This way we can
cleaning up some of our glx code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 16:46:17 +00:00
Emil Velikov
33eff85336 egl/main: convert thread management to use c11 threads
Convert the code to use the C11 threads implementation, and nuke the
Windows non-pthreads code-path. The c11/threads_win32.h abstraction
should be better than the current code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 16:46:17 +00:00
Emil Velikov
6cee785c69 egl/main: use c11/threads' mutex directly
Remove the inline wrappers/abstraction layer.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 16:46:17 +00:00
José Fonseca
bfb4db83b6 include: Add helper header to help trap includes inside extern C.
This is just to help repro and fixing these issues with any C++ compiler --

Commiting this will of course wait until all issues are addressed.

$ scons src/glsl/
scons: Reading SConscript files ...
Checking for GCC ...  yes
Checking for Clang ...  no
Checking for X11 (x11 xext xdamage xfixes glproto >= 1.4.13)... yes
Checking for XCB (x11-xcb xcb-glx >= 1.8.1 xcb-dri2 >= 1.8)... yes
Checking for XF86VIDMODE (xxf86vm)... yes
Checking for DRM (libdrm >= 2.4.38)... yes
Checking for UDEV (libudev >= 151)... yes
warning: LLVM disabled: not building llvmpipe
scons: done reading SConscript files.
scons: Building targets ...
scons: building associated VariantDir targets: build/linux-x86_64-debug/glsl
  Compiling src/glsl/ast_array_index.cpp ...
  Compiling src/glsl/ast_expr.cpp ...
  Compiling src/glsl/ast_function.cpp ...
  Compiling src/glsl/ast_to_hir.cpp ...
  Compiling src/glsl/ast_type.cpp ...
  Compiling src/glsl/builtin_functions.cpp ...
In file included from include/c99_compat.h:28:0,
                 from src/mapi/u_compiler.h:4,
                 from src/mapi/u_thread.h:47,
                 from src/mapi/glapi/glapi.h:47,
                 from src/mesa/main/mtypes.h:42,
                 from src/mesa/main/errors.h:47,
                 from src/mesa/main/imports.h:41,
                 from src/mesa/main/core.h:44,
                 from src/glsl/builtin_functions.cpp:58:
include/no_extern_c.h:48:1: error: template with C linkage
 template<class T> class _IncludeInsideExternCNotPortable;
 ^
In file included from include/c99_compat.h:28:0,
                 from include/c11/threads.h:38,
                 from src/mapi/u_thread.h:49,
                 from src/mapi/glapi/glapi.h:47,
                 from src/mesa/main/mtypes.h:42,
                 from src/mesa/main/errors.h:47,
                 from src/mesa/main/imports.h:41,
                 from src/mesa/main/core.h:44,
                 from src/glsl/builtin_functions.cpp:58:
include/no_extern_c.h:48:1: error: template with C linkage
 template<class T> class _IncludeInsideExternCNotPortable;
 ^
  Compiling src/glsl/builtin_types.cpp ...
  Compiling src/glsl/builtin_variables.cpp ...
scons: *** [build/linux-x86_64-debug/glsl/builtin_functions.os] Error 1
scons: building terminated because of errors.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-06 12:38:55 +00:00
Iago Toral Quiroga
7f10e1678e i965: free scratch buffers when destroying the context
If scratch space is needed for a shader stage we try to reuse the last scratch
buffer bound to that stage. If we can't, we free the old scratch buffer and
allocate a new one. This means we always keep the last scratch buffer for a
particular shader stage around for the entire life span of the context.

These buffers are being reported by Valgrind as definitely lost after
destroying the OpenGL context. For example, for the geometry shader stage:

==18350== 248 bytes in 1 blocks are definitely lost in loss record 85 of 150
==18350==    at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18350==    by 0xA1B35D6: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:724)
==18350==    by 0xA1B383F: drm_intel_gem_bo_alloc (intel_bufmgr_gem.c:794)
==18350==    by 0xA1AEFA3: drm_intel_bo_alloc (intel_bufmgr.c:52)
==18350==    by 0x9D08E31: brw_get_scratch_bo (brw_program.c:226)
==18350==    by 0x9D2A0F2: do_gs_prog (brw_vec4_gs.c:280)
==18350==    by 0x9D2A635: brw_gs_precompile (brw_vec4_gs.c:401)
==18350==    by 0x9D14F68: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_shader.cpp:76)
==18350==    by 0x9D157B8: brw_link_shader (brw_shader.cpp:269)
==18350==    by 0x9B0941E: _mesa_glsl_link_shader (ir_to_mesa.cpp:3038)
==18350==    by 0x99AE4ED: link_program (shaderapi.c:917)
==18350==    by 0x99AF365: _mesa_LinkProgram (shaderapi.c:1385)

So make sure that by the time we destroy the context we check if we have live
scratch buffers for the various stages and release them if that is the case.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-03-06 13:13:24 +01:00
Ville Syrjälä
970dc23603 i965: Fix URB size for CHV
Increase the device info .urb.size for CHV to match the default URB
size (192kB).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2015-03-06 11:50:49 +02:00
Samuel Iglesias Gonsalvez
ced9425327 configure: Introduce new output variable to ax_check_python_mako_module.m4
This output variables gives more flexibility for future changes
in autoconf to detect if it is needed to auto-generate files and
check for the auto-generation dependencies.

It is still returning error when Python is not installed.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2015-03-06 09:39:41 +01:00
Andrey Sudnik
0dfec59a27 i965/vec4: Don't lose the saturate modifier in copy propagation.
Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-05 15:47:19 -08:00
Matt Turner
78df9d5e30 i965/vec4: Handle saturate in dump_instruction().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-05 15:47:18 -08:00
Chia-I Wu
ebad062e9a ilo: enable L3 cache in MOCS
This enables L3 cache in MOCS almost everywhere.
2015-03-06 04:50:19 +08:00
Chia-I Wu
c7d17f8a80 ilo: track if a ilo_view_surface is a scanout
Scanouts require a different cache type.
2015-03-06 04:43:20 +08:00
Chia-I Wu
e7c74ef43d ilo: clean up SURFACE_STATE and BINDING_TABLE_STATE
Add ilo_builder_surface_pointer() to replace ilo_builder_surface_write().
Make Gen8+ take a different path in gen6_SURFACE_STATE().
2015-03-06 04:43:20 +08:00
Brian Paul
8b2c845ea0 mapi: actually remove unused u_thread.h
I thought this was in the previous commit in the series.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-05 13:39:22 -07:00
Rob Clark
60096ed906 freedreno/ir3: fix silly typo for binning pass shaders
Was resulting in gl_PointSize write being optimized out, causing
particle system type shaders to hang if hw binning enabled.

Fixes neverball, OGLES2ParticleSystem, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-05 15:36:47 -05:00
Timothy Arceri
1a96d9ef1c glsl: let interface linking code validate its arrays
Currently intrastage arrays are validated twice for interface blocks.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-06 07:26:57 +11:00
Timothy Arceri
c5a56a63f9 glsl: use common intrastage array validation
Use common intrastage array validation for interface blocks.

This change also allows us to support interface blocks
that are arrays of arrays.

V2: Reinsert unsized array asserts in interstage_match()

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-06 07:26:50 +11:00
Timothy Arceri
50859c688c glsl: move array validation into its own function
V2: return true when var->type is unsized but max access is within valid range

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-03-06 07:26:41 +11:00
Kenneth Graunke
aa0705c06c i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta.
A while back I switched intel_blit_framebuffer to prefer Meta over the
BLT.  This meant that Gen8 platforms would start using the 3D engine
for blits, just like we do on Gen6-7.5.

However, I hadn't considered Gen4-5 when making that change.  The BLT
engine appears to be substantially faster on 965GM than using Meta to
drive the 3D engine.  This isn't too surprising: original Gen4 doesn't
support tile offsets (that came on G45), and the level/layer fields
don't work for cubemap rendering, so for inconvenient miplevel
alignments, we end up blitting or copying data to/from temporaries
in order to render to it.  We may as well just use the blitter.

I chose to use the BLT on Gen4-5 because they use the same ring for
both 3D and BLT; Gen6+ splits it out.

Fixes regressions on 965GM due to botched tile offset code (we should
fix those properly as well, but they're longstanding bugs - for now,
put things back to the status quo).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-03-05 10:36:03 -08:00
Chia-I Wu
4ddd981e40 ilo: add more convenient intel_bo_{ref,unref}()
They both check for NULL and intel_bo_ref() returns the referenced bo.  They
replace intel_bo_{reference,unreference}().
2015-03-06 02:25:03 +08:00
Chia-I Wu
70ef171e91 ilo: add intel_bo_set_tiling()
Make intel_winsys_alloc_bo() always allocate a linear bo, and add
intel_bo_set_tiling() to set the tiling.  Document the purpose of tiling.
2015-03-06 02:25:03 +08:00
Chia-I Wu
0ac706535a ilo: replace intel_tiling_mode by gen_surface_tiling
The former is used by the kernel driver to set up fence registers and to pass
tiling info across processes.  It lacks INTEL_TILING_W, which made our code
less expressive.
2015-03-06 02:25:03 +08:00
Chia-I Wu
eb32ac1956 ilo: update genhw headers
The main change is non-inline <enum>s are now generated as C enums.
2015-03-06 02:25:03 +08:00
Mark Janes
237dcb4aa7 Fix invalid extern "C" around header inclusion.
System headers may contain C++ declarations, which cannot be given C
linkage.  For this reason, include statements should never occur
inside extern "C".

This patch moves the C linkage statements to enclose only the
declarations within a single header.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-05 10:21:40 -08:00
Matt Turner
2e4c95dfe2 i965: Tell intel_get_memcpy() which direction the memcpy() is going.
The SSSE3 swizzling code was written for fast uploads to the GPU and
assumed the destination was always 16-byte aligned. When we began using
this code for fast downloads as well we didn't do anything to account
for the fact that the destination pointer given by glReadPixels() or
glGetTexImage() is not guaranteed to be suitably aligned.

With SSSE3 enabled (at compile-time), some applications would crash when
an SSE aligned-store instruction tried to store to an unaligned
destination (or an assertion that the destination is aligned would
trigger).

To remedy this, tell intel_get_memcpy() whether we're uploading or
downloading so that it can select whether to assume the destination or
source is aligned, respectively.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416
Tested-by: Uriy Zhuravlev <stalkerg@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-05 10:18:28 -08:00
Mark Janes
5f9ee6a02f mesa/x86: missing stdio inclusions
Several patches added include statements where required by the m64
build.  Some files are only compiled for m32, and require similar
changes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 10:16:25 -08:00
Tom Stellard
c97e902a1a clover: Enable cl_khr_fp64 for devices that support doubles v4
v2:
  - Report correct values for CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE
    and CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE.
  - Only define cl_khr_fp64 if the extension is supported.
  - Remove trailing space from extension string.
  - Rename device query function from cl_khr_fp64() to
    has_doubles().

v3:
  - Return 0 for device::doubled_fp_confg() when doubles aren't
    supported.

v4:
  - Remove device query for double fp_config.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-03-05 14:07:37 +00:00
Emil Velikov
8d8ca64c28 xmlpool: make sure we ship options.h
The header is included in ../xmlpool.h. With the latter of which used
directly in a number of places in mesa.
Note that we can also add it (alongside t_option.h) to noinst_HEADERS,
but neither solution fixes the issue that brough us here - namely:
Do not regenerate the headers, if it already exists.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 14:45:55 +00:00
Emil Velikov
fe5fddd7e2 mapi: fix *glapi dependency tracking
I.e. add {shared-,}glapi/glapi_mapi_tmp.h to the SOURCES list. Otherwise
there will be no knowledge that the file is required by others for the
build. Thus autotools won't pick it up for the distribution tarball.

v2: Don't forget about the static glapi. Spotted by Matt.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 14:45:55 +00:00
Emil Velikov
2c0f72d538 mesa: drop Makefile from get_hash.h dependency list
Not required. Additionally this had the side effect of generating the
file, despite it's existence.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 14:45:55 +00:00
Emil Velikov
d22391cb16 mesa: fix dependency tracking of generated sources
Some of the files generated were not in the SOURCES variable, thus
although generated prior to compilation the dependency tracking was
incomplete. The latter of which resulted in the files missing from the
distribution tarball.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 14:45:54 +00:00
Emil Velikov
3f6c28f2a9 mesa: rename format_info.c to format_info.h
The file is auto-generated, and #included by formats.c. Let's rename it
to reflect the latter. This will also help up fix the dependency
tracking by adding it to the _SOURCES variable, without the side effect
of it being compiled (twice).

v2: Update .gitignore to reflect the rename.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 14:45:54 +00:00
Emil Velikov
abae3434c4 mesa/main: update .gitignore
Drop the no longer present get_es{1,2}.c from the list.

v2: Keep the format_info.c rename hunk out of this patch.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-05 14:45:54 +00:00
Emil Velikov
d1fbea038b egl/main: remove no-longer needed definition of stdint types
All the users directly include the header, plus we have a in-tree
replacements for non C99 compilers which we already use.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-05 14:45:54 +00:00
Emil Velikov
bf0e4d219a egl/drivers: include stdint.h where needed
Currently these files are including it indirectly via eglcompiler.h
The latter of which will be removed with follow up commits.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-05 14:45:54 +00:00
Emil Velikov
74c40b9b56 egl/main: drop the declaration of PUBLIC keyword.
Should no longer be used. As many places indirectly include
eglcompiler.h keep this change separate, so that it can be easily
reverted, if needed.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-05 14:45:54 +00:00
Emil Velikov
dd438ae34b egl/main: no longer export internal function
With the split of the gallium egl module we had previously it required
access to some of the internal functions. As the only build (automake)
that did this no longer builds it we can now appropriately hide those
functions.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-05 14:45:53 +00:00
Emil Velikov
d780012cd7 egl/main: replace __FUNCTION__ with __func__
The latter is a C99 standard, and our current wrapper c99_compat.h
should handle non-compliant compilers.
Drop the c99_compat.h inclusion from eglcompiler.h altogether, as it's
no longer required.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-05 14:45:53 +00:00
Emil Velikov
7bd1693877 egl/main: replace INLINE with inline
Drop the custom keyword in favour of the C99 one. All the places using
it now directly include c99_compat.h which should handle things on
platforms which lack it.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-05 14:45:53 +00:00
Brian Paul
9385c592c6 mapi: remove u_thread.h
Just use c11 threads directly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
262cd683e2 mapi: use c11 call_once() instead of pthread_once()
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
18db13f586 mapi: THREADS was always defined, remove it
THREADS was defined if HAVE_PTHREADS or _WIN32 was defined.  That's
always the case.  The build would die in c11/threads.h otherwise.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
fac77912b5 mesa: remove THREADS check, printf calls in debug.c
THREADS is going away in the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
458c7490c2 mapi: rewrite u_current_init() function without u_thread_self()
Remove u_thread_self() since u_thread.h is going away soon.
Create a simple thread ID abstraction which wraps WIN32 or c11 threads.
This also gets rid of the questionable casting of thrd_t to an unsigned
long.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
6b5eb7bce6 mapi: fix preprocessor check in u_current_destroy()
So it matches the preprocessor check around the u_current_init_tsd() code.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
c3f352e836 mapi: remove u_macros.h
Only U_STRINGIFY() is used in entry.c

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
83926b8193 osmesa: include stdio.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
80524549f0 xlib: include stdio.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
8f1a11bfc4 st/osmesa: include stdio.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
8c68987d09 st/xlib: include stdio.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
68579c4a5c st/xlib: include stdio.h
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
fe976ceb76 st/mesa: include stdio.h where needed
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:43 -07:00
Brian Paul
2655afc7e6 swrast: include stdio.h where needed
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:42 -07:00
Brian Paul
78ee6fdb23 nouveau: include stdio.h where needed
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:42 -07:00
Brian Paul
f330ab9383 dri/common: include stdio.h where needed
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:42 -07:00
Brian Paul
db9a088d32 glsl: include stdio.h where needed
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:42 -07:00
Brian Paul
db29869205 mesa: include stdio.h where needed
Instead of relying on glapi.h or some other header to provide it.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:42 -07:00
Brian Paul
028968a3ce mesa: include c11/threads.h in mtypes.h
Let's directly include c11/threads.h instead of relying on glapi.h
to provide it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-03-05 06:59:42 -07:00
Neil Roberts
7286a68991 meta: Fix the y offset for 1D_ARRAY in _mesa_meta_pbo_TexSubImage
The yoffset needs to be interpreted as a slice offset for 1D array
textures. This patch implements that by moving the yoffset into
zoffset similar to how it moves the height into depth.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-03-05 13:24:53 +00:00
Neil Roberts
a08bff1e98 meta: Allow GL_UN/PACK_IMAGE_HEIGHT in _mesa_meta_pbo_Get/TexSubImage
Now that a layered source PBO is interpreted as a single tall 2D image
it's quite easy to accept the image height packing option by just
creating an image that is tall enough to include the image padding.

I'm not sure whether the image height property should affect 1D_ARRAY
textures. My intuition and interpretation of the GL spec (which is a
bit vague) would be that it shouldn't. However the software fallback
path in Mesa uses the property for packing but not for unpacking. The
binary NVidia driver uses it for both. This patch doesn't use it for
either case so it is different from the software fallback. There is
some discussion about this here:

http://lists.freedesktop.org/archives/mesa-dev/2015-February/077925.html

This is tested by the texsubimage Piglit test with the array and pbo
arguments. Previously this test was skipping this code path because it
always sets the image height.

I've also tested it by modifying the getteximage-targets test. It
wasn't using this code path before because it was using the default
texture object so this code couldn't successfully create a frame
buffer. I also modified it to add some image padding with the image
height in the PBO.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-03-05 13:24:45 +00:00
Neil Roberts
7d10d2feee Revert "common: Fix PBOs for 1D_ARRAY."
This reverts commit 546aba143d.

I think the changes to the calls to glBlitFramebuffer from this patch
are no different to what it was doing previously because it used to
set height to 1 before doing the blits. However it was introducing
some problems with the blit for layer 0 because this was no longer
special cased. It didn't fix problems with the yoffset which needs to
be interpreted as a slice offset. I think a better solution would be
to modify the original if statement to cope with the yoffset.

Conflicts:
	src/mesa/drivers/common/meta_tex_subimage.c

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-05 13:23:10 +00:00
Vinson Lee
29c23644cc glsl: Fix GCC unused-variable warning in release build.
CXX      ast_array_index.lo
ast_array_index.cpp: In function ‘void update_max_array_access(ir_rvalue*, int, YYLTYPE*, _mesa_glsl_parse_state*)’:
ast_array_index.cpp:86:30: warning: unused variable ‘interface_type’ [-Wunused-variable]
             const glsl_type *interface_type =
                              ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-03-04 17:20:25 -08:00
Chia-I Wu
b5eb6f769d ilo: improve WA handling in rectlist path
Add wrappers for 3DPRIMITIVE to make sure we clear current_pipe_control_dw1
and deferred_pipe_control_dw1 after it.  Add missing
gen7_wa_post_ps_and_later().
2015-03-04 15:28:05 -07:00
Chia-I Wu
1424bdd61b ilo: clean up Gen7.5 WAs
These WAs

  gen7_wa_post_3dstate_push_constant_alloc_ps()
  gen7_wa_pre_vs()
  gen7_wa_pre_3dstate_sf_depth_bias()
  first half of gen7_wa_pre_depth()
  gen7_wa_post_ps_and_later()

are Gen7-specific.  Update copy-and-pasted gen8_wa_pre_depth() also.
2015-03-04 15:28:05 -07:00
Tom Stellard
a398168f72 clover: Fix build since llvm r231270 2015-03-04 13:10:56 -08:00
Chia-I Wu
68d2e395d9 ilo: add ILO_DEBUG=hang
When set, detect and dump the hanging batch bufffer.
2015-03-05 04:52:49 +08:00
Chia-I Wu
af4cff5d6f ilo: add some more winsys functions
Add intel_winsys_get_reset_stats(), intel_winsys_import_userptr(), and
intel_bo_map_async().  The latter two are stubs, but we are not going to use
them immediately either.
2015-03-04 13:42:17 -07:00
Matt Turner
1e128e9b69 i965/fs: Don't propagate cmod to inst with different type.
Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-04 12:37:34 -08:00
Matt Turner
ade0b580e7 r300g: Check return value of snprintf().
Would have at least prevented the crash the previous patch fixed.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-04 11:15:09 -08:00
Matt Turner
f5e2aa1324 r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.
When built with Gentoo's package manager, the Mesa source directory
exists seven directories deep. The path to the .test file is too long
and is silently truncated, leading to a crash. Just use PATH_MAX.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-04 11:15:09 -08:00
Brian Paul
67e0a4f6e8 glx/tests: add -I src/ to fix make check
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-04 11:02:09 -07:00
Kristian Høgsberg
10c82c6c5f i965: Fix uint64_t overflow in intel_client_wait_sync()
DRM_IOCTL_I915_GEM_WAIT takes an int64_t for the timeout value but
GL_ARB_sync takes an uint64_t.  Further, the ioctl used to wait
indefinitely when passed a negative timeout, but it's been broken and
now returns immediately in that case.  Thus, if an application passes
UINT64_MAX to wait forever, we overflow to -1LL and return immediately.
Work around this mess by clamping the wait timeout to INT64_MAX.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-04 09:55:31 -08:00
Daniel Stone
65c8965d03 egl: Take alpha bits into account when selecting GBM formats
This fixes piglit when using PIGLIT_PLATFORM=gbm

Tom Stellard:
  - Fix ARGB2101010 format

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-04 15:48:18 +00:00
Rob Clark
b709adf7cc freedreno/ir3: fix old compiler after f6b2e8af74
If first_driver_param is left as zero (calloc'd struct), the result is
c0 getting clobbered.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-04 11:37:58 -05:00
Brian Paul
34ff9bc669 gallivm: init MM = NULL to silence warning
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
8aa9191878 mapi: remove u_compiler.h
Just include c99_compat.h or util/macros.h where needed.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
4ab713423f mapi: use util/macros.h instead of locally defined macros
The next step is to get rid of u_compiler.h completely.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
41c87cc566 mapi: replace INLINE with inline
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
5bebd7099a mesa: consolidate PUBLIC macro definition
Define the macro in src/util/macros.h rather than in two different
places.  Note that USED isn't actually used anywhere at this time.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
25656753d7 st/xlib: include p_compiler.h to get PUBLIC definition
To prevent build break with following changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
25a847d9cc mapi: remove unneeded ARRAY_SIZE #define
include util/macros.h instead.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Brian Paul
0339e7dbda glx: use ARRAY_SIZE from macros.h
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-04 08:33:48 -07:00
Jose Fonseca
6e836d2c86 scons: Update for the fact that we require GCC 4.2
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-04 15:12:22 +00:00
Jose Fonseca
d0b1c74b73 svga: Set MSVC2013 compat flags.
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-04 15:12:19 +00:00
Jose Fonseca
2c25008e8e softpipe,trace: Set MSVC 2008 compat flags.
Although we don't deploy these, we need to use them for debugging.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-04 15:12:17 +00:00
Jose Fonseca
00faf9f000 scons: Use -Werror MSVC compatibility flags per-directory.
Matching what we already do with autotools builds.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-04 15:12:06 +00:00
Jose Fonseca
3acd7a34ab st/vega: Remove.
OpenVG API seems to have dwindled away.  The code
would still be interesting if we wanted to implement NV_path_rendering
but given the trend of the next gen graphics APIs, it seems
unlikely that this becomes ARB or core.

v2: Remove a few "openvg" references left, per Emil Velikov.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

v3: Update release notes.
2015-03-04 11:01:45 +00:00
Jose Fonseca
5564c361b5 st/egl: Remove.
Largely superseeded by src/egl, and
WGL/GLX_EXT_create_context_es_profile extensions.

Note this will break Android.mk with gallium drivers -- somebody
familiar with that build infrastructure will need to update it to use
gallium drivers through egl_dri2.

v2: Remove the _EGL_BUILT_IN_DRIVER_GALLIUM define from
src/egl/main/Android.mk; and update the src/egl/main/Sconscript to
create a SharedLibrary, add versioning, create symlink - copy the bits
from egl-static, per Emil Velikov.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

v3: Disallow undefined symbols in libEGL.so.  Update release notes
2015-03-04 11:01:42 +00:00
Jose Fonseca
17b2825d76 windows/gdi: Remove.
This classic driver is so far behind Gallium softpipe/llvmpipe based
one, that's hard to imagine ever being useful.

v2: Drop drivers/windows from src/mesa/Makefile.am:EXTRA_DIST per Emil
Velikov.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

v3: Update release notes.
2015-03-04 11:01:38 +00:00
Jose Fonseca
40a4797384 nir: Use helper macros for dealing with VLAs.
v2:
- Single statement, by using memset return value as suggested by Ian
Romanick.
- No internal declaration, as suggested by Jason Ekstrand.
- Move macros to a header.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-04 10:52:02 +00:00
Marc-Andre Lureau
073a5d2e84 gallium/auxiliary/indices: fix start param
Since commit 28f3f8d, indices generator take a start parameter. However, some
index values have been left to start at 0.

This fixes the glean/fbo test with the virgl driver, and copytexsubimage
with freedreno.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-04 00:15:22 -05:00
Vinson Lee
b77576edc1 scons: Define _DEFAULT_SOURCE.
Fix GCC cpp warnings with glibc >= 2.19.

/usr/include/features.h:148:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
 # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
   ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-03 17:23:48 -08:00
Frank Henigman
e43729943e intel: fix EGLImage renderbuffer _BaseFormat
Correctly set _BaseFormat field when creating a gl_renderbuffer
with EGLImage storage.

Change-Id: I8c9f7302d18b617f54fa68304d8ffee087ed8a77
Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-03 10:58:42 -08:00
Rob Clark
8e67fd798e freedreno/a4xx: re-enable int (conditional on glsl130)
Re-enable integer, now that we can handle flat varyings.  Still, ofc,
conditional on FD_MESA_DEBUG=glsl130, until we can deprecate _old
compiler..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-03 10:41:00 -05:00
Rob Clark
e9f2abe349 freedreno/ir3: handle flat bypass for a4xx
We may not need this for later a4xx patchlevels, but we do at least need
this for patchlevel 0.  Bypass bary.f for fetching varyings when flat
shading is needed (rather than configure via cmdstream).  This requires
a special dummy bary.f w/ (ei) flag to signal to scheduler when all
varyings are consumed.  And requires shader variants based on rasterizer
flatshade state to handle TGSI_INTERPOLATE_COLOR.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-03 10:41:00 -05:00
Rob Clark
9d732d3125 freedreno/ir3: add support for memory (cat6) instructions
Scheduled basically the same as texture (cat5) instructions, using (sy)
flag for synchronization.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-03 10:41:00 -05:00
Rob Clark
20b50a0712 freedreno/ir3: fix up cat6 instruction encodings
I think there is at least one more sub-encoding, but these two should be
enough to cover the common load/store instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-03 10:41:00 -05:00
Rob Clark
4abb789bca tgsi/lowering: don't forget interp for BCOLOR inputs
To lower two sided color, tgsi_lowering creates additional BCOLOR inputs
(matching up to the BCOLOR outputs on the vert shader).  These inputs
should copy the interpolation state of their matching COLOR input.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-03 10:41:00 -05:00
Rob Clark
583a8a8f65 freedreno/a3xx,a4xx: silence some warnings
fd3_emit.c: In function ‘fd3_emit_vertex_bufs’:
  fd3_emit.c:377:11: warning: unused variable ‘semantic’ [-Wunused-variable]
     uint8_t semantic = sem2name(vp->inputs[i].semantic);

and

  fd4_emit.c: In function ‘fd4_emit_vertex_bufs’:
  fd4_emit.c:304:11: warning: unused variable ‘semantic’ [-Wunused-variable]
     uint8_t semantic = sem2name(vp->inputs[i].semantic);

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-03 10:41:00 -05:00
Brian Paul
5ece288876 c99_alloca.h: add case for __sun
Reviewed-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2015-03-03 08:40:13 -07:00
Jose Fonseca
80c5bd7ef0 configure: Leverage gcc warn options to enable safe use of C99 features where possible.
The main objective of this change is to enable Linux developers to use
more of C99 throughout Mesa, with confidence that the portions that need
to be built with MSVC -- and only those portions --, stay portable.

This is achieved by using the appropriate -Werror= options only on the
places they need to be used.

Unfortunately we still need MSVC 2008 on a few portions of the code
(namely llvmpipe and its dependencies).  I hope to eventually eliminate
this so that we can use C99 everywhere, but there are technical/logistic
challenges (specifically, newer Windows SDKs no longer bundle MSVC,
instead require a full installation of Visual Studio, and that has
hindered adoption of newer MSVC versions on our build processes.)
Thankfully we have more directy control over our OpenGL driver, which is
why we're now able to migrate to MSVC 2013 for most of the tree.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-03 09:25:11 +00:00
Ben Widawsky
3d4d77a5dc i965: Fix assertion in brw_reg_type_letters
While using various debugging features (optimization debug, instruction dumping,
etc) this function is called in order to get a readable letter for the type of
unit.

On GEN8, two new units were added, the Qword and the Unsigned Qword (Q, and UQ
respectively). The existing assertion tries to determine that the argument
passed in is within the correct boundary, however, it was using UQ as the upper
limit instead of Q.

To my knowledge you can only hit this case with the branch I am currently
working on, so it doesn't fix any known issues.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-02 19:55:20 -08:00
Ben Widawsky
37c2687645 i965: Rename some PIPE_CONTROL flags
I'm not really sure of the origins of the existing flag names. Modern docs have
some slightly different names. Having the correct names makes it easier to
determine if existing PIPE_CONTROL flag settings are correct, as well as making
adding new PIPE_CONTROLs easier.

This originally came up while I was trying to implement workarounds and spotted
some things called, "flush" which should have been called "invalidate."

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-02 19:28:43 -08:00
Matt Turner
e214000f25 i965/fs: Don't use backend_visitor::instructions after creating the CFG.
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").

The errata this code works around is described in a comment before the function:

   "[DevBW, DevCL] Errata: A destination register from a send can not be
    used as a destination register until after it has been sourced by an
    instruction with a different destination register.

The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-02 18:13:28 -08:00
Jason Ekstrand
c4925d7f3b main/base_tex_format: Properly handle STENCIL_INDEX1/4/16
This takes "fbo-stencil blit GL_STENCIL_INDEX1/4/16" from crash to pass on
BDW.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-02 11:06:44 -08:00
Jason Ekstrand
b1ab02d9c0 meta/TexSubImage: Stash everything other than PIXEL_TRANSFER/store in meta_begin
Previously, there were bugs where if the app set a scissor it could affect
the area of the texture that was downloaded.  There was also potential that
the framebuffer SRGB state could affect downloads.  This ensures that those
will get saved/restored and can't affect the texture download.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89292
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-03-02 11:06:37 -08:00
Matt Turner
93a8c702a6 i915: Remove hand-rolled memcpy implementation.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-02 10:38:49 -08:00
Matt Turner
54d7925012 i965: Remove hand-rolled memcpy implementation.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-03-02 10:38:49 -08:00
Matt Turner
da20bf068e i965: Consider scratch writes to have side effects.
We could do better by tracking scratch reads and writes.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88793
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-02 10:24:49 -08:00
Matt Turner
491d42135a mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-03-02 10:24:33 -08:00
Matt Turner
87109acbed mesa: Free memory allocated for luminance in readpixels.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-03-02 10:24:18 -08:00
Matt Turner
2b2fa18652 mesa: Indent break statements and add a missing one.
Always indenting break statements makes spotting missing ones easier.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-03-02 10:24:16 -08:00
Vinson Lee
3de01d2fe4 c99_alloca.h: Include stdlib.h on all non-Windows.
Fix build on FreeBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89364
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Brian Paul <brianp@vmware.com>
2015-03-02 09:26:36 -07:00
Brian Paul
6f0e9c2e39 mesa: remove extra definition of ARRAY_SIZE in src/mesa/main/macros.h
Already defined in src/util/macros.h

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-02 08:55:31 -07:00
Brian Paul
e1437d6c0a mesa: remove the Elements() macro definition
No longer used.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:31 -07:00
Brian Paul
692bd4a1ab util: replace Elements() with ARRAY_SIZE()
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-02 08:55:31 -07:00
Brian Paul
6633271159 radeon: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:31 -07:00
Brian Paul
9775dbc335 r200: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
ea760c2090 nouveau: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
49a7f8c919 i965: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
b565771003 i915: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
0a77ffcd5a mapi: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
c16c719647 glsl: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
70b401029c st/dri: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
2f0143ca96 st/mesa: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
c7136ff646 mesa/program: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
16f7b77275 mesa/swrast: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
766f5cf8f8 mesa/vbo: replace Elements() with ARRAY_SIZE()
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
c2e130f820 mesa/main: replace Elements() with ARRAY_SIZE()
We've been using a mix of these two macros for a while now.  Let's
just use the later everywhere.  It seems to be the convention used
by other open-source projects.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-02 08:55:30 -07:00
Brian Paul
cd6db1989a mesa: trim down #includes in api_loopback.h
Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-02 08:55:30 -07:00
Brian Paul
775049b6ad mesa: trim down includes of compiler.h
In some cases, glheader.h is the right #include.
Also remove some instances of struct _glapi_table declarations.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-03-02 08:55:30 -07:00
Jose Fonseca
fa5140bb18 scons: Fix HAVE___* definition.
These definitions must be moved before `cppdefines` is used to have effect.

Trivial.
2015-03-02 14:23:51 +00:00
Jose Fonseca
9a07435ff8 identity: Remove.
It's unmaintained, and most likely broken: I use trace driver every now
and then, and everytime I do I need to fix it up.

It's also unused: identity_screen_create is never called.

Above all, it's dead weight: if identity driver had the infrastructure
for other pass-through drivers (like trace and rbug), then it would make
sense on its own right.  But as it is implemmented, it's just another
driver to (forget) to update whenever there is a gallium interface
change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-02 14:12:46 +00:00
Francisco Jerez
7bfbaf4a5a i965: Remove the create_raw_surface vtbl hook.
It's a wrapper around emit_buffer_surface_state with format=RAW, pitch=1,
rw=true and the remaining arguments ordered differently.  There's no point in
having a separate vtbl pointer for that.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-03-02 14:33:13 +02:00
Francisco Jerez
65f9b83e05 i965: Add missing defines for render cache messages.
And remove duplicated definition of OWORD_DUAL_BLOCK_WRITE.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2015-03-02 14:33:13 +02:00
Neil Roberts
cf67ca9ffa i965/skl: Lay out a 1D miptree horizontally
On Gen9+ the 1D miptree is laid out with all of the mipmap levels in a
horizontal line.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-02 11:57:37 +00:00
Neil Roberts
0f1e86afd6 i965/skl: Lay out 3D textures the same as array textures
On Gen9+ the 3D textures use the same mipmap layout as 2D array
textures.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-03-02 11:57:37 +00:00
Neil Roberts
aef8a48979 i965/skl: Fix the maximum thread count format for the PS
According to the bspec for some reason the format of the maximum
number of threads field has changed from U8-2 to U8-1 for the PS.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-02 11:57:37 +00:00
Marek Olšák
27a34f62ba draw: fix division-by-zero for empty geometry shaders
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89372

Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-03-02 12:46:36 +01:00
Chris Forbes
b51ff50a76 i965/gs: Check newly-generated GS-out VUE map against correct stage
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.

This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.5, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
2015-03-01 11:13:35 +13:00
Brian Paul
213c41bf5d i965: add GLSL_TYPE_DOUBLE switch case to silence warning
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-28 13:39:58 -07:00
Brian Paul
7783131a51 mesa: include macros.h in stencil.h
Since it uses the CLAMP macro.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-28 13:39:58 -07:00
Brian Paul
8a25e73df3 mesa: move finite macro to imports.h
Move it to the only place it's used.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-28 13:39:57 -07:00
Brian Paul
977c56df09 mesa: remove _NORMAPI, _NORMAPIP macros
Was only used in one place.  Use equivalent _XFORMAPIP there instead.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-28 13:39:57 -07:00
Brian Paul
61d344ebba mesa: move FLT_MAX_EXP to c99_math.h
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-28 13:39:57 -07:00
Brian Paul
20dc94ba3c mesa: move ONE_DIV_SQRT_LN2 to prog_statevars.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-28 13:39:57 -07:00
Brian Paul
cbf788a348 mesa: remove unused uninitialized_var() macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-28 13:39:57 -07:00
Matt Turner
e71a7f8013 mesa: Check return value of __get_cpuid().
The use of the uninitialized_var() macro was to silence an uninitialized
variable warning that I assumed stemmed from gcc being unable to see
inside __get_cpuid() or understand its inline assembly.

In fact, it was because the __get_cpuid() function can fail, and not
initialize its arguments. Instead, check for failure and return early.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-28 12:20:31 -08:00
Matt Turner
5666d9266f i965/fs/nir: Mark fallthrough. 2015-02-28 10:46:41 -08:00
Matt Turner
54cd2f7c96 i965/fs/nir: Mark fallthrough. 2015-02-28 10:38:21 -08:00
Matt Turner
d528907fd2 i965: Avoid applying negate to wrong MAD source.
For some given GLSL IR like (+ (neg x) (* 1.2 x)), the try_emit_mad
function would see that one of the +'s sources was a negate expression
and set mul_negate = true without confirming that it was actually a
multiply.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89095
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-27 20:24:12 -08:00
Matt Turner
43ef2657a0 i965/vec4: Fix implementation of i2b.
I broke this in commit 2881b123d. I must have misread i2b as b2i.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88246
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-27 20:24:12 -08:00
Ian Romanick
b8a1637119 i965/fs/nir: Use emit_math for nir_op_fpow
It appears that all the other instructions that need it already use it.
This one just got missed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-27 18:47:04 -08:00
Matt Turner
76cd0f00f4 mapi: Don't rely on GNU void pointer arithmetic.
Commit 79daa510c added -Werror=pointer-arith to CFLAGS, which makes
arithmetic on void pointers an error.

See https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-27 16:57:10 -08:00
Kenneth Graunke
982723dfa2 Revert "configure: Leverage gcc warn options to enable safe use of C99 features where possible."
This reverts commit 79daa510c7.

I apparently hadn't done a clean build when testing this; it broke the
build for Tom, Ben, and myself.  We like the idea; let's try a v2.
2015-02-27 16:13:10 -08:00
Jonathan Gray
7983a3d2e0 auxilary/os: correct sysctl use in os_get_total_physical_memory()
The length argument passed to sysctl was the size of the pointer
not the type.  The result of this is sysctl calls would fail on
32 bit BSD/Mac OS X.

Additionally the wrong pointer was passed as an argument to store
the result of the sysctl call.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-27 23:17:22 +00:00
Brian Paul
667dac9d40 glsl: silence uninitialized var warning on MinGW
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-27 15:22:25 -07:00
Brian Paul
bf8d049488 mesa: silence unused var warning in get_tex_rgba_uncompressed()
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-27 15:22:25 -07:00
Brian Paul
48f229d759 mesa: move declaration before code
To fix MinGW warning.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-27 15:22:24 -07:00
Brian Paul
5b089e5f15 meta: silence declaration after code warning on MinGW
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-27 15:22:24 -07:00
Brian Paul
544f56b75a meta: silence uninitialized variable warnings for MinGW
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-27 15:22:24 -07:00
Brian Paul
098e5bf3b3 c99_alloca.h: fix #include for MinGW
As with MSVC, include malloc.h but don't redefine alloca.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89364
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-02-27 15:22:24 -07:00
Brian Paul
943784bbcd gallium/util: add debug_print_usage_enum() debug helper
Signed-off-by: Brian Paul <brianp@vmware.com>
2015-02-27 15:22:04 -07:00
Brian Paul
b14cec0b8e gallium/util: fix 'statement with no effect' warning
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-02-27 15:20:15 -07:00
Kenneth Graunke
53295bebc8 i965: Fix I/L/LA SNORM formats.
_mesa_choose_tex_format (texformat.c) tries I8_SNORM, L8_SNORM, and
either L8A8_SNORM or A8L8_SNORM, none of which are supported by our
driver.  Failing that, it falls back to RGBX for luminance, and RGBA
intensity and luminance alpha.  So, we need to use swizzle overrrides
to obtain the correct values.

Fixes Piglit's EXT_texture_snorm/fbo-blending-formats and
fbo-clear-formats.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-02-27 11:36:27 -08:00
Kenneth Graunke
ea696be5ac i965/fs: Patch the instruction generating discards; don't use CMP.Z.
CMP.Z doesn't work on Gen4-5 because the boolean isn't guaranteed to be
0 or 0xFFFFFFFF - only the low bit is defined.

We can call emit_bool_to_cond_code to generate the condition in f0.0;
the last instruction will generate the flag value.  We can patch it to
use f0.1, and negate the condition.

Fixes discard tests on Gen4-5.

Haswell shader-db stats:
total instructions in shared programs: 5770279 -> 5769112 (-0.02%)
instructions in affected programs:     64342 -> 63175 (-1.81%)
helped:                                1069

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-27 11:36:24 -08:00
Kenneth Graunke
4ebacf8aa6 i965/fs: Introduce brw_negate_cmod().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-27 11:36:08 -08:00
Laura Ekstrand
0fad07af9a main: Fix whitespace in teximage.c.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-27 11:11:45 -08:00
Tom Stellard
da85ab4b65 radeonsi/compute: Enable PIPE_SHADER_CAP_DOUBLES v2
v2:
  - Simplify ifdef

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-27 14:57:52 +00:00
Tom Stellard
75514555aa clover: Don't unconditionally define cl_khr_fp64
This should be done by the frontend for devices that support this
extension.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-27 14:57:44 +00:00
Tom Stellard
ed07255149 pipe-loader: Fix build with dri drivers enabled, and vl state trackers disabled
Configure arguments:

./configure --disable-dri3 --disable-xvmc --enable-opencl
            --with-gallium-drivers=r300,r600,radeonsi
            --with-egl-platforms=drm

Build error:

make[3]: *** No rule to make target
`../../../../src/gallium/auxiliary/libgalliumvlwinsys.la', needed by
`pipe_r300.la'.  Stop.

Cc: "10.5" <mesa-stable@lists.freedestkop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-27 14:51:33 +00:00
Jose Fonseca
79daa510c7 configure: Leverage gcc warn options to enable safe use of C99 features where possible.
The main objective of this change is to enable Linux developers to use
more of C99 throughout Mesa, with confidence that the portions that need
to be built with MSVC -- and only those portions --, stay portable.

This is achieved by using the appropriate -Werror= options only on the
places they need to be used.

Unfortunately we still need MSVC 2008 on a few portions of the code
(namely llvmpipe and its dependencies).  I hope to eventually eliminate
this so that we can use C99 everywhere, but there are technical/logistic
challenges (specifically, newer Windows SDKs no longer bundle MSVC,
instead require a full installation of Visual Studio, and that has
hindered adoption of newer MSVC versions on our build processes.)
Thankfully we have more directy control over our OpenGL driver, which is
why we're now able to migrate to MSVC 2013 for most of the tree.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-27 14:30:36 +00:00
Jose Fonseca
f320ecf218 nir: Use alloca instead of variable length arrays.
This is to enable the code to build with -Werror=vla in the short term,
and enable the code to build with MSVC2013 soon after.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-27 14:30:36 +00:00
Brian Paul
84a1e3d61e mesa: restore #include stdarg.h in imports.h
https://bugs.freedesktop.org/show_bug.cgi?id=89345
Signed-off-by: Brian Paul <brianp@vmware.com>
2015-02-27 07:04:49 -07:00
Brian Paul
06ed81044f c99_math.h: add defines for M_PI, M_E, M_LOG2E
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89342
Signed-off-by: Brian Paul <brianp@vmware.com>
2015-02-27 07:04:49 -07:00
Vinson Lee
8170eba7e7 r300g/tests: Include stdio.h.
Fix build error.

  CC       compiler/tests/r300_compiler_tests-radeon_compiler_regalloc_tests.o
compiler/tests/radeon_compiler_regalloc_tests.c: In function ‘test_runner_rc_regalloc’:
compiler/tests/radeon_compiler_regalloc_tests.c:57:3: error: implicit declaration of function ‘fprintf’ [-Werror=implicit-function-declaration]
   fprintf(stderr, "Failed to load program\n");
   ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2015-02-26 21:01:32 -08:00
Brian Paul
40cfa0c347 radeon/compiler: include stdio.h
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89343
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-02-26 17:53:05 -07:00
Laura Ekstrand
549078cb5a main: Fix target checking for CompressedTexSubImage*D.
This fixes a dEQP test failure.  In the test,
glCompressedTexSubImage2D was called with target = 0 and failed to throw
INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx,
target) being called before the target checking.  To remedy this, target
checking was made into its own function and called prior to
_mesa_get_current_tex_object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89311

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-26 14:24:11 -08:00
Laura Ekstrand
ca65764d60 main: Fix target checking for CopyTexSubImage*D.
This fixes a dEQP test failure.  In the test,
glCopyTexSubImage2D was called with target = 0 and failed to throw
INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx,
target) being called before the target checking.  To remedy this, target
checking was separated from the main error-checking function and
called prior to _mesa_get_current_tex_object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89312

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-26 13:31:59 -08:00
Brian Paul
688d7656c5 c99: in c99_math.h check that _USE_MATH_DEFINES is defined with MSVC
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 12:21:30 -07:00
Brian Paul
fb2ddef157 mesa: remove unused INLINE macro from compiler.h
We now use 'inline' everywhere in Mesa.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-26 11:02:14 -07:00
Brian Paul
164b3cd757 st/mesa: replace INLINE with inline
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-26 11:02:14 -07:00
Brian Paul
0dc6b72455 swrast: replace INLINE with inline
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-26 11:02:14 -07:00
Brian Paul
f51f2af76d radeon: replace INLINE with inline
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-26 11:02:14 -07:00
Brian Paul
bbedb85898 r200: replace INLINE with inline
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-26 11:02:13 -07:00
Brian Paul
8e9fe53ce9 i915: replace INLINE with inline
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-26 11:02:13 -07:00
Jose Fonseca
46110c5d56 include,auxiliary: Remove support for MSVC older then 2008.
MSVC 2008 (shipped with Windows SDK 7.0.7600) is the oldest we
need to support.  At least on llvmpipe, gallium/auxiliary, and util
modules.  For the remaining modules (particular all OpenGL specific
code) can be built with MSVC 2013.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-26 16:53:16 +00:00
Brian Paul
fd090fdadd mesa: don't include stdint.h in compiler.h
Not needed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:39 -07:00
Brian Paul
95855dd32f mesa: don't include math.h in compiler.h
Not needed by anything in that header.  Include math.h or c99_math.h
where needed instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:39 -07:00
Brian Paul
4f25a18011 mesa: trim down #includes in compiler.h
Don't include stuff we don't need.  Fix a few #includes elsewhere to
keep thing building.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:39 -07:00
Brian Paul
538e13d4a1 r300g: remove dependency on compiler.h
It only needs typical stdio.h and stdlib.h functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
609cb60d4b mesa: don't include limits.h in compiler.h
Not needed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
13730bcaf3 mesa: don't include float.h in compiler.h
Not needed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
ddf4b2e363 mesa: only include ctype.h where it's used
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
135b8c6530 mesa: include stdarg.h only where it's used
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
6b06697b0d mesa: remove M_PI, M_E, M_LOG2E macro definitions
Should be defined in math.h.  If not, we can add them to c99_math.h

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
6cb431c19c glsl: #include c99_math.h instead of core.h
We only need the M_LOG2E definition.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
36ea81d067 gallium: whitespace, comment formatting fixes in p_defines.h
Just to keep things consistent.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
e09fe38935 util: add debug_print_bind_flags() debug helper
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Brian Paul
2069f2c7fa gallium: renumber PIPE_BIND_ flags
Note that PIPE_BIND_COMMAND_ARGS_BUFFER and PIPE_BIND_LINEAR were both
bit 21 before.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-02-26 08:38:38 -07:00
Neil Roberts
a44606eb81 meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source tex
A layered PBO image is now interpreted as a single tall 2D image so
the z argument in _mesa_meta_bind_fbo_image is ignored. Therefore this
was just redundantly rebinding the same image repeatedly.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-26 12:04:21 +00:00
Marius Predut
1a93e7690d mesa: use fi_type in vertex attribute code
For 32-bit builds, floating point operations use x86 FPU registers,
not SSE registers.  If we're actually storing an integer in a float
variable, the value might get modified when written to memory.  This
patch changes the VBO code to use the fi_type (float/int union) to
store/copy vertex attributes.

Also, this can improve performance on x86 because moving floats with
integer registers instead of FP registers is faster.

Neil Roberts review:
- include changes on all places that are storing attribute values.
- check with and without -O3 compiler flag.
Brian Paul review:
- use fi_type type instead gl_constant_value type
- fix a bunch of nit-picks.
- fix compiler warnings

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82668
Signed-off-by: Marius Predut <marius.predut@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-25 16:35:49 -07:00
Anuj Phogat
4705346463 i965/gen8: Use HALIGN_16 if MCS is enabled for non-MSRT
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:59 -08:00
Anuj Phogat
84199fa647 i965: Pass pointer to miptree as function parameter in intel_horizontal_texture_alignment_unit
This will be used by next patch in the series.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:53 -08:00
Anuj Phogat
94d88cb468 i965: Allocate texture buffer in intelTexImage
before calling _mesa_meta_pbo_TexSubImage(). This will be used in
later patches and will be required in Skylake to get the tile
resource mode of miptree before calling _mesa_meta_pbo_TexSubImage().

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:46 -08:00
Anuj Phogat
82f6d17300 i965: Make a function to check the conditions to use the blitter
No functional changes in the patch. Just makes the code look cleaner.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:41 -08:00
Anuj Phogat
6960a3962c i965: Move the comment to the right place
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:37 -08:00
Anuj Phogat
524a729f68 i965: Fix condition to use Y tiling in blitter in intel_miptree_create()
Y tiling is supported in blitter on SNB+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:32 -08:00
Anuj Phogat
688309374d meta: Pass null pointer for the pixel data to avoid unnecessary data upload
to a temporary pbo created in _mesa_meta_pbo_GetTexSubImage().

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:28 -08:00
Anuj Phogat
068ba4ac78 meta: Fix buffer object assignment to account for both pack and unpack bo's
create_texture_for_pbo() is shared by _mesa_meta_pbo_GetTexSubImage()
and _mesa_meta_pbo_TexSubImage() functions. So, we need to account
for both pack and unpack buffer objects.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:23 -08:00
Anuj Phogat
618c4c4b6a meta: Use GL_STREAM_READ for pbo created with GL_PIXEL_PACK_BUFFER
create_texture_for_pbo() is used by both _mesa_meta_pbo_GetTexSubImage()
and _mesa_meta_pbo_TexSubImage() functions with different PBO targets.
Use GL_STREAM_READ with GL_PIXEL_PACK_BUFFER and GL_STREAM_DRAW with
GL_PIXEL_UNPACK_BUFFER.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:11:14 -08:00
Anuj Phogat
8d6ae49a8b meta: Add assertion check for ctx->Meta->SaveStackDepth
before using it for derefrencing.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:10:59 -08:00
Anuj Phogat
0a4ea87344 meta: Do power of two samples check only for samples > 0
otherwise samples=0 passes the check, which is invalid.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-02-25 14:10:47 -08:00
Matt Turner
cb25087c7b glsl: Rewrite and fix min/max to saturate optimization.
There were some bugs, and the code was really difficult to follow. We
would optimize

   min(max(x, b), 1.0) into max(sat(x), b)

but not pay attention to the order of min/max and also do

   max(min(x, b), 1.0) into max(sat(x), b)

Corrects four shaders from Champions of Regnum that do

   min(max(x, 1), 10)

and corrects rendering of Mass Effect under VMware Workstation.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-25 08:44:49 -08:00
Rob Clark
864340219b freedreno: drop ARRAY_SIZE macro
Since now ARRAY_SIZE has been added to util/macros.h.  Fixes a bunch of:

  freedreno_util.h:79:0: warning: "ARRAY_SIZE" redefined
   #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
   ^
  In file included from ../../../../src/gallium/include/pipe/p_compiler.h:36:0,
                   from ../../../../src/gallium/include/pipe/p_context.h:31,
                   from freedreno_context.h:32,
                   from freedreno_context.c:29:
  ../../../../src/util/macros.h:29:0: note: this is the location of the previous definition
   #  define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
   ^

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-25 08:37:58 -05:00
Neil Roberts
67e3302497 i965: Don't force x-tiling for 16-bpp formats on Gen>7
Sandybridge doesn't support y-tiling for surface formats with 16 or
more bpp. There was previously an override to explicitly allow this
for Gen7. However, this restriction is also removed in Gen8+ so we
should use y-tiling there too.

This is important to do for Skylake which doesn't support x-tiling for
3D surfaces.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-25 13:19:34 +00:00
Andreas Boll
6d164f65c5 glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA
If the renderer supports the core profile the query returned incorrectly
0x8 as value, because it was using (1U << __DRI_API_OPENGL_CORE) for the
returned value.

The same happened with the compatibility profile. It returned 0x1
(1U << __DRI_API_OPENGL) instead of 0x2.

Internal DRI defines:
   dri_interface.h: #define __DRI_API_OPENGL       0
   dri_interface.h: #define __DRI_API_OPENGL_CORE  3

Those two bits are supposed for internal usage only and should be
translated to GLX_CONTEXT_CORE_PROFILE_BIT_ARB (0x1) for a preferred
core context profile and GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB (0x2)
for a preferred compatibility context profile.

This patch implements the above translation in the glx module.

v2: Fix the incorrect behavior in the glx module

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-25 08:23:38 +01:00
Andreas Boll
06924972d5 dri/common: Update comment about driQueryRendererIntegerCommon
Since 87d3ae0b45
driQueryRendererIntegerCommon handles __DRI2_RENDERER_PREFFERED_PROFILE
too.

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-25 08:23:33 +01:00
Ilia Mirkin
720ba6ca97 glsl: add double support for packing varyings
Doubles are always packed, but a single double will never cross a slot
boundary -- single slots can still be wasted in some situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 22:07:29 -05:00
Laura Ekstrand
546aba143d common: Fix PBOs for 1D_ARRAY.
Corrects the way that _mesa_meta_pbo_TexSubImage and
_mesa_meta_pbo_GetTexSubImage handle 1D_ARRAY textures.  Fixes a failure in
the Piglit arb_direct_state_access/gettextureimage-targets test.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Laura Ekstrand <laura@jlekstrand.net>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-24 17:33:44 -08:00
Laura Ekstrand
ccc5ce6f72 common: Correct PBO 2D_ARRAY handling.
Changes PBO uploads and downloads to use a tall (height * depth) 2D texture
for blitting. This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY
textures are not properly uploaded and downloaded.

Removes the option to use a 2D ARRAY texture for the PBO during upload and
download.  This option didn't work because the miptree couldn't be set up
reliably.

v2: Review from Jason Ekstrand and Neil Roberts:
   -Delete the depth parameter from create_texture_for_pbo
   -Abandon the option to create a 2D ARRAY texture in create_texture_for_pbo

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-24 17:30:13 -08:00
Laura Ekstrand
06084652fe common: Correct texture init for meta pbo uploads and downloads.
This moves the line setting immutability for the texture to after
_mesa_initialize_texture_object so that the initializer function will not
cancel it out. Moreover, because of the ARB_texture_view extension, immutable
textures must have NumLayers > 0, or depth will equal (0-1)=0xFFFFFFFF during
SURFACE_STATE setup, which triggers assertions.

v2: Review from Kenneth Graunke:
   - Include more explanation in the commit message.
   - Make texture setup bug fixes into a separate patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-24 17:27:52 -08:00
Brian Paul
88ff8dee02 mesa: remove DEG2RAD macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 17:10:28 -07:00
Brian Paul
ab68219a59 mesa: remove MAX_GLUSHORT, move MAX_GLUINT
The later is only used in one place in swrast.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 17:10:28 -07:00
Brian Paul
f847ddb64d mesa: move signbit() macro to c99_math.h
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 17:10:28 -07:00
Brian Paul
612143b2d0 mesa: remove unused isblank() function
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 17:10:28 -07:00
Brian Paul
e033d2c642 glcpp: remove unneeded #include of core.h
isblank() is not used in the code.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 17:10:28 -07:00
Brian Paul
9fd7e9d831 mesa: remove sqrtf macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 17:10:28 -07:00
Kenneth Graunke
ee3f674572 i965: Remove redundant discard jumps.
With the previous optimization in place, some shaders wind up with
multiple discard jumps in a row, or jumps directly to the next
instruction.  We can remove those.

Without NIR on Haswell:
total instructions in shared programs: 5777258 -> 5775872 (-0.02%)
instructions in affected programs:     20312 -> 18926 (-6.82%)
helped:                                716

With NIR on Haswell:
total instructions in shared programs: 5773163 -> 5771785 (-0.02%)
instructions in affected programs:     21040 -> 19662 (-6.55%)
helped:                                717

v2: Use the CFG rather than the old instructions list.  Presumably
    the placeholder halt will be in the last basic block.

v3: Make sure placeholder_halt->prev isn't the head sentinel (caught
    twice by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:53 -08:00
Kenneth Graunke
30f51f1a1a glsl: Optimize "if (cond) discard;" to a conditional discard.
st_glsl_to_tgsi and ir_to_mesa have handled conditional discards for a
long time; the previous patch added that capability to i965.

i965 (Haswell) shader-db stats:

Without NIR:
total instructions in shared programs: 5792133 -> 5776360 (-0.27%)
instructions in affected programs:     737585 -> 721812 (-2.14%)
helped:                                6300
HURT:                                  68
GAINED:                                2

With NIR:
total instructions in shared programs: 5787538 -> 5769569 (-0.31%)
instructions in affected programs:     767843 -> 749874 (-2.34%)
helped:                                6522
HURT:                                  35
GAINED:                                6

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:53 -08:00
Kenneth Graunke
8eb6c10999 i965/fs: Handle conditional discards.
The discard condition tells us which channels we want killed.  We want
to invert that condition to get the channels that should survive (remain
live) in f0.1.  Emit a CMP to negate it.

Nothing generates these today, but that will change shortly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Kenneth Graunke
8e62bd52f8 nir: Introduce nir_intrinsic_discard_if.
This is a conditional discard, which takes a boolean source.

Note that we don't generate ir_discard::condition today, so this
shouldn't break drivers (since none implement this intrinsic yet).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Kenneth Graunke
23d42b46e3 glsl: Delete dead discard conditions in constant folding.
opt_constant_folding() already detects conditional assignments where the
condition is constant, and either deletes the assignment or the
condition.

Make it handle discards in the same fashion.

Spotted happening in the wild in Tropico 5 shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Kenneth Graunke
d77b186871 glsl: Handle conditional discards in lower_discard_flow().
This pass wasn't prepared to handle conditional discards.

Instead of initializing the "discarded" temporary to "true", set it to
the condition.  Then, refer to the variable for the condition, to avoid
duplicating the expression tree.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Kenneth Graunke
44b45da994 glsl: Make ir_rvalue_visitor visit ir_discard::condition.
This was forgotten.

I omitted the NULL check since we don't check ir_assignment::condition
either.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Kenneth Graunke
926d8b0510 glsl: Make ir_validate check the type of ir_discard::condition.
Copy and pasted from the ir_if::condition handling, plus a NULL check.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Matt Turner
6f5604601c Revert "i965/fs: Remove force_writemask_all assertion for execsize < 8."
This reverts commit 0d8f27eab7.

"This doesn't seem to be necessary." <- I was wrong!

Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-24 14:08:04 -08:00
Matt Turner
2c7a703b05 i965/fs: Emit MOV(1) instructions with force_writemask_all.
Fixes rendering with Dolphin.

Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-24 14:08:04 -08:00
Matt Turner
467077b834 i965/fs: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.
total instructions in shared programs: 5695356 -> 5689775 (-0.10%)
instructions in affected programs:     486231 -> 480650 (-1.15%)
helped:                                2604
LOST:                                  1
2015-02-24 14:08:04 -08:00
Matt Turner
b8582d18e6 i965/fs/nir: Optimize integer multiply by a 16-bit constant.
Gen8+ support was just broken, since MUL now consumes 32-bits from both
sources. Fixes 986 piglit tests on my BDW.

total instructions in shared programs: 7753873 -> 7753522 (-0.00%)
instructions in affected programs:     28164 -> 27813 (-1.25%)
helped:                                77
GAINED:                                47

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 14:08:04 -08:00
Matt Turner
7a997a3863 i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.
total instructions in shared programs: 7756214 -> 7753873 (-0.03%)
instructions in affected programs:     455452 -> 453111 (-0.51%)
helped:                                2333

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 14:08:04 -08:00
Jason Ekstrand
c750ecaa12 nir/register: Add a parent_instr field
This adds a parent_instr field similar to the one for ssa_def.  The
difference here is that the parent_instr field on a nir_register can be
NULL if the register does not have a unique definition or if that
definition does not dominate all its uses.  We set this field in the
out-of-SSA pass so that backends can get SSA-like information even after
they have gone out of SSA.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 14:08:04 -08:00
Marek Olšák
fc59695b92 st/mesa: remove unused/broken function st_print_shaders
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-24 22:59:57 +01:00
Brian Paul
a86054bac7 st/mesa: remove struct qualifier from st_src_reg parameter
It's a class.  Silences MSVC warning.
2015-02-24 14:44:19 -07:00
Brian Paul
a2b366b92c mesa: remove INV_SQRTF() macro
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
bbb2d84032 mesa: remove ceilf, floorf macros
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
bdd0402ca3 mesa: remove expf macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
cffedcf163 mesa: remove logf macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
f5816d77e2 mesa: remove powf macro
Use the wrapper in c99_math.h if needed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
bad154e677 mesa: remove unused exp2f, log2f, truncf wrappers
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
aeabf4ede5 mesa: remove unused acosf, asinf, atan2f, etc. macros
Not used anywhere.  If any of these are needed, they should be added
to c99_math.h

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
bd7f7aac56 mesa: replace FABSF with fabsf
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
46ce78d4c6 mesa: replace FLOORF with floorf
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
b2c13534f7 mesa: remove unused CEILF macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
79b480ccc0 mesa: replace LOGF, EXPF with logf, expf
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
e25f7772ca mesa: replace FREXPF, LDEXPF with frexpf, ldexpf
Start getting rid of some imports.h macros.  Use the c99 functions instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 14:44:19 -07:00
Brian Paul
e6eddbb96a targets/libgl-xlib: add src/ include dir to fix build 2015-02-24 14:44:19 -07:00
Brian Paul
a55831e8fa swrast: fix a few release build warnings 2015-02-24 14:44:19 -07:00
Marek Olšák
1180e61a1b r600g,radeonsi: fix streamout after pipeline stats have been used
EVENT_TYPE_PIPELINESTAT_STOP disables streamout queries too.

Luckily, pipeline stats are enabled by default, so we don't even have to
emit EVENT_TYPE_PIPELINESTAT_START.

Tested on Hawaii, Bonaire, Redwood, RV730.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
fdf2c04737 radeonsi: small cleanup around current_rast_prim
- remove the last parameter of si_emit_rasterizer_prim_state
- remove the last unused parameter of si_emit_draw_registers
- use current_rast_prim in si_emit_draw_registers

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
0b1f31ab7f radeonsi: set current_rast_prim in the right place
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
4eb0ccf9e7 radeonsi: simplify obtaining a shader property in si_emit_clip_regs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
5349437154 radeonsi: only preload VertexID for the GS copy shader
The copy shader doesn't use any other preloaded VGPRs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
ffd701e677 radeonsi: dump the shader key when dumping shaders
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
93daf5a2f6 r600g,radeonsi: cleanup of hex literals
0x3F800000 -> fui(1.0)
0x00000000 -> 0

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
fa913a2dc6 radeonsi: set PA_SU_HARDWARE_SCREEN_OFFSET to 0
It was probably 0 already, but it doesn't hurt to set it.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
558f51f1c5 st/mesa: cleanup st_translate_geometry_program
Mostly dead code or code that didn't do anything.

Computing gs_num_outputs at the end was also useless. It's already set
correctly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
94746cadc0 st/mesa: inline st_free_tokens
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
b039302fb7 st/mesa: cleanup st_geometry_program structure
It's full of unused variables and variables only used
in st_translate_geometry_program.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-24 21:21:04 +01:00
Marek Olšák
002aa75022 mesa: add a missing GS support check in GetActiveUniformBlockiv
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 21:21:04 +01:00
Glenn Kennard
d80701df8a r600g: Implement GL_ARB_draw_indirect for EG/CM
Requires Evergreen/Cayman and radeon kernel module
2.41.0 or newer.

Expected piglit fails due to hardware limitations:
* arb_draw_indirect-draw-arrays-prim-restart
  Restarts not applied for DrawArrays commands
* arb_draw_indirect-vertexid
  Base vertex offset is not included in vertex id

Marek: bump vgt_state num_dw by 3 (= space needed for one register write)

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-02-24 21:21:04 +01:00
Rob Clark
dd70e78674 freedreno/a4xx: aniso filtering
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-24 14:23:38 -05:00
Rob Clark
c70097ae86 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-24 14:23:38 -05:00
Rob Clark
daccbd27ce freedreno/a4xx: add ARB_instanced_arrays support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-24 14:23:38 -05:00
Rob Clark
e13398714c freedreno/a4xx: handle index_bias (i.e. base_vertex)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-24 14:23:38 -05:00
Rob Clark
283bb4848e freedreno/a4xx: add support for vertexid and instanceid sysvals
ir3 bits of it already in place from a3xx patch..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-24 14:23:38 -05:00
Rob Clark
4aef0d79ee freedreno/a4xx: pass number of instances to draw
a4xx has it's own draw packet, so needs equivalent update to what a3xx
already got.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-24 14:23:38 -05:00
Emil Velikov
86d88e2fbb docs: add news item and link release notes for mesa 10.4.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-24 16:10:52 +00:00
Emil Velikov
d60c628f2a docs: Add sha256 sums for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 41bdeda102)
2015-02-24 16:10:52 +00:00
Emil Velikov
1d761be43a Add release notes for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a5c608e951)
2015-02-24 16:10:52 +00:00
Leo Liu
9c7b343bc0 st/omx/dec/h264: fix picture out-of-order with poc type 0 v2
poc counter should be reset with IDR frame,
otherwise there would be a re-order issue with
frames before and after IDR

v2: add commit message

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-24 10:39:49 -05:00
Emil Velikov
fece147be5 install-lib-links: remove the .install-lib-links file
With earlier commit (install-lib-links: don't depend on .libs directory)
we moved the location of the file from .libs/ to the current dir.
Although we did not attribute that in the former case autotools was
doing us a favour and removing the file. Explicitly remove the file at
clean-local time, otherwise we'll end up with dangling files.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-24 15:33:25 +00:00
Francisco Jerez
f8f3aa78d8 clover: Set appropriate flag defaults on memory object creation.
According to the spec when no device access mode is specified
clCreateBuffer and clCreateImage* should default to read/write, and
clCreateSubBuffer should default to the parent's device access flags.

clCreateSubBuffer is also required to inherit the host access and
host pointer flags from the parent.

Reviewed-and-tested-by: EdB <edb+mesa@sigluy.net>
2015-02-24 16:18:14 +02:00
EdB
0e8460a528 clover: Add CL_MEM_HOST_* flag checks.
Those flags have been introduced in OpenCL 1.2.

[ Francisco Jerez: Rebase.  Throw CL_INVALID_VALUE from
  clCreateSubBuffer if the subbuffer drops access flags from its
  parent.  Use single function taking the set of allowed host access
  flags to validate memory transfer operands. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-24 16:17:18 +02:00
Francisco Jerez
80d3c1e537 clover: Factor out memory object flags validation to a helper function.
And define constants for commonly used subsets of flags to save some
typing.

Reviewed-and-tested-by: EdB <edb+mesa@sigluy.net>
2015-02-24 16:15:48 +02:00
Eric Anholt
49d3c6a8e6 vc4: Update to current kernel sources.
New BO create and mmap ioctls are added.  The submit ABI gains a flags
argument, and the pointers are fixed at 64-bit.  Shaders are now fixed at
the start of their BOs.
2015-02-24 13:49:12 +00:00
Eric Anholt
1d1e820a6d r600: Fix build after 984f306937
Same as for the CLAMP macro, undef it before including a header file that
tries to make fields with that name.
2015-02-24 13:49:12 +00:00
Tobias Klausmann
98ae01c822 st/nine: Mark end of non-void function unreachable
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 12:21:00 +00:00
Tobias Klausmann
984f306937 gallium: include util/macros.h
The most common macros are defined there, no use to duplicate these
Clean up the already redefinded macros

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 12:20:59 +00:00
Alex Henrie
9913ce14e7 driconf: Update Catalan translation
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2015-02-24 09:03:45 +00:00
Alex Henrie
d28a4b523d driconf: Update Spanish translation
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
2015-02-24 09:03:45 +00:00
Eduardo Lima Mitev
0c47e5492b mesa: Add missing error checks to GetProgramInfoLog, GetShaderInfoLog and GetProgramiv
Fixes 3 dEQP tests:
* dEQP-GLES3.functional.negative_api.state.get_program_info_log
* dEQP-GLES3.functional.negative_api.state.get_shader_info_log
* dEQP-GLES3.functional.negative_api.state.get_programiv

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 08:58:54 +01:00
Iago Toral Quiroga
fe74fee8fa i965: Fix non-AA wide line rendering with fractional line widths
"(...)Let w be the width rounded to the nearest integer (...). If the
line segment has endpoints given by (x0,y0) and (x1,y1) in window
coordinates, the segment with endpoints (x0,y0-(w-1)/2) and
(x1,y1-(w-1/2)) is rasterized, (...)"

The hardware it not rounding the line width, so we should do it.

Also, we should be careful not to go beyond the hardware limits
for the line width after it gets rounded. Gen6-7 define a maximum line
width slightly below 8.0, so we should advertise a maximum line
width lower than 7.5 to make sure that 7.0 is the maximum integer
line width that we can select. Since the line width granularity in these
platforms is 0.125, we choose 7.375. Other platforms advertise rounded
maximum line widths, so those are fine.

Fixes the following 3 dEQP tests:
dEQP-GLES3.functional.rasterization.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-24 08:58:54 +01:00
Iago Toral Quiroga
6148e3aae7 mesa: Fix ctx->Texture.CubeMapSeamless
The intel driver code, and apparently all other Mesa drivers, call
_mesa_initialize_context early in the CreateContext hook. That
function will end up calling _mesa_init_texture which will do:

ctx->Texture.CubeMapSeamless = _mesa_is_gles3(ctx);

But this won't work at this point, since _mesa_is_gles3 requires
ctx->Version to be set and that will not happen until late
in the CreateContext hook, when _mesa_compute_version is called.

We can't just move the call to _mesa_compute_version before
_mesa_initialize_context since it needs that available extensions
have been computed, which again requires other things to be
initialized, etc. Instead, we enable seamless cube maps since
GLES2, which should work for most implementations, and expect
drivers that don't support this to disable it manually as part
of their context initialization setup.

Fixes the following 192 dEQP tests:
dEQP-GLES3.functional.texture.filtering.cube.formats.*
dEQP-GLES3.functional.texture.filtering.cube.sizes.*
dEQP-GLES3.functional.texture.filtering.cube.combinations.*
dEQP-GLES3.functional.texture.mipmap.cube.*
dEQP-GLES3.functional.texture.vertex.cube.filtering.*
dEQP-GLES3.functional.texture.vertex.cube.wrap.*
dEQP-GLES3.functional.shaders.texture_functions.texturelod.samplercube_fixed_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 08:58:54 +01:00
Eduardo Lima Mitev
dccdf1d687 mesa: Return error if BeginQuery is called with an existing object of different type
Section 2.14 Asynchronous Queries, page 84 of the OpenGL ES 3.0.4
spec states:

  "BeginQuery generates an INVALID_OPERATION error if any of the
   following conditions hold: [...] id is the name of an
   existing query object whose type does not match target; [...]

Similar wording exists in the OpenGL 4.5 spec, section 4.2. QUERY
OBJECTS AND ASYNCHRONOUS QUERIES, page 43.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.fragment.begin_query

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 08:58:53 +01:00
Eduardo Lima Mitev
3699866463 mesa: Return INVALID_OPERATION when querying a never bound Query obj
Section 2.14 Asynchronous Queries, page 84 of the OpenGL ES 3.0.4 states:

"The command void GenQueries( sizei n, uint *ids ); returns n previously unused
query object names in ids. These names are marked as used, for the purposes of
GenQueries only, but no object is associated with them until the first time they
are used by BeginQuery."

This means that any attempt to use or query a Query object id before it has ever
been bound by calling glBeginQuery, should be assume to be an invalid object.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.state.get_query_objectuiv

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-24 08:58:53 +01:00
Iago Toral Quiroga
4db4a559ad mesa: Add _mesa_is_array_texture helper
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-24 08:58:53 +01:00
Eduardo Lima Mitev
2aa71e9485 mesa: Fix error validating args for TexSubImage3D
The zoffset and depth values were not being considered when calling
error_check_subtexture_dimensions().

Fixes 2 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>
2015-02-24 08:58:53 +01:00
Samuel Iglesias Gonsalvez
fbd6eba72b i965/blorp: round to nearest when converting float into integer
Fixes:

dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_linear

No piglit regressions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 08:58:53 +01:00
Carl Worth
4a6c6c49a7 i965: Perform program state upload outside of atom handling
Across the board of the various generations, the intial few atoms in
all of the atom lists are basically the same, (performing uploads for
the various programs). The only difference is that prior to gen6
there's an ff_gs upload in place of the later gs upload.

In this commit, instead of using the atom lists for this program state
upload, we add a new function brw_upload_programs that calls into the
per-stage upload functions which in turn check dirty bits and return
immediately if nothing needs to be done.

This commit is intended to have no functional change. The motivation
is that future code, (such as the shader cache), wants to have a
single function within which to perform various operations before and
after program upload, (with some local variables holding state across
the upload).

It may be worth looking at whether some of the other functionality
currently handled via atoms might also be more cleanly handled in a
similar fashion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-23 14:54:15 -08:00
Vivek Kasireddy
1e96eece30 egl, wayland: RGB565 format support on Back-buffer
In current code, color format is always hardcoded to
__DRI_IMAGE_FORMAT_ARGB8888 when buffer or DRI image is
allocated in function calls, get_back_bo and dri2_get_buffers,
regardless of current target's color format. This problem
may leads to incorrect render pitch calculation, which
eventually ends up with wrong offset of pixels in
the frame buffer when the image is in different color format
from dri surf's, especially with different bpp. (e.g. RGB565-16bpp)

Attached code patch simply adds RGB565 and XRGB8888 cases to two
functions noted above to resolve the issue.

v2: added a case of XRGB8888, format and bpp selection is done
    via switch-case (not "if-else" anymore)

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-23 14:07:02 -08:00
Brian Paul
cbd287f094 mesa: move math-related function into new c99_math.h file
The alternative would be to include math.h in c99_compat.h but that
seems heavy-handed.

This patch also replaces INLINE with inline in the c99 math function
wrappers.

Fixes MSVC build.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-02-23 14:45:14 -07:00
Jason Ekstrand
9b9ef2aeee nir/gcm: Add some missing break statements
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-23 13:20:13 -08:00
Jason Ekstrand
cb4b2ad44a nir: Copy-propagate vecN operations that are actually moves
We were already do this for ALU operations but we haven't for non-ALU
operations.  This changes that.

total NIR instructions in shared programs: 2039883 -> 2022338 (-0.86%)
NIR instructions in affected programs:     1768850 -> 1751305 (-0.99%)
helped:                                    14244
HURT:                                      124

total FS instructions in shared programs: 4083960 -> 4084036 (0.00%)
FS instructions in affected programs:     7302 -> 7378 (1.04%)
helped:                                   12
HURT:                                     51

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-23 13:19:05 -08:00
Francisco Jerez
f80af89d48 ra: Disable round-robin strategy for optimistically colorable nodes.
The round-robin allocation strategy is expected to decrease the amount
of false dependencies created by the register allocator and give the
post-RA scheduling pass more freedom to move instructions around.  On
the other hand it has the disadvantage of increasing fragmentation and
decreasing the number of equally-colored nearby nodes, what increases
the likelihood of failure in presence of optimistically colorable
nodes.

This patch disables the round-robin strategy for optimistically
colorable nodes.  These typically arise in situations of high register
pressure or for registers with large live intervals, in both cases the
task of the instruction scheduler shouldn't be constrained excessively
by the dense packing of those nodes, and a spill (or on Intel hardware
a fall-back to SIMD8 mode) is invariably worse than a slightly less
optimal scheduling.

Shader-db results on the i965 driver:

total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
instructions in affected programs:     1121 -> 1071 (-4.46%)
helped:                                1
HURT:                                  0
GAINED:                                49
LOST:                                  5

v2: Re-enable round-robin already for the lowest one of the nodes
    pushed optimistically onto the sack (Connor).
v3: Use UINT_MAX instead of ~0, open-code MIN2 (Jason, Connor).

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
34c93fd7f1 i965/fs: Fix lower_load_payload() not to use an incorrect half for immediates and uniforms.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
ea7b4d25c8 i965/fs: Fix lower_load_payload() to take into account non-zero reg_offset.
Fixes metadata guess when instructions in the program specify a
destination register with non-zero reg_offset and when the payload of
a LOAD_PAYLOAD spans several registers.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
08b4c8f7bf i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().
MRFs cannot be read from anyway so they cannot possibly be a valid
source of LOAD_PAYLOAD.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
8e47f51a5a i965/fs: Less broken handling of force_writemask_all in lower_load_payload().
It's perfectly fine to read the second half of a register written with
force_writemask_all from a first half MOV instruction or vice versa, and
lower_load_payload shouldn't mark the whole MOV as belonging to the second
half in that case.  Replicate the same metadata to both halves of the
destination when writemasking is disabled.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Matt Turner
57d80d11b1 mesa/vbo: Use unreachable to silence uninitialized var warning.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:57 -08:00
Matt Turner
bb2a897dbc mesa: Move START/END_FAST_MATH macros to their only use.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:48 -08:00
Matt Turner
08bc7cf8f6 mesa: Remove definition of NULL.
If your stdlib.h doesn't define this you should fix your stdlib.h.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:47 -08:00
Matt Turner
bfcdb84383 mesa: Use assert() instead of ASSERT wrapper.
Acked-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:47 -08:00
Matt Turner
52049f8fd8 mesa: Remove CHECK macro.
There's some commentary about how it's defined by other "modules", and
maybe that was true in 2000 when the code was added.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:41:22 -08:00
Matt Turner
6a587a4461 mesa: Remove dead CAPI define.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:41:22 -08:00
Matt Turner
14ded5ee61 gallium: Use util_cpu_to_le{16,32} in many more places.
... and util_le{16,32}_to_cpu. I think I've used the right ones for
describing the actual operation performed (even though they're both just
"byte-swap this if I'm on big-endian").

The Linux Kernel has typedefs __le32/__be32 and friends that static
analysis tools can use to check that byte-orderings are correct. It
might be interesting to apply that here as well.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:41:22 -08:00
Matt Turner
3492e88090 gallium/util: Use HAVE___BUILTIN_* macros.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:41:22 -08:00
Matt Turner
5a191f49ad mesa: Move C99 MSVC compatibility code from u_math.h to c99_compat.h.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:41:21 -08:00
Matt Turner
0b6d43e329 i965: Link test programs with gtest before pthreads.
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540962
2015-02-23 10:41:21 -08:00
Brian Paul
5dc6c8c570 osmesa: add gallium include dirs to Makefile.am
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89260
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:07:48 -07:00
Brian Paul
44375a3b13 util: move pipe_prim_names array into u_prim_name()
Also, wrapping the array in #ifdef DEBUG / #endif doesn't seem necessary.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:02:39 -07:00
Brian Paul
f1c67e37e6 util: rewrite debug_print_transfer_flags() using debug_dump_flags()
Add add missing PIPE_TRANSFER_PERSISTENT, PIPE_TRANSFER_COHERENT flags.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:02:39 -07:00
Eduardo Lima Mitev
0bfe21e8e0 mesa: Adds missing error condition in _mesa_check_sample_count()
This corrects a trivial error introduced in commit
19252fee46. That patch was merged recently
and omits one condition (that 'samples' is greater than zero) in one of
the error checks. That error will definitely cause regressions.

Also corrects the reference to the specification above the error check,
which was wrongly quoting OpenGL instead of OpenGL-ES.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-02-23 15:04:26 +01:00
Marek Olšák
050bf75c8b radeonsi: fix a warning caused by previous commit
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-02-23 11:45:00 +01:00
Marek Olšák
7820a11e3d radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-02-23 11:40:55 +01:00
Ben Widawsky
6e62a52865 i965/skl: Use 1 register for uniform pull constant payload
When under dispatch_width=16 the previous code would allocate 2 registers for
the payload when only one is needed. This manifested itself through bugs on SKL
which needs to mess with this instruction.

Ken though this might impact shader-db, but apparently it doesn't

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Timo Aaltonen <timo.aaltonen@canonical.com>
2015-02-22 12:27:35 -08:00
Eric Anholt
4359954d84 nir: Generalize the optimization of subs of subs from 0.
I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above,
but we can generalize it and make it more potentially useful.  In the
specific original case of a 0 for our new 'a' argument, it'll get further
algebraic optimization once the 0 is an argument to the new add.

No shader-db effects.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
345c2b288a nir: Collapse repeated bcsels on the same argument.
vc4 results:
total instructions in shared programs: 39881 -> 39794 (-0.22%)
instructions in affected programs:     6302 -> 6215 (-1.38%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
a38038ca5e nir: When faced with a csel on !condition, just flip the arguments.
total NIR instructions in shared programs: 39426 -> 39411 (-0.04%)
NIR instructions in affected programs:     3748 -> 3733 (-0.40%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
8e1152cb33 nir: Allow nir_opt_algebraic to see booleanness through &&, ||, ^, !.
We have some useful optimizations to drop things like 'ine a, 0' on a
boolean argument, but if 'a' came from logical operations on bools, it
couldn't tell.  These kinds of constructs appear as a result of TGSI->NIR
quite frequently (at least with if flattening), so being a little more
aggressive in detecting booleans can pay off.

v2: Add ixor as a booleanness-preserving op (Suggestion by Connor).

vc4 results:
total instructions in shared programs: 40207 -> 39881 (-0.81%)
instructions in affected programs:     6677 -> 6351 (-4.88%)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
dc982f4a85 nir: Add a couple of simplifications of csel operations.
vc4 was already cleaning these up, but it does shave 4 NIR instructions in
shader-db.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Ilia Mirkin
c2ece77678 glsl: ensure that enter/leave record get a record type
May make life easier for tools like Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-21 17:27:24 -05:00
Ilia Mirkin
1763494b31 tgsi: avoid returning pointer to local var, make it static
Spotted by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-21 17:27:24 -05:00
Rob Clark
51e335742e freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly
Fixes xonotic, some webgl stuff, and really pretty much anything with
more than 4 varyings.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
fb1301e40a freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
bdf023482a freedreno/a4xx: bit of cleanup
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
9153dd4b7e loader: not having a pci-id should not be a warn
If there is no pci-id, which is valid for vc4 and freedreno, just emit
an info msg.  Keep malformed but existing pci-id's as a warning.

Mostly just to clean up a warning that confuses users for the non-pci
devices.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
e17437386c freedreno: implement fence
I never actually implemented the stubbed out fence stuff back in the
early days.  Fix that.

We'll need a few libdrm_freedreno changes to handle timeout properly,
so ignore that for now to avoid a libdrm_freedreno dependency bump.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
6855226653 freedreno/a2xx: fix increment in assert
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88883
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:01 -05:00
Jordan Justen
49a938a265 i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data
The brw_imm_ud will yield a HW_REG which then will introduce a barrier
for certain optimization opportunities.

No piglit regressions seen with gen8 (simd8vs).

Suggested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-21 11:40:53 -08:00
Jordan Justen
17fbd854e0 i965/fs: Set pixel/sample mask for compute shaders atomic ops
For fragment programs, we pull this mask from the payload header. The same
mask doesn't exist for compute shaders, so we set all bits to enabled.

Previously we were setting 0xff to support SIMD8 VS, but with CS we
support SIMD16, and therefore we change this to 0xffff.

Related commits for SIMD8 VS:

commit d9cd982d55
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Sun Feb 15 20:06:59 2015 -0800
    i965/simd8vs: Fix SIMD8 atomics

commit 4a95be9772
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Tue Feb 17 09:57:35 2015 -0800
    i965/simd8vs: Fix SIMD8 atomics (read-only)

Note: this mask is ANDed with the execution mask, so some channels may not end
up issuing the atomic operation.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-21 11:40:53 -08:00
Chia-I Wu
9fe81879c5 ilo: R32G32B32_FLOAT need no special care on Gen8+
Gen8+ must use VALIGN_4.  Unlike prior Gens, R32G32B32_FLOAT should supposedly
support VALIGN_4.
2015-02-21 11:33:54 +08:00
Chia-I Wu
226109436f ilo: 128 BPP formats can use TiledY on Gen7.5+
The restriction is lifted.
2015-02-21 11:33:54 +08:00
Ilia Mirkin
f8e4792b22 nvc0: enable double support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:51:50 -05:00
Ilia Mirkin
5491458843 nvc0/ir: remove merge/split pairs to allow normal propagation to occur
Because the TGSI interface creates merges for each instruction source
and then splits them back out, there are a lot of unnecessary
merge/split pairs which do essentially nothing. The various modifier/etc
propagation doesn't know how to walk though those, so just remove them
when they're unnecessary.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:51:50 -05:00
Ilia Mirkin
93812dc10a nvc0/ir: add support for new TGSI double opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:51:43 -05:00
Ilia Mirkin
ef8f09be33 nvc0/ir: handle zero and negative sqrt arguments
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:28 -05:00
Ilia Mirkin
88127874a3 nvc0/ir: no instruction can load a double immediate
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:28 -05:00
Ilia Mirkin
b87b498b88 nvc0/ir: fix lowering of RSQ/RCP/SQRT/MOD to work with F64
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:28 -05:00
Ilia Mirkin
93ebe91bae gm107/ir: fix F2F flipped stype/dtype flags
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:27 -05:00
Ilia Mirkin
dbf4a674b9 gm107/ir: fix DSET boolean float flag
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:27 -05:00
Ilia Mirkin
727018bb0c gm107/ir: fix DMUL opcode encoding
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:27 -05:00
Ilia Mirkin
493ad88e1b gk110/ir: add emission of dadd/dmul/dmad opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:27 -05:00
Ilia Mirkin
fd0b1a4cbf nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:27 -05:00
Roland Scheidegger
88305dfd0b mesa: don't enable NV_fragment_program_option with swrast
Since dropping some NV_fragment_program opcodes (commits
868f95f1da, a3688d686f)
we can no longer parse all opcodes necessary for this extension, leading
to bugs (https://bugs.freedesktop.org/show_bug.cgi?id=86980).
Hence don't announce support for it in swrast (no other driver enabled it).
(Note that remnants of some NV_fp/vp extensions remain, they could be
dropped but are required as hacks for getting viewperf11 catia to run.)
2015-02-21 01:23:00 +01:00
Brian Paul
9dbe5e1dca drivers/x11: add gallium include dirs to Makefile.am
Fixes xlib driver build after e8c5cbfd92.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-02-20 16:25:07 -07:00
Marek Olšák
0feb0b7373 vbo: fix an unitialized-variable warning
It looks like a bug to me.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-21 00:16:35 +01:00
Marek Olšák
41f49a2fd4 gallium/sw/kms: fix a type-mismatch warning
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-21 00:16:35 +01:00
Marek Olšák
1a44566132 gallium/sw/kms: don't redefine DEBUG
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-21 00:16:35 +01:00
Marek Olšák
f900233928 targets/d3dadapter9: remove an unused variable
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-21 00:16:35 +01:00
Marek Olšák
ab947d2dd8 tgsi: fix type-mismatch warning
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-21 00:16:34 +01:00
Marek Olšák
6f273ec408 gallivm: fix uninitialized-variable warnings
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-21 00:16:34 +01:00
Matt Turner
b21ad12485 mesa: Have configure define NDEBUG, not mtypes.h.
mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
defined. Confusing and bizarre that you don't get NDEBUG if you don't
include mtypes.h.

... which is just what happened in commit bef38f62e.

Let's let configure define this for us if not using --enable-debug.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-20 14:10:38 -08:00
Kenneth Graunke
b6393d7040 nir: Fix the Mesa build without -DDEBUG.
With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which
doesn't exist, breaking the build:

assert(state->states[index].index < state->states[index].stack_size);

Switch it to ifndef NDEBUG, so the field will exist if the assertion
actually generates code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-20 13:43:44 -08:00
Eric Anholt
bef38f62e0 nir: Drop dependency on mtypes.h for core NIR.
One less new directory necessary for gallium code that wants to interact
with NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
90b4bf2e6e glsl: Only include mtypes from glsl_types.h for the C++ code that needs it.
It's used in one of the methods, not in the structure definitions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
b53d035825 util: Move Mesa's bitset.h to util/.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
8aa381e3cd mesa: Make bitset.h not rely on Mesa-specific types and functions.
Note that we can't use u_math.h's align() because it's a function instead
of a macro, while BITSET_DECLARE needs a constant expression for nouveau's
usage in global declarations.

v2: Stick some parens around the bits macro argument usage (review by Jose).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
41b1882ed4 mesa: Use u_math.h from macros.h
This avoids duplication of some macros and other definitions across the
tree.

Note that COPY_4FV switches from a memcpy-based implementation to an
assignment of 4 floats.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
5ca019358f gallium/util: Don't include unused debug functions from u_math.h
It introduces references to gallium util/ symbols which means we don't get
to include it from outside-of-gallium code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
e8c5cbfd92 mesa: Add gallium include dirs to more parts of the tree.
v2: Try to patch up the scons bits.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Marek Olšák
f5ac5e20b1 gallium/radeon: fix an uninitialized-variable warning 2015-02-20 20:20:10 +01:00
Ilia Mirkin
c85a686d02 gallium: add new double-related shader caps to all the getters
Missed a few drivers in the earlier changes, this should fix up all the
ones that print unknown caps or don't have a default statement.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-20 14:09:25 -05:00
Brian Paul
71b155a2cb svga: add missing _DROUND,DFRACEXP_DLDEXP_SUPPORTED switch cases
To silence unhandled switch case warnings.
2015-02-20 08:09:40 -07:00
Marek Olšák
7692704b14 radeonsi: don't use SQC_CACHES to flush ICACHE and KCACHE on SI
This reverts 73c2b0d18c.

It doesn't seem to be reliable. It's probably missing a wait packet or
something, because it's just a register write and doesn't wait for anything.
SURFACE_SYNC at least seems to wait until the flush is done. Just guessing.

Let's not complicate things and revert this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88561

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-20 12:06:22 +01:00
Iago Toral Quiroga
2a06728ba0 i965/gen6: Fix GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB
In gen6 we need to compute the primitive count in the generated GS program.
The current implementation only counts full primitives, that is, if the
output primitive type is a triangle strip, it won't count individual
triangles in the strip, only complete strips.

If we want to count basic primitives instead we have two options: rework
the assembly code we generate for strip primitives or simply use
CL_INVOCATION_COUNT to resolve the query and let the hardware do that work
for us. This patch implements the latter approach.

Fixes the following piglit test:
bin/arb_pipeline_statistics_query-geom -auto

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89210
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-20 11:24:11 +01:00
Eduardo Lima Mitev
097b933b55 mesa: Check that draw buffers are valid for glDrawBuffers on GLES3
Section 4.2 (Whole Framebuffer Operations) of the OpenGL 3.0 specification
says:

    "Each buffer listed in bufs must be BACK, NONE, or one of the values from
     table 4.3 (NONE, COLOR_ATTACHMENTi)".

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.draw_buffers

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-20 09:35:12 +01:00
Samuel Iglesias Gonsalvez
fe1e89a026 glsl: don't allow invariant qualifiers for interface blocks
GLSL 1.50 and GLSL 4.40 specs, they both say the same in
"Interface Blocks" section:

"If optional qualifiers are used, they can include interpolation qualifiers,
auxiliary storage qualifiers, and storage qualifiers and they must declare
an input, output, or uniform member consistent with the interface qualifier
of the block"

From GLSL ES 3.0, chapter 4.3.7 "Interface Blocks", page 38:

"GLSL ES 3.0 does not support interface blocks for shader inputs or outputs."

and from GLSL ES 3.0, chapter 4.6.1 "The invariant qualifier", page 52.

"Only variables output from a shader can be candidates for invariance."

This patch fixes the following dEQP tests:

dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_fragment

No piglit regressions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>

v2:

- Enable this check for GLSL.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-20 09:35:08 +01:00
Eric Anholt
85316d059c vc4: Keep an array of pointers to instructions defining the temps around.
The optimization passes are always regenerating it and throwing it away,
but it's not hard to keep track of.
2015-02-19 23:35:17 -08:00
Eric Anholt
877b48a531 vc4: Move qir_uniform() and the constant-value versions to vc4_qir.c/h.
I may want them in optimization passes, and they're not really particular
to the program translation stage.
2015-02-19 23:35:17 -08:00
Eric Anholt
14dc281c13 vc4: Enforce one-uniform-per-instruction after optimization.
This lets us more intelligently decide which uniform values should be put
into temporaries, by choosing the most reused values to push to temps
first.

total uniforms in shared programs: 13457 -> 13433 (-0.18%)
uniforms in affected programs:     1524 -> 1500 (-1.57%)
total instructions in shared programs: 40198 -> 40019 (-0.45%)
instructions in affected programs:     6027 -> 5848 (-2.97%)

I noticed this opportunity because with the NIR work, some programs were
happening to make different uniform copy propagation choices that
significantly increased instruction counts.
2015-02-19 23:35:17 -08:00
Eric Anholt
09c844fcd9 vc4: Rename add_uniform() to qir_uniform(). 2015-02-19 23:35:17 -08:00
Eric Anholt
96f6efc561 vc4: Shut up runtime warnings about new pipe caps. 2015-02-19 23:35:13 -08:00
Matt Turner
e0137fd6f7 i965/vec4: Add and use byte-MOV instruction for unpack 4x8.
Previously we were using a B/UB source in an Align16 instruction, which
is illegal. It for some reason works on all platforms, except Broadwell.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86811
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:44 -08:00
Matt Turner
dada30462b i965/blorp: Emit MADs.
Low hanging fruit: cuts a couple of instructions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Matt Turner
30ec53f30e i965/blorp: Optimize clamping tex coords.
Each emit_cond_mov() emits a CMP of its first to arguments using the
specified conditional mod, followed by a predicated MOV of the fifth
argument into the fourth. In all four cases here, it was just
implementing MIN/MAX which we can do in a single SEL instruction.

Also reorder the instructions for a slightly better schedule.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Matt Turner
3b7f683f3b i965: Use greater-equal cmod to implement maximum.
The docs specifically call out SEL with .l and .ge as the
implementations of MIN and MAX respectively. Among other things, SEL
with these conditional mods are commutative.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Matt Turner
f8b435ae6a i965: Don't emit saturates for instructions without destinations.
We were special casing OPCODE_END but no other instructions that have no
destination, like OPCODE_KIL, leading us to emitting MOVs with null
destinations.

total instructions in shared programs: 5705243 -> 5701539 (-0.06%)
instructions in affected programs:     124104 -> 120400 (-2.98%)
helped:                                904

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Matt Turner
7f8dd91d16 i965/fs: Consider MOV.SAT to interfere if it has a source modifier.
The saturate propagation pass recognizes that the second instruction
below does not interfere with an attempt to propagate the saturate
modifier from instruction 3 to 1.

 1:  add(8)     dst0   src0  src1
 2:  mov.sat(8) dst1   dst0
 3:  mov.sat(8) dst2   dst0

Unfortunately, we did not consider the case of instruction 2 having a
source modifier on dst0. Take for instance:

 1:  add(8)     dst0   src0  src1
 2:  mov.sat(8) dst1  -dst0
 3:  mov.sat(8) dst2   dst0

Consider such an instruction to interfere. Increase instruction counts
in Anomaly 2, which could be a bug fix depending on the values the first
instruction produces.

instructions in affected programs:     53228 -> 53934 (1.33%)
HURT:                                  360

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Matt Turner
871ad3f08b i965/fs: Use fs_inst::overwrites_reg() in saturate propagation.
This is safer and matches the conditional_mod propagation pass.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Matt Turner
bf3389ec49 i965/fs: Add unit tests for saturate propagation pass.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 21:16:43 -08:00
Timothy Arceri
9acb011a3e glsl: Use the without_array predicate
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-20 16:11:15 +11:00
Ilia Mirkin
5000a5f67b nv50: add PIPELINE_STATISTICS query support, based on nvc0
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nick Tenney <nick.tenney@gmail.com>
2015-02-19 23:12:35 -05:00
Ilia Mirkin
f883df74e0 svga: add missing :
Fixes: 924ee3f408 ("gallium: add shader cap for dldexp/dfracexp support")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 20:18:02 -05:00
Jason Ekstrand
c7002fad90 nir/GCM: Pull unpinned instructions out of blocks while pinning
This lets us be slightly more efficient by not walking the CFG extra times.
Also, it may make it easier to ensure that GVN happens on only unpinned
instructions.

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
8dfe6f672f nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned
Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
190073c737 nir: Add a global code motion (GCM) pass
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use nir_dominance_lca for computing least common anscestors
 - Use the block index for comparing dominance tree depths
 - Pin things that do partial derivatives

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
a52a4b5223 nir/instr: Change "live" to a more generic "pass_flags" field
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
3d25afc51c nir: Make nir_[cf_node/instr]_[prev/next] return null if at the end
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
902b0ccc9a nir/from_ssa: Don't try to read an invalid instruction
Right now, the nir_instr_prev function function blindly looks up the
previous element in the exec list and casts it to an instruction even if
it's the tail sentinel.  The next commit will change this to return null if
it's the first instruction.  Making this change first avoids getting a
segfault between commits.  The only reason we never noticed is that, thanks
to the way things are laid out in nir_block, the casted instruction's type
was never parallal_copy.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
0281fd0786 nir/validate: Validate SSA defs the same way we do for registers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
34952b5671 nir/validate: Validate if_uses on registers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
98ecb25f89 nir: Properly clean up CF nodes when we remove them
Previously, if you remved a CF node that still had instructions in it, none
of the use/def information from those instructions would get cleaned up.
Also, we weren't removing if statements from the if_uses of the
corresponding register or SSA def.  This commit fixes both of these
problems

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
e025943134 nir: use nir_foreach_ssa_def for indexing ssa defs
This is both simpler and more correct.  The old code didn't properly index
load_const instructions.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
0167c38cac nir/from_ssa: Use the nir_block_dominance function instead of our own
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
f481a9425c nir/dominance: Add a constant-time mechanism for comparing blocks
This is mostly thanks to Connor.  The idea is to do a depth-first search
that computes pre and post indices for all the blocks.  We can then figure
out if one block dominates another in constant time by two simple
comparison operations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
b4c5489c8a nir/dominance: Expose the dominance intersection function
Being able to find the least common anscestor in the dominance tree is a
useful thing that we may want to do in other passes.  In particular, we
need it for GCM.

v2: Handle NULL inputs by returning the other block

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:16 -08:00
Ilia Mirkin
6316c90cc0 st/mesa: lower DFRACEXP/DLDEXP when they are not supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 19:39:15 -05:00
Ilia Mirkin
e4a3f48a45 st/mesa: disable lowering of dops to dfrac when dround is available
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 19:38:26 -05:00
Ilia Mirkin
e556bfc8ff st/mesa: add support for new double opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 19:32:55 -05:00
Ilia Mirkin
924ee3f408 gallium: add shader cap for dldexp/dfracexp support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 19:32:52 -05:00
Ilia Mirkin
899d779cb7 gallium: add a cap to enable double rounding opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 19:32:49 -05:00
Ilia Mirkin
12dedca523 gallium: add some more double opcodes to avoid unnecessary lowering
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 19:32:35 -05:00
Dave Airlie
1759689d18 docs/GL3.txt: softpipe now supports GL_ARB_gpu_shader_fp64
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 10:12:00 +10:00
Dave Airlie
8c6a0ebaad st/mesa: add st fp64 support (v7.1)
This adds support to the state tracker for
ARB_gpu_shader_fp64.

The details are explained in comments
within the code.

v2 : add double to int/unsigned conversion
v3: handle fp64 consts better
v4: use DRSQ
v4.1: add d2b
v4.2: drop DDIV

v5: split out some prep patches.
v5.1: add some comments.
v5.2: more comments

v6: simplify down the double instruction
    generation loop.

v7: Merge Ilia's two cleanup patches.
v7.1: minor fixups for Ilia patch + cleanups

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 10:06:56 +10:00
Dave Airlie
0178358a2d mesa/st_tgsi_to_glsl: prepare add_constant for fp64
This just moves stuff around a little to make the next patch
cleaner.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 10:06:47 +10:00
Dave Airlie
12150a5bee st/glsl_to_tgsi: convert dst to an array
This is just prep work for fp64 support where we need
an array of 2 dst values.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 10:05:52 +10:00
Dave Airlie
c442d0961e i965: just avoid warnings with fp64
This just fills in some blanks to avoid warnings in the i965 driver.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 09:44:28 +10:00
Kenneth Graunke
75f6ed617f glsl: Add compute to _mesa_shader_stage_to_string(); use unreachable.
This is basically Ian's review feedback for my patch that added
_mesa_shader_stage_to_abbrev() - it just makes both consistent again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-19 15:15:46 -08:00
Kenneth Graunke
5cdfa839c2 i965/vec4: Print "VS" or "GS" when compiles fail, not "vec4".
This is now trivial to do right.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:46 -08:00
Kenneth Graunke
e60318fbcd i965/vec4: Replace debug_flag with debug_enabled.
backend_visitor now handles this, so we can delete the vec4_visitor
specific code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
eeacbc1a02 i965: Make scheduler cycle estimates use the proper stage name.
Previously, the vec4 backend labeled shaders as "vec4" - now it uses the
specific names "VS" and "GS".

The FS backend now correctly prints "VS" for vertex shaders (rather than
"fs").  It also prints "FS" instead of "fs" for fragment shaders;
preserving that behavior didn't seem essential.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
2bd139e18c i965/fs: Un-hardcode DEBUG_WM, "FS", and "fragment".
These code paths can (or will) be used for other shader stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
7e35a81264 i965: Create backend_visitor fields for debugging messages.
We introduce three new fields in backend_visitor:
- debug_enabled: whether or not INTEL_DEBUG & DEBUG_<stage flag>
- stage_name: "vertex", "fragment", etc. for use in messages
- stage_abbrev: "VS", "FS", etc. for use in messages

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
7c891e8ddd i965: Add a function to translate MESA_SHADER_* into DEBUG_* enums.
When compiling, we have a gl_shader_stage (MESA_SHADER_*) enum, and want
to know whether debugging is enabled for that stage.  This allows us to
easily translate it into the corresponding debug flag.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
7555d1bafb glsl: Create a _mesa_shader_stage_to_abbrev() function.
This is similar to _mesa_shader_stage_to_string(), but returns "VS"
instead of "vertex".

v2: Use unreachable() and add MESA_SHADER_COMPUTE (requested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
231267bf01 i965/fs: Use VARYING_SLOT checks rather than strcmp().
Comparing the location field is equivalent and more efficient.

We'll also need this when we start using NIR for ARB programs, as our
NIR converter will set the location field correctly, but probably won't
use the GLSL names for these concepts.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Kenneth Graunke
a07cd42f1e i965/fs: Remove type parameter from emit_vs_system_value().
Every VS system value has type D.  We can always add this back if that
changes, but for now, it's extra typing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-19 15:15:45 -08:00
Dave Airlie
2e9f4eadfb glsl: add lowering for double divide to rcp/mul
It looks like no hw does div anyways, so we should just
lower at the GLSL level.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 08:58:06 +10:00
Dave Airlie
0e82817247 softpipe/tgsi: expose doubles for softpipe.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 08:52:11 +10:00
Dave Airlie
fa43e0443e tgsi: add support for flt64 constants
These act like flt32 except they take up two slots, and you
can only add 2 x flt64 constants in one slot.

The main reason they are different is we don't want to match half a flt64
constants against a flt32 constant in the matching code, we need to make
sure we treat both parts of the flt64 as an single structure.

Cleaned up printing/parsing by Ilia Mirkin <imirkin@alum.mit.edu>

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 08:51:49 +10:00
Dave Airlie
3cd1338534 gallium: add double opcodes and TGSI execution (v4.2)
This patch adds support for a set of double opcodes
to TGSI. It is an update of work done originally
by Michal Krol on the gallium-double-opcodes branch.

The opcodes have a hint where they came from in the
header file.

v2: add unsigned/int <-> double
v2.1:  update docs.

v3: add DRSQ (Glenn), fix review comments (Glenn).

v4: drop DDIV
v4.1: cleanups, fix some docs bugs, (Ilia)
      rework store_dest and fetch_source fns. (Ilia)
4.2: fixup float comparisons (Ilia)

This is based on code by Michael Krol <michal@vmware.com>

Roland and Glenn also reviewed earlier versions.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-20 08:49:12 +10:00
Brian Paul
14b9bf630c gallium/util: indentation fix 2015-02-19 15:36:59 -07:00
Brian Paul
21c57a697f st/mesa: add GSL_TYPE_DOUBLE, new ir_unop_* switch cases
To silence compiler warnings about unhandled switch cases.
v2: move GSL_TYPE_DOUBLE case to the "Invalid type in type_size" section,
per Ilia.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 15:36:59 -07:00
Brian Paul
2f5597787c nir: add missing GLSL_TYPE_DOUBLE case in type_size()
To silence compiler warning about unhandled switch case.
v2: move GLSL_TYPE_DOUBLE to the "not reached" section, per Ilia.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 15:36:59 -07:00
Brian Paul
62a8883f32 st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels
Use pipe_sampler_view_reference() instead of ordinary assignment.
Also add a new sanity check assertion.

Fixes piglit gl-1.0-drawpixels-color-index test crash.  But note
that the test still fails.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 15:36:59 -07:00
Brian Paul
89c96afe3c swrast: fix multiple color buffer writing
If a fragment program wrote to more than one color buffer, the
first fragment color got replicated to all dest buffers.  This
fixes 5 piglit FBO tests, including fbo-drawbuffers-arbfp.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45348
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-19 15:36:59 -07:00
Brian Paul
fbac86ad2a mesa: remove unused _math_trans_4chan()
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-19 15:36:59 -07:00
Lucas Stach
5c1aac17ad install-lib-links: don't depend on .libs directory
This snippet can be included in Makefiles that may, depending on the
project configuration, not actually build any installable libraries.

In that case we don't have anything to depend on and this part of
the makefile may be executed before the .libs directory is created,
so do not depend on it being there.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2015-02-19 10:02:02 -08:00
Francisco Jerez
6c34fd20be i965/vec4: Calculate register allocation q values manually.
This fixes a regression in the running time of Piglit introduced by
commit 78e9043475, which increased the
number of register allocation classes set up by the VEC4 back-end
from 2 to 16.  The algorithm used by ra_set_finalize() to calculate
them is unnecessarily expensive, do it manually like the FS back-end
does.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:09:12 +02:00
Francisco Jerez
35a77a148f i965: Don't compact instructions with unmapped bits.
Some instruction bits don't have a mapping defined to any compacted
instruction field.  If they're ever set and we end up compacting the
instruction they will be forced to zero.  Avoid using compaction in such
cases.

v2: Align multiple lines of an expression to the same column.  Change
    conditional compaction of 3-source instructions to an
    assertion. (Matt)
v3: The 3-source instruction bit 105 is part of SourceIndex on CHV.
    Add assertion that reserved bit 7 is not set. (Matt)
    Document overlap with UIP and 64-bit immediate fields.
v4: Make some more unmapped bit checks assertions. (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:06:42 +02:00
Francisco Jerez
6c07279e5a i965: Handle F16TO32/F32TO16 with dword src/dst consistently on both back-ends.
Due to the way it's implemented in hardware, the F16TO32/F32TO16
instructions require the source/destination register to be of some
16-bit type in Align1 mode, while they require it to be some 32-bit
type in Align16 mode (and as an undocumented feature the high 16 bits
of the destination register are zeroed out in the case of the F32TO16
instruction on Gen7).  Make their behaviour consistent so you can
specify a 32 bit register type as source or destination and get
predictable results in the most significant bits no matter what access
mode is being used.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:06:42 +02:00
Francisco Jerez
437d401e63 i965/gen8: Fix F32TO16 in vec4 mode if the source and destination registers alias.
We cannot zero out the destination register if it overlaps with the
source.  Use an Align1 instruction instead to zero out the high 16
bits after the conversion to half float.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:06:42 +02:00
Francisco Jerez
509f58740c i965/fs: Replace ud_reg_to_w() with a more general helper function.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:06:42 +02:00
Francisco Jerez
63d6d09a3b i965/vec4: Don't attempt to reduce swizzles of send from GRF instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:06:42 +02:00
Francisco Jerez
bda7698fce i965/vec4: Fix constant propagation across different types.
If the source type differs from the original type of the constant we
need to bit-cast it before propagating, otherwise the original type
information will be lost.  If the constant was a vector float there
isn't much we can do, because the result of bit-casting the component
values of a vector float cannot itself be represented as an immediate.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 14:06:42 +02:00
Samuel Iglesias Gonsalvez
187ace73a9 glsl: A shader cannot redefine or overload built-in functions in GLSL ES 3.00
Create a new search function to look for matching built-in functions by name
and use it for built-in function redefinition or overload in GLSL ES 3.00.

GLSL ES 3.0 spec, chapter 6.1 "Function Definitions", page 71

  "A shader cannot redefine or overload built-in functions."

While in GLSL ES 1.0 specification, chapter 8 "Built-in Functions"

  "User code can overload the built-in functions but cannot redefine them."

So this check is specific to GLSL ES 3.00.

This patch fixes the following dEQP tests:

dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_vertex
dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_fragment
dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_vertex
dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_fragment

No piglit regressions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-19 10:05:33 +01:00
Eduardo Lima Mitev
19252fee46 mesa: Adds check for integer internal format and num samples in glRenderbufferStorageMultisample
Per GLES3 specification, section 4.4 Framebuffer objects page 198, "If
internalformat is a signed or unsigned integer format and samples is greater
than zero, then the error INVALID_OPERATION is generated.".

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.renderbuffer_storage_multisample

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 09:35:41 +01:00
Eduardo Lima Mitev
dbc160a3f8 mesa: Returns correct error values from gl(Get)SamplerParameter*() on GL-ES 3.0+
'3.8.2 Sampler Objects' section of the GL-ES 3.0 specification states:

    "An INVALID_OPERATION error is generated if sampler is not the name
    of a sampler object previously returned from a call to GenSamplers."

In desktop GL, an GL_INVALID_VALUE is returned instead.

Fixes 6 dEQP failing tests:
* dEQP-GLES3.functional.negative_api.shader.get_sampler_parameteriv
* dEQP-GLES3.functional.negative_api.shader.get_sampler_parameterfv
* dEQP-GLES3.functional.negative_api.shader.sampler_parameteri
* dEQP-GLES3.functional.negative_api.shader.sampler_parameteriv
* dEQP-GLES3.functional.negative_api.shader.sampler_parameterf
* dEQP-GLES3.functional.negative_api.shader.sampler_parameterfv

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-19 09:35:37 +01:00
Ilia Mirkin
e8e22cf65f glsl: remove bogus 'd' constant qualifiers
0.0 is a double anyways. Apparently my version of gcc was happy with
0.0d as well, but this is not true of all compilers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89218

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 01:45:54 -05:00
Ilia Mirkin
0cade4ea2b st/mesa: don't die for ETC2 formats when no driver support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 01:41:28 -05:00
Eric Anholt
2a135c470e nir: Add an ALU op builder kind of like ir_builder.h
v2: Rebase on the nir_opcodes.h python code generation support.
v3: Use SSA values, and set an appropriate writemask on dot products.
v4: Make the arguments be SSA references as well.  This lets you stack up
    expressions in the arguments of other expressions, at the cost of
    having to insert a fmov/imov if you want to swizzle.  Also, add
    the generated file to NIR_GENERATED_FILES.
v5: Use more pythonish style for iterating the list.
v6: Infer the size of the dest from the size of the srcs, and auto-swizzle
    a single small src out to the appropriate size.
v7: Add little helpers for initializing the struct, add a typedef for the
    struct like other nir types have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v6)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v7)
2015-02-18 22:28:42 -08:00
Ilia Mirkin
de798bb937 docs: mark ARB_gpu_shader_fp64 as done in core
No driver support... yet. But core is ready.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Ilia Mirkin
e790a3c910 glsl/tests: add DOUBLE types
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 00:28:35 -05:00
Ilia Mirkin
2e7e7b8af6 glsl: add a lowering pass for frexp/ldexp with double arguments
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 00:28:35 -05:00
Dave Airlie
fffbf37124 glsl: lower double optional passes (v2)
These lowering passes are optional for the backend to request, currently
the TGSI softpipe backend most likely the r600g backend would want to use
these passes as is. They aim to hit the gallium opcodes from the standard
rounding/truncation functions.

v2: also lower floor in mod_to_floor

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Dave Airlie
e6354a2850 glsl: implement double builtin functions
This implements the bulk of the builtin functions for fp64 support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Dave Airlie
2e626318e0 glsl/lower_instructions: add double lowering passes
This lowers double dot product and lrp to fma.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Dave Airlie
8be5ee23de glsl: enable/disable certain lowering passes for doubles
We want to restrict some lowering passes to floats only,
and enable other for doubles.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Tapani Pälli
3bbaf71994 glsl: validate output types for shader stages
Patch fixes Piglit test:
   arb_gpu_shader_fp64/preprocessor/fs-output-double.frag

and adds additional validation for shader outputs.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Dave Airlie
94f9ed701a glsl: add double support to lower_mat_op_to_vec
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Dave Airlie
3773072169 glsl: Linking support for doubles
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:35 -05:00
Dave Airlie
7aa3ffe2c5 glsl: Support double loop control
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
53383476d1 glsl: Support double inouts
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
a10275f762 glsl/lexer: Support double floats
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
942574bb24 glsl/parser: Support double floats
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
ba3bab264d glsl/ast: Support double floats
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
24626444c3 glsl: Add ubo lowering support for doubles
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
8609b53716 glsl: Add support doubles in optimization passes
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
41e9adfd83 glsl/ir: Add builder support for functions with double floats
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
eeae6251be glsl/ir: Add builtin constant function support for doubles
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
753ba6b999 glsl/ir: Add cloning support for doubles
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
57c6c3d3bd glsl/ir: Add printing support for doubles
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
5a69bdb599 glsl/ir: Add builtin function support for doubles
v2: add d2b, more ir_constant stuff (Ilia)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Ilia Mirkin
53bf7c8fd2 glsl: fix uniform linking logic in the presence of structs
Add a enter/leave record callback so that the offset may be aligned to
the proper value. Otherwise only leaf fields are called, and the first
field needs to be aligned to the outer struct's base alignment while the
last field needs to be aligned to the inner struct's base alignment.

This removes most usage of the last field/record type values passed into
visit_field.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 00:28:34 -05:00
Ilia Mirkin
1ec715ce8b glsl: teach std140_base_alignment about samplers
These functions are about to be used more aggressively for determining
uniform layout. Samplers may be inside of structs, and it's easier to
reuse the existing base alignment logic.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-02-19 00:28:34 -05:00
Dave Airlie
fe23bb85ba glsl: Uniform linking support for doubles
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:34 -05:00
Dave Airlie
3af8db94cd glsl: Add double builtin type generation
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:33 -05:00
Dave Airlie
277f4d75a7 glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2)
v2: add define bit (Tapani Pälli)

Patch makes following Piglit tests pass:
   arb_gpu_shader_fp64/preprocessor/define.vert
   arb_gpu_shader_fp64/preprocessor/define.frag

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:33 -05:00
Dave Airlie
5cc486b4e3 mesa: add double uniform support. (v5)
This adds support for the new uniform interfaces
from ARB_gpu_shader_fp64.

v2:
support ARB_separate_shader_objects ProgramUniform*d* (Ian)
don't allow boolean uniforms to be updated (issue 15) (Ian)

v3: fix size_mul
v4: Teach uniform update to take into account double precision (Topi)
v5: add transpose for double case (Ilia)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:33 -05:00
Dave Airlie
bf257d2c90 glsl: Add double builtin type
This causes a lot of warnings about unchecked type in
switch statements - fix them later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:33 -05:00
Dave Airlie
6227af2690 mesa: add ARB_gpu_shader_fp64 extension info (v2)
This just adds the entries to extensions.c and mtypes.h

v2: use core profile only (Ian)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:33 -05:00
Dave Airlie
3c915e5c16 glapi: add ARB_gpu_shader_fp64 (v2)
Just add the xml file covering this extension,
and dummy interface files in mesa, and fix up
sanity tests.

v2:
Enable ProgramUniform*d* from ARB_separate_shader_objects (Ian)
use 40 instead of 43 for dispatch_sanity.cpp (Chris)
uncomment PU sanity tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:28:33 -05:00
Ilia Mirkin
069dab7576 freedreno: add missing PIPE_CAP_RESOURCE_FROM_USER_MEMORY to switch
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:25:03 -05:00
Ilia Mirkin
92fc8f04d6 freedreno/a3xx: add ARB_instanced_arrays support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:25:03 -05:00
Ilia Mirkin
f6b2e8af74 freedreno/a3xx: add support for vertexid and instanceid sysvals
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:25:03 -05:00
Ilia Mirkin
2c6e3d822b freedreno: pass number of instances to draw
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:25:03 -05:00
Ilia Mirkin
e4ddfeea65 freedreno/a3xx: add ETC2 decoding support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 00:25:03 -05:00
Ilia Mirkin
33edda7d97 st/mesa: pass etc2 textures to driver if supported
If the driver actually supports ETC2, don't decode it in software.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-19 00:25:03 -05:00
Ilia Mirkin
845b9e4294 llvmpipe,softpipe: only support ETC1, not the upcoming ETC2
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-18 22:32:25 -05:00
Ilia Mirkin
0821efcb33 gallium: add ETC2 format support
No actual decoding is added, similar faking mechanism to bptc.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-18 22:32:25 -05:00
Ilia Mirkin
d622afdbc3 freedreno/a3xx: add hardware ETC1 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-18 22:32:25 -05:00
Eric Anholt
935ee6b652 gallium/dri: Shut up a compiler warning.
The compiler doesn't see that buffers is set in the !image case and used
in the !image case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-18 15:15:29 -08:00
Eric Anholt
6eadde51bb nir: Recognize and reduce duplicated fsats.
No effect on vc4 shader-db.

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
1907a3a7ee nir: Add a flag for lowering fsat.
vc4 cse/algebraic-disabled stats:
total instructions in shared programs: 44356 -> 44354 (-0.00%)
instructions in affected programs:     55 -> 53 (-3.64%)

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
e5ecf8e427 nir: Add a flag for lowering ffma.
vc4 cse/algebraic-disabled stats:
total uniforms in shared programs: 13966 -> 13791 (-1.25%)
uniforms in affected programs:     435 -> 260 (-40.23%)
total instructions in shared programs: 44732 -> 44356 (-0.84%)
instructions in affected programs:     9599 -> 9223 (-3.92%)

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
42a8ace66e nir: Add a flag for lowering fneg/ineg.
vc4 cse/algebraic-disabled stats:
total instructions in shared programs: 44911 -> 44732 (-0.40%)
instructions in affected programs:     11371 -> 11192 (-1.57%)

v2: Fix broken iabs(isub(0, a)) transformation.
v3: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
cb95a228e8 nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)).
vc4 cse/algebraic-disabled stats:
total uniforms in shared programs: 13972 -> 13966 (-0.04%)
uniforms in affected programs:     408 -> 402 (-1.47%)
total instructions in shared programs: 44973 -> 44911 (-0.14%)
instructions in affected programs:     1551 -> 1489 (-4.00%)

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:50 -08:00
Eric Anholt
ccf14bca4b nir: Add lowering of POW instructions if the lower flag is set.
This could be done in a separate pass like we do in GLSL IR, but it seems
to me like having the definitions of the transformations in the two
directions next to each other makes a lot of sense.

v2: Reorder the comment about the transformation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-18 14:47:50 -08:00
Eric Anholt
8e9dbfff17 nir: Conditionalize the POW reconstruction on shader compiler options.
Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic
and other passes should try and generate certain types of opcodes or
patterns.  Extend that to NIR by defining our own struct, which is
automatically generated from the Mesa struct in glsl_to_nir and provided
directly by the driver in TGSI-to-NIR.

v2: Split out the previous two prep patches.
v3: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
2015-02-18 14:47:50 -08:00
Eric Anholt
955a6bb57d nir: Add an optional expression controlling nir_algebraic xforms.
This will be used so that we can customize the transforms for the target
GPU, so we don't un-lower expressions that had already been lowered (or
introduce new lowering transformations that not all GPUs want)

v2: Drop the complication of having the condition->index dictionary, since
    we don't actually expect there to be many different conditions (change
    by Kenneth).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-18 14:47:50 -08:00
Eric Anholt
f90bb54734 nir: Add a nir_shader_compiler_options struct pointed to by the shaders.
This will be used to give the optimization passes a chance to customize
behavior for the particular target device.

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:50 -08:00
Jordan Justen
4a95be9772 i965/simd8vs: Fix SIMD8 atomics (read-only)
An update for d9cd982d55.

A similar change was needed for CS to allow the piglit test
tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test
to pass.

The previous change (d9cd982d) should fix cases that write atomics,
such as atomicCounterIncrement, and this change will fix cases than
only read atomics, such as atomicCounter.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-18 14:33:36 -08:00
Chia-I Wu
b0e26173b2 ilo: fix PCB alloc asserts on Gen7.5 GT3
GT3 has two slices and all limits are doubled.
2015-02-18 14:20:29 -07:00
Chia-I Wu
68573f57ee ilo: fix compiler warnings
Fix -Wmaybe-uninitialized warnings.  The change to
ilo_blit_resolve_slices_for_hiz() is a potential bug fix.
2015-02-18 14:20:29 -07:00
Adam Jackson
b290330e3b i915: For the love of all that is holy, stop saying "IGD"
a001 and a011 are pineview chips.  Say so.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2015-02-18 14:51:16 -05:00
Emil Velikov
8a71fd8d49 auxiliary/vl: honour the DRI2PROTO_CFLAGS
Otherwise for non-default installations the build will fail to find the
headers and error out.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-18 11:02:50 +00:00
Emil Velikov
dd7b6670a2 auxiliary/vl: Build vl_winsys_dri.c only when needed.
With commit c39dbfdd0f7(auxiliary/vl: bring back the VL code for the dri
targets) we did not fully consider users of dri-swrast alone. Thus we
ended up trying to compile the dri2 specific code on platform which lack
it - Cygwin for example.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reported-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2015-02-18 11:02:50 +00:00
Emil Velikov
3018c4a56a automake: Use AM_DISTCHECK_CONFIGURE_FLAGS
Currently we use DISTCHECK_CONFIGURE_FLAGS, which is reserved for
the user. As with other variables, one should use the AM_ variable
within the makefile.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-18 11:02:44 +00:00
Emil Velikov
b0eada1707 glx: do not leak the dri2 extension information
The XExtensionInfo is allocated dynamically (if the pointer is NULL)
in the XEXT_GENERATE_FIND_DISPLAY macro. On the other hand the
macro XEXT_GENERATE_CLOSE_DISPLAY does not check/free the memory.

Follow the example set by dri1 and appledri, and use a static variable.

Spotted while hunting "still reachable" leaks in Waffle.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-18 11:02:25 +00:00
Michel Dänzer
4db985a5fa Revert "radeon/llvm: enable unsafe math for graphics shaders"
This reverts commit 0e9cdedd2e.

It caused the grass to disappear in The Talos Principle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-18 17:06:32 +09:00
Ilia Mirkin
b7a85bee83 st/mesa: add ARB_pipeline_statistics_query support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-18 02:10:47 -05:00
Ben Widawsky
e206785b57 i965: implement ARB_pipeline_statistics_query
NOTE: The implementation was initially one patch, this. All the history is kept
here, even though all the core mesa changes were moved to the parent of this
patch.

This patch implements ARB_pipeline_statistics_query. This addition to GL does
not add a new API. Instead, it adds new tokens to the existing query APIs. The
work to hook up the new tokens is trivial due to it's similarity to the previous
work done for the query APIs. I've implemented all the new tokens to some
degree, but have stubbed out the untested ones at the entry point for Begin().
Doing this should allow the remainder of the code to be left in.

The new tokens give GL clients a way to obtain stats about the GL pipeline.
Generally, you get the number of things going in, invocations, and number of
things coming out, primitives, of the various stages. There are two immediate
uses for this, performance information, and debugging various types of
misrendering. I doubt one can use these for debugging very complex applications,
but for piglit tests, it should be quite useful.

Tessellation shaders, and compute shaders are not addressed in this patch
because there is no upstream implementation. I've implemented how I believe
tessellation shader stats will work for Intel hardware (though there is a bit of
ambiguity). Compute shaders are a bit more interesting though, and I don't yet
know what we'll do there.

For the lazy, here is a link to the relevant part of the spec:
https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt

Running the piglit tests
http://lists.freedesktop.org/archives/piglit/2014-November/013321.html
(http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats)
yield the following results:

> piglit-run.py -t stats tests/all.py output/pipeline_stats
> [5/5] pass: 5 Running Test(s): 5

v2:
- Don't allow pipeline_stats to be per stream (Ilia). This may (not sure) be
  needed for AMD_transform_feedback4, which we do not support.
   > If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_-
   > EMITTED_ARB counts primitives emitted to any of the vertex streams for
   > which STREAM_RASTERIZATION_AMD is enabled.
- Remove comment from GL3.txt because it is only used for extensions that are
  part of required versions (Ilia)
- Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia)
- Add a fallthrough comment (Ilia)
- Only divide PS invocations by 4 on HSW+ (Ben)

v3:
- Add ARB_pipeline_statistics_query to relnotes.html
- Add ARB_pipeline_statistics_query.xml to the Makefile.am, and master XML (Ilia)
- Correct extension number (Ilia)
- Add link to xml in the main GL API xml (Ilia)
- remove special GS case from gen6_end_query (Ian)
- Make lookup table static so gcc doesn't initialized it on every call (Ian)
- Use if (_mesa_has_geometry_shaders(ctx)) instead of explicit checks (Ian)
- Core mesa parts moved into a prep patch (Ilia)

v4:
- Change to 10.6 relnotes since we missed 10.5 window
- Moved compute shader stuff into the switch statement (Jordan)
- Jordan: Add compute shader support

v5:
- Fixed relnote style (Ilia)

v6:
- Rebased on master which beat me to adding the first relnotes - essentially
  this undoes v5 (which had a typo anyway)
- Some code style fixes (Ken)
- Remove some excess comments (Ken)
- Unify tessellation failure style - unreachable (Ken)
- Fix workaround comment for PS invocations (Ken)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 23:01:12 -08:00
Ben Widawsky
86ffc36d3c mesa: Add support for the ARB_pipeline_statistics_query extension
This was originally part of a single patch which added the extension, and
implemented it for i965 classic. For information about the evolution of the
patch, please see the subsequent commit.

One difference here as compared to the original mega patch is this does build
support for the compute shader query. Since it cannot be tested on any platform,
it will always return NULL for now. Jordan has already written a patch to
address this, and when that patch lands, this logic can be modified.

v2: Fix typo in subject (Brian Paul)
Add checks for desktop gl (Ilia)
Fail for any callers for now (Ilia)
Update QueryCounterBits for new tokens (Ilia)
Jordan: Use _mesa_has_compute_shaders

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

v3: Rebased on patch which adds the proper information to unstub tessellation
shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 23:01:11 -08:00
Jordan Justen
2cd2831500 mesa: Add _mesa_has_compute_shaders
v2 (Ben): Change GLboolean to bool as requested by Ian

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-17 23:00:15 -08:00
Fabian Bieler
599cbe5508 mesa: Add ARB_tessellation_shader to extension table.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 22:06:19 -08:00
Kenneth Graunke
d523fefa75 i965: Prefer Meta over the BLT for BlitFramebuffer.
There's some debate about whether we should use Meta or BLORP,
but either should run circles around the BLT engine.

In particular, this means that Gen8+ will use the 3D engine for blits,
like we do on Gen6-7.

Improves performance in "copypixrate -blit -back" (from Mesa demos)
by 232.037% +/- 3.15795% (n=10) on Broadwell GT3e.

v2: Rebase on Laura's changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-17 22:06:06 -08:00
Matt Turner
bb33a31c38 i965/fs: Add algebraic optimizations for MAD.
total instructions in shared programs: 5764176 -> 5763808 (-0.01%)
instructions in affected programs:     25121 -> 24753 (-1.46%)
helped:                                164
HURT:                                  2

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
8cfd1e2ac6 i965/fs: Emit MAD instructions when possible.
Previously we didn't emit MAD instructions since they cannot take
immediate arguments, but with the opt_combine_constants() pass we can
handle this properly.

total instructions in shared programs: 5920017 -> 5733278 (-3.15%)
instructions in affected programs:     3625153 -> 3438414 (-5.15%)
helped:                                22017
HURT:                                  870
GAINED:                                91
LOST:                                  49

Without constant pooling, this patch is a complete loss:

total instructions in shared programs: 5912589 -> 5987888 (1.27%)
instructions in affected programs:     3190050 -> 3265349 (2.36%)
helped:                                1564
HURT:                                  17827
GAINED:                                27
LOST:                                  101

And since the constant pooling patch by itself hurt a bunch of things,
from before constant pooling to this patch the results are:

total instructions in shared programs: 5895414 -> 5747946 (-2.50%)
instructions in affected programs:     3617993 -> 3470525 (-4.08%)
helped:                                20478
HURT:                                  4469
GAINED:                                54
LOST:                                  146

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
36bc5f06dd i965/fs: Allow immediates in MAD and LRP instructions.
And then the opt_combine_constants() pass will pull them out into
registers. This will allow us to do some algebraic optimizations on MAD
and LRP.

total instructions in shared programs: 5946656 -> 5931320 (-0.26%)
instructions in affected programs:     778247 -> 762911 (-1.97%)
helped:                                3780
HURT:                                  6
GAINED:                                12
LOST:                                  12

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
2dad1e3abd i965/fs: Add pass to combine immediates.
total instructions in shared programs: 5885407 -> 5940958 (0.94%)
instructions in affected programs:     3617311 -> 3672862 (1.54%)
helped:                                3
HURT:                                  23556
GAINED:                                31
LOST:                                  165

... but will allow us to always emit MAD instructions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
0d8f27eab7 i965/fs: Remove force_writemask_all assertion for execsize < 8.
This doesn't seem to be necessary.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
662c645318 i965/cfg: Add function to generate a dot file of the dominator tree.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
b06eef05d0 i965/cfg: Add function to generate a dot file of the CFG.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
0e3dbc0248 i965/cfg: Calculate the immediate dominators.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
08f304bb3b i965/cfg: Allow cfg::dump to be called without a visitor.
The fs_visitor's dump_instruction() implementation calls cfg_t()
indirectly through calculate_live_intervals, so if you have an infinite
loop in the CFG code, you can't call cfg::dump(fs_visitor *) to debug
it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner
1af5c4a526 i965: Allow exec_list sentinels as arguments to insert functions.
To insert an instruction at the end of a basic block, we typically do
something like

   inst = block->last_non_control_flow_inst();
   inst->insert_after(block, new_inst);

But blocks can consist of a single control flow instruction, so inst
will actually be the exec_list's head sentinel. We shouldn't use it as
if it were a regular instruction, but it is safe to insert something after
it.

This patch avoids assert-failing because an exec_list sentinel wasn't in
the basic block's instruction list.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Alan Coopersmith
b7ce7c00e3 Make _mesa_swizzle_and_convert argument types in .c match those in .h
Caused Solaris Studio compilers to fail to build with errors about
incompatible function redefinitions.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith
4671dca0ee Use __typeof instead of typeof with Solaris Studio compilers
While the C compiler accepts typeof, C++ requires __typeof.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86944
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith
d602fbd861 Avoid fighting with Solaris headers over isnormal()
When compiling in C99 or C++11 modes, Solaris defines isnormal() as
a macro via <math.h>, which causes the function definition to become
too mangled to compile.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith
815b3bd096 Remove extraneous ; after DECL_TYPE usage
The macro is defined to provide a trailing ; so this caused the expansion
to end in ";;" which made the Solaris Studio compilers issue warnings for
every line of:
  "builtin_type_macros.h", line 113: Warning: extra ";" ignored.
for every file that included the header, filling build logs with thousands
of useless warnings.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith
60ad5103b9 Bracket arguments to tr so they work with Solaris tr
https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Limitations-of-Usual-Tools.html#index-g_t_0040command_007btr_007d-1842

Without this fix, egl fails to build on Solaris, with the error:

<command-line>:0:22: error: '_EGL_PLATFORM_x11' undeclared (first use in this function)
egldisplay.c:207:31: note: in expansion of macro '_EGL_NATIVE_PLATFORM'
             native_platform = _EGL_NATIVE_PLATFORM;
                               ^

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-17 18:16:32 -08:00
Kenneth Graunke
76960a55e6 glsl: Reduce memory consumption of copy propagation passes.
opt_copy_propagation and opt_copy_propagation_elements create new ACP
and Kill sets each time they enter a new control flow block.  For if
blocks, they also copy the entire existing ACP set contents into the
new set.

When we exit the control flow block, we discard the new sets.  However,
we weren't freeing them - so they lived on until the pass finished.
This can waste a lot of memory (57MB on one pessimal shader).

This patch makes the pass allocate ACP entries using this->acp as the
memory context, and Kill entries out of this->kill.  It also steals
kill entries when moving them from the inner kill list to the parent.

It then frees the lists, including their contents.

v2: Move ralloc_free(this->acp) just before this->acp = orig_acp
    (suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
2015-02-17 17:33:27 -08:00
Chris Forbes
eda3dd0076 i965: Add device limits for tess threads & URB entries
This should cover all platforms prior to Skylake.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-17 17:33:27 -08:00
Dave Airlie
e8e4437ed0 r600g/sb: treat undefined values like constants
When we schedule an instructions with undefined value, we
eventually will use 0, which is a constant, however sb wasn't
taking this into account and creating ops with illegal scalar
swizzles.

this replaces my fix for op3 in t slots.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-18 11:13:06 +10:00
Kenneth Graunke
598d144cef i915c: Use the actual MIN instruction.
Matt Turner noticed that the hardware has always had a MIN
instruction, but the driver always used MAX+MOV for no
apparent reason.

This should cut an instruction, and a temporary, allowing
more programs to run in hardware.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 15:24:15 -08:00
Kenneth Graunke
7bf774034a i915g: Use the actual MIN instruction.
Matt Turner noticed that the hardware has always had a MIN
instruction, but the driver always used MAX+MOV for no
apparent reason.

This should cut an instruction, and a temporary, allowing
more programs to run in hardware.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 15:24:15 -08:00
Kenneth Graunke
27b6ef7eca i965: Add a function to disassemble an instruction from the 4 dwords.
I used this a while back when debugging GPU hangs, and it seems like it
could be useful, so I figured I'd add it so people can use it in the
debugger.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-17 15:24:15 -08:00
Kenneth Graunke
0b499abb51 i965: Do Sandybridge workaround flushes before each primitive.
Sandybridge requires the post-sync non-zero workaround in a ton of
places, and if you ever miss one, the GPU usually hangs.

Currently, we try to track exactly when a workaround flush is
necessary (via the brw->batch.need_workaround_flush flag).  This is
tricky to get right, and we've botched it several times in the past.

This patch unconditionally performs the post-sync non-zero flush at the
start of each primitive's state upload (including BLORP).  We drop the
needs_workaround_flush flag, and drop all the other callers, as the
flush has already been performed.

We have no data to indicate that simply flushing all the time will
hurt performance, and it has the potential to help stability.

v2: Add post-sync workaround to initial GPU state upload to be extra
    cautious (suggested by Chad Versace).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2015-02-17 15:24:14 -08:00
Laura Ekstrand
92163482bd main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.
Previously array textures were not working with GetCompressedTextureImage,
leading to failures in the test
arb_direct_state_access/getcompressedtextureimage.c.

Tested-by: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-17 13:45:48 -08:00
Ian Romanick
4470bf1f49 i965/vec4: Silence unused parameter warnings
brw_vec4_copy_propagation.cpp:243:59: warning: unused parameter 'reg' [-Wunused-parameter]
                    int arg, struct copy_entry *entry, int reg)
                                                           ^

brw_vec4_generator.cpp:869:57: warning: unused parameter 'inst' [-Wunused-parameter]
 vec4_generator::generate_unpack_flags(vec4_instruction *inst,
                                                         ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick
2524f9b80d mesa/main: Silence unused parameter warning
Just remove the _mesa_free_lighting_data function.  The body has been
empty since the shine table was moved into the tnl module (commit
ba1d921).

main/light.c:1216:46: warning: unused parameter 'ctx' [-Wunused-parameter]
 _mesa_free_lighting_data( struct gl_context *ctx )
                                              ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick
1424bbfb57 util/hash: Silence comparison between signed and unsigned integer warnings in tests
delete_management.c:56:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^
delete_management.c:69:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = size - 100; i < size; i++) {
                           ^
delete_management.c:79:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       assert(key_value(entry->key) >= size - 100 &&
                               ^
delete_management.c:79:70: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       assert(key_value(entry->key) >= size - 100 &&
                                                                      ^
insert_many.c:56:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^
insert_many.c:62:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^
insert_many.c:67:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    assert(ht->entries == size);
                  ^
random_entry.c:62:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick
3d8f9570cd util/hash: Silence unused parameter warnings in tests
delete_and_lookup.c:37:21: warning: unused parameter ‘key’ [-Wunused-parameter]
 badhash(const void *key)
                     ^
delete_and_lookup.c:43:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
delete_and_lookup.c:43:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
collision.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
collision.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
destroy_callback.c:50:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
destroy_callback.c:50:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
insert_many.c:46:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
insert_many.c:46:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
insert_and_lookup.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
insert_and_lookup.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
null_destroy.c:32:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
null_destroy.c:32:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
random_entry.c:52:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
random_entry.c:52:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
remove_null.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
remove_null.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
replacement.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
replacement.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick
147afac80c glcpp: Silence GCC warning
glcpp/glcpp.c:124:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
 const static struct option
 ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Marek Olšák
2ead74888a radeonsi: fix a crash if a stencil ref state is set before a DSA state
+ minor indentation fixes

Discovered by Axel Davy.

This can't be reproduced with any app, because all state trackers set a DSA
state first.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2015-02-17 17:41:00 +01:00
Marek Olšák
7713d594e4 r600g,radeonsi: implement GL_AMD_pinned_memory
v2: update release notes

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
c688988b0d winsys/radeon: test the userptr ioctl to see if it's present
There is no other way to check for support.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
064847122a winsys/radeon: allow unaligned size for user-memory buffers
This is not required, but being user-friendly doesn't hurt.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
e8d727a2b6 winsys/radeon: allow mapping a user buffer
OpenGL requires this.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
8b587ee701 gallium: add interface and state tracker support for GL_AMD_pinned_memory
v2: add alignment restrictions to docs, fix indentation in headers

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
11ebb03c26 mesa: implement GL_AMD_pinned_memory
It's not possible to query the current buffer binding, because the extension
doesn't define GL_..._BUFFER__BINDING_AMD.

Drivers should check the target parameter of Drivers.BufferData. If it's
equal to GL_EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD, the memory should be pinned.
That's all there is to it.

A piglit test is on the piglit mailing list.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Christian König
4fa61b1a23 winsys/radeon: add user pointer support
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
e8625a29fe mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers
Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
218b15715e radeonsi: initialize TC_L2_dirty to false after buffer allocation
I forgot to do this, though "true" should have no effect on correctness.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
a27b74819a radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
5f1cef76f9 r600g,radeonsi: use fences to implement PIPE_QUERY_GPU_FINISHED
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák
f1103f6a1e r600g,radeonsi: demote TIMESTAMP_DISJOINT query to be a software query
The query result is always constant.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Dave Airlie
59292b38eb st/glsl_to_tgsi: fix whitespace
everytime I open this file in emacs with show trailing whitespace
or git add from it my screen flares with red.

Just do a general cleanup, makes working on fp64 support not as
jarring.

I'm not saying this is perfect, its just better than before.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-17 14:49:19 +10:00
Ilia Mirkin
b53fbec01d glsl/tests: add IMAGE type.
This fixes a warning when running make check.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-17 11:26:06 +10:00
Chia-I Wu
faaf13f6bf ilo: always set up BLEND_STATE on Gen8
There is now an DW0 that seems to be always referenced.
2015-02-17 04:59:33 +08:00
Chia-I Wu
6d4475d7bf ilo: fix alpha test on Gen8
Shoudl use GEN8_BLEND_DW0_ALPHA_TEST_ENABLE instead of
GEN6_RT_DW1_ALPHA_TEST_ENABLE (and others).
2015-02-17 04:59:33 +08:00
Ben Widawsky
d9cd982d55 i965/simd8vs: Fix SIMD8 atomics
The short version: we need to set bits in R0.7 which provide a mask to be used
for PS kill samples/pixels. Since the VS has no such concept, we just need to
set all 1.

The longer version...
Execution for SIMD8 atomics is defined as follows:
SIMD8: The low 8 bits of the execution mask are ANDed with 8 bits of the
Pixel/Sample Mask from the message header. For the typed messages, the Slot
Group in the message descriptor selects either the low or high 8 bits. For the
untyped messages, the low 8 bits are always selected. The resulting mask is used
to determine which slots are read into the destination GRF register (for read),
or which slots are written to the surface (for write). If the header is not
present, only the low 8 bits of the execution mask are used.

The message header for untyped messages is defined in R0.7 "This field contains
the 16-bit pixel/sample mask to be used for SIMD16 and SIMD8 messages. All 16
bits are used for SIMD16 messages.  For typed SIMD8 messages, Slot Group selects
which 8 bits of this field are used. For untyped SIMD8 messages, the low 8 bits
of this field are used." Furthermore, "The message header for the untyped
messages only needs to be delivered for pixel shader threads, where the
execution mask may indicate pixels/samples that are enabled only due to
derivative (LOD) calculations, but the corresponding slot on the surface must
not be accessed." We're not using a pixel shader here, but AFAICT, this mask is
used for all stages.

This leaves two options, Remove the header, or make the VS code emit the correct
thing for the header. I believe one of the goals of using SIMD8 VS was to get as
much code reuse as possible, and so I chose the latter. Since the VS has no such
thing as kill instructions, the mask is derived simple as all 1's.

v2:
Add a comment to the code (stolen from Curro on the mailing list)
Change the control flow style (Curro + Jason)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87258
Cc: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-16 12:22:44 -08:00
Brian Paul
9ac3700146 mesa: move assertion after declarations in texstore.c
To fix MSVC build.
2015-02-16 08:39:25 -07:00
Brian Paul
4d2cee4d5e mesa: silence uninitialized var warning in get_tex_rgba_uncompressed()
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-16 08:33:28 -07:00
Neil Roberts
bb77745681 meta: Fix saving the results of the current occlusion query
When restoring the current state in _mesa_meta_end it was previously trying to
copy the on-going sample count of the current occlusion query into the new
query after restarting it so that the driver will continue adding to the
previous value. This wouldn't work for two reasons. Firstly, the query might
not be ready yet so the Result member will usually be zero. Secondly the saved
query is stored as a pointer to the query object, not a copy of the struct, so
it is actually restarting the exact same object. Copying the result value is
just copying between identical addresses with no effect. The call to
_mesa_BeginQuery will have always reset it back to zero.

This patch fixes it by making it actually wait for the query object to be
ready before grabbing the previous result. The downside of doing this is that
it could introduce a stall but I think this situation is unlikely so it might
not matter too much. A better solution might be to introduce a real
suspend/resume mechanism to the driver interface. This could be implemented in
the i965 driver by saving the depth count multiple times like it does in the
i945 driver.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88248
Reviewed-by: Carl Worth <cworth@cworth.org>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-16 12:09:17 +00:00
Francisco Jerez
946e29847b i965/vec4: Override destination register writemask in sampler message send.
This line was removed by accident in commit
16b9112574 causing a regression in the
ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert Khronos conformance
test.  It's necessary because the swizzle_result() code below expects
all four components of the vector to be valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89094
Tested-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-16 13:51:08 +02:00
Iago Toral Quiroga
0a811e1d1e i965: Fix a crash in the texture gradient lowering pass with cube samplers
We need to swizzle the rhs to match the number of components in the writemask,
otherwise we'll hit an assertion in ir_assignment.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-16 10:53:48 +01:00
Iago Toral Quiroga
ba426522dd mesa: Fix element count for byte-swaps in texstore, readpix and texgetimage
Some old format conversion code in pack.c implemented byte-swapping like this:

GLint comps = _mesa_components_in_format(dstFormat);
GLint swapSize = _mesa_sizeof_packed_type(dstType);
if (swapSize == 2)
   _mesa_swap2((GLushort *) dstAddr, n * comps);
else if (swapSize == 4)
   _mesa_swap4((GLuint *) dstAddr, n * comps);

where n is the pixel count. But this is incorrect for packed formats,
where _mesa_sizeof_packed_type is already returning the size of a pixel
instead of the size of a single component, so multiplying this by the
number of components in the format results in a larger element count
for _mesa_swap than we want.

Unfortunately, we followed the same implementation for byte-swapping
in the rewrite of the format conversion code for texstore, readpixels
and texgetimage.

This patch computes the correct element counts for _mesa_swap calls
by computing the bytes per pixel in the image and dividing that by the
swap size to obtain the number of swaps required per pixel. Then multiplies
that by the number of pixels in the image to obtain the swap count that
we need to use.

Also, when handling byte-swapping in texstore_rgba, we were ignoring
the image's depth. This patch fixes this too.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-16 10:51:18 +01:00
Iago Toral Quiroga
4b249d2eed mesa: Handle transferOps in texstore_rgba
In the recent rewrite of the format conversion code we did not handle this.
This patch adds the missing support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89068
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-16 10:49:41 +01:00
Matt Turner
a2299bfbbd i965/fs: Handle U/UW-type immediates in the generator. 2015-02-15 14:29:08 -08:00
Matt Turner
7a83f7d481 i965/fs: Handle W/UW-type immediates in dump_instructions(). 2015-02-15 14:29:08 -08:00
Matt Turner
74ef90acd7 i965: Let dump_instructions() work before calculate_cfg().
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-15 12:24:11 -08:00
Matt Turner
fa124a337c i965/fs: Call calculate_cfg() before optimize().
The CFG is fundamental to the FS IR, not merely a piece of optimization.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-15 12:24:11 -08:00
Matt Turner
eb47d0efd3 i965: Optimize multiplication by -1 into a negated MOV.
instructions in affected programs:     968 -> 942 (-2.69%)
helped:                                4

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-15 12:24:10 -08:00
Matt Turner
e8a6f2ad65 i965: Add an is_negative_one() method.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-15 12:24:10 -08:00
Matt Turner
72b9f8db2a i965/vec4/vp: Use vec4_visitor::CMP.
... instead of emit(BRW_OPCODE_CMP, ...). In commit 6b3a301f I changed
vec4_visitor::CMP to set the destination's type to that of src0. In the
following commit (2335153f) I removed an apparently now unnecessary work
around for Gen8 that did the same thing.

But there was a single place that emitted a CMP instruction without
using the vec4_visitor::CMP function. Use it there.

And change dst_null_d to dst_null_f for good measure, since ARB vp
doesn't have integers.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89032
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-15 12:24:10 -08:00
Chia-I Wu
69b1693ef3 ilo: fix some state pointer commands on Gen8
3DSTATE_CC_STATE_POINTERS seems to be ignored when bit 0 of DW1 is not set.
Follow i965 and set the bit for 3DSTATE_CC_STATE_POINTERS and
3DSTATE_BLEND_STATE_POINTERS.  Add gen checks for all state pointer commands.
2015-02-15 13:32:41 +08:00
Ilia Mirkin
854eb06bee nvc0: allow holes in xfb target lists
Tested with a modified xfb-streams test which outputs to streams 0, 2,
and 3.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-14 17:15:54 -05:00
Ilia Mirkin
80d373ed5b st/mesa: treat resource-less xfb buffers as if they weren't there
If a transform feedback buffer's size is 0, st_bufferobj_data doesn't
end up creating a buffer for it. There's no point in trying to write to
such a buffer, so just pretend as if it's not really there.

This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-14 17:15:54 -05:00
Ilia Mirkin
68e4f3f572 nvc0: bail out of 2d blits with non-A8_UNORM alpha formats
This fixes the teximage-colors uploads with GL_ALPHA format and
non-GL_UNSIGNED_BYTE type.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-14 17:15:54 -05:00
Jason Ekstrand
3c57a59527 i965/nir: Don't support gl_FrontFacing as an input variable
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-14 13:47:16 -08:00
Jason Ekstrand
dd110cdfd8 nir: Make gl_FrontFacing a system_value
GLSL IR labels gl_FrontFacing as an input variable and not a system value.
This commit makes NIR silently translate gl_FrontFacing to a system value
so that it properly gets translated into a load_system_value intrinsic.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-14 13:47:16 -08:00
Jason Ekstrand
785b22caee i965/nir: Add support for nir_intrinsic_load_front_face
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-14 13:47:16 -08:00
Jason Ekstrand
929f43851e nir/lower_phis_to_scalar: Fix some logic in is_phi_scalarizable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-14 13:46:59 -08:00
Shawn Starr
7df256add2 clover: Use Legacy PassManager for LLVM trunk (3.7)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Shawn Starr <shawn.starr@rogers.com>
2015-02-14 01:31:57 +00:00
Chia-I Wu
8323796840 ilo: fix JIP/UIP on Gen8
UIP is in DW2 and JIP is in DW3 on Gen8.  Also, the units are in bytes.
2015-02-14 06:52:36 +08:00
Chia-I Wu
c62507f42c ilo: do not set GEN6_THREADCTRL_SWITCH
It is not needed on Gen6+, and it appears to be broken on Gen8.
2015-02-14 06:52:36 +08:00
Chia-I Wu
7504b357d4 ilo: correct ISA UIP/JIP decoding for Gen8
JIP is int32_t and UIP is in DW2 on Gen8.
2015-02-14 06:52:36 +08:00
Chia-I Wu
f8126fed95 ilo: prepare for 64-bit immediates decoding
Replace imm32 by imm64.  Add more ways (UD, D, etc) to access the immediate.
2015-02-14 06:52:36 +08:00
Chia-I Wu
9ed376a76c ilo: cleanup ISA DW1 decoding
Decode the higher and lower 16 bits separately.
2015-02-14 06:52:36 +08:00
Chia-I Wu
db362983d1 ilo: cleanup ISA DW0 decoding
Add disasm_inst_decode_dw0_opcode_gen6() to decode the opcode.  Simplify
branch_ctrl/acc_wr_ctrl decoding.
2015-02-14 06:52:36 +08:00
Chia-I Wu
5fc0dd8953 ilo: update some outdated gen checks
Update gen checks for 3DSTATE_POLY_STIPPLE_OFFSET,
3DSTATE_POLY_STIPPLE_PATTERN, 3DSTATE_LINE_STIPPLE, and
3DSTATE_AA_LINE_PARAMETERS.
2015-02-14 06:52:36 +08:00
Chia-I Wu
8b9446dbeb ilo: fix rectlist length on Gen8
5 PIPE_CONTROLs, 2 3DSTATE_WM_HZ_OP, and depth buffer setup require 65 DWords.
2015-02-14 06:52:36 +08:00
Chia-I Wu
baba8b2745 ilo: fix 3DSTATE_VF_TOPOLOGY
The pipe primitive type was wrongly translated twice.
2015-02-14 06:52:36 +08:00
Jose Fonseca
c944b91190 os,llvmpipe: Set rasterizer thread names on Linux.
To help identify llvmpipe rasterizer threads -- especially when there
can be so many.

We can eventually generalize this to other OSes, but for that we must
restrict the function to be called from the current thread.  See also
http://stackoverflow.com/a/7989973

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-13 19:42:21 +00:00
Jose Fonseca
b09f25428f uti/u_atomic: Don't test p_atomic_add with booleans.
Add another class of tests.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=89112

I failed to spot this in my previous change, because bool was a typedef
for char on the system I tested.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-13 19:39:27 +00:00
Tapani Pälli
e333035c47 mesa: fix OES_texture_float texture render target behavior
Current implementation allowed usage of unsized type texture GL_FLOAT
and GL_HALF_FLOAT as a render target as this was 'expected behavior' by
WEBGL_oes_texture_float and is also allowed by the oes-texture-float
WebGL test. However this broke some ES3 conformance tests that do not
accept such behavior. Patch sets such an fbo incomplete as expected by
the ES3 conformance tests. Textures with sized types like RGBA32F will
still continue to work as render targets.

v2: code style cleanups (Ian Romanick, Matt Turner)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88905
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-13 07:51:13 +02:00
Eric Anholt
3f1e1287fd vc4: Make SF be a flag on the QIR instructions.
Right now the places that used to emit a mov.sf just put the SF on the
previous instruction when it generated the source of the SF value.  Even
without optimization to push the sf up further (and kill thus potentially
kill more MOVs), this gets us:

total uniforms in shared programs: 13455 -> 13457 (0.01%)
uniforms in affected programs:     3 -> 5 (66.67%)
total instructions in shared programs: 40296 -> 40198 (-0.24%)
instructions in affected programs:     12595 -> 12497 (-0.78%)
2015-02-12 16:33:16 -08:00
Eric Anholt
4413861dd8 r200: Drop unused variable.
Quiets compiler warning since e7f2f2dea5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-12 16:33:16 -08:00
Eric Anholt
55de910f90 i965: Quiet another compiler warning about uninitialized values.
The compiler can't tell that we're always going to hit the first if block
on the first time through the loop.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-12 16:33:16 -08:00
Eric Anholt
f65e26478b i965: Move some asserts to unreachable.
If execution was supposed to be supported in this case, we'd run into
trouble from completely uninitialized sat_imm values.

v2: Drop the '!' before the string.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-12 16:32:10 -08:00
Eric Anholt
6489cb1ae6 i965: Shut up a compiler warning about uninitialized var.
We always pass this argument, even if it won't be used by the particular
texture op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-12 16:29:53 -08:00
Carl Worth
55a57834bf Revert use of Mesa IR optimizer for ARB_fragment_programs
Commit f82f2fb3dc added use of the Mesa
IR optimizer for both ARB_fragment_program and ARB_vertex_program, but
only justified the vertex-program portions with measured performance
improvements.

Meanwhile, the optimizer was seen to generate hundreds of unused
immediates without discarding them, causing failures.

Discard the use of the optimizer for now to fix the regression. (In
the future, we anticpate things moving from Mesa IR to NIR for better
optimization anyway.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82477

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

CC: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-12 13:33:12 -08:00
Jose Fonseca
1ba9f9e62c util/u_atomic: Use lower-case variables in _Interlocked* helpers. 2015-02-12 19:32:21 +00:00
Jose Fonseca
531d47baa8 util/u_atomic: Add _InterlockedExchangeAdd8/16 for older MSVC.
We need to build certain parts of Mesa (namely gallium, llvmpipe, and
therefore util) with Windows SDK 7.0.7600, which includes MSVC 2008.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-12 19:32:21 +00:00
Jose Fonseca
d2438f5920 util/u_atomic: Test p_atomic_add() for 8bit integers.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-12 19:32:21 +00:00
Ilia Mirkin
b1e70f2423 docs: add ARB_draw_indirect to ES 3.1 list
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-12 11:12:29 -05:00
Axel Davy
63986f9580 egl: Soften several HAVE_DRM_PLATFORM to HAVE_LIBDRM
To fix build when libdrm is not found,
commit a594cec7e3 did put several
parts of egl code under #ifdef HAVE_DRM_PLATFORM.

HAVE_DRM_PLATFORM means the egl drm platform is being built.
What should have been used instead is HAVE_LIBDRM.

At a few locations, the HAVE_DRM_PLATFORM introduced
have already been replaced by HAVE_LIBDRM, this patch
replaces the remaining occurences.

This patch makes for example EGL_EXT_image_dma_buf_import
be advertised by egl under x11 when the drm egl platform
is not built, whereas previously it required the drm egl
platform to be built.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-12 13:20:22 +00:00
Emil Velikov
c39dbfdd0f auxiliary/vl: bring back the VL code for the dri targets
With commit c642e87d9f4(auxiliary/vl: rework the build of the VL code)
we split out the VL code into a separate static library that was meant
to be used by the VL targets alone - va, vdpau, xvmc.

The commit failed to consider the way we handle vdpau-gl interop and
broke it. Bring back the functionality by keeping the vl <> vl_stub
separation as requrested by Christian.

v2: Update the omx target as well. Update mesa-stable email address.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86837
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
2015-02-12 13:19:26 +00:00
Emil Velikov
153539bd9d configure: rework wayland_scanner handling(fix make distcheck)
Currently having the wayland-scanner is optional, which causes problems
when autotools parses through the makefiles, and tries to generate all
the BUILT_SOURCES.

As the config option --with-egl-platform=wayland is not the default, we
won't end up setting the WAYLAND_SCANNER variable, which in turn will
cause some files to not get generated.

There has been a wayland-scanner package as of wayland 1.2 which
provides a variable for the scanner binary, so let's use that one and
fall back to manually searching via AC_PATH_PROG when needed.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-12 13:19:20 +00:00
Emil Velikov
72e602905d nir: add missing header to the sources list
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-12 13:19:13 +00:00
Emil Velikov
556fc4b84d nir: resolve nir.h dependency list (fix make distcheck)
Use nir/nir_opcodes.h as is (w/o the absolute path), as it is the target
name used to generate the actual file. Otherwise the target is missing,
the file won't get generated and the build will fail.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-12 13:18:52 +00:00
Martin Peres
9f7efa78a8 docs: update GL3.txt to state my current work on the dsa extension
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2015-02-12 11:24:37 +02:00
Ben Widawsky
e93566a15c i965/vs/skl: Use vec4 datatypes for message header
We're using a SIMD4x2 sampler message, which has execsize 4, and so the
register width must be <= 4.  Use <4,4,1> regioning instead of <8,8,1>
regioning to access the same data but avoid tripping the assert.

Fixes the following piglit tests:
spec/glsl-1.20/compiler/structure-and-array-operations/array-selection.vert
spec/glsl-es-3.00/compiler/uniform_block/interface-name-basic.vert
spec/glsl-es-3.00/compiler/uniform_block/interface-name-field-clashes-with-struct.vert
spec/glsl-es-3.00/compiler/uniform_block/interface-name-field-clashes-with-function.vert
spec/glsl-es-3.00/compiler/uniform_block/interface-name-array.vert
glslparsertest/glsl2/condition-07.vert
spec/glsl-es-3.00/compiler/uniform_block/interface-name-field-clashes-with-variable.vert

v2: Better commit message courtesy of Ken.
I had a discussion with Ken, and we both question how we end up with a mov and
execsize 4. For now though, this fixes the piglit tests, so we can worry about
it later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-11 21:41:58 -08:00
Chia-I Wu
cba6a4a129 ilo: update screen init for Gen8
This is very preliminary and is only tested with glxgears.  All information
about Gen8 is derived from i965 and beignet.
2015-02-12 08:05:07 +08:00
Chia-I Wu
cb1cdecf64 ilo: update outdated render command emissions for Gen8 2015-02-12 07:56:13 +08:00
Chia-I Wu
9ab4fc4e63 ilo: update rectlist command emission for Gen8 2015-02-12 07:56:13 +08:00
Chia-I Wu
4caf8d9761 ilo: update draw command emission for Gen8 2015-02-12 07:56:13 +08:00
Chia-I Wu
d8927ab02f ilo: update surface state emission for Gen8 2015-02-12 07:56:13 +08:00
Chia-I Wu
7832a3013b ilo: update dynamic state emission for Gen8 2015-02-12 07:56:13 +08:00
Chia-I Wu
8682cbab3e ilo: update outdated gen assertions for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
c173a5288f ilo: add new WM related helpers for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
8c2cbc8955 ilo: update VS related functions for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
0e3381154c ilo: update VF related functions for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
a57805cb75 ilo: update SAMPLER_STATE for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
7e7e45db65 ilo: update SAMPLER_BORDER_COLOR_STATE for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
8976a190b2 ilo: update depth clear value for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
0b7fdce4f5 ilo: update ilo_zs_surface for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
aa7109f059 ilo: update ilo_view_surface for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
7922982d4f ilo: update texture layout for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
47dc2ae6e2 ilo: update ilo_blend_state and related functions for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
e8455128aa ilo: update ilo_dsa_state and related functions for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
9aeee99e4d ilo: update multisample related states for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu
6366fbc1a8 ilo: update WM and PS related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
584d3369b6 ilo: update SBE related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
4cb592ec17 ilo: update SF related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
05e2eb57cd ilo: update CLIP related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
9ab0165375 ilo: update SF_CLIP_VIEWPORT for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
b64aeebbcc ilo: update streamout related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
6f77bd3bdc ilo: update 3DSTATE_{DS,HS,GS} for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
3be0504399 ilo: update 3DSTATE_CONSTANT_x for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
49306afe7b ilo: update 3DSTATE_URB_x for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
d43ae05d76 ilo: update 3DSTATE_PUSH_CONSTANT_ALLOC_x for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
f43332ca2f ilo: update render engine common helpers for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
8d9f69bef2 ilo: update BLT helpers for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
574f8d0229 ilo: update MI helpers for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu
bfc8a72609 ilo: add functions for Gen8 relocs
Extend ilo_builder_writer_reloc() for Gen8 memory addressing.  Add new
wrappers, ilo_builder_surface_reloc64(() and ilo_builder_batch_reloc64().
2015-02-12 07:56:11 +08:00
Chia-I Wu
a7911620f6 ilo: update the toy compiler for Gen8
Based on what we know from the classic driver.
2015-02-12 07:56:11 +08:00
Chia-I Wu
0066c22c40 ilo: update genhw headers
Accumulated changes for various renames and additions, including Gen8
definitions.  Some of the dynamic state __SIZE no longer means the size of an
element, but the size of an array of elements.  The changes can be seen in
ilo_render_dynamic.c.
2015-02-12 07:56:10 +08:00
Chia-I Wu
5933d84ad6 ilo: clean up ilo_gpe_init_dsa()
Add dsa_get_stencil_enable_gen6(), dsa_get_depth_enable_gen6(), and
dsa_get_alpha_enable_gen6() to be called from ilo_gpe_init_dsa().
2015-02-12 07:56:10 +08:00
Chia-I Wu
aa354b92d2 ilo: clean up ilo_gpe_init_blend()
Make ilo_blend_state more space efficient and forward-looking.
2015-02-12 07:56:10 +08:00
Chia-I Wu
1d07055b50 ilo: clean up sample patterns
Use signed int for sample positions and add helpers to access them.  Call them
patterns instead of positions.
2015-02-12 07:56:10 +08:00
Matt Turner
69ad5fd4ce glsl: Optimize (f2i(trunc x)) into (f2i x).
total instructions in shared programs: 5950326 -> 5949286 (-0.02%)
instructions in affected programs:     88264 -> 87224 (-1.18%)
helped:                                692
2015-02-11 13:50:19 -08:00
Matt Turner
c262b2b582 glsl: Optimize round-half-up pattern.
Hurts some Psychonauts shaders, but after the next patch (which this
enables) they're fewer instructions than before this patch.
2015-02-11 13:50:19 -08:00
Matt Turner
a5455ab1ca glsl: Add trunc() to ir_builder. 2015-02-11 13:50:19 -08:00
Matt Turner
d91390634f i965: Add LINTERP/CINTERP to can_do_cmod().
LINTERP is implemented as a PLN instruction or a LINE+MAC. PLN and MAC
can do conditional mod. CINTERP is just a MOV.

total instructions in shared programs: 5952103 -> 5950284 (-0.03%)
instructions in affected programs:     324573 -> 322754 (-0.56%)
helped:                                1819

We lose the SIMD16 in one Unigine Heaven shader which appears six times
in shader-db.
2015-02-11 13:50:19 -08:00
Matt Turner
245c7848fc program: Remove _mesa_nop_vertex_program/_mesa_nop_fragment_program.
Dead since

   commit 284ce20901
   Author: Eric Anholt <eric@anholt.net>
   Date:   Fri Aug 20 10:52:14 2010 -0700

       Remove remnants of the old glsl compiler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-11 13:50:19 -08:00
Matt Turner
4c42e1116b nir: Recognize open-coded fmin/fmax.
And unfortunately other shaders do the same thing but with >=/<= which
we can't apply this optimization to because of NaNs.

instructions in affected programs:     23309 -> 22938 (-1.59%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-11 13:50:19 -08:00
Eric Anholt
56e21647e2 nir: Add algebraic opt for int comparisons with identical operands.
No change on shader-db on i965.

v2: Reword the comment due to feedback from Erik Faye-Lund

Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
2015-02-11 11:52:38 -08:00
Eric Anholt
2919bdf466 nir: Fix load_const comparisons for CSE.
We want the size of a float per component, not the size of a whole vec4.

NIR instructions on i965:
total instructions in shared programs: 1261937 -> 1261929 (-0.00%)
instructions in affected programs:     114 -> 106 (-7.02%)

Looking at one of these examples (tesseract), it's from vec4 load_consts
for a MRT solid fill, which do get CSEed now that we don't memcmp off the
end of the const value and into the SSA def.  For the 1-component loads
that are common in i965, we were only memcmping off into the rest of the
usually zero-filled const_value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-11 11:52:38 -08:00
Matt Turner
09d6ea9ae3 i965/fs: Remove conditional mod when optimizing a SEL into a MOV.
Missed in commit ca675b73, but got right in the companion commit 3c28b2c0.
2015-02-11 10:26:49 -08:00
Jeremy Huddleston Sequoia
e68b67b53f darwin: build fix
xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
             ^
Fixes regression from 291be28476

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2015-02-10 22:22:33 -08:00
Jeremy Huddleston Sequoia
1c67a5687a darwin: build fix
../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2015-02-10 20:35:10 -08:00
Matt Turner
ea0f0eb6c0 glsl: Optimize 1/exp(x) into exp(-x).
Lots of shaders divide by exp2(...) which we turn into a multiplication
by the reciprocal. We can avoid the reciprocal by simply negating exp2's
argument.

total instructions in shared programs: 5947154 -> 5946695 (-0.01%)
instructions in affected programs:     118661 -> 118202 (-0.39%)
helped:                                380

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-10 17:48:44 -08:00
Matt Turner
a9065cef48 nir: Remove casts from void*.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-10 17:48:42 -08:00
Matt Turner
bb1e007157 nir: Replace assert(0) with unreachable().
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-10 17:48:31 -08:00
Matt Turner
942b56ad05 nir: Remove unused has_indirect variable.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-10 17:48:16 -08:00
Matt Turner
fff0b2eab5 i965/vec4: Emit MADs from (x + abs(y * z)).
Same as commit 3654b6d4 to the fs backend.

total instructions in shared programs: 5945788 -> 5945787 (-0.00%)
instructions in affected programs:     36 -> 35 (-2.78%)
helped:                                1

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 17:48:15 -08:00
Matt Turner
3d581f9996 i965/vec4: Emit MADs from (x + -(y * z)).
Same as commit c4fab711 to the fs backend.

total instructions in shared programs: 5945998 -> 5945788 (-0.00%)
instructions in affected programs:     74665 -> 74455 (-0.28%)
helped:                                399
HURT:                                  180

It hurts some programs because we make no attempts in the vec4 backend
to avoid MADs if they have constant (or vector uniform) arguments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 17:47:37 -08:00
Neil Roberts
5b29b2922a i965/skl: Implement WaDisable1DDepthStencil
Skylake+ doesn't support setting a depth buffer to a 1D surface but it
does allow pretending it's a 2D texture with a height of 1 instead.

This fixes the GL_DEPTH_COMPONENT_* tests of the copyteximage piglit
test (and also seems to avoid a subsequent GPU hang).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 18:00:21 +00:00
Francisco Jerez
1b224290fb i965/gen7-8: Implement glMemoryBarrier().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez
46b03d5400 i965: Generalize the update_null_renderbuffer_surface vtbl hook to non-renderbuffers.
Null surfaces are going to be useful to have something to point
unbound image units to, as the ARB_shader_image_load_store extension
requires us to behave deterministically in cases where some shader
tries to access an unbound image unit: Invalid stores and atomics are
supposed to be discarded and invalid loads are supposed to return
zero, which is precisely what the null surface does.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez
342b7ce7d4 i965: Allocate binding table space for shader images.
v2: Bump the number of supported image uniforms to 32 (Ken).

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez
36a17f0f99 i965: Don't tile 1D miptrees.
It doesn't really improve locality of texture fetches, quite the
opposite it's a waste of memory bandwidth and space due to tile
alignment.

v2: Check mt->logical_height0 instead of mt->target (Ken).  Add short
    comment explaining why they shouldn't be tiled.

Reviewed-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez
b40bcd24e0 i965/vec4: Don't set any dependency control bits for F32TO16 on Gen8.
It's expanded to several instructions.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez
aef83957e1 i965: Handle negated unsigned immediate values in constant propagation.
Negation of UD/UW sources behaves the same as for D/W sources, taking
the two's complement of the source, except for bitwise logical
operations on Gen8 and up which take the one's complement.  Fixes
crash in a GLSL shader with subtraction of two unsigned values.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez
64fde7b31c i965/vec4: Take into account non-zero reg_offset during register allocation.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez
78e9043475 i965/vec4: Add register classes up to MAX_VGRF_SIZE.
In preparation for some send from GRF instructions that will require
larger payloads.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez
530445330b i965/vec4: Init mlen for several send from GRF instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez
5f878d1b47 i965/vec4: Don't infer MRF dependencies for send from GRF instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez
de666fc102 i965/vec4: Fix the scheduler to take into account reads and writes of multiple registers.
v2: Avoid nested ternary operators in vec4_instruction::regs_read(). (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez
8ad486077e i965/vec4: Make vec4_visitor::implied_mrf_writes() return zero for sends from GRF.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez
16b9112574 i965/vec4: Pass dst register to the vec4_instruction constructor.
So regs_written gets initialized with a sensible value.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez
0c902a8f78 i965/vec4: Initialize vec4_instruction::predicate and ::predicate_inverse.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez
388b136e67 i965/vec4: Implement equals() method for dst_reg too.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez
3df2cb2f86 i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.
Scalar registers are required to have zero stride, fix the
regs_written calculation not to assume that the instruction writes
zero registers in that case.

v2: Rename CEILING() to DIV_ROUND_UP(). (Matt, Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:51 +02:00
Francisco Jerez
f2668f9f21 i965/fs: Fix stack allocation of fs_inst and stop stealing src array provided on construction.
Using 'ralloc*(this, ...)' is wrong if the object has automatic
storage or was allocated through any other means.  Use normal dynamic
memory instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:51 +02:00
Francisco Jerez
c472793a2a i965/fs: Remove duplicate include of brw_shader.h
The second one was inside an extern "C" block, luckily it was being
discarded by the preprocessor.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:51 +02:00
Francisco Jerez
dfe957c02b i965: Move up fs_inst::flag_subreg to backend_instruction.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:51 +02:00
Francisco Jerez
639696aa05 i965: Move up fs_inst::regs_written to backend_instruction.
It will also be useful in the VEC4 back-end.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:51 +02:00
Francisco Jerez
4ed52e8bc4 i965/vec4: Remove dependency of vec4_instruction on the visitor class.
The only reason why you need a vec4_visitor to construct a
vec4_instruction is to initialize vec4_instruction::ir and
::annotation.  Instead set them from vec4_visitor::emit() just like
fs_visitor does.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:50 +02:00
Francisco Jerez
a3ee6c7d19 i965/fs: Remove dependency of fs_inst on the visitor class.
The fs_visitor argument of fs_inst::regs_read() wasn't used at all.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:50 +02:00
Francisco Jerez
bfbb0e84e1 i965: Move IR object definitions to separate header files.
One should be able to manipulate i965 IR without pulling the whole
FS/VEC4 visitor classes -- Optimization passes and other
transformations would ideally be visitor-agnostic.  Among other issues
this avoids a circular dependency between the header file where such
visitor-agnostic code will be defined and the main FS/VEC4 header
where both IR (layer below) and visitor (layer above) happen to be
defined.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:50 +02:00
Francisco Jerez
447879eb88 i965: Factor out virtual GRF allocation to a separate object.
Right now virtual GRF book-keeping and allocation is performed in each
visitor class separately (among other hundred different things),
leading to duplicated logic in each visitor and preventing layering as
it forces any code that manipulates i965 IR and needs to allocate
virtual registers to depend on the specific visitor that happens to be
used to translate from GLSL IR.

v2: Use realloc()/free() to allocate VGRF book-keeping arrays (Connor).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 16:05:47 +02:00
Francisco Jerez
e6146e6f14 glsl: Forbid calling the constructor of any opaque type.
The spec doesn't define any opaque type constructors.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-10 15:49:43 +02:00
Francisco Jerez
c4111dfa0a glsl: Return correct number of coordinate components for cubemap array images.
Cubemap array images are unlike cubemap array samplers in that they don't need
an additional coordinate to index individual cubemaps in the array, instead
they behave like a 2D array of 6n layers, with n the number of cubemaps in the
array.  Take this exception into account.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-10 15:49:43 +02:00
Francisco Jerez
fcc2fd53df mesa: Bump MAX_IMAGE_UNIFORMS to 32.
So the i965 driver can expose 32 image uniforms per shader stage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 15:37:56 +02:00
Francisco Jerez
818585b9f9 mesa: Rename the CEILING() macro to DIV_ROUND_UP().
Some people have complained that code using the CEILING() macro is
difficult to understand because it's not immediately obvious what it
is supposed to do until you go and look up its definition.  Use a more
descriptive name that matches the similar utility macro in the Linux
kernel.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 15:37:47 +02:00
Tiziano Bacocco
1e02f2badf nv50,nvc0: Mark PIPE_QUERY_TIMESTAMP_DISJOINT as ready immediately
Without this when an application issues that query, it would try to
wait the result from the gpu, and since no query has been actually
issued, it will wait forever.

Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-10 08:02:17 -05:00
Roy Spliet
09ee907266 nv50/ir: Fold IMM into MAD
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is
a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be
done post-RA because it requires that SDST == SSRC2.

V2: improve readability and add comments to clarify decisions
V3: Remove redundant code... compiler already attempts to put the IMM in
SSRC1

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-10 08:02:13 -05:00
Roy Spliet
3dc39d0bca nv50/ir: Add emit support for MAD IMM format
But don't enable generation of it in the opProperties, because we can't
guarantee the SDST==SRC2 constraint until after register assignment. We'll
add a post-RA folding pass to utilise this.

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-10 08:02:02 -05:00
Roy Spliet
fb63df2215 nv50/ir: Add support for MAD 4-byte opcode
Add emission rules for negative and saturate flags for MAD 4-byte opcodes,
and get rid of some of the constraints. Obviously tested with a wide variety
of shaders.

V2: Document MAD as supported short form
V3: Split up IMM from short-form modifiers

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-10 08:01:46 -05:00
Ilia Mirkin
354206f407 nv50/ir: change the way float face is returned
The old way made it impossible for the optimizer to reason about what
was going on. The new way is the same number of instructions (the neg
gets folded into the cvt) but enables the optimizer to be cleverer if
comparing to a constant (most common case). [The optimizer is presently
not sufficiently clever to work this out, but it could relatively easily
be made to be. The old way would have required significant complexity to
work out.]

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-10 08:01:46 -05:00
Kenneth Graunke
480ee1f0b4 nir: Mark nir_print_instr's instr pointer as const.
Printing instructions doesn't modify them, so we can mark the parameter
const.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-10 03:37:55 -08:00
Kenneth Graunke
08a06b6b89 i965: Fix integer border color on Haswell.
+82 Piglits - 100% of border color tests now pass on Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2015-02-09 13:18:58 -08:00
Kenneth Graunke
e1e73443c5 i965: Use a gl_color_union for sampler border color.
This should have no effect, but will make it easier to implement other
bug fixes.

v2: Eliminate "unsigned one" local; just use the value where necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2015-02-09 13:18:58 -08:00
Kenneth Graunke
8cb18760cc i965: Override swizzles for integer luminance formats.
The hardware's integer luminance formats are completely unusable;
currently we fall back to RGBA.  This means we need to override
the texture swizzle to obtain the XXX1 values expected for luminance
formats.

Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled]
on Broadwell - 100% of border color tests now pass on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2015-02-09 13:18:54 -08:00
Carl Worth
b16de0b713 util/u_atomic: Add new macro p_atomic_add
This provides for atomic addition, which will be used by an upcoming
shader-cache patch. A simple test is added to "make check" as well.

Note: The various O/S functions differ on whether they return the
original value or the value after the addition, so I did not provide
an add_return() macro which would be sensitive to that difference.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-02-09 10:47:44 -08:00
Jason Ekstrand
345e8cc849 util/hash_table: Try to hit a double-insertion bug in the collision test
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-07 17:01:05 -08:00
Jason Ekstrand
623c3a858d util/set: Do a full search when adding new items
Previously, the set_insert function would bail early if it found a deleted
slot that it could re-use.  However, this is a problem if the key being
inserted is already in the set but further down the list.  If this happens,
the element ends up getting inserted in the set twice.  This commit makes
it so that we walk over all of the possible entries for the given key and
then, if we don't find the key, place it in the available free entry we
found.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-07 17:01:05 -08:00
Jason Ekstrand
c9287e797b util/hash_table: Do a full search when adding new items
Previously, the hash_table_insert function would bail early if it found a
deleted slot that it could re-use.  However, this is a problem if the key
being inserted is already in the hash table but further down the list.  If
this happens, the element ends up getting inserted in the hash table twice.
This commit makes it so that we walk over all of the possible entries for
the given key and then, if we don't find the key, place it in the available
free entry we found.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-07 17:01:05 -08:00
James Legg
1581e12aba mesa: Make renderbuffer FBO attachments not layered
For framebuffer completeness checks, consider renderbuffers as not
layered. Previously, they would have counted as layered if a layered
textured had previously been bound to the same attachment point. This
could cause framebuffer completeness checks to incorrectly fail with
GL_FRAMEBUFFER_INCOMPLETE_LAYER_TARGETS, even if no layered attachments
were present.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89026
2015-02-08 13:54:15 +13:00
Emil Velikov
49299ef6fa Post-branch version bump to 10.6.0-devel, add release notes template
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 19:12:20 +00:00
Brian Paul
d1e21325cf gallium/hud: also try R8_UNORM format for font texture
Convert the code to try formats from an array rather than a bunch
of if/else cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-07 11:03:37 -07:00
Brian Paul
6447e9dbfa gallium/hud: flush stdout in print_help(), for Windows
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-07 11:03:37 -07:00
Ben Widawsky
7ea1e37497 i965: Add more stringent blitter assertions
Blits to or from a y-tiled surface must always be a multiple of the tile size.
From page 16 of the HSW PRM
(https://01.org/linuxgraphics/sites/default/files/documentation/intel-gfx-prm-osrc-hsw-memory-views.pdf#16)
"The pitch of a tiled enclosing region must be an integral number of tile
widths"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-07 08:08:59 -08:00
Ben Widawsky
efde74c89d i965: Consolidate some of the intel_blit logic
An upcoming patch is going to introduce some code here, and having this code
organized as the patch does makes it a bit easier to read later.

There should be no functional change here.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-07 08:07:56 -08:00
Park, Jeongmin
0467a52dc3 st/dri: Make depth buffer optional for postprocessing
Since only pp_jimenezmlaa uses depth buffer, we can make it optional.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-02-07 12:12:00 +01:00
Park, Jeongmin
2e6ba6afdb postprocess: Check for depth buffer in pp_jimenezmlaa
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88962
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-02-07 12:12:00 +01:00
Ben Widawsky
8030e269e9 i965/vec4: Correct MUL destination hazard
As it turns out, we were over-thinking the cause of the hang on
Cherryview. It's simply errata for Cherryview.

commit 88fea85f09
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Fri Nov 21 10:47:41 2014 -0800

    i965/vec4/gen8: Handle the MUL dest hazard exception

This is an explanation to why we never saw the hang on BDW.

NOTE: The problem the original patch was trying to fix does still exist. It will
have to be fixed at some point.

v2: Modify commit message, s/CHV/BDW

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-06 17:54:17 -08:00
Emil Velikov
e660f0dd80 docs: add news item and link release notes for mesa 10.4.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:51:08 +00:00
Emil Velikov
d8278be310 docs: Add sha256 sums for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 54da987bae)
2015-02-07 00:48:04 +00:00
Emil Velikov
7d796a59de Add release notes for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 62eb27ac8b)
2015-02-07 00:48:02 +00:00
Eric Anholt
bff4cbdafa nir: Fix broken fsat recognizer.
We've probably never seen this ridiculous pattern in the wild, so it
didn't matter.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-06 15:57:55 -08:00
Eric Anholt
6706537dd4 nir: Slightly simplify algebraic code generation by reusing a struct.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-06 15:57:55 -08:00
Eric Anholt
9e35af08af tgsi/ureg: Add missing some missing opcodes opcode_tmp.h
I wanted all of these for NIR-to-TGSI.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-06 15:50:07 -08:00
Eric Anholt
f3dbf3689a tgsi/ureg: Move ureg_dst_register() to the header.
I wanted to use it for nir-to-tgsi.  The equivalent ureg_src_register() is
also located here.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-06 15:50:07 -08:00
Marek Olšák
40fa7d44ab gallium/u_tests: test a NULL buffer sampler view
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-06 22:27:07 +01:00
Marek Olšák
56e709bffb gallium/u_tests: test a NULL constant buffer
This expects (0,0,0,0), though it can be changed to something else or allow
more than one set of values to be considered correct.

This is currently the radeonsi behavior.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-06 22:27:07 +01:00
Marek Olšák
9e8a6d8486 gallium/u_tests: test a NULL texture sampler view
v2: allow one of the two values
2015-02-06 22:27:06 +01:00
Marek Olšák
63e51baedc gallium/u_tests: restructure the only test, refactor out reusable code
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-06 22:27:06 +01:00
Marek Olšák
dcf996c31e gallium: run gallium tests if GALLIUM_TESTS=1 is set
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-06 22:27:06 +01:00
Marek Olšák
0271ac72d1 gallium/postprocessing: fix crash at context destruction
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-06 20:03:06 +01:00
Xavier Bouchoux
2fd21c4098 r600g/sb: fix a bug in constants folding optimisation pass
ADD     R6.y.1,    R5.w.1, ~1|3f800000
    ADD     R6.y.2,    |R6.y.1|, -0.0001|b8d1b717

was wrongly being converted to

    ADD     R6.y.1,    R5.w.1, ~1|3f800000
    ADD     R6.y.2,    R5.w.1, -1.0001|bf800347

because abs() modifier was ignored.

Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-06 20:03:06 +01:00
Xavier Bouchoux
acef65503e r600g: fix abs() support on ALU 3 source operands instructions
Since alu does not support abs() modifier on source operands, spill
and apply the modifiers to a temp register when needed.

Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-06 20:03:06 +01:00
David Heidelberg
bae23a1756 r300g: small code cleanup (v2)
v2: incorporated changes from Marek Olšák

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
2015-02-06 18:27:30 +01:00
Iago Toral Quiroga
71a36e0a2c glsl: GLSL ES identifiers cannot exceed 1024 characters
v2 (Ian Romanick)
- Move the check to the lexer before rallocing a copy of the large string.

Fixes the following 2 dEQP tests:
dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex
dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-06 12:21:42 +01:00
Kenneth Graunke
d4a461caaf i965: Fix INTEL_DEBUG=shader_time for SIMD8 VS (and GS).
We were incorrectly attributing VS time to FS8 on Gen8+, which now use
fs_visitor for vertex shaders.

We don't hit this for geometry shaders yet, but we may as well add
support now - the fix is obvious, and we'll just forget later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-02-05 20:01:03 -08:00
Kenneth Graunke
32f1d4e286 i965/fs: Use inst->eot rather than opcodes in register allocation.
Previously, we special cased FB writes and URB writes in the register
allocation code.  What we really wanted was to handle any message with
EOT set.

This saves us from extending the list with new opcodes in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-05 20:01:02 -08:00
Kenneth Graunke
10d8a1a88e i965/fs: Delete is_last_send(); just check inst->eot.
This helper function basically just checks inst->eot, but also asserts
that only opcodes we expect to terminate threads have EOT set.  As far
as I'm aware, we've never had such a bug.

Removing it means that we don't have to extend the list for new opcodes.
Cherryview and Skylake introduce an optimization where sampler messages
can have EOT set; scalar GS/HS/DS will likely introduce new opcodes as
well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-05 20:00:42 -08:00
Michel Dänzer
a338dc0186 st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
The latter currently implies CPU read access, so only PIPE_USAGE_STAGING
can be expected to be fast.

Mesa demos src/tests/streaming_rect on Kaveri (radeonsi):

Unpatched:  42 frames in  1.023 seconds = 41.056 FPS
Patched:   615 frames in  1.000 seconds = 615.000 FPS

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658
Cc: "10.3 10.4" <mesa-stable@lists.freedestkop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-06 10:55:53 +09:00
Tiziano Bacocco
17abefa12b st/nine: Implement dummy vbo behaviour when vs is missing inputs
Use a dummy vertex buffer object when vs inputs have no corresponding
entries in the vertex declaration. This dummy buffer will give to the
shader float4(0,0,0,0).

This fixes several artifacts on some games.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
2015-02-06 00:07:20 +01:00
Axel Davy
90585cbc9a gallium/targets/d3dadapter9: Free card device
The drm fd wasn't released, causing a crash
for wine tests on nouveau, which seems to have
a bug when a lot of device descriptors are open.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:20 +01:00
Axel Davy
8b3a9d5c9f gallium/targets/d3dadapter9: Release the pipe_screen at destruction.
We weren't releasing hal and ref, causing some issues (threads not released, etc)

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
8f50614910 gallium/targets/d3dadapter9: Fix device detection for render-nodes
When on a render node the unique ioctl doesn't work.

This patch drops the code to detect the device, which relied
on an ioctl, and replaces it by the mesa loader function.
The mesa loader function is more complete and won't fail for render-nodes.

Alternatively we could also have used the pipe cap to
determine the vendor and device id from the driver.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
2c54d154e8 st/nine: Dummy sampler should have a=1
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
9ac74e604b st/nine: Fix update_framebuffer binding cbufs the pixel shader wouldn't render to
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
ee606b4780 st/nine: Clear: better behave if rt_mask is different to the one of the framebuffer bound
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
d8d48f6f71 st/nine: Fix multisampling support detection
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Tiziano Bacocco
a1d369e804 st/nine: Fix enabled lights in stateblocks
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
2015-02-06 00:07:19 +01:00
Axel Davy
1543defc5e st/nine: Fix depth stencil formats bindings flags.
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
49214a3dfc st/nine: Fix gpu memory leak in swapchain
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
d538007734 st/nine: SetResourceResize should track nr_samples too
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Tiziano Bacocco
1c1d26cd97 st/nine: D3DRS_FILLMODE set to 0 is D3DFILL_SOLID
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
2015-02-06 00:07:19 +01:00
Tiziano Bacocco
50f0e011da st/nine: Setting D3DRS_ALPHAFUNC to 0 means D3DCMP_NEVER
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
2015-02-06 00:07:19 +01:00
Axel Davy
dfe5e84e74 st/nine: Implement fallback behaviour when rts and ds don't match
This seems to be the behaviour on Win. Previous behaviour led
to different issues depending on the driver.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
8b901e3011 st/nine: Fix present_buffers allocation
If has_present_buffers was false at first,
but after a device reset, it turns true (for
example if we begin to render to a multisampled
back buffer), there was a crash due to present_buffers
being uninitialised.
This patch fixes it.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Patrick Rudolph
792af626d4 st/nine: Check for aligned offset in each vertex element
Fixes wine test test_vertex_declaration_alignment()

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:19 +01:00
Patrick Rudolph
63221c6f09 st/nine: Fix bufferoverflow in {Get|Set}PixelShaderConstantF
Previous code wasn't checking against the correct limit: 224
for sm3 hardware, but 256.

Fixes wine test test_pixel_shader_constant()

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:19 +01:00
Patrick Rudolph
2dcad120a0 st/nine: Set [out] argument to NULL for some functions
Wine tests, and probably some apps, check for errors by checking for NULL
instead of error codes.
Fixes wine test test_surface_blocks()

Reviewed-by: Axel davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:19 +01:00
Patrick Rudolph
9aa3ebd0e7 st/nine: Remove duplicated debug message
Likely a rebase error

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:19 +01:00
Patrick Rudolph
33617ef296 st/nine: Return E_FAIL for unused vertexdeclaration type
Add returncode E_FAIL.
Return E_FAIL for any vertexdeclaration element with type unused.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:19 +01:00
Patrick Rudolph
faf94f6eea st/nine: Missing sanity check for CALLOC return E_OUTOFMEMORY if allocation of usage_map fails
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:19 +01:00
Axel Davy
75676886e4 st/nine: Implement ATOC hack
ATOC is an hack for Alpha to coverage
that is supported by NV and Intel.

You need to check the support for it
with CheckDeviceFormat.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
0a4aaf1d41 st/nine: Implement AMD alpha to coverage
This D3D hack is supposed to be supported
by all AMD SM2+ cards. Apps use it without
checking if they are on AMD.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
bf0adf248f st/nine: Add D3DFMT_DF16 support
This depth buffer format, like D3DFMT_INTZ, can be used to read
the depth buffer values when bound to a shader.

Some apps may use this format to get better performance when
they don't need the precision of INTZ (24 bits for depth, 8 for
stencil, whereas DF16 is just 16 bits for depth)

We don't add support for DF24 yet, because it implies support
for FETCH4, which we don't support for now.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
34292754d2 st/nine: Change the value of some advertised caps
These values are taken from wine.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
25f1e5584c st/nine: NineDevice9_SetClipPlane: pPlane must be non-NULL
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:19 +01:00
Axel Davy
02a89dc163 st/nine: Implement fallback for D3DFMT_D24S8, D3DFMT_D24X8 and D3DFMT_INTZ
Some drivers support PIPE_FORMAT_S8_UINT_Z24_UNORM,
some others PIPE_FORMAT_Z24_UNORM_S8_UINT, some both.

It doesn't matter which one we use, since the d3d formats
they map to aren't lockable (app can read it directly).

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
27e438e356 st/nine: Refactor format d3d9 to pipe conversion
Move the checks of whether the format is supported
into a common place.
The advantage is that allows to handle when a d3d9
format can be mapped to several formats, and that
cards don't support all of them.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
f8713b1bfd st/nine: Refactor nine_d3d9_to_pipe_format_map
The order of the format is changed to have
an increasing ordering of the d3d9 format values.

Some missing formats are added and matched to PIPE_FORMAT_NONE

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
4cf5701160 st/nine: Improve CheckDeviceFormat debug output
Because the debug output of this function was cut in two parts,
sometimes the second part wasn't print when we would return earlier,
whereas we would like to get it.

The reason of the separation was that it's only at the end of the function
we can print what we map to the d3d9 arguments, but we can always retrieve
that info by hand.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
42ac71a4e2 st/nine: Implement RESZ hack
This D3D hack allows to resolve a multisampled
depth buffer into a single sampled one.

Note that the implementation is slightly incorrect.
When querying the content of D3DRS_POINTSIZE,
it should return the resz code if it has been set.
This behaviour will be implemented when state changes
will be reworked. For now the current behaviour is ok,
since apps use the D3DCREATE_PUREDEVICE flag when creating
the device, which means they won't read states and in exchange
get better performance.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
5c61f6344a st/nine: fix early basetexture destruction
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Patrick Rudolph
dfeca90419 st/nine: Do not leak private data in volume9.
This->data was allocated by nine, but not freed.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:18 +01:00
Patrick Rudolph
b3afcc0968 st/nine: Check block alignment for compressed textures in NineSurface9_CopySurface
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:18 +01:00
Axel Davy
65ce2b2848 st/nine: Commit sampler views again if srgb state changed.
This fixes a wine test and some minor visual issues on some games.

The patch is not optimal, there is probably a more efficient way to
fix this issue, but the code there already has some innefficiencies.
There is plans to rewrite that part of the code to make it more
efficient.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
2d2286d17c st/nine: Fix use of D3DSP_NOSWIZZLE
D3DSP_NOSWIZZLE already contains the shift.
Detected with Clang.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
1f3b7d4039 st/nine: Check for the correct number of constants.
This removes unneeded hack for Anno 1404.
This app is not checking the number of supporting
constants, and rely on the shader compilation to fail
if it puts too many constants.

This patch also checks for the correct number of constants for ps.

Note that we don't check the official limitations for old vs and ps
versions. The restrictions were fixed, unlike for the number of vertex
shader constants for later versions. Likely apps use the correct number,
and it's not a problem for us if it wants use more.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
d0aeb4422b st/nine: Introduce failure handling for shader parsing.
Instead of crashing on buggy shaders, we should return an error.
This patch introduces this behaviour in the case of invalid constant
access

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
6fcc2c8872 st/nine: Print warnings for r500 when shader is likely to go wrong
r500 hasn't enough float constants for vs to fill all needs.
Overlapping issues can happen with complex shaders.
The fix would be to recompile shaders to include the integer
and boolean constants, instead of reserving slots for them.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
70a523818f st/nine: Declare constants only up to the maximum needed.
Previously 276 constants were declared everytime.

This patch makes shaders declare constants up to the maximum
constant needed and moves the moment we print the TGSI
shader after the moment we declare the constants.

This is needed for r500, since when indirect addressing is used,
it cannot reduce the amount of constants needed, and that it is
restricted to 256 constant slots.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
a249c7a161 st/nine: Refactor how user constbufs sizes are calculated
Count explicitly the slots for float, int and bool constants,
and deduce the constbuf size in nine_shader.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
65ca8e4b3d st/nine: Explicit nine requirements
This patch raises nine requirements and disables nine for old
hw that don't match them.

Currently for these cards only games that don't have tight requirements
would work well with nine. However nine is missing several checks
regarding these limitations.
To make code and future patches less heavy, dropping support for these old
card seems a good solution.

That makes r500 the only dx9 generation cards supported by nine. It seems the one
with the less limitations for nine. Still not everything is ok, and we'll have
for example to implement shader recompilation for these cards to include
integer and boolean constants in the shader.
Eventually when this is done, we can reintroduce support for older cards.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
eb1c12d20d gallium: Add MULTISAMPLE_Z_RESOLVE cap
Resolving a multisampled depth texture into
a single sampled texture is supported on >= SM4.1
hw. It is possible some previous hw support it.

The ability was tested on radeonsi and nvc0.
Apparently is is also supported for radeon >= r700.

This patch adds the MULTISAMPLE_Z_RESOLVE cap and
add it to the drivers. It is advertised for drivers
for which it is sure the ability is supported.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Laura Ekstrand
77cc799853 GL: Update glext.h to Revision 29735 (20150202).
Khronos modified glext.h to get rid of GL_TEXTURE_BINDING, a special enum
added for ARB_direct_state_access.  This enum was ruled unimplementable.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Laura Ekstrand <laura@jlekstrand.net>
2015-02-05 11:41:26 -08:00
Jose Fonseca
08efcc0960 llvmpipe: Trivially advertise PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT.
Nothing special needs to be done.

Even though llvmpipe copies constant (ie uniform) buffers internally, the
application is supposed to flush and sync, so all should work.

All bufferstorage piglit tests pass.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-05 16:16:47 +00:00
Matt Turner
2335153ff2 i965: Remove now unnecessary Gen8 CMP destination type override.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-04 12:14:35 -08:00
Matt Turner
6b3a301f61 i965: Set CMP's destination type to src0's type.
Allows CMP instructions with float sources to be compacted and coissued.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-04 12:14:34 -08:00
Matt Turner
7e60794392 i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.
Prevents piglit regressions from the next patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-04 12:14:34 -08:00
Jose Fonseca
661c8bb220 gallium/util: Don't implement u_bit_scan64 on MSVC.
As ffsll doesn't exist in MSVC yet, and u_bit_scan64 is only used by
radeonsi which is never built with MSVC.

This is just a stop-gap fix to unbreak MSVC build until we refactor these
mathematical portability wrappers into src/util.

Trivial.
2015-02-04 15:22:59 +00:00
Jose Fonseca
46f1033067 gallium/util: Define ffsll on MinGW.
Trivial.

(Fixing MSVC will be far less so, as _BitScanForward64 is only supported on x64.)
2015-02-04 14:58:20 +00:00
Marek Olšák
6c5af1dc4e radeonsi: implement polygon stippling
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
6895dfb184 radeonsi: add polygon stipple texture slot
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
1fe7ba8c69 radeonsi: deduce rasterizer primitive type at the beginning of draw_vbo
I will need this for polygon stippling.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
8f65e6eae8 radeonsi: allow 64 descriptors per array
We need a slot for the stipple texture and the pixel shader already uses
32 textures (16 API slots + 16 FMASK slots).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
9af943c32e radeonsi: add support for sampler views where resource = NULL
The hardware obeys swizzles even if the resource is NULL.
This will be used by set_polygon_stipple.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
70e4243f07 radeonsi: add support for NULL texture sampler views that return (0,0,0,1)
This used to hang.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
82f64a68a4 radeonsi: fix a crash when binding a NULL sampler view list
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
b142dd2f24 radeonsi: move the buffer descriptor to the end of the image descriptor
This will allow supporting NULL textures.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
afe1e6acdd radeonsi: don't use tgsi_parse_context to get processor type
Also remove unused "tokens".

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
50908a8918 radeonsi: fix instanced arrays with non-zero start instance
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
658f1d4cfe r600g,radeonsi: don't append to streamout buffers that haven't been used yet
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.

This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
b616429ca8 gallium: set PIPE_MAX_SAMPLERS to 18
For drivers that use higher slots not to crash in tgsi_shader_info.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
8fc542aa89 gallium/u_pstipple: add ability to specify a fixed texture unit
E.g. r600g can use slot 17, which is outside of the API range.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
50433ea526 gallium/util: add u_bit_scan64
Same as u_bit_scan, but for uint64_t.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
f2328ffdc8 tgsi: add tgsi_get_processor_type helper from radeon
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Kenneth Graunke
ccbe15f332 i965/fs: Fix saturate on MAD and LRP with the NIR backend.
Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably
many other programs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-04 00:34:57 -08:00
Iago Toral Quiroga
1b029f8a4a mesa: Fix _mesa_format_convert fallback path when src is not an array format
When a rebase swizzle is provided and we call _mesa_swizzle_and_convert
after unpacking the source format we were always passing normalized=false.
We should pass true or false depending on the formats involved in the
conversion for the byte and float paths (the integer path cannot ever be
normalized).

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2015-02-04 08:08:34 +01:00
Park, Jeongmin
6fd4a61ad6 st/osmesa: Fix osbuffer->textures indexing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-03 15:46:56 -07:00
Connor Abbott
ab24e12706 i965/nir: use redundant phi optimization
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 16:00:13 -05:00
Connor Abbott
a135f34080 nir: add an optimization to remove useless phi nodes
This removes phi nodes whose sources all point to the same thing.

Shader-db results:

total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%)
NIR instructions in affected programs:     126564 -> 122480 (-3.23%)
helped:                                615
HURT:                                  0

total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%)
FS instructions in affected programs:     24622 -> 23174 (-5.88%)
helped:                                138
HURT:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 16:00:13 -05:00
Jason Ekstrand
572d1f6e41 nir/validate: Ensure that phi sources are SSA-only
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 12:52:42 -08:00
Jason Ekstrand
5420774510 nir/validate: Validate that only float ALU outputs are saturated
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 12:46:55 -08:00
Jason Ekstrand
c0df85cca4 nir/lower_source_mods: Don't lower saturate for non-float outputs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 12:46:38 -08:00
Jason Ekstrand
8776b1b14b i965/fs_nir: Get rid of get_alu_src
Originally, get_alu_src was supposed to handle resolving swizzles and
things like that.  However, now that basically every instruction we have
only takes scalar sources, we don't really need it anymore.  The only case
where it's still marginally useful is for the mov and vecN operations that
are left over from SSA form.  We can handle those cases as a special case
easily enough.  As a side-effect, we don't need the vec_to_movs pass
anymore.

v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Rework the way we detect if we need an extra copy for swizzling.  The
   old code involved a pile of confusing switch fall-throughs; we now use a
   loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:33:11 -08:00
Jason Ekstrand
112d738b91 i965/fs: Use NIR's scalarizing abilities and stop handling vectors
Now that we can scalarize with NIR, there's no need for all this code
anymore.  Let's get rid of it and just do scalar operations.

v2: run copy prop before lowering phi nodes

v3: Get rid of the "emit(...)->saturate = foo" pattern

v4: Run alu_to_scalar as an optimization pass

total instructions in shared programs: 5998321 -> 5974070 (-0.40%)
instructions in affected programs:     732075 -> 707824 (-3.31%)
helped:                                3137
HURT:                                  191
GAINED:                                18
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:33:11 -08:00
Jason Ekstrand
f2adcd36cb nir: Add a pass to lower vector phi nodes to scalar phi nodes
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Add better comments
 - Use nir_ssa_dest_init and nir_src_for_ssa more places
 - Fix some void * casts

v3 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Rework the way we determine whether or not to sccalarize a phi node to
   make the recursion non-bogus
 - Treat load_const instructions as scalarizable

v4 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Allow uniform and input loads to be scalarizable

v5 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Also consider loads of inputs (varying, uniform, or ubo) to be
   scalarizable.  We were already doing this for load_var on uniforms and
   inputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:33:11 -08:00
Matt Turner
e87928a494 i965/fs: Add support for constant propagating into sources with modifiers.
All but 16 of the programs helped were ARB fp programs.

total instructions in shared programs: 5949286 -> 5945470 (-0.06%)
instructions in affected programs:     275162 -> 271346 (-1.39%)
helped:                                1197
GAINED:                                1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
cfa2165642 i965/vec4: Use abs/negate functions in const propagation.
No changes in shader-db.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
dbd4c22a37 i965: Add function to take the abs of immediates.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
638beee24a i965: Add function to negate immediates.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
1f4bdad316 i965: Mark UB/B immediates as unreachable.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
32e98e8ef0 gallium/util: Don't use __builtin_clrsb in util_last_bit().
Unclear circumstances lead to undefined symbols on x86.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-03 12:25:14 -08:00
Matt Turner
d8be1b9aba glsl/list: Note that exec_lists may not be realloc'd.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:25:14 -08:00
Nils Wallménius
cfb5b1c59e st/mesa: mark constant array of swizzles as static const
This saves about 0.5k in the text section for a gallium driver
on amd64.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-02-04 09:07:13 +13:00
Eduardo Lima Mitev
0ed3bffc08 mesa: Returns a GL_INVALID_VALUE error on several APIs when buffer size is negative
Section 2.3.1 (Errors) of the OpenGL 4.5 spec says:

    "If a negative number is provided where an argument of type sizei or
    sizeiptr is specified, an INVALID_VALUE error is generated.

This patch adds checks for negative buffer size values passed to different APIs.
It also moves up the check on other APIs that already had it, making it the first
error check performed in the function, for consistency.

While there may be other APIs throughtout the code lacking this check (or at least
not at the beginning of the function), this patch focuses on the cases that break
the dEQP tests listed below. It could be a good excersize for the future to check
all other cases, and improve consistency in the order of the checks throughout the
whole Mesa code base.

This fixes 5 dEQP test:
* dEQP-GLES3.functional.negative_api.state.get_attached_shaders
* dEQP-GLES3.functional.negative_api.state.get_shader_source
* dEQP-GLES3.functional.negative_api.state.get_active_uniform
* dEQP-GLES3.functional.negative_api.state.get_active_attrib
* dEQP-GLES3.functional.negative_api.shader.program_binary

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-03 13:19:36 +01:00
Samuel Iglesias Gonsalvez
284bd1ecdf mesa: fix error value in GetFramebufferAttachmentParameteriv for OpenGL ES 3.0
Section 6.1.13 "Framebuffer Object Queries" of OpenGL ES 3.0 spec:

 "If the default framebuffer is bound to target, then attachment must be
  BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or
  STENCIL, identifying the stencil buffer."

OpenGL ES 3.0, section 2.5 (GL Errors):

 "If a command that requires an enumerated value is passed a
  symbolic constant that is not one of those specified as allowable
  for that command, an INVALID_ENUM error is generated."

Then change the returned error to INVALID_ENUM.

Fixes:

dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-03 13:19:36 +01:00
Iago Toral Quiroga
5dfb085ff3 glsl: Improve precision of mod(x,y)
Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement
mod(x,y) as y * fract(x/y). This implementation has a down side though:
it introduces precision errors due to the fract() operation. Even worse,
since the result of fract() is multiplied by y, the larger y gets the
larger the precision error we produce, so for large enough numbers the
precision loss is significant. Some examples on i965:

Operation                           Precision error
-----------------------------------------------------
mod(-1.951171875, 1.9980468750)      0.0000000447
mod(121.57, 13.29)                   0.0000023842
mod(3769.12, 321.99)                 0.0000762939
mod(3769.12, 1321.99)                0.0001220703
mod(-987654.125, 123456.984375)      0.0160663128
mod( 987654.125, 123456.984375)      0.0312500000

This patch replaces the current lowering pass with a different one
(MOD_TO_FLOOR) that follows the recommended implementation in the GLSL
man pages:

mod(x,y) = x - y * floor(x/y)

This implementation eliminates the precision errors at the expense of
an additional add instruction on some systems. On systems that can do
negate with multiply-add in a single operation this new implementation
would come at no additional cost.

v2 (Ian Romanick)
- Do not clone operands because when they are expressions we would be
duplicating them and that can lead to suboptimal code.

Fixes the following 16 dEQP tests:
dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_*
dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_*

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-03 13:19:36 +01:00
Eduardo Lima Mitev
c27d23f0c8 mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3
GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX
(2.8.1 Transferring Array Elements, page 26) which is not currently
possible to query using glGet*() funcs.

Fixes 4 dEQP tests:
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-03 13:19:36 +01:00
Iago Toral Quiroga
ec7dcaf578 glsl: can't have 'const' qualifier used with struct or interface block members
Fixes the following 2 dEQP tests:
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-03 13:19:36 +01:00
Iago Toral Quiroga
5d655a43e6 glsl: interface blocks must be declared at global scope
Fixes the following 2 dEQP tests:
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-03 13:19:36 +01:00
Iago Toral Quiroga
6dd346c232 i965: Fix negate with unsigned integers
For code such as:

uint tmp1 = uint(in0);
uint tmp2 = -tmp1;
float out0 = float(tmp2);

We produce code like:
mov(8)    g5<1>.xF    -g9<4,4,1>.xUD

which does not produce correct results. This code produces the
results we would expect if tmp1 and tmp2 were signed integers
instead.

It seems that a similar problem was detected and addressed when
using negations with unsigned integers as part of condionals, but
it looks like the problem has a wider impact than that.

This patch fixes the problem by preventing copy-propagation of
negated UD registers in all scenarios, not only in conditionals.

Fixes the following 24 dEQP tests:

dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uint_*
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec2_*
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec3_*
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec4_*

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-03 13:19:36 +01:00
Jose Fonseca
5b941ce857 scons: Fix Windows builds with LLVM 3.5.
LLVMBitReader dependency was introduced, as pointed out by Rob Conde.
2015-02-03 10:18:51 +00:00
Ilia Mirkin
bc321db75b st/mesa: add EXT_polygon_offset_clamp support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-02 20:44:22 -05:00
Ilia Mirkin
7c211a12aa gallium: add a cap to determine whether the driver supports offset_clamp
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-02 20:44:02 -05:00
Ilia Mirkin
2ce29ce5af i965/gen6+: enable EXT_polygon_offset_clamp
Replace the hard-coded 0's with the context clamp value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-02 20:35:36 -05:00
Ilia Mirkin
81998dda63 mesa: add support for GL_EXT_polygon_offset_clamp
Nothing enables the extension yet, but the values are now available.
The spec calls for it to only be exposed for GL 3.3+, which is core-only
in mesa. Instead we allow any driver to enable it, including in a compat
context for any GL version.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-02 20:35:36 -05:00
Ilia Mirkin
83321009de glapi: add GL_EXT_polygon_offset_clamp
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-02-02 20:35:36 -05:00
Kenneth Graunke
0f06f12c11 glsl: Pick ast_conditional branch regardless of op1/2 being constant.
If the ?: operator's condition is a constant value, and both branches
were pure expressions, we can just make the resulting value one or the
other.

Previously, we only did this if op[1] and op[2] were also constant
values - but there's no actual reason for that restriction.

No changes in shader-db, probably because we usually optimize this later
anyway.  But it does make us generate less stupid code up front.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-02 17:14:55 -08:00
Kenneth Graunke
534f07ee85 i965: Add a better PRM citation for the IMS dimension mangling.
Paul originally had to reverse engineer these formulas based on the
description about how the sampler works.  The description here is not
the easiest to follow - especially given that it's from the Sandybridge
era, when the hardware only did 4x multisampling.

Jordan and I recently found another part of the documentation where they
simply state that IMS dimensions must be adjusted by a set of formulas.
Quoting this section provides an easy to follow explanation for the
code, including 2x/4x/8x/16x.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-02-02 17:14:38 -08:00
Laura Ekstrand
e9b86cb5d6 swrast: Whitespace fixes.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-02 13:22:26 -08:00
Laura Ekstrand
e187c2f543 DD: Refactor BlitFramebuffer.
In preparation for glBlitNamedFramebuffer, the DD table function
BlitFramebuffer needs to accept two arbitrary framebuffer objects rather
than assuming ctx->ReadBuffer and ctx->DrawBuffer.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-02 13:21:20 -08:00
Laura Ekstrand
ad2c64abbd GL: Update glext.h to Khronos Revision 29537.
Khronos Revision 29537 fixes ARB_direct_state_access function prototypes that
had GLsizei where they should have had GLsizeiptr. The mainly affects
functions related to buffer objects.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-02 10:39:55 -08:00
Jason Ekstrand
2cebaac479 i965: Don't use tiled_memcpy to download from RGBX or BGRX surfaces
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-02-02 10:18:42 -08:00
Neil Roberts
af8fd694d4 dir-locals.el: Don't set variables for non-programming modes
This limits the style changes to modes inherited from prog-mode. The
main reason to do this is to avoid setting fill-column for people
using Emacs to edit commit messages because 78 characters is too many
to make it wrap properly in git log. Note that makefile-mode also
inherits from prog-mode so the fill column should continue to apply
there.

v2: Apply to all the .dir-locals.el files, not just the one in the
    root directory.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-02 12:02:55 +00:00
Iago Toral Quiroga
68155e5a36 i965: Fix intel_miptree_copy_teximage for GL_TEXTURE_1D_ARRAY
For GL_TEXTURE_1D_ARRAY targets we store the depth of the array
in the Height field and leave Depth=1 in the underlying texture
object. When we call intel_miptree_copy_teximage in the process
of re-creating a miptree (possibily because the number of miplevels
has changed) we didn't account for this, so we where only copying
texture images for the first slice.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-02 09:29:18 +01:00
Eric Anholt
753c327151 vc4: Kill a bunch of color write calculation when colormask is all off.
I could have done this in the bit that generates the ANDs and ORs, but
it's probably generally useful.  Sadly, I still need this even if I move
to NIR, because I can't yet express my read of the destination color in
NIR, which I would need to move my blend/logicop/colormask handling into
NIR.

total uniforms in shared programs: 13497 -> 13455 (-0.31%)
uniforms in affected programs:     101 -> 59 (-41.58%)
total instructions in shared programs: 40797 -> 40296 (-1.23%)
instructions in affected programs:     1639 -> 1138 (-30.57%)
2015-02-01 16:07:24 -08:00
Fredrik Höglund
0508032413 docs: Update ARB_direct_state_access
Mark vertex array objects as started.
2015-02-01 23:00:42 +01:00
Martin Peres
9272022353 doc: break down ARB_direct_state_access in GL3.txt
A student was wondering what was going on + I started working on it too.

CC: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-02-01 22:50:35 +01:00
Eric Anholt
12ebd7e20e vc4: Dump the VPM read index in QIR disasm.
Since the VPM reads have to be in order, it's useful to see their indices
in the dump.
2015-02-01 12:53:08 -08:00
Jason Ekstrand
6094619c02 i965/pixel_read: Don't try to do a tiled_memcpy from a multisampled buffer
The GL spec guarantees that glGetTexImage will never get a multisampled
texture, but this is not true for glReadPixels.  If we get a multisampled
buffer, we have to do a multisample resolve on it before we can pull the
data down for the user.  Since this isn't practical to handle in
tiled_memcpy, we just fall back to the other paths that can handle this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-31 08:54:32 -08:00
Francisco Jerez
11f5d8a5d4 i965: Enable L3 caching of buffer surfaces.
And remove the mocs argument of the emit_buffer_surface_state vtbl hook.  Its
semantics vary greatly from one generation to another, so it kind of
encourages the caller to pass 0 which is the only valid setting across
generations.  After this commit the hardware-specific code decides what the
best cacheability settings are for buffer surfaces, just like we do for
textures.

This together with some additional changes coming is expected to improve
performance of pull constants, buffer textures, atomic counters and image
objects on Gen7 and up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-31 17:01:49 +02:00
José Fonseca
11a955aef4 egl: Pass the correct X visual depth to xcb_put_image().
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

  https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-31 09:14:36 +00:00
Jason Ekstrand
5c31184cf5 intel/pixel_read: Properly flip the results for window system buffers
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-30 18:56:56 -08:00
Jason Ekstrand
837a4c42a6 i965/tiled_memcpy: Support a signed linear pitch
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-30 18:56:56 -08:00
Jason Ekstrand
7cc3bb2318 main: Add STENCIL_INDEX formats to base_tex_format
This fixes a bug on BDW when our meta-based stencil blit path assert-fails
due to an invalid internal format even though we do support the
ARB_stencil_texturing extension.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 15:49:45 -08:00
Jason Ekstrand
16875bc5cd teximage: Don't indent switch cases
No functional change.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 15:49:45 -08:00
Brian Paul
b930ef1ce8 mesa: remove some dead display list code
The size of a Node is always four bytes so no need for the old code
that was used when sizeof(Node)==8.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 13:27:18 -07:00
Brian Paul
20bc72b791 mesa: remove stale comment in dlist.c code
sizeof(Node) is always 4 bytes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 13:27:18 -07:00
Brian Paul
613974b774 mesa: s/union gl_dlist_node/Node/ in dlist.c code
Just minor clean-up.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 13:27:17 -07:00
Brian Paul
53b01938ed mesa: fix display list 8-byte alignment issue
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-01-30 08:48:19 -07:00
José Fonseca
fbc3e030e6 util/u_atomic: Provide a _InterlockedCompareExchange8 for older MSVC.
Fixes build with Windows SDK 7.0.7600.

Tested with u_atomic_test, both on x86 and x86_64.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-30 15:24:34 +00:00
José Fonseca
d7f2dfb67e util/u_atomic: Use _Interlocked* intrinsics for non 64bits.
The intrinsics are universally available, whereas older Windows SDKs (e.g.
7.0.7600) don't have the non-intrisic entrypoint.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-30 15:24:33 +00:00
Neil Roberts
a7eec6d620 i965/skl: Force a BINDING_TABLE_POINTER_* after push constant command
According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take
effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_*
command. This patch just makes it set the BRW_NEW_SURFACES state when
uploading the push constants to ensure the binding tables will be
updated.

This fixes the fbo-blending-formats Piglit test and possibly others.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-30 12:25:13 +00:00
Topi Pohjolainen
083fb215e1 meta: Don't write depth when decompressing tex-images
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:59:13 +02:00
Topi Pohjolainen
c49c750579 meta: Don't write depth when generating miptrees
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:59:04 +02:00
Topi Pohjolainen
941aced635 meta/blit: Compile programs with and without depth
When color buffers alone are concerned the depth is not needed.

No regression on BDW where meta blit is used instead of blorp. I
also disabled blorp temporarily for fbo-blits on IVB and saw no
regressions there either.
I also compared several graphics benchmarks on BDW and saw neither
regressions or improvements.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:58:32 +02:00
Topi Pohjolainen
97caf5fa04 meta/blit: Write depth only when asked for
Implementing an idea from Ken, on i965 the shader program for 2D
blits becomes significantly simpler.

Before:

pln(8)   g6<1>F    g4<0,1,0>F    g2<8,8,1>F  { align1 1Q compacted };
pln(8)   g7<1>F    g4.4<0,1,0>F  g2<8,8,1>F  { align1 1Q compacted };
send(8)  g2<1>UW   g6<8,8,1>F
         sampler (1, 0, 0, 1) mlen 2 rlen 4  { align1 1Q };
mov(8)   g123<1>F  g2<8,8,1>F                { align1 1Q compacted };
mov(8)   g124<1>F  g3<8,8,1>F                { align1 1Q compacted };
mov(8)   g125<1>F  g4<8,8,1>F                { align1 1Q compacted };
mov(8)   g126<1>F  g5<8,8,1>F                { align1 1Q compacted };
mov(8)   g127<1>F  g2<8,8,1>F                { align1 1Q compacted };
nop                                                             ;
sendc(8) null        g123<8,8,1>F
    render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT };

After:

pln(8)   g6<1>F     g4<0,1,0>F    g2<8,8,1>F   { align1 1Q compacted };
pln(8)   g7<1>F     g4.4<0,1,0>F  g2<8,8,1>F   { align1 1Q compacted };
send(8)  g124<1>UW  g6<8,8,1>F
         sampler (1, 0, 0, 1) mlen 2 rlen 4    { align1 1Q };
sendc(8) null        g124<8,8,1>F
   render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT };

v2 (Matt): Removed unintended white-space change

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:57:51 +02:00
Topi Pohjolainen
4c157d34c0 meta/blit: Add plumbing for shaders without depth
Currently all blit programs are unconditionally compiled with
gl_FragDepth.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:54:53 +02:00
Jason Ekstrand
604ae33c8b nir/opt_algebraic: Add some constant bcsel reductions
total instructions in shared programs: 5998190 -> 5997603 (-0.01%)
instructions in affected programs:     54276 -> 53689 (-1.08%)
helped:                                293

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:11:13 -08:00
Jason Ekstrand
7f19cd5a56 nir/opt_algebraic: Add some boolean simplifications
total instructions in shared programs: 5998321 -> 5998287 (-0.00%)
instructions in affected programs:     4520 -> 4486 (-0.75%)
helped:                                8

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:11:10 -08:00
Jason Ekstrand
70273c5cd5 nir/algebraic: Support specifying variable as constant or by type
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
81f77e4f3a nir/algebraic: Fail to compile of a variable is used in a replace but not the search
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
026b5cc792 nir/search: Allow for matching variables based on types
This allows you to match on an unknown value but only if it is of a given
type.  90% of the uses of this are for matching only booleans, but adding
the generality of arbitrary types is no more complex.

nir_algebraic.py doesn't handle this yet but that's ok because the C
language will ensure that the default type on all variables is void.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
d8999bcdce nir/search: Add support for matching unknown constants
There are some algebraic transformations that we want to do but only if
certain things are constants.  For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.

nir_algebraic.py doesn't handle this yet, but that's ok because the C
language will make sure that false is the default for now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
5ab1489ae6 nir: Add an invalid type
This allows us to indicate a concept of an invalid type.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Roland Scheidegger
f01e8d3ba5 gallium/docs: fix docs wrt ARL/ARR/FLR
since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-29 22:08:12 +01:00
Eric Anholt
fc884eadf1 nir: Add variants of some of the comparison simplifications.
We end up with these from TGSI-to-NIR because the pass generating the
comparisons doesn't know if the arg is actually a bool input or not.  vc4
results:

total instructions in shared programs: 41801 -> 41508 (-0.70%)
instructions in affected programs:     4253 -> 3960 (-6.89%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-29 11:44:06 -08:00
Eric Anholt
2b9c3bace7 vc4: Fix point size handling when it's the first output. 2015-01-29 11:43:33 -08:00
Eric Anholt
9a3a60cb13 nir: Don't try to to-SSA ALU instructions that are already SSA.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-29 11:43:33 -08:00
Eric Anholt
68d476167c nir: Fix a bit of broken indentation.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-29 11:42:08 -08:00
Eric Anholt
36c604c824 nir: Add a couple of helpers for glsl types.
This will be used by tgsi_to_nir, which needs to get vec4 types for
declaring shader input/output variables.

v2: Add a missing space.

Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-29 11:41:17 -08:00
Emil Velikov
765cfe9a90 docs: fix mesa 10.4.3 release date
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-29 14:02:48 +00:00
Kalyan Kondapally
e638841b87 Mesa: Advertise GL_OES_texture_*float* extensions support with i965.
This patch advertises support for GL_OES_texture_*float* extensions
when using i965 drivers.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-29 08:22:12 +02:00
Kalyan Kondapally
2c2a92d5b8 Mesa: Add support for HALF_FLOAT_OES type.
This patch adds needed support for accepting HALF_FLOAT_OES as valid type
for TexImage*D and TexSubImage*D when Texture FLoat extensions are supported.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-29 08:21:41 +02:00
Kalyan Kondapally
a63c8a524b Mesa: Add support for GL_OES_texture_*float* extensions.
This patch series adds support for following GLES2 Texture Float extensions:
1)GL_OES_texture_float,
2)GL_OES_texture_half_float,
3)GL_OES_texture_float_linear,
4)GL_OES_texture_half_float_linear.

This patch adds basic infrastructure and needed boolean flags to advertise
support for these extensions, by default the support is disabled. Next patch
in the series introduces support for HALF_FLOAT_OES token.

v4: take assert away and make valid_filter_for_float conditional (Tapani),
    fix the alphabetical order (Emil)

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-29 08:16:47 +02:00
Eric Anholt
dd4d9a4e62 nir: Make vec-to-movs handle src/dest aliasing.
It now emits vector MOVs instead of a series of individual MOVs, which
should be useful to any vector backends.  This pushes the problem of
src/dest aliasing of channels on a scalar chip to the backend, but if
there are any vector operations in your shader then you needed to be
handling this already.

Fixes fs-swap-problem with my scalarizing patches.

v2: Rename to insert_mov(), and add a comment about what it does.
v3: Rewrite the comment.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v3)
2015-01-28 16:33:34 -08:00
Eric Anholt
d70eb38517 gallium: Replace u_simple_list.h with util/simple_list.h
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 16:33:34 -08:00
Eric Anholt
7c99187c6a mesa: Port a variant of 68afbe89c7 to util/
The idea is that after a remove_from_list(), you might want to be able to
do a remove_from_list() on it again or an is_empty_list().  This is
apparently relied on by r300g.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 16:33:34 -08:00
Eric Anholt
8ab6759cef mesa: Move simple_list.h to src/util.
We have two copies of it in the tree, I'm going to delete one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 16:33:34 -08:00
Tom Stellard
2397a72129 radeonsi: Enable VGPR spilling for all shader types v5
v2:
  - Only emit write SPI_TMPRING_SIZE once per packet.
  - Use context global scratch buffer.

v3:
  - Patch shaders using WRITE_DATA packet instead of map/unmap.
  - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and
    VS_PARTIAL_FLUSH when patching shaders.

v4:
  - Code cleanups.
  - Remove unnecessary multiplies.

v5:
  - Patch shaders in system memory and re-upload to vram.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:47 +00:00
Tom Stellard
5dcd97f25c radeonsi/compute: Allocate the scratch buffer during state creation
This moves scratch buffer allocation from si_launch_grid() to
si_create_compute_state().  This helps to reduce the overhead of
launching a kernel and also fixes a bug in the code that would cause
the scratch buffer to be too small if a kernel with smaller scratch size
was launched before a kernel with a larger scratch size.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Tom Stellard
32206c5e56 radeonsi: Add radeon_shader_binary member to struct si_shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Tom Stellard
37559f8dfc radeonsi/compute: Rename si_compute::program to si_compute::shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Marek Olšák
5935edd47c radeonsi: Avoid leaking memory when rebuilding shader states
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Jason Ekstrand
bb26ebac13 nir/opcodes: Use a return type of tfloat for ldexp
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-28 13:21:40 -08:00
Jason Ekstrand
7ac79eea1a Revert "util: Move the alternate fpclassify implementation to util"
This reverts commits d6eb572905 and
58e8468d11.

This is no longer necessary as we aren't using it in NIR anymore.  Also, it
broke the build on some strange systems so let's put it back in querymatrix
where it came from.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88852

Acked-by: Matt Turner <mattst88@gmail.com>
2015-01-28 13:20:26 -08:00
Jason Ekstrand
f0340ff625 Revert "nir/opcodes: Use fpclassify() instead of isnormal() for ldexp"
This reverts commit d7d340fb2f.

We have an isnormal() implementation available, the only problem was that
we had the wrong return type (fixed in a later patch).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806

Acked-by: Matt Turner <mattst88@gmail.com>
2015-01-28 13:19:47 -08:00
Jason Ekstrand
58e8468d11 util: Predicate the fpclassify fallback on !defined(__cplusplus)
The problem is that the fallbacks we have at the moment don't work in C++.
While we could theoretically fix the fallbacks it would also raise the
issue of correctly detecting the fpclassify function.  So, for now, we'll
just disable it until we actually have a C++ user.

Reported-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: EdB <edb+mesa@sigluy.net>
2015-01-28 11:47:56 -08:00
Sven Arvidsson
3b7747c022 drirc: set allow_glsl_extension_directive_midshader for Dead Island.
Signed-off-by: Sven Arvidsson <sa@whiz.se>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 14:50:28 +01:00
Jason Ekstrand
d7d340fb2f nir/opcodes: Use fpclassify() instead of isnormal() for ldexp
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-28 03:42:41 -08:00
Jason Ekstrand
d6eb572905 util: Move the alternate fpclassify implementation to util
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-28 03:42:41 -08:00
Jason Ekstrand
5e8468e6da i965/tex: Don't create read-write textures with non-renderable formats
I haven't actually seen this bug in the wild, but it's possible that
someone could ask to do a S3TC PBO download or something.  This protects us
from accidentally creating a render target with a compressed or otherwise
non-renderable format.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-28 01:28:32 -08:00
Jason Ekstrand
34723c0861 i965/gen8: Include the buffer offset when emitting renderbuffer relocs
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88792
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-28 01:28:31 -08:00
Tapani Pälli
291d7ef84d mesa: improve error messaging for format CSV parser
Patch adds 2 error messages that point user directly to fix
mispelled or impossible swizzle field for a format.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-28 10:40:15 +02:00
EdB
6ee5effac1 clover/llvm: Dump the OpenCL C code earlier.
[ Francisco Jerez: As discussed on the mailing list, this is intended
  to produce more useful debug output in cases where the compilation
  terminates unexpectedly. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-01-28 02:27:41 +02:00
EdB
13d23a9a17 clover/llvm: Move CLOVER_DEBUG stuff into anonymous namespace.
[ Francisco Jerez: As we're at it make debug_options[] local to its
  only user and remove temporary. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-01-28 02:27:41 +02:00
Dave Airlie
349df23eb0 r600g: add support for primitive id without geom shader (v2)
GLSL 1.50 specifies a fragment shader may have a primitive id
input without a geometry shader present.

On r600 hw there is a special GS scenario for this, you have
to enable GS_SCENARIO_A and pass the primitive id through
the vertex shader which operates in GS_A mode.

This is a first pass attempt at this, and passes the piglit
tests that test for this.

v1.1: clean up debug print + no need to assign
key value to setup output.
v2: add r600 support

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-01-28 09:51:21 +10:00
Dave Airlie
cc2fc095bf r600g: move selecting the pixel shader earlier.
In order to detect that a pixel shader has a prim id
input when we have no geometry shader we need to reorder
the shader selection so the pixel shader is selected
first, then the vertex shader key can take into account
the primitive id input requirement and lack of geom shader.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-01-28 09:51:02 +10:00
Michel Dänzer
5c83a0d2ce st/clover: Pass target instead of target.begin() to std::string()
Fixes reading beyond allocated memory:

==1936== Invalid read of size 1
==1936==    at 0x4C2C1B4: strlen (vg_replace_strmem.c:412)
==1936==    by 0x9E00C30: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==1936==    by 0x5B44FAE: clover::compile_program_llvm(clover::compat::string const&, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&, pipe_shader_ir, clover::compat::string const&, clover::compat::string const&, clover::compat::string&) (invocation.cpp:698)
==1936==    by 0x5B39A20: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==1936==    by 0x5B20152: clBuildProgram (program.cpp:182)
==1936==    by 0x400F41: main (hello_world.c:109)
==1936==  Address 0x56fee1f is 0 bytes after a block of size 15 alloc'd
==1936==    at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==1936==    by 0x5B398F0: alloc (compat.hpp:59)
==1936==    by 0x5B398F0: vector<std::basic_string<char> > (compat.hpp:98)
==1936==    by 0x5B398F0: string<std::basic_string<char> > (compat.hpp:327)
==1936==    by 0x5B398F0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==1936==    by 0x5B20152: clBuildProgram (program.cpp:182)
==1936==    by 0x400F41: main (hello_world.c:109)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-01-27 16:55:29 +09:00
Michel Dänzer
ee31c8d706 r600g,radeonsi: Fix calculation of IR target cap string buffer size
Fixes writing beyond the allocated buffer:

==31855== Invalid write of size 1
==31855==    at 0x50AB2A9: vsprintf (iovsprintf.c:43)
==31855==    by 0x508F6F6: sprintf (sprintf.c:32)
==31855==    by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526)
==31855==    by 0x5B2B7DE: get_compute_param<char> (device.cpp:37)
==31855==    by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201)
==31855==    by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==31855==    by 0x5B20152: clBuildProgram (program.cpp:182)
==31855==    by 0x400F41: main (hello_world.c:109)
==31855==  Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd
==31855==    at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324)
==31855==    by 0x5B2B7C2: allocate (new_allocator.h:104)
==31855==    by 0x5B2B7C2: allocate (alloc_traits.h:357)
==31855==    by 0x5B2B7C2: _M_allocate (stl_vector.h:170)
==31855==    by 0x5B2B7C2: _M_create_storage (stl_vector.h:185)
==31855==    by 0x5B2B7C2: _Vector_base (stl_vector.h:136)
==31855==    by 0x5B2B7C2: vector (stl_vector.h:278)
==31855==    by 0x5B2B7C2: get_compute_param<char> (device.cpp:35)
==31855==    by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201)
==31855==    by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==31855==    by 0x5B20152: clBuildProgram (program.cpp:182)
==31855==    by 0x400F41: main (hello_world.c:109)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-01-27 16:54:38 +09:00
Connor Abbott
f1a9252def nir: fix a bug with constant folding non-per-component instructions
Before, we were only copying the first N channels, where N is the size
of the SSA destination, which is fine for per-component instructions,
but non-per-component instructions like fdot3 can have more source
components than destination components. Fix this using the helper
function introduced in the last patch.

v2: use new helper name

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-26 21:26:36 -05:00
Connor Abbott
816f0515a2 nir: add a helper function for getting the number of source components
Unlike with non-SSA ALU instructions, where if they're per-component
you have to look at the writemask to know which source channels are
being used, SSA ALU instructions always have all the possible channels
enabled so we can just look at the number of components in the SSA
definition for per-component instructions to say how many source
components are being used.

v2: use new name nir_ssa_alu_instr_src_components()

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-26 21:26:36 -05:00
Sisinty Sasmita Patra
90bd943f2a i965: Implemente a tiled fast-path for glReadPixels and glGetTexImage
Added intel_readpixels_tiled_mempcpy and intel_gettexsubimage_tiled_mempcpy
functions. These are the fast paths for glReadPixels and glGetTexImage.

On chrome, using the RoboHornet 2D Canvas toDataURL test, this patch cuts
amount of time spent in glReadPixels by more than half and reduces the time
of the entire test by 10%.

v2: Jason Ekstrand <jason.ekstrand@intel.com>
   - Refactor to make the functions look more like the old
     intel_tex_subimage_tiled_memcpy
   - Don't export the readpixels_tiled_memcpy function
   - Fix some pointer arithmatic bugs in partial image downloads (using
     ReadPixels with a non-zero x or y offset)
   - Fix a bug when ReadPixels is performed on an FBO wrapping a texture
     miplevel other than zero.

v3: Jason Ekstrand <jason.ekstrand@intel.com>
   - Better documentation fot the *_tiled_memcpy functions
   - Add target restrictions for renderbuffers wrapping textures

v4: Jason Ekstrand <jason.ekstrand@intel.com>
   - Only check the return value of brw_bo_map for error and not bo->virtual

v5: Jason Ekstrand <jason.ekstrand@intel.com>
   - Don't unnecessarily repeat a comment

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-26 17:29:35 -08:00
Sisinty Sasmita Patra
b52959c602 i965/tiled_memcpy: Add tiled-to-linear paths
This commit addes tiled copy functions for coping from tiled memory to
linear memory.  These are very similar to the existing linear-to-tiled
paths.

v2: Jason Ekstrand <jason.ekstrand@intel.com>
   - New commit message
   - Various whitespace fixes
   - Added ptrdiff_t casts as done in commit 225a09790

v3: Jason Ekstrand <jason.ekstrand@intel.com>
   - Fixed a comment

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-26 17:29:34 -08:00
Sisinty Sasmita Patra
009be40b7d i965: Refactor tiled memcpy functions and move them into their own file
This commit refactors the tiled_memcpy code in intel_tex_subimage.c and
moves it into its own file intel_tiled_memcpy files.  Also, xtile_copy and
ytile_copy are renamed to linear_to_xtiled and linear_to_ytiled
respectively.  The *_faster functions are similarly renamed.

There was also a bit of logic to select between the the libc provided
memcpy function and our custom memcpy that does an RGBA -> BGRA swizzle.
This was moved into an intel_get_memcpy function so that rgba8_copy can
live (and be inlined) in intel_tiled_memcpy.c.

v2: Jason Ekstrand <jason.ekstrand@intel.com>
   - Better commit message
   - Fix up the copyright on the intel_tiled_memcpy files
   - Various whitespace fixes
   - Moved a bunch of stuff that did not need to be exposed from
     intel_tiled_memcpy.h to intel_tiled_memcpy.c
   - Added proper documentation for intel_get_memcpy
   - Incorperated the ptrdiff_t tweaks from commit 225a09790

v3: Jason Ekstrand <jason.ekstrand@intel.com>
   - Fixed a comment
   - Move the tile size constants into the .c file

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-26 17:29:34 -08:00
Jason Ekstrand
f883aac06e i965/tex_subimage: Use the fast tiled path for rectangle textures
There's no reason why we should be doing this for 2D textures and not
rectangles.  Just a matter of adding another hunk to the condition.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-26 17:29:34 -08:00
Dave Airlie
ea9ae5d51a mesa/autoconf: attempt to use gnu99 on older gcc compilers
anonymous structs/union don't work with c99 but do work with gnu99
on gcc 4.4.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-01-27 10:27:56 +10:00
Felix Janda
2e2087a9eb mesa: simplify detection of fpclassify
Fixes compilation with musl libc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-26 14:07:57 -08:00
Jason Ekstrand
dd74369a0a nir/opcodes: Don't go through doubles when constant-folding iabs
Previously, we called the abs() function in math.h.  However, this involves
unnecessarily going through double.  This commit changes it to use integers
directly with a ternary.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-26 11:25:02 -08:00
Jason Ekstrand
9bd28fe3a3 nir/opcodes: Simplify and fix the unpack_half_*_split_* constant expressions
Previously, these functions were explicitly writing to dst.x and dst.y.
However they both return only one component so writing to dst.y is invalid.
Also, since they only return one component, we don't need the explicit
assignment in the expression and can simplify it use an implicit
assignment.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-26 11:25:02 -08:00
Jason Ekstrand
27c6e3e4ca nir: Use pointers for nir_src_copy and nir_dest_copy
This avoids the overhead of copying structures and better matches the newly
added nir_alu_src_copy and nir_alu_dest_copy.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-26 11:24:58 -08:00
Kenneth Graunke
9f5fee8804 i965: Handle CMP.nz ... 0 and MOV.nz similarly in cmod propagation.
"MOV.nz null src" and "CMP.nz null src 0" are equivalent instructions.

Previously, we deleted MOV.nz instructions when the instruction
generating the MOV's source also wrote the flag register (as the flag
register already contains the desired value).  However, we wouldn't
delete CMP.nz instructions that served the same purpose.

We also didn't attempt true cmod propagation on MOV.nz instructions,
while we would for the equivalent CMP.nz form.

This patch fixes both limitations, treating both forms equally.
CMP.nz instructions will now be deleted (helping the NIR backend),
and MOV.nz instructions will have their .nz propagated.

No changes in shader-db without NIR.  With NIR,

total instructions in shared programs: 6006153 -> 5969364 (-0.61%)
instructions in affected programs:     2087139 -> 2050350 (-1.76%)
helped:                                10704
HURT:                                  0
GAINED:                                2
LOST:                                  2

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-26 10:13:18 -08:00
Jan Vesely
9cbb9165b9 clover: Fix build with llvm after r226981
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88783
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2015-01-26 09:46:41 -05:00
Niels Ole Salscheider
4b94c3fc31 configure: Link against all LLVM targets when building clover
Since 8e7df519bd, we initialise all targets in
clover. This fixes bug 85380.

v2: Mention correct bug in commit message

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-25 18:11:03 +02:00
Connor Abbott
0aa31bf9c3 nir/constant_folding: use the new constant folding infrastructure
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-24 21:35:35 -08:00
Jason Ekstrand
89285e4d47 nir: add new constant folding infrastructure
Add a required field to the Opcode class, const_expr, that contains an
expression or statement that computes the result of the opcode given known
constant inputs. Then take those const_expr's and expand them into a function
that takes an opcode and an array of constant inputs and spits out the constant
result. This means that when adding opcodes, there's one less place to update,
and almost all the opcodes are self-documenting since the information on how to
compute the result is right next to the definition.

The helper functions in nir_constant_expressions.c were taken from
ir_constant_expressions.cpp.

v3 Jason Ekstrand <jason.ekstrand@iastate.edu>
 - Use mako to generate one function per opcode instead of doing piles of
   string splicing

v4 Jason Ekstrand <jason.ekstrand@iastate.edu>
 - More comments and better indentation in the mako
 - Add a description of the constant expression language in nir_opcodes.py
 - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-24 21:35:35 -08:00
Connor Abbott
fa4bc6c130 nir: use Python to autogenerate opcode information
Before, we used a system where a file, nir_opcodes.h, defined some macros that
were included to generate the enum values and the nir_op_infos structure. This
worked pretty well, but for development the error messages were never very
useful, Python tools couldn't understand the opcode list, and it was difficult
to use nir_opcodes.h to do other things like autogenerate a builder API. Now, we
store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py to
generate the old nir_opcodes.c and nir_opcodes_h.py to generate nir_opcodes.h,
which contains all the enum names and gets included into nir.h like before.  In
addition to solving the above problems, using Python and Mako to generate
everything means that it's much easier to add keep information centralized as we
add new things like constant propagation that require per-opcode information.

v2:
 - make Opcode derive from object (Dylan)
 - don't use assert like it's a function (Dylan)
 - style fixes for fnoise, use xrange (Dylan)
 - use iterkeys() in nir_opcodes_h.py (Dylan)
 - use pydoc-style comments (Jason)
 - don't make fmin/fmax commutative and associative yet (Jason)

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

v3 Jason Ekstrand <jason.ekstrand@intel.com>
 - Alphabetize source file lists
 - Generate nir_opcodes.h in the builddir instead of the source dir
 - Include $(builddir)/src/glsl/nir in the i965 build
 - Rework nir_opcodes.h generation so it generates a complete header file
   instead of one that has to be embedded inside an enum declaration
2015-01-24 21:33:56 -08:00
Emil Velikov
d2811c29da docs: add news item and link release notes for mesa 10.4.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 13:18:10 +00:00
Emil Velikov
48818a0fc7 docs: Add sha256 sums for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 49a5bce780)
2015-01-24 13:14:56 +00:00
Emil Velikov
9f35423270 Add release notes for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit e92bfa3f95)
2015-01-24 13:14:54 +00:00
Matt Turner
94e7b59a75 i965: Convert CMP.GE -(abs)reg 0 -> CMP.Z reg 0.
total instructions in shared programs: 5952059 -> 5951603 (-0.01%)
instructions in affected programs:     138812 -> 138356 (-0.33%)
GAINED:                                1
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
40ae302a3c i965/fs: Add support for removing MOV.NZ instructions.
For some reason, we occasionally write the flag register with a MOV.NZ
instruction:

   add(8)          g25<1>F         -g6<0,1,0>F     g15<8,8,1>F
   cmp.l.f0(8)     g26<1>D         g25<8,8,1>F     0F
   mov.nz.f0(8)    null            g26<8,8,1>D

A MOV.NZ instruction on the result of a CMP is like comparing for
equality with true in C. It's useless. Removing it allows us to
generate:

   add.l.f0(8)     null            -g6<0,1,0>F     g15<8,8,1>F

total instructions in shared programs: 5955701 -> 5951657 (-0.07%)
instructions in affected programs:     302910 -> 298866 (-1.34%)
GAINED:                                1
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
9a3a294224 i965/fs: Allow flipping cond mod for negated arguments.
This allows us to apply the optimization in cases where the CMP's
argument is negated, by flipping the conditional mod. For example, it
allows us to optimize this:

   add(8)       temp   a      b
   cmp.l.f0(8)  null   -temp  0.0

into

   add.g.f0(8)  temp   a      b

total instructions in shared programs: 5958360 -> 5955701 (-0.04%)
instructions in affected programs:     466880 -> 464221 (-0.57%)
GAINED:                                0
LOST:                                  1

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
d6317beb46 i965/fs: Propagate cmod across flag read if it contains the same value.
total instructions in shared programs: 5959463 -> 5958900 (-0.01%)
instructions in affected programs:     70031 -> 69468 (-0.80%)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
3fb5b2bc47 i965/fs: Add unit tests for cmod propagation pass.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
19f9cb72c8 i965/fs: Add pass to propagate conditional modifiers.
total instructions in shared programs: 5974160 -> 5959463 (-0.25%)
instructions in affected programs:     1743737 -> 1729040 (-0.84%)
GAINED:                                0
LOST:                                  12

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
3759a89ad3 i965/fs: Eliminate null-dst instructions without side-effects.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
7452f18b22 i965/fs: Apply conditional mod specially to split MAD/LRP.
Otherwise we'll apply the conditional mod to only one of SIMD8
instructions and trigger an assertion.

NoDDClr/NoDDChk have the same problem but we never apply those to these
instructions, so I'm leaving them for a later time.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:40 -08:00
Matt Turner
eed7223243 i965/fs: Add a pass to fixup 3-src instructions that have a null dest.
3-src instructions can only have GRF/MRF destinations. It's really
difficult to deal with that restriction in dead code elimination (that
wants to give instructions null destinations to show that their result
isn't used) while allowing 3-src instructions to have conditional mod,
so don't, and just give then a destination before register allocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:39 -08:00
Matt Turner
215b081c2a i965: Add is_3src() to backend_instruction.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:39 -08:00
Matt Turner
0654ca7d7e i965: Add backend_instruction::can_do_cmod().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:39 -08:00
Matt Turner
71486e9f2d i965/cfg: Add a foreach_block_reverse macro.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 17:57:39 -08:00
Matt Turner
65dd4a255a i965/cfg: Add a foreach_inst_in_block_reverse_safe macro.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 17:57:39 -08:00
Matt Turner
579157e6c1 glsl: Add a foreach_in_list_reverse_safe macro.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:57:39 -08:00
Matt Turner
c638ea3d19 i965: Don't make instructions with a null dest a barrier to scheduling.
Now that we properly track accumulator dependencies, the scheduler is
able to schedule instructions between the mach and mov in the common
the integer multiplication pattern:

   mul  acc0, x, y
   mach null, x, y
   mov  dest, acc0

Since a null destination implies no dependency on the destination, we
can also safely schedule instructions (that don't write the accumulator)
between the mul and mach.

GAINED:                                103
LOST:                                  43

Causes one program to spill (643 -> 1076 instructions).

I committed this patch last year (commit 42a26cb5) but reverted it
(commit 0d3f83f4) after inexplicable artifacts in Kerbal Space Program
(bug 78648). Tapani reapplied this patch and could not reproduce the bug
with current Mesa.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 17:57:39 -08:00
Ian Romanick
f02f1af9f7 i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful
If try_replace_with_sel is able to replace the flow control with a SEL
instruction, then there is no flow control... failing SIMD16 because
of nonexistent flow control is wrong.

No piglit regressions on any i965 platform in Jenkins.

total instructions in shared programs: 4382707 -> 4382707 (0.00%)
instructions in affected programs:     0 -> 0
helped:                                0
HURT:                                  0
GAINED:                                2089
LOST:                                  0

No other platforms affected in shader-db.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23 17:34:47 -08:00
Eric Anholt
0680d170d1 nir: Expose nir_print_instr() for debug prints
It's nice to have this present in your default cases so you can see what
instruction is triggering an abort.

v2: Just pass a NULL state, now that it won't crash when you do.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 17:30:11 -08:00
Eric Anholt
6445a40520 nir: When asked to print with a NULL state, just use bare variable names.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 17:30:01 -08:00
Eric Anholt
447ddfc137 nir: Add nir_lower_alu_to_scalar.
This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted
for vc4.

v2: Use the nir_src_for_ssa() helper, and another instance of
    nir_alu_src_copy().
v3: Drop the non-SSA support.  All intended callers will have SSA-only ALU
    ops.
v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused
    unsupported() function, drop lower_context struct.
v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert
    about weird input_sizes[].

Reviewed-by: Jason Ekstrand <jason.ekstrand@iastate.edu>
2015-01-23 16:37:23 -08:00
Eric Anholt
b200127816 nir: Make some helpers for copying ALU src/dests.
There aren't many users yet, but I wanted to do this from my scalarizing
pass.

v2: Constify the src arguments.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 16:37:16 -08:00
Kenneth Graunke
15063d2ad0 nir: Add algebraic optimizations for division and reciprocal.
These also exist in opt_algebraic.cpp.

total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%)
NIR instructions in affected programs:     42221 -> 42002 (-0.52%)
helped:                                    198

total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%)
i965 instructions in affected programs:     84322 -> 83885 (-0.52%)
helped:                                     394
HURT:                                       1 (by 1 instruction)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
bbd60f6d79 nir: Add algebraic optimizations for exponential/logarithmic functions.
Most of these exist in the GLSL IR algebraic pass already.  However,
SSA allows us to find more instances of the patterns.

total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%)
NIR instructions in affected programs:     124189 -> 120026 (-3.35%)
helped:                                    604

total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%)
i965 instructions in affected programs:     261295 -> 254507 (-2.60%)
helped:                                     1295
HURT:                                       3

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
391fb32bbe nir: Add algebraic optimizations for simplifying comparisons.
The first batch removes bonus fnot/inot operations, possibly allowing
other optimizations to better recognize patterns.

The next batch replaces a fadd and constant 0.0 with an fneg - negation
is usually free on GPUs, while addition is not.

total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%)
NIR instructions in affected programs:     411143 -> 405922 (-1.27%)
helped:                                    2233
HURT:                                      214

A few shaders are hurt by a few instructions due to moving neg such
that it has a constant operand, which is then folded, resulting in two
distinct load_consts for x and -x.  We can always clean that up later.

total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%)
i965 instructions in affected programs:     784980 -> 775093 (-1.26%)
helped:                                     4508
HURT:                                       2

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
551a752a59 nir: Add algebraic optimizations for pointless shifts.
The GLSL IR optimization pass contained these; we may as well include
them too.

v2: Fix a >> 0 and a << 0 optimizations (caught by Matt).

No change in the number of NIR instructions on a shader-db run.

total i965 instructions in shared programs: 6035397 -> 6035392 (-0.00%)
i965 instructions in affected programs:     542 -> 537 (-0.92%)
helped:                                     2 (in glamor)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
3e56572c49 nir: Add a bunch of algebraic optimizations on logic/bit operations.
Matt and I noticed a bunch of "val <- ior a a" operations in a shader,
so we decided to add an algebraic optimization for that.  While there,
I decided to add a bunch more of them.

v2: Delete bogus fand/for optimizations (caught by Jason).

total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%)
NIR instructions in affected programs:     149634 -> 146937 (-1.80%)
helped:                                    1032

total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%)
i965 instructions in affected programs:     537 -> 542 (0.93%)
HURT:                                       2

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
978b0a9cda nir: Implement CSE on intrinsics that can be eliminated and reordered.
Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had
load_input and load_uniform intrinsics repeated several times, with the
same parameters, but each one generating a distinct SSA value.  This
made ALU operations on those values appear distinct as well.

Generating distinct SSA values is silly - these are read only variables.
CSE'ing them makes everything use a single SSA value, which then allows
other operations to be CSE'd away as well.

Generalizing a bit, it seems like we should be able to safely CSE any
intrinsics that can be eliminated and reordered.  I didn't implement
support for variables for the time being.

v2: Assert that info->num_variables == 0 (requested by Jason).

total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%)
NIR instructions in affected programs:     2413496 -> 2001071 (-17.09%)
helped:                                    16872

total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%)
i965 instructions in affected programs:     640654 -> 620094 (-3.21%)
helped:                                     2071
HURT:                                       585
GAINED:                                     14
LOST:                                       25

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
cbdd623f13 nir: Pull nir_instr_can_cse()'s SSA checks out of the switch.
This should not be a change in behavior, as all current cases that
potentially answer "yes" require SSA.

The next patch will introduce another case that requires SSA.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
d7743bb1c2 i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug.
This allows us to count NIR instructions via shader-db.

Use "run" as normal.  The results file will contain both NIR and
assembly.

Then, to generate a NIR report:
./report.py <(grep    NIR results/foo) <(grep    NIR results/bar)

Or, to generate an i965 report:
./report.py <(grep -v NIR results/foo) <(grep -v NIR results/bar)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
f3e06fcc6a i965/nir: Print NIR on INTEL_DEBUG=fs.
This is useful for debugging and looking for optimization opportunities.

It will need to be expanded when we add support for other scalar stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke
faa38e16aa i965/nir: Do optimizations again just before lowering source mods.
We want to run CSE and algebraic optimizations again after lowering IO.
Some of the passes in the optimization loop don't handle saturates and
other modifiers, so run it before lowering to source modifiers.

total instructions in shared programs: 6046190 -> 6045768 (-0.01%)
instructions in affected programs:     22406 -> 21984 (-1.88%)
helped:                                47
HURT:                                  0
GAINED:                                0
LOST:                                  0

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 14:53:25 -08:00
Matt Turner
9b5efac461 loader: Remove NEED_OPENGL_COMMON check.
HAVE_DRICOMMON is sufficient since OpenGL must be enabled for DRI.
2015-01-23 14:28:44 -08:00
Matt Turner
2e7b62cbb9 gitignore: Ignore .tar.xz files. 2015-01-23 14:28:44 -08:00
Matt Turner
dd6f641303 mesa: Build with subdir-objects. 2015-01-23 14:28:44 -08:00
Matt Turner
145919b2ab glsl: Build a libglsl_util library.
Rather than sourcing files with ../dir/file.c which leads to distclean
wiping out ../dir's .deps directory.
2015-01-23 14:28:44 -08:00
Matt Turner
a37ae2ab92 mapi: Build with subdir-objects. 2015-01-23 14:28:44 -08:00
Matt Turner
961def1074 mapi: Remove vgapi from SUBDIRS.
OpenVG is disabled with via autotools.
2015-01-23 14:28:44 -08:00
Matt Turner
ce98519266 mesa: Drop inclusion of glapi_gen.mk.
Some glapi headers used to be generated from this Makefile.am, but no
longer.
2015-01-23 14:28:43 -08:00
Matt Turner
618c3b35f1 glsl: Build with subdir-objects.
Apparently $(top_srcdir) is not expanded in a source list when using
subdir-objects, so remove that. It's not clear to me why we were going
to such lengths to prefix each source file anyway.
2015-01-23 14:28:42 -08:00
Matt Turner
a8b880bd63 nir: Add headers to distribution. 2015-01-23 14:27:39 -08:00
Matt Turner
ae494281a4 nir: Add nir_{opt_,}algebraic.py to distribution. 2015-01-23 14:26:53 -08:00
Matt Turner
4db329ddff mesa: Add format_{un,}pack.py to distribution. 2015-01-23 14:26:53 -08:00
Matt Turner
195488e945 mesa: Remove pack_tmp.h from sources.
Missed in commit 3a4de321.
2015-01-23 13:35:25 -08:00
Connor Abbott
68a9d0b36f nir: add generated file to .gitignore
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 10:20:46 -08:00
Ville Syrjälä
f4b31d29d7 i965: Fix min_vs_entries for CHV
According to BSpec the correct number for min_vs_entries is 34 for CHV.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2015-01-23 12:09:41 +02:00
Ville Syrjälä
99754446ab i965: Fix max_wm_threads for CHV
Change max_wm_threads to match the spec on CHV. The max number of
threads in 3DSTATE_PS is always programmed to 64 and the hardware
internally scales that depending on the GT SKU. So this doesn't
change the max number of threads actually used, but it does affect
the scratch space calculation.

On CHV the old value was too small, so the amount of scratch space
allocated wasn't sufficient to satisfy the actual max number of
threads used.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2015-01-23 12:09:35 +02:00
Connor Abbott
c8761c8559 glsl: fix stale comment
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-23 00:23:51 -05:00
Jason Ekstrand
6be2434031 i965/emit: Assert that src1 is not an MRF after doing the MRF->GRF conversion
When emitting texturing from indirect texture units, we need to be able to
scratch around in the header message.  Since we only do this for >= HSW,
this is ok since there are no MRFs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj phogat <anuj.phogat@gmail.com>
2015-01-22 16:00:34 -08:00
Jason Ekstrand
7de8a3e13e i965/emit: Do the sampler index adjustment directly in header.0.3
Prior to this commit, the adjust_sampler_state_pointer function took an
extra register that it could use as scratch space.  The usual candidate was
the destination of the sampler instruction.  However, if that register ever
aliased anything important such as the sampler index, this would scratch
all over important data.  Fortunately, the calculation is such that we can
just do it in place and we don't need the scratch space at all.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-22 15:19:13 -08:00
Axel Davy
8751734613 st/nine: Correctly handle when ff vs should have no texture coord input/output
Previous code semantic was:

. if ff ps will not run a ff stage, then do not output texture coords for this stage
for vs
. if XYZRHW is used (position_t), use only the mode where input coordinates are copied
to the outputs.

Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it means
copy texture coord input to texture coord output if there is such input. The case
where there is no texture coord input wasn't handled correctly.

Drivers like r300 dislike when vs has inputs that are not fed.

Moreover if the app uses ff vs with a programmable ps, we shouldn't look at
what are the parameters of the ff ps to decide to output or not texture
coordinates.

The new code semantic is:

. if XYZRHW is used, restrict to PASSTHRU
. if PASSTHRU is used and no texture input is declared, then do not output
texture coords for this stage

The case where ff ps needs a texture coord input and ff vs doesn't output
it is not handled, and should probably be a runtime error.

This fixes 3Dmark05, which uses ff vs with programmable ps.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:24 +00:00
Axel Davy
77fcff37cf st/nine: Change comment relating to vertex shader inputs not matching declaration
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:24 +00:00
Axel Davy
f8a74410f1 st/nine: Allocate vs constbuf buffer for indirect addressing once.
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.

This patch makes this buffer be allocated once.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:24 +00:00
Axel Davy
e0f75044c8 st/nine: Allocate the correct size for the user constant buffer
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:24 +00:00
Axel Davy
b9cbea9dbc st/nine: Add variables containing the size of the constant buffers
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:24 +00:00
Axel Davy
a721987077 st/nine: Fix sm3 relative addressing for non-debug build
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.

The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:23 +00:00
Axel Davy
4b7a9cfddb st/nine: Remove unused code for ps
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:23 +00:00
Axel Davy
9690bf33d7 st/nine: Correct rules for relative adressing and constants.
relative adressing for constants is possible only for vs float
constants.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:23 +00:00
Axel Davy
bce94ce831 st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:23 +00:00
Axel Davy
9e23b64c15 st/nine: Implement TEXDP3TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:23 +00:00
Axel Davy
09eb1e901f st/nine: Implement TEXDP3
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:23 +00:00
Axel Davy
f19e699368 st/nine: Implement TEXDEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:23 +00:00
Axel Davy
3676ab02fb st/nine: Implement TEXM3x3SPEC
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:22 +00:00
Axel Davy
2b9f079ae3 st/nine: Implement TEXM3x2TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:22 +00:00
Axel Davy
fdff111dc8 st/nine: implement TEXM3x2DEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:22 +00:00
Axel Davy
7865210670 st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:22 +00:00
Axel Davy
b1259544e3 st/nine: Fill missing dst and src number for some instructions.
Not filling them correctly results in bad padding and later crash.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:22 +00:00
Axel Davy
5399119fb1 st/nine: Implement TEXCOORD special behaviours
texcoord for ps < 1_4 should clamp between 0 and 1 the values.

texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.

v2: replace DIV by RCP + MUL
v3: Remove an useless MOV

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:22 +00:00
Axel Davy
30704bbc6e st/nine: Fix CALLNZ implementation
Nothing seems to indicates the negation modifier would be stored in the
instruction flags instead of the source modifier. tx_src_param has
already handled it if it is in the source modifier.

In addition,
when the card supports native integers, the boolean
are stored in 32 bits int and are equal to
0 or 0xFFFFFFFF.

Given 0xFFFFFFFF is NaN if it was a float, better use
UIF than IF.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:22 +00:00
Axel Davy
6378d74937 st/nine: Fix some fixed function pipeline operation
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:21 +00:00
Axel Davy
018407b5d8 st/nine: Clamp ps 1.X constants
This is wine (and windows) behaviour.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:21 +00:00
Axel Davy
8bbc5e2781 st/nine: Remove duplicated code for ps texcoord input declaration
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:21 +00:00
Axel Davy
3ca67f8810 st/nine: Fix CND implementation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:21 +00:00
Axel Davy
dd055176cc st/nine: Match REP implementation to LOOP
Previous implementation was behaving fine, but improve it by:
. Improved documentation
. Decreasing counter (comparing to 0 is likely to be faster than to constant)
. Move the counter update at the end for better performance for shaders that
break the loop earlier than when the count is done.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:21 +00:00
Axel Davy
6a8e5e48be st/nine: Rewrite LOOP implementation, and a0 aL handling
Previous implementation didn't work well with nested loops.

Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.

Wine tests loop_index_test() and nested_loop_test() now pass correctly.

Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:21 +00:00
Axel Davy
c9aa9a0add st/nine: Correct LOG on negative values
We should take the absolute value of the input.

Also return -FLT_MAX instead of -Inf for an input of 0.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:20 +00:00
Axel Davy
f5e8e3fb80 st/nine: Handle NRM with input of null norm
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:20 +00:00
Axel Davy
2487f73574 st/nine: Handle RSQ special cases
We should use the absolute value of the input as input to ureg_RSQ.

Moreover, an input of 0.0 should return FLT_MAX.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:20 +00:00
Axel Davy
c12f8c2088 st/nine: Fix POW implementation
POW doesn't match directly TGSI, since we should
take the absolute value of src0.

Fixes black textures in some games

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:20 +00:00
Axel Davy
e0dd9ca985 st/nine: Fix typo for M4x4
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:20 +00:00
Axel Davy
53dc992f20 st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
Let's say we have c1 and c2 declared in the shader and c0 given by the app

Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.

This correction fixes several issues in some games.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:20 +00:00
Axel Davy
9fb58a74a0 st/nine: Saturate oFog and oPts vs outputs
According to docs and Wine, these two vs outputs have
to be saturated.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:20 +00:00
Axel Davy
a214838181 st/nine: Remove some shader unused code
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:19 +00:00
Axel Davy
d08c7b0b88 st/nine: Convert integer constants to floats before storing them when cards don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:19 +00:00
Axel Davy
d9d18fe39f st/nine: Rework of boolean constants
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:19 +00:00
Axel Davy
77f0ecf9ce st/nine: Add ATI1 and ATI2 support
Adds ATI1 and ATI2 support to nine.

They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:19 +00:00
Axel Davy
b0b5430322 st/nine: Check if srgb format is supported before trying to use it.
According to msdn, we must act as if user didn't ask srgb if we don't
support it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:19 +00:00
Stanislaw Halik
82810d3b66 st/nine: Hack to generate resource if it doesn't exist when getting view
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).

This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.

Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.

This fixes several games crashing at launch.

Acked-by: Axel Davy <axel.davy@ens.fr>
Acked-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:18 +00:00
Axel Davy
47280d777d st/nine: NineBaseTexture9: update sampler view creation
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:18 +00:00
Axel Davy
0abfb80dac st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:18 +00:00
Axel Davy
0d2c22e648 st/nine: Fix crash when deleting non-implicit swapchain
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.

Fixes problems with battle.net launcher.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:18 +00:00
Axel Davy
9232161178 st/nine: CubeTexture: fix GetLevelDesc
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:18 +00:00
Axel Davy
18c7e70226 st/nine: NineBaseTexture9: fix setting of last_layer
Use same similar settings as u_sampler_view_default_template

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:18 +00:00
Axel Davy
05e20e1045 st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:18 +00:00
Xavier Bouchoux
dc88989189 st/nine: Fix D3DRS_POINTSPRITE support
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:17 +00:00
Axel Davy
d2f2a550cf st/nine: Add new texture format strings
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:17 +00:00
Xavier Bouchoux
072e2ba8e1 st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:17 +00:00
Xavier Bouchoux
8bb550b958 st/nine: Additional defines to d3dtypes.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-22 22:16:17 +00:00
Axel Davy
3bc75fcf22 st/nine: Fix clip state logic
The clip state was reset everytime, incurring an overhead.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-01-22 22:16:17 +00:00
David Heidelberger
23fae79735 st/nine: query: remove unused variable (trivial)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: David Heidelberg <david@ixit.cz>
2015-01-22 22:16:16 +00:00
Eric Anholt
fc6938d23e nir: Fix setup of constant bool initializers.
brw_fs_nir has only seen scalar bools so far, thanks to vector splitting,
and the ralloc of in glsl_to_nir.cpp will *usually* get you a 0-filled
chunk of memory, so reading too large of a value will usually get you the
right bool value.  But once we start doing vector bools in a few commits,
we end up getting bad values.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-22 13:52:19 -08:00
Eric Anholt
534a4ec82f nir: Make an easier helper for setting up SSA defs.
Almost all instructions we nir_ssa_def_init() for are nir_dests, and you
have to keep from forgetting to set is_ssa when you do.  Just provide the
simpler helper, instead.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-22 13:52:19 -08:00
Jonathan Gray
c5be9c126d glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2015-01-22 21:29:43 +00:00
Vinson Lee
9db7b12cb2 scons: Add X11 include path if X11 is available.
Mac OS X XQuartz places X11 headers at /opt/X11/include.

This patch fixes this Mac OS X SCons build error.

  Compiling src/gallium/state_trackers/glx/xlib/glx_api.c ...
In file included from src/gallium/state_trackers/glx/xlib/glx_api.c:34:
include/GL/glx.h:30:10: fatal error: 'X11/Xlib.h' file not found
         ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-22 21:29:43 +00:00
José Fonseca
fea35bbf6d meta: Move loop declaration to top of block.
Fixes MSVC build.

Trvial.
2015-01-22 20:06:17 +00:00
Jason Ekstrand
d5d4ba9139 i965/tex_subimage: use meta instead of the blitter for PBO TexSubImage
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:37:13 -08:00
Jason Ekstrand
779923194c i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:37:09 -08:00
Jason Ekstrand
ef0499af25 i965/pixel_read: Use meta_pbo_GetTexSubImage for PBO ReadPixels
Since the meta path can do strictly more than the blitter path, we just
remove the blitter path entirely.

Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:36:25 -08:00
Jason Ekstrand
8546fe900c meta: Add an implementation of GetTexSubImage for PBOs
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:36:24 -08:00
Jason Ekstrand
7f396189f0 meta: Add a BlitFramebuffers-based implementation of TexSubImage
This meta path, designed for use with PBO's, creates a temporary texture
out of the PBO and uses BlitFramebuffers to do the actual texture upload.

v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Add support for handling simple packing options

v3 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Refactor to split out the texture-from-pbo code
 - Rename to _mesa_meta_pbo_TexSubImage

Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:36:24 -08:00
Jason Ekstrand
e24d17e08c formats: Use a hash table for _mesa_format_from_array_format
Going through the for loop every time has noticable overhead.  This fixes
things up so we only do that once ever and then just do a hash table lookup
which should be much cheaper.

v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use once_flag and call_once from c11/threads.h instead of pthreads

Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:35:43 -08:00
Jason Ekstrand
333226522c i965: Implement SetTextureStorageForBufferObject
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:21:07 -08:00
Jason Ekstrand
117a1d69de i965: Apply the miptree offset to surface state for renderbuffers
Previously, we were completely ignoring the mt->offset field for
renderbuffers.  While it does have some alignment constraints, it is valid
to use it.  This patch adds the code to each of the 4 surface state setup
functions to handle it.

Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:21:07 -08:00
Jason Ekstrand
404660e3c7 i965/mipmap_tree: Add a depth parameter to create_for_bo
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:21:07 -08:00
Jason Ekstrand
3298b1235a mesa/dd: Add a function for creating a texture from a buffer object
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2015-01-22 10:21:07 -08:00
Tapani Pälli
adc8cdfa35 glsl: do not allow interface block to have name already taken
Fixes currently failing Piglit case
   interface-blocks-name-reused-globally.vert

v2: combine var declaration with assignment (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-22 07:54:19 +02:00
Matt Turner
28b7c6b285 nir: Replace assert(0) with unreachable().
Fixes a couple of warnings in the process.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-21 21:06:37 -08:00
Matt Turner
6de077f01d i965/vec4: Fix fprintf argument ordering.
Introduced in commit 3167a80b.
2015-01-21 20:17:26 -08:00
Jason Ekstrand
f88c6a4997 nir: Stop using designated initializers
Designated initializers with anonymous unions don't work in MSVC or
GCC < 4.6.  With a couple of constructor methods, we don't need them any
more and the code is actually cleaner.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88467
Reviewed-by: Connor Abbot <cwabbott0@gmail.com>
2015-01-21 19:55:02 -08:00
Tobias Klausmann
76086d7120 mesa: change assert to unreachable in two format functions
This fixes two problems reported by osc:
I: Program returns random data in a function
E: Mesa no-return-in-nonvoid-function ../../src/mesa/main/format_utils.c:180
E: Mesa no-return-in-nonvoid-function ../../src/mesa/main/glformats.c:2714

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2015-01-21 13:17:27 -08:00
Jason Ekstrand
7da60eca4f nir: Add src and dest constructors
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-21 12:21:10 -08:00
Jan Vesely
3c3e60e050 mesa: Add assert to check number of vector elements
The below code crashes when vector_elements <= 0
Fixes Warray-bounds warnings

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-21 14:06:02 +00:00
Jan Vesely
3cb10cce37 mesa: Fix some signed-unsigned comparison warnings
v2: s/unsigned int/unsigned/ in prog_optimize.c

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-21 14:05:52 +00:00
Jan Vesely
da1f92779d mesa: remove comparisons that are always true
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-21 14:05:04 +00:00
Jason Ekstrand
194f6235b3 nir: Add a nir_foreach_phi_src helper macro
Reviewed-by: Connor Abbott <cwabbott02gmail.com>
2015-01-20 16:53:29 -08:00
Ben Widawsky
169d7e5cb1 i965: Extract scalar region checking logic
There are currently 2 users of this functionality. I have 2 more users coming
up, and having a simple function makes the results much cleaner. The existing
interface semantics was proposed by Matt.

v2 (Ken): Rename to region_matches()/has_scalar_region().

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-20 15:24:40 -08:00
Ben Widawsky
9394f58383 i965: Add QWORD sizes to type_sz macro
GEN8 added the QWORD as a valid type for certain operations on the EU.
In order to calculate the number of registers used one must have the type
size as part of the equation. Quoting the formula in the code:

   regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32;

Adding this separately for bisection since there is no simple way to add
an assert in the type_sz function.

NOTE: As a side note, I was confused for a while because it's impossible
to calculate the region, ie. registers needed, without vstride.  However,
at this point these are all part of the IR, and so no vstride must exist.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-20 15:24:40 -08:00
Eric Anholt
b368c91f26 vc4: Fix build since 8ed5305d28 2015-01-20 14:19:29 -08:00
Rob Clark
fd6e18d651 freedreno/a4xx: sysmem bypass
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-20 13:27:28 -05:00
Rob Clark
5da3bec44b freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-20 13:27:19 -05:00
Tom Stellard
17a2f11a06 radeonsi: Re-enable LLVM IR dumps
This was inadvertently disabled by
761e36b4ca.
2015-01-20 09:55:44 -05:00
Tom Stellard
73bc0fdb6f radeonsi/compute: Use relocs for scratch pointer rather than user sgprs v2
Instead of passing a pointer to the scratch buffer via user sgprs, we
now patch the shader with the buffer address using reloc information
from the LLVM generated ELF.

v2:
  - Make sure not to break older LLVM.
2015-01-20 09:55:44 -05:00
Tom Stellard
dfdaf3eb7e radeon: Teach radeon_elf_read() how to parse reloc information v3
v2:
  - Use strdup for copying reloc names.
  - Free reloc memory.

v3:
  - Add free_relocs parameter to radeon_shader_binary_free_members()
2015-01-20 09:55:43 -05:00
Tom Stellard
5667aa58c4 radeon: Add a helper function for freeing members of radeon_shader_binary 2015-01-20 09:55:43 -05:00
Kenneth Graunke
c4fd0c9052 i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'.  Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.

I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two.  Either draw by itself works
fine, but together, they hang the GPU.  Removing the glUniform call
makes the hangs disappear.  In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.

Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear.  I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).

I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further.  We have no real tools,
and the hardware people moved on years ago.  I've analyzed 20+ error
states and read every scrap of documentation I could find.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2015-01-19 13:13:51 -08:00
Kenneth Graunke
a5ca86a983 i965/nir: Enable SIMD16 support in the NIR FS backend.
With the previous commits in place, it just works.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-19 13:13:50 -08:00
Kenneth Graunke
45123ee818 i965/nir: Use offset() instead of altering reg_offset directly.
offset() properly handles reg_width, so it'll work for SIMD16.

While we're in the area, simplify a few cases, and use retype() to cut a
few more lines of code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-19 13:13:48 -08:00
Kenneth Graunke
3f263ffbb3 i965/nir: Replace fs_reg(GRF, virtual_grf_alloc(...)) with vgrf(...).
brw_fs_nir.cpp creates almost all of its registers via:

   fs_reg reg = fs_reg(GRF, virtual_grf_alloc(num_components));

When we add SIMD16 support, we'll need to set reg->width = 16 and
double the VGRF size...on pretty much every VGRF it allocates.

This patch replaces that pattern with a new "vgrf" helper method:

   fs_reg reg = vgrf(num_components);

The new function correctly takes reg_width into account.  For now,
reg_width is always 1, so this should have no functional change.

v2: Just make vgrf() account for reg_width right away, rather than
    changing the behavior in the next patch.

v3: Replace one last virtual_grf_alloc I missed.  It's used in code
    that only runs for dispatch_width == 8, so it doesn't matter,
    but consistency is nice.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-19 13:13:46 -08:00
Kenneth Graunke
d1533d87cc i965: Replace fs_reg(fs_visitor, type) with fs_visitor::vgrf(type).
I dislike how fs_reg has a constructor that knows about fs_visitor.
Apart from that, it stands alone, with no need to interact with the
rest of the compiler.  Which is sensible - a class that represents
a register should do just that.  Allocating virtual register numbers
should be left up to the compiler (fs_visitor).

This patch replaces the constructor with a new fs_visitor::vgrf method,
eliminating fs_reg's dependency on fs_visitor.  It ends up being no
more code.

v2: Rebase from May 2014 -> January 2015.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-19 13:13:34 -08:00
Marek Olšák
5b01512df3 st/mesa: don't set vs.key.clamp_color if a shader doesn't write any colors
And update some comments.
2015-01-19 20:15:27 +01:00
Marek Olšák
ccc5b60b06 winsys/radeon: increase the size of buffer cache
This should fix this performance regression:
https://bugs.freedesktop.org/show_bug.cgi?id=88227

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-19 20:15:27 +01:00
Carl Worth
3b8ccca8a3 Rename sha1.c and sha1.h to mesa-sha1.c and mesa-sha1.h
The filename of sha1.h was conflicting with the system-provided
sha1.h, (and in some confiurations, our sha1.c was unsuccessfully
attemping to include "sha1.h" and <sha1.h> as two different files).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88523
2015-01-19 10:53:07 -08:00
Martin Peres
7a182d2335 mesa: fix a trivial spelling mistake
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-19 01:23:07 -08:00
Tapani Pälli
d74a817b86 mesa: support GL_RGB for GL_EXT_texture_type_2_10_10_10_REV
Commit 8ec6534 changed texture upload path and the way how texture
format is being checked, this commit adds support for GL_RGB with
GL_UNSIGNED_INT_2_10_10_10_REV as specified by the extension
EXT_texture_type_2_10_10_10_REV specification.

This fixes regression in ES3 conformance test
   ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels

v2: add MESA_FORMAT_R10G10B10X2_UNORM format (Iago Toral)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88385
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-01-19 08:11:45 +02:00
Micah Fedke
d36fa60191 mesa: Add ARB_shader_precision infrastructure
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-19 16:33:21 +13:00
Kenneth Graunke
461103ef64 i965/fs: Fix the dummy fragment shader.
We hit an assertion that the destination of the FB write should not be
an immediate.  (I don't know what we were thinking.)  Use ARF null.

Trying to substitute real shaders with the dummy shader would crash
when trying to upload non-existent uniforms.  Say there are none.

It also wouldn't generate any code because we didn't compute the CFG,
and code generation now requires it.  Compute it.

Gen4-5 also require a message header to be present.

On Gen6+, there were assertion failures in SF/SBE state because
urb_setup was memset to 0 instad of -1, causing it to think there were
attributes when nothing was set up right.  Set to no attributes.

Finally, you have to ensure "Setup URB Entry Read Length" is non-zero
or you get GPU hangs, at least on Crestline.

It now works on at least Crestline and Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-17 14:20:41 -08:00
Kristian Høgsberg
8c6018e9bc gbm: Define _DEFAULT_SOURCE to avoid warning
glibc 2.19 introduced _DEFUAULT_SOURCE as a replacement for _BSD_SOURCE,
and deprecates _BSD_SOURCE with an annoying warning.  Defining both is
how you're supposed to transition so let's do that.  It gets rid of the
warning and we can figure out when/if we can drop _BSD_SOURCE later.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-16 21:54:54 -08:00
Vinson Lee
9075823c17 sha1: Fix gcry_md_hd_t typo.
Fix build error.

  CC       libmesautil_la-sha1.lo
sha1.c: In function '_mesa_sha1_final':
sha1.c:210:22: error: 'grcy_md_hd_t' undeclared (first use in this function)
    gcry_md_hd_t h = (grcy_md_hd_t) ctx;
                      ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88519
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2015-01-16 16:25:39 -08:00
Vinson Lee
10a4f1e77a nir: s/malloc.h/stdlib.h/
Fix build error on Mac OS X.

  CC       nir_to_ssa.lo
nir_to_ssa.c:29:10: fatal error: 'malloc.h' file not found
         ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88478
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2015-01-16 16:14:51 -08:00
Kristian Høgsberg
a9f657ded1 i965: Fix up too-wide comment
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-16 14:42:27 -08:00
Kristian Høgsberg
9bf2c7166a gbm/dri: Fix const confusion
The driver name is no longer const, it's always allocated dynamically
one way or another.  Drop const from dri_screen_create_dri2
driver_name argument to avoid warning.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-16 14:29:40 -08:00
Carl Worth
59216f53ec configure: Add machinery for --enable-shader-cache (and --disable-shader-cache)
We don't actually have the code for the shader cache just yet, but
this configure machinery puts everything in place so that the shader
cache can be optionally compiled in.

Specifically, if the user passes no option (neither
--disable-shader-cache, nor --enable-shader-cache), then this feature
will be automatically detected based on the presence of a usable SHA-1
library. If no suitable library can be found, then the shader cache
will be automatically disabled, (and reported in the final output from
configure).

The user can force the shader-cache feature to not be compiled, (even
if a SHA-1 library is detected), by passing
--disable-shader-cache. This will prevent the compiled Mesa libraries
from depending on any library for SHA-1 implementation.

Finally, the user can also force the shader cache on with
--enable-shader-cache. This will cause configure to trigger a fatal
error if no sutiable SHA-1 implementation can be found for the
shader-cache feature.

Bug fix by José Fonseca <jfonseca@vmware.com>: Fix to put conditional
assignment in Makefile.am, not Makefile.sources to avoid breaking
scons build.

Note: As recommended by José, with this commit the scons build will
not compile any of the SHA-1-using code. This is waiting for someone
to write SConstruct detection of the available SHA-1 libraries, (and
set the appropriate HAVE_SHA1_* variables).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-16 13:47:40 -08:00
Carl Worth
a24bdce46f mesa: Add mesa SHA-1 functions
The upcoming shader cache uses the SHA-1 algorithm for cryptographic
naming. These new mesa_sha1 functions are implemented with any one of
several differeny cryptographics libraries.

This code was copied from the xserver repository, (where it has
apparently been functioning well on a variety of operating systems),
and comes licensed with a license identical to that of Mesa.

Bug fixes by José Fonseca <jfonseca@vmware.com>: Fix to put
conditional assignment in Makefile.am, not Makefile.sources to avoid
breaking scons build. Fix include file for CryptoAPI section. Fix
missing cast in openssl section.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-16 13:47:40 -08:00
Carl Worth
670826b431 configure: Add copyright and license block to configure.ac
Prior to copying in code from the xserver configure.ac file, it makes
sense to have the license of this file clearly marked, (to show that
it's licensed identically to the configure.ac file from the xserver
repository).

And since the text of the license refers to "the above copyright
notice" it also makes sense to have an actual copyright attribution in
place.

I generated this list of names by looking at the output of:

	git shortlog -n --format=%aD -- configure.ac

(and arbitrarily stopping for contributors with fewer than 15
commits). Then for each name, I looked for existing Copyright
attributions in the mesa source tree with the same name, (and using
"Intel Corporation" as the copyright holder where I knew that was
appropriate).
2015-01-16 13:47:40 -08:00
Carl Worth
977ddecb69 glsl: Add unit tests for blob.c
In addition to exercising all of the functions in blob.h, this
includes a stress test that forces some reallocing, and also tests to
verify the alignment and overrun-detection code in blob.c.
2015-01-16 13:47:40 -08:00
Tapani Pälli
ffcad3a548 glsl: Add blob_overwrite_bytes and blob_overwrite_uint32
These functions are useful when serializing an unknown number of items
to a blob. The caller can first save the current offset, write a
placeholder uint32, write out (and count) the items, then use
blob_overwrite_uint32 with the saved offset to replace the placeholder
value.

Then, when deserializing, the reader will first read the count and
know how many subsequent items to expect.

(I wrote this code after reading a very similar patch written by
Tapani when he wrote serialization code for IR. Since I re-used the
idea of his code so directly, I've credited him as the author of this
code. --Carl)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 13:47:40 -08:00
Carl Worth
1c9877327e glsl: Add blob.c---a simple interface for serializing data
This new interface allows for writing a series of objects to a chunk
of memory (a "blob").. The allocated memory is maintained within the
blob itself, (and re-allocated by doubling when necessary).

There are also functions for reading objects from a blob as well. If
code attempts to read beyond the available memory, the read functions
return 0 values (or its moral equivalent) without reading past the
allocated memory. Once the caller is done with the reads, it can check
blob->overrun to ensure whether any invalid values were previously
returned due to attempts to read too far.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 13:47:40 -08:00
Tapani Pälli
165575d0a8 mesa: Add iterate method for string_to_uint_map
The upcoming shader cache needs this to be able to cache hash data
from the gl_shader_program structure.

Edited-by: Carl Worth <cworth@cworth.org>:

There is an internal implementation detail that the hash table
underlying the struct string_to_uint_map stores each value internally
as (value+1). The user needn't be very concerned with this (other than
knowing that a value of UINT_MAX cannot be stored) since put() adds 1
and get() subtracts 1.

So in this commit, rather than call the user's function directly with
hash_table_call_foreach, we call through a wrapper that fixes up the
off-by-one values before the caller's callback sees them.

And with this wrapper in place, we also give a better signature to the
callback function being passed to iterate(), so that this callback
function can actually expect a char* and an unsigned argument, (rather
than a couple of void* ).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-16 13:47:40 -08:00
Carl Worth
62d5b4b03a util: Make unreachable at least be an assert
Previously, if __builtin_unreachable() was unavailable, the
unreachable macro was defined to do nothing. We do better here, by at
least still making it an assert.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-16 13:47:40 -08:00
Carl Worth
f87ffd5cc3 glsl: Add convenience function get_sampler_instance
This is similar to the existing functions get_instance,
get_array_instance, etc. for getting a type singleton. The new
get_sampler_instance() function will be used by the upcoming shader
cache.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-16 13:47:40 -08:00
Kenneth Graunke
127c972492 i965: Fix some oddities in FB_WRITE register width and execution size.
Previously, we generated this for FB writes in SIMD16 mode:

load_payload(16) vgrf5@8+0.0:F, vgrf1:F, vgrf2:F, vgrf3:F, vgrf4:F
fb_write(8) (null):UD, vgrf5@8+0.0:F 1sthalf

The LOAD_PAYLOAD's destination had its register width set to 8, and the
FB_WRITE had its execution size set to 8.  This seems wrong, and while
it probably doesn't affect anything, we should fix it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 12:39:35 -08:00
Kenneth Graunke
faaca23734 i965/fs: Make lower_load_payload etc. appear in INTEL_DEBUG=optimizer.
In order to support calling lower_load_payload() inside a condition,
this patch makes OPT() a statement expression:

https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

We recently did the equivalent change in the vec4 backend (commit
9b8bd67768).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 12:38:26 -08:00
Neil Roberts
a4ab08bf45 format_utils: Use a more precise conversion when decreasing bits
When converting to a format that has fewer bits the previous code was just
shifting off the bits. This doesn't provide very accurate results. For example
when converting from 8 bits to 5 bits it is equivalent to doing this:

x * 32 / 256

This works as if it's taking a value from a range where 256 represents 1.0 and
scaling it down to a range where 32 represents 1.0. However this is not
correct because it is actually 255 and 31 that represent 1.0.

We can do better with a formula like this:

(x * 31 + 127) / 255

The +127 is to make it round correctly.

The new code has a special case to use uint64_t when the result of the
multiplication would overflow an unsigned int. This function is inline and
only ever called with constant values so hopefully the if statements will be
folded.

The main incentive to do this is to make the CPU conversion path pick the same
values as the hardware would if it did the conversion. This fixes failures
with the ‘texsubimage pbo’ test when using the patches from here:

http://lists.freedesktop.org/archives/mesa-dev/2015-January/074312.html

v2: Use 64-bit arithmetic when src_bits+dst_bits > 32

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 13:53:15 +00:00
Iago Toral Quiroga
6367ca8b41 i965/gen6: Fix crash with VS+TF after rendering with GS
Rendering with a GS and then using transform feedback with a program that does
not have a GS can crash in gen6. The reason for this is that
brw_begin_transform_feedback checks brw->geometry_program to decide if there
is a GS program, but this is not correct: brw->geometry_program is updated when
issuing drawing commands, so after rendering with a GS it will be non-NULL
until we draw again with a program that does not have a GS. If the next
program uses TF, we will call glBegintransformFeedback before issuing
the drawing command and hence brw->geometry_program will be non-NULL if
the previous rendering used a GS. The right thing to do here is to check
ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY] instead. This is what the
gen7 code path does too.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=87694

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-16 14:16:59 +01:00
Jason Ekstrand
bc6e57e019 nir/live_variables: Use a worklist
This is a rework of the liveness algorithm using a worklist as suggested by
Connor.  Doing so reduces the number of times we walk over the instructions
because we don't have to do an entire pointless walk over the instructions
just to figure out it's time to stop.  Also, the stuff after the last loop
in the funciton will only ever get visited once.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 16:54:21 -08:00
Jason Ekstrand
4839d1aed1 nir: Add a worklist helper structure
A worklist is a common concept in optimizations.  This adds a structure
that we can reuse for many different types of optimizations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 16:54:21 -08:00
Brian Paul
0aaaa13ec9 nir: fix incorrect argument passed to validate_src() in validate_tex_instr()
Silences a compiler warning.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 17:41:42 -07:00
Brian Paul
aa479a69d6 nir: silence compiler warning from visit_src() call
v2: use proper argument

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 17:09:02 -07:00
Brian Paul
337eca4ac8 mesa: move GET_CURRENT_CONTEXT() to top of _mesa_init_renderbuffer()
To fix MSVC build.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-15 16:15:34 -07:00
Mike Mason
e407fb1af4 mesa: Fix render buffer initial internal format in GLES 3
Changes the initial internal format of a render buffer
to GL_RGBA4 in GLES 3. This fixes a failure in the following
DrawElements test:

  dEQP-GLES3.functional.state_query.rbo.renderbuffer_internal_format

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-15 13:29:48 -08:00
Jason Ekstrand
153b8b3525 util/hash_set: Rework the API to know about hashing
Previously, the set API required the user to do all of the hashing of keys
as it passed them in.  Since the hashing function is intrinsically tied to
the comparison function, it makes sense for the hash set to know about
it.  Also, it makes for a somewhat clumsy API as the user is constantly
calling hashing functions many of which have long names.  This is
especially bad when the standard call looks something like

_mesa_set_add(ht, _mesa_pointer_hash(key), key);

In the above case, there is no reason why the hash set shouldn't do the
hashing for you.  We leave the option for you to do your own hashing if
it's more efficient, but it's no longer needed.  Also, if you do do your
own hashing, the hash set will assert that your hash matches what it
expects out of the hashing function.  This should make it harder to mess up
your hashing.

This is analygous to 94303a0750 where we did this for hash_table

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Jason Ekstrand
4c99e3ae78 util: Move main/set to util/hash_set
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Jason Ekstrand
8ed5305d28 hash_table: Rename insert_with_hash to insert_pre_hashed
We already have search_pre_hashed.  This makes the APIs match better.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Matt Turner
f0aec4ee1e i965: Don't consider null dst instructions as matching non-null dst.
When performing common subexpression elimination on instructions with
non-null destinations we emit a MOV to copy the result to a new
register that must have no other uses. In the case of:

   cmp.g.f0.0(8) null:D, vgrf43:F, 0.500000f
   ...
   cmp.g.f0.0(8) vgrf113:D, vgrf43:F, 0.500000f

we put the first instruction in the AEB and decided that we could reuse
its result when we found the second. Unfortunately, that meant that we'd
emit a MOV from the first's destination, which is null.

Don't do anything if the entry's destination is null and the
instruction's destination is non-null.

Tested-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-15 10:11:42 -08:00
Matt Turner
41d9f232b6 i965/vec4: Make sure that imm writes are to registers in the same file.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887
2015-01-15 10:11:42 -08:00
Matt Turner
3654b6d43c i965/fs: Emit MADs from (x + abs(y * z)).
Just use the abs source modifier on both of the multiplicand
arguments.

instructions in affected programs:     300 -> 296 (-1.33%)

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-15 10:10:44 -08:00
Matt Turner
c4fab711ed i965/fs: Emit MADs from (x + -(y * z)).
Just use the negation source modifier on one of the multiplicand
arguments.

total instructions in shared programs: 5889529 -> 5880016 (-0.16%)
instructions in affected programs:     600846 -> 591333 (-1.58%)

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-15 10:10:44 -08:00
Jason Ekstrand
0d05d1226e nir/algebraic: Only replace an instruction once
Without the break, it was possible that an instruction would match multiple
expressions.  If this happened, you could end up trying to replace it
multiple times and get a segfault.  This makes it so that, after a
successful replacement, it moves on to the next instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
c56adc68e2 i965/nir: Do a final copy lowering pass before lowering locals to regs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
0f85310975 nir/vars_to_ssa: Use the copy lowering from lower_var_copies
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
d3636da902 nir: Add a pass for lowering copy instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
700ba5daaf nir/vars_to_ssa: Refactor get_deref_node
This refactor allows you to more easily get the deref node associated with
a given variable.  We then use that new functionality in the
deref_may_be_aliased function instead of creating a 1-element deref chain.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
55b5058e69 nir: Rename lower_variables to lower_vars_to_ssa
The original name wasn't particularly descriptive.  This one indicates that
it actually gives you SSA values as opposed to the old pass which lowered
variables to registers.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
4aa6162f6e nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array
This solves a number of problems.  First is the ability to change the
number of sources that a texture instruction has.  Second, it solves the
delema that may occur if a texture instruction has more than 4 sources.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
dcb1acdea0 nir/validate: Only build in debug mode
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand
347ab2bf24 nir/lower_variables: Improve documentation
Additional description was added to a variety of places.  Also, we no
longer use the term "leaf" to describe fully-qualified direct derefs.
Instead, we simply use the term "direct" or spell it out completely.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
8016fa39e1 nir/lower_variables: Use a for loop for get_deref_node
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
0c0ca8b6ae nir: Use the actual FNV-1a hash for hashing derefs
We also switch to using loops rather than recursion.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
a3b73ccf6d util/hash_table: Pull the details of the FNV-1a into helpers
This way the basics of the FNV-1a hash can be reused to easily create other
hashing functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
e4115ca9d8 nir: Make intrinsic flags into an enum
This should be much better for debugging as GDB will pick up on the fact
that it's an enum and actually tell you what you're looking at instead of
giving you some arbitrary hex value you have to go look up.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
ed13f4e716 nir: Use static inlines instead of macros for list getters
This should make debugging a lot easier as GDB handles static inlines much
better than macros.  Also, static inlines are typesafe.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
b95fae034f nir/variable: Remove the constant_value field
This was a left-over relic of GLSL IR that we aren't using for anything.
If we ever want that value again, we can add it back, but NIR constant
folding should be just as good as GLSL IR's if not better pretty soon, so
I'm not worried about it.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
8599b30c67 nir: Add some documentation
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
ad9d0a9ea6 nir/lower_variables: Follow the Cytron paper more closely
Previously, our variable renaming algorithm, while similar to the one in
the Cytron paper, was not the same.  While I'm pretty sure it was correct,
it will be easier for readers of the code in the variable renaming pass if
it follows more closely.  This commit removes the automatic stack popping
we were doing and replaces it with explicit popping like Cytron does.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
b1d114a48c nir/print: Various cleanups recommended by Eric
Cc: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
e2763339fe nir/lower_variables: Add a bunch of comments and re-arrange a few things
This commit seeks to make the lower_variables pass much more clear by
adding a pile of comments and re-arranging a few things.  There are no
functional or algorithmic changes.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
40ca129ed5 nir: Rename parallel_copy_copy to parallel_copy_entry and add a foreach macro
parallel_copy_copy was a silly name.  Also, things were getting long and
annoying, so I added a foreach macro.  For historical reasons, several of
the original iterations over parallel copy entries in from_ssa used the
_safe variants of the loop.  However, all of these no longer ever remove an
entry so it's ok to make them all use the normal iterator.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
1b720c6ed8 nir/from_ssa: Clean up parallel copy handling and document it better
Previously, we were doing a lazy creation of the parallel copy
instructions.  This is confusing, hard to get right, and involves some
extra state tracking of the copies.  This commit adds an extra walk over
the basic blocks to add the block-end parallel copies up front.  This
should be much less confusing and, consequently, easier to get right.  This
commit also adds more comments about parallel copies to help explain what
all is going on.

As a consequence of these changes, we can now remove the at_end parameter
from nir_parallel_copy_instr.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
de73d1e173 nir: Rename nir_block_following_if to nir_block_get_following_if
The new name is a little longer but less confusing.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand
cb53aacaa1 i965/fs_nir: Handle sample ID, position, and mask better
Before, we were emitting the full pile of setup instructions for sample_id
and sample_pos every time they were used.  With this commit, we emit them
in their own pass once at the beginning of the shader and simply emit uses
later on.  When it comes time for setting up VS, we can put setup for its
special values in the same pass.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
813316d150 nir/opcodes: Remove the per_component info field
Originally, this field was intended for determining if the given
instruction acted per-component or if it had mismatching source and
destination sizes that would have to be interpreted specially.  However, we
can easily derive this from output_size == 0, so it's not really that
useful.  Also, the values we were setting in nir_opcodes.h for this field
were completely bogus and it was never used.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
e2a8f9e5cc nir/search: Use nir_op_infos to determine if an operation is commutative
Prior to this commit, we had a big switch statement for this.  Now it's
baked into the opcode metadata so we can just use that.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
46f3e1ab50 nir/opcodes: Add algebraic properties metadata
This commit adds some algebraic properties to the metadata of each opcode
in NIR.  In particular, you now know, just from the metadata, if a given
opcode is commutative or associative.  This will be useful for algebraic
transformation passes that want to be able to match a + b as well as b + a
in one go.

v2: Make algebraic properties all caps.  This was more consistent with the
    intrinsics flags and seems better for flags in general.

    Also, the enums are now declared with (1 << n) rather then hex values.

v3: fmin and fmax technically aren't commutative or associative.  Things
    get funny when one of the arguments is a NaN.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
2c7da78805 nir: Make load_const SSA-only
As it was, we weren't ever using load_const in a non-SSA way.  This allows
us to substantially simplify the load_const instruction.  If we ever need a
non-SSA constant load, we can do a load_const and an imov.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
675ffdef30 nir: Make nir_ssa_undef_instr_create initialize the destination
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
951a7f23a0 i965/nir: Move the other lowering passes to before out-of-SSA
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
5c16be1c52 nir/lower_system_values: Handle SSA destinations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
821e75a160 nir/lower_atomics: Use/support SSA
Previously, lower_atomics was non-SSA only.  We assert-failed if the
destination of an atomic operation intrinsic was an SSA def and we used
temporary registers for computing offsets.  This commit changes both of
these behaviors.  We now use SSA values for computing offsets (so we can
optimize them) and we handle SSA destinations.  We also move the pass to
run before we go out of SSA on i965 as it now generates SSA values.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
8ddb03d56d nir/live_variables: Use the new ssa_def iterator
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
28a3e164e2 nir: Use nir_foreach_ssa_def for setting up ssa destinations
Before, we were using foreach_dest and switching on whether the destination
was an SSA value.  This works, except not all destinations are SSA values
so we have to special-case ssa_undef instructions.  Now that we have a
foreach_ssa_def function, we can iterate over all of the register
destinations in one pass and iterate over the SSA destinations in a second.
This way, if we add other ssa-only instructions, we won't have to worry
about adding them to the special case we have for ssa_undef.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
193fea9eb6 nir: Add a foreach_ssa_def function
There are some functions whose destinations are SSA-only and so aren't a
nir_dest.  This provides a function that is capable of iterating over the
SSA definitions defined by those functions.  If you want registers, you
should use the old iterator.

v2: Kenneth Graunke <kenneth@whitecape.org>:
 - Fix nir_foreach_ssa_def's return value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
bc0735857f nir/lower_variables: Use a real dominance DFS for variable renaming
Previously, we were just iterating over the program "in order" which
kind-of approximates a DFS, but not really.  In particular, we got the
following case wrong:

loop {
   a = 3;
   if (foo) {
      a = 5;
   } else {
      break;
   }
   use(a);
}

where use(a) would get 3 instead of 5 because of premature popping of the
SSA def stack.  Now, since we do an actaul DFS, we should evaluate use(a)
immediately after a = 5 and we should be ok.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand
dfb3abbaec nir: Remove predication
We stopped generating predicates in glsl_to_nir some time ago.  Right now,
it's all dead untested code that I'm not convinced always worked in the
first place.  If we decide we want them back, we can revert this patch.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
b3fd098e7d nir: Make bcsel a fully vector operation
Previously, the condition was a scalar that applied to all components
simultaneously.  As of this commit, the condition is a vector and each
component is switched seperately.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
295faf9462 nir: Call nir_metadata_preserve more places
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
b6c81b3ff4 nir/metadata: Rename metadata_dirty to metadata_preserve
nir_metadata_dirty was a terrible name because the parameter it takes is
the metadata to be preserved.  This is really confusing because it looks
like it's doing the opposite of what it is actually doing.  Now it's named
sensibly.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
3c2c0a164c i965/fs_nir: Add support for indirect texture arrays
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use the nir_tex_src_sampler_offset source type instead of the
   sampler_indirect thing that I cooked up before.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
60ec60a600 nir: Rework the way samplers are lowered
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use the nir_tex_src_sampler_offset source type instead of the
   sampler_indirect thing that I cooked up before.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
4cdabcc0fa nir/tex_instr_create: Initialize all 4 sources
This helps a lot with things like lowering passes that may need to add
sources.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
62ac0ee804 nir/tex_instr: Rename the indirect source type and add an array size
In particular, we rename nir_tex_src_sampler_index to _sampler_offset and
add a sampler_array_size field to nir_tex_instr.  This way we can pass the
size of sampler arrays through to backends even after removing the variable
information and, with it, the type.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
534d145e5e nir: Use a source for uniform buffer indices instead of an index
In GLSL-to-NIR we were just setting the base index to 0 whenever there was
an indirect so having it expressed as a sum makes no sense.  Also, while a
base offset may make sense for the memory location (first element in the
array, etc.) it makes less sense for the actual uniform buffer index.  This
may change later, but it seems to make more sense for now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
6a5604ca6a nir: Constant fold array indirects
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
cd4b995254 nir: Make texture instruction names more consistent
This commit renames nir_instr_as_texture to nir_instr_as_tex and renames
nir_instr_type_texture to nir_instr_type_tex to be consistent with
nir_tex_instr.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
d6fe35a418 nir: Remove the ffma peephole
This is no longer needed because it's now part of the algebraic
optimization pass

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand
f77f4c00ce nir: Add a basic constant folding pass
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
d5410bd8f6 nir: Add an algebraic optimization pass
This pass uses the previously built algebraic transformations framework and
should act as an example for anyone else wanting to make an algebraic
transformation pass for NIR.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
0e145a951e nir: Add infastructure for generating algebraic transformation passes
This commit builds on the nir_search.h infastructure by adding a bit of
python code that makes it stupid easy to write an algebraic transformation
pass.  The nir_algebraic.py file contains four python classes that
correspond directly to the datastructures in nir_search.c and allow you to
easily generate the C code to represent them.  Given a list of
search-and-replace operations, it can then generate a function that applies
those transformations to a shader.

The transformations can be specified manually, or they can be specified
using nested tuples.  The nested tuples make a neat little language for
specifying expression trees and search-and-replace operations in a very
readable and easy-to-edit fasion.

The generated code is also fairly efficient.  Insteady of blindly calling
nir_replace_instr with every single transformation and on every single
instruction, it uses a switch statement on the instruction opcode to do a
first-order culling and only calls nir_replace_instr if the opcode is known
to match the first opcode in the search expression.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
0057dfd673 nir: Add an expression matching framework
This framework provides a simple way to do simple search-and-replace
operations on NIR code.  The nir_search.h header provides four simple data
structures for representing expressions:  nir_value and four subtypes:
nir_variable, nir_constant, and nir_expression.  An expression tree can
then be represented by nesting these data structures as needed.  The
nir_replace_instr function takes an instruction, an expression, and a
value; if the instruction matches the expression, it is replaced with a new
chain of instructions to generate the given replacement value.  The
framework keeps track of swizzles on sources and automatically generates
the currect swizzles for the replacement value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
a94d1c2481 nir/glsl: Emit abs, neg, and sat operations instead of source modifiers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
8edcd1de14 nir: Make the type casting operations static inline functions
Previously, the casting operations were macros.  While this is usually
fine, the casting macro used the input parameter twice leading to strange
behavior when you passed the result of another function into it.  Since we
know the source and destination types explicitly, we don't loose anything
by making it a function.

Also, this gives us a nice little macro for creating cast function that
will hopefully prevent mistyping.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
919426631b nir: Add a lowering pass for adding source modifiers where possible
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
1d83a8eb7a nir: Add neg, abs, and sat opcodes
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:20 -08:00
Jason Ekstrand
a1c259d666 i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-15 07:19:41 -08:00
Jason Ekstrand
e257a51124 i965/fs_nir: Add a has_indirect flag and clean up some of the input/output code
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
a3ad7fdf33 nir: Add a helper for getting a constant value from an SSA source
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
940ccc45ad nir/glsl: Add support for gpu_shader5 interpolation instrinsics
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
45bdcc257e nir: Add gpu_shader5 interpolation intrinsics
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
e3fa49c9e6 nir/validate: Validate intrinsic source/destination sizes
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
27663dbe8e nir: Vectorize intrinsics
We used to have the number of components built into the intrinsic.  This
meant that all of our load/store intrinsics had vec1, vec2, vec3, and vec4
variants.  This lead to piles of switch statements to generate the correct
intrinsic names, and introspection to figure out the number of components.
We can make things much nicer by allowing "vectorized" intrinsics.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
d1d12efb36 nir: Remove the old variable lowering code
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:03 -08:00
Jason Ekstrand
faad82b4e7 nir/validate: Ensure that outputs are write-only and inputs are read-only
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
26865f858d i965/fs_nir: Use the new variable lowering code
This commit switches us over to the new variable lowering code which is
capable of properly handling lowering indirects as we go.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
29e607e5cf nir/glsl: Generate SSA NIR
With this commit, the GLSL IR -> NIR pass generates NIR in more-or-less SSA
form.  It's SSA in the sense that it doesn't have any registers, but it
isn't really useful SSA because it still has a pile of load/store
intrinsics that we will need to get rid of.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
6962c332e5 nir: Add a pass to lower global variables to local variables
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
619b2e2499 nir: Add a pass for lowering input/output loads/stores
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
aff431293b nir: Add a pass to lower local variables to registers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
d477beab07 nir: Add a pass to lower local variable accesses to SSA values
This pass analizes all of the load/store operations and, when a variable is
never aliased (potentially used by an indirect operation), it is lowered
directly to an SSA value.  This pass translates to SSA directly and does
not require any fixup by the original to-SSA pass.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
615ba5ad04 nir: Add a copy splitting pass
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
68778d52cd nir: Automatically update SSA if uses
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
7c5284d0e5 i965/fs_nir: Don't dump the shader.
This is killing piglit.  I'll leave the logging local

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
9318ce8c5a nir/glsl: Don't allocate a state_slots array for 0 state slots
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
9d62df3800 nir: Validate that the sources of a phi have the same size as the destination
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
24249599b1 nir/copy_propagate: Don't cause size mismatches on phi node sources
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
6a52d2af2f nir: Don't require a function in ssa_def_init
Instead, we give SSA definitions a temporary index of 0xFFFFFFFF if the
instruction does not have a block and a proper index when it actually gets
added to the list.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
829aa98320 nir: Use an integer index for specifying structure fields
Previously, we used a string name.  It was nice for translating out of GLSL
IR (which also does that) but cumbersome the rest of the time.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
4f8230e247 nir: Add a concept of a wildcard array dereference
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
b5143edaee nir: Make array deref direct vs. indirect an enum
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:02 -08:00
Jason Ekstrand
8219ff1796 nir: Clean up nir_deref helper functions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
895eee505c nir/lower_samplers: Use the nir_instr_rewrite_src function
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
cd01de0812 nir: Add a helper for rewriting an instruction source
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
04fb073344 i965/fs_nir: Properly saturate multiplies
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
5690c2b54c nir/from_ssa: Don't lower constant SSA values to registers
Backends want to be able to do special things with constant values such as
put them into immediates or make decisions based on whether or not a value
is constant.  Before, constants always got lowered to a load_const into a
register and then a register use.  Now we leave constants as SSA values so
backends can special-case them if they want.  Since handling constant SSA
values is trivial, this shouldn't be a problem for backends.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
c2abfc0b86 i965/fs_nir: Handle SSA constants
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
e0aa4c6272 i965/fs_nir: Use an array rather than a hash table for register lookup
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
20adc516e2 i965/fs_nir: Add the CSE pass and actually run in a loop
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
6bdce55c44 nir: Add a basic CSE pass
This pass is still fairly basic.  It only handles ALU operations, constant
loads, and phi nodes.  No texture ops or intrinsics yet.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
20a5812606 nir: Add a fused multiply-add peephole 2015-01-15 07:19:01 -08:00
Jason Ekstrand
02ee1d22a1 nir: Validate that the SSA def and register indices are unique
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
c937bdb3c2 i965/fs_nir: Turn on the peephole select optimization
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
13ec15bdbf nir: Add a peephole select optimization
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
ef7ebb908e nir/nir: Patch up phi predecessors in move_successors
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
02eef48343 nir/nir: Use safe iterators when iterating over the CFG
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
c6582e884d glsl/list: Add a foreach_list_typed_safe_reverse macro
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
dc4e660dfa nir/nir: Fix a bug in move_successors
The unlink_blocks function moves successors around to make sure that, if
there is a remaining successor, it is in the first successors slot and not
the second.  To fix this, we simply get both successors up front.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
2bd5a24a5e i965/fs_nir: Validate optimization passes
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
10adf8fc85 nir: Differentiate between signed and unsigned versions of find_msb
We also make the return types match GLSL.  The GLSL spec specifies that
findMSB and findLSB return a signed integer.  Previously, nir had them
return unsigned.  This updates nir's behavior to match what GLSL expects.

We also update the nir-to-fs generator to take the new instructions.  While
we're at it, we fix the case where the input to findMSB is zero.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
a76ccbfacf nir/print: Don't reindex things
These indices should now be reasonably stable/consistent.  Redoing the
indices in the print functions makes it harder to debug problems.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:01 -08:00
Jason Ekstrand
73522ec83f nir: Validate all lists in the validator
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
8b3dfdce76 glsl/list: Fix the exec_list_validate function
Some time while refactoring things to make it look nicer before pushing to
master, I completely broke the function.  This fixes it to be correct.
Just goes to show you why you souldn't push code that has no users yet...

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
4285aaecdc i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
943ddb9458 nir: Add a better out-of-SSA pass
This commit rewrites the out-of-SSA pass to not be nearly as naieve.  It's
based on "Revisiting Out-of-SSA Translation for Correctness, Code Quality,
and Efficiency" by Boissinot et. al.  It should be fairly close to
state-of-the art.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
4f44120ff5 nir: Add a function for comparing two sources
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
366181d826 nir: Add a parallel copy instruction type
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
7de6b7fc3e nir: Add a function for rewriting all the uses of a SSA def
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
946012f10f nir: Automatically handle SSA uses when an instruction is inserted
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
fbc443ad56 nir: Add an initialization function for SSA definitions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
f86902e75d nir: Add an SSA-based liveness analysis pass.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
c9a21c725d nir: set reg_alloc and ssa_alloc when indexing registers and SSA values
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
d7e482d32c nir: Add a function to detect if a block is immediately followed by an if
Since we don't actually have an "if" instruction, this is a very common
pattern when iterating over instructions.  This adds a helper function for
it to make things a little less painful.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
dfdf0c4673 nir: Add a foreach_block_reverse function
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
07556442a7 nir/foreach_block: Return false if the callback on the last block fails
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
49911cf4db nir: Add a basic metadata management system
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
ea1eefe13f nir/lower_variables_scalar: Silence a compiler warning
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
63eb32950e i965/fs_nir: Convert the shader to/from SSA
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
9d986d19d0 nir: Add a lower_vec_to_movs pass
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:19:00 -08:00
Jason Ekstrand
2943522d80 nir: Add a naieve from-SSA pass
This pass is kind of stupidly implemented but it should be enough to get us
up and going.  We probably want something better that doesn't generate all
of the redundant moves eventually.  However, the i965 backend should be
able to handle the movs, so I'm not too worried about it in the short term.
2015-01-15 07:18:59 -08:00
Jason Ekstrand
ff0a9fcf33 i965/fs_nir: Don't duplicate emit_general_interpolation
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
b1fe8604c6 i965/fs: Don't take an ir_variable for emit_general_interpolation
Previously, emit_general_interpolation took an ir_variable and pulled the
information it needed from that.  This meant that in fs_fp, we were
constructing a dummy ir_variable just to pass into it.  This commit makes
emit_general_interpolation take only the information it needs and gets rid
of the fs_fp cruft.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
b600f1a381 nir: Add intrinsics to do alternate interpolation on inputs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
4b4f90dbff nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean immediates
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
744b4e9348 i965/fs_nir: Add atomic counters support
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
6e46c98ec1 nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
95fbd6e1ee i965/fs_nir: Handle coarse/fine derivatives
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
d40b5ca5c5 nir/glsl: Add support for coarse and fine derivatives
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
8c75a7ce59 nir: Add fine and coarse derivative opcodes
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
458a6ce500 nir/glsl: Add support for saturate
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
4582341ea7 i965/fs_nir: Add support for sample_pos and sample_id 2015-01-15 07:18:59 -08:00
Jason Ekstrand
7cd1537aae Fix up varying pull constants
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
4bb81f6d02 Fix what I think are a few NIR typos
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
b092bc9805 i965/fs_nir: Use the correct texture offset immediate
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
c181ff268e i965/fs_nir: Use the correct types for texture inputs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
c2ded36bb6 i965/fs_nir: Make the sampler register always unsigned
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
ae2880d131 i965/fs: Only use nir for 8-wide non-fast-clear shaders.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Connor Abbott
2faf7f87d6 i965/fs: add a NIR frontend
This is similar to the GLSL IR frontend, except consuming NIR. This lets
us test NIR as part of an actual compiler.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Make brw_fs_nir build again
   Only use NIR of INTEL_USE_NIR is set
   whitespace fixes
2015-01-15 07:18:59 -08:00
Connor Abbott
9afc566e2d i965/fs: Don't pass through the coordinate type
All we really need is the number of components.
2015-01-15 07:18:58 -08:00
Connor Abbott
616a48ebc6 i965/fs: make emit_fragcoord_interpolation() not take an ir_variable 2015-01-15 07:18:58 -08:00
Connor Abbott
7602385ac5 nir: add an SSA-based dead code elimination pass
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
8b7cb7674c nir: add an SSA-based copy propagation pass 2015-01-15 07:18:58 -08:00
Connor Abbott
4553887d4a nir: add a pass to convert to SSA
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
b559ee709b nir: calculate dominance information 2015-01-15 07:18:58 -08:00
Connor Abbott
cff1deff72 nir: add an optimization to turn global registers into local registers
After linking and inlining, this allows us to convert these registers
into SSA values and optimise more code.
2015-01-15 07:18:58 -08:00
Connor Abbott
613bf6818a nir: add a pass to lower atomics
v2: Jason Ekstrand <jason.ekstrand@intel.com>
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
8692c6a023 nir: add a pass to lower system value reads
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
8cdcfce5ce nir: add a pass to lower sampler instructions 2015-01-15 07:18:58 -08:00
Connor Abbott
370e875b32 nir: add a pass to remove unused variables
After we lower variables, we want to delete them in order to free up
some memory.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
    whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
494790b2a9 nir: keep track of the number of input, output, and uniform slots 2015-01-15 07:18:58 -08:00
Connor Abbott
c2f36cf125 nir: add a pass to lower variables for scalar backends 2015-01-15 07:18:58 -08:00
Connor Abbott
7f0daaa5e7 nir: add a glsl-to-nir pass
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Make glsl_to_nir build again
   fix whitespace
2015-01-15 07:18:58 -08:00
Connor Abbott
dbb76421da nir: add a validation pass
This is similar to ir_validate.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
98fa28bff7 nir: add a printer
This is similar to ir_print_visitor.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Jason Ekstrand
9b1139649d SQUASH: Fix comments from eric
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:58 -08:00
Jason Ekstrand
8b4c860580 SQUASH: Add an assert 2015-01-15 07:18:58 -08:00
Connor Abbott
2812e5de93 nir: add core helper functions
These include functions for adding and removing various bits of IR and
helpers for iterating over all the sources and destinations of an
instruction. This is similar to ir.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace and automake fixes
2015-01-15 07:18:58 -08:00
Jason Ekstrand
f521a3c543 SQUASH: Use the enum for the variable mode 2015-01-15 07:18:57 -08:00
Connor Abbott
30c4678f64 nir: add the core datastructures
This includes all the instructions, ifs, loops, functions, etc. This is
similar to the information in ir.h.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Include ralloc and hash_table from the util directory
   whitespace fixes

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-By glenn.kennard <glenn.kennard@gmail.com>
2015-01-15 07:18:57 -08:00
Connor Abbott
b5ca34a211 nir: add a simple C wrapper around glsl_types.h
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
    whitespace and automake fixes

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:57 -08:00
Connor Abbott
77e7a00267 nir: add initial README
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:57 -08:00
Connor Abbott
ab2ae63854 exec_list: add a list_foreach_typed_reverse() macro
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:57 -08:00
Eric Anholt
84ef2d4156 vc4: Add some dumping for STORE_TILE_BUFFER_GENERAL. 2015-01-15 22:21:29 +13:00
Eric Anholt
1b241c59e8 vc4: Add dumping for the TILE_RENDERING_MODE_CONFIG packet.
I wanted to read it, so I wrote parsing.
2015-01-15 22:19:25 +13:00
Eric Anholt
d0d6d24723 vc4: Fix CL dumping trying to dump too far.
Execution will end at the cl->next, because that's what ct0ea/ct1ea get
programmed to.
2015-01-15 22:19:25 +13:00
Eric Anholt
0471f72755 vc4: Fix texture type masking.
Everything from ETC1 to RGBA64 was getting its top bit dropped, but we
didn't use any of those formats.
2015-01-15 22:19:25 +13:00
Eric Anholt
6313a2c8f0 vc4: Colormask should apply after all other fragment ops (like logic op).
Theoretically it should apply after dithering as well, but ditehring for
565 happens in fixed function in the TLB store.
2015-01-15 22:19:25 +13:00
Eric Anholt
0289a26201 vc4: No turning unpack arguments into small immediates.
Since unpack only happens on things read from the A register file, we have
to leave them as something that can be allocated to A (temp or uniform).
2015-01-15 22:19:25 +13:00
Eric Anholt
772c47aefe vc4: Move the tests for src needing to be an A register to vc4_qir.c.
I want it from another location.
2015-01-15 22:19:25 +13:00
Eric Anholt
8f2fb68026 vc4: Don't swap the raddr on instructions doing unpacks.
It would mean different unpacking behavior, since only the A file does
unpack (with PM==0).
2015-01-15 22:19:25 +13:00
Eric Anholt
5d5707707f vc4: Don't let pairing happen with badly mismatched unpack flags.
No difference on shader-db, but prevents definite regressions in the
blending changes.
2015-01-15 22:19:25 +13:00
Eric Anholt
3820866e40 vc4: Don't let pairing happen with badly mismatched pack flags.
No difference on shader-db, but will become more important as I introduce
more use of pack flags with the blending changes.
2015-01-15 22:19:25 +13:00
Eric Anholt
d1f2fc834d vc4: Fix early Z behavior on hardware.
It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z.  The result was
that you'd get big chunks of your rendering missing.
2015-01-15 22:19:25 +13:00
Michel Dänzer
82b7ee62fc Revert "radeonsi: only set BC_OPTIMIZE_DISABLE when necessary"
This reverts commit 0543630d0b.

It caused flickering artifacts in Steam games such as Team Fortress 2 or
Left 4 Dead 2.

We could probably only enable this optimization by also making sure the
shader code only uses either SI_PARAM_LINEAR_CENTROID or
SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader
variant.

Sorry I didn't remember this when reviewing the reverted change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-15 15:09:48 +09:00
Michel Dänzer
a6a75f1286 st/clover: Adapt to TargetLibraryInfo.h move in LLVM SVN r226078
Trivial.
2015-01-15 12:57:05 +09:00
Ian Romanick
0a0d2c9443 mesa: Micro-optimize _mesa_is_valid_prim_mode
You would not believe the mess GCC 4.8.3 generated for the old
switch-statement.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence -0.37374% +/- 0.184057% (n=40)
64-bit: Difference at 95.0% confidence 0.966722% +/- 0.338442% (n=40)

The regression on 32-bit is odd.  Callgrind says the caller,
_mesa_is_valid_prim_mode is faster.  Before it says 2,293,760
cycles, and after it says 917,504.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
ead200d156 mesa: Check for vertex program the same way in desktop GL and ES
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Multithread:

32-bit: Difference at 95.0% confidence 0.416027% +/- 0.163529% (n=40)
64-bit: Difference at 95.0% confidence 0.494771% +/- 0.259985% (n=40)

Gl32Batch7 had no difference proven at 95.0% confidence (n=120) on
32-bit or 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
d5f936367f mesa: Drop index buffer bounds check
The previous check was insufficient (as it did not take 'indices' into
consideration), and DX10 hardware does not need this check anyway.

Since index_bytes is no longer used, remove it.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 1.66929% +/- 0.230107% (n=40)
64-bit: Difference at 95.0% confidence -1.40848% +/- 0.288038% (n=40)

The regression on 64-bit is odd.  Callgrind says the caller,
validate_DrawElements_common is faster.  Before it says 10,321,920
cycles, and after it says 8,945,664.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
a4aeb534ea mesa: Only check for a current vertex shader in core profile
This doesn't affect performance, but it feels more correct.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: No difference proven at 95.0% confidence (n=120)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
d6c6b186cf mesa: Only validate shaders that can exist in the context
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 0.495267% +/- 0.202063% (n=40)
64-bit: Difference at 95.0% confidence 3.57576% +/- 0.288175% (n=40)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
14aadbe827 i965: Store the atoms directly in the context
Instead of having an extra pointer indirection in one of the hottest
loops in the driver.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 1.98515% +/- 0.20814% (n=40)
64-bit: Difference at 95.0% confidence 1.5163% +/- 0.811016% (n=60)

v2 (Ken): Cut size of array from 64 to 57 to save memory.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:01:27 -08:00
Ian Romanick
6ed53c27ef i965: Micro-optimize brw_get_index_type
With the switch-statement, GCC 4.8.3 produces a small pile of code with
a branch.

00000000 <brw_get_index_type>:
  000000:       8b 54 24 04             mov    0x4(%esp),%edx
  000004:       b8 01 00 00 00          mov    $0x1,%eax
  000009:       81 fa 03 14 00 00       cmp    $0x1403,%edx
  00000f:       74 0d                   je     00001e <brw_get_index_type+0x1e>
  000011:       31 c0                   xor    %eax,%eax
  000013:       81 fa 05 14 00 00       cmp    $0x1405,%edx
  000019:       0f 94 c0                sete   %al
  00001c:       01 c0                   add    %eax,%eax
  00001e:       c3                      ret

However, this could be two instructions.

00000000 <brw_get_index_type>:
  000000:       2d 01 14 00 00          sub    $0x1401,%eax
  000005:       d1 e8                   shr    %eax
  000007:       90                      nop
  000008:       90                      nop
  000009:       90                      nop
  00000a:       90                      nop
  00000b:       c3                      ret

The function was also moved to the header so that it could be inlined at
the two call sites.  Without this, 32-bit also needs to pull the
parameter from the stack.  This means there is a push, a call, a move,
and a ret added to a two instruction function.  The above code shows the
function with __attribute__((regparm=1)), but even this adds several
extra instructions.  There is also an extra instruction on 64-bit to
move the parameter to %eax for the subtract.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 0.818589% +/- 0.234661% (n=40)
64-bit: Difference at 95.0% confidence 0.54554% +/- 0.354092% (n=40)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-14 16:56:47 -08:00
Ian Romanick
3f1f1d0df4 meta: Put _mesa_meta_in_progress in the header file
...so that it can be inlined in the two places that call it.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: Difference at 95.0% confidence 1.24042% +/- 0.382277% (n=40)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-14 16:55:53 -08:00
Kenneth Graunke
3167a80bb1 i965: Fix "vertex" vs. "geometry" and "VS" vs. "GS" in debug output.
We were happily printing "Native code for unnamed vertex shader" and
"VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output,
as well as the KHR_debug output used by shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-14 16:55:43 -08:00
Kenneth Graunke
68ed14d6ad i965: Pass a shader stage abbreviation to fs_generator().
A lot of messages hardcoded the string "FS", which is confusing on
Broadwell, where we use this code for VS support as well.

shader-db particularly got confused, as it reported two "FS SIMD8"
shaders, and no vertex shaders at all.  Craziness ensued.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-14 16:55:38 -08:00
Samuel Iglesias Gonsalvez
efef6c8280 configure: add check for GNU indent
Only GNU indent is supported when indenting autogenerated format_pack.c
and format_unpack.c files. Some non-GNU indent (Mac OS X and FreeBSD)
add extra whitespaces than break the build of those files.

Fallback to 'cat' if a non-GNU indent is found.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=88335

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-14 12:52:22 +01:00
Samuel Iglesias Gonsalvez
6d43a4c338 configure: change required Python Mako version to 0.3.4
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-01-14 12:52:22 +01:00
Iago Toral Quiroga
c6a2628950 mesa: rename RGBA8888_* format constants to something appropriate.
The 8888 suggests 8-bit components which is not correct, so
replace that with the actual size of the components in each
format.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-14 07:57:31 +01:00
Jason Ekstrand
ae417957e0 i965/miptree_map_blit: Don't do the initial copy if INVALIDATE_RANGE is set
Before we were always coping from the buffer being mapped into the
temporary buffer.  However, if INVALIDATE_RANGE is set, then we know that
the data is going to be junk after we unmap so there's no point in doing
the blit.  This is important because doing the blit will cause a stall 3
lines later when we map the buffer.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-13 22:06:51 -08:00
Tapani Pälli
f52fe39d31 mesa/glsl/glapi: enable GL_EXT_draw_buffers extension
Patch enables ES2 extension that utilizes existing ES3 functionality.

Changes make all the subtests to run and pass in WebGL conformance
test 'webgl-draw-buffers' when running Chrome on OpenGL ES, also
Piglit test 'draw_buffers_gles2' passes.

v2: remove unused boolean (Ilia Mirkin)
v3: proper error checking for invalid values (Chad Versace)
v4: run error check explicitly for ES2 and ES3 (Kenneth Graunke)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-14 07:48:51 +02:00
Jason Ekstrand
3a5c7e47fd i965/fs: Allow constant propagation between different types
This will be needed for NIR because it is typeless and treats all constants
as uint32 values and reinterprets them when they are used later.  This
commit allows those values to be properly propagated.

Also, this helps some synmark shaders because it allows us to copy
propagate a 0x00000000UD into a 0.0F in a load_payload, which then lets us
combine 4 load_payloads.

instructions in affected programs:     2288 -> 2144 (-6.29%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-13 13:24:52 -08:00
Chad Versace
610c7486c2 egl/wayland: Fix unused variable warnings
Remove ctx variables unused as of 70e8ccc459.
2015-01-13 11:33:23 -08:00
Mike Mason
90d2a85193 mesa: Enable GL_RGB/GL_RGBA in GLES3 glGetInternalformativ
Removes commit 7894278 changes and moves fix to _mesa_GetInternalformativ().
The original commit enabled the GL_RGB and GL_RGBA unsized internal formats
as valid for render buffers in GLES3, but this is incorrect. They should
have only been enabled for GetInternalformativ()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88079
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-13 11:23:46 -08:00
Rob Clark
876550ff97 freedreno/ir3: handle "holes" in inputs
If, for example, only the x/y/w components of in.xyzw are actually used,
we still need to have a group of four registers and assign all four
components.  The hardware can't write in.xy and in.w to discontiguous
registers.  To handle this, pad with a dummy NOP instruction, to keep
the neighbor chain contiguous.

This fixes a problem noticed with firefox OMTC.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-13 08:17:18 -05:00
Iago Toral Quiroga
b6819cd554 mesa: Fix error reporting for some cases of incomplete FBO attachments
According to the OpenGL and OpenGL ES specs (sections
"FRAMEBUFFER COMPLETENESS" and "Whole Framebuffer Completeness"),
the image for color, depth or stencil attachments must be renderable,
otherwise the attachment is considered incomplete and we should report
GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT. Currently, we detect this
situation properly but report a different error.

This fixes the following 3 piglit tests:
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgba_unsigned_int_2_10_10_10_rev
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb16f

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev
038894c7cb mesa: Returns a GL_INVALID_VALUE error if num of texs in glDeleteTextures is negative
Per GLES3 manual for glDeleteTextures
<https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteTextures.xhtml>,
GL_INVALID_VALUE is generated if n is negative.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.texture.deletetextures

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev
2012f62d4a mesa: Returns a GL_INVALID_VALUE error if num of fbos in glDeleteRenderbuffers is negative
Per GLES3 manual for glDeleteRenderbuffers
<https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteRenderbuffers.xhtml>,
GL_INVALID_VALUE is generated if n is negative.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.delete_renderbuffers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev
f77a473497 mesa: Returns a GL_INVALID_VALUE error if num of fbos in glDeleteFramebuffers is negative
Per GLES3 manual for glDeleteFramebuffers
<https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteFramebuffers.xhtml>,
GL_INVALID_VALUE is generated if n is negative.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.delete_framebuffers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev
f408c333e2 mesa: Allows querying GL_SAMPLER_BINDING on GLES3 profile
From GLES3 specification (page 123), "The currently bound sampler may be
queried by calling GetIntegerv with pname set to
SAMPLER_BINDINGGL_SAMPLER_BINDING".

Fixes 4 dEQP tests:
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getboolean
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getinteger
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getinteger64
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getfloat

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Samuel Iglesias Gonsalvez
719e3f016e main: round floating-point value to nearest integer in glGetSamplerParameteriv()
Previously, a cast was done to convert from float to int but there
were rounding errors.

The spec specificies in Data Conversion chapter that Floating-point values are
rounded to the nearest integer.

This patch fixes the following 2 dEQP tests:

dEQP-GLES3.functional.state_query.sampler.sampler_texture_min_lod_getsamplerparameteri
dEQP-GLES3.functional.state_query.sampler.sampler_texture_max_lod_getsamplerparameteri

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Samuel Iglesias Gonsalvez
d8d59202af main: round floating-point value to nearest integer in glGetTexParameteriv()
Previously, a cast was done to convert from float to int but there
were rounding errors.

The spec specificies in Data Conversion chapter that Floating-point values are
rounded to the nearest integer.

This patch fixes the following 8 dEQP tests:

dEQP-GLES3.functional.state_query.texture.texture_2d_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_2d_texture_max_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_3d_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_3d_texture_max_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_2d_array_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_2d_array_texture_max_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_cube_map_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_cube_map_texture_max_lod_gettexparameteri

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Samuel Iglesias Gonsalvez
8e49a3e028 main: fix return GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_LEVEL value
Return the proper value for two-dimensional array texture and three-dimensional
textures.

From OpenGL ES 3.0 spec, chapter 6.1.13 "Framebuffer Object Queries",
page 234:

"If pname is FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER and the texture
object named FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is a layer of a
three-dimensional texture or a two-dimensional array texture, then params
will contain the number of the texture layer which contains the attached im-
age. Otherwise params will contain the value zero."

Furthermore, FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER is an alias of
FRAMEBUFFER_ATTACHMENT_TEXTURE_3D_ZOFFSET_EXT.

This patch fixes dEQP test:

dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_texture_layer

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Iago Toral Quiroga
c260d61e76 i965: Fix bitcast operations with negate (ceil)
Commit 0ae9ca12a8 put source modifiers out of the bitcast operations
by adding a MOV operation that would handle them separately. It missed
the case of ceil though: the implementation negates both its source and
destination operands. The source operand will be used for RNDD, which
we can handle normally, but we need to fix the modifier for the
negated result.

v2:
  - RNDD can handle the source modifier so no need to put that one
    in a separate MOV.

Fixes the following 42 dEQP tests:
dEQP-GLES3.functional.shaders.builtin_functions.common.ceil.*_vertex
dEQP-GLES3.functional.shaders.builtin_functions.common.ceil.*_fragment
dEQP-GLES3.functional.shaders.builtin_functions.precision.ceil._*vertex.*
dEQP-GLES3.functional.shaders.builtin_functions.precision.ceil._*fragment.*

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-13 12:19:32 +01:00
Iago Toral Quiroga
d42e090386 mesa: Depth and stencil attachments must be the same in OpenGL ES3
"9.4. FRAMEBUFFER COMPLETENESS
 ...
 Depth and stencil attachments, if present, are the same image."

Notice that this restriction is not included in the OpenGL ES2 spec.

Fixes 18 dEQP tests in:
dEQP-GLES3.functional.fbo.completeness.attachment_combinations.*

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev
b8b1d83c71 mesa: Initializes the stencil value masks to 0xFF instead of ~0u
'4.1.4 Stencil Test' section of the GL-ES 3.0 specification says:

    "In the initial state, [...] the front and back stencil mask are both set
    to the value 2^s − 1, where s is greater than or equal to the number of
    bits in the deepest stencil buffer* supported by the GL implementation."

Since the maximum supported precision for stencil buffers is 8 bits, mask
values should be initialized to 2^8 - 1 = 0xFF.

Currently, these masks are initialized to max unsigned integer (~0u), because
in OpenGL 3.0 and before, the initial mask values were:

    "In the initial state, stenciling is disabled, the front and back
    stencil reference value are both zero, the front and back stencil
    comparison functions are both ALWAYS, and the front and back
    stencil mask are both all ones."

The problem is that it causes the mask values to overflow to -1 when converted
to signed integer by glGet* APIs.

Fixes 6 dEQP failing tests:
* dEQP-GLES3.functional.state_query.integers.stencil_value_mask_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_both_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_both_getfloat

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev
aa727c1dd9 i965: Sets missing vertex shader constant values for HighInt format
The range's min and max, and the precision value are not set correctly for the
vertex shader constants.

Fixes 1 dEQP test: dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-13 12:19:32 +01:00
Marek Olšák
bed6f20f28 r600g: fix build failure when building the driver without LLVM 2015-01-12 23:20:26 +01:00
Laura Ekstrand
0e6f0eea1a main: Remove comparison unsigned int >= 0.
Fixes "macro compares unsigned to 0 (NO_EFFECT)" found by Coverity Scan.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-12 10:23:17 -08:00
Juha-Pekka Heikkila
c503ce1044 mesa/main: In _mesa_CompressedTextureSubImage3D() check found texObj
Check returned texObj is not null. If texObj is null there is already
GL_INVALID_OPERATION error set.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2015-01-12 09:56:43 -08:00
José Fonseca
457d40e9e8 mesa: Move declarations to to of block.
To fix MSVC build.

Trivial.
2015-01-12 12:40:01 +00:00
Samuel Iglesias Gonsalvez
c471b09bf4 mesa: restrict use of GL_ABGR_EXT format to allowed data types
GL_UNSIGNED_SHORT_5_5_5_1, GL_UNSIGNED_SHORT_1_5_5_5_REV,
GL_UNSIGNED_INT_10_10_10_2, GL_UNSIGNED_INT_2_10_10_10_REV data types
are not explicitly allowed to work with GL_ABGR_EXT format neither
in GL nor GL_EXT_abgr specs.

Removed the corresponding mesa formats as there are no other functions
using them inside Mesa anymore.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:30 +01:00
Iago Toral Quiroga
769de5165c mesa: Remove _mesa_rebase_rgba_uint and _mesa_rebase_rgba_float
These are no longer used anywhere now that we have _mesa_format_convert.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:30 +01:00
Samuel Iglesias Gonsalvez
8993b9818c mesa: Remove _mesa_pack_int_rgba_row() and auxiliary functions
These are no longer used.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:30 +01:00
Iago Toral Quiroga
d28d9376e2 mesa: Remove _mesa_(un)pack_index_span
These are not used anywhere.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
3a4de32144 mesa: Remove _mesa_pack_rgba_span_float and tmp_pack.h
_mesa_pack_rgba_span_float was the last of the color span functions
and we have replaced all calls to it with calls to _mesa_format_convert,
so we can remove it together with tmp_pack.h which was used to
generate the pack functions for multiple types that were used from
the various color span functions that have been removed.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
873437e209 mesa: Remove _mesa_unpack_color_span_float
And various helper functions that went unused after removing it.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
3ba92bac76 mesa: Remove (signed) integer pack and span functions.
These are no longer used now that we moved to _mesa_format_convert.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
2280fdeb61 mesa: Remove _mesa_unpack_color_span_ubyte
This is no longer used anywhere after moving to _mesa_format_convert.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
c540800aa5 mesa: Remove _mesa_make_temp_float_image
Now that we have _mesa_format_convert we don't need this.

This was only used to create temporary RGBA float images in the process
of storing some compressed formats. These can call _mesa_texstore
with a RGBA/float dst to achieve the same goal.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
4468386a3c mesa: Remove _mesa_make_temp_ubyte_image
Now that we have _mesa_format_convert we don't need this.

texstore_rgba will use the GL_COLOR_INDEX to RGBA conversion
helpers instead and compressed formats that used
_mesa_make_temp_ubyte_image to create an ubyte RGBA temporary
image can call _mesa_texstore with a RGBA/ubyte dst to
achieve the same goal.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
43a76a9e44 mesa: Remove _mesa_unpack_color_span_uint
This is no longer used.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Eduardo Lima Mitev
87c595c17b mesa: Replace _mesa_unpack_bitmap with _mesa_unpack_image()
_mesa_unpack_bitmap() was introduced by commit 02b801c to handle the case
when data is stored in PBO by display lists, in the context of this bug:

Incorrect pixels read back if draw bitmap texture through Display list
https://bugs.freedesktop.org/show_bug.cgi?id=10370

Since _mesa_unpack_image() already handles the case of GL_BITMAP, this patch
removes _mesa_unpack_bitmap() and makes affected calls go through
_mesa_unapck_image() instead.

The sample test attached to the original bug report passes with this change
and there are no piglit regressions.

Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
ea79ab3e8c mesa: Let _mesa_swizzle_and_convert take array format types instead of GL types
In the future we would like to have a format conversion library that is
independent of GL so we can share it with Gallium. This is a step in that
direction.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
a55f67fcb0 st/mesa: Use _mesa_format_convert to implement st_GetTexImage.
Instead of using _mesa_pack_rgba_span_float. This should allow us to remove
that function in a later patch.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
84eb402c01 swrast: Use _mesa_format_convert to implement draw_rgba_pixels.
This is the only place that uses _mesa_unpack_color_span_float so after
this we should be able to remove that function.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
a629f0612d mesa: Use _mesa_format_convert to implement get_tex_rgba_compressed.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
77bd2b288f mesa: use _mesa_format_convert to implement get_tex_rgba_uncompressed.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
5038d839b8 mesa: use _mesa_format_convert to implement glReadPixels.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
8ec6534b26 mesa: Use _mesa_format_convert to implement texstore_rgba.
Notice that _mesa_format_convert does not handle byte-swapping scenarios,
GL_COLOR_INDEX or MESA_FORMAT_YCBCR(_REV), so these must be handled
separately.

Also, remove all the code that goes unused after using _mesa_format_convert.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
2ec8718dae mesa: Add helpers to extract GL_COLOR_INDEX to RGBA float/ubyte
We only use _mesa_make_temp_ubyte_image in texstore.c to convert
GL_COLOR_INDEX to RGBA, but this helper does more stuff than this.
All uses of this helper can be replaced with calls to
_mesa_format_convert except for this GL_COLOR_INDEX conversion.

This patch extracts the GL_COLOR_INDEX to RGBA logic to a separate
helper so we can use that instead from texstore.c.

In future patches we will replace all remaining calls to
_mesa_make_temp_ubyte_image in the repository (related to compressed
formats) with calls to _mesa_format_convert so we can remove
_mesa_make_temp_ubyte_image and related functions.

v2:
- Remove ‘for’ loop initial declaration. They are only allowed in C99 or C11
mode.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
d71a1adff2 mesa: Add RGBA to Luminance conversion helpers
For glReadPixels with a Luminance destination format we compute luminance
values from RGBA as L=R+G+B. This, however, requires ad-hoc implementation,
since pack/unpack functions or _mesa_swizzle_and_convert won't do this
(and thus, neither will _mesa_format_convert). This patch adds helpers
to do this computation so they can be used to support conversion to luminance
formats.

The current implementation of glReadPixels does this computation as part
of the span functions in pack.c (see _mesa_pack_rgba_span_float), that do
this together with other things like type conversion, etc. We do not want
to use these functions but use _mesa_format_convert instead (later patches
will remove the color span functions), so we need to extract this functionality
as helpers.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Iago Toral Quiroga
a177b30f1f mesa: Add _mesa_swap2_copy and _mesa_swap4_copy
We have _mesa_swap{2,4} but these do in-place byte-swapping only. The new
functions receive an extra parameter so we can swap bytes on a source
input array and store the results in a (possibly different) destination
array.

This is useful to implement byte-swapping in pixel uploads, since in this
case we need to swap bytes on the src data which is owned by the
application so we can't do an in-place byte swap.

v2:
  - Include compiler.h in image.h, which is necessary to build in MSCV as
    indicated by Brian Paul.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:29 +01:00
Samuel Iglesias Gonsalvez
dcef50b9b5 mesa/pack: use _mesa_format_from_format_and_type in _mesa_pack_rgba_span_from_*
We had previously added the needed mesa formats, so we can simplify
the code further.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Iago Toral Quiroga
559a1072da mesa: Add helper to convert a GL format and type to a mesa (array) format.
v2 after review by Jason Ekstrand:
- Move _mesa_format_from_format_and_type to glformats
- Return a mesa_format for GL_UNSIGNED_INT_8_8_8_8(_REV)

v3:
- Adapted to the new implementation of mesa_array_format as a plain uint32_t
  bitfield.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Iago Toral Quiroga
b1f0229140 mesa: Add a helper _mesa_compute_rgba2base2rgba_component_mapping
This will come in handy when callers of _mesa_format_convert need
to compute the rebase swizzle parameter to use.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Iago Toral Quiroga
3171a09c25 mesa: Add a rebase_swizzle parameter to _mesa_format_convert
The new parameter allows callers to provide a rebase swizzle that
the function needs to use to match the requirements of the base
internal format involved. This is necessary when the source or
destination internal formats (depending on whether we are doing
the conversion for a pixel download or a pixel upload respectively)
do not match the base formats of the source or destination
formats of the conversion. This can happen when the driver does not
support the internal formats and uses a different format to store
pixel data internally.

For example, a texture upload from RGB to Luminance in a driver
that does not support textures with a Luminance format may decide
to store the Luminance data as RGBA. In this case we want to store
the RGBA values as (R,R,R,1). Following the same example, when we
download from that texture to RGBA we want to read (R,0,0,1). The
rebase_swizzle parameter allows these transforms to happen.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Iago Toral Quiroga
1aaed75330 mesa: Expose compute_component_mapping as _mesa_compute_component_mapping
This is necessary to handle conversions between array types where
the driver does not support the dst format requested by the client and
chooses a different format instead.

We will need this in _mesa_format_convert, so move it to format_utils.c,
prefix it with '_mesa_' and make it available to other files.

v2:
  - Move _mesa_compute_component_mapping to glformats

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Jason Ekstrand
deca11c0dc mesa: Add an implementation of a master convert function.
v2 by Iago Toral <itoral@igalia.com>:

- When testing if we can directly pack we should use the src format to check
  if we are packing from an RGBA format. The original code used the dst format
  for the ubyte case by mistake.
- Fixed incorrect number of bits for dst, it was computed using the src format
  instead of the dst format.
- If the dst format is an array format, check if it is signed. We were only
  checking this for the case where it was not an array format, but we need
  to know this in both scenarios.
- Fixed incorrect swizzle transform for the cases where we convert between
  array formats.
- Compute is_signed and bits only once and for the dst format. We were
  computing these for the src format too but they were overwritten by the
  dst values immediately after.
- Be more careful when selecting the integer path. Specifically, check that
  both src and dst are integer types. Checking only one of them should suffice
  since OpenGL does not allow conversions between normalized and integer types,
  but putting extra care here makes sense and also makes the actual requirements
  for this path more clear.
- The format argument for pack functions is the destination format we are
  packing to, not the source format (which has to be RGBA).
- Expose RGBA8888_* to other files. These will come in handy when in need to
  test if a given array format is RGBA or in need to pass RGBA formats to
  mesa_format_convert.

v3 by Samuel Iglesias <siglesias@igalia.com>:

- Add an RGBA8888_INT definition.

v4 by Iago Toral <itoral@igalia.com> after review by Jason Ekstrand:

- Added documentation for _mesa_format_convert.
- Added additional explanatory comments for integer conversions.
- Ensure that we use _messa_swizzle_and_convert for all signed source formats.
- Squashed: do not directly (un)pack to RGBA UINT if the source is not unsigned.

v5 by Iago Toral <itoral@igalia.com>:

- Adapted to the new implementation of mesa_array_format as a plain uint32_t
  bitfield.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
ba5418c60d mesa/pack: refactor _mesa_pack_rgba_span_float()
Use autogenerated format pack functions and take advantage of some
macros to reduce source code, facilitating its maintenance.

Unfortunately, dstType == GL_UNSIGNED_SHORT cannot simplified like
the others, so keep it as it is.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
41a785b09c mesa/main/pack_tmp.h: Add float conversion support
We will use this in a later patch to refactor _mesa_pack_rgba_span_float.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
1a5ec9624a mesa/pack: use autogenerated format_pack functions
Take advantage of new mesa formats and new format_pack functions to
reduce source code in _mesa_pack_rgba_span_from_ints() and
_mesa_pack_rgba_span_from_uints().

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
8c82b22a16 mesa: use format conversion functions in swrast
This commit adds a macro to facilitate the task of using
format conversions functions but keeps the same API.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
c5a5c9a7db mesa/formats: add new mesa formats and their pack/unpack functions.
This  will be used to refactor code in pack.c and support conversion
to/from these types in a master convert function that will be added
later.

v2:
- Fix autogeneration of MESA_FORMAT_A2R10G10B10_UNORM pack/unpack
  functions

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
f8d160fc96 mesa/format_pack: Add _mesa_pack_int_rgba_row()
This will be used to unify code in pack.c.

v2:
- Modify pack_int_*() function generator to use c.datatype() and
  f.datatype()

v3:
- Only autogenerate pack_int_*() functions for non-normalized integer
  formats.

v4:
- Use _mesa_unsigned_to_unsigned() in pack_int_*() because, in order
  to be able to pack both signed and unsigned formats, we need to
  sign-extend.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
9567e1048b mesa: Add _mesa_pack_uint_rgba_row() format conversion function
We will use this later on to handle uint conversion scenarios in a master
convert function.

v2:
- Modify pack_uint_*() function generation to use c.datatype() and
  f.datatype().
- Remove UINT_TO_FLOAT() macro usage from pack_uint*()
- Remove "if not f.is_normalized()" conditional as pack_uint*()
  functions are only autogenerated for non normalized formats.

v3:
- Add clamping for non-normalized integer formats in pack_uint*()

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Jason Ekstrand
e1fdcddafe mesa: Autogenerate format_unpack.c
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2 by Samuel Iglesias <siglesias@igalia.com>:
- Add usage of INDENT_FLAGS in Makefile.am

v3 by Samuel Iglesias <siglesias@igalia.com>:
- Modify unpack_float_*() and unpack_ubyte_*() function generation
to use c.datatype() and f.datatype()
- Fix out-of-tree build

v4 by Samuel Iglesias <siglesias@igalia.com>:
- format_unpack.c.mako is now format_unpack.py, with the template code
  inlined. It now auto-generates format_unpack.c
- Add format_unpack.c to gitignore.
- Simplify Makefile.am change
- Modify SConscript to build format_unpack.c with scons

v5 by Samuel Iglesias <siglesias@igalia.com>:
- Don't allow float to non-normalized integer format conversions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Jason Ekstrand
e0439f7505 mesa: Autogenerate most of format_pack.c
We were auto-generating it before.  The problem was that the autogeneration
tool we were using was called "copy, paste, and edit".  Let's use a more
sensible solution.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2 by Samuel Iglesias <siglesias@igalia.com>
- Remove format_pack.c as it is now autogenerated
- Add usage of INDENT_FLAGS in Makefile.am
- Remove trailing blank line

v3 by Samuel Iglesias <siglesias@igalia.com>
- Merge format_convert.py into format_parser.py
   - Adapt pack_*_* function generations
- Fix out-of-tree build

v4 by Samuel Iglesias <siglesias@igalia.com>
- _get_datatype() is now a helper function

v5 by Samuel Iglesias <siglesias@igalia.com>
- format_pack.c.mako is now format_pack.py, with the template code
  inlined. It now auto-generates format_pack.c
- Simplify Makefile.am change.
- Modify SConscript to build format_pack.c with scons.
- Remove run_mako.py
- Add format_pack.c to gitignore

v6 by Samuel Iglesias <siglesias@igalia.com>:
- Don't allow float to non-normalized integer format conversions.
- Add non-normalized formats support for ubyte packing functions. Merge
the previously separated patch.
- Add clamping for non-normalized integer formats in pack_ubyte*()

v7 by Samuel Iglesias <siglesias@igalia.com>:
- Add assert to check that sRGB formats are 8-bit size.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez
2b37bea010 configure: require python mako module
It is now a hard dependency because of the autogeneration of
format pack and unpack functions.

Update the documentation to reflect this change.

v2:
- Inline python script in m4 file and use PYTHON2

v3:

- Remove semicolons and quotes and change coding style
- Add Ilia Mirkin suggestion to use Python's split functionality.
- Use AX_CHECK_PYTHON_MAKO_MODULE name.
- Change to MIT license

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Jason Ekstrand
f89793946a mesa: Add a _mesa_is_format_color_format helper
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-12 11:20:28 +01:00
Iago Toral Quiroga
3c19251f28 mesa: Let _mesa_get_format_base_format also handle mesa_array_format.
If we need the base format for a mesa_array_format we have to find the
matching mesa_format first. This is expensive because it requires
to loop through all existing mesa formats until we find the right match.

We can resolve the base format of an array format directly by looking
at its swizzle information. Also, we can have _mesa_get_format_base_format
accept an uint32_t which can pack either a mesa_format or a mesa_array_format
and resolve the base format for either type. This way clients do not need to
check if they have a mesa_format or a mesa_array_format and call different
functions depending on the case.

Another reason to resolve the base format for array formats directly is that
we don't have matching mesa_format enums for every possible array format, so
for some GL format/type combinations we can produce array formats that don't
have a corresponding mesa format, in which case we would not be able to
find the base format. Example format=GL_RGB, type=GL_UNSIGNED_SHORT. This type
would map to something like MESA_FORMAT_RGB_UNORM16, but we don't have that.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:28 +01:00
Jason Ekstrand
3da735cc4c main: Add a concept of an array format
An array format is a 32-bit integer format identifier that can represent
any format that can be represented as an array of standard GL datatypes.
Whie the MESA_FORMAT enums provide several of these, they don't account for
all of them.

v2 by Iago Toral Quiroga <itoral@igalia.com>:
 - Implement mesa_array_format as a plain bitfiled uint32_t type instead of
   using a struct inside a union to access the various components packed in
   it. This is necessary to support bigendian properly, as pointed out by
   Ian.
 - Squashed: Make float types normalized

v3 by Iago Toral Quiroga <itoral@igalia.com>:
  - Include compiler.h in formats.h, which is necessary to build in MSVC as
    indicated by Brian Paul.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-01-12 11:20:28 +01:00
Iago Toral Quiroga
382d097e54 swrast: Remove unused variable.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:27 +01:00
Samuel Iglesias Gonsalvez
fea1be8d0b mesa: Fix _mesa_swizzle_and_convert integer conversions to clamp properly
Fix various conversion paths that involved integer data types of different
sizes (uint16_t to uint8_t, int16_t to uint8_t, etc) that were not
being clamped properly.

Also, one of the paths was incorrectly assigning the value 12, instead of 1,
to the constant "one".

v2:
- Create auxiliary clamping functions and use them in all paths that
  required clamp because of different source and destination sizes
  and signed-unsigned conversions.

v3:
- Create MIN_INT macro and use it.

v4:
- Add _mesa_float_to_[un]signed() and mesa_half_to_[un]signed() auxiliary
  functions.
- Add clamp for float-to-integer conversions in _mesa_swizzle_and_convert()

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:27 +01:00
Jason Ekstrand
483b043488 mesa/format_utils: Prefix and expose the conversion helper functions
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2 by Samuel Iglesias <siglesias@igalia.com>:
- Fix compilation errors

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:27 +01:00
Iago Toral Quiroga
3473a84fb2 mesa: Fix incorrect assertion in init_teximage_fields_ms
_BaseFormat is a GLenum (unsigned int) so testing if its value is
greater than 0 to detect the cases where _mesa_base_tex_format
returns -1 doesn't work.

Fixing the assertion breaks the arb_texture_view-lifetime-format
piglit test on nouveau, since that test calls
_mesa_base_tex_format with GL_R16F with a context that does not
have ARB_texture_float, so it returns -1 for the BaseFormat, which
was not being caught properly by the ASSERT in init_teximage_fields_ms
until now.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:27 +01:00
Samuel Iglesias Gonsalvez
b2b39ce257 mesa: Fix get_texbuffer_format().
We were returning incorrect mesa formats for GL_LUMINANCE_ALPHA16I_EXT
and GL_LUMINANCE_ALPHA32I_EXT.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:27 +01:00
Jason Ekstrand
96fe6191cb mesa: Fix A1R5G5B5 packing/unpacking
As with B5G6R5, these have been left broken with comments saying they are.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-12 11:20:27 +01:00
Jason Ekstrand
3e4669a8f3 mesa/colormac: Remove an unused macro
The PACK_565_REV macro is no longer used.  It was also extremely confusing
because it's actually a byteswapped 565 not reversed 565.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-12 11:20:27 +01:00
Jason Ekstrand
ec0bfba496 mesa: Fix packing/unpacking of MESA_FORMAT_R5G6B5_UNORM
Aparently, the packing/unpacking functions for these formats have differed
from the format description in formats.h.  Instead of fixing this, people
simply left a comment saying it was broken.  Let's actually fix it for
real.

v2 by Samuel Iglesias <siglesias@igalia.com>:
- Fix comment in formats.h

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-12 11:20:27 +01:00
Jason Ekstrand
7d1b08ac44 mesa: Fix clamping to -1.0 in snorm_to_float
This patch fixes the return of a wrong value when x is lower than
-MAX_INT(src_bits) as the result would not be between [-1.0 1.0].

v2 by Samuel Iglesias <siglesias@igalia.com>:
    - Modify snorm_to_float() to avoid doing the division when
      x == -MAX_INT(src_bits)

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-12 11:20:27 +01:00
Emil Velikov
3b5f206475 docs: add news item and link release notes for mesa 10.3.7/10.4.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:46:38 +00:00
Emil Velikov
8e34db76e1 docs: Add sha256 sums for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 14f1659b43)
2015-01-12 10:46:38 +00:00
Emil Velikov
1631f74a1c Add release notes for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 02f2e97c3e)
2015-01-12 10:46:38 +00:00
Emil Velikov
134593f0c0 docs: Add sha256 sums for the 10.3.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 20e0546cc2)
2015-01-12 10:46:38 +00:00
Emil Velikov
4a8105e5cc Add release notes for the 10.3.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 6b00e5585a)
2015-01-12 10:46:38 +00:00
Kenneth Graunke
f95733ddb7 i965: Respect the no_8 flag on Gen6, not just Gen7+.
When doing repclears, we only want to use the SIMD16 program, not the
SIMD8 one.  Kristian added this to the Gen7+ code, but apparently we
missed it in the Gen6 code.  This patch copies that code over.

Approximately doubles the performance in a clear microbenchmark from
mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681
2015-01-12 00:41:07 -08:00
Ian Romanick
f591712efe mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary
There are no binary formats supported, so what are you doing?  At least
this gives the application developer some feedback about what's going
on.  The spec gives no guidance about what to do in this scenario.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
2015-01-12 12:01:09 +13:00
Ian Romanick
4fd8b30123 mesa: Ensure that length is set to zero in _mesa_GetProgramBinary
v2: Fix assignment of length.  Noticed by Julien Cristau.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
2015-01-12 12:01:06 +13:00
Ian Romanick
201b9c1818 mesa: Add missing error checks in _mesa_ProgramBinary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
2015-01-12 12:00:45 +13:00
Eric Anholt
ff1948a1be vc4: Clamp the inputs to the blend equation to [0, 1].
Fixes the remaining ARB_color_buffer_float rendering tests.
2015-01-11 17:17:20 +13:00
Eric Anholt
1519a1928a vc4: Add a little helper for clamping to [0,1]. 2015-01-11 17:17:20 +13:00
Eric Anholt
1a328120d3 vc4: Fix up statechange management for uncompiled/compiled FS/VS.
No need to recheck the FS compile when the VS source has changed, but
there *is* a need to recheck the VS compile when the compiled VS has
changed (since the live inputs may change).

Fixes es3conform's blend test.
2015-01-11 17:17:20 +13:00
Eric Anholt
c122662984 vc4: Fix clear color setup for RGB565.
The util_pack_color() thing only sets up the low bits of the union, so
only return them, too.  Fixes intermittent failure on
fbo-alphatest-formats and es3conform's framebuffer-objects test under
simulation.
2015-01-11 17:17:19 +13:00
Eric Anholt
355156d2f7 vc4: Avoid the save/restore of r3 for raddr conflicts, just use ra31.
Turns out this was harmful in code quality:

total instructions in shared programs: 39487 -> 38845 (-1.63%)
instructions in affected programs:     22522 -> 21880 (-2.85%)

This costs us yet another register, which is painful since it means more
programs might fail to compile).  However, the alternative was causing us
trouble where we'd save/restore r3 while it contained a MIN-ed direct
texture offset, causing the kernel to fail to validate our shaders (such
as in GLB2.7).
2015-01-11 08:57:24 +13:00
Eric Anholt
a8e14c293b vc4: Allow dead code elimination of VPM reads.
This gets a bunch of dead reads out of the CSes, which don't read most
attributes generally.

total instructions in shared programs: 39753 -> 39487 (-0.67%)
instructions in affected programs:     4721 -> 4455 (-5.63%)
2015-01-10 20:55:37 +13:00
Eric Anholt
b920ecf793 vc4: Cook up the draw-time VPM setup info during shader compile.
This will give the compiler the chance to dead-code eliminate unused VPM
reads.  This is particularly a big deal in the CS where a bunch of vattrs
are just not going to be used.
2015-01-10 15:24:56 +13:00
Eric Anholt
c772c92153 vc4: Split two notions of instructions having side effects.
Some ops can't be DCEd, while some of the ops that are just important due
to the args they have can be.
2015-01-10 15:24:46 +13:00
Eric Anholt
a58ae83882 vc4: Redo VPM reads as a read file.
This will let us do copy propagation of the VPM reads.
2015-01-10 14:35:24 +13:00
Eric Anholt
06b6a72a3e vc4: Fix miscalculation of the VPM space.
We pass in a byte offset, not dword.  I'm rather scared that this actually
managed to pass piglit, but it does fix gears.
2015-01-10 14:35:06 +13:00
Eric Anholt
92a0b0bd70 vc4: Pack VPM attr contents according to just the size of the attribute.
total instructions in shared programs: 40960 -> 39753 (-2.95%)
instructions in affected programs:     20871 -> 19664 (-5.78%)
2015-01-10 13:54:12 +13:00
Eric Anholt
72cb6619cb vc4: Restructure color packing as a series of channel replacements.
I'm using this in some WIP commits for doing blending in 8888 instead of
vec4.  But it also gives us these results immediately, thanks to allowing
more uniforms/immediates in the arguments:

total instructions in shared programs: 41027 -> 40960 (-0.16%)
instructions in affected programs:     4381 -> 4314 (-1.53%)
2015-01-10 13:54:12 +13:00
Eric Anholt
3093bfacf0 vc4: Fix the no-copy-propagating-from-TLB_COLOR_READ check.
Our MOV's dst obviously won't be the TLB_COLOR_READ's def, because we're
ssa.
2015-01-10 13:54:12 +13:00
Eric Anholt
1d04432677 vc4: Move global seqno short-circuiting to vc4_wait_seqno().
Any other caller would want it, too.
2015-01-10 13:54:12 +13:00
Eric Anholt
24d9487432 state_tracker: Fix assertion failures in conditional block movs.
If you had a conditional assignment of an array or struct (say, from the
if-lowering pass), we'd try doing swizzle_for_size() on the aggregate
type, and it would assertion fail due to vector_elements==0.  Instead,
extend emit_block_mov() to handle emitting the conditional operations,
which also means we'll have appropriate writemasks/swizzles on the CMPs
within a struct containing various-sized members.

Fixes 20 testcases in es3conform on vc4.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-10 13:54:12 +13:00
Matt Turner
3d8188d4f8 i965: Consider SEL.{GE,L} to be commutative operations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-08 15:38:16 -08:00
Matt Turner
7f813bf53d i965/cfg: Fix end_ip of last basic block.
start_ip and end_ip are inclusive.

Increases instruction counts in 64 shaders in shader-db, likely
indicative of them previously being misoptimized.
2015-01-08 15:38:16 -08:00
Brian Paul
df461ac952 mesa: compute row stride outside of loop and fix MSVC compilation error
Can't do void pointer arithmetic with MSVC.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-08 14:35:16 -07:00
Brian Paul
e2bf5b183b mesa: fix MSVC compilation errors
Move assertions after declarations and don't use void pointer arithmetic.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-08 14:35:07 -07:00
Laura Ekstrand
8d2542fc9d main: Checking for cube completeness in TextureSubImage.
This is part of a potential solution to a spec bug.  Cube completeness
is a concept from glGenerateMipmap, but it seems reasonable to check for it in
TextureSubImage when target=GL_TEXTURE_CUBE_MAP.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:30 -08:00
Laura Ekstrand
efbc1c86a6 main: Checking for cube completeness in GetTextureImage.
This is part of a potential solution to a spec bug.  Cube completeness
is a concept from glGenerateMipmap, but it seems reasonable to check for it in
GetTextureImage when the target is GL_TEXTURE_CUBE_MAP.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:30 -08:00
Laura Ekstrand
b66dd38a37 main: Added _mesa_cube_level_complete to check for the completeness of an arbitrary cube map level.
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-08 11:37:30 -08:00
Laura Ekstrand
2546d901be main: glDeleteTextures now throws GL_INVALID_VALUE if n is negative.
This is in conformance with the OpenGL spec.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:30 -08:00
Laura Ekstrand
50d679381d main: Refactor in teximage.c to handle NULL from _mesa_get_current_tex_object.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:30 -08:00
Laura Ekstrand
98e64e538a main: Added entry point for glTextureBuffer.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:30 -08:00
Laura Ekstrand
499004e56a main: Fix texObj->Immutable flag update in _mesa_texture_image_multisample.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
a7d69516b8 main: Added entry points for glTextureStorage[23]DMultisample.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
91089d6d65 main: Added entry point for glGenerateTextureMipmap.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
239e3fb876 main: Added entry points for glCompressedTextureSubImage*D.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
8b5482ec03 main: Added entry point for glGetCompressedTextureImage.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
a739bdeb1d main: Added entry point for glGetTextureImage.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
f51f6805f5 main: Nameless texture creation and deletion. Does not affect normal creation and deletion paths.
In implementing ARB_DIRECT_STATE_ACCESS functions, it is often necessary to
abstract the functionality of a traditional GL API function into a backend
that both the traditional and dsa API functions can share.  For instance,
glTexParameteri and glTextureParameteri both call _mesa_texture_parameteri,
which takes a context object and a texture object as arguments.

The existance of such backend functions provides the opportunity for
driver internals (such as meta) to pass around the actual texture object
rather than its ID or target, saving on texture object storage and look-up
overhead.

This patch provides nameless texture creation and deletion for meta.  This
will be used in an upcoming refactor of meta.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
d6b7c40cec main: Added entry points for CopyTextureSubImage*D.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
bad39f6c1e main: Fixed some comments in texparam.c
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
c2c5077864 main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
89912d04a1 main: Added entry point for glGetTextureParameterfv.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
86bb3be319 main: Added entry points for glGetTextureLevelParameteriv, fv.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
bf5c588cde main: legal_get_tex_level_parameter_target now handles GL_TEXTURE_CUBE_MAP.
ARB_DIRECT_STATE_ACCESS functions allow an effective target of
GL_TEXTURE_CUBE_MAP.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
d954f6023b main: Added entry points for glTextureParameteriv, Iiv, Iuiv.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
354d789f3b main: Added entry point for glTextureParameteri.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
2ce5db3930 main: Added entry point for glTextureParameterfv.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
abc688e33a main: Added entry point for glTextureParameterf.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
5ad5393f3b main: Added get_texobj_by_name in texparam.c.
This is a convenience function for *Texture*Parameter functions.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
795ba44754 main: set_tex_parameterf now handles errors according to the OpenGL 4.5 Specification.
Beginning in the OpenGL 4.3 core specification, certain error handling has
changed.  One example shown here is that INVALID_ENUM is thrown instead of
INVALID_OPERATION when a user attempts to set sampler parameters for a
multisample target.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:29 -08:00
Laura Ekstrand
f4dce7a6a6 main: set_tex_parameteri now handles errors according to the OpenGL 4.5 Specification.
Beginning in the OpenGL 4.3 core specification, some error handling has
changed (see OpenGL 4.5 core spec, 30.10.2014, Section 8.10 Texture
Parameters, pages 228-29). As an example, changing sampler states with a
multisample target throws INVALID_ENUM rather than INVALID_OPERATION.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
77aabd8be2 main: Added entry point for BindTextureUnit.
The following preparations were made in texstate.c and texstate.h to
better facilitate the BindTextureUnit function:

Dylan Noblesmith:
mesa: add _mesa_get_tex_unit()
mesa: factor out _mesa_max_tex_unit()
This is about to appear in a lot more places, so
reduce boilerplate copy paste.
add _mesa_get_tex_unit_err() checking getter function
Reduce boilerplate across files.

Laura Ekstrand:
Made note of why BindTextureUnit should throw GL_INVALID_OPERATION if the unit is out of range.
Added assert(unit > 0) to _mesa_get_tex_unit.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
4b381e84db main: Corrected comment on _mesa_is_zero_size_texture.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
b8939fd3d1 main: Added entry points for glTextureSubImage*D.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
5a5fe9f308 main: Added entry points for glTextureStorage*D.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
97c838cf85 main: Added entry point for glCreateTextures.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
15ddc2d94b main: Removed trailing whitespaces in texture code.
main: Removed trailing whitespace in texstate.c.
main: Deleted trailing whitespaces in texobj.c.
main: Fixed whitespace errors in teximage.h and teximage.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
ea1fb258ba main: Renamed _mesa_get_compressed_teximage to _mesa_GetCompressedTexImage_sw.
This reflects the new naming convention for software fallbacks.  To avoid
confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks
now have the form _mesa_[Driver function name]_sw.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
460365cde3 main: Renamed _mesa_get_teximage to _mesa_GetTexImage_sw.
This reflects the new naming convention for software fallbacks.  To avoid
confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks
now have the form _mesa_[Driver function name]_sw.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
16f6d9cf5f main: Changed _mesa_alloc_texture_storage to _mesa_AllocTextureStorage_sw.
In order to implement ARB_DIRECT_STATE_ACCESS, many GL API functions must now
rely on a backend that both traditional and DSA functions can use. For
instance, _mesa_TexStorage2D and _mesa_TextureStorage2D both call a backend
function _mesa_texture_storage that takes a context and a texture object as
arguments.  The backend is named _mesa_texture_storage so that Meta can call
it and avoid looking up the context and the texture object.  However, backend
names often look very close to the names of software fallbacks (ie.
_mesa_alloc_texture_storage).  For this reason, software fallbacks have been
renamed for clarity to have the form _mesa_[Driver function name]_sw.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
35371d6560 main: Moved _mesa_get_current_tex_object from teximage.c to texobj.c.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
d7528fce5a main: Moved _mesa_lock_texture and _mesa_unlock_texture to texobj.h from teximage.h.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
838ef5b781 i965: blit_texture_to_pbo() now accepts TEXTURE_CUBE_MAP.
ARB_DIRECT_STATE_ACCESS permits the user to use TEXTURE_CUBE_MAP as a target.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
60e3bfddaf main: Added utility function _mesa_lookup_texture_err().
Most ARB_DIRECT_STATE_ACCESS functions take an object's ID and use it to look
up the object in its hash table.  If the user passes a fake object ID (ie. a
non-generated name), the implementation should throw INVALID_OPERATION.
This is a convenience function for texture objects.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
Laura Ekstrand
56875181c7 glapi: Added ARB_direct_state_access.xml file.
main: Added ARB_direct_state_access to extensions.c as dummy_false.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-01-08 11:37:28 -08:00
José Fonseca
6c9b695a9c st/wgl: Ignore ulVersion in DrvValidateVersion.
We never used ulVersion for proper version checks.

Most 3rd party drivers use version 1, but recently NVIDIA OpenGL driver
started using a different version number, so the handy trick of renaming
Mesa's ICDs as nvoglv32.dll on Windows machines with NVIDIA hardware for
quick testing of Mesa software renderers stopped working.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-01-08 18:57:04 +00:00
José Fonseca
0dba2af2fb mesa: Address assignment makes integer from pointer without a cast gcc warning.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-08 18:57:04 +00:00
Kristian Høgsberg
0ac4c27275 i965/skl: Always use a header for SIMD4x2 sampler messages
SKL+ overloads the SIMD4x2 SIMD mode to mean either SIMD8D or SIMD4x2
depending on bit 22 in the message header.  If the bit is 0 or there is
no header we get SIMD8D.  We always wand SIMD4x2 in vec4 and for fs pull
constants, so use a message header in those cases and set bit 22 there.

Based on an initial patch from Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-08 10:13:32 -08:00
Kristian Høgsberg
cec8eff28e i965/skl: Report more accurate number of samples for format
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-07 21:51:35 -08:00
Rob Clark
e7026ac486 freedreno/ir3: fix pos_regid > max_reg
We can't (or don't know how to) turn this off.  But it can end up being
stored to a higher reg # than what the shader uses, leading to
corruption.

Also we currently aren't clever enough to turn off frag_coord/frag_face
if the input is dead-code, so just fixup max_reg/max_half_reg.  Re-org
this a bit so both vp and fp reg footprint fixup are called by a common
fxn used also by ir3_cmdline.  Also add a few more output lines for
ir3_cmdline to make it easier to see what is going on.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
1e5c207dba freedreno/ir3: start on indirect gpr reads
Handle TEMP[ADDR[]] src registers by generating a fanin to group array
elements, similarly to how texture fetch instructions work.

NOTE:
For all the scalar instructions generated for a single tgsi vector
operation which uses an array src (or possibly even uses the same array
as multiple srcs), re-use the same fanin node.  Since a vector operation
operates on all components at the same time, it should never see more
than one version of the same array.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
63e5b72da8 freedreno/ir3: make reg array dynamic
To use fanin's to group registers in an array, we can potentially have a
much larger array of registers.  Rather than continuing to bump up the
array size, just make it dynamically allocated when the instruction is
created.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
9a9f2a893b freedreno/ir3: simplify RA
Group inputs/outputs, in addition to fanin/fanout, as they must also
exist in sequential scalar registers.  This lets us simplify RA by
working in terms of neighbor groups.

NOTE: has the slight problem that it can't optimize out mov's for things
like:

  MOV OUT[n], IN[m]

To avoid this, instead of trying to figure out what mov's we can
eliminate, we first remove all mov's prior to grouping, and then
re-insert mov's as needed while grouping inputs/outputs/fanins.
Eventually we'd prefer the frontend to not insert extra mov's in the
first place (so we don't have to bother removing them).  This is the
plan for an eventual NIR based frontend, so separate out the instr
grouping (which will still be needed for NIR frontend) from the mov
elimination (which won't).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
dddfe6c21e freedreno/ir3: regmask support for relative addr
For temp arrays, a 32bit mask won't be sufficient.. but otoh we don't
need to support an arbitrary mask.  So for this case use a simple size
field rather than a bitmask.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
9bb865b3cf freedreno/ir3: split up ssa_src
Slight bit of refactoring that will be needed for indirect gpr
addressing (TEMP[ADDR[]]).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
d15db9e7c0 freedreno/ir3: drop instr_clone() stuff
Unnecessary and overly complicated.  And gets in the way for temp arrays
(TEMP[ADDR[]]).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
212b909643 freedreno/ir3: runtime enable RA debug for DEBUG builds
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
8c3952051e freedreno/ir3: handle relative addr in ir3_dump
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
56370b9feb freedreno/ir3: legalize vs unused sam dst components
We probably could be more clever elsewhere and mask out components that
are not used.  But either way, legalize should realize that there is
also a write-after-write hazard with texture sample instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
063e2ef76a freedreno/ir3: hack for old compiler
Old compiler doesn't have ir3_block's.. so we need a special path.  This
hack can be dropped when ir3_compiler_old is retired.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
18899d1b80 tgsi: track max array per file
NOTE IN[] and OUT[] don't need (have?) ArrayID's.. and TEMP[] can
optionally have them.  So we implicitly assume that ArrayID==0 always
exists for each file.  This is why array_max[file] is never less than
zero.

You can tell from indirect_files(_read/written) if the legacy array-
id zero was actually used.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07 19:37:28 -05:00
Rob Clark
49b4a6331f tgsi: keep track of read vs written indirects
At least temporarily, I need to fallback to old compiler still for
relative dest (for freedreno), but I can do relative src temp.  Only
a temporary situation, but seems easy/reasonable for tgsi-scan to
track this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-07 19:37:28 -05:00
Marek Olšák
d7cd9bfc7f Revert "radeonsi: reduce the size of si_pm4_state"
This reverts commit 9141d88555.

It broke OpenCL.
2015-01-08 00:10:36 +01:00
Tom Stellard
e28f9d0e60 radeonsi: Fix crash when destroying si_screen
We were invalidating si_screen:tm by calling
r600_destroy_common_screen() which frees the si_screen object.  This
caused the driver to crash in LLVMDisposeTargetMachine() since we
were passing it an invalid pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=88170
2015-01-07 16:28:40 -05:00
José Fonseca
2b7fd5b11d mesa: Don't use _mesa_generic_nop on Windows.
It doesn't work on Windows because of STDCALL calling convention -- it's
the callee responsibility to pop the arguments, and the number of
arguments vary with the prototype --, so the stack pointer ends up getting
corrupted.

This is just a non-invasive stop-gap fix.  A proper fix would be more
elaborate, and require either:
- a variation of __glapi_noop_table which sets GL_INVALID_OPERATION
  error
- stop using APIENTRY on all internal _mesa_* functions.

Tested with piglit gl-1.0-beginend-coverage (it now fails instead of
crashing).

VMware PR1350505

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-01-07 19:35:35 +00:00
José Fonseca
fd1f79f7dd glapi: Force frame pointer elimination on Windows.
To catch mismatches in cdecl vs stdcall calling convention.  See code
comment for more detailed explanation.

Tested with piglit gl-1.0-beginend-coverage (it now also crashes on
debug builds.)

VMware PR1350505.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-01-07 19:35:34 +00:00
Marek Olšák
1829f9c928 radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders
v2: complete rewrite

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-01-07 18:27:54 +01:00
Marek Olšák
d8185aa9a8 radeonsi: emit SURFACE_SYNC last
This fixes a case where a transform feedback buffer is fed back as an index
buffer, because SURFACE_SYNC must be after VS_PARTIAL_FLUSH.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
7c9ec6ca7e radeonsi: flush all CB/DB caches unconditionally when changing the framebuffer
This is easier to read and will work better with shader image stores.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
a1bbccf521 radeonsi: change TC cache flushing strategy for textures
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
ca9c5b2be5 radeonsi: improve and fix streamout flushing
- we don't usually need to flush TC L2
- we should flush KCACHE
  (not really an issue now since we always flush KCACHE when updating
   descriptors, but it could be a problem if we used CE, which doesn't
   require flushing KCACHE)
- add an explicit VS_PARTIAL_FLUSH flag

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
18a30c9778 radeonsi: use TC L2 for CP DMA operations with shader resources on CIK
So that TC L2 doesn't need to be flushed.

The only problem is with index buffers, which don't use TC.
A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
11b76369f5 radeonsi: use TC L2 for updating descriptors on CIK
This allows not flushing TC L2 on CIK later.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
02ba7334d3 radeonsi: don't use TC L2 for updating descriptors on SI
It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA
when updating the same memory.

The solution for SI is to use uncached access here, because CP DMA doesn't
support cached access.

CIK will be handled in the next patch.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
edf18da85d radeonsi: only flush the right set of caches for CP DMA operations
That's either framebuffer caches or caches for shader resources.
The motivation is that framebuffer caches need to be flushed very rarely
here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
73c2b0d18c radeonsi: implement separate ICACHE and KCACHE flush for SI
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
0aecf9e2d1 radeonsi: add a combined flag for flushing a framebuffer
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
2bfe9d4538 radeonsi: rename flush flags, split the TC flag into L1 and L2
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d217819e78 r600g,radeonsi: separate cache flush flags
I will rename them for radeonsi.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d14f2ab4ad r600g: move r6xx-specific streamout flush flagging into r600g
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
0543630d0b radeonsi: only set BC_OPTIMIZE_DISABLE when necessary
SPI_PS_IN_CONTROL is moved into the SPI mapping state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
5d8e838dae radeonsi: do not define FACE as an ordinary PS input
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
15a7fff69a radeonsi: remove flatshade from the shader key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
13de9475fc radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen
It doesn't do anything useful. And colors are floating-point, so we can use
fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE
state only (in the next patch).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
e3d4bdd6a8 radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values
Only done for completeness. Not used by anything yet.

Tested by advertising PIPE_CAP_VERTEXID_NOBASE.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d7c6f397f4 radeonsi: fix VertexID for OpenGL
This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
368b0a7340 radeonsi: clarify a hw bug in shader exports
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
d1d2af2398 radeonsi: use ordered compares for SSG and face selection
Ordered compares are what you have in C. Unordered compares are the result
of negating ordered compares (they return true if either argument is NaN).

That special NaN behavior is completely useless here, and unordered
compares produce horrible code with all stable LLVM versions.
(I think that has been fixed in LLVM git)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
a38e8de643 radeonsi: remove unused and not useful variables
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
638fa8016a radeonsi: remove init config from states
It really doesn't do anything there.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
9141d88555 radeonsi: reduce the size of si_pm4_state
- the relocs array is unused, remove it
- ndw is at most 115 (init), set 140 as the maximum
- compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák
1b82eb677d tgsi: add uses_centroid into tgsi_shader_info 2015-01-07 12:06:43 +01:00
Marek Olšák
eaae92a349 st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX
Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-07 12:06:43 +01:00
Marek Olšák
8f5d309521 vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays
From GL 4.4 Core profile:

  If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are
  enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is
  used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
  performed for array elements transferred by any drawing command not taking a
  type parameter, including all of the *Draw* commands other than *DrawEle-
  ments*.

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-07 12:06:42 +01:00
Eric Anholt
426fd535d9 vc4: Fix scaling W projection of the Z coordinate when there's a Z offset.
Fixes piglit glsl-fs-fragcoord-zw-perspective, es3conform
gl_FragCoord_z_frag, and the rest of the piglit glsl 1.10 interpolation
tests.
2015-01-06 17:22:13 -08:00
Eric Anholt
49b5c901e8 vc4: Fix deletion from the program cache.
They key is, oddly enough, in the key field, not in the data field (which
is the vc4_compiled_shader *).  Fixes regular failures in fp-long-alu.
2015-01-06 15:41:36 -08:00
Eric Anholt
b295403971 vc4: Skip storing the Z/S contents when it's invalidated.
Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409%
(n=67).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-06 15:40:41 -08:00
Eric Anholt
239db93888 gallium: Plumb the swap INVALIDATE_ANCILLARY flag through more layers.
v2: Instead of telling the driver that the window system ancillaries have
    been invalidated (when the driver doesn't know which of its buffers
    are the window system's!), introduce a method for invalidating
    specific surfaces.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-06 15:40:41 -08:00
Eric Anholt
70e8ccc459 egl: Inform the client API when ancillary buffers may become undefined.
This is part of the EGL spec, and is useful for a tiled renderer to avoid
the memory bandwidth cost of storing the depth/stencil buffers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-06 15:40:40 -08:00
Vinson Lee
5ae1305124 ax_prog_flex.m4: Merge upstream OpenBSD fixes.
Merge the following upstream autoconf-archive patches.

ax_prog_flex: change grep syntax to accept e.g. "flex.real" in case a wrapper or symlink is used.
AX_PROG_FLEX: avoid use of grep empty string escape extension (fix for OpenBSD)
AX_PROG_FLEX: Also accept gflex.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@openbsd.org>
2015-01-06 15:06:54 -08:00
Tom Stellard
a8ef880a1b radeon/llvm: Use amdgcn triple for SI+ on LLVM >= 3.6 2015-01-06 12:53:21 -08:00
Tom Stellard
761e36b4ca radeonsi: Cache LLVMTargetMachine object in si_screen
Rather than building a new one every compile.  This should reduce some
of the overhead of compiling shaders.

One consequence of this change is that we lose the MachineInstrs dumps
when dumping the shaders via R600_DEBUG.  The LLVM IR and assembly is
still dumped, and if you still want to see the MachineInstr dump, you
can run the dumped LLVM IR through llc.
2015-01-06 12:53:21 -08:00
Brian Paul
934e41c0b3 mesa: create, use new _mesa_texture_base_format() function
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:55 -07:00
Brian Paul
f262ed6e3d mesa: remove unused ctx parameter for _mesa_select_tex_image()
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:55 -07:00
Brian Paul
05279fa563 swrast: use new _mesa_base_tex_image() helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:55 -07:00
Brian Paul
58e8dd6b9d st/mesa: use new _mesa_base_tex_image() helper
This involved adding a new st_texture_image_const() helper also.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:55 -07:00
Brian Paul
3a400cbb66 mesa: add _mesa_base_tex_image() helper function
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
d0fa559e49 mesa: simplify a conditional in detach_shader()
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
c0a445037b mesa: minor whitespace fixes in shaderapi.c
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
6d9aed19f3 mesa: make _mesa_reference_shader_program() an inline function
which wraps _mesa_reference_shader_program_(), similar to what we do
for other reference-counted objects.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
3f687e995f mesa: update comment on delete_shader_program()
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
5b7e7cfb2b mesa: rearrange error handling in glProgramParameteri()
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
41dc2fee4e mesa: fix error strings in shaderapi.c
The _mesa_-prefixed function names should not appear in GL error
messages.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
a6822e3135 glsl: use the is_gl_identifier() helper in a couple more places
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-05 13:50:54 -07:00
Brian Paul
83b344021b meta: init var to silence uninitialized variable warning 2015-01-05 13:50:54 -07:00
Brian Paul
d294365d06 draw: silence uninitialized variable warning
v2: move initialization of llvm_gs to declaration.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-05 13:50:54 -07:00
Brian Paul
04e35cc4aa gallivm: silence a couple compiler warnings
Silence warnings about possibly uninitialized variables when making a
release build.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-01-05 13:50:54 -07:00
Leonid Shatz
5fea39ace3 gallium/util: make sure cache line size is not zero
The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 17:58:39 +01:00
Roland Scheidegger
b59c7ed0ab gallium/util: fix crash with daz detection on x86
The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 17:58:38 +01:00
Ilia Mirkin
21a280f87c nvc0: add name to magic number
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
7228302009 nvc0: regenerate rnndb headers
The headers hadn't been regenerated in a long time and had seen a number
of manual modifications. A few changes:
 - remove nvc0_2d entirely, use the nv50 header which has the nvc0
   values too
 - remove 3ddefs, it's identical to the nv50 file
 - move macros out into a separate file

Also the upstream rnndb changed the overall chip naming convention; this
was fixed up manually in the generated files until a better solution is
determined.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
7ed02b111a nv50: regenerate rnndb headers
The headers hadn't been regenerated in a long time, and there were a few
minor divergences. Among other things, rnndb has changed naming to
G80/etc, for now I've not tackled switching that over and manually
replaced the nvidia codenames back to the chip ids. However no other
modifications of the headergen'd headers was done.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Tobias Klausmann
1f8c0be27e nv50: enable texture compression
Compression seems to be supported for only some formats. Enable it for
those. Previously this was disabled for everything despite the code
looking like it was actually enabled.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
e452cfb149 nv50/ir: enable sat modifier for OP_SUB
SUB is handled the same as ADD, so no reason not to allow a saturate
modifier on it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Roy Spliet
44673512a8 nv50/ir: Add sat modifier for mul
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
ec3e1e6194 nv50,nvc0: avoid doing work inside of an assert
assert is compiled out in release builds - don't put logic into it. Note
that this particular instance is only used for vp debugging and is
normally compiled out.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-05 00:34:33 -05:00
Ilia Mirkin
fb1afd1ea5 nv50/ir: fix texture offsets in release builds
assert's get compiled out in release builds, so they can't be relied
upon to perform logic.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Roy Spliet <rspliet@eclipso.eu>
Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 00:34:33 -05:00
Kenneth Graunke
5464257263 i965: Micro-optimize swizzle_to_scs() and make it inlinable.
brw_swizzle_to_scs has been showing up in my CPU profiling, which is
rather silly - it's a tiny amount of code.  It really should be inlined,
and can easily be implemented with fewer instructions.

The enum translation is as follows:

SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
        0          1          2          3             4            5
        4          5          6          7             0            1
  SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA,     SCS_ZERO,     SCS_ONE

which is simply (swizzle + 4) & 7.

Haswell needs extra textureGather workarounds to remap GREEN to BLUE,
but Broadwell and later do not.

This patch replicates swizzle_to_scs in gen7_wm_surface_state.c and
gen8_surface_state.c, since the Gen8+ code can be simplified to a mere
two instructions.  Both copies can be marked static for easy inlining.

v2: Put the commit message in the code as comments (requested by
    Jason Ekstrand).  Also fix a typo.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-04 21:31:40 -08:00
Kenneth Graunke
f3ad1804eb i965: Support MESA_FORMAT_R8G8B8X8_SRGB.
Valve games use GL_SRGB8 textures.  Instead of supporting that properly,
we fell back to MESA_FORMAT_R8G8B8A8_SRGB (with an alpha channel), which
meant that we had to use texture swizzling to override the alpha to 1.0
when sampling.  This meant shader recompiles on Gen < 7.5 platforms.

By supporting MESA_FORMAT_R8G8B8X8_SRGB, the hardware just returns 1.0
for us, so we can just use SWIZZLE_XYZW, and avoid any recompiles.  All
generations of hardware have supported the format for sampling and
filtering; we can easily support rendering by using the R8G8B8A8_SRGB
format and writing garbage to the X channel.  (We do this already for
the non-SRGB version of this format.)

This removes all remaining shader recompiles in a time demo of "Counter
Strike: Global Offensive" (32 -> 0) on Sandybridge.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-04 21:31:40 -08:00
Kenneth Graunke
51b9382da8 i965: Fix BLORP sRGB MSAA overrides to cope with X vs. A formats.
The logic in brw_blorp_surface_info::set uses brw_format_for_mesa_format
for source surfaces, and brw->render_target_format[] for destination
surfaces.  We should do the same in the sRGB MSAA overrides.

Currently, this isn't a problem, since SRGB MSAA buffers are all RGBA.
The next commit will introduce RGBX SRGB MSAA buffers, at which point
we need to get the RGBX -> RGBA format overrides for rendering right.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-04 21:31:40 -08:00
Kenneth Graunke
1f1102c834 i965: Copy shader->shadow_samplers to prog->ShadowSamplers.
ir_to_mesa does this - apparently we just forgot or something.

Without this, we'll guess the wrong texture swizzle (XYZW for color
instead of XXX1 for depth) when doing precompiles.

This cuts 26 shader recompiles in a time demo of "Counter Strike:
Global Offensive" (58 -> 32) on Sandybridge.  Haswell still has 0
recompiles.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-04 21:31:40 -08:00
Kenneth Graunke
0b98b2bf53 i965: Make the precompile ignore DEPTH_TEXTURE_MODE on Gen7.5+.
Gen7.5+ platforms that support the "Shader Channel Select" feature leave
key->tex.swizzles[i] as SWIZZLE_NOOP except when GL_DEPTH_TEXTURE_MODE
is GL_ALPHA (which is really uncommon).  So, the precompile should leave
them as SWIZZLE_NOOP (aka SWIZZLE_XYZW) as well.

We didn't notice this because prog->ShadowSamplers is not set correctly.
The next patch will fix that problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-04 21:31:40 -08:00
Kenneth Graunke
d41cf9fb60 i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT.
According to the documentation, we need to do a CS stall on every fourth
PIPE_CONTROL command to avoid GPU hangs.  The kernel does a CS stall
between batches, so we only need to count the PIPE_CONTROLs in our batches.

v2: Get the generation check right (caught by Chris Wilson),
    combine the ++ with the check (suggested by Daniel Vetter).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-04 17:21:33 -08:00
Marek Olšák
3793a1b421 r300g: handle vertex format PIPE_FORMAT_NONE 2015-01-04 23:54:47 +01:00
Marek Olšák
48094d0e65 glsl_to_tgsi: fix a bug in copy propagation
This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-01-03 13:25:30 +01:00
Kenneth Graunke
916516b251 i965: Make INTEL_DEBUG=state ignore state flags with a count of 1.
There are too many state flags to fit in one terminal screen, even with
a very tall terminal.  Everything is flagged once, so a value of 1 means
that it hasn't ever happened again, and thus isn't terribly interesting.

Skipping those makes it easier to see the interesting values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-03 01:45:15 -08:00
Kenneth Graunke
408e298942 i965: Fix INTEL_DEBUG=optimizer with VF types.
Hardcoding stderr is wrong; INTEL_DEBUG=optimizer uses other files.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-03 01:45:15 -08:00
Kenneth Graunke
9b8bd67768 i965: Show opt_vector_float() and later passes in INTEL_DEBUG=optimizer.
In order to support calling opt_vector_float() inside a condition, this
patch makes OPT() a statement expression:

https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

We've used that elsewhere already.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-03 01:45:15 -08:00
Jeremy Huddleston Sequoia
61711316f5 swrast: Fix -Wduplicate-decl-specifier warning
swrast.c:67:12: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
const char const *swrast_vendor_string = "Mesa Project";
           ^
swrast.c:68:12: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
const char const *swrast_renderer_string = "Software Rasterizer";
           ^

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2015-01-01 19:55:43 -08:00
Roy Spliet
c3260f8d98 nv50/ir: Fold sat into mad
The mad instruction emitter already supported the saturate modifier,
but the ModifierFolding pass never tried folding cvt sat operations
in for NV50.

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-01 21:40:35 -05:00
Ilia Mirkin
9e94b87b60 nv50/ir: fold MAD when one of the multiplicands is const
Fold MAD dst, src0, immed, src2 (or src0/immed swapped) when
 - immed = 0 -> MOV dst, src2
 - immed = +/- 1 -> ADD dst, src0, src2

These types of MAD patterns were observed in some st/nine shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-01-01 21:40:35 -05:00
Alexander von Gluck IV
290553b6d6 gallium/state_tracker: Rewrite Haiku's state tracker
* More gallium-like
* Leverage stamps properly and don't call mesa functions
2015-01-01 21:33:36 -05:00
Marek Olšák
b77eaafcdc radeonsi: fix warnings 2015-01-01 14:42:32 +01:00
Kenneth Graunke
c633528cba i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.
This is a partial revert of c89306983c.
It split the {start,base}_vertex_location handling into several steps:

1. Set brw->draw.start_vertex_location = prim[i].start
   and brw->draw.base_vertex_location = prim[i].basevertex.
   (This happened once per _mesa_prim, in the main drawing loop.)
2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset
   appropriately.  (This happened in brw_prepare_shader_draw_parameters,
   which was called just after brw_prepare_vertices, as part of state
   upload, and only happened when BRW_NEW_VERTICES was flagged.)
3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim).

If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on
the second (or later) primitives, we would do step #1, but not #2.
The first _mesa_prim would get correct values, but subsequent ones
would only get the first half of the summation.

The reason I originally did this was because I needed the value of
gl_BaseVertexARB to exist in a buffer object prior to uploading
3DSTATE_VERTEX_BUFFERS.  I believed I wanted to upload the value
of 3DPRIMITIVE's "Base Vertex Location" field, which was computed
as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) +
brw->vb.start_vertex_bias.  The latter value wasn't available until
after brw_prepare_vertices, and the former weren't available in the
state upload code at all.  Hence the awkward split.

However, I believe that including brw->vb.start_vertex_bias was a
mistake.  It's an extra bias we apply when uploading vertex data into
VBOs, to move [min_index, max_index] to [0, max_index - min_index].

>From the GL_ARB_shader_draw_parameters specification:
"<gl_BaseVertexARB> holds the integer value passed to the <baseVertex>
 parameter to the command that resulted in the current shader
 invocation.  In the case where the command has no <baseVertex>
 parameter, the value of <gl_BaseVertexARB> is zero."

I conclude that gl_BaseVertexARB should only include the baseVertex
parameter from glDraw*Elements*, not any internal biases we add for
optimization purposes.

With that in mind, gl_BaseVertexARB only needs prim[i].start or
prim[i].basevertex.  We can simply store that, and go back to computing
start_vertex_location and base_vertex_location in brw_emit_prim(), like
we used to.  This is much simpler, and should actually fix two bugs.

Fixes missing geometry in Unvanquished.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-31 17:10:47 -08:00
Kenneth Graunke
faa615a798 i965: Use WARN_ONCE for the single-primitive-exceeded-aperture message.
This makes it show up via ARB_debug_output and is also less code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-31 17:06:51 -08:00
Eric Anholt
a6f6d6188c u_primconvert: Fix leak of the upload BO on context destroy.
v2: Conditionalize it on having done any uploads (Turns out
    u_upload_destroy() isn't safe with a NULL arg).

Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2014-12-31 13:50:17 -08:00
Eric Anholt
37478c638a vc4: Fix memory leak as of 0404e7fe0a.
Can't reset the CL before looking at how much we had pupt in it.
2014-12-31 11:34:28 -08:00
Ilia Mirkin
be0311c962 nv50,nvc0: set vertex id base to index_bias
Fixes the piglits which check that gl_VertexID includes the base vertex
offset:
  arb_draw_indirect-vertexid elements
  gl-3.2-basevertex-vertexid

Note that this leaves out the original G80, for which this will continue
to fail. It could be fixed by passing a driver constbuf value in, but
that's beyond the scope of this change.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
2014-12-30 23:30:23 -05:00
Tiziano Bacocco
609c3e51f5 nv50,nvc0: implement half_pixel_center
LAST_LINE_PIXEL has actually been renamed to PIXEL_CENTER_INTEGER in
rnndb; use that method to implement the rasterizer setting, used for
st/nine.

Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2014-12-30 20:11:55 -05:00
Eric Anholt
3ba57bae47 vc4: Only render tiles where the scissor ever intersected them.
This gives a 2.7x improvement in x11perf -rect100, since we only end up
load/storing the x11perf window, not the whole screen.
2014-12-30 14:33:52 -08:00
Eric Anholt
0404e7fe0a vc4: Move draw call reset handling to a helper function.
This will be more important in the next commit, when there's more state to
reset to nonzero values, and I want an early exit from the submit
function.
2014-12-30 14:30:59 -08:00
Eric Anholt
effb39e899 vc4: Drop the content of vc4_flush_resource().
The callers all follow it with a flush of the context, and the flush of
the context gives us more information about how things are being flushed.
2014-12-30 14:30:59 -08:00
Emil Velikov
64dcb2bb0a docs: add news item and link release notes for mesa 10.3.6/10.4.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:50:43 +00:00
Emil Velikov
4fa6024b5f docs: Add sha256 sums for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:45:36 +00:00
Emil Velikov
73ec4e2265 Add release notes for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:45:34 +00:00
Emil Velikov
dd0f2f3695 docs: Add sha256 sums for the 10.3.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:45:30 +00:00
Emil Velikov
184246b6d9 Add release notes for the 10.3.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:45:29 +00:00
Matt Turner
6c18279b9f mesa: Remove __SSE4_1__ guards from sse_minmax.c.
See commit e07c9a288.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-12-29 12:17:06 -08:00
Matt Turner
798c094e62 i965/vec4: Do separate copy followed by constant propagation after opt_vector_float().
total instructions in shared programs: 5877012 -> 5876617 (-0.01%)
instructions in affected programs:     33140 -> 32745 (-1.19%)

From before the commit that allows VF constant propagation (which hurt
some programs) to here, the results are:

total instructions in shared programs: 5877951 -> 5876617 (-0.02%)
instructions in affected programs:     123444 -> 122110 (-1.08%)

with no programs hurt.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
d61c519822 i965/vec4: Allow constant propagation of VF immediates.
total instructions in shared programs: 5877951 -> 5877012 (-0.02%)
instructions in affected programs:     155923 -> 154984 (-0.60%)

Helps 1233, hurts 156 shaders. The hurt shaders are addressed in the
next commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
c855f49c99 i965/vec4: Add parameter to skip doing constant propagation.
After CSEing some MOV ..., VF instructions we have code like

   mov tmp, [1F, 2F, 3F, 4F]VF
   mov r10, tmp
   mov r11, tmp
   ...
   use r10
   use r11

We want to copy propagate tmp into the uses of r10 and r11, but *not*
constant propagate the VF immediate into the uses of tmp.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
bbdd3198a5 i965/vec4: Do CSE, copy propagation, and DCE after opt_vector_float().
total instructions in shared programs: 5869005 -> 5868220 (-0.01%)
instructions in affected programs:     70208 -> 69423 (-1.12%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
7463e6d61b i965/vec4: Perform CSE on MOV ..., VF instructions.
Port of commit a28ad9d4 from the fs backend.

No shader-db changes since we don't emit MOV ..., VF instructions yet.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
44573458bd i965/vec4: Add pass to gather constants into a vector-float MOV.
Currently only handles consecutive instructions with the same
destination that collectively write all channels.

total instructions in shared programs: 5879798 -> 5869011 (-0.18%)
instructions in affected programs:     465236 -> 454449 (-2.32%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
7bc6e455e2 i965: Add support for saturating immediates.
I don't feel great about assert(!"unimplemented: ...") but these
cases do only seem possible under some currently impossible circumstances.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:08:18 -08:00
Matt Turner
3978585bcc i965: Add fs_reg/src_reg constructors that take vf[4].
Sometimes it's easier to generate 4x values into an array, and the
memcpy is 1 instruction, rather than 11 to piece 4 arguments together.

I'd forgotten to remove the prototype from fs_reg from a previous patch,
so it's already there for us here.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-29 10:05:03 -08:00
Alexander von Gluck IV
0c7f895995 gallium/target: Drop no longer needed Haiku viewport override
* Drop no longer needed mesa headers
* Haiku LLVM pipe working with LLVM 3.5.0 on x86_64
2014-12-27 06:12:54 +00:00
Alexander von Gluck IV
2b3a570920 gallium/st: Clean up Haiku depth mapping, fix colorspace errors 2014-12-27 05:55:29 +00:00
Eric Anholt
cb5a37249c vc4: Handle unaligned accesses in CL emits.
As of 229bf4475f we started getting SIBGUS
from unaligned accesses on the hardware, for reasons I haven't figured
out.  However, we should be avoiding unaligned accesses anyway, and our CL
setup certainly would have produced them.
2014-12-25 15:47:39 -10:00
Eric Anholt
db6e054eb0 vc4: Don't bother zero-initializing the shader reloc indices.
They should all be set to real values by the time they're read, and
ideally if you used valgrind you'd see uninitialized value uses.
2014-12-25 12:25:41 -10:00
Eric Anholt
0b607b54ce vc4: Fix the argument type for cl_u16().
It doesn't matter, since it just got truncated to 16 inside, anyway.
2014-12-25 12:25:41 -10:00
Alexander von Gluck IV
890ef622d6 egl: Fix non-dri SCons builds re #87657
* Revert change to egl main producing Shared Libraries
* Check for dri before including dri code
2014-12-25 10:34:49 -05:00
Michel Dänzer
b3057f8097 radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0
E.g. this could happen on older kernels which don't support the
RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in
si_write_harvested_raster_configs() doesn't deal with this correctly and
would probably mangle the value badly.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-12-25 12:06:22 +09:00
Eric Anholt
229bf4475f vc4: Optimize CL emits by doing size checks up front.
The optimizer obviously doesn't have the ability to rewrite these to skip
the size checks per call, so we have to do it manually.

Improves a norast benchmark on simulation by 0.779706% +/- 0.405838%
(n=6087).
2014-12-24 10:28:26 -10:00
Eric Anholt
20e3a2430e vc4: Avoid repeated hindex lookups in the loop over tiles.
Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673%
(n=20).
2014-12-24 08:28:33 -10:00
Kenneth Graunke
4616b2ef85 i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.
This was probably missed when moving from a fixed binding table layout
to a dynamic one that changes based on the shader.

Fixes newly proposed Piglit test fbo-mrt-new-bind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Mike Stroyan <mike@LunarG.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2014-12-24 00:15:40 -08:00
Kenneth Graunke
b7f14e03e3 i965: Cache register write capability checks.
Our ability to perform register writes depends on the hardware and
kernel version.  It shouldn't ever change on a per-context basis,
so we only need to check once.

Checking introduces a synchronization point between the CPU and GPU:
even though we submit very few GPU commands, the GPU might be busy doing
other work, which could cause us to stall for a while.

On an idle i7 4750HQ, this improves performance in OglDrvCtx (a context
creation microbenchmark) by 6.14748% +/- 1.6837% (n=20).  With Unigine
Valley running in the background (to keep the GPU busy), it improves
performance in OglDrvCtx by 2290.92% +/- 29.5274% (n=5).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2014-12-24 00:15:40 -08:00
Rob Clark
f332cf92b6 freedreno/ir3: split out legalize pass
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-23 19:53:01 -05:00
Rob Clark
4097ef6ee8 freedreno/ir3: ra debug
Some compile time RA debug

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-23 19:53:01 -05:00
Alexander von Gluck IV
402c808372 egl/haiku: Clean up SConscript whitespace 2014-12-23 09:07:58 -05:00
Alexander von Gluck IV
49ce07878d egl/dri2: Fix build of dri2 egl driver with SCons
* egl/dri2 was missing a SConscript
* Problem caught by Adrián Arroyo Calle
2014-12-23 09:07:58 -05:00
Alexander von Gluck IV
e7ac21202d egl: Clean up Haiku visual creation
* Only create one struct
* 'final' also is a language conflict
* Some style cleanup
2014-12-23 09:07:58 -05:00
Alexander von Gluck IV
400b833592 egl: Add Haiku code and support
* This is the cleaned up work of the Haiku GCI student
  Adrián Arroyo Calle adrian.arroyocalle@gmail.com
* Several patches were consolidated to prevent
  unnecessary touching of non-related code
2014-12-23 09:07:57 -05:00
Timothy Arceri
da4fb3e7a1 glsl: check if implicitly sized arrays match explicitly sized arrays across the same stage
V2: Improve error message.

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-23 19:32:56 +11:00
Chad Versace
414be86c96 i965: Use safer pointer arithmetic in gather_oa_results()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
gather_oa_results(), like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I get nervous when I see code patterns like this:

   (void*) + (int) * (int)

I smell 32-bit overflow all over this code.

This patch retypes 'snapshot_size' to 'ptrdiff_t', which should fix any
potential overflow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2014-12-22 15:47:14 -06:00
Chad Versace
225a09790d i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
intel_texsubimage_tiled_memcpy() , like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I recently solved, in commit b69c7c5dac, an overflow
bug in a line of code that looks very similar to pointer arithmetic in
this function.

This patch conceptually applies the same fix as in b69c7c5dac. Instead
of retyping the variables, though, this patch adds some casts. (I tried
to retype the variables as ptrdiff_t, but it quickly got very messy. The
casts are cleaner).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2014-12-22 15:47:11 -06:00
Chad Versace
aebcf26d82 i965: Fix intel_miptree_map() signature to be more 64-bit safe
This patch should diminish the likelihood of pointer arithmetic overflow
bugs, like the one fixed by b69c7c5dac.

Change the type of parameter 'out_stride' from int to ptrdiff_t. The
logic is that if you call intel_miptree_map() and use the value of
'out_stride', then you must be doing pointer arithmetic on 'out_ptr'.
Using ptrdiff_t instead of int should make a little bit harder to hit
overflow bugs.

As a side-effect, some function-scope variables needed to be retyped to
avoid compilation errors.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2014-12-22 15:47:07 -06:00
Chad Versace
d11bc9fe8d i965: Remove spurious casts in copy_image_with_memcpy()
If a pointer points to raw, untyped memory and is never dereferenced,
then declare it as 'void*' instead of casting it to 'void*'.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-22 15:46:54 -06:00
Marek Olšák
2150db4d5d radeonsi: force NaNs to 0
This fixes incorrect rendering in Unreal Engine demos.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-21 20:34:38 +01:00
David Heidelberg
4fb1d00f4e st/nine: fix DBG typo (trivial)
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-12-21 20:34:19 +01:00
David Heidelberg
fbfe2918f4 r300g: implement ARR opcode
Same as ARL, just has extra rounding.
Useful for st/nine.

Tested-by: Pavel Ondračka <pavel.ondracka@email.cz>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-12-21 20:34:19 +01:00
Rob Clark
aa6415b485 freedreno/a4xx: blend-color
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-20 12:08:37 -05:00
Rob Clark
10d81a03b3 freedreno/a4xx: alpha-test
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-20 12:08:37 -05:00
Rob Clark
097d760aac freedreno: update generated headers 2014-12-20 12:08:37 -05:00
Rob Clark
f20a0acd43 freedreno/ir3: trans_kill cleanup
trans_kill() only handles the single opcode.  Drop the remnant of a time
when both KILL and KILL_IF were handled by the same fxn.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-20 12:08:37 -05:00
Rob Clark
4ee545646d freedreno/ir3: hack for standalone compiler
Standalone compiler doesn't have screen or context.  We need to come up
with a better way to control the target arch (ie. something that we can
control from cmdline w/ standalone compiler) but for now this hack keeps
it from segfault'ing.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-20 12:08:37 -05:00
Matt Turner
a5481d6fbb i965/fs: Add missing const qualifier. 2014-12-19 12:55:13 -08:00
Eric Anholt
e06b0778f5 vc4: Coalesce MOVs into VPM with the instructions generating the values.
total instructions in shared programs: 41168 -> 40976 (-0.47%)
instructions in affected programs:     18156 -> 17964 (-1.06%)
2014-12-18 15:00:56 -08:00
Eric Anholt
a871eff16c vc4: Redefine VPM writes as a (destination) QIR register file.
This will let me coalesce the VPM writes into the instructions generating
the values.
2014-12-17 22:35:08 -08:00
Timothy Arceri
a9e77896a7 docs: note change in minimum GCC version to 4.2.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-12-18 16:08:27 +11:00
Timothy Arceri
743a684512 gallium: remove support for GCC older than 4.2.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-18 16:08:19 +11:00
Timothy Arceri
6852dce591 mesa: bump required GCC version to 4.2.0
It turns out Mesa hasn't compiled on less then 4.2 for a while
 so update conf to reflect this.

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-18 16:08:11 +11:00
Eric Anholt
e473fbe469 vc4: Add support for turning constant uniforms into small immediates.
Small immediates have the downside of taking over the raddr B field, so
you might have less chance to pack instructions together thanks to raddr B
conflicts.  However, it also reduces some register pressure since it lets
you load 2 "uniform" values in one instruction (avoiding a previous load
of the constant value to a register), and increases some pairing for the
same reason.

total uniforms in shared programs: 16231 -> 13374 (-17.60%)
uniforms in affected programs:     10280 -> 7423 (-27.79%)
total instructions in shared programs: 40795 -> 41168 (0.91%)
instructions in affected programs:     25551 -> 25924 (1.46%)

In a previous version of this patch I had a reduction in instruction count
by forcing the other args alongside a SMALL_IMM to be in the A file or
accumulators, but that increases register pressure and had a bug in
handling FRAG_Z.  In this patch is I just use raddr conflict resolution,
which is more expensive.  I think I'd rather tweak allocation to have some
way to slightly prefer good choices for files in general, rather than risk
failing to register allocate by forcing things into register classes.
2014-12-17 19:35:13 -08:00
Eric Anholt
ff266483fb vc4: Move follow_movs() to common QIR code.
I want this from other passes.
2014-12-17 19:05:52 -08:00
Eric Anholt
8d22e8907f vc4: Fix missing newline for load immediate instruction disasm. 2014-12-17 19:05:52 -08:00
Matt Turner
18ebf9e251 mesa: Remove unnecessary -f from $(RM).
$(RM) includes -f.
2014-12-17 17:54:33 -08:00
Matt Turner
b2b6cf2437 mesa: Remove tarballs/checksum rules. 2014-12-17 17:54:33 -08:00
Matt Turner
4cc8d66f74 gallium: Add egl and gbm to distribution. 2014-12-17 17:54:33 -08:00
Matt Turner
baedd68ca9 mesa: Set DISTCHECK_CONFIGURE_FLAGS.
Enable some non-default options that distros are likely to use.
2014-12-17 17:54:33 -08:00
Matt Turner
ce48ce425a targets/xvmc: Add uninstall hooks to handle megadriver hardlinks. 2014-12-17 17:54:33 -08:00
Matt Turner
ed1ac1d574 targets/vdpau: Add uninstall hooks to handle megadriver hardlinks. 2014-12-17 17:54:33 -08:00
Matt Turner
adc2922f9c targets/vdpau: Add clean-local rule to remove .lib links. 2014-12-17 17:54:33 -08:00
Eric Anholt
06890c444a vc4: Add a userspace BO cache.
Since our kernel BOs require CMA allocation, and the use of them requires
new mmaps, it's pretty expensive and we should avoid it if possible.
Copying my original design for Intel, make a userspace cache that reuses
BOs that haven't been shared to other processes but frees BOs that have
sat in the cache for over a second.

Improves glxgears framerate on RPi by around 30%.
2014-12-17 16:07:01 -08:00
Eric Anholt
39bc936011 vc4: Add dmabuf support.
This gets DRI3 working on modesetting with glamor.  It's not enabled under
simulation, because it looks like handing our dumb-allocated buffers off
to the server doesn't actually work for the server's rendering.
2014-12-17 16:07:01 -08:00
Eric Anholt
113044e1b9 vc4: Drop a weird argument in the BOs-from-handles API. 2014-12-17 16:06:17 -08:00
Roland Scheidegger
f97b731c82 draw: revert using correct order for prim decomposition.
This reverts db3dfcfe90.
The commit was correct but we've got some precision problems later in
llvmpipe (or possibly in draw clip) due to the vertices coming in in
different order, causing some internal test failures. So revert for now.
(Will only affect drivers which actually support constant-interpolated
attributes and not just flatshading.)
2014-12-17 20:17:42 +01:00
Jan Vesely
bc18b48924 util: Silence signed-unsigned comparison warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-17 17:15:36 +00:00
Cody Northrop
83e8bb5b1a i965: Require pixel alignment for GPU copy blit
The blitter will start at a pixel's natural alignment. For PBOs, if the
provided offset if not aligned, bits will get dropped.

This change adds offset alignment check for src and dst, kicking back if
the requirements are not met.

The change is based on following verbiage from BSPEC:
 Color pixel sizes supported are 8, 16, and 32 bits per pixel (bpp).
 All pixels are naturally aligned.

Found in the following locations:
page 35 of intel-gfx-prm-osrc-hsw-blitter.pdf
page 29 of ivb_ihd_os_vol1_part4.pdf
page 29 of snb_ihd_os_vol1_part5.pdf

This behavior was observed with Steam Big Picture rendering incorrect
icon colors.  The fix has been tested on Ubuntu and SteamOS on Haswell.

Signed-off-by: Cody Northrop <cody@lunarg.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83908
Reviewed-by: Neil Roberts <neil@linux.intel.com>
2014-12-16 16:04:14 -08:00
Mark Janes
fc016bc0f3 i965: remove includes of sampler.h from extern "C" blocks
C linkage was removed from functions in program/sampler.cpp.  However,
some cpp files include program/sampler.h within extern "C" blocks,
causing link errors for test_vec4_copy_propagation.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-16 15:39:55 -08:00
Kenneth Graunke
3eb6258db7 i965/query: Cache whether the batch references the query BO.
Chris Wilson noted that repeated calls to CheckQuery() would call
drm_intel_bo_references(brw->batch.bo, query->bo) on each invocation,
which is expensive.  Once we've flushed, we know that future batches
won't reference query->bo, so there's no point in asking more than once.

This patch adds a brw_query_object::flushed flag, which is a
conservative estimate of whether the batch has been flushed.

On the first call to CheckQuery() or WaitQuery(), we check if the
batch references query->bo.  If not, it must have been flushed for
some reason (such as being full).  We record that it was flushed.
If it does reference query->bo, we explicitly flush, and record that
we did so.

Any subsequent checks will simply see that query->flushed is set,
and skip the drm_intel_bo_references() call.

Inspired by a patch from Chris Wilson.

According to Eero, this does not affect the performance of Witcher 2
on Haswell, but approximately halves the userspace CPU usage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-16 15:39:54 -08:00
Kenneth Graunke
cb5cfb8361 i965/query: Use brw_bo_map to handle stall warnings.
This is less code and also measures the duration of the stall for us.

Our old code predates the existance of brw_bo_map().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-16 15:39:54 -08:00
Kenneth Graunke
9c47653d32 i965/query: Remove redundant drm_intel_bo_references call in CheckQuery.
CheckQuery calls drm_intel_bo_references to see if the batch references
the query BO, and if so, flushes.  It then checks if the query BO is
busy, and if not, calls gen6_queryobj_get_results().

Stupidly, gen6_queryobj_get_results() immediately did a second redundant
drm_intel_bo_references check, even though we know the buffer is not
referenced and in fact idle.

This patch moves the batch-flush check out of gen6_queryobj_get_results
and into WaitQuery() (the other caller).  That way, both callers do a
single batch-flush check.

This should only be a minor improvement, since it would only affect
the first CheckQuery call where the result is actually available.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-16 15:39:53 -08:00
Kenneth Graunke
12c16f4f27 i965/query: Add query->bo == NULL early return in CheckQuery hook.
If query->bo == NULL, this is a redundant CheckQuery call, and we
should simply return.  We didn't do anything anyway - we skipped the
batch flushing block, and although we called get_results(), it has an
early return and does nothing.  Why bother?

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-16 15:39:53 -08:00
Kenneth Graunke
ed8edd7175 i965/query: Set Ready flag in gen6_queryobj_get_results().
q->Ready means that the results are in, and core Mesa is free to return
them to the application.  gen6_queryobj_get_results() is a natural place
to set that flag; doing so means callers don't have to.

The older non-hardware-context aware code couldn't do this, because we
had to call brw_queryobj_get_results() to gather intermediate results
when we ran out of space for snapshots in the query buffer.  We only
gather complete results in the Gen6+ code, however.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-16 15:39:50 -08:00
Eric Anholt
1f0e106050 vc4: Add support for turning add-based MOVs to muls for pairing.
total instructions in shared programs: 43053 -> 40795 (-5.24%)
instructions in affected programs:     37996 -> 35738 (-5.94%)
2014-12-16 13:45:41 -08:00
Eric Anholt
f96bd9673e vc4: Add a helper for changing a field in an instruction. 2014-12-16 13:45:41 -08:00
Eric Anholt
8e18adea61 vc4: Fix the name of qpu_waddr_ignores_ws().
We're deciding about the WS bit, not PM.
2014-12-16 13:45:41 -08:00
Timothy Arceri
54cc3be436 docs: note change in minimum GCC version to 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:55 +11:00
Timothy Arceri
e801fbb813 util: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:42 +11:00
Timothy Arceri
0936d42d52 mesa: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:35 +11:00
Timothy Arceri
bf37433f8c gbm: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:29 +11:00
Timothy Arceri
13675a4907 gallium: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:23 +11:00
Timothy Arceri
8d0c641603 egl: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:17 +11:00
Timothy Arceri
78e1246bec mesa: bump required GCC version to 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:11 +11:00
Timothy Arceri
5eec7c8ab8 mesa: remove support for GCC older than 3.3.0
GCC >=3.3 has been required since 9aa3aa7138

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-17 08:37:05 +11:00
Matt Turner
2308b3bef2 i965/fs: Add a comment explaining what saturate propagation does. 2014-12-16 11:30:44 -08:00
Eric Anholt
3f6b008168 vc4: Add support for enabling early Z discards.
This is the same basic logic from the original Broadcom driver.
2014-12-16 10:37:34 -08:00
Brian Paul
c6e8d2c659 st/mesa: remove extern "C" around #includes in st_glsl_to_tgsi.cpp
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
6dac455e6a program: remove extern "C" usage in sampler.cpp
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
6d2f59fd94 program: remove extern "C" around #includes
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
241c599cb1 glsl: remove extern "C" around #includes
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
44c8957cfe st/mesa: add extern "C" to st_context.h
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
d260348130 st/mesa: add extern "C" to st_program.h
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
de42431a9d main: remove extern C around #includes in ff_fragment_shader.cpp
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
7b0aefaf74 mesa: move #include of mtypes.h outside __cplusplus check
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
04addcc6a3 program: add #ifndef SAMPLER_H wrapper
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
641314eff3 mesa: put extern "C" in src/mesa/program/*h header files
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Brian Paul
3ebc135b4e mesa: put extern "C" in header files
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-16 07:52:41 -07:00
Juha-Pekka Heikkila
4b342fbbb7 mapi: add glapi-test and shared-glapi-test to .gitignore
On the same go remove src/mapi/shared-glapi/tests/.gitignore
and src/mapi/glapi/tests/.gitignore as useless.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-16 13:51:09 +02:00
Juha-Pekka Heikkila
ebbf0a250a util: add u_atomic_test to .gitignore
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-16 13:50:59 +02:00
Juha-Pekka Heikkila
5d431ffd61 glx: remove __glXstrdup()
I didn't find this being used anywhere

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-16 13:50:53 +02:00
Juha-Pekka Heikkila
096b48b3e1 i965: add test_vf_float_conversions to .gitignore
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-16 13:50:45 +02:00
Juha-Pekka Heikkila
430fbd8ad8 i965: Make validate_reg tables constant
Declare local tables constant.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2014-12-16 13:50:38 +02:00
Timothy Arceri
873d7351c5 glsl: remove commented out code
MaxGeometryOutputComponents is used as the value
for gl_MaxGeometryVaryingComponents

Acked-by: Matt Turner <mattst88@gmail.com>
2014-12-16 15:57:30 +11:00
Timothy Arceri
965cfbc85e i965: remove commented out code
Acked-by: Matt Turner <mattst88@gmail.com>
2014-12-16 15:57:25 +11:00
Ilia Mirkin
1402f689f1 nvc0: add missed PIPE_CAP_VERTEXID_NOBASE
Commit ade8b26bf missed adding this cap to nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-12-15 23:18:07 -05:00
Roland Scheidegger
fef58979e1 st/mesa: use vertex id lowering according to pipe cap bit.
Tested with llvmpipe by setting the cap bit temporarily, seems to work,
though no driver requests it for now.
2014-12-16 04:23:00 +01:00
Roland Scheidegger
97dc3d826e draw: implement support for the VERTEXID_NOBASE and BASEVERTEX semantics.
This fixes 4 vertexid related piglit tests with llvmpipe due to switching
behavior of vertexid to the one gl expects.
(Won't fix non-llvm draw path since we don't get the basevertex currently.)
2014-12-16 04:23:00 +01:00
Roland Scheidegger
ade8b26bf5 gallium: add TGSI_SEMANTIC_VERTEXID_NOBASE and TGSI_SEMANTIC_BASEVERTEX
Plus a new PIPE_CAP_VERTEXID_NOBASE query. The idea is that drivers not
supporting vertex ids with base vertex offset applied (so, only support
d3d10-style vertex ids) will get such a d3d10-style vertex id instead -
with the caveat they'll also need to handle the basevertex system value
too (this follows what core mesa already does).
Additionally, this is also useful for other state trackers (for instance
llvmpipe / draw right now implement the d3d10 behavior on purpose, but
with different semantics it can just do both).
Doesn't do anything yet.
And fix up the docs wrt similar values.

v2: incorporate feedback from Brian and others, better names, better docs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-12-16 04:23:00 +01:00
Dave Airlie
3c8ef3a74b r600g/sb: implement r600 gpr index workaround. (v3.1)
r600, rv610 and rv630 all have a bug in their GPR indexing
and how the hw inserts access to PV.

If the base index for the src is the same as the dst gpr
in a previous group, then it will use PV instead of using
the indexed gpr correctly.

The workaround is to insert a NOP when you detect this.

v2: add second part of fix detecting DST rel writes followed
by same src base index reads.

v3: forget adding stuff to structs, just iterate over the
previous node group again, makes it more obvious.
v3.1: drop local_nop.

Fixes ~200 piglit regressions on rv635 since SB was introduced.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-16 12:44:45 +10:00
Vadim Girlin
de0fd375f6 r600g/sb: fix issues with loops created for switch
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-16 12:43:31 +10:00
Dave Airlie
34e512d9ea Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"
This reverts commit 7b0067d23a.

Vadim's patch fixes this a lot better.
2014-12-16 12:43:23 +10:00
Eric Anholt
1b486b52ac vc4: Add support for 32-bit signed norm/scaled vertex attrs.
32-bit unsigned would require some adjustments to handle values >=
0x80000000.
2014-12-15 14:33:05 -08:00
Eric Anholt
48a2154520 vc4: Add support for 16-bit signed/unsigned norm/scaled vertex attrs. 2014-12-15 14:33:01 -08:00
Eric Anholt
9ca32d6c19 vc4: Rename the 16-bit unpack #define.
It's only an f16 conversion if you're doing a float operation, otherwise
it's 16 bit signed to 32-bit signed.
2014-12-15 14:33:01 -08:00
Eric Anholt
2142fd1f6f vc4: Add support for 8-bit unnormalized vertex attrs. 2014-12-15 14:33:00 -08:00
Eric Anholt
214a169b32 vc4: Refactor vertex attribute conversions a bit.
There was just way too much indentation.
2014-12-15 14:28:23 -08:00
Eric Anholt
1fa1ee56a0 vc4: Fix use of r3 as a temp in 8-bit unpacking.
We're actually allocating out of r3 now, and I missed it because I'd typed
this one as qpu_rn(3) instead of qpu_r3().
2014-12-15 14:28:23 -08:00
Eric Anholt
8e678de761 vc4: Rename UNPACK_8* to UNPACK_8*_F.
There is an equivalent unpack function without conversion to float if you
use an integer operation instead.
2014-12-15 14:28:23 -08:00
Eric Anholt
ade7704685 vc4: Add support for UMAD. 2014-12-15 14:28:23 -08:00
Eric Anholt
440075fb50 vc4: 0-initialize the screen again.
I typoed this when rebasing the memory leak fixes.
2014-12-15 14:28:22 -08:00
Maxence Le Doré
19e05d6898 glsl: Add gl_MaxViewports to available builtin constants
It seems to have been forgotten during viewports array implementation time.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-15 12:20:00 -08:00
Andres Gomez
8517e665bc i965/brw_reg: struct constructor now needs explicit negate and abs values.
We were assuming, when constructing a new brw_reg struct, that the
negate and abs register modifiers would not be present by default in
the new register.

Now, we force explicitly setting these values when constructing a new
register.

This will avoid problems like forgetting to properly set them when we
are using a previous register to generate this new register, as it was
happening in the dFdx and dFdy generation functions.

Fixes piglit test shaders/glsl-deriv-varyings

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82991
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-15 11:40:22 -08:00
Eric Anholt
e108442bb1 vc4: Fix leaks of the compiled shaders' keys. 2014-12-14 23:12:11 -08:00
Eric Anholt
667719fcb2 vc4: Fix leaks of the CL contents. 2014-12-14 23:12:11 -08:00
Eric Anholt
1f1ca8b2ea vc4: Fix leak of vc4_bos stashed in the context. 2014-12-14 23:12:11 -08:00
Eric Anholt
80ed075e60 vc4: Fix leak of the compiled shader programs in the cache. 2014-12-14 23:12:11 -08:00
Eric Anholt
4da9e3d805 vc4: Fix leak of a copy of the scheduled QPU instructions.
They're copied into a vc4_bo after compiling is done.
2014-12-14 23:12:11 -08:00
Eric Anholt
5c9b8eace2 vc4: Switch to using the util/ hash table.
No performance difference on a microbenchmark with norast that should hit it
enough to have mattered, n=220.
2014-12-14 23:12:11 -08:00
Eric Anholt
c84306fdc2 vc4: Fix leak of simulator memory on screen cleanup. 2014-12-14 23:11:59 -08:00
Eric Anholt
f519c3bff1 vc4: Fix a leak of the simulator's exec BO's actual vc4_bo. 2014-12-14 23:10:35 -08:00
Eric Anholt
6c3115af85 hash_table: Fix compiler warnings from the renaming.
Not sure how we both missed this.  None of the callers were using the
return value, though.
2014-12-14 20:22:07 -08:00
Jason Ekstrand
94303a0750 util/hash_table: Rework the API to know about hashing
Previously, the hash_table API required the user to do all of the hashing
of keys as it passed them in.  Since the hashing function is intrinsically
tied to the comparison function, it makes sense for the hash table to know
about it.  Also, it makes for a somewhat clumsy API as the user is
constantly calling hashing functions many of which have long names.  This
is especially bad when the standard call looks something like

_mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data);

In the above case, there is no reason why the hash table shouldn't do the
hashing for you.  We leave the option for you to do your own hashing if
it's more efficient, but it's no longer needed.  Also, if you do do your
own hashing, the hash table will assert that your hash matches what it
expects out of the hashing function.  This should make it harder to mess up
your hashing.

v2: change to call the old entrypoint "pre_hashed" rather than
    "with_hash", like cworth's equivalent change upstream (change by
    anholt, acked-in-general by Jason).

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-12-14 19:32:53 -08:00
Mario Kleiner
0d7f4c8658 glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)
glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets
translated into target_msc=divisor=0 but remainder=1 by the mesa
api. This is done for server DRI2 where there needs to be a way
to tell the server-side DRI2ScheduleSwap implementation if a call
to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was
done. remainder = 1 was (ab)used as a flag to tell the server to
select proper semantic. The DRI3/Present backend ignored this
signalling, treated any target_msc=0 as glXSwapBuffers() request,
and called xcb_present_pixmap with invalid divisor=0, remainder=1
combo. The present extension responded kindly to this with a
BadValue error and dropped the request, but mesa's DRI3/Present
backend doesn't check for error codes. From there on stuff went
downhill quickly for the calling OpenGL client...

This patch fixes the problem.

v2: Change comments to be more clear, with reference to
relevant spec, as suggested by Eric Anholt.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-12-14 15:09:49 +00:00
Mario Kleiner
455d3036fa glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-12-14 15:09:49 +00:00
Mario Kleiner
ad8b0e8bf6 glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)
Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes -> PresentPixmapComplete event -> (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
-> PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
-> Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
-> Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Rename vblank_msc/ust to notify_msc/ust as suggested by
Axel Davy for better clarity.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2014-12-14 15:09:49 +00:00
Mario Kleiner
8cab54de16 glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)
targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Simplify as suggested by Axel Davy. Add comments proposed
by Eric Anholt.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-12-14 15:09:49 +00:00
Emil Velikov
ac0940224b docs: Add 10.4 sha256 sums, news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit af0c82099b)

Conflicts:
	docs/index.html
	docs/relnotes.html
2014-12-14 14:10:34 +00:00
Emil Velikov
1faac11778 docs: Update 10.4.0 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 5fe79b0b12)
2014-12-14 14:10:34 +00:00
Rob Clark
0ebd623f60 freedreno/a4xx: mipmaps
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-13 15:09:37 -05:00
Rob Clark
cf80694df5 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-13 15:09:37 -05:00
Rob Clark
f24e910da4 freedreno: add is_a3xx()/is_a4xx() helpers
A bunch of open-coded 'gpu_id > 300's seems like it will eventually
cause problems with future generations.  There were already a few minor
problems with caps for features that still need additional work on a4xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-13 15:09:37 -05:00
Rob Clark
7474de2235 freedreno: helper to calc layer/level offset
Rather than duplicating this everywhere.  Especially as on a4xx the
layout of layers and levels differs based on texture type.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-13 15:09:37 -05:00
Kenneth Graunke
23caba862a i965/vec4: Drop writemasks on scratch reads.
This code is complete nonsense and has apparently existed since I first
implemented register spilling in the VS two years ago.

Scratch reads are SEND messages, which ignore the destination writemask.

The comment about "data that may not have been written to scratch" is
also confusing - we always spill whole 4x2 registers, so such data
simply does not exist.  We can safely ignore the writemask.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-12 23:21:27 -08:00
Timothy Arceri
a3218e65d1 mesa: remove long dead 3Dnow optimisation
This code has been turned off for the last
decade. Considering 3Dnow is obsolete it
seems the bug will never be fixed so just
remove it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-13 12:15:25 +11:00
Brian Paul
64bd1ac2b1 ir_to_mesa: remove unused 'target' variable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-12 16:45:33 -07:00
Brian Paul
7dccc1a57a util: add missing closing brace for __cplusplus 2014-12-12 16:45:33 -07:00
Brian Paul
0dcc7de205 mesa: remove obsolete comment on _mesa_ClearColor() 2014-12-12 16:45:33 -07:00
Brian Paul
caa13c59ef mesa: whitespace fixes, 80-column wrapping in texobj.c 2014-12-12 16:45:33 -07:00
Brian Paul
e725dc0a74 mesa: whitespace, line wrap fixes in clear.c 2014-12-12 16:45:33 -07:00
Matt Turner
3f3aeb5333 mapi: Move rules for generating glapi_mapi_tmp.h out of the conditional.
Allows distcheck to succeed, regardless of how Mesa has been configured.
2014-12-12 12:11:50 -08:00
Matt Turner
5ea4b25fba glsl: Add dist-hook to delete glcpp test *.out files. 2014-12-12 12:11:50 -08:00
Matt Turner
a29ae0b3dd glcpp: Make tests write .out files to builddir. 2014-12-12 12:11:50 -08:00
Matt Turner
75c7a7114f gallium: Remove Android files from distribution.
Android builds Mesa from git, so there don't need to be in the tarball.
2014-12-12 12:11:50 -08:00
Matt Turner
00eadb77e6 osmesa: Add osmesa.def to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
92f89f0c0c x86-64: Remove calling_convention.txt.
It just details the x86-64 calling convention. No need for this in Mesa.
2014-12-12 12:11:50 -08:00
Matt Turner
9e191e8829 drivers/x11: Add headers to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
dd6a43f07c drivers/windows: Add to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
d51150a98a mesa: Add autogen.sh to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
4401e2b219 mapi: Add ABI-check tests to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
43ac31dff0 mesa: Add notes/readme files to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
a208e9b520 util: Wire up u_atomic_test. 2014-12-12 12:11:50 -08:00
Matt Turner
952b324b23 mesa: Add scons files to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
f6502aaa58 haiku: Add files to distribution. 2014-12-12 12:11:50 -08:00
Matt Turner
fe2c72e6ec egl: Add files to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
feb741dc7c egl+gbm: Add symbols-check tests to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
0ac98e7296 docs: Add to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
55983a1eaa glapi/gen: Add gl_and_glX_API.xml to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
7a26c82489 glx/apple: Add headers to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
a267212a4d mesa: Add a dist hook to remove .gitignore files from distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
b662d5282f mesa: Add clean-local rule to remove .lib links. 2014-12-12 12:11:49 -08:00
Matt Turner
8e2577f2a9 glsl: Add clean-local rule to delete glcpp test output. 2014-12-12 12:11:49 -08:00
Matt Turner
e643fd3b4a util: List hash_table tests as check_PROGRAMS.
EXTRA_PROGRAMS is not what you want for binaries listed in TEST.
2014-12-12 12:11:49 -08:00
Matt Turner
216248730a xmlpool: Add $(MOS) and options.h to CLEANFILES. 2014-12-12 12:11:49 -08:00
Matt Turner
3b7bcb5d04 dri: Add uninstall hooks to handle megadriver hardlinks. 2014-12-12 12:11:49 -08:00
Matt Turner
65155c208d targets/dri: Remove unnecessary variables in install-data-hook. 2014-12-12 12:11:49 -08:00
Matt Turner
d27379d016 glx/tests: Add headers to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
3d357d030f gallium/targets: Add *.sym files to distribution.
And add d3dadapter9's extra dependency.
2014-12-12 12:11:49 -08:00
Matt Turner
00ab151ad1 egl/dri2: Add headers to distribution. 2014-12-12 12:11:49 -08:00
Matt Turner
7a08a1e61b egl: Drop unnecessary Makefile.am. 2014-12-12 12:11:48 -08:00
Matt Turner
d1c1d6d9b6 glx: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
82b7da3de7 glx: Alphabetize source lists.
And remove absurd tab-space-space indentation.
2014-12-12 12:11:48 -08:00
Matt Turner
4f90f341a7 swrast: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
c9b5c4d407 r200: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
7162219450 r200: Alphabetize source list. 2014-12-12 12:11:48 -08:00
Matt Turner
5fd472507b radeon: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
b53fbe2552 radeon: Alphabetize source list. 2014-12-12 12:11:48 -08:00
Matt Turner
10259d8614 nouveau: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
6b0207552f nouveau: Alphabetize source list. 2014-12-12 12:11:48 -08:00
Matt Turner
e81ec49b56 i965: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
976b3f4cfa i965: Alphabetize source list. 2014-12-12 12:11:48 -08:00
Matt Turner
d8e28537e3 i915: Add headers to distribution. 2014-12-12 12:11:48 -08:00
Matt Turner
0698f5de4a i915: Alphabetize source list. 2014-12-12 12:11:48 -08:00
Matt Turner
9f565f5f8a loader: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
929bcfb756 program: Add lex and yacc sources to distribution.
Since we have manual build rules and list the .c/.cpp files in SOURCES,
we need to explicitly list these for distribution.
2014-12-12 12:11:47 -08:00
Matt Turner
e3ea939988 glsl: Add parser headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
4af1905e73 drivers/common: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
942e646941 vbo: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
b8205d4db7 vbo: Alphabetize VBO_FILES. 2014-12-12 12:11:47 -08:00
Matt Turner
009bf242d3 tnl: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
e15cd6dd9f tnl: Alphabetize TNL_FILES. 2014-12-12 12:11:47 -08:00
Matt Turner
d1127e29dd tnl_dd: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
d36113e000 tnl_dd: Remove dead t_dd_vb.c.
Dead since e4344161 ("dri: Remove all DRI1 drivers").
2014-12-12 12:11:47 -08:00
Matt Turner
e88ed739f0 swrast: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
58a3ec427f state_trackers: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
4194f9c1ad x86: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
0557d54847 x86-64: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
d5fba58f85 sparc: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
1abf4e2f45 math: Add headers to distribution. 2014-12-12 12:11:47 -08:00
Matt Turner
152e967063 program: Add headers to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
e475ad70c8 program: Alphabetize PROGRAM_FILES. 2014-12-12 12:11:46 -08:00
Matt Turner
67abb4910a mesa: Remove moved texcompress_rgtc_tmp.h from source list.
Missed in commit ebcb2ee9.
2014-12-12 12:11:46 -08:00
Matt Turner
9a742eef53 mesa: Add headers to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
19999c3114 mesa: Alphabetize MAIN_FILES. 2014-12-12 12:11:46 -08:00
Matt Turner
3125cd1f6b glsl: Add lex and yacc sources to distribution.
Since we have manual build rules and list the .c/.cpp files in SOURCES,
we need to explicitly list these for distribution.
2014-12-12 12:11:46 -08:00
Matt Turner
55afbcc661 include: Add remaining headers to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
2a5b012171 configure.ac: Ship .xz compressed tarballs, in addition to .gz.
11 MiB -> 6.5 MiB.
2014-12-12 12:11:46 -08:00
Matt Turner
dd439e494e configure.ac: Use tar-ustar archive format.
The default tar-v7 archive format doesn't support filenames longer than
99 characters, of which we have a few (in src/glsl/tests/lower_jumps/).
2014-12-12 12:11:46 -08:00
Matt Turner
8280358cf1 gtest: Add headers to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
838ac978f4 glsl: Add headers to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
69386ddfa6 glsl: Distribute tests/, TODO, and README 2014-12-12 12:11:46 -08:00
Matt Turner
b245009173 mesa: Add python scripts to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
cceeea0c4c dri/common: Add files to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
748d0b04a0 vgapi: Add vgapi.csv to distribution. 2014-12-12 12:11:46 -08:00
Matt Turner
72cf4baeb3 mapi: Add mapi_abi.py to EXTRA_DIST 2014-12-12 12:11:45 -08:00
Matt Turner
f6357a993b dri/common: Drop unused mmio.h.
Unused since commit 7550a24f.
2014-12-12 12:11:45 -08:00
Matt Turner
547faf1dec glapi/gen: Add KHR_context_flush_control.xml to distribution. 2014-12-12 12:11:45 -08:00
Matt Turner
2de8da637e configure.ac: Drop generating egl-static and gbm Makefiles. 2014-12-12 12:11:45 -08:00
Matt Turner
1cd2b9177e util: Add headers and python scripts for distribution. 2014-12-12 12:11:45 -08:00
Matt Turner
7808344271 glapi: Make mapi/glapi/gen before mapi to avoid distcheck problem. 2014-12-12 12:11:45 -08:00
Matt Turner
2eef9c0b16 r200: Avoid out of bounds array access.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-12 12:11:31 -08:00
Eric Anholt
e5eaf8ec60 vc4: Fix referencing of sync objects.
While the pipe_reference_* helpers set the pointer, a bare pipe_reference
doesn't.   Fixes 5 ARB_sync tests.
2014-12-12 09:30:35 -08:00
José Fonseca
e75e677d28 util: Unbreak usage of assert()/debug_assert() inside expressions.
f0ba7d897d made debug_assert()/assert()
unsafe for expressions, but only now that u_atomic.h started to rely on
them for Windows that this became an issue.

This fixes non-debug builds with MSVC.
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-12 14:19:53 +00:00
Eric Anholt
92b85fba89 vc4: Consider FS backface color loads as color inputs as well.
This fixes flatshading of backface color in 4 of the piglit interpolation
tests.
2014-12-11 23:52:34 -08:00
Eric Anholt
5b3c0d999c vc4: Drop redundant index size setting.
This is already done at set_index_buffer() time.
2014-12-11 23:52:34 -08:00
Eric Anholt
d78eb57528 vc4: Don't throw out the index offset in the shadow index buffer path.
When we upload shadow indices at draw time, we need the source offset.
Fixes the piglit draw-elements test.
2014-12-11 23:52:25 -08:00
Eric Anholt
0ae5e002e0 vc4: Fix triangle-guardband-viewport piglit test.
The original Broadcom driver also did this with the viewport.
2014-12-11 21:31:27 -08:00
Eric Anholt
87db578268 vc4: Fix a memory leak in setting up QPU instructions for scheduling. 2014-12-11 21:31:27 -08:00
Ben Widawsky
5069e4bd40 i965/gen8+: Remove false perf debug message about MOCS
We support MOCS on both gen8 and gen9, so the message seems meaningless. Remove
it to avoid confusion.

Trivial.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-11 18:59:38 -08:00
Ben Widawsky
9cd4f90242 i965/gen8: Check correct number of blitter dwords
The odds of having this patch make a difference on Gen8+ are probably very low.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-but-not-tested-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-11 18:59:36 -08:00
Alexander von Gluck IV
ad2ffd3bc6 mesa/drivers: Add missing mesautil lib to Haiku swrast
* Resolves missing util_format_linear_to_srgb_8unorm_table symbol.
2014-12-11 03:34:15 +00:00
Roland Scheidegger
ff96537759 draw: simplify prim id insertion in prim assembler
Because all topologies are reduced to basic primitives (i.e. no strips, fans)
and the vertices involved are all copied, there's no need for any elaborate
decisions where to insert the prim id. The logic employed was correct for
first provoking vertex, but didn't account at all for the last provoking
vertex case. And since we now will get the right constant value even if the
primitive type is later changed (for unfilled etc.) this is no longer
required to pass certain tests (which were checking for prim_id == some
const interpolated value so passing because both were wrong in the end).
This is a bit overkill (3x4 values assigned in total even though it's really
one scalar per prim...) but the code is now much easier and I don't need to
add more cases for last provoking vertex.

This fixes piglit primitive-id-no-gs-strip test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-10 22:11:16 +01:00
Roland Scheidegger
db3dfcfe90 draw: fix another decompose bug affecting constant interpolated attributes
Previously the first provoking vertex convention would only be used if
flatshading were enabled. No matter how I look at it that cannot be possibly
correct. Maybe the code getting used was somewhat simpler that way at a time
where there weren't constant interpolated attributes, only flatshading...
(Note that all other places including the decomposition macros already do
the same.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-10 22:11:16 +01:00
Roland Scheidegger
2b23149206 draw: fix flatshade stage for constant interpolated values
This stage only worked for traditional old-school flatshading, it did ignore
constant interpolated values and only handled colors, the code probably
predates using of constant interpolated values in gallium. So fix this - the
clip stage apparently did this a long time ago already.
Unfortunately this also means the stage needs to be invoked when flatshading
isn't enabled but some other prim changing stages are - for instance with
fill mode line each of the 3 lines in a tri should get the same attribute
value from the leading vertex in the original tri if interpolation is constant,
which did not happen before
Due to that, the stage is now run in more cases, even unnecessary ones. Could
in theory skip it completely if there aren't any constant interpolated
attributes (and rast->flatshade isn't set), but not sure it's worth bothering,
as it looks kinda complicated getting this information in advance.

No piglit change (doesn't really cover this directly).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-10 22:11:16 +01:00
Roland Scheidegger
fb61f75bf6 draw: copy over prim id header in flatshade stage when emitting lines
Just like we do for tris (det shouldn't matter at this point, however
can have flags for things like line stipple reset).

No piglit change, it would fail line stippling tests if the flatshade
stage were run, which will happen with the next commit.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-10 22:11:16 +01:00
Roland Scheidegger
fe7e6b248f gallium/docs: clarify fragment shader position input w component.
The previous language was a bit misleading, since it sounded like
w was interpolated then the reciprocal calculated which isn't what
should be happening.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-10 22:11:16 +01:00
Marek Olšák
ac319d94d3 docs/relnotes: document the removal of GALLIUM_MSAA
Cc: 10.2.10.3 10.4 <mesa-stable@lists.freedesktop.org>
2014-12-10 21:59:37 +01:00
Marek Olšák
15186607bb radeonsi: take into account NULL colorbuffers when computing CB_TARGET_MASK
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
3291eedfe6 radeonsi: only emit line stippling and provoking vertex state when it changes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
acda2e113a radeonsi: fix SPI state dependency on sprite_coord_enable
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
7991d602f3 radeonsi: fix line stippling and provoking vertex state for GS primitives
I'm not sure if GS hw outputs line lists or line strips.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
834bee42ed radeonsi: emit DRAW_PREAMBLE only if it changes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
c466093512 radeonsi: remove setting of VGT_DISPATCH_DRAW_INDEX
It's used only if VGT_SHADER_STAGES_EN.DISPATCH_DRAW_EN is 1, which we don't
set.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
6fde194910 radeonsi: emit GS_OUT_PRIM_TYPE only if it changes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
34350131de radeonsi: emit primitive restart only if it changes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
3382036946 radeonsi: emit base vertex and start instance only if they change
v2: added a helper function for invalidation of the sh constants

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
b472709090 radeonsi: emit clip registers only if VS, GS, or rasterizer is changed
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
161534737c radeonsi: get info about VS outputs from tgsi_shader_info
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
20e570d115 radeonsi: move all shader-related functions to a new file si_state_shaders.c
This huge amount of code deserves its own file.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
ca7f1cf8b5 radeonsi: generate derived and draw-related registers directly in the CS
The big function is split into 3 smaller functions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
508c1ca6af radeonsi: si_conv_pipe_prim shouldn't fail
An assertion should suffice.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
c6546cfb03 radeonsi: remove useless variable si_context::pm4_dirty_cdwords
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
e90bae4376 radeonsi: remove unused draw packet functions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
384213cb51 radeonsi: emit draw packets directly into the CS
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
feedd8f700 radeonsi: add emit util functions for SH registers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
2b76bb3ba7 tgsi: add tgsi_shader_info::writes_clipvertex
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-10 21:59:37 +01:00
Marek Olšák
8115797801 tgsi: add clip and cull distance writemasks into tgsi_shader_info
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-10 21:59:36 +01:00
Marek Olšák
946eb08e6a tgsi: add tgsi_shader_info::writes_psize
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-10 21:59:36 +01:00
Marek Olšák
0a60ebe30c cso: put cso_release_all into cso_destroy_context
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-10 21:59:36 +01:00
Kristian Høgsberg
ee5fb8d1ba i965: Generate vs code using scalar backend for BDW+
With everything in place, we can now use the scalar backend compiler for
vertex shaders on BDW+.  We make scalar vertex shaders the default on
BDW+ but add a new vec4vs debug option to force the vec4 backend.

No piglit regressions.

Performance impact is minimal, I see a ~1.5 improvement on the T-Rex
GLBenchmark case, but in general it's in the noise.  Some of our
internal synthetic, vs bounded benchmarks show great improvement, 20%-40%
in some cases, but real-world cases are mostly unaffected.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:27 -08:00
Kristian Høgsberg
7ff457b930 i965: Clean up fs_visitor::run and rename to run_fs
Now that fs_visitor::run is back to being only fragment
shader compilation, we can clean up a few stage == MESA_SHADER_FRAGMENT
conditions and rename it to run_fs.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:23 -08:00
Kristian Høgsberg
8b6a797d74 i965: Add fs_visitor::run_vs() to generate scalar vertex shader code
This patch uses the previous refactoring to add a new run_vs() method
that generates vertex shader code using the scalar visitor and
optimizer.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:19 -08:00
Kristian Høgsberg
bf23079379 i965: Rename brw_vec4_prog_data/key to brw_bue_prog_data/key
These structs aren't vec4 specific, they are shared by shader stages
operating on Vertex URB Entries (VUEs).  VUEs are the data structures in
the URB that hold vertex data between the pipeline geometry stages.
Using vue in the name instead of vec4 makes a lot more sense, especially
when we add scalar vertex shader support.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:16 -08:00
Kristian Høgsberg
3d10f0a98c i965: Prepare for using the ATTR register file in the fs backend
The scalar vertex shader will use the ATTR register file for vertex
attributes.  This patch adds support for the ATTR file to fs_visitor.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:11 -08:00
Kristian Høgsberg
df0966fb1a i965: Consolidate code to get struct brw_sampler_prog_key_data
This chunk of code is repeated in a few places, and we're going to add
a MESA_SHADER_VERTEX case to it soon.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:08 -08:00
Kristian Høgsberg
c5b3878714 i965: Add new SIMD8 VS prog data flag
This flag signals that we have a SIMD8 VS shader so we can set up the
corresponding state accordingly.  This boils down to setting
the BDW+ SIMD8 enable bit in 3DSTATE_VS and making UBO and pull
constant buffers use dword pitch.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:04 -08:00
Kristian Høgsberg
d9e29f5d88 i965: Add SIMD8 URB write low-level IR instruction
This is all we need from the generator for SIMD8 vertex shaders.  This
opcode is just the send instruction, all the hard work will happen
in the visitor using LOAD_PAYLOAD.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:29:00 -08:00
Kristian Høgsberg
686ef091a4 i965: Remove shader program argument and member from fs_generator
Now that the caller passes in the shader debug name, we don't need this
anymore.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:28:55 -08:00
Kristian Høgsberg
9a1af7b318 i965: Set shader name for generator from call site
fs_generator no longer knows what stage it's generating code for, so
we have to set the debug name of the shader from the call site.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:28:51 -08:00
Kristian Høgsberg
7bb9d33b8d i965: Generalize fs_generator further
This removes all stage specific data from the generator, and lets us
create a generator for any stage.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:28:48 -08:00
Kristian Høgsberg
840e8fc920 i965: Don't copy propagate constants from sources with saturate
We don't propagate the saturate bit and some instructions can't
saturate at all.  If the source has saturate set, just skip propagation.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-10 12:28:32 -08:00
Matt Turner
47aaabda47 i965: Replace 'noann' debug flag with 'ann'.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-10 10:19:16 -08:00
Matt Turner
1a2de7dce8 i965: Disable unlit-centroid workaround on Gen < 6.
Back to the original commit (8313f444) adding the workaround, we were
enabling it on gens <= 7, even though gens <= 5 can't do multisampling.

I cannot find documentation that says that Sandybridge needs this
workaround but in practice disabling it causes these piglit tests to
fail:

EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled}

On Ironlake:

total instructions in shared programs: 4358478 -> 4349671 (-0.20%)
instructions in affected programs:     117680 -> 108873 (-7.48%)

A bunch of shaders in TF2, Portal 2, and L4D2 are cut by 25~30%.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-10 10:18:39 -08:00
Adrien Destugues
13e42fc025 hgl: traverse add-on entries
* Allow using symlinks to add-ons when developing.
2014-12-10 14:01:01 +00:00
Alexander von Gluck IV
03e237e9f2 gallium/target: Haiku softpipe
* Use print macro to fix warning on 64-bit systems
2014-12-10 14:01:01 +00:00
Alexander von Gluck IV
63d3f621e3 gallium/aux: Avoid redefining MAX
* Can be redefined on some platforms through u_debug.h
2014-12-10 14:01:00 +00:00
Jan Vesely
3a18fc6058 clover: Use switch when creating kernel arguments.
This way we get a warning if an enum value is not handled.

v2: codestyle

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-12-10 15:48:20 +02:00
Dave Airlie
7f21cf7198 r600g: only init GS_VERT_ITEMSIZE on r600
On evergreen there are 4 regs, on r600/700 there is only one.

Don't initialise regs and trash someone elses state.

Not sure this fixes anything, but hey one less stupid.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-10 16:34:40 +10:00
Eric Anholt
8812dc503e vc4: Do QPU scheduling across uniform loads.
This means another pass of reordering the uniform data store, but it lets
us pair up a lot more instructions.

total instructions in shared programs: 44639 -> 43176 (-3.28%)
instructions in affected programs:     36938 -> 35475 (-3.96%)
2014-12-09 21:19:11 -08:00
Eric Anholt
c5b544403f vc4: Populate the delay field better, and schedule high delay first.
This is a standard scheduling heuristic, and clearly helps.

total instructions in shared programs: 46418 -> 44467 (-4.20%)
instructions in affected programs:     42531 -> 40580 (-4.59%)
2014-12-09 18:32:36 -08:00
Eric Anholt
45a8923771 vc4: Skip raddr dependencies for 32-bit immediate loads.
These don't have raddr fields.
2014-12-09 18:32:36 -08:00
Eric Anholt
f431b4f110 vc4: Mark VPM read setup as impacting VPM reads, not writes.
Fixes assertion failures if we adjust scheduling priorities to emphasize
VPM reads more.
2014-12-09 18:32:36 -08:00
Eric Anholt
cff8c96a0d vc4: Refuse to merge instructions involving 32-bit immediate loads.
An immediate load overwrites the mul and add operations, so you can't
merge with them.
2014-12-09 18:32:36 -08:00
Aaron Watry
25db8729dc clover: Fix build after llvm r223802
Signed-off-by: Aaron Watry <awatry at gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-12-09 19:28:50 -06:00
Rob Clark
69d23809d0 freedreno/a4xx: frag-coord / face fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:03:55 -05:00
Rob Clark
3dbcd25022 freedreno/a4xx: fix rendering to layer != 0
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:03:40 -05:00
Rob Clark
6a5ba23fa6 freedreno/a4xx: temp hack for FLAT varyings
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:03:09 -05:00
Rob Clark
eb6fd3b8eb freedreno/ir3: lower TXP as needed
On a3xx, lower TXP for 3D textures, on a4xx lower all TXP.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:03:01 -05:00
Rob Clark
5b38a1740b freedreno/a4xx: XA gpu hang at startup
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:02:45 -05:00
Rob Clark
1e3a732603 freedreno/a4xx: texture fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:01:49 -05:00
Rob Clark
5d7c9c9160 freedreno: cleanup slice alignment/setup
Collapse things back into a setup_slices() which takes the desired
alignment as a param.  This gets things ready for a4xx which has some
slightly different requirements.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:01:21 -05:00
Rob Clark
8ecbcbf0aa freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-09 18:01:10 -05:00
Rob Clark
219440ddeb tgsi/lowering: add support to lower TXP (v2)
v2: actually do perspective divide for RECT/SHADOWRECT

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-12-09 17:47:44 -05:00
Timothy Arceri
f1b5f2b157 mesa: use build flag to ensure stack is realigned on x86
Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment.

V4: fix comment and indentation

V3: move all sse4.1 build flag config to the same location
 and add comment as to why we need to do the realign

V2: use $target_cpu rather than $host_cpu
  and setup build flags in config rather than makefile

https://bugs.freedesktop.org/show_bug.cgi?id=86788
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
CC: "10.4" <mesa-stable@lists.freedesktop.org>
2014-12-10 07:35:38 +11:00
Marek Olšák
65ef78e861 draw: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION
Required by Nine. Tested with util_run_tests.
It's added to softpipe, llvmpipe, and r300g/swtcl.

Tested-by: David Heidelberg <david@ixit.cz>
2014-12-09 12:27:10 +01:00
Samuel Iglesias Gonsalvez
6cc7251185 main: return two minor digits for ES shading language version
For OpenGL ES 3.0 spec, the minor number for SHADING_LANGUAGE_VERSION is always
two digits, matching the OpenGL ES Shading Language Specification release
number. For example, this query might return the string "3.00".

This patch fixes the following dEQP test:

   dEQP-GLES3.functional.state_query.string.shading_language_version

No piglit regression observed.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-09 11:40:00 +01:00
Samuel Iglesias Gonsalvez
426a50e208 glsl: invariant qualifier is not valid for shader inputs in GLSL ES 3.00
GLSL ES 3.00 spec, chapter 4.6.1 "The Invariant Qualifier",

    Only variables output from a shader can be candidates for invariance. This
    includes user-defined output variables and the built-in output variables.
    As only outputs can be declared as invariant, an invariant output from one
    shader stage will still match an input of a subsequent stage without the
    input being declared as invariant.

This patch fixes the following dEQP tests:

dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage_precision
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage_precision
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_precision_invariant_input
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_invariant_input
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_precision_invariant_input
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_invariant_input

No piglit regressions observed.

v2:
- Add spec content in the code

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-09 11:40:00 +01:00
Iago Toral Quiroga
e1ed4f2532 mesa: Recompute LegalTypesMask if the GL API has changed
The current code computes ctx->Array.LegalTypesMask just once,
however, computing this needs to consider ctx->API so we need
to make sure that the API for that context has not changed if
we intend to reuse the result.

The context API can change, at least, if we go through
_mesa_meta_begin, since that will always force
API_OPENGL_COMPAT until we call _mesa_meta_end. If any
operation in between these two calls triggers a call to
update_array_format, then we might be caching a value for
LegalTypesMask that will not be right once we have called
_mesa_meta_end and restored the context API.

Fixes the following 179 dEQP tests in i965:
dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.*
dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.fixed.*
dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.fixed.*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_draw.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_draw.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_draw.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_copy.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_copy.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_copy.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_read.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_read.*fixed*
dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_read.*fixed*
dEQP-GLES3.functional.vertex_arrays.multiple_attributes.input_types.3_*fixed2*
dEQP-GLES3.functional.draw.random.{2,18,28,68,83,106,109,156,181,191}

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-09 11:40:00 +01:00
Eduardo Lima Mitev
09cb149ba7 mesa: Returns zero samples when querying GL_NUM_SAMPLE_COUNTS when internal format is integer
From GL ES 3.0 specification, section 6.1.15 Internal Format Queries (page 236),
multisampling is not supported for signed and unsigned integer internal formats.

Fixes 19 dEQP tests under 'dEQP-GLES3.functional.state_query.internal_format.*'.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-09 11:40:00 +01:00
Eduardo Lima Mitev
7894278717 mesa: Enables GL_RGB and GL_RGBA unsized internal formats for OpenGL ES 3.0
GL_RGB and GL_RGBA are valid internal formats on a GLES3 profile. See
"Table 1. Unsized Internal Formats" at
https://www.khronos.org/opengles/sdk/docs/man3/html/glTexImage2D.xhtml.

Fixes 2 dEQP tests:
- dEQP-GLES3.functional.state_query.internal_format.rgb_samples
- dEQP-GLES3.functional.state_query.internal_format.rgba_samples

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-09 11:40:00 +01:00
Eduardo Lima Mitev
242ad32655 mesa: Considers GL_DEPTH_STENCIL_ATTACHMENT a valid argument for FBO invalidation under GLES3
In OpenGL and OpenGL-ES 3+, GL_DEPTH_STENCIL_ATTACHMENT is a valid attachment point for the family of functions
that invalidate a framebuffer object (e.g, glInvalidateFramebuffer, glInvalidateSubFramebuffer, etc).
Currently, a GL_INVALID_ENUM error is emitted for this attachment point.

Fixes 21 dEQP test failures under 'dEQP-GLES3.functional.fbo.invalidate.*'.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-09 11:40:00 +01:00
Eric Anholt
8420a95692 vc4: Reserve rb31 instead of r3 for raddr conflict spills.
This increases the cost of a raddr b conflict spill (save r3 to rb31, move
src1 to r3, move rb31 back to r3 when done, instead of just move src1 to
r3), but on average thanks to instruction pairing it's more worthwhile to
have another accumulator.

total instructions in shared programs: 46428 -> 46171 (-0.55%)
instructions in affected programs:     38030 -> 37773 (-0.68%)
2014-12-09 01:04:46 -08:00
Eric Anholt
ab1b1fa6fb vc4: Prioritize allocating accumulators to short-lived values.
The register allocator walks from the end of the nodes array looking for
trivially-allocatable things to put on the stack, meaning (assuming
everything is trivially colorable and gets put on the stack in a single
pass) the low node numbers get allocated first.  The things allocated
first happen to get the lower-numbered registers, which is to say the fast
accumulators that can be paired more easily.

When we previously made the nodes match the temporary register numbers,
we'd end up putting the shader inputs (VS or FS) in the accumulators,
which are often long-lived values.  By prioritizing the shortest-lived
values for allocation, we can get a lot more instructions that involve
accumulators, and thus fewer conflicts for raddr and WS.

total instructions in shared programs: 52870 -> 46428 (-12.18%)
instructions in affected programs:     52260 -> 45818 (-12.33%)
2014-12-09 00:55:14 -08:00
Dave Airlie
0d4272cd8e r600g: fix regression since UCMP change
Since d8da6decea where the
state tracker started using UCMP on cayman a number of tests
regressed.

this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0,
we should be doing CNDE_INT with reverse arguments.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-09 11:54:46 +10:00
Matt Turner
2a0bef91ca program: Delete dead _mesa_realloc_instructions.
Dead since 2010 (commit 284ce209).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-08 17:02:19 -08:00
Matt Turner
811a1836c8 swrast: Remove 'inline' from tex filter functions.
Reduces .text size of mesa_dri_drivers.so (i965-only) by 62k, or 1.4%.

Note that we don't remove inline from lerp_2d(), which has a comment
above it saying it definitely should be inlined. Though, removing the
inline keyword from it doesn't actually change the compiled code for me.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-08 17:02:19 -08:00
Matt Turner
8af4aaf351 Don't cast the return value of malloc/realloc
See commit 2b7a972e for the Coccinelle script.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-08 17:02:19 -08:00
Matt Turner
f0a8bcd84e Use calloc instead of malloc/memset-0
See commit 6bda027e for the Coccinelle script.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-08 17:02:19 -08:00
Matt Turner
9019e5e195 Remove useless checks for NULL before freeing
See commits 5067506e and b6109de3 for the Coccinelle script.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-08 17:02:19 -08:00
Kristian Høgsberg
cae7a2a031 i965/skl: Add Skylake PCI IDs
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2014-12-08 16:33:59 -08:00
Damien Lespiau
5bad948fa8 i965/skl: Emit depth stall workaround for gen9 as well
The docs say that we shouldn't need this workaround for gen8+, but just
removing it, causes gpu hangs.  We'll revisit this, but for now, just
extend the workaround to gen9.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-12-08 16:33:59 -08:00
Ben Widawsky
9404494b9b i965/skl: Fix GS thread count location
SKL moves the GS threadcount to dw8 from dw7, and no longer does the
divide by 2 thing.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Kristian Høgsberg <krh@bitplanet.net>
2014-12-08 16:33:59 -08:00
Vinson Lee
d20235f79a i965: Fix union usage for G++ <= 4.6.
This patch fixes this build error with G++ <= 4.6.

  CXX    test_vf_float_conversions.o
test_vf_float_conversions.cpp: In function ‘unsigned int f2u(float)’:
test_vf_float_conversions.cpp:63:20: error: expected primary-expression before ‘.’ token

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-08 16:25:16 -08:00
Eric Anholt
70dd3df344 vc4: Interleave register allocation from regfile A and B.
The register allocator prefers low-index registers from vc4_regs[] in the
configuration we're using, which is good because it means we prioritize
allocating the accumulators (which are faster).  On the other hand, it was
causing raddr conflicts because everything beyond r0-r2 ended up in
regfile A until you got massive register pressure.  By interleaving, we
end up getting more instruction pairing from getting non-conflicting
raddrs and QPU_WSes.

total instructions in shared programs: 55957 -> 52719 (-5.79%)
instructions in affected programs:     46855 -> 43617 (-6.91%)
2014-12-08 16:08:13 -08:00
Eric Anholt
46741c1b87 vc4: Fix decision for whether the MIN operation writes to the B regfile. 2014-12-08 16:08:13 -08:00
Eric Anholt
24c5ab7bbb vc4: Drop dependency on r3 for color packing.
We can avoid it by carefully ordering the packing.  This is important as a
step in giving r3 to the register allocator.

total instructions in shared programs: 56087 -> 55957 (-0.23%)
instructions in affected programs:     18368 -> 18238 (-0.71%)
2014-12-08 16:08:13 -08:00
Eric Anholt
dfbf58c439 vc4: Add support for GL 1.0 logic ops. 2014-12-08 16:08:13 -08:00
Eric Anholt
5045d8ca42 vc4: Add support for TGSI_OPCODE_UCMP.
This is being emitted now from st_glsl_to_tgsi.cpp.
2014-12-08 16:08:13 -08:00
Tom Stellard
c16436149c radeonsi/compute: Clamp COMPUTE_TMPRING_SIZE.WAVES to: num_cu * 32
This is the maximum value allowed for this field.
2014-12-08 17:20:50 -05:00
Tom Stellard
0e1c085f17 winsys/radeon: Always report at least 1 compute unit
All uses of this require that the value be at least one, so it's
easier to report at least one than having to wrap all uses
in MAX2(max_compute_units, 1).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-12-08 17:20:50 -05:00
Tom Stellard
67dcbcd92c radeonsi: Program RASTER_CONFIG for harvested GPUs v5
Harvested GPUs have some of their render backends disabled, so
in order to prevent the hardware from trying to render things
with these disabled backends we need to correctly program
the PA_SC_RASTER_CONFIG register.

v2:
  - Write RASTER_CONFIG for all SEs.

v3:
  - Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit.
  - Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting
    PA_SC_RASTER_CONFIG.
  - Get num_se and num_sh_per_se from kernel.

v4:
  - Get correct value for num_se
  - Remove loop for setting PA_SC_RASTER_CONFIG
  - Only compute raster config when a backend has been disabled.

v5: Michel Dänzer
  - Fix computation for chips with multiple SEs

https://bugs.freedesktop.org/show_bug.cgi?id=60879

CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2014-12-08 17:20:50 -05:00
Roland Scheidegger
fea5c2640b draw: (trivial): remove double semicolon 2014-12-09 00:10:41 +01:00
Abdiel Janulgue
49e0431211 st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported
There is a bug in the current lowering pass implementation where we lower saturate
to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior
is to actually lower to clamp only when we don't support saturate which happens
on drivers that don't support SM 3.0

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-12-08 20:14:26 +02:00
Abdiel Janulgue
4ea8c8d56c glsl: Don't optimize min/max into saturate when EmitNoSat is set
v3: Fix multi-line comment format (Ian)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-12-08 20:14:17 +02:00
Abdiel Janulgue
39f7b72428 ir_to_mesa: Remove sat to clamp lowering pass
Fixes an infinite loop in swrast where the lowering pass unpacks saturate into
clamp but the opt_algebraic pass tries to do the opposite.

v3 (Ian):
This is a revert of commit cfa8c1cb "ir_to_mesa: lower ir_unop_saturate" on
the ir_to_mesa.cpp portion. prog_execute.c can handle saturates in vertex
shaders, so classic swrast shouldn't need this lowering pass.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-12-08 20:14:10 +02:00
Michael Forney
5d64da401c loader: Add missing EXPAT_CFLAGS to libloader.la CPPFLAGS
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-08 08:50:27 -08:00
Matt Turner
f65200ccc9 i965: Remove default from brw_instruction_name switch to catch missing names.
The case-range extension is available in clang and gcc at least back to
3.4.0.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-08 08:50:26 -08:00
Matt Turner
b6a71cbb64 i965: Add missing opcode names.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-08 08:50:26 -08:00
Matt Turner
6383e206c0 i965: Add opcode names for set_omask and set_sample_id.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-08 08:50:26 -08:00
Chad Versace
7e8ba77c49 egl: Expose EGL_KHR_get_all_proc_addresses and its client extension
Mesa already implements the behavior of EGL_KHR_get_all_proc_addresses
and EGL_KHR_client_get_all_proc_addresses. This patch just exposes the
extension strings.

See: https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_get_all_proc_addresses.txt
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2014-12-07 20:58:25 -08:00
Emil Velikov
0b6e0aa5ae docs: add news item and link release notes for mesa 10.3.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-07 19:22:11 +00:00
Emil Velikov
7409ad5147 docs: Add sha256 sums for the 10.3.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1ba2029184)
2014-12-07 19:22:11 +00:00
Emil Velikov
8d235e0c70 Add release notes for the 10.3.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit c90b0db1ae)
2014-12-07 19:22:11 +00:00
Ilia Mirkin
043b79461f freedreno/a2xx: silence warning about missing DEPTH32X
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:53 -05:00
Ilia Mirkin
c416f49ebe freedreno/a3xx: handle index_bias (i.e. base_vertex)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:50 -05:00
Ilia Mirkin
b38b40d7bb freedreno/a3xx: add bgr565 texturing and rendering
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:47 -05:00
Ilia Mirkin
e02ed16cb5 freedreno/a3xx: add support for SRGB render targets
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:43 -05:00
Ilia Mirkin
39a7c049d3 freedreno/a3xx: output RGBA16_FLOAT from fs for certain outputs
Fixes R11G11B10F rendering, and is required for SRGB format support.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:40 -05:00
Ilia Mirkin
3674c76edf freedreno/a3xx: re-enable rgb10_a2 render targets
There were previously regressions regarding border colors, which the
updated swizzle logic resolves.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:37 -05:00
Ilia Mirkin
fc94b2c2a0 freedreno/a3xx: fix border color swizzle to match texture format desc
This is a hack since it uses the texture information together with the
sampler, but I don't see a better way to do it. In OpenGL, there is a
1:1 correspondence.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:33 -05:00
Ilia Mirkin
97fef2db5c freedreno/a3xx: fix alpha-blending on RGBX formats
Expert debugging assistance provided by Chris Forbes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-06 18:18:20 -05:00
Chris Forbes
6b01969345 glcpp: Fix can not to cannot in error message
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-07 11:49:28 +13:00
Chris Forbes
b49a069bd3 glcpp: Disallow undefining GL_* builtin macros.
Fixes the piglit test: spec/glsl-es-3.00/compiler/undef-GL_ES.vert

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-07 11:47:45 +13:00
Chris Forbes
ed56c16820 i965/Gen6-7: Fix point sprites with PolygonMode(GL_POINT)
This was an oversight in the original patch. When PolygonMode is
used, then front faces, back faces, or both may be rendered as
points and are affected by point sprite state.

Note that SNB/IVB can't actually be fully conformant here, for
a legacy context -- we don't have separate sets of pointsprite
enables for front and back faces. Haswell ignores pointsprite
state correctly in hardware for non-point rasterization, so can
do this correctly, but it doesn't seem worth it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86764
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-07 11:46:42 +13:00
Chris Forbes
092c73a7c3 i965: Fix regs read for FS_OPCODE_INTERP_PER_SLOT_OFFSET
Dead code elimination was eating the Y offset.

Fixes the piglit test:
spec/ARB_gpu_shader5/arb_gpu_shader5-interpolateAtOffset-nonconst

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-07 10:29:26 +13:00
Chris Forbes
680f72d6f2 i965: Add opcode names for FS interpolation opcodes
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-07 10:29:20 +13:00
Roland Scheidegger
d8da6decea mesa/st: don't use CMP / I2F for conditional assignments with native integers
The original idea was to optimize away the condition by integrating it directly
into the CMP instruction. However, with native integers this requires an extra
I2F instruction. It is also fishy because the negation used didn't really honor
ieee754 float comparison rules, not to mention the CMP instruction itself
(being pretty much a legacy instruction) doesn't really have defined special
float value behavior in any case.
So, use UCMP and adjust the code trying to optimize the condition away
accordingly (I have absolutely no idea if such conditions are actually hit
or would be translated away somewhere else already).

v2: cosmetic changes

No piglit regressions on llvmpipe.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-06 18:03:25 +01:00
Roland Scheidegger
6f2cf5f3d0 llvmpipe: decrease MAX_SCENES from 2 to 1
Multiple scenes per context are meant to be used so a new scene can be built
while another one is processed in rasterization. However, quite surprisingly,
this does not actually work (and according to git log, possibly never did,
though maybe it did at some point further back (5 years+) but was buggy)
because we always wait immediately on the rasterizer to finish the scene when
contexts (and hence setup/scene) is flushed. This means when we try to get
an empty scene later, any old one is already empty again.
Thus using multiple scenes is just a waste of memory (not too bad, since the
additional scenes are guaranteed to be empty, which means their size ought to
be one data block (64kB) plus the size of some structs), without actually
really doing anything. (There is also quite some code for the whole concept of
multiple scenes which doesn't really do much in practice, but keep it hoping
the wait-on-scene-flush can be fixed some day.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-06 18:03:18 +01:00
Roland Scheidegger
1b6db3593e draw: use the prim type from prim_info not emit in passthrough emit
The prim assembler may change the prim type when injecting prim ids now,
which isn't reflected by what's stored in emit.
This looks brittle and potentially dangerous (it is not obvious if such prim
type changes are really supported by pt emit, the prim type is actually also
set in prepare which would then be different).

This fixes piglit primitive-id-no-gs-first-vertex.shader_test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-06 18:03:11 +01:00
Roland Scheidegger
fe86415beb draw: use correct output prim for non-adjacent topologies in prim assembler.
The decomposition done in the prim assembler will turn tri fans into tris,
but this wasn't reflected in the output prim type. Meaning with a tri fan
with 6 verts input, the output was a tri fan with 12 vertices instead of a
tri list with 12 vertices (not as bad as it sounds, since the additional tris
created would all be degenerate since they'd all have two times vertex zero
but still bogus).
This is because the prim assembler is used if either the input topology is
something with adjacency, or if prim id needs to be injected, and for the
latter case topologies without adjacency can be converted to basic ones.
Unfortunately decomposition here for inserting prim ids is necessary, at
least for the indexed case where we can't just insert the prim id at the
right place depending on provoking vertex.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-06 18:03:05 +01:00
Roland Scheidegger
3fdbad1142 draw: kill off unneded prim assembler code for handling adjacency verts
The default macros when the adjacency macros aren't defined will already
exactly do that (that is, drop the adjacent vertices and call the non-adjacent
macro).

Reviewed-by: Jose Fonseca <jfonseca@vmwarec.com>
2014-12-06 18:02:59 +01:00
Roland Scheidegger
ec30c66b46 gallium/docs: (trivial) remove STR opcode description.
The opcode was removed alongside SFL by commit
ecfe9e2ad2.
2014-12-06 17:56:46 +01:00
Matt Turner
a28ad9d4c0 i965/fs: Perform CSE on MOV ..., VF instructions.
Safe from causing optimization loops, since we don't constant propagate
VF arguments.

(for this and the previous patch):
total instructions in shared programs: 4289075 -> 4271932 (-0.40%)
instructions in affected programs:     1616779 -> 1599636 (-1.06%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-05 16:43:31 -08:00
Matt Turner
963a3c7f90 i965/fs: Try to emit LINE instructions on Gen <= 5.
The LINE instruction performs a multiply-add instruction (a * b + c)
where b and c are scalar arguments. It reads b and c from offsets in
src0 such that you can load them (it they're representable) as a
vector-float immediate with a single instruction.

Hurts some programs, but that'll all get better once we CSE the
vector-float MOVs in the next patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77544
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-05 16:43:31 -08:00
Matt Turner
6be863af0e i965/fs: Add support for generating the LINE instruction.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-05 16:43:31 -08:00
Matt Turner
92346db057 i965: Set the region of LINE's src0 to <0,1,0>.
The PRMs say that

   <src0> region must be a replicated scalar
   (with HorzStride = VertStride = 0).

but apparently that doesn't actually apply to all generations. I did
notice when implementing the optimization later in this series that G45
and ILK needed this regioning.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-05 16:43:31 -08:00
Matt Turner
9ed8d00ab5 i965: Give compile stats through KHR_debug.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-05 16:43:31 -08:00
Matt Turner
5b1e51bfbe mesa: Add a source parameter to _mesa_gl_debug.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-05 16:43:31 -08:00
Eric Anholt
befdff8142 vc4: Try swapping the regfile A to B to pair instructions.
total instructions in shared programs: 56995 -> 56087 (-1.59%)
instructions in affected programs:     40503 -> 39595 (-2.24%)
2014-12-05 16:27:58 -08:00
Eric Anholt
7d8b79f398 vc4: Allow pairing of some instructions that disagree about the WS bit.
No difference on shader-db because we tend to have a lot of other
conflicts going on as well (like RADDR_A disagreements)
2014-12-05 16:27:06 -08:00
Matt Turner
e36c6513ce configure.ac: Replace contraction to fix syntax highlighting. 2014-12-05 13:22:56 -08:00
Ben Widawsky
f13870db09 i965/gs: Avoid DW * DW mul
The GS has an interesting use for mul. Because the GS can emit multiple
vertices per input vertex, and it also has a unique count at the top of the URB
payload, the GS unit needs to be able to dynamically specify URB write offsets
(relative to the global offset). The documentation in the function has a very
good explanation from Paul on the mechanics.

This fixes around 2000 piglit tests on BSW.

v2:
Reworded commit message (Ben) no mention of CHV (Matt)
Change SHRT_MAX to USHRT_MAX (Ken, and Matt)
Update comment in code to reflect the use of UW (Ben)
Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken)
Drop the bogus hunk in emit_control_data_bits() (Ken)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
Cc: "10.4 10.3 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-05 12:12:46 -08:00
Eric Anholt
6f32deb538 vc4: Add separate write-after-read dependency tracking for pairing.
If an operation is the last one to read a register, the instruction
containing it can also include the op that has the next write to that
register.

total instructions in shared programs: 57486 -> 56995 (-0.85%)
instructions in affected programs:     43004 -> 42513 (-1.14%)
2014-12-05 10:53:53 -08:00
Eric Anholt
042962df2d vc4: Fix inverted priority of instructions for QPU scheduling.
We were scheduling TLB operations as early as possible, and texture setup
as late as possible.  When I introduced prioritization, I visually
inspected that an independent operation got moved above texture results
collection, which tricked me into thinking it was working (but it was just
because texture setup was being pushed late).

total instructions in shared programs: 57651 -> 57486 (-0.29%)
instructions in affected programs:     18532 -> 18367 (-0.89%)
2014-12-05 10:43:14 -08:00
Eric Anholt
bd4057a5d7 vc4: Refuse to merge two ops that both access shared functions.
Avoids assertion failures in vc4_qpu_validate.c if we happen to find the
right set of operations available.
2014-12-05 10:43:14 -08:00
Eric Anholt
dadc32ac80 vc4: Allow dead code elimination of color reads.
This might happen if the blending functions are set up to not actually use
the destination color/alpha, for example.
2014-12-05 10:43:14 -08:00
Eric Anholt
34cf86bdc4 vc4: Add a debug flag for waiting for sync on submit.
This is nice when you're tracking down which command list is hanging the
GPU.
2014-12-05 10:43:14 -08:00
Matt Turner
c0e26c5d27 i965/fs: Move brw_file_from_reg() higher in the file.
This was supposed to be part of the previous commit.
2014-12-05 09:53:35 -08:00
Matt Turner
db186f2a38 i965/fs: Make brw_reg_from_fs_reg static and remove prototype.
And move it above its first use in brw_fs_generator.cpp.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-05 09:49:42 -08:00
Matt Turner
2881b123d0 i965: Use ~0 to represent true on all generations.
Jason realized that we could fix the result of the CMP instruction on
Gen <= 5 by doing -(result & 1). Also do the resolves in the vec4
backend before use, rather than when the bool was created. The FS does
this and it saves some unnecessary resolves.

On Ironlake:

total instructions in shared programs: 4289762 -> 4287277 (-0.06%)
instructions in affected programs:     619430 -> 616945 (-0.40%)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-05 09:49:42 -08:00
Matt Turner
05e2578cac i965: Change the type of booleans to D.
This is a revert of commit 4656c14e ("i965/fs: Change the type of
booleans to UD and emit correct immediates") plus some small additional
fixes, like casting ctx->Const.UniformBooleanTrue to int and changing UD
to D in the ir_unop_b2f cases. Note that it's safe to leave 0x3f800000
as UD and as a literal it's more recognizable than 1065353216.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-05 09:49:42 -08:00
Matt Turner
66cc8de042 i965/fs: Add a negate() function.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-05 09:49:42 -08:00
Matt Turner
15f6118b77 i965/vec4: Don't DCE flag-writing insts because dest was unused.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-05 09:49:42 -08:00
Matt Turner
0d3cc01b0b i965/vec4: Allow CSE on uniform-vec4 expansion MOVs.
Three source instructions cannot directly source a packed vec4 (<0,4,1>
regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to
both halves of a register.

If these uniform values are used by multiple three-source instructions,
we'll emit multiple expansion moves, which we cannot combine in CSE
(because CSE emits moves itself).

So emit a virtual instruction that we can CSE.

Sometimes we demote a uniform to to a pull constant after emitting an
expansion move for it. In that case, recognize in opt_algebraic that if
the .file of the new instruction is GRF then it's just a real move that
we can copy propagate and such.

total instructions in shared programs: 5822418 -> 5812335 (-0.17%)
instructions in affected programs:     351841 -> 341758 (-2.87%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-05 09:49:42 -08:00
Matt Turner
be80f69ecd glsl: Optimize scalar all_equal/any_nequal into equal/nequal.
Cuts an instruction from two shaders in Tesseract, by allowing the
(x+y) cmp 0 -> x cmp -y optimization to take place.

instructions in affected programs:     1198 -> 1194 (-0.33%)

Reviewed-by: Eric Anholt <eric@anholt.net>
2014-12-05 09:49:42 -08:00
José Fonseca
a1fc6a91e5 mesa: Ensure stack is realigned on x86.
Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits,
but that is an assumption OpenGL drivers (or any dynamic library for
that matter) can't afford to make as there are many closed- and open-
source application binaries out there that only assume 4-byte stack
alignment.

This fix uses force_align_arg_pointer GCC attribute, and is only a
stop-gap measure.

The right fix would be to pass -mstackrealign or
-mincoming-stack-boundary=2 to all source fails that use any -msse*
option, as there is no way to guarantee if/when GCC will decide to spill
SSE registers to the stack.

https://bugs.freedesktop.org/show_bug.cgi?id=86788

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-12-05 15:17:37 +00:00
José Fonseca
f9098f0972 util/primconvert: Avoid point arithmetic; apply offset on all cases.
Matches what u_vbuf_get_minmax_index() does.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-12-05 14:44:16 +00:00
Ilia Mirkin
c3bed13604 util/primconvert: take ib offset into account
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-05 07:23:48 -05:00
Ilia Mirkin
fb434e675f util/primconvert: support instanced rendering
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-05 07:23:48 -05:00
Ilia Mirkin
1dfa039168 util/primconvert: pass index bias through
The index_bias (aka base_vertex) applies to the downstream draw just as
much, since the actual index values are never modified.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-12-05 07:23:48 -05:00
Kenneth Graunke
ae45a5a28d i965: Compute VS attribute WA bits earlier and check if they changed.
BRW_NEW_VERTICES is flagged every time we draw a primitive.  Having
the brw_vs_prog atom depend on BRW_NEW_VERTICES meant that we had to
compute the VS program key and do a program cache lookup for every
single primitive.  This is painfully expensive.

The workaround bit computation is almost entirely based on the vertex
attribute arrays (brw->vb.inputs[i]), which are set by brw_merge_inputs.
The only thing it uses the VS program for is to see which VS inputs are
actually read.  brw_merge_inputs() happens once per primitive, and can
safely look at the currently bound vertex program, as it doesn't change
in the middle of a draw.

This patch moves the workaround bit computation to brw_merge_inputs(),
right after assigning brw->vb.inputs[i], and stores the previous WA bit
values in the context.  If they've actually changed from the last draw
(which is uncommon), we signal that we need a new vertex program,
causing brw_vs_prog to compute a new key.

Improves performance in Gl32Batch7 by 13.6123% +/- 0.739652% (n=166)
on Haswell GT3e.  I'm told Baytrail shows similar gains.

v2: Introduce a new BRW_NEW_VS_ATTRIB_WORKAROUNDS dirty bit, rather
    than reusing BRW_NEW_VERTEX_PROGRAM (suggested by Chris Forbes).
    This prevents unnecessary re-emission of surface/sampler related
    atoms (and an SOL atom on Sandybridge).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-04 17:50:52 -08:00
Matt Turner
0b4a688691 egl/dri2: Log a warning if no platforms are enabled.
If you hit this, you didn't compile with --with-egl-platforms=...

Recompile with something like --with-egl-platforms=x11,drm and make
clean and make again.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-12-04 15:13:51 -08:00
Kenneth Graunke
ca19e89d6e i965: Drop BRW_NEW_VERTEX_PROGRAM and _NEW_TRANSFORM from Gen4 VS state.
These stopped being necessary in commit ab973403e4.

v2: Update commit message with a better explanation (thanks to Eric
    Anholt for doing the git archaeology).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-04 15:04:35 -08:00
Kenneth Graunke
a2dd8ea59a i965: Drop BRW_NEW_VERTEX_PROGRAM from Gen7+ 3DSTATE_VS atoms.
We don't access brw->vertex_program or ctx->_Shader since the previous
commit, so we don't need this dirty bit.

I think it's still necessary on Gen6 because it still conflates
constant uploading with unit state uploading.  We can fix that later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-04 15:04:35 -08:00
Kenneth Graunke
7b6620faf5 i965: Store floating point mode choice in brw_stage_prog_data.
We use IEEE mode for GLSL programs, but need to use ALT mode for ARB
programs so that 0^0 == 1.  The choice is based entirely on the shader
source language.

Previously, our code to determine which mode we wanted was duplicated
in 8 different places (VS and FS for Gen4-5, Gen6, Gen7, and Gen8).
The ctx->_Shader->CurrentProgram[stage] == NULL check was confusing
as well - we use CurrentProgram (non-derived state), but _Shader
(derived state).  It also relies on knowing that ARB programs don't
use gl_shader_program structures today.  The compiler already makes
this assumption in a few places, but I'd rather keep that assumption
out of the state upload code.

With this patch, we select the mode at compile time, and store that
choice in prog_data.  The state upload code simply uses that decision.

This eliminates a BRW_NEW_*_PROGRAM dependency in the state upload code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-04 15:04:35 -08:00
Kenneth Graunke
d300e58db0 i965: Make Gen4-5 and Gen8+ ALT checks use ctx->_Shader too.
Commit c0347705 changed the Gen6-7 code to use ctx->_Shader rather than
ctx->Shader, but neglected to change the Gen4-5 or Gen8+ code.

This might fix SSO related bugs, but ALT mode is only used for ARB
programs, so if there's an actual problem, it's likely no one would
run into it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-04 15:04:35 -08:00
Kenneth Graunke
8daf3c53c7 i965: Move PSCDEPTH calculations from draw time to compile time.
The "Pixel Shader Computed Depth Mode" value is entirely based on the
shader program, so we can easily do it at compile time.  This avoids the
if+switch on every 3DSTATE_WM (Gen7)/3DSTATE_PS_EXTRA (Gen8+) upload,
and shares a bit more code.

This also simplifies the PMA stall code, making it match the formula
more closely, and drops a BRW_NEW_FRAGMENT_PROGRAM dependency.  (Note
that the previous comment was wrong - the code and the documentation
have != PSCDEPTH_OFF, not ==.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-04 15:04:35 -08:00
Rob Clark
4265148ac6 freedreno/a4xx: unify vertex/texture formats into a single table
Similar to the scheme that Ilia put in place for a3xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-04 16:01:37 -05:00
Rob Clark
e9589a8fcf freedreno/a4xx: fd4_util -> fd4_format
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-04 16:01:37 -05:00
Rob Clark
8bf69a29bb freedreno: update generated headers / a4xx fmt rename
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-04 16:01:37 -05:00
Kenneth Graunke
bcc7eb115e i965: Add var->location != -1 assertions.
We shouldn't receive variables with invalid locations set - adding these
assertions should help catch problems before they cause crashes later.

Inspired by similar code in st_glsl_to_tgsi.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-03 17:45:16 -08:00
Matt Turner
b5b18e4687 i965/fs: Don't offset uniform registers in half().
Half gives you the second half of a SIMD16 register, but if the register
is a uniform it would incorrectly give you the next register.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-03 16:47:45 -08:00
Rob Clark
c74f2db0a5 freedreno/a4xx: frag-depth fixes
Also seems to fix kill/discard.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-03 16:38:26 -05:00
Ian Romanick
a909b995d9 linker: Assign varying locations geometry shader inputs for SSO
Previously only geometry shader outputs would be assigned locations if
the geometry shader was the only stage in the linked program.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: pavol@klacansky.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-12-03 11:33:49 -08:00
Ian Romanick
5eca78a00a linker: Wrap access of producer_var with a NULL check
producer_var could be NULL if consumer_var is not NULL and
consumer_is_fs is false.  This will occur when the producer is NULL and
the consumer is the geometry shader for a program that contains only a
geometry shader.  This will occur starting with the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: pavol@klacansky.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-12-03 11:33:49 -08:00
Jan Vesely
a2f2eebfdf st/xvmc: Fix compiler warnings
Mostly signed/unsigned comparison

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-12-03 17:07:08 +01:00
Axel Davy
712a4c5438 st/nine: Fix vertex declarations for non-standard (usage/index)
Nine code to match vertex declaration to vs inputs was limiting
the number of possible combinations.

Some sm3 games have issues with that, because arbitrary (usage/index)
can be used.

This patch does the following changes to fix the problem:
. Change the numbers given to (usage/index) combinations to uint16
. Do not put limits on the indices when it doesn't make sense
. change the conversion rule (usage/index) -> number to fit all combinations
. Instead of having a table usage_map mapping a (usage/index) number to
an input index, usage_map maps input indices to their (usage/index)

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
5d6d260833 st/nine: sm1_declusage_to_tgsi, do not restrict indices with TGSI_SEMANTIC_GENERIC
With sm3, you can declare an input/output with an usage and an usage index.

Nine code hardcodes the translation usage/index to a corresponding TGSI code.
The translation was limited to a few usage/index combinations that were corresponding
to most of the needs of games, but some games did not work.

This patch rewrites that Nine code to map all possible usage/index combination
to TGSI code. The index associated to TGSI_SEMANTIC_GENERIC doesn't need to be low
for good performance, as the old code was supposing, and is not particularly bounded
(it's UINT16). Given the index is BYTE, we can map all combinations.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
3e1f731d3e st/nine: Queries: Always return D3D_OK when issuing with D3DISSUE_BEGIN
This is the behaviour that Wine tests.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
2f78259c11 st/nine: Queries: always succeed for D3DQUERYTYPE_TIMESTAMP when flushing
This is the behaviour that Wine tests

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
225d7f8e0e st/nine: Queries: allow app to call GetData without Issuing first
Nine was allowing that behaviour, but was not filling the result.

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
eac0b9b68a st/nine: Queries: Fix D3DISSUE_END behaviour.
Issuing D3DISSUE_END should:
. reset previous queries if possible
. end the query

Previous behaviour wasn't calling end_query for
queries not needing D3DISSUE_BEGIN, nor resetting
previous queries.

This fixes several applications not launching properly.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
ca0588d1a1 st/nine: Queries: return S_FALSE instead of INVALIDCALL when in building query state
It is the same behaviour as wine has.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
b0302a95ec st/nine: Queries: Use gallium caps to get if queries are supported. (v2)
Some queries need the driver to advertise a cap to be supported.
For example r300 doesn't support them.

v2 (David): check also for PIPE_CAP_QUERY_PIPELINE_STATISTICS, fix wine
            tests on r300g

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
6b35662e30 st/nine: Queries: Remove flush logic
get_query_result flushes automatically, we don't need to flush.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:24 +01:00
Axel Davy
3e48791aea st/nine: Queries: remove dummy queries
Applications are supposed to call CreateQuery with a NULL
ppQuery to know if the query is supported. We supported that.

However when ppQuery was not NULL, we were accepting to create the
query and were creating a dummy query even when the query is not
supported.

Wine has different behaviour. This patch drops the dummy queries
support and matches wine behaviour.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-12-03 16:39:23 +01:00
Ilia Mirkin
79f9a106b9 freedreno/a3xx: implement anisotropic filtering
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-12-03 09:23:46 -05:00
Rob Clark
b491d1ca6e freedreno/a4xx: rect textures
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-03 09:22:05 -05:00
Rob Clark
fbba633f2f freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-03 09:22:05 -05:00
Rob Clark
4cfe905a9b freedreno: fix signed vs unsigned lols
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-03 09:22:05 -05:00
José Fonseca
ef7e0b39a2 gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.
Trivial.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=86958
2014-12-03 07:49:47 +00:00
Tapani Pälli
636db35c35 glsl: throw error when using invariant(all) in a fragment shader
Note that some of the GLSL specifications explicitly state this as
compile error, some simply state that 'it is an error'.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-03 08:56:19 +02:00
Ben Widawsky
c914247dcb i965/skl: Fix SBE state upload code.
The state upload code was incorrectly shifting the attribute swizzles. The
effect of this is we're likely to get the default swizzle values, which disables
the component.

This doesn't technically fix any bugs since Skylake support is still disabled by
default (no PCI IDs).

While here, since VARYING_SLOT_MAX can be greater than the number of attributes
we have available, add a warning to the code to make sure we never do the wrong
thing (and hopefully prevent further static analysis from finding this).
Admittedly I am a bit confused. It seems to me like the moment a user has
greater than 8 varyings we will hit this condition. CC Ken to clarify.

v2: Forgot to git add the warning message in v1

v3: Change the > 31 varyings to an assertion (Ken)

Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> (via Coverity)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 22:11:09 -08:00
Jan Vesely
02cc9e9f9e r600, llvm: Don't leak global symbol offsets
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-12-02 22:32:05 -05:00
Matt Turner
bc3ca485ae i965: Avoid union literal, for old gcc compatibility.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-02 17:20:16 -08:00
Matt Turner
f0fa6a5e86 i965: Remove tabs from instruction scheduler.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2014-12-02 17:20:16 -08:00
Kenneth Graunke
51f7f613f9 i965/vs: Set brw_vs_prog_key::clamp_vertex_color to 0 when irrelevant.
Vertex color clamping is only relevant if the shader writes to
the built-in gl_[Secondary]{Front,Back}Color varyings.  Otherwise,
brw_vs_prog_key::clamp_vertex_color is never used, so we can simply
leave it set to 0.

This enables us to correctly predict the clamp_vertex_color key value
in the precompile for shaders which don't use those varyings.

Eliminates virtually all VS recompiles in Serious Sam 3's intro.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
afd605f346 i965: Make vertex color clamp handling code VS specific.
Vertex color clamping only applies to gl_[Secondary]{Front,Back}Color,
which are compatibility-only built-in varyings.  We only support GS in
core profile, so they can't exist in geometry shaders.

We can drop several dirty bits from the GS program key - they're
unnecessary for a core profile implementation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
169b6c1955 i965/vs: Handle vertex color clamping in emit_urb_slot().
Vertex color clamping only applies to a few specific built-ins: COL0/1
and BFC0/1 (aka gl_[Secondary]{Front,Back}Color).  It seems weird to
handle special cases in a function called emit_generic_urb_slot().

emit_urb_slot() is all about handling special cases, so it makes more
sense to handle this there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
793ac67d3d i965: Use the enum type for gen6_gather_wa sampler key field.
Requested by Matt Turner.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
e5e466c954 i965: Drop use of GL types in program keys.
This is really far removed from the API; we should just use C types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
a64f3ba3d1 i965: Move program key structures to brw_program.h.
With fs_visitor/fs_generator being reused for SIMD8 VS/GS programs,
we're running into weird #include patterns, where scalar code #includes
brw_vec4.h and such.

Program keys aren't really related to SIMD4X2/SIMD8 execution - they
mostly capture NOS for a particular shader stage.  Consolidating them
all in one place that's vec4/scalar neutral should help avoid problems.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
5f34a18f96 i965: Delete brw_state_flags::cache and related code.
It's been merged into brw_state_flags::brw for simplicity and
efficiency.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
4f24c168c8 i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).
I put the BRW_NEW_*_PROG_DATA flags at the beginning so that
brw_state_cache.c can still continue using 1 << brw_cache_id.

I also added a comment explaining the difference between
BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time
to remember it.

Non-mechanical changes:
- brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache.
- brw_state_upload.c - INTEL_DEBUG=state changes.
- brw_context.h - bit definition merging.

v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention
    state-based recompiles, and nix the "proper subset" claim,
    as it's false. (Caught by Kristian Høgsberg).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
ce44b2061c i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.
Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only
ones that are left are legitimately related to the program cache.  Yet,
it seems a bit wasteful to have an entire bitfield for only 7 bits.

State upload is one of the hottest paths in the driver.  For each atom
in the list, we call check_state() to see if it needs to be emitted.
Currently, this involves comparing three separate bitfields (mesa, brw,
and cache).  Consolidating the brw and cache bitfields would save a
small amount of CPU overhead per atom.  Broadwell, for example, has
57 state atoms, so this small savings can add up.

CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the
offset into the program cache BO (prog_offset).  Since most uses refer
to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name.

Removing "cache" completely is a bit painful, so I decided to do it in
several patches for easier review, and to separate mechanical changes
from manual ones.  This one simply renames things, and was made via:

$ for file in *.[ch]; do
      sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \
             -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file
  done

Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw!
The next patch will remedy this flaw.  It will also fix the
alphabetization issues.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-12-02 17:00:26 -08:00
Kenneth Graunke
2a4f5728ad i965: Remove "disable_derivative_optimization" driconf option.
This was added in September 2013 when we first implemented the fast
(but lower quality) derivatives.  A quick Google search didn't turn
up anyone using or recommending the option, so I suspect no one does.

Applications that want to control the quality of their derivatives can
use the new GL_ARB_derivative_control extension, or use the glHint
mechanism.  The driconf option seems superfluous.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-02 17:00:26 -08:00
Ian Romanick
0391d1bbea i965: Just return void from brw_try_draw_prims
Note from Ken:

    "We used to use the return value to indicate whether software
    fallbacks were necessary, but we haven't in years."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 12:16:28 -08:00
Ian Romanick
9fd398215d mesa: Use current Mesa coding style in check_valid_to_render
This makes some others patches (still in my local tree) a bit cleaner.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 12:16:28 -08:00
Ian Romanick
331b0120d1 mesa: Use unreachable instead of assert in check_valid_to_render
This is generally the prefered style these days.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 12:16:28 -08:00
Ian Romanick
304c466bd8 mesa: Silence unused parameter warnings in _mesa_validate_Draw functions
../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawElements':
../../src/mesa/main/api_validate.c:376:37: warning: unused parameter 'basevertex' [-Wunused-parameter]
../../src/mesa/main/api_validate.c: In function '_mesa_validate_MultiDrawElements':
../../src/mesa/main/api_validate.c:394:65: warning: unused parameter 'basevertex' [-Wunused-parameter]
../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawRangeElements':
../../src/mesa/main/api_validate.c:452:35: warning: unused parameter 'basevertex' [-Wunused-parameter]
../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawArrays':
../../src/mesa/main/api_validate.c:473:25: warning: unused parameter 'start' [-Wunused-parameter]
../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawElementsInstanced':
../../src/mesa/main/api_validate.c:590:44: warning: unused parameter 'basevertex' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 12:16:28 -08:00
Ian Romanick
5e72886db0 mesa: Refactor common validation code to validate_DrawElements_common
Most of the code in _mesa_validate_DrawElements,
_mesa_validate_DrawRangeElements, and
_mesa_validate_DrawElementsInstanced was the same.  Refactor this out to
common code.

As a side-effect, a bug in _mesa_validate_DrawElementsInstanced was
fixed.  Previously this function would not generate an error when
check_valid_to_render failed if numInstances was 0.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 12:16:28 -08:00
Ian Romanick
b93dcb0e71 mesa: Generate GL_INVALID_OPERATION when drawing w/o a VAO in core profile
GL 3-ish versions of the spec are less clear that an error should be
generated here, so Ken (and I during review) just missed it in 1afe335.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-02 12:16:28 -08:00
Brian Paul
4e6244e80f mesa: fix height error check for 1D array textures
height=0 is legal for 1D array textures (as depth=0 is legal for
2D arrays).  Fixes new piglit ext_texture_array-errors test.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2014-12-02 10:00:03 -07:00
Jan Vesely
ca0616f17e r600, llvm: Fix mem leak
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-12-02 11:30:13 -05:00
EdB
745b1f5503 clover: clCompileProgram CL_INVALID_COMPILER_OPTIONS
clCompileProgram should return CL_INVALID_COMPILER_OPTIONS
instead of CL_INVALID_BUILD_OPTIONS

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-12-02 11:05:03 -05:00
Eric Anholt
29c7cf2b2b vc4: Pair up QPU instructions when scheduling.
We've got two mostly-independent operations in each QPU instruction, so
try to pack two operations together.  This is fairly naive (doesn't track
read and write separately in instructions, doesn't convert ADD-based MOVs
into MUL-based movs, doesn't reorder across uniform loads), but does show
a decent improvement on shader-db-2.

total instructions in shared programs: 59583 -> 57651 (-3.24%)
instructions in affected programs:     47361 -> 45429 (-4.08%)
2014-12-01 22:29:42 -08:00
Dave Airlie
7b0067d23a r600g/sb: fix issues cause by GLSL switching to loops for switch
Since 73dd50acf6
glsl: implement switch flow control using a loop

The SB backend was falling over in an assert or crashing.

Tracked this down to the loops having no repeats, but requiring
a working break, initial code just called the loop handler for
all non-if statements, but this caused a regression in
tests/shaders/dead-code-break-interaction.shader_test.
So I had to add further code to detect if all the departure
nodes are empty and avoid generating an empty loop for that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86089
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-02 13:57:27 +10:00
Rob Clark
036f434ac2 freedreno/a4xx: alpha blend fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-01 20:31:23 -05:00
Rob Clark
a7d91c33c2 freedreno/a4xx: fix DRAW initiator encoding of index size
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-01 20:31:23 -05:00
Rob Clark
81194ac767 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-01 20:31:23 -05:00
Matt Turner
5df88c2096 i965/vec4: Rewrite dead code elimination to use live in/out.
Improves 359 shaders by >=10%
         114 shaders by >=20%
          91 shaders by >=30%
          82 shaders by >=40%
          22 shaders by >=50%
           4 shaders by >=60%
           2 shaders by >=80%

total instructions in shared programs: 5845346 -> 5822422 (-0.39%)
instructions in affected programs:     364979 -> 342055 (-6.28%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
7a5cc789de i965/vec4: Track liveness of the flag register.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
b449366587 i965/fs: Remove opt_drop_redundant_mov_to_flags().
Dead code elimination now handles this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
b37273b924 i965/fs: Use const fs_reg & rather than a copy or pointer.
Also while we're touching var_from_reg, just make it an inline function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
60d507c3c5 i965/fs: Dead code eliminate instructions writing the flag.
Most prominently helps Natural Selection 2, which has a surprising
number shaders that do very complicated things before drawing black.

instructions in affected programs:     21052 -> 16978 (-19.35%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
bf8deb5514 i965/fs: Track liveness of the flag register.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
13f6601585 i965: Use local pointer to block_data in live intervals.
The next patch will be simplified because of this, and makes reading the
code a lot easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
a50915984f i965/vec4: Make live_intervals part of the vec4_visitor class.
Like in fs_visitor.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
e4d0299089 i965/fs: Treat the FB_WRITE as predicated if we're discarding.
Pre-Haswell hardware couldn't actually predicate it, but it's easier to
pretend as if it's predicated in the visitor since it will generate a
MOV from f0.1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:13 -08:00
Matt Turner
f1e5418f40 i965: Don't treat IF or WHILE with cmod as writing the flag.
Sandybridge's IF and WHILE instructions can do an embedded comparison
with conditional mod.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:12 -08:00
Matt Turner
937ddb419d i965/disasm: Disassemble tdr and tm registers properly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:42:12 -08:00
Jordan Justen
cd1b0f04be main, glsl: Bump max known desktop glsl version to 4.50
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-01 16:20:21 -08:00
Jordan Justen
307d22abb0 glsl/cs: Change gl_WorkGroupSize from ivec3 to uvec3
As documented in:

https://www.opengl.org/registry/specs/ARB/compute_shader.txt

  const uvec3 gl_WorkGroupSize;

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-01 16:20:21 -08:00
Jonathan Gray
31a46fb7a5 i965: avoid anonymous struct in float <-> VF conversions
Anonymous structures are only supported with newer versions of
GCC.  They will not work with GCC 4.2.1 used by OpenBSD or
GCC 4.4.7 shipped with RHEL6 going by a commit to fix a similiar
problem in radeonsi earlier in the year
(74388dd24b).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2014-12-01 16:13:08 -08:00
Brian Paul
991d5cf8ce mesa: fix arithmetic error in _mesa_compute_compressed_pixelstore()
We need parenthesis around the expression which computes the number of
blocks per row.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
2014-12-01 16:30:55 -07:00
Brian Paul
691170b9c7 vbo: also print buffer object pointer in vbo_print_vertex_list()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:30:39 -07:00
Brian Paul
1e14aaa8f9 mesa: some improvements for print_list()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:30:17 -07:00
Brian Paul
c407c6d588 mesa: inline/remove _mesa_polygon_stipple()
Was not called from any other place.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:30:12 -07:00
Brian Paul
f54162857c svga: fix comment typo 2014-12-01 16:30:12 -07:00
Brian Paul
953847e5a8 mesa: remove unused functions in prog_execute.c
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-12-01 16:29:55 -07:00
Brian Paul
cd8a7258b8 mesa: update glext.h to version 20141118 2014-12-01 15:22:20 -07:00
Brian Paul
ded14afa42 gallium: add include path to fix building of pipe-loader code
The pipe-loader code wasn't finding util/u_atomic.h

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-01 15:22:08 -07:00
José Fonseca
0806bf8815 graw: Avoid 'near'/'far' variables.
They are defined by windows.h, which got included slightly more
frequently than before with u_atomic.h
2014-12-01 20:24:51 +00:00
Matt Turner
120426b13d i965/fs: Clean up some whitespace in reg_allocate.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-01 11:32:56 -08:00
Matt Turner
2e007fd621 ra: Don't use regs as the ralloc context.
The i965 backends pass something out of 'screen', which is allocated
per-process, making using this as a ralloc context not thread-safe.

All callers ra_alloc_interference_graph() already ralloc_free() its
return value.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-01 11:32:54 -08:00
Matt Turner
933c678776 i965: Initialize INTEL_DEBUG once per process.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-01 11:32:52 -08:00
Matt Turner
82811ff176 i965: Initialize compaction tables once per process.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-01 11:32:51 -08:00
Matt Turner
9db278d0e2 glsl: Initialize static temporaries_allocate_names once per process.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-12-01 11:32:48 -08:00
José Fonseca
a5299e9e1c util/u_atomic: Fix the unlocked implementation.
It was totally broken:

- p_atomic_dec_zero() was returning the negation of the expected value

- p_atomic_inc_return()/p_atomic_dec_return() was
  post-incrementing/decrementing, hence returning the old value instead
  of the new

- p_atomic_cmpxchg() was returning the new value on success, instead of
  the old

It is clear this never used in the past. I wonder if it wouldn't be better to
yank it altogether.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-01 11:28:45 -08:00
José Fonseca
ff80b92a58 util/u_atomic: Add a simple test.
It was much easier for me to verify things build and run as expected
with this simple test, than building and testing whole Mesa.

With scons the test can be build and run merely by doing:

  scons u_atomic_test

Building the test with autotools is left as a future exercise.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-12-01 11:28:45 -08:00
Matt Turner
6df72e970c util: Make u_atomic.h typeless.
like how C11's stdatomic.h provides generic functions. GCC's __sync_*
builtins already take a variety of types, so that's simple.

MSVC and Sun Studio don't, but we can implement it with something that
looks a little crazy but is actually quite readable.

Thanks to Jose for some MSVC fixes!

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:45 -08:00
Matt Turner
41b5858a2f util: Use stdbool.h's bool rather than "boolean".
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:45 -08:00
Matt Turner
2879a77a37 util: Remove u_atomic.h's GCC inline assembly.
GCC >= 4.1 support the __sync_* intrinsics. That seems like a
sufficiently old baseline.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:45 -08:00
Matt Turner
972f8458f1 util: Remove u_atomic.h's MSVC inline assembly.
There was already an intrinsics path that implemented all of the same
functions, plus more.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:45 -08:00
Matt Turner
504062be2a util: Remove u_atomic.h's Gallium dependence.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:45 -08:00
Matt Turner
4abd20e261 util: s/INLINE/inline/ in u_atomic.h.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:44 -08:00
Matt Turner
ccad3829e3 util: Move u_atomic.h to src/util.
To be shared outside of Gallium.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-12-01 11:28:44 -08:00
Eric Anholt
3fe4d8e1e3 vc4: Introduce scheduling of QPU instructions.
This doesn't reschedule much currently, just tries to fit things into the
regfile A/B write-versus-read slots (the cause of the improvements in
shader-db), and hide texture fetch latency by scheduling setup early and
results collection late (haven't performance tested it).  This
infrastructure will be important for doing instruction pairing, though.

shader-db2 results:
total instructions in shared programs: 61874 -> 59583 (-3.70%)
instructions in affected programs:     50677 -> 48386 (-4.52%)
2014-12-01 11:00:23 -08:00
Eric Anholt
6958c404ca vc4: Drop the explicit scoreboard wait.
This is actually implicitly handled by the TLB operations.
2014-12-01 11:00:23 -08:00
Eric Anholt
334036fb64 vc4: Also deal with VPM reads at thread end.
Prevents a regression with QPU scheduling, which happens to put the no-op
reads for unused VPM contents end up at the end of the program.
2014-12-01 11:00:23 -08:00
Eric Anholt
a7b1a93137 vc4: Fix assertion about SFU versus texturing.
We're supposed to be checking that nothing else writes r4, which is done
by the TMU result collection signal, not the coordinate setup.

Avoids a regression when QPU instruction scheduling is introduced.
2014-12-01 11:00:23 -08:00
Eric Anholt
2d5784c825 vc4: Add another check for invalid TLB scoreboard handling.
This was caught by an assertion in the simulator.
2014-12-01 11:00:23 -08:00
Rob Clark
bb19f2c3c4 freedreno/a4xx: invalidate cache when vbo's change
Otherwise vertex shader can see stale cache data.  This in particular
happens when the same vbo is updated and reused.  Not sure yet if vbo's
at differing addresses but bound to same vertex buffer slot could have
issues, but seems safest to flush whenever new vertex buffers are bound.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-01 12:02:25 -05:00
Ilia Mirkin
ebbd34a468 st/mesa: avoid exposing EXT_texture_integer for pre-GLSL 1.30
For drivers building up to GL(ES)3, only expose the actual extension if
the API will let it be used (e.g. via overrides/debug flags that enable
higher versions).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-11-30 13:04:29 -05:00
Ilia Mirkin
4907c31385 freedreno/a3xx: add missing integer formats and enable rendering
The mesa state tracker doesn't fall back on similar integer formats, so
they must all be provided. Remove the restriction against integer color
rendering.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:04:28 -05:00
Ilia Mirkin
82104c19f3 freedreno/a3xx: enable sampling from integer textures
We need to produce a u32 destination type on integer sampling
instructions, so keep that in a shader key set based on the
currently-bound textures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:04:28 -05:00
Ilia Mirkin
8e336ef55b freedreno: allow each generation to hook into sampler view setting
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:04:28 -05:00
Ilia Mirkin
618ff11457 freedreno/a3xx: don't use half precision shaders for int/float32
Integer outputs end up getting mangled due to cov.f32f16, and float32
loses precision. Use full precision shaders in both of those cases.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:04:28 -05:00
Ilia Mirkin
f866446e8c freedreno/a3xx: disable blending for integer formats
Also add support for the BLENDABLE bind flag, similarly predicated on
non-int formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:04:28 -05:00
Ilia Mirkin
8e147e9ec8 freedreno/a3xx: remove blend clamp enables from gmem/clears
Just pass the data through unmolested. This probably has no effect since
blending isn't actually enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:00:41 -05:00
Ilia Mirkin
d63afe3b58 freedreno/a3xx: add format to emit info, use to set sint/uint flags
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:00:41 -05:00
Ilia Mirkin
5d95e99622 freedreno/a3xx: add 16-bit unorm/snorm texture formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:00:41 -05:00
Ilia Mirkin
547182977f freedreno/ir3: remove unused arg parameter
Leaving it around in the struct in case we want to use it later.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-30 13:00:22 -05:00
Ilia Mirkin
de83ef677f freedreno/ir3: fix UMAD
Looks like none of the mad variants do u16 * u16 + u32, so just add in
the extra value "by hand".

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
2014-11-30 13:00:22 -05:00
Rob Clark
66f694b16c freedreno/a4xx: stencil fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-30 10:44:09 -05:00
Rob Clark
5b46670487 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-30 10:44:03 -05:00
Rob Clark
3e698ebf44 freedreno/a4xx: add render target format to fd4_emit
This lets us move emitting SP_FS_MRT_REG back to fd4_program_emit.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-30 10:43:50 -05:00
Ilia Mirkin
4aec928ca4 freedreno/a3xx: unify vertex/texture formats into a single table
The table contains all the relevant information about each format. The
helper functions now just do lookups in the table.

Note that this adds support for a lot of formats that were previously
unsupported. Additionally it adds disabled support for integer render
buffers, which will require more work to actually enable.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-11-29 12:15:43 -05:00
Ilia Mirkin
20fbf99595 freedreno/a3xx: rename vertex/texture format enums to be more consistent
Switch both of them from independently inconsistent conventions to having
UINT/SINT/UNORM/SNORM/FLOAT/FIXED suffixes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-11-29 12:15:43 -05:00
Ilia Mirkin
3338bfcf49 freedreno/a3xx: fd3_util -> fd3_format
All the "util" helpers are actually format-related

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-11-29 12:15:43 -05:00
Ilia Mirkin
3de9fa8ff4 freedreno/a3xx: only enable blend clamp for non-float formats
This fixes arb_color_buffer_float-render GL_RGBA16F.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-11-29 12:15:43 -05:00
Kenneth Graunke
67c498086d i965: Add _CACHE_ in brw_cache_id enum names.
BRW_CACHE_VS_PROG is more easily associated with program caches than
plain BRW_VS_PROG.

While we're at it, rename BRW_WM_PROG to BRW_CACHE_FS_PROG, to move away
from the outdated Windowizer/Masker name.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:47 -08:00
Kenneth Graunke
e563c33d57 i965: Move CACHE_NEW_SAMPLER to BRW_NEW_SAMPLER_STATE_TABLE.
This flag signifies that we've emitted a new SAMPLER_STATE table.
Given that we haven't cached those in years, CACHE_NEW_SAMPLER isn't
a great name.  Putting it in the BRW_NEW_* hierarchy would make more
sense; BRW_NEW_SAMPLER_STATE_TABLE better reflects its actual purpose.

When this flag is raised, the pointer to the SAMPLER_STATE table has
changed, so we need to re-issue any packets which point to it (unit
state on Gen4-5, 3DSTATE_SAMPLER_STATE_POINTERS on Gen6, and the
per-stage variants on Gen7+).

Saves 2 * sizeof(void *) bytes per context, as we remove useless
aux_compare/aux_free function pointers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:46 -08:00
Kenneth Graunke
324368b500 i965: Move some /* CACHE_NEW_SAMPLER */ comments.
Marking brw_stage_state::sampler_count as CACHE_NEW_SAMPLER is wrong.

The number of samplers used by each program is actually computed at
draw time (brw_try_draw_prims), based purely on the currently bound
shader programs (gl_program::SamplersUsed).

CACHE_NEW_SAMPLER means that we've emitted a new SAMPLER_STATE table.
Although this could indicate that the number of samplers has changed,
it could also simply mean that the contents of the table has changed
(i.e. we've bound different textures).

The real reason these atoms depend on CACHE_NEW_SAMPLER is because they
include a pointer to the SAMPLER_STATE table.  This was not commented.

So, move the comments to the appropriate place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:44 -08:00
Kenneth Graunke
66ebfad3cd i965: Move CACHE_NEW_*_VP flags to BRW_NEW_*_VP.
We've been streaming these out for ages, so they basically have nothing
to do with brw_state_cache.c.

Saves 6 * sizeof(void *) bytes per context, as we won't have useless
aux_compare/aux_free functions for them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:42 -08:00
Kenneth Graunke
4d67b6ab9a i965: Fold the gen7_cc_viewport_state_pointer atom into brw_cc_vp.
These always happen together; the extra atom just means another item to
iterate through, flags to check, and a call through a function pointer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:40 -08:00
Kenneth Graunke
f421db70ba i965: Combine CACHE_NEW_*_UNIT into BRW_NEW_GEN4_UNIT_STATE.
On Gen4-5, unit state is specified as indirect state, rather than
commands.  If any unit state changes, we upload it via brw_state_batch
and arrange for 3DSTATE_PIPELINED_POINTERS to be re-emitted, which
updates pointers to all unit state at once.

Since there's only one command and state atom (brw_psp_urb_cs) that
needs to know about this, there's no benefit to having six separate
flags.  We can combine CACHE_NEW_*_UNIT into a single flag.

We also haven't cached these in a long time, so it doesn't make sense
to use the "CACHE_NEW_" prefix.  Instead, use the "BRW_NEW_" prefix.

This also saves 12 * sizeof(void *) bytes of memory per context, as
we remove useless aux_compare/aux_free functions for each CACHE bit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:38 -08:00
Kenneth Graunke
bea9b8e306 i965: Alphabetize brw_tracked_state flags and use a consistent style.
Most of the dirty flags were listed in some arbitrary order.  Some used
bonus parenthesis.  Some put multiple flags on one line, others put one
per line.  Some used tabs instead of spaces...but only on some lines.

This patch settles on one flag per line, in alphabetical order, using
spaces instead of tabs, and sheds the unnecessary parentheses.

Sorting was mostly done with vim's visual block feature and !sort,
although I alphabetized short lists by hand; it was pretty manual.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-29 02:18:36 -08:00
Christoph Bumiller
f3b4b263c2 nv50/ir/tgsi: handle TGSI_OPCODE_ARR
This instruction is used by st/nine.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2014-11-28 19:17:52 -05:00
Kenneth Graunke
133280120b i965: Set prog_data->uses_kill if simulating alpha test via discards.
When using MRT on Gen4-5, we have to simulate GL's alpha test feature
by emitting discards in the fragment shader.  In this case, it makes
sense to set prog_data->uses_kill, which means the fragment shader may
kill pixels via the discard mechanism.

This saves us from having to look an extra key value in a couple of
places, including in the generator.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-27 20:25:24 -08:00
Kenneth Graunke
06372c3fa9 i965: Use brw_wm_prog_data::uses_kill, not gl_fragment_program::UsesKill
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-27 20:25:23 -08:00
Kenneth Graunke
a0f8b363c0 i965/fs: Pass key->render_to_fbo via src1 of FS_OPCODE_DDY_*.
This means the generator doesn't have to look at the key, which is a
little nicer - we're pretty close to no key dependencies at all.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-27 20:25:19 -08:00
Kenneth Graunke
cea37f0911 i965/fs: Handle derivative quality decisions in the front-end.
Kristian noted that there's very little use of brw_wm_prog_key in the
generator, and that it basically just generates what it's told, without
caring about what stage it's handling.

One exception to this is derivative handling.  When handling dFdxCoarse
and dFdxFine, we packed an enum value in a second source register,
explicitly telling the generator what to do.  For dFdx, we specified an
enum value of "please use the hint", then checked the program key in the
generator level code.

A natural method is to define separate FS_OPCODE_DD[XY]_{COARSE,FINE}
opcodes, and have the front-end (which already decides what IR to
generate based on the program key) decide which dPdx/dPdy should
correspond to.  This consolidates the decision making in one place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-27 20:25:14 -08:00
Kenneth Graunke
2315ae6653 i965: Create prog_data temporary variables in PS state upload code.
prog_data->foo is a bit more readable than brw->wm.prog_data->foo.
The local variable definition is also a great location to put the
obligatory /* CACHE_NEW_WM_PROG */ comment.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-11-27 20:24:24 -08:00
Kenneth Graunke
6a1c1fd503 i965: Fix missing CACHE_NEW_WM_PROG in 3DSTATE_PS_EXTRA.
brw->wm.prog_data is covered by CACHE_NEW_WM_PROG, not
BRW_NEW_FRAGMENT_PROGRAM.  So, we should listen to it.

However, I believe that BRW_NEW_FRAGMENT_PROGRAM is sufficient to cover
all the necessary cases - CACHE_NEW_WM_PROG happens in a subset of
cases.  So, the code being wrong shouldn't have triggered bugs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-11-27 20:24:15 -08:00
Ilia Mirkin
e928b1e65b nv50: remove ancient map of rt formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-27 16:51:31 -05:00
Ilia Mirkin
37fe347542 freedreno/ir3: don't pass consts to madsh.m16 in MOD logic
madsh.m16 can't handle a const in src1, make sure to unconst it

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
2014-11-27 14:25:36 -05:00
Romain Failliot
b340469f33 docs: Set llvmpipe and softpipe note only for MSAA.
Right now, in mesamatrix.net, the footnote is set so that it seems to be
for all the features, while actually it only applies to MSAA.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-27 18:03:26 +01:00
Neil Roberts
c97cbd7e3d glsl: Use | action in the lexer source to avoid duplicating the float action
Flex and lex have a special action ‘|’ which means to use the same action as
the next rule. We can use this to reduce a bit of code duplication in the
rules for the various float literal formats.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-27 11:43:59 +00:00
Neil Roberts
9d8aa88693 glsl: Disallow float literals with the 'f' suffix but no point or exponent
According to the GLSL spec float literals like ‘1f’ shouldn't be allowed
without adding a decimal point or an exponent. Apparently the AMD driver also
disallows this so it seems unlikely that anything would be relying on it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-27 11:43:17 +00:00
Dave Airlie
91a827624c r600g: make llvm code compile this time
Actually compiling the code helps make it compile.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-27 14:11:23 +10:00
Dave Airlie
b10ddf962f r600g: fix fallout from last patch
I accidentally rebased from the wrong machine and missed some
fixes that were on my r600 box.

doh.

this fixes a bunch of geom shader textureSize tests on rv635
from gpu reset to pass.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86760
Reported-by: wolput@onsneteindhoven.nl
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-27 13:12:41 +10:00
Dave Airlie
07ae69753c r600g: merge the TXQ and BUFFER constant buffers (v1.1)
We are using 1 more buffer than we have, although in the future the
driver should just end up using one buffer in total probably, this
is a good first step, it merges the txq cube array and buffer info
constants on r600 and evergreen.

This should in theory fix geom shader tests on r600.

v1.1: fix comments from Glenn.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-27 10:31:38 +10:00
Matt Turner
bc5f5424e3 glapi: Remove dead mesadef.py.
Dead since commit 4e120c97, in which apiparser (which mesadef.py imports)
was removed.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2014-11-26 20:31:15 +00:00
José Fonseca
37b2a29d3b mesa/gdi: Don't pretend mesa.def is auto generated.
Just use the same entrypoints we use for st/wgl's opengl32.dll.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:31:14 +00:00
José Fonseca
cb009bdd44 st/wgl: Don't export wglGetExtensionsStringARB.
It's not exported by the official opengl32.dll neither.  Applications are
supposed to get it via wglGetProcAddress(), not GetProcAddress().

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:31:11 +00:00
José Fonseca
5fdb6d6839 mapi/glapi: Fix dll linkage of GLES1 symbols.
This fixes several MSVC warnings like:

  warning C4273: 'glClearColorx' : inconsistent dll linkage

In fact, we should avoid using `declspec(dllexport)` altogether, and use
exclusively the .DEF instead, which gives more precise control of which
symbols must be exported, but all the public GL/GLES headers practically
force us to pick between `declspec(dllexport)` or
`declspec(dllimport)`.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:31:07 +00:00
José Fonseca
4b6e93650c util/u_snprintf: Don't redefine HAVE_STDINT_H as 0.
We now always guarantee availability of stdint.h on MSVC -- if MSVC
doesn't supply one we use our own.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:30:58 +00:00
José Fonseca
29557a1fa8 gallivm: Removed unused variable.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:25:12 +00:00
José Fonseca
a0ddc54777 draw,gallivm,llvmpipe: Avoid implicit casts of 32-bit shifts to 64-bits.
Addresses MSVC warnings "result of 32-bit shift implicitly converted to
64 bits (was 64-bit shift intended?)", which can often be symptom of
bugs, but in these cases were all benign.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:25:12 +00:00
José Fonseca
aef3a01d57 scons: Generate SSE2 floating-point arithmetic.
- SSE2 is available on all x86 processors we care about.

- It's recommended by Intel:

  https://software.intel.com/en-us/blogs/2012/09/26/gcc-x86-performance-hints

- And has been the default since MSVC 2012:

  http://msdn.microsoft.com/en-us/library/7t5yh4fd(v=vs.110).aspx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:25:12 +00:00
José Fonseca
0473577f91 scons: Remove dead code/comments.
- Remove no-op if-clause.

- -mstackrealign has been enabled again on MinGW for quite some time and
  appears to work alright nowadays.

- Drop -mmmx option as it is implied my -msse, and we don't use MMX
  intrinsics anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:25:12 +00:00
Axel Davy
a10bf5c10c st/nine: fix formatting in query9 (cosmetic)
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:12 +00:00
Axel Davy
d52328fc39 st/nine: Fix setting of the shift modifier in nine_shader
It is an sint_4, but it was stored in a uint_8...
The code using it was acting as if it was signed.

Problem found thanks to Coverity

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:12 +00:00
David Heidelberg
90fea6b3e0 st/nine: remove unused pipe_viewport_state::translate[3] and scale[3]
2efabd9f5a removed them as unused.

This caused random memory overwrites (reported by Coverity).

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-26 20:09:12 +00:00
Axel Davy
614d9387c7 st/nine: fix wrong variable reset
Error detected by Coverity (COPY_PASTE_ERROR)

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-26 20:09:12 +00:00
David Heidelberg
a99f31bced st/nine: return GetAvailableTextureMem in bytes as expected (v2)
PIPE_CAP_VIDEO_MEMORY returns the amount of video memory in megabytes,
so need to converted it to bytes.

Fixed Warframe memory detection.

v2: also prepare for cards with more than 4GB memory

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-26 20:09:11 +00:00
Axel Davy
4eea2496bc st/nine: Add pool check to SetTexture (v2)
D3DPOOL_SCRATCH is disallowed according to spec.
D3DPOOL_SYSTEMMEM should be allowed but we don't handle it right for now.

v2: Fixes segfault in SetTexture when unsetting the texture

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:11 +00:00
Axel Davy
890f963d64 st/nine: propertly declare constants (v2)
Fixes "Error : CONST[20]: Undeclared source register" when running
dx9_alpha_blending_material. Also artifacts on ilo.

v2: also remove unused MISC_CONST

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:11 +00:00
Stanislaw Halik
7f74b9d479 st/nine: call DBG() at more external entry points
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>
2014-11-26 20:09:11 +00:00
Axel Davy
6aeae7442d st/nine: rework the way D3DPOOL_SYSTEMMEM is handled
This patch moves the data field from Resource9 to Surface9 and cleans
D3DPOOL_SYSTEMMEM handling in Texture9. This fixes HL2 lost coast.

It also removes in Texture9 some code written to support importing
and exporting non D3DPOOL_SYSTEMMEM shared buffers. This code hadn't
the design required to support the feature and wasn't used.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:10 +00:00
Axel Davy
133b2087c5 st/nine: Rework Basetexture9 and Resource9.
Instead of having parts of the structures initialised by the parents,
have them initialised by the children.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:10 +00:00
Axel Davy
104b5a8193 st/nine: clean device9ex.
Pass ex specific parameters as arguments to device9 ctor instead
of passing them by filling the structure.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-26 20:09:10 +00:00
Emil Velikov
9b7037a369 nine: the .pc file should not follow mesa version
The version provided by it should be the same as the one
provided/handled by the module. Add the missing tiny version.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-26 20:09:10 +00:00
Emil Velikov
c642e87d9f auxiliary/vl: rework the build of the VL code
Rather than shoving all the VL code for non-VL targets, increasing
their size, just split it out and use it when needed. This gives us
the side effect of building vl_winsys_dri.c once, dropping a few
automake warnings, and reducing the size of the dri modules as below

   text    data     bss     dec     hex filename
5850573  187549 1977928 8016050  7a50b2 before/nouveau_dri.so
5508486  187100  391240 6086826  5ce0aa after/nouveau_dri.so

The above data is for a nouveau + swrast + kms_swrast 'megadriver'.

v2: Do not include the vl sources in the auxiliary library.
v3: Rebase. Add nine.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-26 20:09:09 +00:00
Emil Velikov
86a51eb861 auxiliary/vl: split the vl sources list into VL_SOURCES
With follow up commit we'll split vl static lib from the auxiliary one,
and choose the appropriate vl (galliumvl or galliumvl_stub) for the
respective targets to link against.

v2: Rebase.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-26 20:09:09 +00:00
Emil Velikov
f093c1c8ec auxiliary/vl: add galliumvl_stub.la
Will be used by the non-VL targets, to stub out the functions called
by the drivers. The entry point to those are within the VL
state-trackers, yet the compiler cannot determine that at link time.
Thus we'll need to stub them out to prevent unresolved symbols in the
dri, egl, gbm and pipe-loader targets.

v2: Rebase.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-26 20:09:09 +00:00
Emil Velikov
2dbaedaf10 automake: rework VL dependency tracking
Set a single VL_{CFLAG,LIBS} for xcb and friends, and let each target
check for it's relevant library alone. Required as with follow up
commits we'll build aux/vl into a separate module, which needs VL_CFLAGS

Cleanup add a couple of explicit LIBDRM_LIBS linking, as aux/vl itself
requires libdrm, despite that LIBDRM_{RADEON,NOUVEAU...} may provide it
as well.

v2: Rebase. Make sure st/xvmc programs work.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-26 20:08:40 +00:00
Emil Velikov
303bc3609a configure: check the package version when auto-detecting the VL targets
Or we might end up where automatically enable the build, only to error
out a couple of lines after that.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-26 20:08:39 +00:00
Siavash Eliasi
8dc8c496e1 mesa: Permanently enable features supported by target CPU at compile time.
This will remove the need for unnecessary runtime checks for CPU features if
already supported by target CPU, resulting in smaller and less branchy code.

V2:
- Removed the SSSE3 related part for the not yet merged patch.
- Avoiding redefinition of macros.

Tested-by: David Heidelberg <david@ixit.cz>
2014-11-26 20:08:38 +00:00
Emil Velikov
752c2e9690 docs: add relnotes template for 10.5.0
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-26 18:00:17 +00:00
Timothy Arceri
b3721cd230 util: update hash type comments
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-11-26 20:04:13 +11:00
Matt Turner
531feec9dc i965/vec4: Handle destination writemasks in VEC4_OPCODE_PACK_BYTES.
Since pack_bytes expands to two mov(4) align1 instructions, we can't use
swizzles directly. For an instruction like

   pack_bytes m4.y:UD, vgrf13.xyzw:UD

we can write into the .y component by settings the offset based on the
swizzle.

Also while we're doing this, we can set the dependency control hints
properly, so that a series of pack_bytes writing into separate
components of a register can issue without blocking.
2014-11-25 17:29:02 -08:00
Matt Turner
70fcd56538 i965/vec4: Optimize packSnorm4x8().
Reduces the number of instructions needed to implement packSnorm4x8()
from 13 -> 7.
2014-11-25 17:29:02 -08:00
Matt Turner
3532be7680 i965/vec4: Optimize packUnorm4x8().
Reduces the number of instructions needed to implement packUnorm4x8()
from 11 -> 6.
2014-11-25 17:29:02 -08:00
Matt Turner
e14c7c7faf i965/vec4: Add VEC4_OPCODE_PACK_4_BYTES.
Will be used by emit_pack_{s,u}norm_4x8().
2014-11-25 17:29:02 -08:00
Matt Turner
94a30bbd4f i965/vec4: Optimize unpackSnorm4x8().
Reduces the number of instructions needed to implement unpackSnorm4x8()
from 16 -> 6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner
bf686b2785 i965/vec4: Optimize unpackUnorm4x8().
Reduces the number of instructions needed to implement unpackUnorm4x8()
from 11 -> 4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner
cb0ba848d4 i965/vec4: Add vector float immediate infrastructure.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner
5d23721c1d i965/fs: Add vector float immediate infrastructure.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner
276075f864 i965: Disassemble vector float immediates properly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-11-25 17:29:02 -08:00
Matt Turner
b2abf033e0 i965: Add unit test for float <-> VF conversions.
Using Eric's original VF -> float conversion code to initialize the
table.
2014-11-25 17:29:02 -08:00
Matt Turner
c37d798e78 i965: Add functions to convert float <-> VF.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:01 -08:00
Chris Forbes
0008d0e59e i965/Gen6-7: Do not replace texcoords with point coord if not drawing points
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.

Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.

V2: - Simplify patch -- the real changes are no longer duplicated across
      the Gen6 and Gen7 atoms.
    - Also don't clobber attr overrides -- which matters on Haswell too,
      and fixes the other half of the problem
    - Fix newly-introduced warnings
V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than
      core flag and state; keep the state flags in order.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 22:38:32 +13:00
Kenneth Graunke
60f011af1a glsl: Make lower_constant_arrays_to_uniforms require dereferences.
Ilia noticed that my lowering pass was converting the constant array
used by textureGatherOffsets' offsets parameter to a uniform.  This
broke textureGather for Nouveau, and is generally a horrible plan,
since it violates the GLSL constraint that offsets must be an
immediate constant.

When I wrote this pass, I neglected to consider whole array assignment.
I figured opt_array_splitting would handle constant indexing, so this
pass was really about fixing variable indexing.

textureGatherOffsets is an example of whole array access that we really
don't want to touch.  Whole array copies don't appear to benefit from
this either - they're most likely initializers for temporary arrays
which are going to be mutated anyway.  Since you're copying, you may
as well copy from immediates, not uniforms.

This patch makes the pass look for ir_dereference_arrays of
ir_constants, rather than looking for any ir_constant directly.
This way, it ignores whole array assignment.

No shader-db changes or Piglit regressions on Haswell.  Some Piglit
tests generate different code (fixing textureGatherOffsets on Nouveau).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2014-11-24 15:30:09 -08:00
Kenneth Graunke
f0c91f32c0 i965: Precompile ARB programs.
We already precompile GLSL programs; it seems logical to precompile ARB
programs as well.  We just never hooked it up.

This also makes the programs compile even if no drawing occurs, which is
useful for shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Kenneth Graunke
b55777f39d i965: Make precompile functions accessible from C.
Previously, the prototypes for brw_vs/gs/fs_precompile were scattered
between brw_vs.h (C), brw_gs.h (C), and brw_fs.h (C++ only).  Also,
brw_fs_precompile had C++ linkage, while the others were C.

This patch moves all the prototypes to a central location (brw_shader.h)
and makes brw_fs_precompile have C linkage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Kenneth Graunke
62b425448c i965: Pass gl_program pointers into precompile functions.
We'd like to do precompiling for ARB vertex and fragment programs,
which only have gl_program structures - gl_shader_program is NULL.

This patch makes the various precompile functions take a gl_program
parameter directly, rather than accessing it via gl_shader_program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Kenneth Graunke
d54925df9c i965: Move brw->precompile checks out a level.
brw_shader_precompile should just do a precompile; it makes more sense
for the caller to decide whether we should do one.  Simpler.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Roland Scheidegger
880424b8ad llvmpipe: (trivial) remove redundant util_cpu_detect() call in lp_test_main
Already called earlier.
2014-11-25 00:29:29 +01:00
Roland Scheidegger
8148a06b8f llvmpipe: fix lp_test_arit denorm handling
llvmpipe disables denorms on purpose (on x86/sse only), because denorms are
generally neither required nor desired for graphic apis (and in case of d3d10,
they are forbidden).
However, this caused some arithmetic tests using denorms to fail on some
systems, because the reference did not generate the same results anymore.
(It did not fail on all systems - behavior of these math functions is sort
of undefined when called with non-standard floating point mode, hence the
result differing depending on implementation and in particular the sse
capabilities.)
So, for the reference, simply flush all (input/output) denorms manually
to zero in this case.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=67672.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-25 00:29:29 +01:00
Eric Anholt
93d30ff5d6 nouveau: Fix build after STR/BRA opcode dropping.
I missed these while git grepping for users of the dead opcodes.  Sigh,
macros.
2014-11-24 15:22:25 -08:00
Eric Anholt
a3688d686f mesa: Drop unused NV_fragment_program opcodes.
The extension itself was deleted 2 years ago.  There are still some
prog_instruction opcodes from NV_fp that exist because they're used by
ir_to_mesa.cpp, though.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Roamnick <ian.d.romanick@intel.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
868f95f1da mesa: Drop unused SFL/STR opcodes.
They're part of NV_vertex_program2, which I'm pretty sure we're never
going to support.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Roamnick <ian.d.romanick@intel.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
365a4a3f9a gallium: Drop the unused CND opcode.
Nothing in the tree generates it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
00f7002c5c gallium: Drop unused BRA opcode.
Never generated, and implemented in only nvfx vertprog.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
ecfe9e2ad2 gallium: Drop the unused SFL/STR opcodes.
Nothing generated them.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
dc00b382b5 gallium: Drop the unused RFL opcode.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
8c822b1e91 gallium: Drop unused X2D opcode.
Nothing in the tree generates it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
ff886c4955 gallium: Drop the unused ARA opcode.
Nothing in the tree generated it.

v2: Only drop ARA, not ARR as well.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v2)
2014-11-24 14:56:22 -08:00
Eric Anholt
de2f8d75db gallium: Drop the unused RCC opcode.
Nothing in the tree generated it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
d4864cdf15 gallium: Drop the NRM and NRM4 opcodes.
They weren't generated in tree, and as far as I know all hardware had to
lower it to a DP, RSQ, MUL.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt
7361d5ba63 ilo: Drop the explicit intialization of gaps in TGSI opcodes.
The nice thing about the good way of initializing arrays like this is that
you don't need to initialize everything in order, or even everything at
all.  Taking advantage of that only needs a tiny fixup to deal with the
default NULL value of the pointers.

I haven't dropped the initialization of opcodes that exist and are unsupported.
2014-11-24 14:56:22 -08:00
Eric Anholt
386c3fcb14 r300: Drop the "/* gap */" notes.
This switch statement's code structure isn't dependent on the numbers of
the opcodes at all.
2014-11-24 14:56:22 -08:00
Eric Anholt
2f01cc8417 r600: Drop the "/* gap */" notes.
These are obviously the gaps already, due to the bare numbers with
unsupported implementations.

This makes inserting new gaps less irritating.
2014-11-24 14:56:22 -08:00
Jose Fonseca
925cb75f89 nine: Drop use of TGSI_OPCODE_CND.
This was the only state tracker emitting it, and hardware was just having
to lower it anyway (or failing to lower it at all).

v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed
    to actually not reference TGSI_OPCODE_CND.  Change by anholt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-24 14:56:22 -08:00
Jose Fonseca
56fd7c6361 nine: Don't reference the dead TGSI_OPCODE_NRM.
The translation is lowering it to not using TGSI_OPCODE_NRM, anyway.

v2: Extracted from a larger patch by Jose that also dropped DP2A usage.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-24 14:56:22 -08:00
Eric Anholt
7c0acd8535 nine: Don't use the otherwise-dead SFL opcode in an unreachable path.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-24 14:56:21 -08:00
Matt Turner
057e6e5251 i965/gen6/gs: Don't declare a src_reg with struct.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:09:23 -08:00
Matt Turner
ff966aff99 i965/disasm: Fix all32h/any32h predicate disassembly.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-11-24 14:09:23 -08:00
Matt Turner
b754e52532 glsl: Fix tautological comparison.
Caught by clang.

warning: comparison of constant -1 with expression of type
         'ir_texture_opcode' is always false
      [-Wtautological-constant-out-of-range-compare]
      if (op == -1)
          ~~ ^  ~~

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:09:23 -08:00
Matt Turner
024db256d4 util: Prefer atomic intrinsics to inline assembly.
Cuts a little more than 1k of .text size from i915g.

This was previously done in commit 5f66b340 and subsequently reverted in
commit 3661f757 after bug 30514 was filed. I believe the cause of bug
30514 wasn't anything related to cross compiling, but rather that the
toolchain used defaulted to -march=i386, and i386 doesn't have the
CMPXCHG or XADD instructions used to implement the intrinsics.

So we reverted a patch that improved things so that we didn't break
compilation for a platform that never could have worked anyway.
2014-11-24 14:09:23 -08:00
Matt Turner
99cebffda9 util: Implement assume() for clang.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-11-24 14:09:23 -08:00
Matt Turner
56ac25918a i965: Don't overwrite the math function with conditional mod.
Ben was asking about the undocumented restriction that the math
instruction cannot use the dependency control hints. I went to reconfirm
and disabled the is_math() check in opt_set_dependency_control() and saw
that the disassembled math instructions with dependency hints had a
bogus math function. We were mistakenly overwriting it by setting an
empty conditional mod.

Unfortunately, this wasn't the cause of the aforementioned problem (I
reproduced it). This bug is benign, since we don't set dependeny hints
on math instructions -- but maybe some day.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:07:32 -08:00
Matt Turner
f5bef2d2e5 i965: Assert that math instructions don't have conditional mod.
The math function field is at the same location as conditional mod.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:06:39 -08:00
Matt Turner
803a744507 glsl: Remove unused ast copy constructors.
These were added in commits a760c738 and 43757135 to be used in
implementing C-style aggregate initializers (commit 1b0d6aef). Paul
rewrote that code in commit 0da1a2cc to use GLSL types, rather than
AST types, leaving these copy constructors unused.

Tested by making them private and providing no definition.
2014-11-24 14:06:39 -08:00
Matt Turner
baff470823 glapi: Remove dead gl_offsets.py.
Dead since commit 07b85457.
2014-11-24 14:02:54 -08:00
Matt Turner
76ef547be7 glapi: Remove dead extension_helper.py.
Dead since commit 3d16088f.
2014-11-24 14:02:54 -08:00
Eric Anholt
52a7cb2ec4 vc4: Fix some inconsistent indentation. 2014-11-24 12:37:33 -08:00
Eric Anholt
6f4adb7483 vc4: Don't forget to actually connect the fence code.
I thought I'd tested this.
2014-11-24 12:37:33 -08:00
Eric Anholt
fa74ec7e98 vc4: Add a note about a piece of errata I've learned about.
Right now in my environment I've only got a small CMA area, so this
constraint ends up holding.
2014-11-24 12:37:33 -08:00
Chris Forbes
2b4fe85f0e mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose
This was just returning the same value as GL_CURRENT_MATRIX_ARB.
Spotted while investigating something else in apitrace.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 21:55:47 +13:00
Chris Forbes
129178893b glsl: Generate unique names for each const array lowered to uniforms
Uniform names (even for hidden uniforms) are required to be unique; some
parts of the compiler assume they can be looked up by name.

Fixes the piglit test: tests/spec/glsl-1.20/linker/array-initializers-1

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 21:07:56 +13:00
Chris Forbes
adefccd12a i965: Handle nested uniform array indexing
When converting a uniform array reference to a pull constant load, the
`reladdr` expression itself may have its own `reladdr`, arbitrarily
deeply. This arises from expressions like:

   a[b[x]]     where a, b are uniform arrays (or lowered const arrays),
               and x is not a constant.

Just iterate the lowering to pull constants until we stop seeing these
nested. For most shaders, there will be only one pass through this loop.

Fixes the piglit test:
tests/spec/glsl-1.20/linker/double-indirect-1.shader_test

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 21:07:29 +13:00
Dave Airlie
c88385603a r600g: do all CUBE ALU operations before gradient texture operations (v2.1)
This moves all the CUBE section above the gradients section,
so that the gradient emission happens on one block which
is what sb/hardware expect.

v2: avoid changes to bytecode by using spare temps
v2.1: shame gcc, oh the shame. (uninit var warnings)

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-24 13:33:07 +10:00
Dave Airlie
38ec184419 r600: fix texture gradients instruction emission (v2)
The piglit tests were failing, and it appeared to be SB
optimising out things, but Glenn pointed out the gradients
are meant to be clause local, so we should emit the texture
instructions in the same clause. This moves things around
to always copy to a temp and then emit the texture clauses
for H/V.

v2: Glenn pointed out we could get another ALU fetch in
the wrong place, so load the src gpr earlier as well.

Fixes at least:
./bin/tex-miplevel-selection textureGrad 2D

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-24 10:41:30 +10:00
Ilia Mirkin
fecae4625c nv50,nvc0: buffer resources can be bound as other things down the line
res->bind is not an indicator of how the resource is currently bound.
buffers can be rebound across different binding points without changing
underlying storage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2014-11-23 15:43:28 -05:00
Ilia Mirkin
e80a0a7d9a nv50,nvc0: actually check constbufs for invalidation
The number of vertex buffers has nothing to do with the number of bound
constbufs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2014-11-23 15:43:27 -05:00
Ilia Mirkin
7d07083cfd nv50/ir: set neg modifiers on min/max args
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=86618
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2014-11-23 15:43:27 -05:00
Chris Forbes
89b9ef937c mesa: Fix function name in GetActiveUniformName error
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-11-23 15:04:15 +13:00
Stéphane Marchesin
3d9c1a9dd6 i915g: Fallback copy_render for ZS formats
These don't work out of the box, need more work, maybe with a proxy
format?

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:41 -08:00
Stéphane Marchesin
90207340c7 i915g: Add back 4444 and 5551 formats
Now that we have the transfers working, we can re-add those formats.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:40 -08:00
Stéphane Marchesin
1e47510df7 i915g: Don't limit blitter to POT textures
Now that we have NPOT support for u_blitter, there is no reason to
limit this any longer.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:40 -08:00
Stéphane Marchesin
e30c799da9 i915g: Align all texture dimensions to the next POT
This creates a usable layout for all NPOT textures. Of course these
still have lots of limitations, but at least we can render to a
level.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:40 -08:00
Stéphane Marchesin
675019584c i915g: Fix typos
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:40 -08:00
Stéphane Marchesin
2ed24b2c31 i915g: Fix maxlod computation.
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:39 -08:00
Stéphane Marchesin
0220a428d7 i915g: Fix offset for level != 0
For NPOT texture layouts, we want to be able to access texture levels
other than 0 directly. Since the hw doesn't support that, We do it by
adding the offset directly.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:39 -08:00
Stéphane Marchesin
a9b0787076 i915g: Don't write constants past I915_MAX_CONSTANT
This happens with glsl-convolution-1, where we have 64 constants. This
doesn't make the test pass (we don't have 64 constants anyway, only
32) but this prevents it from crashing.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:39 -08:00
Stéphane Marchesin
5f61744adb i915g: Don't hardcode array size for phase count
This is an array of temp registers, so use I915_MAX_TEMPORARY for the size.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
2014-11-22 00:13:39 -08:00
David Heidelberg
25b00f4617 draw: allow LLVM use on non-SSE2 X86 cpus
This patch remove workaround related to LLVM < 3.2 bug.

Original bug has been closed as fixed in 2011.
At this moment gallium requires LLVM 3.3 (2013).

LLVM has been tested without SSE2 support in commit
ca70de9bd2 and removed after requiring
LLVM 3.3 in commit 013ff2fae1

Original LLVM bug: http://llvm.org/bugs/show_bug.cgi?id=6960

Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-22 04:29:00 +00:00
Emil Velikov
7d854c9771 docs: add news item and link release notes for mesa 10.3.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-22 04:26:06 +00:00
Emil Velikov
34616bc922 docs: Add sha256 sums for the 10.3.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 72c27d7a3a)
2014-11-22 04:24:32 +00:00
Emil Velikov
9e168ad903 Add release notes for the 10.3.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 26c8ecd85d)
2014-11-22 04:24:29 +00:00
Kenneth Graunke
a746be259d i965: Make Gen4-5 push constants call _mesa_load_state_parameters too.
In commit 5e37a2a4a8, I made the pull constant code stop calling
_mesa_load_state_parameters() when there were no pull parameters.

This worked fine on Gen6+ because the push constant code also called
it if there were any push constants.  However, the Gen4-5 push constant
code wasn't doing this.  This patch makes it do so, like the Gen6+ code.

A better long term solution would be to make core Mesa just handle this
for us when necessary.

Fixes around 8766 Piglit tests on Ironlake, and probably Gen4 as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2014-11-21 16:25:17 -08:00
Ben Widawsky
88fea85f09 i965/vec4/gen8: Handle the MUL dest hazard exception
Fix one of the few cases where we can't reliable touch the destination hazard
bits. I am explicitly doing this patch individually so it is easy to backport. I
was tempted to do this patch before the previous patch which reorganized the
code, but I believe even doing that first, this is still easy to backport.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-21 12:08:46 -08:00
Ben Widawsky
156f565f9e i965/vec4: Extract depctrl hazards
Move this to a separate function so that we can begin to add other little
caveats without making too big a mess.

NOTE: There is some desire to improve this function eventually, but we need to
fix a bug first.

v2:
Use const for the inst for the hazard check (Matt)
Invert safe logic to get rid of the double negative (Matt)
Add PRM reference for predicates (Matt)
Add note about empirical evidence for math (Matt)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-21 12:08:46 -08:00
Matt Turner
40c0d79d29 i965/fs: Remove is_valid_3src().
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-21 10:26:44 -08:00
Matt Turner
0777775274 i965/fs: Remove is_valid_3src() checks from emit_lrp.
The visitor emits MOVs to temporary registers for immediates, so these
never trigger. For further proof, check case ir_triop_fma.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-21 10:26:44 -08:00
Matt Turner
1fdc75fde4 i965/fs: Remove unused apply_stride().
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-21 10:26:44 -08:00
Matt Turner
279c1c80b6 i965/fs: Move ip_record class to its one use.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-21 10:26:44 -08:00
Matt Turner
d9432af45a i965: Move common fields into backend_instruction.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-21 10:26:42 -08:00
Matt Turner
bd50213929 i965: Combine offset/texture_offset fields.
texture_offset was only used by some texturing operations, and offset
was only used by spill/unspill and some URB operations. These fields are
never used at the same time.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-21 10:26:38 -08:00
Marek Olšák
645b471d61 radeonsi: use minnum and maxnum LLVM intrinsics for MIN and MAX opcodes
So far it has been compiled into pretty ugly code (8 instructions or so
for either opcode).

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-11-21 12:15:58 +01:00
Eric Anholt
21577571b3 vc4: Update for new kernel ABI with async execution and waits.
Our submits now return immediately and you have to manually wait for
things to complete if you want to (like a normal driver).
2014-11-20 13:07:07 -08:00
Ville Syrjälä
390799c496 i915: Only use TEXCOORDTYPE_VECTOR with cube maps on gen2
Check that the target is GL_TEXTURE_CUBE_MAP before emitting
TEXCOORDTYPE_VECTOR texture coordinates.

I'm not sure if the hardware would like CARTESIAN coordinates
with cube maps, and as I'm too lazy to find out just emit the
VECTOR coordinates for cube maps always. For other targets use
CARTESIAN or HOMOGENOUS depending on the number of texture
coordinates provided.

Fixes rendering of the "electric" background texture in chromium-bsu
main menu. We appear to be provided with three texture coordinates
there (I'm guessing due to the funky texture matrix rotation it does).
So the code would decide to use TEXCOORDTYPE_VECTOR instead of
TEXCOORDTYPE_CARTESIAN even though we're dealing with a 2D texure.
The results weren't what one might expect.

demos/cubemap still works, which hopefully indicates that this doesn't
break things.

Also tested with:
 bin/glean -o -v -v -v -t +texCube --quick
 bin/cubemap -auto
from piglit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-20 21:58:57 +02:00
Ben Widawsky
ca39c46c3b i965/disasm: Properly decode branch_ctrl (gen8+)
Add support for decoding the new branch control bit. I saw two things wrong with
the existing code.

1. It didn't bother trying to decode the bit.
-  While we do not *intentionally* emit this bit today, I think it's interesting
   to see if we somehow ended up with the bit set. It may also be useful in the
   future.

2. It seemed to be the wrong bit.
-  The docs are pretty poor wrt which bit this actually occupies. To me, it
   /looks/ like it should be bit 28. I am not sure where Ken got 30 from. I
   verified it should be 28 by looking at the simulator code.

I also added the most basic support for GOTO simply so we don't need to remember
to change the function in the future.

v2:
Move the branch_ctrl check out of the if gen >= 6 check to make it more
readable. (Matt)
ENDIF doesn't have branch_ctrl (Matt + Ken)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-20 09:45:23 -08:00
José Fonseca
56bf948e11 rtasm,translate: Re-enable SSE on Mingw64.
This reverts f4dd099171.

The src/gallium/tests/unit/translate_test.c gives the same results on
MinGW 64-bits as on Linux 64-bits.  And since MinGW is often used for
development/testing due to its convenience, it's better not to have this
sort of differences relative to MSVC.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-20 14:11:36 +00:00
Kenneth Graunke
5e37a2a4a8 i965: Skip _mesa_load_state_parameters when there are zero parameters.
Saves a tiny bit of CPU overhead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
2014-11-20 01:56:54 -08:00
Marek Olšák
6f7371619c radeonsi: remove unused variable si_state_dsa::db_render_control 2014-11-19 21:42:14 +01:00
Roland Scheidegger
763fc526c7 llvmpipe: enable PIPE_CAP_TGSI_VS_LAYER_VIEWPORT
No changes required in the driver itself, all handled by draw.

piglit results in a quick run:
skip->pass 7
skip->fail 2
(The new failures in the ARB_fragment_layer_viewport group are expected,
we fail the same if gs doesn't write these outputs regardless of the vs.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-19 18:35:30 +01:00
Roland Scheidegger
4b6d6642d2 draw: fixes for vertex shaders outputting layer or viewport index
Mostly add a couple cases so we don't just check gs for this.
There's only one gotcha, the built-in vp transform in the llvm vs can't
handle it (this would be fixable though non-trivial due to vp index being
non-constant for the SoA outputs, but we don't use it if there's a gs
neither - the whole clip/vp transform integration there is suboptimal).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-19 18:35:30 +01:00
Michael Varga
9460cd39e8 st/va: surface: render subpicture
Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-19 09:29:11 -05:00
Michael Varga
7523db174e st/va: subpicture implementation
added BGRA format
create/destroy
set image
associate/deassociate

Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-19 09:29:11 -05:00
Michael Varga
05e225b558 st/va: added internal storage for VAImage and BGRA format
When calling vaCreateImage() an internal copy of VAImage is maintained
since the allocation of "image" may not be guaranteed to live long enough.

Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-19 09:29:11 -05:00
Michael Varga
7b4f233c1f st/va: added some calls to handle_table_remove()
In a few locations handles were being added but not removed.

Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-19 09:29:10 -05:00
Chad Versace
b69c7c5dac i965: Fix segfault in WebGL Conformance on Ivybridge
Fixes regression of WebGL Conformance test texture-size-limit [1] on
Ivybridge Mobile GT2 0x0166 with Google Chrome R38.

Regression introduced by

    commit 6c04423153
    Author: Kenneth Graunke <kenneth@whitecape.org>
    Date:   Sun Feb 2 02:58:42 2014 -0800

        i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192.

The test regressed because the pointer offset arithmetic in
intel_miptree_map_gtt() overflows for large textures. The pointer
arithmetic is not 64-bit safe.

[1] 52f0dc240f/sdk/tests/conformance/textures/texture-size-limit.html

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=78770
Fixes: Intel CHRMOS-1377
Reported-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Ian Romanic <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2014-11-18 19:16:45 -08:00
Siavash Eliasi
80bffde0a2 mesa/main: Fix tmp_row memory leak in texstore_rgba_integer.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-18 14:55:39 -08:00
Jason Ekstrand
d76be6bd60 docs/GL3: Mark GL_ARB_direct_state_access as being started by Laura 2014-11-18 14:54:12 -08:00
Dave Airlie
1830138cc0 r600g: limit texture offset application to specific types (v2)
For 1D and 2D arrays we don't want the other coordinates being
offset and affecting where we sample. I wrote this patch 6 months
ago but lost it.

Fixes:
./bin/tex-miplevel-selection textureLodOffset 1DArray
./bin/tex-miplevel-selection textureLodOffset 2DArray
./bin/tex-miplevel-selection textureOffset 1DArray
./bin/tex-miplevel-selection textureOffset 1DArrayShadow
./bin/tex-miplevel-selection textureOffset 2DArray
./bin/tex-miplevel-selection textureOffset(bias) 1DArray
./bin/tex-miplevel-selection textureOffset(bias) 2DArray

v2: rewrite to handle more cases and be consistent with code
above.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-19 08:22:13 +10:00
Dave Airlie
d4c342f67e r600g: geom shaders: always load texture src regs from inputs
Otherwise we seem to lose the split_gs_inputs and try and
pull from an uninitialised register.

fixes 9 texelFetch geom shader tests.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-19 08:21:40 +10:00
Eric Anholt
82e919d33b vc4: Emit semaphore instructions for new kernel ABI.
Previously, the kernel would dispatch thread 0, wait, then dispatch thread
1.  By insisting that the thread contents use semaphores in the right
place, the kernel can sleep for longer by dispatching both threads at
once.
2014-11-18 12:46:55 -08:00
Eric Anholt
05f165b62d vc4: Mark a big array as const.
Drops 1kb of code from this inner loop, in exchange for 2.5k of data.
2014-11-18 12:42:52 -08:00
Andres Gomez
1398ed724a glsl_compiler: Add binding hash tables to avoid SIGSEVs on linking stage
When using the stand alone compiler, if we try to link a shader with vertex
attributes it will segfault on linking as the binding hash tables are not
included in the shader program. Obviously, we cannot make the linking stage
succeed without the bound attributes but we can prevent the crash and just
let the linker spit its own error.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-18 08:47:04 -07:00
Andres Gomez
f9fc3ae89b linker: Add carriage returns on several linker errors
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-18 08:47:04 -07:00
Andres Gomez
2d5af04bae draw: Fixed inline comments
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-18 08:47:03 -07:00
Roland Scheidegger
74f505fa73 gallivm: fix alignment issue for vertex data fetch
We cannot guarantee that vertex buffers have the necessary alignment for
fetching all AoS members at once (for instance 4x32bit XYZW data). We can
however guarantee that for textures. This did not cause errors for older
llvm versions but it now matters and will cause segfaults if the data
happens to not be aligned. Thus we need to set alignment manually.
(Note that we can't actually really guarantee data to be even element aligned
due to offsets in vertex buffers being bytes and OpenGL allowing this, but
it does not matter for x86 as alignment is only required for sse vectors -
not sure what happens on other archs, however.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=85467.
2014-11-18 15:26:59 +01:00
Marek Olšák
3958378abb radeonsi: support gl_FragCoord at integer pixel center
No known benefit for OpenGL, but it doesn't hurt.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-11-18 14:27:54 +01:00
Marek Olšák
da2dea3843 radeonsi: support per-sample gl_FragCoord
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-11-18 14:27:54 +01:00
Ilia Mirkin
68db29c434 st/mesa: add a fallback for clear_with_quad when no vs_layer
Not all drivers can set gl_Layer from VS. Add a fallback that passes the
instance id from VS to GS, and then uses the GS to set the layer.

Tested by adding

  quad_buffers |= clear_buffers;
  clear_buffers = 0;

to the st_Clear logic, and forcing set_vertex_shader_layered in all
cases. No piglit regressions (on piglits with 'clear' in the name).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2014-11-17 22:17:49 -05:00
Vinson Lee
7b8e04b3f0 mesa: Bump version to 10.5.0-devel.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2014-11-18 02:02:54 +00:00
Axel Davy
7f565845a1 nine: Implement threadpool
DRI_PRIME setups have different issues due the lack of dma-buf fences
support in the drivers. For DRI3 DRI_PRIME, a race can appear, making
tearings visible, or worse showing older content than expected. Until
dma-buf fences are well supported (and by all drivers), an alternative
is to send the buffers to the server only when rendering has finished.
Since waiting the rendering has finished in the main thread has a
performance impact, this patch uses an additional thread to offload the
wait and the sending of the buffers to the server.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-18 02:02:54 +00:00
Axel Davy
948e6c5228 nine: Add drirc options (v2)
Implements vblank_mode and throttling, which  allows us change default ratio
between framerate and input lag.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-11-18 02:02:54 +00:00
Joakim Sindholt
fdd96578ef nine: Add state tracker nine for Direct3D9 (v3)
Work of Joakim Sindholt (zhasha) and Christoph Bumiller (chrisbmr).
DRI3 port done by Axel Davy (mannerov).

v2: - nine_debug.c: klass extended from 32 chars to 96 (for sure) by glennk
    - Nine improvements by Axel Davy (which also fixed some wine tests)
    - by Emil Velikov:
     - convert to static/shared drivers
     - Sort and cleanup the includes
     - Use AM_CPPFLAGS for the defines
     - Add the linker garbage collector
     - Restrict the exported symbols (think llvm)

v3: - small nine fixes
    - build system improvements by Emil Velikov

v4: [Emil Velikov]
   - Do no link against libudev. No longer needed.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-18 02:02:54 +00:00
Christoph Bumiller
7d2573b537 gallium/auxiliary: add contained and rect checks (v6)
v3: thanks to Brian, improved coding style, also glennk helped spot few
things (unsigned -> int, two constify)
v4: thanks Ilia improved function, dropped u_box_clip_3d
v5: incorporated rest of Gregor proposed changes,clean ups
v6: u_box_clip_2d simplify proposed by Ilia Mirkin

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-18 02:02:54 +00:00
Christoph Bumiller
cb49132166 gallium/auxiliary: add inc and dec alternative with return (v4)
At this moment we use only zero or positive values.

v2: Implement it for also for Solaris, MSVC assembly
    and enable for other combinations.

v3: Replace MSVC assembly by assert + warning during compilation

v4: remove inc and dec with return for MSVC assembly

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-18 02:02:53 +00:00
Christoph Bumiller
e23d63cffd gallium/auxiliary: implement sw_probe_wrapped (v2)
Implement pipe_loader_sw_probe_wrapped which allows to use the wrapped
software renderer backend when using the pipe loader.

v2: - remove unneeded ifdef
    - use GALLIUM_PIPE_LOADER_WINSYS_LIBS
    - check for CALLOC_STRUCT
    thanks to Emil Velikov

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-18 02:02:53 +00:00
Christoph Bumiller
8314315dff winsys/sw/wrapper: implement is_displaytarget_format_supported for swrast
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-18 02:02:53 +00:00
Christoph Bumiller
259ec77db9 tgsi/ureg: add ureg_UARL shortcut (v2)
v2: moved in in same order as in p_shader_tokens (thanks Brian)

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
2014-11-18 02:02:53 +00:00
Dave Airlie
4e520101e6 r600g/cayman: handle empty vertex shaders
Some of the geom shader tests produce an empty vertex shader,
on cayman we'd crash in the finaliser because last_cf was NULL.

cayman doesn't need the NOP workaround, so if the code arrives
here with no last_cf, just emit an END.

fixes crashes in a bunch of piglit geom shader tests.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-18 11:59:47 +10:00
Dave Airlie
27e1e0e710 r600g/cayman: fix texture gather tests
It appears on cayman the TG4 outputs were reordered.

This fixes a lot of piglit tests.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-18 11:59:30 +10:00
Dave Airlie
70dac5fa44 r600g: cayman umad assigns dst pointlessly
There is no need to assign dst here, just use the chan from j

Pointed out by glennk.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-18 11:59:30 +10:00
Dave Airlie
4a128d5a16 r600g/cayman: fix integer multiplication output overwrite (v2)
This fixes tests/spec/glsl-1.10/execution/fs-op-assign-mult-ivec2-ivec2-overwrite.shader_test.

hopeful fix for fd.o bug 85376

Reported-by: ghallberg
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-18 11:58:16 +10:00
Brian Paul
11abd7b2bc st/mesa: copy sampler_array_size field when copying instructions
The sampler_array_size field was added by "mesa/st: add support for
dynamic sampler offsets".  But the field wasn't getting copied in
the get_pixel_transfer_visitor() or get_bitmap_visitor() functions.

The count_resources() function then didn't properly compute the
glsl_to_tgsi_visitor::samplers_used bitmask.  Then, we didn't declare
all the sampler registers in st_translate_program().  Finally, we
asserted when we tried to emit a tgsi ureg src register with File =
TGSI_FILE_UNDEFINED.

Add the missing assignments and some new assertions to catch the
invalid register sooner.

Cc: "10.3, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-17 15:07:54 -07:00
Brian Paul
920f875132 gallium/tests: add missing arg to util_make_vertex_passthrough_shader()
Fix oversights from the "add a window_space option to the passthrough
vertex shader" patch.

Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-11-17 10:20:24 -07:00
Michel Dänzer
ae4536b4f7 radeonsi: Disable asynchronous DMA except for PIPE_BUFFER
Using the asynchronous DMA engine for multi-dimensional operations seems
to cause random GPU lockups for various people. While the root cause for
this might need to be fixed in the kernel, let's disable it for now.

Before re-enabling this, please make sure you can hit all newly enabled
paths in your testing, preferably with both piglit and real world apps,
and get in touch with people on the bug reports below for stability
testing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85647
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83500
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
2014-11-17 16:17:52 +09:00
Vinson Lee
876c53375e scons: Require glproto >= 1.4.13 for X11.
GLXBadProfileARB and X_GLXCreateContextAtrribsARB require glproto >=
1.4.13. These symbols were added in commit
d5d41112cb "st/xlib: Generate errors as
specified."

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2014-11-16 13:26:26 -08:00
José Fonseca
aafbebe8ab draw: Make it more clear that *_jit_context points to pipe_viewport_state structures.
No change in behavior.
2014-11-16 11:33:21 +00:00
José Fonseca
2a3e140ff4 draw: Fix breakage due to removal pipe_viewport_state::translate[3] and scale[3].
Unfortunately no LLVM type was generated for pipe_viewport_state -- it
was being treated as a single floating point array --, so llvmpipe (and
any driver that relies on draw/llvm) got totally busted.
2014-11-16 11:31:23 +00:00
José Fonseca
d2dbeed006 gallium/auxiliary: Fix build without LLVM.
Trivial.
2014-11-16 10:22:46 +00:00
José Fonseca
4784623b3e gallium/auxiliary: Remove GALLIVM_CPP_SOURCES
Redundant.

Should fix ttps://bugs.freedesktop.org/show_bug.cgi?id=86330
2014-11-16 10:16:47 +00:00
Emil Velikov
45e2ba1b8c freedreno: add missing headers in Makefile.sources
... or autotools will fail to pick them up for the distribution tarball.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-16 01:16:30 +00:00
Emil Velikov
c3bb38c4cb targets: bundle all files in the tarball
We were missing a few files
 - The version scripts
 - Android & scons build scripts
 - A few headers.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-16 01:16:30 +00:00
Emil Velikov
d936ef3fb7 auxiliary: ship all files in the distribution tarball
- Add all headers into Makefile.sources
 - Don't forget the target-helpers
 - Add the python scripts & the formats table/list (csv)
 - Temporary add vl/vl_winsys_dri.c to EXTRA_DIST until we rework the
way VL is build.
 - Add the following to EXTRA_DIST - they are included via the
generated u_indices_gen.c thus we should not add them to *SOURCES.
  indices/u_indices.c
  indices/u_unfilled_indices.c

XXX: Should we nuke gallivm/f.cpp ? It seems that no-one is using it.

v2: Rebase

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-16 01:07:32 +00:00
Emil Velikov
ded56e4674 gallium: ship the gallium API headers
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-16 01:03:42 +00:00
Emil Velikov
dfa61dc37e pipe-loader: consolidate sources into Makefile.sources
Drop the unneeded subdir-objects.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-16 01:03:42 +00:00
Thierry Reding
631090e155 dri/kms: Always zero out struct drm_mode_create_dumb
The DRM_IOCTL_MODE_CREATE_DUMB (and others) IOCTL isn't very rigorously
specified, which has the effect that some kernel drivers do not consider
the .pitch and .size fields of struct drm_mode_create_dumb outputs only.
Instead they will use these as lower bounds and overwrite them only if
the values that they compute are larger than what userspace provided.

This works if and only if userspace initializes the fields explicitly to
either 0 or some meaningful value. However, if userspace just leaves the
values uninitialized and the struct drm_mode_create_dumb is allocated on
the stack for example, the driver may try to overallocate buffers.

Fortunately most userspace does zero out the structure before passing it
to the IOCTL, but there are rare exceptions. Mesa is one of them. In an
attempt to rectify this situation, kernel drivers are being updated to
not use the .pitch and .size fields as inputs. However in order to fix
the issue with older kernels, make sure that Mesa always zeros out the
structure as well.

Future IOCTLs should be more rigorously defined so that structures can
be validated and IOCTLs rejected if output fields aren't set to zero.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-16 01:03:40 +00:00
Marek Olšák
2efabd9f5a gallium: remove unused pipe_viewport_state::translate[3] and scale[3]
Almost all drivers ignore them.
2014-11-16 01:28:28 +01:00
Marek Olšák
ff8042270f radeonsi: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION
Required by Nine.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2014-11-16 01:28:28 +01:00
Marek Olšák
48f1409c3b tgsi/ureg: simplify code for declaring properties
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2014-11-16 01:28:26 +01:00
Marek Olšák
e6a2d3f7b6 gallium/util: add a test for TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION
Not testable by OpenGL. Required by Nine.

This is an example of how to implement a piglit-like test using gallium only.
2014-11-16 01:28:26 +01:00
Marek Olšák
717f2dd69f gallium/util: add a window_space option to the passthrough vertex shader
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2014-11-16 01:28:24 +01:00
Marek Olšák
ad54b01896 tgsi: fixup the string of VS_WINDOW_SPACE_POSITION
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2014-11-16 01:28:09 +01:00
Rob Clark
7c5707bd4a freedreno/a4xx: implement mem->gmem (restore)
Support to restore gmem (tile buffer) (in case it wasn't glClear'd).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-15 18:16:36 -05:00
Rob Clark
0c6275300e freedreno/a4xx: move where SP_FS_MRT_REGn is emitted
Addition of color fmt bitfield to this register (compared to a3xx) means
we need to re-emit if either prog or framebuffer state is dirty.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-15 18:16:36 -05:00
Emil Velikov
e07c9a288c Revert "mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__."
This reverts commit 8d3f739383.

In the last commit we've updated our check to determine if the actual
code is buildable, rather than if the compiler acknowledges the option.
I.e. did anyone provide -mno-sse4.1 vs is my compiler too old.

Now this code will never be attemped to be build, in both cases.

Confirmed by building mesa with
export CFLAGS='-march=native -mno-sse4.1'
./configure && make

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-15 20:34:36 +00:00
Emil Velikov
1a6ae84041 configure.ac: roll up a program for the sse4.1 check
So when checking/building sse code we have three possibilities:
 1 Old compiler, throws an error when using -msse*
 2 New compiler, user disables sse* (-mno-sse*)
 3 New compiler, user doesn't disable sse

The original code, added code for #1 but not #2. Later on we patched
around the lack of handling #2 by wrapping the code in __SSE4_1__.
Yet it lead to a missing/undefined symbol in case of #1 or #2, which
might cause an issue for #2 when using the i965 driver.

A bit later we "fixed" the undefined symbol by using #1, rather than
updating it to handle #2. With this commit we set things straight :)

To top it all up, conventions state that in case of conflicting
(-enable-foo -disable-foo) options, the latter one takes precedence.
Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test.

v2: Clean the #includes. Suggested by Ilia, Matt & Siavash.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Tested-by: Siavash Eliasi <siavashserver@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-15 20:34:34 +00:00
Ilia Mirkin
3bc42a09e2 nv50,nvc0: use clip_halfz setting when creating rasterizer state
This enables the ARB_clip_control extension.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2014-11-15 14:14:51 -05:00
Rob Clark
61c68b69d7 freedreno: add adreno 420 support
Very initial support.  Basic stuff working (es2gears, es2tri, and maybe
about half of glmark2).  Expect broken stuff.  Still missing: mem->gmem
(restore), queries, mipmaps (blob segfaults!), hw binning, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-15 08:30:31 -05:00
Rob Clark
4b1dfcb2c1 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-15 08:30:31 -05:00
2381 changed files with 243071 additions and 125810 deletions

View File

@@ -1,4 +1,4 @@
((nil
((prog-mode
(indent-tabs-mode . nil)
(tab-width . 8)
(c-basic-offset . 3)

2
.gitignore vendored
View File

@@ -18,6 +18,7 @@
*.tar
*.tar.bz2
*.tar.gz
*.tar.xz
*.trs
*.zip
*~
@@ -44,3 +45,4 @@ manifest.txt
.libs/
Makefile
Makefile.in
.install-mesa-links

View File

@@ -24,16 +24,17 @@
# use c99 compiler by default
ifeq ($(LOCAL_CC),)
ifeq ($(LOCAL_IS_HOST_MODULE),true)
LOCAL_CC := $(HOST_CC) -std=c99
LOCAL_CC := $(HOST_CC) -std=c99 -D_GNU_SOURCE
else
LOCAL_CC := $(TARGET_CC) -std=c99
endif
endif
LOCAL_C_INCLUDES += \
$(MESA_TOP)/src \
$(MESA_TOP)/include
MESA_VERSION=$(shell cat $(MESA_TOP)/VERSION)
MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
LOCAL_CFLAGS += \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
@@ -41,6 +42,19 @@ LOCAL_CFLAGS += \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
LOCAL_CFLAGS += \
-DHAVE___BUILTIN_EXPECT \
-DHAVE___BUILTIN_FFS \
-DHAVE___BUILTIN_FFSLL \
-DHAVE_FUNC_ATTRIBUTE_FLATTEN \
-DHAVE_FUNC_ATTRIBUTE_UNUSED \
-DHAVE_FUNC_ATTRIBUTE_FORMAT \
-DHAVE_FUNC_ATTRIBUTE_PACKED \
-DHAVE___BUILTIN_CTZ \
-DHAVE___BUILTIN_POPCOUNT \
-DHAVE___BUILTIN_POPCOUNTLL \
-DHAVE___BUILTIN_CLZ \
-DHAVE___BUILTIN_CLZLL \
-DHAVE___BUILTIN_UNREACHABLE \
-DHAVE_PTHREAD=1 \
-fvisibility=hidden \
-Wno-sign-compare
@@ -54,7 +68,16 @@ LOCAL_CFLAGS += \
endif
endif
ifeq ($(MESA_ENABLE_LLVM),true)
LOCAL_CFLAGS += \
-DHAVE_LLVM=0x0305 -DLLVM_VERSION_PATCH=2 \
-D__STDC_CONSTANT_MACROS \
-D__STDC_FORMAT_MACROS \
-D__STDC_LIMIT_MACROS
endif
LOCAL_CPPFLAGS += \
$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \
-Wno-error=non-virtual-dtor \
-Wno-non-virtual-dtor

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx
# gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 vmwgfx
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -34,6 +34,13 @@ MESA_TOP := $(call my-dir)
MESA_ANDROID_MAJOR_VERSION := $(word 1, $(subst ., , $(PLATFORM_VERSION)))
MESA_ANDROID_MINOR_VERSION := $(word 2, $(subst ., , $(PLATFORM_VERSION)))
MESA_ANDROID_VERSION := $(MESA_ANDROID_MAJOR_VERSION).$(MESA_ANDROID_MINOR_VERSION)
ifeq ($(filter 1 2 3 4,$(MESA_ANDROID_MAJOR_VERSION)),)
MESA_LOLLIPOP_BUILD := true
else
define local-generated-sources-dir
$(call local-intermediates-dir)
endef
endif
MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
MESA_PYTHON2 := python
@@ -41,7 +48,7 @@ MESA_PYTHON2 := python
DRM_GRALLOC_TOP := hardware/drm_gralloc
classic_drivers := i915 i965
gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx
gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4
MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))
@@ -73,6 +80,8 @@ else
MESA_BUILD_GALLIUM := false
endif
MESA_ENABLE_LLVM := $(if $(filter radeonsi,$(MESA_GPU_DRIVERS)),true,false)
# add subdirectories
ifneq ($(strip $(MESA_GPU_DRIVERS)),)
@@ -82,19 +91,14 @@ SUBDIRS := \
src/glsl \
src/mesa \
src/util \
src/egl/main
ifeq ($(strip $(MESA_BUILD_CLASSIC)),true)
SUBDIRS += \
src/egl/main \
src/egl/drivers/dri2 \
src/mesa/drivers/dri
endif
ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)
SUBDIRS += src/gallium
endif
mkfiles := $(patsubst %,$(MESA_TOP)/%/Android.mk,$(SUBDIRS))
include $(mkfiles)
include $(call all-named-subdir-makefiles,$(SUBDIRS))
endif

View File

@@ -5,3 +5,12 @@ $(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libGLES_mesa_i
$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/mesa_*_intermediates)
$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/glsl_compiler_intermediates)
$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/STATIC_LIBRARIES/libmesa_glsl_utils_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/STATIC_LIBRARIES/libmesa_*_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/i9?5_dri_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libglapi_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libGLES_mesa_intermediates)
$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/mesa_*_intermediates)
$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/glsl_compiler_intermediates)
$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/STATIC_LIBRARIES/libmesa_*_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/*_dri_intermediates)

View File

@@ -21,85 +21,41 @@
SUBDIRS = src
AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-dri3 \
--enable-gallium-tests \
--enable-gbm \
--enable-gles1 \
--enable-gles2 \
--enable-glx-tls \
--enable-va \
--enable-vdpau \
--enable-xa \
--enable-xvmc \
--with-egl-platforms=x11,wayland,drm
ACLOCAL_AMFLAGS = -I m4
doxygen:
cd doxygen && $(MAKE)
EXTRA_DIST = \
autogen.sh \
common.py \
docs \
doxygen \
scons \
SConstruct
.PHONY: doxygen
noinst_HEADERS = \
include/c99_alloca.h \
include/c99_compat.h \
include/c99_math.h \
include/c99 \
include/c11 \
include/D3D9 \
include/HaikuGL \
include/no_extern_c.h \
include/pci_ids
# Rules for making release tarballs
PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)
PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)
EXTRA_FILES = \
aclocal.m4 \
configure \
bin/ar-lib \
bin/compile \
bin/config.sub \
bin/config.guess \
bin/depcomp \
bin/install-sh \
bin/ltmain.sh \
bin/missing \
bin/ylwrap \
bin/test-driver \
src/glsl/glsl_parser.cpp \
src/glsl/glsl_parser.h \
src/glsl/glsl_lexer.cpp \
src/glsl/glcpp/glcpp-lex.c \
src/glsl/glcpp/glcpp-parse.c \
src/glsl/glcpp/glcpp-parse.h \
src/mesa/program/lex.yy.c \
src/mesa/program/program_parse.tab.c \
src/mesa/program/program_parse.tab.h \
`git ls-files | grep "Makefile.am" | sed -e "s/Makefile.am/Makefile.in/"`
IGNORE_FILES = \
-x autogen.sh
parsers: configure
$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h
# Everything for new a Mesa release:
ARCHIVES = $(PACKAGE_NAME).tar.gz \
$(PACKAGE_NAME).tar.bz2 \
$(PACKAGE_NAME).zip
tarballs: checksums
rm -f ../$(PACKAGE_DIR) $(PACKAGE_NAME).tar
manifest.txt: .git
( \
ls -1 $(EXTRA_FILES) ; \
git ls-files $(IGNORE_FILES) \
) | sed -e '/^\(.*\/\)\?\./d' -e "s@^@$(PACKAGE_DIR)/@" > $@
../$(PACKAGE_DIR):
ln -s $(PWD) $@
$(PACKAGE_NAME).tar: parsers ../$(PACKAGE_DIR) manifest.txt
cd .. ; tar -cf $(PACKAGE_DIR)/$(PACKAGE_NAME).tar -T $(PACKAGE_DIR)/manifest.txt
$(PACKAGE_NAME).tar.gz: $(PACKAGE_NAME).tar ../$(PACKAGE_DIR)
gzip --stdout --best $(PACKAGE_NAME).tar > $(PACKAGE_NAME).tar.gz
$(PACKAGE_NAME).tar.bz2: $(PACKAGE_NAME).tar
bzip2 --stdout --best $(PACKAGE_NAME).tar > $(PACKAGE_NAME).tar.bz2
$(PACKAGE_NAME).zip: parsers ../$(PACKAGE_DIR) manifest.txt
rm -f $(PACKAGE_NAME).zip ; \
cd .. ; \
zip -q -@ $(PACKAGE_NAME).zip < $(PACKAGE_DIR)/manifest.txt ; \
mv $(PACKAGE_NAME).zip $(PACKAGE_DIR)
checksums: $(ARCHIVES)
@-sha256sum $(PACKAGE_NAME).tar.gz
@-sha256sum $(PACKAGE_NAME).tar.bz2
@-sha256sum $(PACKAGE_NAME).zip
.PHONY: tarballs checksums
# We list some directories in EXTRA_DIST, but don't actually want to include
# the .gitignore files in the tarball.
dist-hook:
find $(distdir) -name .gitignore -exec $(RM) {} +

View File

@@ -1 +1 @@
10.4.0-devel
10.7.0-devel

View File

@@ -6,8 +6,8 @@ test -z "$srcdir" && srcdir=.
ORIGDIR=`pwd`
cd "$srcdir"
autoreconf -v --install || exit 1
cd $ORIGDIR || exit $?
autoreconf --force --verbose --install || exit 1
cd "$ORIGDIR" || exit $?
if test -z "$NOCONFIGURE"; then
"$srcdir"/configure "$@"

View File

@@ -26,28 +26,28 @@ else:
target_platform = host_platform
_machine_map = {
'x86': 'x86',
'i386': 'x86',
'i486': 'x86',
'i586': 'x86',
'i686': 'x86',
'BePC': 'x86',
'Intel': 'x86',
'ppc' : 'ppc',
'BeBox': 'ppc',
'BeMac': 'ppc',
'AMD64': 'x86_64',
'x86_64': 'x86_64',
'sparc': 'sparc',
'sun4u': 'sparc',
'x86': 'x86',
'i386': 'x86',
'i486': 'x86',
'i586': 'x86',
'i686': 'x86',
'BePC': 'x86',
'Intel': 'x86',
'ppc': 'ppc',
'BeBox': 'ppc',
'BeMac': 'ppc',
'AMD64': 'x86_64',
'x86_64': 'x86_64',
'sparc': 'sparc',
'sun4u': 'sparc',
}
# find host_machine value
if 'PROCESSOR_ARCHITECTURE' in os.environ:
host_machine = os.environ['PROCESSOR_ARCHITECTURE']
host_machine = os.environ['PROCESSOR_ARCHITECTURE']
else:
host_machine = _platform.machine()
host_machine = _platform.machine()
host_machine = _machine_map.get(host_machine, 'generic')
default_machine = host_machine
@@ -65,7 +65,8 @@ else:
default_llvm = 'no'
try:
if target_platform != 'windows' and \
subprocess.call(['llvm-config', '--version'], stdout=subprocess.PIPE) == 0:
subprocess.call(['llvm-config', '--version'],
stdout=subprocess.PIPE) == 0:
default_llvm = 'yes'
except:
pass
@@ -75,30 +76,38 @@ else:
# Common options
def AddOptions(opts):
try:
from SCons.Variables.BoolVariable import BoolVariable as BoolOption
except ImportError:
from SCons.Options.BoolOption import BoolOption
try:
from SCons.Variables.EnumVariable import EnumVariable as EnumOption
except ImportError:
from SCons.Options.EnumOption import EnumOption
opts.Add(EnumOption('build', 'build type', 'debug',
allowed_values=('debug', 'checked', 'profile', 'release')))
opts.Add(BoolOption('verbose', 'verbose output', 'no'))
opts.Add(EnumOption('machine', 'use machine-specific assembly code', default_machine,
allowed_values=('generic', 'ppc', 'x86', 'x86_64')))
opts.Add(EnumOption('platform', 'target platform', host_platform,
allowed_values=('cygwin', 'darwin', 'freebsd', 'haiku', 'linux', 'sunos', 'windows')))
opts.Add(BoolOption('embedded', 'embedded build', 'no'))
opts.Add(BoolOption('analyze', 'enable static code analysis where available', 'no'))
opts.Add('toolchain', 'compiler toolchain', default_toolchain)
opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support', 'no'))
opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))
opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)', 'no'))
opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))
opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))
opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
opts.Add(BoolOption('texture_float', 'enable floating-point textures and renderbuffers', 'no'))
if host_platform == 'windows':
opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')
try:
from SCons.Variables.BoolVariable import BoolVariable as BoolOption
except ImportError:
from SCons.Options.BoolOption import BoolOption
try:
from SCons.Variables.EnumVariable import EnumVariable as EnumOption
except ImportError:
from SCons.Options.EnumOption import EnumOption
opts.Add(EnumOption('build', 'build type', 'debug',
allowed_values=('debug', 'checked', 'profile',
'release')))
opts.Add(BoolOption('verbose', 'verbose output', 'no'))
opts.Add(EnumOption('machine', 'use machine-specific assembly code',
default_machine,
allowed_values=('generic', 'ppc', 'x86', 'x86_64')))
opts.Add(EnumOption('platform', 'target platform', host_platform,
allowed_values=('cygwin', 'darwin', 'freebsd', 'haiku',
'linux', 'sunos', 'windows')))
opts.Add(BoolOption('embedded', 'embedded build', 'no'))
opts.Add(BoolOption('analyze',
'enable static code analysis where available', 'no'))
opts.Add('toolchain', 'compiler toolchain', default_toolchain)
opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',
'no'))
opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))
opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)',
'no'))
opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))
opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))
opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
opts.Add(BoolOption('texture_float',
'enable floating-point textures and renderbuffers',
'no'))
if host_platform == 'windows':
opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')

View File

@@ -1,3 +1,35 @@
dnl Copyright © 2011-2014 Intel Corporation
dnl Copyright © 2011-2014 Emil Velikov <emil.l.velikov@gmail.com>
dnl Copyright © 2007-2010 Dan Nicholson
dnl Copyright © 2010-2014 Marek Olšák <maraeo@gmail.com>
dnl Copyright © 2010-2014 Christian König
dnl Copyright © 2012-2014 Tom Stellard <tstellar@gmail.com>
dnl Copyright © 2009-2012 Jakob Bornecrantz
dnl Copyright © 2009-2014 Jon TURNEY
dnl Copyright © 2011-2012 Benjamin Franzke
dnl Copyright © 2008-2014 David Airlie
dnl Copyright © 2009-2013 Brian Paul
dnl Copyright © 2003-2007 Keith Packard, Daniel Stone
dnl
dnl Permission is hereby granted, free of charge, to any person obtaining a
dnl copy of this software and associated documentation files (the "Software"),
dnl to deal in the Software without restriction, including without limitation
dnl the rights to use, copy, modify, merge, publish, distribute, sublicense,
dnl and/or sell copies of the Software, and to permit persons to whom the
dnl Software is furnished to do so, subject to the following conditions:
dnl
dnl The above copyright notice and this permission notice (including the next
dnl paragraph) shall be included in all copies or substantial portions of the
dnl Software.
dnl
dnl THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
dnl IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
dnl FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
dnl THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
dnl LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
dnl FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
dnl DEALINGS IN THE SOFTWARE.
dnl
dnl Process this file with autoconf to create configure.
AC_PREREQ([2.60])
@@ -12,7 +44,14 @@ AC_INIT([Mesa], [MESA_VERSION],
AC_CONFIG_AUX_DIR([bin])
AC_CONFIG_MACRO_DIR([m4])
AC_CANONICAL_SYSTEM
AM_INIT_AUTOMAKE([foreign])
AM_INIT_AUTOMAKE([foreign tar-ustar dist-xz])
dnl We only support native Windows builds (MinGW/MSVC) through SCons.
case "$host_os" in
mingw*)
AC_MSG_ERROR([MinGW build not supported through autoconf/automake, use SCons instead])
;;
esac
# Support silent build rules, requires at least automake-1.11. Disable
# by either passing --disable-silent-rules to configure or passing V=1
@@ -29,7 +68,7 @@ AC_SUBST([OSMESA_VERSION])
dnl Versions for external dependencies
LIBDRM_REQUIRED=2.4.38
LIBDRM_RADEON_REQUIRED=2.4.56
LIBDRM_INTEL_REQUIRED=2.4.52
LIBDRM_INTEL_REQUIRED=2.4.60
LIBDRM_NVVIEUX_REQUIRED=2.4.33
LIBDRM_NOUVEAU_REQUIRED="2.4.33 libdrm >= 2.4.41"
LIBDRM_FREEDRENO_REQUIRED=2.4.57
@@ -39,6 +78,7 @@ PRESENTPROTO_REQUIRED=1.0
LIBUDEV_REQUIRED=151
GLPROTO_REQUIRED=1.4.14
LIBOMXIL_BELLAGIO_REQUIRED=0.0
LIBVA_REQUIRED=0.35.0
VDPAU_REQUIRED=0.4.1
WAYLAND_REQUIRED=1.2.0
XCB_REQUIRED=1.9.3
@@ -46,6 +86,7 @@ XCBDRI2_REQUIRED=1.8
XCBGLX_REQUIRED=1.8.1
XSHMFENCE_REQUIRED=1.1
XVMC_REQUIRED=1.0.6
PYTHON_MAKO_REQUIRED=0.3.4
dnl Check for progs
AC_PROG_CPP
@@ -72,7 +113,27 @@ AX_PROG_FLEX([],
AC_CHECK_PROG(INDENT, indent, indent, cat)
if test "x$INDENT" != "xcat"; then
AC_SUBST(INDENT_FLAGS, '-i4 -nut -br -brs -npcs -ce -TGLubyte -TGLbyte -TBool')
# Only GNU indent is supported
INDENT_VERSION=`indent --version | grep GNU`
if test $? -eq 0; then
AC_SUBST(INDENT_FLAGS, '-i4 -nut -br -brs -npcs -ce -TGLubyte -TGLbyte -TBool')
else
INDENT="cat"
fi
fi
AX_CHECK_PYTHON_MAKO_MODULE($PYTHON_MAKO_REQUIRED)
if test -z "$PYTHON2"; then
if test ! -f "$srcdir/src/util/format_srgb.c"; then
AC_MSG_ERROR([Python not found - unable to generate sources])
fi
else
if test "x$acv_mako_found" = xno; then
if test ! -f "$srcdir/src/mesa/main/format_unpack.c"; then
AC_MSG_ERROR([Python mako module v$PYTHON_MAKO_REQUIRED or higher not found])
fi
fi
fi
AC_PROG_INSTALL
@@ -101,9 +162,10 @@ AC_COMPILE_IFELSE(
AC_MSG_RESULT([$acv_mesa_CLANG])
dnl If we're using GCC, make sure that it is at least version 3.3.0. Older
dnl If we're using GCC, make sure that it is at least version 4.2.0. Older
dnl versions are explictly not supported.
GEN_ASM_OFFSETS=no
USE_GNU99=no
if test "x$GCC" = xyes -a "x$acv_mesa_CLANG" = xno; then
AC_MSG_CHECKING([whether gcc version is sufficient])
major=0
@@ -115,13 +177,16 @@ if test "x$GCC" = xyes -a "x$acv_mesa_CLANG" = xno; then
GCC_VERSION_MINOR=`echo $GCC_VERSION | cut -d. -f2`
fi
if test $GCC_VERSION_MAJOR -lt 3 -o $GCC_VERSION_MAJOR -eq 3 -a $GCC_VERSION_MINOR -lt 3 ; then
if test $GCC_VERSION_MAJOR -lt 4 -o $GCC_VERSION_MAJOR -eq 4 -a $GCC_VERSION_MINOR -lt 2 ; then
AC_MSG_RESULT([no])
AC_MSG_ERROR([If using GCC, version 3.3.0 or later is required.])
AC_MSG_ERROR([If using GCC, version 4.2.0 or later is required.])
else
AC_MSG_RESULT([yes])
fi
if test $GCC_VERSION_MAJOR -lt 4 -o $GCC_VERSION_MAJOR -eq 4 -a $GCC_VERSION_MINOR -lt 6 ; then
USE_GNU99=yes
fi
if test "x$cross_compiling" = xyes; then
GEN_ASM_OFFSETS=yes
fi
@@ -144,6 +209,7 @@ AX_GCC_FUNC_ATTRIBUTE([flatten])
AX_GCC_FUNC_ATTRIBUTE([format])
AX_GCC_FUNC_ATTRIBUTE([malloc])
AX_GCC_FUNC_ATTRIBUTE([packed])
AX_GCC_FUNC_ATTRIBUTE([unused])
AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
@@ -164,7 +230,7 @@ _SAVE_LDFLAGS="$LDFLAGS"
_SAVE_CPPFLAGS="$CPPFLAGS"
dnl Compiler macros
DEFINES="-DUSE_EXTERNAL_DXTN_LIB=1"
DEFINES=""
AC_SUBST([DEFINES])
case "$host_os" in
linux*|*-gnu*|gnu*)
@@ -180,7 +246,13 @@ esac
dnl Add flags for gcc and g++
if test "x$GCC" = xyes; then
CFLAGS="$CFLAGS -Wall -std=c99"
CFLAGS="$CFLAGS -Wall"
if test "x$USE_GNU99" = xyes; then
CFLAGS="$CFLAGS -std=gnu99"
else
CFLAGS="$CFLAGS -std=c99"
fi
# Enable -Werror=implicit-function-declaration and
# -Werror=missing-prototypes, if available, or otherwise, just
@@ -212,6 +284,30 @@ if test "x$GCC" = xyes; then
# gcc's builtin memcmp is slower than glibc's
# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
CFLAGS="$CFLAGS -fno-builtin-memcmp"
# Flags to help ensure that certain portions of the code -- and only those
# portions -- can be built with MSVC:
# - src/util, src/gallium/auxiliary, and src/gallium/drivers/llvmpipe needs
# to build with Windows SDK 7.0.7600, which bundles MSVC 2008
# - non-Linux/Posix OpenGL portions needs to build on MSVC 2013 (which
# supports most of C99)
# - the rest has no compiler compiler restrictions
MSVC2013_COMPAT_CFLAGS="-Werror=pointer-arith"
MSVC2013_COMPAT_CXXFLAGS="-Werror=pointer-arith"
# Enable -Werror=vla if compiler supports it
save_CFLAGS="$CFLAGS"
AC_MSG_CHECKING([whether $CC supports -Werror=vla])
CFLAGS="$CFLAGS -Werror=vla"
AC_LINK_IFELSE([AC_LANG_PROGRAM()],
[MSVC2013_COMPAT_CFLAGS="$MSVC2013_COMPAT_CFLAGS -Werror=vla";
MSVC2013_COMPAT_CXXFLAGS="$MSVC2013_COMPAT_CXXFLAGS -Werror=vla";
AC_MSG_RESULT([yes])],
AC_MSG_RESULT([no]));
CFLAGS="$save_CFLAGS"
MSVC2008_COMPAT_CFLAGS="$MSVC2013_COMPAT_CFLAGS -Werror=declaration-after-statement"
MSVC2008_COMPAT_CXXFLAGS="$MSVC2013_COMPAT_CXXFLAGS"
fi
if test "x$GXX" = xyes; then
CXXFLAGS="$CXXFLAGS -Wall"
@@ -237,6 +333,11 @@ if test "x$GXX" = xyes; then
CXXFLAGS="$CXXFLAGS -fno-builtin-memcmp"
fi
AC_SUBST([MSVC2013_COMPAT_CFLAGS])
AC_SUBST([MSVC2013_COMPAT_CXXFLAGS])
AC_SUBST([MSVC2008_COMPAT_CFLAGS])
AC_SUBST([MSVC2008_COMPAT_CXXFLAGS])
dnl even if the compiler appears to support it, using visibility attributes isn't
dnl going to do anything useful currently on cygwin apart from emit lots of warnings
case "$host_os" in
@@ -252,11 +353,29 @@ AC_SUBST([VISIBILITY_CXXFLAGS])
dnl
dnl Optional flags, check for compiler support
dnl
AX_CHECK_COMPILE_FLAG([-msse4.1], [SSE41_SUPPORTED=1], [SSE41_SUPPORTED=0])
SSE41_CFLAGS="-msse4.1"
dnl Code compiled by GCC with -msse* assumes a 16 byte aligned
dnl stack, but on x86-32 such alignment is not guaranteed.
case "$target_cpu" in
i?86)
SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign"
;;
esac
save_CFLAGS="$CFLAGS"
CFLAGS="$SSE41_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#include <smmintrin.h>
int main () {
__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
c = _mm_max_epu32(a, b);
return 0;
}]])], SSE41_SUPPORTED=1)
CFLAGS="$save_CFLAGS"
if test "x$SSE41_SUPPORTED" = x1; then
DEFINES="$DEFINES -DUSE_SSE41"
fi
AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
dnl Can't have static and shared libraries, default to static if user
dnl explicitly requested. If both disabled, set to static since shared
@@ -301,6 +420,8 @@ if test "x$enable_debug" = xyes; then
CXXFLAGS="$CXXFLAGS -O0"
fi
fi
else
DEFINES="$DEFINES -DNDEBUG"
fi
dnl
@@ -528,6 +649,7 @@ if test "x$enable_asm" = xyes; then
fi
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
dnl Check to see if dlopen is in default libraries (like Solaris, which
@@ -544,7 +666,7 @@ AC_CHECK_FUNCS([dladdr])
LIBS="$save_LIBS"
case "$host_os" in
darwin*|mingw*)
darwin*)
;;
*)
AC_CHECK_FUNCS([clock_gettime], [CLOCK_LIB=],
@@ -558,13 +680,10 @@ dnl See if posix_memalign is available
AC_CHECK_FUNC([posix_memalign], [DEFINES="$DEFINES -DHAVE_POSIX_MEMALIGN"])
dnl Check for pthreads
case "$host_os" in
mingw*)
;;
*)
AX_PTHREAD
;;
esac
AX_PTHREAD
if test "x$ax_pthread_ok" = xno; then
AC_MSG_ERROR([Building mesa on this platform requires pthreads])
fi
dnl AX_PTHREADS leaves PTHREAD_LIBS empty for gcc and sets PTHREAD_CFLAGS
dnl to -pthread, which causes problems if we need -lpthread to appear in
dnl pkgconfig files.
@@ -595,20 +714,15 @@ AC_ARG_ENABLE([opengl],
[enable_opengl="$enableval"],
[enable_opengl=yes])
AC_ARG_ENABLE([gles1],
[AS_HELP_STRING([--enable-gles1],
[enable support for OpenGL ES 1.x API @<:@default=disabled@:>@])],
[AS_HELP_STRING([--disable-gles1],
[disable support for OpenGL ES 1.x API @<:@default=enabled@:>@])],
[enable_gles1="$enableval"],
[enable_gles1=no])
[enable_gles1=yes])
AC_ARG_ENABLE([gles2],
[AS_HELP_STRING([--enable-gles2],
[enable support for OpenGL ES 2.x API @<:@default=disabled@:>@])],
[AS_HELP_STRING([--disable-gles2],
[disable support for OpenGL ES 2.x API @<:@default=enabled@:>@])],
[enable_gles2="$enableval"],
[enable_gles2=no])
AC_ARG_ENABLE([openvg],
[AS_HELP_STRING([--enable-openvg],
[enable support for OpenVG API @<:@default=disabled@:>@])],
[enable_openvg="$enableval"],
[enable_openvg=no])
[enable_gles2=yes])
AC_ARG_ENABLE([dri],
[AS_HELP_STRING([--enable-dri],
@@ -660,6 +774,11 @@ AC_ARG_ENABLE([gbm],
[enable gbm library @<:@default=auto@:>@])],
[enable_gbm="$enableval"],
[enable_gbm=auto])
AC_ARG_ENABLE([nine],
[AS_HELP_STRING([--enable-nine],
[enable build of the nine Direct3D9 API @<:@default=no@:>@])],
[enable_nine="$enableval"],
[enable_nine=no])
AC_ARG_ENABLE([xvmc],
[AS_HELP_STRING([--enable-xvmc],
@@ -733,7 +852,7 @@ esac
if test "x$enable_opengl" = xno -a \
"x$enable_gles1" = xno -a \
"x$enable_gles2" = xno -a \
"x$enable_openvg" = xno -a \
"x$enable_nine" = xno -a \
"x$enable_xa" = xno -a \
"x$enable_xvmc" = xno -a \
"x$enable_vdpau" = xno -a \
@@ -795,7 +914,7 @@ AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xyes -a \
case "$host_os" in
darwin*)
dri_platform='apple' ;;
gnu*|mingw*|cygwin*)
gnu*|cygwin*)
dri_platform='none' ;;
*)
dri_platform='drm' ;;
@@ -822,12 +941,6 @@ x*yes*yes*)
;;
esac
# Building Xlib-GLX requires shared glapi to be disabled.
if test "x$enable_xlib_glx" = xyes; then
AC_MSG_NOTICE([Shared GLAPI should not used with Xlib-GLX, disabling])
enable_shared_glapi=no
fi
AM_CONDITIONAL(HAVE_SHARED_GLAPI, test "x$enable_shared_glapi" = xyes)
# Build the pipe-drivers as separate libraries/modules.
@@ -860,6 +973,144 @@ fi
AC_SUBST([MESA_LLVM])
# SHA1 hashing
AC_ARG_WITH([sha1],
[AS_HELP_STRING([--with-sha1=libc|libmd|libnettle|libgcrypt|libcrypto|libsha1|CommonCrypto|CryptoAPI],
[choose SHA1 implementation])])
case "x$with_sha1" in
x | xlibc | xlibmd | xlibnettle | xlibgcrypt | xlibcrypto | xlibsha1 | xCommonCrypto | xCryptoAPI)
;;
*)
AC_MSG_ERROR([Illegal value for --with-sha1: $with_sha1])
esac
AC_CHECK_FUNC([SHA1Init], [HAVE_SHA1_IN_LIBC=yes])
if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_LIBC" = xyes; then
with_sha1=libc
fi
if test "x$with_sha1" = xlibc && test "x$HAVE_SHA1_IN_LIBC" != xyes; then
AC_MSG_ERROR([sha1 in libc requested but not found])
fi
if test "x$with_sha1" = xlibc; then
AC_DEFINE([HAVE_SHA1_IN_LIBC], [1],
[Use libc SHA1 functions])
SHA1_LIBS=""
fi
AC_CHECK_FUNC([CC_SHA1_Init], [HAVE_SHA1_IN_COMMONCRYPTO=yes])
if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_COMMONCRYPTO" = xyes; then
with_sha1=CommonCrypto
fi
if test "x$with_sha1" = xCommonCrypto && test "x$HAVE_SHA1_IN_COMMONCRYPTO" != xyes; then
AC_MSG_ERROR([CommonCrypto requested but not found])
fi
if test "x$with_sha1" = xCommonCrypto; then
AC_DEFINE([HAVE_SHA1_IN_COMMONCRYPTO], [1],
[Use CommonCrypto SHA1 functions])
SHA1_LIBS=""
fi
dnl stdcall functions cannot be tested with AC_CHECK_LIB
AC_CHECK_HEADER([wincrypt.h], [HAVE_SHA1_IN_CRYPTOAPI=yes], [], [#include <windows.h>])
if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_CRYPTOAPI" = xyes; then
with_sha1=CryptoAPI
fi
if test "x$with_sha1" = xCryptoAPI && test "x$HAVE_SHA1_IN_CRYPTOAPI" != xyes; then
AC_MSG_ERROR([CryptoAPI requested but not found])
fi
if test "x$with_sha1" = xCryptoAPI; then
AC_DEFINE([HAVE_SHA1_IN_CRYPTOAPI], [1],
[Use CryptoAPI SHA1 functions])
SHA1_LIBS=""
fi
AC_CHECK_LIB([md], [SHA1Init], [HAVE_LIBMD=yes])
if test "x$with_sha1" = x && test "x$HAVE_LIBMD" = xyes; then
with_sha1=libmd
fi
if test "x$with_sha1" = xlibmd && test "x$HAVE_LIBMD" != xyes; then
AC_MSG_ERROR([libmd requested but not found])
fi
if test "x$with_sha1" = xlibmd; then
AC_DEFINE([HAVE_SHA1_IN_LIBMD], [1],
[Use libmd SHA1 functions])
SHA1_LIBS=-lmd
fi
PKG_CHECK_MODULES([LIBSHA1], [libsha1], [HAVE_LIBSHA1=yes], [HAVE_LIBSHA1=no])
if test "x$with_sha1" = x && test "x$HAVE_LIBSHA1" = xyes; then
with_sha1=libsha1
fi
if test "x$with_sha1" = xlibsha1 && test "x$HAVE_LIBSHA1" != xyes; then
AC_MSG_ERROR([libsha1 requested but not found])
fi
if test "x$with_sha1" = xlibsha1; then
AC_DEFINE([HAVE_SHA1_IN_LIBSHA1], [1],
[Use libsha1 for SHA1])
SHA1_LIBS=-lsha1
fi
AC_CHECK_LIB([nettle], [nettle_sha1_init], [HAVE_LIBNETTLE=yes])
if test "x$with_sha1" = x && test "x$HAVE_LIBNETTLE" = xyes; then
with_sha1=libnettle
fi
if test "x$with_sha1" = xlibnettle && test "x$HAVE_LIBNETTLE" != xyes; then
AC_MSG_ERROR([libnettle requested but not found])
fi
if test "x$with_sha1" = xlibnettle; then
AC_DEFINE([HAVE_SHA1_IN_LIBNETTLE], [1],
[Use libnettle SHA1 functions])
SHA1_LIBS=-lnettle
fi
AC_CHECK_LIB([gcrypt], [gcry_md_open], [HAVE_LIBGCRYPT=yes])
if test "x$with_sha1" = x && test "x$HAVE_LIBGCRYPT" = xyes; then
with_sha1=libgcrypt
fi
if test "x$with_sha1" = xlibgcrypt && test "x$HAVE_LIBGCRYPT" != xyes; then
AC_MSG_ERROR([libgcrypt requested but not found])
fi
if test "x$with_sha1" = xlibgcrypt; then
AC_DEFINE([HAVE_SHA1_IN_LIBGCRYPT], [1],
[Use libgcrypt SHA1 functions])
SHA1_LIBS=-lgcrypt
fi
# We don't need all of the OpenSSL libraries, just libcrypto
AC_CHECK_LIB([crypto], [SHA1_Init], [HAVE_LIBCRYPTO=yes])
PKG_CHECK_MODULES([OPENSSL], [openssl], [HAVE_OPENSSL_PKC=yes],
[HAVE_OPENSSL_PKC=no])
if test "x$HAVE_LIBCRYPTO" = xyes || test "x$HAVE_OPENSSL_PKC" = xyes; then
if test "x$with_sha1" = x; then
with_sha1=libcrypto
fi
else
if test "x$with_sha1" = xlibcrypto; then
AC_MSG_ERROR([OpenSSL libcrypto requested but not found])
fi
fi
if test "x$with_sha1" = xlibcrypto; then
if test "x$HAVE_LIBCRYPTO" = xyes; then
SHA1_LIBS=-lcrypto
else
SHA1_LIBS="$OPENSSL_LIBS"
SHA1_CFLAGS="$OPENSSL_CFLAGS"
fi
fi
AC_MSG_CHECKING([for SHA1 implementation])
AC_MSG_RESULT([$with_sha1])
AC_SUBST(SHA1_LIBS)
AC_SUBST(SHA1_CFLAGS)
# Allow user to configure out the shader-cache feature
AC_ARG_ENABLE([shader-cache],
AS_HELP_STRING([--disable-shader-cache], [Disable binary shader cache]),
[enable_shader_cache="$enableval"],
[if test "x$with_sha1" != "x"; then
enable_shader_cache=yes
else
enable_shader_cache=no
fi])
if test "x$with_sha1" = "x"; then
if test "x$enable_shader_cache" = "xyes"; then
AC_MSG_ERROR([Cannot enable shader cache (no SHA-1 implementation found)])
fi
fi
AM_CONDITIONAL([ENABLE_SHADER_CACHE], [test x$enable_shader_cache = xyes])
# Check for libdrm
PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED],
[have_libdrm=yes], [have_libdrm=no])
@@ -902,6 +1153,10 @@ AC_ARG_ENABLE([driglx-direct],
[driglx_direct="$enableval"],
[driglx_direct="yes"])
# Check for libcaca
PKG_CHECK_EXISTS([caca], [have_libcaca=yes], [have_libcaca=no])
AM_CONDITIONAL([HAVE_LIBCACA], [test x$have_libcaca = xyes])
dnl
dnl libGL configuration per driver
dnl
@@ -1256,7 +1511,6 @@ if test "x$enable_gbm" = xyes; then
fi
if test "x$enable_dri" = xyes; then
GBM_BACKEND_DIRS="$GBM_BACKEND_DIRS dri"
if test "x$enable_shared_glapi" = xno; then
AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi])
fi
@@ -1279,6 +1533,8 @@ GBM_PC_LIB_PRIV="$DLOPEN_LIBS"
AC_SUBST([GBM_PC_REQ_PRIV])
AC_SUBST([GBM_PC_LIB_PRIV])
AM_CONDITIONAL(HAVE_VULKAN, true)
dnl
dnl EGL configuration
dnl
@@ -1291,8 +1547,15 @@ if test "x$enable_egl" = xyes; then
if test "$enable_static" != yes; then
if test "x$enable_dri" = xyes; then
HAVE_EGL_DRIVER_DRI2=1
fi
HAVE_EGL_DRIVER_DRI2=1
if test "x$enable_shared_glapi" = xno; then
AC_MSG_ERROR([egl_dri2 requires --enable-shared-glapi])
fi
else
# Avoid building an "empty" libEGL. Drop/update this
# when other backends (haiku?) come along.
AC_MSG_ERROR([egl requires --enable-dri])
fi
fi
fi
@@ -1315,93 +1578,91 @@ if test "x$enable_xa" = xyes; then
fi
AM_CONDITIONAL(HAVE_ST_XA, test "x$enable_xa" = xyes)
dnl
dnl OpenVG configuration
dnl
VG_LIB_DEPS=""
if test "x$enable_openvg" = xyes; then
if test "x$enable_egl" = xno; then
AC_MSG_ERROR([cannot enable OpenVG without EGL])
fi
if test -z "$with_gallium_drivers"; then
AC_MSG_ERROR([cannot enable OpenVG without Gallium])
fi
AC_MSG_ERROR([Cannot enable OpenVG, because egl_gallium has been removed and
OpenVG hasn't been integrated into standard libEGL yet])
EGL_CLIENT_APIS="$EGL_CLIENT_APIS "'$(VG_LIB)'
VG_LIB_DEPS="$VG_LIB_DEPS $SELINUX_LIBS $PTHREAD_LIBS"
VG_PC_LIB_PRIV="-lm $CLOCK_LIB $PTHREAD_LIBS $DLOPEN_LIBS"
AC_SUBST([VG_PC_LIB_PRIV])
fi
AM_CONDITIONAL(HAVE_OPENVG, test "x$enable_openvg" = xyes)
dnl
dnl Gallium G3DVL configuration
dnl
if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
if test "x$enable_xvmc" = xauto; then
PKG_CHECK_EXISTS([xvmc], [enable_xvmc=yes], [enable_xvmc=no])
PKG_CHECK_EXISTS([xvmc >= $XVMC_REQUIRED], [enable_xvmc=yes], [enable_xvmc=no])
fi
if test "x$enable_vdpau" = xauto; then
PKG_CHECK_EXISTS([vdpau], [enable_vdpau=yes], [enable_vdpau=no])
PKG_CHECK_EXISTS([vdpau >= $VDPAU_REQUIRED], [enable_vdpau=yes], [enable_vdpau=no])
fi
if test "x$enable_omx" = xauto; then
PKG_CHECK_EXISTS([libomxil-bellagio], [enable_omx=yes], [enable_omx=no])
PKG_CHECK_EXISTS([libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx=yes], [enable_omx=no])
fi
if test "x$enable_va" = xauto; then
PKG_CHECK_EXISTS([libva], [enable_va=yes], [enable_va=no])
PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes], [enable_va=no])
fi
fi
if test "x$enable_dri" = xyes -o \
"x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx" = xyes -o \
"x$enable_va" = xyes; then
need_gallium_vl=yes
fi
AM_CONDITIONAL(NEED_GALLIUM_VL, test "x$need_gallium_vl" = xyes)
if test "x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx" = xyes -o \
"x$enable_va" = xyes; then
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
need_gallium_vl_winsys=yes
fi
AM_CONDITIONAL(NEED_GALLIUM_VL_WINSYS, test "x$need_gallium_vl_winsys" = xyes)
if test "x$enable_xvmc" = xyes; then
PKG_CHECK_MODULES([XVMC], [xvmc >= $XVMC_REQUIRED x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
PKG_CHECK_MODULES([XVMC], [xvmc >= $XVMC_REQUIRED])
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_XVMC, test "x$enable_xvmc" = xyes)
if test "x$enable_vdpau" = xyes; then
PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED],
[VDPAU_LIBS="`$PKG_CONFIG --libs x11-xcb xcb xcb-dri2`"])
PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED])
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_VDPAU, test "x$enable_vdpau" = xyes)
if test "x$enable_omx" = xyes; then
PKG_CHECK_MODULES([OMX], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
PKG_CHECK_MODULES([OMX], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED])
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_OMX, test "x$enable_omx" = xyes)
if test "x$enable_va" = xyes; then
PKG_CHECK_MODULES([VA], [libva >= 0.35.0 x11-xcb xcb-dri2 >= $XCBDRI2_REQUIRED],
[VA_LIBS="`$PKG_CONFIG --libs x11-xcb xcb-dri2`"])
PKG_CHECK_MODULES([VA], [libva >= $LIBVA_REQUIRED])
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_VA, test "x$enable_va" = xyes)
dnl
dnl Nine Direct3D9 configuration
dnl
if test "x$enable_nine" = xyes; then
if ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
AC_MSG_ERROR([nine requires the gallium swrast driver])
fi
if test "x$with_gallium_drivers" = xswrast; then
AC_MSG_ERROR([nine requires at least one non-swrast gallium driver])
fi
if test "x$enable_dri3" = xno; then
AC_MSG_WARN([using nine together with wine requires DRI3 enabled system])
fi
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_NINE, test "x$enable_nine" = xyes)
dnl
dnl OpenCL configuration
dnl
AC_ARG_WITH([libclc-path],
[AS_HELP_STRING([--with-libclc-path],
[DEPRECATED: See http://dri.freedesktop.org/wiki/GalliumCompute#How_to_Install])],
[LIBCLC_PATH="$withval"],
[LIBCLC_PATH=''])
if test -n "$LIBCLC_PATH"; then
AC_MSG_ERROR([The --with-libclc-path option has been deprecated.
Please review the updated build instructions for clover:
http://dri.freedesktop.org/wiki/GalliumCompute])
fi
AC_ARG_WITH([clang-libdir],
[AS_HELP_STRING([--with-clang-libdir],
[Path to Clang libraries @<:@default=llvm-config --libdir@:>@])],
@@ -1492,6 +1753,13 @@ if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then
AC_MSG_ERROR([cannot build egl state tracker without EGL library])
fi
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland_scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland_scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner])
fi
# Do per-EGL platform setups and checks
egl_platforms=`IFS=', '; echo $with_egl_platforms`
for plat in $egl_platforms; do
@@ -1499,9 +1767,9 @@ for plat in $egl_platforms; do
wayland)
PKG_CHECK_MODULES([WAYLAND], [wayland-client >= $WAYLAND_REQUIRED wayland-server >= $WAYLAND_REQUIRED])
WAYLAND_PREFIX=`$PKG_CONFIG --variable=prefix wayland-client`
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner],,
[${WAYLAND_PREFIX}/bin$PATH_SEPARATOR$PATH])
if test "x$WAYLAND_SCANNER" = x; then
AC_MSG_ERROR([wayland-scanner is needed to compile the wayland egl platform])
fi
;;
x11)
@@ -1515,7 +1783,12 @@ for plat in $egl_platforms; do
AC_MSG_ERROR([EGL platform drm requires libdrm >= $LIBDRM_REQUIRED])
;;
android|fbdev|gdi|null)
surfaceless)
test "x$have_libdrm" != xyes &&
AC_MSG_ERROR([EGL platform surfaceless requires libdrm >= $LIBDRM_REQUIRED])
;;
android|gdi|null)
;;
*)
@@ -1532,7 +1805,7 @@ done
# libEGL wants to default to the first platform specified in
# ./configure. parse that here.
if test "x$egl_platforms" != "x"; then
FIRST_PLATFORM_CAPS=`echo $egl_platforms | sed 's| .*||' | tr 'a-z' 'A-Z'`
FIRST_PLATFORM_CAPS=`echo $egl_platforms | sed 's| .*||' | tr '[[a-z]]' '[[A-Z]]'`
EGL_NATIVE_PLATFORM="_EGL_PLATFORM_$FIRST_PLATFORM_CAPS"
else
EGL_NATIVE_PLATFORM="_EGL_INVALID_PLATFORM"
@@ -1544,7 +1817,7 @@ fi
AM_CONDITIONAL(HAVE_EGL_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_DRM, echo "$egl_platforms" | grep -q 'drm')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_FBDEV, echo "$egl_platforms" | grep -q 'fbdev')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_SURFACELESS, echo "$egl_platforms" | grep -q 'surfaceless')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_NULL, echo "$egl_platforms" | grep -q 'null')
AM_CONDITIONAL(HAVE_EGL_DRIVER_DRI2, test "x$HAVE_EGL_DRIVER_DRI2" != "x")
@@ -1559,21 +1832,6 @@ if ! echo "$egl_platforms" | grep -q 'x11'; then
GL_PC_CFLAGS="$GL_PC_CFLAGS -DMESA_EGL_NO_X11_HEADERS"
fi
AC_ARG_WITH([max-width],
[AS_HELP_STRING([--with-max-width=N],
[Maximum framebuffer width (4096)])],
[DEFINES="${DEFINES} -DMAX_WIDTH=${withval}";
AS_IF([test "${withval}" -gt "4096"],
[AC_MSG_WARN([Large framebuffer: see s_tritemp.h comments.])])]
)
AC_ARG_WITH([max-height],
[AS_HELP_STRING([--with-max-height=N],
[Maximum framebuffer height (4096)])],
[DEFINES="${DEFINES} -DMAX_HEIGHT=${withval}";
AS_IF([test "${withval}" -gt "4096"],
[AC_MSG_WARN([Large framebuffer: see s_tritemp.h comments.])])]
)
dnl
dnl Gallium LLVM
dnl
@@ -1620,6 +1878,13 @@ strip_unwanted_llvm_flags() {
-e 's/-fstack-protector-strong\>//g'
}
llvm_check_version_for() {
if test "${LLVM_VERSION_INT}${LLVM_VERSION_PATCH}" -lt "${1}0${2}${3}"; then
AC_MSG_ERROR([LLVM $1.$2.$3 or newer is required for $4])
fi
}
if test -z "$with_gallium_drivers"; then
enable_gallium_llvm=no
@@ -1668,30 +1933,15 @@ if test "x$enable_gallium_llvm" = xyes; then
AC_MSG_ERROR([LLVM $LLVM_REQUIRED_VERSION_MAJOR.$LLVM_REQUIRED_VERSION_MINOR or newer is required])
fi
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -qw 'mcjit'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
LLVM_COMPONENTS="engine bitwriter mcjit mcdisassembler"
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
# LLVM 3.3 >= 177971 requires IRReader
if $LLVM_CONFIG --components | grep -qw 'irreader'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader"
fi
# LLVM 3.4 requires Option
if $LLVM_CONFIG --components | grep -qw 'option'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} option"
fi
# Current OpenCL/Clover and LLVM 3.5 require ObjCARCOpts and ProfileData
if $LLVM_CONFIG --components | grep -qw 'objcarcopts'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} objcarcopts"
fi
if $LLVM_CONFIG --components | grep -qw 'profiledata'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} profiledata"
fi
llvm_check_version_for "3" "5" "0" "opencl"
LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker instrumentation"
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option objcarcopts profiledata"
fi
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT -DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH"
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT -DMESA_LLVM_VERSION_PATCH=$LLVM_VERSION_PATCH"
MESA_LLVM=1
dnl Check for Clang internal headers
@@ -1759,6 +2009,13 @@ AC_ARG_WITH([va-libdir],
[VA_LIB_INSTALL_DIR="${libdir}/dri"])
AC_SUBST([VA_LIB_INSTALL_DIR])
AC_ARG_WITH([d3d-libdir],
[AS_HELP_STRING([--with-d3d-libdir=DIR],
[directory for the D3D modules @<:@${libdir}/d3d@:>@])],
[D3D_DRIVER_INSTALL_DIR="$withval"],
[D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
AC_SUBST([D3D_DRIVER_INSTALL_DIR])
dnl
dnl Gallium helper functions
dnl
@@ -1803,21 +2060,19 @@ require_egl_drm() {
}
radeon_llvm_check() {
if test ${LLVM_VERSION_INT} -lt 307; then
amdgpu_llvm_target_name='r600'
else
amdgpu_llvm_target_name='amdgpu'
fi
if test "x$enable_gallium_llvm" != "xyes"; then
AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
fi
LLVM_REQUIRED_VERSION_MAJOR="3"
LLVM_REQUIRED_VERSION_MINOR="4"
LLVM_REQUIRED_VERSION_PATCH="2"
if test "${LLVM_VERSION_INT}${LLVM_VERSION_PATCH}" -lt "${LLVM_REQUIRED_VERSION_MAJOR}0${LLVM_REQUIRED_VERSION_MINOR}${LLVM_REQUIRED_VERSION_PATCH}"; then
AC_MSG_ERROR([LLVM $LLVM_REQUIRED_VERSION_MAJOR.$LLVM_REQUIRED_VERSION_MINOR.$LLVM_REQUIRED_VERSION_PATCH or newer is required for $1])
llvm_check_version_for "3" "4" "2" $1
if test true && $LLVM_CONFIG --targets-built | grep -iqvw $amdgpu_llvm_target_name ; then
AC_MSG_ERROR([LLVM $amdgpu_llvm_target_name not enabled in your LLVM build.])
fi
if test true && $LLVM_CONFIG --targets-built | grep -qvw 'R600' ; then
AC_MSG_ERROR([LLVM R600 Target not enabled. You can enable it when building the LLVM
sources with the --enable-experimental-targets=R600
configure flag])
fi
LLVM_COMPONENTS="${LLVM_COMPONENTS} r600 bitreader ipo"
LLVM_COMPONENTS="${LLVM_COMPONENTS} $amdgpu_llvm_target_name bitreader ipo"
NEED_RADEON_LLVM=yes
if test "x$have_libelf" != xyes; then
AC_MSG_ERROR([$1 requires libelf when using llvm])
@@ -2043,6 +2298,11 @@ AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64)
AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc)
AC_SUBST([NINE_MAJOR], 1)
AC_SUBST([NINE_MINOR], 0)
AC_SUBST([NINE_TINY], 0)
AC_SUBST([NINE_VERSION], "$NINE_MAJOR.$NINE_MINOR.$NINE_TINY")
AC_SUBST([VDPAU_MAJOR], 1)
AC_SUBST([VDPAU_MINOR], 0)
@@ -2064,6 +2324,13 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
AC_SUBST([XA_TINY], $XA_TINY)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
PKG_CHECK_MODULES(VALGRIND, [valgrind],
[have_valgrind=yes], [have_valgrind=no])
if test "x$have_valgrind" = "xyes"; then
AC_DEFINE([HAVE_VALGRIND], 1,
[Use valgrind intrinsics to suppress false warnings])
fi
dnl Restore LDFLAGS and CPPFLAGS
LDFLAGS="$_SAVE_LDFLAGS"
CPPFLAGS="$_SAVE_CPPFLAGS"
@@ -2084,7 +2351,6 @@ AC_CONFIG_FILES([Makefile
src/egl/drivers/dri2/Makefile
src/egl/main/Makefile
src/egl/main/egl.pc
src/egl/wayland/Makefile
src/egl/wayland/wayland-drm/Makefile
src/egl/wayland/wayland-egl/Makefile
src/egl/wayland/wayland-egl/wayland-egl.pc
@@ -2092,9 +2358,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/auxiliary/Makefile
src/gallium/auxiliary/pipe-loader/Makefile
src/gallium/drivers/freedreno/Makefile
src/gallium/drivers/galahad/Makefile
src/gallium/drivers/i915/Makefile
src/gallium/drivers/identity/Makefile
src/gallium/drivers/ilo/Makefile
src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/noop/Makefile
@@ -2108,20 +2372,19 @@ AC_CONFIG_FILES([Makefile
src/gallium/drivers/svga/Makefile
src/gallium/drivers/trace/Makefile
src/gallium/drivers/vc4/Makefile
src/gallium/drivers/vc4/kernel/Makefile
src/gallium/state_trackers/clover/Makefile
src/gallium/state_trackers/dri/Makefile
src/gallium/state_trackers/glx/xlib/Makefile
src/gallium/state_trackers/nine/Makefile
src/gallium/state_trackers/omx/Makefile
src/gallium/state_trackers/osmesa/Makefile
src/gallium/state_trackers/va/Makefile
src/gallium/state_trackers/vdpau/Makefile
src/gallium/state_trackers/vega/Makefile
src/gallium/state_trackers/xa/Makefile
src/gallium/state_trackers/xvmc/Makefile
src/gallium/targets/d3dadapter9/Makefile
src/gallium/targets/d3dadapter9/d3d.pc
src/gallium/targets/dri/Makefile
src/gallium/targets/egl-static/Makefile
src/gallium/targets/gbm/Makefile
src/gallium/targets/libgl-xlib/Makefile
src/gallium/targets/omx/Makefile
src/gallium/targets/opencl/Makefile
@@ -2142,10 +2405,8 @@ AC_CONFIG_FILES([Makefile
src/gallium/winsys/radeon/drm/Makefile
src/gallium/winsys/svga/drm/Makefile
src/gallium/winsys/sw/dri/Makefile
src/gallium/winsys/sw/fbdev/Makefile
src/gallium/winsys/sw/kms-dri/Makefile
src/gallium/winsys/sw/null/Makefile
src/gallium/winsys/sw/wayland/Makefile
src/gallium/winsys/sw/wrapper/Makefile
src/gallium/winsys/sw/xlib/Makefile
src/gallium/winsys/vc4/drm/Makefile
@@ -2161,8 +2422,6 @@ AC_CONFIG_FILES([Makefile
src/mapi/es1api/glesv1_cm.pc
src/mapi/es2api/glesv2.pc
src/mapi/glapi/gen/Makefile
src/mapi/vgapi/Makefile
src/mapi/vgapi/vg.pc
src/mesa/Makefile
src/mesa/gl.pc
src/mesa/drivers/dri/dri.pc
@@ -2179,6 +2438,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
src/vulkan/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile])
@@ -2196,7 +2456,6 @@ echo " includedir: $includedir"
dnl API info
echo ""
echo " OpenGL: $enable_opengl (ES1: $enable_gles1 ES2: $enable_gles2)"
echo " OpenVG: $enable_openvg"
dnl Driver info
echo ""
@@ -2265,6 +2524,12 @@ else
echo " Gallium: no"
fi
dnl Shader cache
echo ""
echo " Shader cache: $enable_shader_cache"
if test "x$enable_shader_cache" = "xyes"; then
echo " With SHA1 from: $with_sha1"
fi
dnl Libraries
echo ""
@@ -2288,6 +2553,7 @@ if test "x$MESA_LLVM" = x1; then
echo " LLVM_CFLAGS: $LLVM_CFLAGS"
echo " LLVM_CXXFLAGS: $LLVM_CXXFLAGS"
echo " LLVM_CPPFLAGS: $LLVM_CPPFLAGS"
echo " LLVM_LDFLAGS: $LLVM_LDFLAGS"
echo ""
fi
echo " PYTHON2: $PYTHON2"

View File

@@ -18,26 +18,26 @@ are exposed in the 3.0 context as extensions.
Feature Status
----------------------------------------------------- ------------------------
GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe (*), softpipe (*)
GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
glBindFragDataLocation, glGetFragDataLocation DONE
Conditional rendering (GL_NV_conditional_render) DONE (r300, swrast)
Map buffer subranges (GL_ARB_map_buffer_range) DONE (r300, swrast)
Clamping controls (GL_ARB_color_buffer_float) DONE (r300)
Float textures, renderbuffers (GL_ARB_texture_float) DONE (r300)
Conditional rendering (GL_NV_conditional_render) DONE ()
Map buffer subranges (GL_ARB_map_buffer_range) DONE ()
Clamping controls (GL_ARB_color_buffer_float) DONE ()
Float textures, renderbuffers (GL_ARB_texture_float) DONE ()
GL_EXT_packed_float DONE ()
GL_EXT_texture_shared_exponent DONE (swrast)
GL_EXT_texture_shared_exponent DONE ()
Float depth buffers (GL_ARB_depth_buffer_float) DONE ()
Framebuffer objects (GL_ARB_framebuffer_object) DONE (r300, swrast)
Framebuffer objects (GL_ARB_framebuffer_object) DONE ()
GL_ARB_half_float_pixel DONE (all drivers)
GL_ARB_half_float_vertex DONE (r300, swrast)
GL_ARB_half_float_vertex DONE ()
GL_EXT_texture_integer DONE ()
GL_EXT_texture_array DONE ()
Per-buffer blend and masks (GL_EXT_draw_buffers2) DONE (swrast)
GL_EXT_texture_compression_rgtc DONE (r300, swrast)
GL_ARB_texture_rg DONE (r300, swrast)
Per-buffer blend and masks (GL_EXT_draw_buffers2) DONE ()
GL_EXT_texture_compression_rgtc DONE ()
GL_ARB_texture_rg DONE ()
Transform feedback (GL_EXT_transform_feedback) DONE ()
Vertex array objects (GL_ARB_vertex_array_object) DONE (all drivers)
Vertex array objects (GL_ARB_vertex_array_object) DONE ()
sRGB framebuffer format (GL_EXT_framebuffer_sRGB) DONE ()
glClearBuffer commands DONE
glGetStringi command DONE
@@ -45,7 +45,7 @@ GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe (*),
glVertexAttribI commands DONE
Depth format cube textures DONE ()
GLX_ARB_create_context (GLX 1.4 is required) DONE
Multisample anti-aliasing DONE (r300)
Multisample anti-aliasing DONE (llvmpipe (*), softpipe (*))
(*) llvmpipe and softpipe have fake Multisample anti-aliasing support
@@ -53,28 +53,28 @@ GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe (*),
GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
Forward compatible context support/deprecations DONE ()
Instanced drawing (GL_ARB_draw_instanced) DONE (swrast)
Buffer copying (GL_ARB_copy_buffer) DONE (r300, swrast)
Primitive restart (GL_NV_primitive_restart) DONE (r300)
Instanced drawing (GL_ARB_draw_instanced) DONE ()
Buffer copying (GL_ARB_copy_buffer) DONE ()
Primitive restart (GL_NV_primitive_restart) DONE ()
16 vertex texture image units DONE ()
Texture buffer objs (GL_ARB_texture_buffer_object) DONE for OpenGL 3.1 contexts ()
Rectangular textures (GL_ARB_texture_rectangle) DONE (r300, swrast)
Uniform buffer objs (GL_ARB_uniform_buffer_object) DONE (swrast)
Signed normalized textures (GL_EXT_texture_snorm) DONE (r300)
Rectangular textures (GL_ARB_texture_rectangle) DONE ()
Uniform buffer objs (GL_ARB_uniform_buffer_object) DONE ()
Signed normalized textures (GL_EXT_texture_snorm) DONE ()
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
Core/compatibility profiles DONE
Geometry shaders DONE ()
BGRA vertex order (GL_ARB_vertex_array_bgra) DONE (r300, swrast)
Base vertex offset(GL_ARB_draw_elements_base_vertex) DONE (r300, swrast)
Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (r300, swrast)
Provoking vertex (GL_ARB_provoking_vertex) DONE (r300, swrast)
BGRA vertex order (GL_ARB_vertex_array_bgra) DONE ()
Base vertex offset(GL_ARB_draw_elements_base_vertex) DONE ()
Frag shader coord (GL_ARB_fragment_coord_conventions) DONE ()
Provoking vertex (GL_ARB_provoking_vertex) DONE ()
Seamless cubemaps (GL_ARB_seamless_cube_map) DONE ()
Multisample textures (GL_ARB_texture_multisample) DONE ()
Frag depth clamp (GL_ARB_depth_clamp) DONE (swrast)
Fence objects (GL_ARB_sync) DONE (r300, swrast)
Frag depth clamp (GL_ARB_depth_clamp) DONE ()
Fence objects (GL_ARB_sync) DONE ()
GLX_ARB_create_context_profile DONE
@@ -82,52 +82,52 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_blend_func_extended DONE ()
GL_ARB_explicit_attrib_location DONE (all drivers that support GLSL)
GL_ARB_occlusion_query2 DONE (r300, swrast)
GL_ARB_occlusion_query2 DONE ()
GL_ARB_sampler_objects DONE (all drivers)
GL_ARB_shader_bit_encoding DONE ()
GL_ARB_texture_rgb10_a2ui DONE ()
GL_ARB_texture_swizzle DONE (r300, swrast)
GL_ARB_texture_swizzle DONE ()
GL_ARB_timer_query DONE ()
GL_ARB_instanced_arrays DONE (r300)
GL_ARB_instanced_arrays DONE ()
GL_ARB_vertex_type_2_10_10_10_rev DONE ()
GL 4.0, GLSL 4.00:
GL_ARB_draw_buffers_blend DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, radeonsi, llvmpipe, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_gpu_shader5 DONE (i965, nvc0)
- 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE (r600)
- Dynamically uniform sampler array indices DONE (r600, softpipe)
- Dynamically uniform UBO array indices DONE (r600)
- Implicit signed -> unsigned conversions DONE
- Fused multiply-add DONE ()
- Packing/bitfield/conversion functions DONE (r600)
- Enhanced textureGather DONE (r600, radeonsi)
- Geometry shader instancing DONE (r600)
- Packing/bitfield/conversion functions DONE (r600, radeonsi, softpipe)
- Enhanced textureGather DONE (r600, radeonsi, softpipe)
- Geometry shader instancing DONE (r600, llvmpipe, softpipe)
- Geometry shader multiple streams DONE ()
- Enhanced per-sample shading DONE (r600)
- Enhanced per-sample shading DONE (r600, radeonsi)
- Interpolation functions DONE (r600)
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 started (Dave)
GL_ARB_gpu_shader_fp64 DONE (nvc0, softpipe)
GL_ARB_sample_shading DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_shader_subroutine not started
GL_ARB_shader_subroutine started (Dave)
GL_ARB_tessellation_shader started (Chris, Ilia)
GL_ARB_texture_buffer_object_rgb32 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_cube_map_array DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_texture_gather DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_query_lod DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_transform_feedback2 DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_transform_feedback3 DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_transform_feedback2 DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_transform_feedback3 DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL 4.1, GLSL 4.10:
GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision started (Micah)
GL_ARB_vertex_attrib_64bit started (Dave)
GL_ARB_vertex_attrib_64bit DONE (nvc0, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, nvc0, r600, llvmpipe)
@@ -137,12 +137,13 @@ GL 4.2, GLSL 4.20:
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_shader_image_load_store in progress (curro)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_map_buffer_alignment DONE (all drivers)
@@ -152,70 +153,89 @@ GL 4.3, GLSL 4.30:
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader in progress (jljusten)
GL_ARB_copy_image DONE (i965)
GL_ARB_copy_image DONE (i965) (gallium - in progress, VMware)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (nv50, nvc0, r600, llvmpipe)
GL_ARB_framebuffer_no_attachments not started
GL_ARB_framebuffer_no_attachments DONE (i965)
GL_ARB_internalformat_query2 not started
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, radeonsi, llvmpipe, softpipe)
GL_ARB_program_interface_query not started
GL_ARB_multi_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior not started
GL_ARB_shader_image_size not started
GL_ARB_shader_storage_buffer_object not started
GL_ARB_shader_image_size in progress (Martin Peres)
GL_ARB_shader_storage_buffer_object in progress (Iago Toral, Samuel Iglesias)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965, nv50, nvc0)
GL_ARB_texture_view DONE (i965, nv50, nvc0, llvmpipe, softpipe)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40:
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi)
GL_ARB_clear_texture DONE (i965)
GL_ARB_buffer_storage DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_clear_texture DONE (i965) (gallium - in progress, VMware)
GL_ARB_enhanced_layouts not started
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object not started
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, swrast, llvmpipe, softpipe)
GL_ARB_texture_stencil8 not started
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_stencil8 DONE (nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL 4.5, GLSL 4.50:
GL_ARB_ES3_1_compatibility not started
GL_ARB_clip_control DONE (llvmpipe, softpipe, r300, r600, radeonsi)
GL_ARB_clip_control DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_conditional_render_inverted DONE (i965, nv50, nvc0, llvmpipe, softpipe)
GL_ARB_cull_distance not started
GL_ARB_cull_distance in progress (Tobias)
GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600)
GL_ARB_direct_state_access not started
GL_ARB_direct_state_access DONE (all drivers)
- Transform Feedback object DONE
- Buffer object DONE
- Framebuffer object DONE
- Renderbuffer object DONE
- Texture object DONE
- Vertex array object DONE
- Sampler object DONE
- Program Pipeline object DONE
- Query object DONE (will require changes when GL_ARB_query_buffer_object lands)
GL_ARB_get_texture_sub_image started (Brian Paul)
GL_ARB_shader_texture_image_samples not started
GL_ARB_texture_barrier DONE (nv50, nvc0, r300, r600, radeonsi)
GL_ARB_texture_barrier DONE (nv50, nvc0, r600, radeonsi)
GL_KHR_context_flush_control DONE (all - but needs GLX/EXT extension to be useful)
GL_KHR_robust_buffer_access_behavior not started
GL_KHR_robustness 90% done (the ARB variant)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1
GL_ARB_arrays_of_arrays started (Timothy)
GL_ARB_compute_shader in progress (jljusten)
GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments not started
GL_ARB_program_interface_query not started
GL_ARB_framebuffer_no_attachments DONE (i965)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965)
GL_ARB_shader_image_load_store in progress (curro)
GL_ARB_shader_storage_buffer_object not started
GL_ARB_shader_image_size in progress (Martin Peres)
GL_ARB_shader_storage_buffer_object in progress (Iago Toral, Samuel Iglesias)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
Multisample textures (GL_ARB_texture_multisample) DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965, nvc0, r600, radeonsi)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600, radeonsi)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functions not covered above:
glMemoryBarrierByRegion
glGetTexLevelParameter[fi]v - needs updates to restrict to GLES enums
glGetBooleani_v - needs updates to restrict to GLES enums
More info about these features and the work involved can be found at
http://dri.freedesktop.org/wiki/MissingFunctionality

View File

@@ -11,10 +11,6 @@ no longer shipped or supported.
Run
scons osmesa mesagdi
to build classic mesa Windows GDI drivers; or
scons libgl-gdi
to build gallium based GDI driver.

View File

@@ -103,7 +103,7 @@ Mesa Version History
- Stencil-related functions now work in display lists
Changes:
- renamed aux.h as glaux.h (MS-DOS names can't start with aux)
- most filenames are in 8.3 format to accomodate MS-DOS
- most filenames are in 8.3 format to accommodate MS-DOS
- use GLubytes to store arrays of colors instead of GLints
1.2.2 August 2, 1995
@@ -1007,7 +1007,7 @@ Mesa Version History
- glGetTexImage was using pixel unpacking instead of packing params
- auto-mipmap generation for cube maps was incorrect
Changes:
- max texture units reduced to six to accomodate texture rectangles
- max texture units reduced to six to accommodate texture rectangles
- removed unfinished GL_MESA_sprite_point extension code

View File

@@ -61,7 +61,6 @@
<li><a href="shading.html" target="_parent">Shading Language</a>
<li><a href="egl.html" target="_parent">EGL</a>
<li><a href="opengles.html" target="_parent">OpenGL ES</a>
<li><a href="openvg.html" target="_parent">OpenVG / Vega</a>
<li><a href="envvars.html" target="_parent">Environment Variables</a>
<li><a href="osmesa.html" target="_parent">Off-Screen Rendering</a>
<li><a href="debugging.html" target="_parent">Debugging Tips</a>

View File

@@ -17,158 +17,240 @@
<h1>Development Notes</h1>
<h2>Adding Extensions</h2>
<p>
To add a new GL extension to Mesa you have to do at least the following.
<ul>
<li>
If glext.h doesn't define the extension, edit include/GL/gl.h and add
code like this:
<pre>
#ifndef GL_EXT_the_extension_name
#define GL_EXT_the_extension_name 1
/* declare the new enum tokens */
/* prototype the new functions */
/* TYPEDEFS for the new functions */
#endif
</pre>
</li>
<li>
In the src/mapi/glapi/gen/ directory, add the new extension functions and
enums to the gl_API.xml file.
Then, a bunch of source files must be regenerated by executing the
corresponding Python scripts.
</li>
<li>
Add a new entry to the <code>gl_extensions</code> struct in mtypes.h
</li>
<li>
Update the <code>extensions.c</code> file.
</li>
<li>
From this point, the best way to proceed is to find another extension,
similar to the new one, that's already implemented in Mesa and use it
as an example.
</li>
<li>
If the new extension adds new GL state, the functions in get.c, enable.c
and attrib.c will most likely require new code.
</li>
<li>
The dispatch tests check_table.cpp and dispatch_sanity.cpp
should be updated with details about the new extensions functions. These
tests are run using 'make check'
</li>
<li><a href="#style">Coding Style</a>
<li><a href="#submitting">Submitting Patches</a>
<li><a href="#release">Making a New Mesa Release</a>
<li><a href="#extensions">Adding Extensions</a>
</ul>
<h2>Coding Style</h2>
<h2 id="style">Coding Style</h2>
<p>
Mesa's code style has changed over the years. Here's the latest.
Mesa is over 20 years old and the coding style has evolved over time.
Some old parts use a style that's a bit out of date.
If the guidelines below don't cover something, try following the format of
existing, neighboring code.
</p>
<p>
Comment your code! It's extremely important that open-source code be
well documented. Also, strive to write clean, easily understandable code.
Basic formatting guidelines
</p>
<p>
3-space indentation
</p>
<p>
If you use tabs, set them to 8 columns
</p>
<p>
Line width: the preferred width to fill comments and code in Mesa is 78
columns. Exceptions are sometimes made for clarity (e.g. tabular data is
sometimes filled to a much larger width so that extraneous carriage returns
don't obscure the table).
</p>
<p>
Brace example:
</p>
<ul>
<li>3-space indentation, no tabs.
<li>Limit lines to 78 or fewer characters. The idea is to prevent line
wrapping in 80-column editors and terminals. There are exceptions, such
as if you're defining a large, static table of information.
<li>Opening braces go on the same line as the if/for/while statement.
For example:
<pre>
if (condition) {
foo;
}
else {
bar;
}
switch (condition) {
case 0:
foo();
break;
case 1: {
...
break;
}
default:
...
break;
}
if (condition) {
foo;
} else {
bar;
}
</pre>
<p>
Here's the GNU indent command which will best approximate my preferred style:
(Note that it won't format switch statements in the preferred way)
</p>
<li>Put a space before/after operators. For example, <tt>a = b + c;</tt>
and not <tt>a=b+c;</tt>
<li>This GNU indent command generally does the right thing for formatting:
<pre>
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
</pre>
<p>
Local variable name example: localVarName (no underscores)
</p>
<p>
Constants and macros are ALL_UPPERCASE, with _ between words
</p>
<p>
Global variables are not allowed.
</p>
<p>
Function name examples:
</p>
<li>Use comments wherever you think it would be helpful for other developers.
Several specific cases and style examples follow. Note that we roughly
follow <a href="http://www.stack.nl/~dimitri/doxygen/">Doxygen</a> conventions.
<br>
<br>
Single-line comments:
<pre>
glFooBar() - a public GL entry point (in glapi_dispatch.c)
_mesa_FooBar() - the internal immediate mode function
save_FooBar() - retained mode (display list) function in dlist.c
foo_bar() - a static (private) function
_mesa_foo_bar() - an internal non-static Mesa function
/* null-out pointer to prevent dangling reference below */
bufferObj = NULL;
</pre>
Or,
<pre>
bufferObj = NULL; /* prevent dangling reference below */
</pre>
Multi-line comment:
<pre>
/* If this is a new buffer object id, or one which was generated but
* never used before, allocate a buffer object now.
*/
</pre>
We try to quote the OpenGL specification where prudent:
<pre>
/* Page 38 of the PDF of the OpenGL ES 3.0 spec says:
*
* "An INVALID_OPERATION error is generated for any of the following
* conditions:
*
* * <length> is zero."
*
* Additionally, page 94 of the PDF of the OpenGL 4.5 core spec
* (30.10.2014) also says this, so it's no longer allowed for desktop GL,
* either.
*/
</pre>
Function comment example:
<pre>
/**
* Create and initialize a new buffer object. Called via the
* ctx->Driver.CreateObject() driver callback function.
* \param name integer name of the object
* \param type one of GL_FOO, GL_BAR, etc.
* \return pointer to new object or NULL if error
*/
struct gl_object *
_mesa_create_object(GLuint name, GLenum type)
{
/* function body */
}
</pre>
<p>
Places that are not directly visible to the GL API should prefer the use
of <tt>bool</tt>, <tt>true</tt>, and
<li>Put the function return type and qualifiers on one line and the function
name and parameters on the next, as seen above. This makes it easy to use
<code>grep ^function_name dir/*</code> to find function definitions. Also,
the opening brace goes on the next line by itself (see above.)
<li>Function names follow various conventions depending on the type of function:
<pre>
glFooBar() - a public GL entry point (in glapi_dispatch.c)
_mesa_FooBar() - the internal immediate mode function
save_FooBar() - retained mode (display list) function in dlist.c
foo_bar() - a static (private) function
_mesa_foo_bar() - an internal non-static Mesa function
</pre>
<li>Constants, macros and enumerant names are ALL_UPPERCASE, with _ between
words.
<li>Mesa usually uses camel case for local variables (Ex: "localVarname")
while gallium typically uses underscores (Ex: "local_var_name").
<li>Global variables are almost never used because Mesa should be thread-safe.
<li>Booleans. Places that are not directly visible to the GL API
should prefer the use of <tt>bool</tt>, <tt>true</tt>, and
<tt>false</tt> over <tt>GLboolean</tt>, <tt>GL_TRUE</tt>, and
<tt>GL_FALSE</tt>. In C code, this may mean that
<tt>#include &lt;stdbool.h&gt;</tt> needs to be added. The
<tt>try_emit_</tt>* methods in src/mesa/program/ir_to_mesa.cpp and
src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.
</p>
<h2>Submitting patches</h2>
</ul>
<h2 id="submitting">Submitting patches</h2>
<p>
You should always run the Mesa Testsuite before submitting patches.
The Testsuite can be run using the 'make check' command. All tests
The basic guidelines for submitting patches are:
</p>
<ul>
<li>Patches should be sufficiently tested before submitting.
<li>Code patches should follow Mesa coding conventions.
<li>Whenever possible, patches should only effect individual Mesa/Gallium
components.
<li>Patches should never introduce build breaks and should be bisectable (see
<code>git bisect</code>.)
<li>Patches should be properly formatted (see below).
<li>Patches should be submitted to mesa-dev for review using
<code>git send-email</code>.
<li>Patches should not mix code changes with code formatting changes (except,
perhaps, in very trivial cases.)
</ul>
<h3>Patch formatting</h3>
<p>
The basic rules for patch formatting are:
</p>
<ul>
<li>Lines should be limited to 75 characters or less so that git logs
displayed in 80-column terminals avoid line wrapping. Note that git
log uses 4 spaces of indentation (4 + 75 &lt; 80).
<li>The first line should be a short, concise summary of the change prefixed
with a module name. Examples:
<pre>
mesa: Add support for querying GL_VERTEX_ATTRIB_ARRAY_LONG
gallium: add PIPE_CAP_DEVICE_RESET_STATUS_QUERY
i965: Fix missing type in local variable declaration.
</pre>
<li>Subsequent patch comments should describe the change in more detail,
if needed. For example:
<pre>
i965: Remove end-of-thread SEND alignment code.
This was present in Eric's initial implementation of the compaction code
for Sandybridge (commit 077d01b6). There is no documentation saying this
is necessary, and removing it causes no regressions in piglit on any
platform.
</pre>
<li>A "Signed-off-by:" line is not required, but not discouraged either.
<li>If a patch address a bugzilla issue, that should be noted in the
patch comment. For example:
<pre>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89689
</pre>
<li>If there have been several revisions to a patch during the review
process, they should be noted such as in this example:
<pre>
st/mesa: add ARB_texture_stencil8 support (v4)
if we support stencil texturing, enable texture_stencil8
there is no requirement to support native S8 for this,
the texture can be converted to x24s8 fine.
v2: fold fixes from Marek in:
a) put S8 last in the list
b) fix renderable to always test for d/s renderable
fixup the texture case to use a stencil only format
for picking the format for the texture view.
v3: hit fallback for getteximage
v4: put s8 back in front, it shouldn't get picked now (Ilia)
</pre>
<li>If someone tested your patch, document it with a line like this:
<pre>
Tested-by: Joe Hacker &lt;jhacker@foo.com&gt;
</pre>
<li>If the patch was reviewed (usually the case) or acked by someone,
that should be documented with:
<pre>
Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;
Acked-by: Joe Hacker &lt;jhacker@foo.com&gt;
</pre>
</ul>
<h3>Testing Patches</h3>
<p>
It should go without saying that patches must be tested. In general,
do whatever testing is prudent.
</p>
<p>
You should always run the Mesa test suite before submitting patches.
The test suite can be run using the 'make check' command. All tests
must pass before patches will be accepted, this may mean you have
to update the tests themselves.
</p>
<p>
Whenever possible and applicable, test the patch with
<a href="http://piglit.freedesktop.org">Piglit</a> to
check for regressions.
</p>
<h3>Mailing Patches</h3>
<p>
Patches should be sent to the Mesa mailing list for review.
When submitting a patch make sure to use git send-email rather than attaching
@@ -184,7 +266,38 @@ re-sending the whole series). Using --in-reply-to makes
it harder for reviewers to accidentally review old patches.
</p>
<h2>Marking a commit as a candidate for a stable branch</h2>
<p>
When submitting follow-up patches you should also login to
<a href="https://patchwork.freedesktop.org">patchwork</a> and change the
state of your old patches to Superseded.
</p>
<h3>Reviewing Patches</h3>
<p>
When you've reviewed a patch on the mailing list, please be unambiguous
about your review. That is, state either
<pre>
Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;
</pre>
or
<pre>
Acked-by: Joe Hacker &lt;jhacker@foo.com&gt;
</pre>
Rather than saying just "LGTM" or "Seems OK".
</p>
<p>
If small changes are suggested, it's OK to say something like:
<pre>
With the above fixes, Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;
</pre>
which tells the patch author that the patch can be committed, as long
as the issues are resolved first.
</p>
<h3>Marking a commit as a candidate for a stable branch</h3>
<p>
If you want a commit to be applied to a stable branch,
@@ -221,7 +334,7 @@ the upcoming stable release can always be seen on the
<a href="http://cworth.org/~cworth/mesa-stable-queue/">Mesa Stable Queue</a>
page.
<h2>Criteria for accepting patches to the stable branch</h2>
<h3>Criteria for accepting patches to the stable branch</h3>
Mesa has a designated release manager for each stable branch, and the release
manager is the only developer that should be pushing changes to these
@@ -306,7 +419,8 @@ be rejected:
regression that is unaacceptable for the stable branch.</li>
</ul>
<h2>Making a New Mesa Release</h2>
<h2 id="release">Making a New Mesa Release</h2>
<p>
These are the instructions for making a new Mesa release.
@@ -456,7 +570,7 @@ Edit docs/relnotes/X.Y.Z.html to add the sha256sums printed as part of "make
tarballs" in the previous step. Commit this change.
</p>
<h3>Push all commits and the tag creates above</h3>
<h3>Push all commits and the tag created above</h3>
<p>
This is the first step that cannot easily be undone. The release is going
@@ -483,7 +597,7 @@ signatures to the freedesktop.org server:
mv ~/MesaLib-X.Y.Z* .
</pre>
<h3>Back on mesa master, andd the new release notes into the tree</h3>
<h3>Back on mesa master, add the new release notes into the tree</h3>
<p>
Something like the following steps will do the trick:
@@ -543,6 +657,56 @@ release announcement:
</pre>
</p>
<h2 id="extensions">Adding Extensions</h2>
<p>
To add a new GL extension to Mesa you have to do at least the following.
<ul>
<li>
If glext.h doesn't define the extension, edit include/GL/gl.h and add
code like this:
<pre>
#ifndef GL_EXT_the_extension_name
#define GL_EXT_the_extension_name 1
/* declare the new enum tokens */
/* prototype the new functions */
/* TYPEDEFS for the new functions */
#endif
</pre>
</li>
<li>
In the src/mapi/glapi/gen/ directory, add the new extension functions and
enums to the gl_API.xml file.
Then, a bunch of source files must be regenerated by executing the
corresponding Python scripts.
</li>
<li>
Add a new entry to the <code>gl_extensions</code> struct in mtypes.h
</li>
<li>
Update the <code>extensions.c</code> file.
</li>
<li>
From this point, the best way to proceed is to find another extension,
similar to the new one, that's already implemented in Mesa and use it
as an example.
</li>
<li>
If the new extension adds new GL state, the functions in get.c, enable.c
and attrib.c will most likely require new code.
</li>
<li>
The dispatch tests check_table.cpp and dispatch_sanity.cpp
should be updated with details about the new extensions functions. These
tests are run using 'make check'
</li>
</ul>
</div>
</body>
</html>

View File

@@ -204,9 +204,8 @@ terribly relevant.</p>
few preprocessor defines.</p>
<ul>
<li>If <tt>GLX_USE_TLS</tt> is defined, method #4 is used.</li>
<li>If <tt>HAVE_PTHREAD</tt> is defined, method #3 is used.</li>
<li>If <tt>WIN32_THREADS</tt> is defined, method #2 is used.</li>
<li>If <tt>GLX_USE_TLS</tt> is defined, method #3 is used.</li>
<li>If <tt>HAVE_PTHREAD</tt> is defined, method #2 is used.</li>
<li>If none of the preceding are defined, method #1 is used.</li>
</ul>

View File

@@ -88,8 +88,11 @@ types such as <code>EGLNativeDisplayType</code> or
<code>EGLNativeWindowType</code> defined for.</p>
<p>The available platforms are <code>x11</code>, <code>drm</code>,
<code>fbdev</code>, and <code>gdi</code>. The <code>gdi</code> platform can
only be built with SCons. Unless for special needs, the build system should
<code>wayland</code>, <code>null</code>, <code>android</code>,
<code>haiku</code>, and <code>gdi</code>. The <code>android</code> platform
can only be built as a system component, part of AOSP, while the
<code>haiku</code> and <code>gdi</code> platforms can only be built with SCons.
Unless for special needs, the build system should
select the right platforms automatically.</p>
</dd>
@@ -112,13 +115,6 @@ is required if applications mix OpenGL and OpenGL ES.</p>
</dd>
<dt><code>--enable-openvg</code></dt>
<dd>
<p>OpenVG must be explicitly enabled by this option.</p>
</dd>
</dl>
<h2>Use EGL</h2>
@@ -187,14 +183,6 @@ probably required only for some of the demos found in mesa/demo repository.</p>
values are: <code>debug</code>, <code>info</code>, <code>warning</code>, and
<code>fatal</code>.</p>
</dd>
<dt><code>EGL_SOFTWARE</code></dt>
<dd>
<p>For drivers that support both hardware and software rendering, setting this
variable to true forces the use of software rendering.</p>
</dd>
</dl>
@@ -212,38 +200,15 @@ the X server directly using (XCB-)DRI2 protocol.</p>
</dd>
<dt><code>egl_gallium</code></dt>
<dd>
<p>This driver is based on Gallium3D. It supports all rendering APIs and
hardware supported by Gallium3D. It is the only driver that supports OpenVG.
The supported platforms are X11, DRM, FBDEV, and GDI.</p>
<p>This driver comes with its own hardware drivers
(<code>pipe_&lt;hw&gt;</code>) and client API modules
(<code>st_&lt;api&gt;</code>).</p>
</dd>
<h2>Packaging</h2>
<p>The ABI between the main library and its drivers are not stable. Nor is
there a plan to stabilize it at the moment. Of the EGL drivers,
<code>egl_gallium</code> has its own hardware drivers and client API modules.
They are considered internal to <code>egl_gallium</code> and there is also no
stable ABI between them. These should be kept in mind when packaging for
distribution.</p>
<p>Generally, <code>egl_dri2</code> is preferred over <code>egl_gallium</code>
when the system already has DRI drivers. As <code>egl_gallium</code> is loaded
before <code>egl_dri2</code> when both are available, <code>egl_gallium</code>
is disabled by default.</p>
there a plan to stabilize it at the moment.</p>
<h2>Developers</h2>
<p>The sources of the main library and the classic drivers can be found at
<code>src/egl/</code>. The sources of the <code>egl</code> state tracker can
be found at <code>src/gallium/state_trackers/egl/</code>.</p>
<p>The sources of the main library and drivers can be found at
<code>src/egl/</code>.</p>
<h3>Lifetime of Display Resources</h3>

View File

@@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues.
<li>LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging)
<li>LIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers
calls per second.
<li>LIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter)
</ul>

View File

@@ -327,19 +327,6 @@ Basically, applying a translation of (0.375, 0.375, 0.0) to your coordinates
will fix the problem.
</p>
<h2>3.6 How can I change the maximum framebuffer size in Mesa's
<tt>swrast</tt> backend?</h2>
<p>
These can be overridden by using the <tt>--with-max-width</tt> and
<tt>--with-max-height</tt> options. The two need not be equal.
</p><p>
Do note that Mesa uses these values to size some internal buffers,
so increasing these sizes will cause Mesa to require additional
memory. Furthermore, increasing these limits beyond <tt>4096</tt>
may introduce rasterization artifacts; see the leading comments in
<tt>src/mesa/swrast/s_tritemp.h</tt>.
</p>
<br>
<br>

View File

@@ -16,6 +16,137 @@
<h1>News</h1>
<h2>June 20, 2015</h2>
<p>
<a href="relnotes/10.5.8.html">Mesa 10.5.8</a> is released.
This is a bug-fix release.
</p>
<h2>June 14, 2015</h2>
<p>
<a href="relnotes/10.6.0.html">Mesa 10.6.0</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>June 07, 2015</h2>
<p>
<a href="relnotes/10.5.7.html">Mesa 10.5.7</a> is released.
This is a bug-fix release.
</p>
<h2>May 23, 2015</h2>
<p>
<a href="relnotes/10.5.6.html">Mesa 10.5.6</a> is released.
This is a bug-fix release.
</p>
<h2>May 11, 2015</h2>
<p>
<a href="relnotes/10.5.5.html">Mesa 10.5.5</a> is released.
This is a bug-fix release.
</p>
<h2>April 24, 2015</h2>
<p>
<a href="relnotes/10.5.4.html">Mesa 10.5.4</a> is released.
This is a bug-fix release.
</p>
<h2>April 12, 2015</h2>
<p>
<a href="relnotes/10.5.3.html">Mesa 10.5.3</a> is released.
This is a bug-fix release.
</p>
<h2>March 28, 2015</h2>
<p>
<a href="relnotes/10.5.2.html">Mesa 10.5.2</a> is released.
This is a bug-fix release.
</p>
<h2>March 20, 2015</h2>
<p>
<a href="relnotes/10.4.7.html">Mesa 10.4.7</a> is released.
This is a bug-fix release.
</p>
<h2>March 13, 2015</h2>
<p>
<a href="relnotes/10.5.1.html">Mesa 10.5.1</a> is released.
This is a bug-fix release.
</p>
<h2>March 06, 2015</h2>
<p>
<a href="relnotes/10.5.0.html">Mesa 10.5.0</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>March 06, 2015</h2>
<p>
<a href="relnotes/10.4.6.html">Mesa 10.4.6</a> is released.
This is a bug-fix release.
</p>
<h2>February 21, 2015</h2>
<p>
<a href="relnotes/10.4.5.html">Mesa 10.4.5</a> is released.
This is a bug-fix release.
</p>
<h2>February 06, 2015</h2>
<p>
<a href="relnotes/10.4.4.html">Mesa 10.4.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 24, 2015</h2>
<p>
<a href="relnotes/10.4.3.html">Mesa 10.4.3</a> is released.
This is a bug-fix release.
</p>
<h2>January 12, 2015</h2>
<p>
<a href="relnotes/10.3.7.html">Mesa 10.3.7</a>
and <a href="relnotes/10.4.2.html">Mesa 10.4.2</a> are released.
These are bug-fix releases from the 10.3 and 10.4 branches, respectively.
<br>
NOTE: It is anticipated that 10.3.7 will be the final release in the 10.3
series. Users of 10.3 are encouraged to migrate to the 10.4 series in order
to obtain future fixes.
</p>
<h2>December 29, 2014</h2>
<p>
<a href="relnotes/10.3.6.html">Mesa 10.3.6</a>
and <a href="relnotes/10.4.1.html">Mesa 10.4.1</a> are released.
These are bug-fix releases from the 10.3 and 10.4 branches, respectively.
</p>
<h2>December 14, 2014</h2>
<p>
<a href="relnotes/10.4.html">Mesa 10.4</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>December 5, 2014</h2>
<p>
<a href="relnotes/10.3.5.html">Mesa 10.3.5</a> is released.
This is a bug-fix release.
</p>
<h2>November 21, 2014</h2>
<p>
<a href="relnotes/10.3.4.html">Mesa 10.3.4</a> is released.
This is a bug-fix release.
</p>
<h2>November 8, 2014</h2>
<p>
<a href="relnotes/10.3.3.html">Mesa 10.3.3</a> is released.
@@ -1266,7 +1397,7 @@ The <a href="faq.html">Mesa FAQ</a> has been rewritten.
- glGetTexImage was using pixel unpacking instead of packing params
- auto-mipmap generation for cube maps was incorrect
Changes:
- max texture units reduced to six to accomodate texture rectangles
- max texture units reduced to six to accommodate texture rectangles
- removed unfinished GL_MESA_sprite_point extension code
</pre>

View File

@@ -38,6 +38,10 @@
Version 2.6.4 or later should work.
</li>
<br>
<li><a href="http://www.makotemplates.org/">Python Mako module</a> -
Python Mako module is required. Version 0.7.3 or later should work.
</li>
</br>
<li><a href="http://www.scons.org/">SCons</a> is required for building on
Windows and optional for Linux (it's an alternative to autoconf/automake.)
</li>
@@ -51,8 +55,8 @@ Versions 2.5.35 and 2.4.1, respectively, (or later) should work.
<br>
On Windows with MinGW, install flex and bison with:
<pre>mingw-get install msys-flex msys-bison</pre>
For MSVC on Windows, you can find flex/bison programs on the
<a href="ftp://ftp.freedesktop.org/pub/mesa/windows-utils/">Mesa ftp site</a>.
For MSVC on Windows, install
<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.
</li>
</ul>
@@ -78,7 +82,7 @@ the needed dependencies:
<pre>
sudo yum install flex bison imake libtool xorg-x11-proto-devel libdrm-devel \
gcc-c++ xorg-x11-server-devel libXi-devel libXmu-devel libXdamage-devel git \
expat-devel llvm-devel
expat-devel llvm-devel python-mako
</pre>
@@ -123,14 +127,13 @@ by -debug for debug builds.
To build Mesa with SCons for Windows on Linux using the MinGW crosscompiler toolchain do
</p>
<pre>
scons platform=windows toolchain=crossmingw machine=x86 mesagdi libgl-gdi
scons platform=windows toolchain=crossmingw machine=x86 libgl-gdi
</pre>
<p>
This will create:
</p>
<ul>
<li>build/windows-x86-debug/mesa/drivers/windows/gdi/opengl32.dll &mdash; Mesa + swrast, binary compatible with Windows's opengl32.dll
<li>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll &mdash; Mesa + Gallium + softpipe, binary compatible with Windows's opengl32.dll
<li>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll &mdash; Mesa + Gallium + softpipe (or llvmpipe), binary compatible with Windows's opengl32.dll
</ul>
<p>
Put them all in the same directory to test them.

View File

@@ -49,7 +49,7 @@ stderr if the LIBGL_DEBUG environment variable is defined.
libGL.so is thread safe. The overhead of thread safety for common,
single-thread clients is negligible. However, the overhead of thread
safety for multi-threaded clients is significant. Each GL API call
requires two calls to pthread_get_specific() which can noticably
requires two calls to pthread_get_specific() which can noticeably
impact performance. Warning: libGL.so is thread safe but individual
DRI drivers may not be. Please consult the documentation for a driver
to learn if it is thread safe.

View File

@@ -58,15 +58,37 @@ It's the fastest software rasterizer for Mesa.
</pre>
<p>
For Windows you will need to build LLVM from source with MSVC or MINGW
(either natively or through cross compilers) and CMake, and set the LLVM
environment variable to the directory you installed it to.
For Windows you will need to build LLVM from source with MSVC or MINGW
(either natively or through cross compilers) and CMake, and set the LLVM
environment variable to the directory you installed it to.
LLVM will be statically linked, so when building on MSVC it needs to be
built with a matching CRT as Mesa, and you'll need to pass
-DLLVM_USE_CRT_RELEASE=MTd for debug and checked builds,
-DLLVM_USE_CRT_RELEASE=MTd for profile and release builds.
<code>-DLLVM_USE_CRT_xxx=yyy</code> as described below.
</p>
<table border="1">
<tr>
<th rowspan="2">LLVM build-type</th>
<th colspan="2" align="center">Mesa build-type</th>
</tr>
<tr>
<th>debug,checked</th>
<th>release,profile</th>
</tr>
<tr>
<th>Debug</th>
<td><code>-DLLVM_USE_CRT_DEBUG=MTd</code></td>
<td><code>-DLLVM_USE_CRT_DEBUG=MT</code></td>
</tr>
<tr>
<th>Release</th>
<td><code>-DLLVM_USE_CRT_RELEASE=MTd</code></td>
<td><code>-DLLVM_USE_CRT_RELEASE=MT</code></td>
</tr>
</table>
<p>
You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86
to cmake.
</p>

View File

@@ -1,59 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>OpenVG State Tracker</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>OpenVG State Tracker</h1>
<p>
The current version of the OpenVG state tracker implements OpenVG 1.1.
</p>
<p>
More information about OpenVG can be found at
<a href="http://www.khronos.org/openvg/">
http://www.khronos.org/openvg/</a> .
</p>
<p>
The OpenVG state tracker depends on the Gallium architecture and a working EGL implementation.
Please refer to <a href="egl.html">Mesa EGL</a> for more information about EGL.
</p>
<h2>Building the library</h2>
<ol>
<li>Run <code>configure</code> with <code>--enable-openvg</code> and
<code>--enable-gallium-egl</code>. If you do not need OpenGL, you can add
<code>--disable-opengl</code> to save the compilation time.</li>
<li>Build and install Mesa as usual.</li>
</ol>
<h3>Sample build</h3>
A sample build looks as follows:
<pre>
$ ./configure --disable-opengl --enable-openvg --enable-gallium-egl
$ make
$ make install
</pre>
<p>It will install <code>libOpenVG.so</code>, <code>libEGL.so</code>, and one
or more EGL drivers.</p>
<h2>OpenVG Demos</h2>
<p>OpenVG demos can be found in mesa/demos repository.</p>
</div>
</body>
</html>

View File

@@ -21,6 +21,28 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/10.5.8.html">10.5.8 release notes</a>
<li><a href="relnotes/10.6.0.html">10.6.0 release notes</a>
<li><a href="relnotes/10.5.7.html">10.5.7 release notes</a>
<li><a href="relnotes/10.5.6.html">10.5.6 release notes</a>
<li><a href="relnotes/10.5.5.html">10.5.5 release notes</a>
<li><a href="relnotes/10.5.4.html">10.5.4 release notes</a>
<li><a href="relnotes/10.5.3.html">10.5.3 release notes</a>
<li><a href="relnotes/10.5.2.html">10.5.2 release notes</a>
<li><a href="relnotes/10.4.7.html">10.4.7 release notes</a>
<li><a href="relnotes/10.5.1.html">10.5.1 release notes</a>
<li><a href="relnotes/10.5.0.html">10.5.0 release notes</a>
<li><a href="relnotes/10.4.6.html">10.4.6 release notes</a>
<li><a href="relnotes/10.4.5.html">10.4.5 release notes</a>
<li><a href="relnotes/10.4.4.html">10.4.4 release notes</a>
<li><a href="relnotes/10.4.3.html">10.4.3 release notes</a>
<li><a href="relnotes/10.4.2.html">10.4.2 release notes</a>
<li><a href="relnotes/10.3.7.html">10.3.7 release notes</a>
<li><a href="relnotes/10.4.1.html">10.4.1 release notes</a>
<li><a href="relnotes/10.3.6.html">10.3.6 release notes</a>
<li><a href="relnotes/10.4.html">10.4 release notes</a>
<li><a href="relnotes/10.3.5.html">10.3.5 release notes</a>
<li><a href="relnotes/10.3.4.html">10.3.4 release notes</a>
<li><a href="relnotes/10.3.3.html">10.3.3 release notes</a>
<li><a href="relnotes/10.3.2.html">10.3.2 release notes</a>
<li><a href="relnotes/10.3.1.html">10.3.1 release notes</a>

View File

@@ -104,7 +104,7 @@ a07b4b6b9eb449b88a6cb5061e51c331 MesaLib-10.0.3.zip
<li>Add md5sums for 10.0.2. release.</li>
<li>cherry-ignore: Ignore several patches not yet ready for the stable branch</li>
<li>Drop another couple of patches.</li>
<li>cherry-ignore: Ignore 4 patches at teh request of the author, (Anuj).</li>
<li>cherry-ignore: Ignore 4 patches at the request of the author, (Anuj).</li>
<li>Update version to 10.0.3</li>
</ul>

View File

@@ -88,6 +88,8 @@ following options during configure, if you would like support for svga driver
Note: The files are installed in $(libdir)/gallium-pipe/ and the interface
between them and libxatracker.so is <strong>not</strong> stable.
</p>
<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

106
docs/relnotes/10.3.4.html Normal file
View File

@@ -0,0 +1,106 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.4 Release Notes / November 21, 2014</h1>
<p>
Mesa 10.3.4 is a bug fix release which fixes bugs found since the 10.3.3 release.
</p>
<p>
Mesa 10.3.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
26482495ef6177f889dbd87c7edcccfedd995598785bbbd7e3e066352574c8e0 MesaLib-10.3.4.tar.gz
e6373913142338d10515daf619d659433bfd2989988198930c13b0945a15e98a MesaLib-10.3.4.tar.bz2
8c3ebbb6535daf3414305860ebca6ac67dbb6e3d35058c7a6ce18b84b5945b7f MesaLib-10.3.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76252">Bug 76252</a> - Dynamic loading/unloading of opengl32.dll results in a deadlock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: copy sampler_array_size field when copying instructions</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965: Fix segfault in WebGL Conformance on Ivybridge</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>r600g/cayman: fix integer multiplication output overwrite (v2)</li>
<li>r600g/cayman: fix texture gather tests</li>
<li>r600g/cayman: handle empty vertex shaders</li>
<li>r600g: geom shaders: always load texture src regs from inputs</li>
<li>r600g: limit texture offset application to specific types (v2)</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.3.3 release</li>
<li>configure.ac: roll up a program for the sse4.1 check</li>
<li>get-pick-list.sh: Require explicit "10.3" for nominating stable patches</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>st/mesa: add a fallback for clear_with_quad when no vs_layer</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>llvmpipe: Avoid deadlock when unloading opengl32.dll</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i915g: we also have more than 0 viewports!</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Disable asynchronous DMA except for PIPE_BUFFER</li>
</ul>
</div>
</body>
</html>

88
docs/relnotes/10.3.5.html Normal file
View File

@@ -0,0 +1,88 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.5 Release Notes / December 5, 2014</h1>
<p>
Mesa 10.3.5 is a bug fix release which fixes bugs found since the 10.3.4 release.
</p>
<p>
Mesa 10.3.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7ea71c3cce89114df3dc050376afa1c6f6bf235d77a68f9703273603d6a90621 MesaLib-10.3.5.tar.gz
eb75d2790f1606d59d50a6acaa637b6c75f2155b3e0eca3d5099165c0d9556ae MesaLib-10.3.5.tar.bz2
164bc64ba63fb07ff255ff8de6ed3c95ff545dfe8f864c44c33abe94788da910 MesaLib-10.3.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (2):</p>
<ul>
<li>mesa: fix arithmetic error in _mesa_compute_compressed_pixelstore()</li>
<li>mesa: fix height error check for 1D array textures</li>
</ul>
<p>Chris Forbes (2):</p>
<ul>
<li>i965: Handle nested uniform array indexing</li>
<li>mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.3.5 release</li>
<li>Update version to 10.3.5</li>
</ul>
<p>Ilia Mirkin (6):</p>
<ul>
<li>nv50/ir: set neg modifiers on min/max args</li>
<li>nv50,nvc0: actually check constbufs for invalidation</li>
<li>nv50,nvc0: buffer resources can be bound as other things down the line</li>
<li>freedreno/ir3: don't pass consts to madsh.m16 in MOD logic</li>
<li>freedreno/a3xx: only enable blend clamp for non-float formats</li>
<li>freedreno/ir3: fix UMAD</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>configure.ac: bump libdrm_freedreno requirement</li>
</ul>
</div>
</body>
</html>

124
docs/relnotes/10.3.6.html Normal file
View File

@@ -0,0 +1,124 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.6 Release Notes / December 29, 2014</h1>
<p>
Mesa 10.3.6 is a bug fix release which fixes bugs found since the 10.3.5 release.
</p>
<p>
Mesa 10.3.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c4d053d6bc6604cb5c93c99e0ef2e815c539f26dc5a03737eb3809bc1767d12f MesaLib-10.3.6.tar.gz
8d43673c6788fbf85f9c36c3a95c61ccf46f8835fc9c0d85d34474490d80572b MesaLib-10.3.6.tar.bz2
6b5b1e9a13949cfdb76fe51e8dcc3ea71e464a5ca73d11fdc29c20c4ba3f411a MesaLib-10.3.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965/brw_reg: struct constructor now needs explicit negate and abs values.</li>
</ul>
<p>Ben Widawsky (1):</p>
<ul>
<li>i965/gs: Avoid DW * DW mul</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>r600g: only init GS_VERT_ITEMSIZE on r600</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.3.5 release</li>
<li>Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"</li>
<li>Update version to 10.3.6</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>linker: Wrap access of producer_var with a NULL check</li>
<li>linker: Assign varying locations geometry shader inputs for SSO</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>util/primconvert: pass index bias through</li>
<li>util/primconvert: support instanced rendering</li>
<li>util/primconvert: take ib offset into account</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>util/primconvert: Avoid point arithmetic; apply offset on all cases.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>docs/relnotes: document the removal of GALLIUM_MSAA</li>
</ul>
<p>Mario Kleiner (4):</p>
<ul>
<li>glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)</li>
<li>glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)</li>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
<li>glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)</li>
</ul>
<p>Maxence Le Doré (1):</p>
<ul>
<li>glsl: Add gl_MaxViewports to available builtin constants</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>radeonsi: Program RASTER_CONFIG for harvested GPUs v5</li>
</ul>
</div>
</body>
</html>

93
docs/relnotes/10.3.7.html Normal file
View File

@@ -0,0 +1,93 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.7 Release Notes / January 12, 2015</h1>
<p>
Mesa 10.3.7 is a bug fix release which fixes bugs found since the 10.3.6 release.
</p>
<p>
Mesa 10.3.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
bc13f33c19bc9f44a0565fdd51a8f9d1c0153a3365c429ceaf4ef43b7022b052 MesaLib-10.3.7.tar.gz
43c6ced15e237cbb21b3082d7c0b42777c50c1f731d0d4b5efb5231063fb6a5b MesaLib-10.3.7.tar.bz2
d821fd46baf804fecfcf403e901800a4b996c7dd1c83f20a354b46566a49026f MesaLib-10.3.7.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>
<li>i965: Use safer pointer arithmetic in gather_oa_results()</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.3.6 release</li>
<li>Update version to 10.3.7</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50,nvc0: set vertex id base to index_bias</li>
<li>nv50/ir: fix texture offsets in release builds</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>
<li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>glsl_to_tgsi: fix a bug in copy propagation</li>
<li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>
<li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>
</ul>
</div>
</body>
</html>

View File

@@ -327,6 +327,7 @@ DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm)</li>
<li>Removed support for the GL_ATI_envmap_bumpmap extension</li>
<li>The hacky --enable-32/64-bit is no longer available in configure. To build
32/64 bit mesa refer to the default method recommended by your distribution</li>
</li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

97
docs/relnotes/10.4.1.html Normal file
View File

@@ -0,0 +1,97 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.1 Release Notes / December 29, 2014</h1>
<p>
Mesa 10.4.1 is a bug fix release which fixes bugs found since the 10.4.0 release.
</p>
<p>
Mesa 10.4.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5311285e791a6bfaa468ad002bd1e1164acb3eaa040b5a1bf958bdb7c27e0a9d MesaLib-10.4.1.tar.gz
91e8b71c8aff4cb92022a09a872b1c5d1ae5bfec8c6c84dbc4221333da5bf1ca MesaLib-10.4.1.tar.bz2
e09c8135f5a86ecb21182c6f8959aafd39ae2f98858fdf7c0e25df65b5abcdb8 MesaLib-10.4.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83908">Bug 83908</a> - [i965] Incorrect icon colors in Steam Big Picture</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965/brw_reg: struct constructor now needs explicit negate and abs values.</li>
</ul>
<p>Cody Northrop (1):</p>
<ul>
<li>i965: Require pixel alignment for GPU copy blit</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add 10.4 sha256 sums, news item and link release notes</li>
<li>Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"</li>
<li>Update version to 10.4.1</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>linker: Wrap access of producer_var with a NULL check</li>
<li>linker: Assign varying locations geometry shader inputs for SSO</li>
</ul>
<p>Mario Kleiner (4):</p>
<ul>
<li>glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)</li>
<li>glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)</li>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
<li>glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)</li>
</ul>
<p>Maxence Le Doré (1):</p>
<ul>
<li>glsl: Add gl_MaxViewports to available builtin constants</li>
</ul>
</div>
</body>
</html>

127
docs/relnotes/10.4.2.html Normal file
View File

@@ -0,0 +1,127 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.2 Release Notes / January 12, 2015</h1>
<p>
Mesa 10.4.2 is a bug fix release which fixes bugs found since the 10.4.1 release.
</p>
<p>
Mesa 10.4.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e303e77dd774df0d051b2870b165f98c97084a55980f884731df89c1b56a6146 MesaLib-10.4.2.tar.gz
08a119937d9f2aa2f66dd5de97baffc2a6e675f549e40e699a31f5485d15327f MesaLib-10.4.2.tar.bz2
c2c2921a80a3395824f02bee4572a6a17d6a12a928a3e497618eeea04fb06490 MesaLib-10.4.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>
<li>i965: Use safer pointer arithmetic in gather_oa_results()</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"</li>
<li>r600g: fix regression since UCMP change</li>
<li>r600g/sb: implement r600 gpr index workaround. (v3.1)</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.1 release</li>
<li>Update version to 10.4.2</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50,nvc0: set vertex id base to index_bias</li>
<li>nv50/ir: fix texture offsets in release builds</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>
<li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>
</ul>
<p>Leonid Shatz (1):</p>
<ul>
<li>gallium/util: make sure cache line size is not zero</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>glsl_to_tgsi: fix a bug in copy propagation</li>
<li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>
<li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>
<li>radeonsi: fix VertexID for OpenGL</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallium/util: fix crash with daz detection on x86</li>
</ul>
<p>Tiziano Bacocco (1):</p>
<ul>
<li>nv50,nvc0: implement half_pixel_center</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>r600g/sb: fix issues with loops created for switch</li>
</ul>
</div>
</body>
</html>

145
docs/relnotes/10.4.3.html Normal file
View File

@@ -0,0 +1,145 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.3 Release Notes / January 24, 2015</h1>
<p>
Mesa 10.4.3 is a bug fix release which fixes bugs found since the 10.4.2 release.
</p>
<p>
Mesa 10.4.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c53eaafc83d9c6315f63e0904d9954d929b841b0b2be7a328eeb6e14f1376129 MesaLib-10.4.3.tar.gz
ef6ecc9c2f36c9f78d1662382a69ae961f38f03af3a0c3268e53f351aa1978ad MesaLib-10.4.3.tar.bz2
179325fc8ec66529d3b0d0c43ef61a33a44d91daa126c3bbdd1efdfd25a7db1d MesaLib-10.4.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (39):</p>
<ul>
<li>st/nine: Add new texture format strings</li>
<li>st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS</li>
<li>st/nine: NineBaseTexture9: fix setting of last_layer</li>
<li>st/nine: CubeTexture: fix GetLevelDesc</li>
<li>st/nine: Fix crash when deleting non-implicit swapchain</li>
<li>st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format</li>
<li>st/nine: NineBaseTexture9: update sampler view creation</li>
<li>st/nine: Check if srgb format is supported before trying to use it.</li>
<li>st/nine: Add ATI1 and ATI2 support</li>
<li>st/nine: Rework of boolean constants</li>
<li>st/nine: Convert integer constants to floats before storing them when cards don't support integers</li>
<li>st/nine: Remove some shader unused code</li>
<li>st/nine: Saturate oFog and oPts vs outputs</li>
<li>st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs</li>
<li>st/nine: Fix typo for M4x4</li>
<li>st/nine: Fix POW implementation</li>
<li>st/nine: Handle RSQ special cases</li>
<li>st/nine: Handle NRM with input of null norm</li>
<li>st/nine: Correct LOG on negative values</li>
<li>st/nine: Rewrite LOOP implementation, and a0 aL handling</li>
<li>st/nine: Fix CND implementation</li>
<li>st/nine: Clamp ps 1.X constants</li>
<li>st/nine: Fix some fixed function pipeline operation</li>
<li>st/nine: Implement TEXCOORD special behaviours</li>
<li>st/nine: Fill missing dst and src number for some instructions.</li>
<li>st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC</li>
<li>st/nine: implement TEXM3x2DEPTH</li>
<li>st/nine: Implement TEXM3x2TEX</li>
<li>st/nine: Implement TEXM3x3SPEC</li>
<li>st/nine: Implement TEXDEPTH</li>
<li>st/nine: Implement TEXDP3</li>
<li>st/nine: Implement TEXDP3TEX</li>
<li>st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB</li>
<li>st/nine: Correct rules for relative adressing and constants.</li>
<li>st/nine: Remove unused code for ps</li>
<li>st/nine: Fix sm3 relative addressing for non-debug build</li>
<li>st/nine: Add variables containing the size of the constant buffers</li>
<li>st/nine: Allocate the correct size for the user constant buffer</li>
<li>st/nine: Allocate vs constbuf buffer for indirect addressing once.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.2 release</li>
<li>Update version to 10.4.3</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Fix clamping to -1.0 in snorm_to_float</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>glsl: Link glsl_test with pthreads library.</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>nine: Drop use of TGSI_OPCODE_CND.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Respect the no_8 flag on Gen6, not just Gen7+.</li>
<li>i965: Work around mysterious Gen4 GPU hangs with minimal state changes.</li>
</ul>
<p>Stanislaw Halik (1):</p>
<ul>
<li>st/nine: Hack to generate resource if it doesn't exist when getting view</li>
</ul>
<p>Xavier Bouchoux (3):</p>
<ul>
<li>st/nine: Additional defines to d3dtypes.h</li>
<li>st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9</li>
<li>st/nine: Fix D3DRS_POINTSPRITE support</li>
</ul>
</div>
</body>
</html>

100
docs/relnotes/10.4.4.html Normal file
View File

@@ -0,0 +1,100 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.4 Release Notes / February 06, 2015</h1>
<p>
Mesa 10.4.4 is a bug fix release which fixes bugs found since the 10.4.3 release.
</p>
<p>
Mesa 10.4.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5cb427eaf980cb8555953e9928f5797979ed783e277745d5f8cbae8bc5364086 MesaLib-10.4.4.tar.gz
f18a967e9c4d80e054b2fdff8c130ce6e6d1f8eecfc42c9f354f8628d8b4df1c MesaLib-10.4.4.tar.bz2
86baad73b77920c80fe58402a905e7dd17e3ea10ead6ea7d3afdc0a56c860bd7 MesaLib-10.4.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix display list 8-byte alignment issue</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.3 release</li>
<li>Update version to 10.4.4</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>egl: Pass the correct X visual depth to xcb_put_image().</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>gallium/util: Don't use __builtin_clrsb in util_last_bit().</li>
</ul>
<p>Niels Ole Salscheider (1):</p>
<ul>
<li>configure: Link against all LLVM targets when building clover</li>
</ul>
<p>Park, Jeongmin (1):</p>
<ul>
<li>st/osmesa: Fix osbuffer-&gt;textures indexing</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i965: Fix max_wm_threads for CHV</li>
</ul>
</div>
</body>
</html>

114
docs/relnotes/10.4.5.html Normal file
View File

@@ -0,0 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.5 Release Notes / February 21, 2015</h1>
<p>
Mesa 10.4.5 is a bug fix release which fixes bugs found since the 10.4.4 release.
</p>
<p>
Mesa 10.4.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e12bbdaee9a758617e8ebd0bb0e987f72addd11db2e4da25ba695e386cd63843 MesaLib-10.4.5.tar.gz
bf60000700a9d58e3aca2bfeee7e781053b0d839e61a95b1883e05a2dee247a0 MesaLib-10.4.5.tar.bz2
3b926de8eee500bb67cf85332c51292f826cc539b8636382aadbb8e70c76527a MesaLib-10.4.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
</ul>
<h2>Changes</h2>
<p>Carl Worth (1):</p>
<ul>
<li>Revert use of Mesa IR optimizer for ARB_fragment_programs</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.4 release</li>
<li>get-pick-list.sh: Require explicit "10.4" for nominating stable patches</li>
<li>Update version to 10.4.5</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0: bail out of 2d blits with non-A8_UNORM alpha formats</li>
<li>st/mesa: treat resource-less xfb buffers as if they weren't there</li>
<li>nvc0: allow holes in xfb target lists</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>darwin: build fix</li>
<li>darwin: build fix</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Override swizzles for integer luminance formats.</li>
<li>i965: Use a gl_color_union for sampler border color.</li>
<li>i965: Fix integer border color on Haswell.</li>
<li>glsl: Reduce memory consumption of copy propagation passes.</li>
</ul>
<p>Laura Ekstrand (1):</p>
<ul>
<li>main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g,radeonsi: don't append to streamout buffers that haven't been used yet</li>
<li>radeonsi: fix instanced arrays with non-zero start instance</li>
<li>radeonsi: small fix in SPI state</li>
<li>mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers</li>
<li>radeonsi: fix a crash if a stencil ref state is set before a DSA state</li>
</ul>
<p>Michel Dänzer (2):</p>
<ul>
<li>st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB</li>
<li>Revert "radeon/llvm: enable unsafe math for graphics shaders"</li>
</ul>
</div>
</body>
</html>

143
docs/relnotes/10.4.6.html Normal file
View File

@@ -0,0 +1,143 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.6 Release Notes / March 06, 2015</h1>
<p>
Mesa 10.4.6 is a bug fix release which fixes bugs found since the 10.4.5 release.
</p>
<p>
Mesa 10.4.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4 MesaLib-10.4.6.tar.gz
d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735 MesaLib-10.4.6.tar.bz2
6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa MesaLib-10.4.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (2):</p>
<ul>
<li>glsl: Don't optimize min/max into saturate when EmitNoSat is set</li>
<li>st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>swrast: fix multiple color buffer writing</li>
<li>st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>
</ul>
<p>Eduardo Lima Mitev (1):</p>
<ul>
<li>mesa: Fix error validating args for TexSubImage3D</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.5 release</li>
<li>install-lib-links: remove the .install-lib-links file</li>
<li>Revert "mesa: Correct backwards NULL check."</li>
<li>mesa: cherry-pick the second half of commit 2aa71e9485a</li>
<li>Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."</li>
<li>Update version to 10.4.6</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>mesa: Add missing error checks in _mesa_ProgramBinary</li>
<li>mesa: Ensure that length is set to zero in _mesa_GetProgramBinary</li>
<li>mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>auxilary/os: correct sysctl use in os_get_total_physical_memory()</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix picture out-of-order with poc type 0 v2</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>install-lib-links: don't depend on .libs directory</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>vbo: fix an unitialized-variable warning</li>
<li>radeonsi: fix point sprites</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>glsl: Rewrite and fix min/max to saturate optimization.</li>
<li>mesa: Correct backwards NULL check.</li>
<li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>
<li>mesa: Correct backwards NULL check.</li>
</ul>
</div>
</body>
</html>

134
docs/relnotes/10.4.7.html Normal file
View File

@@ -0,0 +1,134 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.7 Release Notes / March 20, 2015</h1>
<p>
Mesa 10.4.7 is a bug fix release which fixes bugs found since the 10.4.6 release.
</p>
<p>
Mesa 10.4.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9e7b59267199658808f8b33e0410b86fbafbdcd52378658b9df65fac9d24947f MesaLib-10.4.7.tar.gz
2c351c98671f9a7ab3fd9c601bb7a255801b1580f5dd0992639f99152801b0d2 MesaLib-10.4.7.tar.bz2
d14ac578b5ce16560757b53fbd1cb4d6b34652f8e110e4b10a019adc82e67ffd MesaLib-10.4.7.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
</ul>
<h2>Changes</h2>
<p>Andrey Sudnik (1):</p>
<ul>
<li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl: Take alpha bits into account when selecting GBM formats</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.6 release</li>
<li>cherry-ignore: add not applicable/rejected commits</li>
<li>mesa: rename format_info.c to format_info.h</li>
<li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>
<li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>
<li>Update version to 10.4.7</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>freedreno: move fb state copy after checking for size change</li>
<li>freedreno/ir3: fix array count returned by TXQ</li>
<li>freedreno/ir3: get the # of miplevels from getinfo</li>
<li>freedreno: fix slice pitch calculations</li>
</ul>
<p>Marc-Andre Lureau (1):</p>
<ul>
<li>gallium/auxiliary/indices: fix start param</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>r300g: fix RGTC1 and LATC1 SNORM formats</li>
<li>r300g: fix a crash when resolving into an sRGB texture</li>
<li>r300g: fix sRGB-&gt;sRGB blits</li>
<li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>
<li>r300g: Check return value of snprintf().</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/ir3: fix silly typo for binning pass shaders</li>
<li>freedreno: update generated headers</li>
</ul>
<p>Samuel Iglesias Gonsalvez (1):</p>
<ul>
<li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>
</ul>
<p>Stefan Dösinger (1):</p>
<ul>
<li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4 Release Notes / TBD</h1>
<h1>Mesa 10.4 Release Notes / December 14, 2014</h1>
<p>
Mesa 10.4 is a new development release.
@@ -31,9 +31,11 @@ because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<h2>SHA256 checksums</h2>
<pre>
TBD.
abfbfd2d91ce81491c5bb6923ae649212ad5f82d0bee277de8704cc948dc221e MesaLib-10.4.0.tar.gz
98a7dff3a1a6708c79789de8b9a05d8042e867067f70e8f30387c15026233219 MesaLib-10.4.0.tar.bz2
443a6d46d0691b5ac811d8d30091b1716c365689b16d49c57cf273c2b76086fe MesaLib-10.4.0.zip
</pre>
@@ -47,18 +49,209 @@ Note: some of the new features are only available with certain drivers.
<li>GL_ARB_conditional_render_inverted on nv50</li>
<li>GL_ARB_sample_shading on r600</li>
<li>GL_ARB_texture_view on nv50, nvc0</li>
<li>GL_ARB_clip_control on llvmpipe, softpipe, r300, r600, radeonsi</li>
<li>GL_ARB_clip_control on nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe</li>
<li>GL_KHR_context_flush_control on all drivers</li>
</ul>
<h2>Bug fixes</h2>
TBD.
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79963">Bug 79963</a> - [ILK Bisected]some piglit and ogles2conform cases fail </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=29661">Bug 29661</a> - MSVC built u_format_test fails on Windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38873">Bug 38873</a> - [855gm] gnome-shell misrendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61415">Bug 61415</a> - Clover ignores --with-opencl-libdir path</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67672">Bug 67672</a> - [llvmpipe] lp_test_arit fails on old CPUs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69200">Bug 69200</a> - [Bisected]Piglit glx/glx-multithread-shader-compile aborted</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70410">Bug 70410</a> - egl-static/Makefile: linking fails with llvm &gt;= 3.4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72819">Bug 72819</a> - [855GM] Incorrect drop shadow color on windows and strange white rectangle when showing/hiding GLX-dock...</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74563">Bug 74563</a> - Surfaceless contexts are not properly released by DRI drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75011">Bug 75011</a> - [hyperz] Performance drop since git-01e6371 (disable hyperz by default) with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75112">Bug 75112</a> - Meta Bug for HyperZ issues on r600g and radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76252">Bug 76252</a> - Dynamic loading/unloading of opengl32.dll results in a deadlock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76861">Bug 76861</a> - mid3 generates slow code for constant arguments</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77957">Bug 77957</a> - Variably-indexed constant arrays result in terrible shader code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79155">Bug 79155</a> - [Tesseract Game] Global Illumination: Medium Causes Color Distortion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80011">Bug 80011</a> - [softpipe] tgsi/tgsi_exec.c:2023:exec_txf: Assertion `0' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80012">Bug 80012</a> - [softpipe] draw/draw_gs.c:113:tgsi_fetch_gs_outputs: Assertion `!util_is_inf_or_nan(output[slot][0])' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80050">Bug 80050</a> - [855GM] Incorrect drop shadow color under windows in Cinnamon persists with MESA 10.1.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80615">Bug 80615</a> - Files in bellagio directory [omx tracker] don't respect installation folder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80848">Bug 80848</a> - [dri3] Building mesa fails with dri3 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81680">Bug 81680</a> - [r600g] Firefox crashes with hardware acceleration turned on</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82537">Bug 82537</a> - Stunt Rally GLSL compiler assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82921">Bug 82921</a> - layout(location=0) emits error &gt;= MAX_UNIFORM_LOCATIONS due to integer underflow</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83080">Bug 83080</a> - [SNB+ Bisected]ES3-CTS.shaders.loops.do_while_constant_iterations.mixed_break_continue_fragment fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83148">Bug 83148</a> - Unity invisible under Ubuntu 14.04 and 14.10</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83380">Bug 83380</a> - Linking fails when not writing gl_Position.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83418">Bug 83418</a> - EU IV is incorrectly rendered after git1409011930.d571f2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83463">Bug 83463</a> - [swrast] piglit glsl-vs-clamp-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83573">Bug 83573</a> - [swrast] piglit fs-op-not-bool-using-if regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83574">Bug 83574</a> - [llvmpipe] [softpipe] piglit arb_explicit_uniform_location-use-of-unused-loc regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83777">Bug 83777</a> - [regression] ilo fails to build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83934">Bug 83934</a> - Structures must have same name to be considered same type.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84140">Bug 84140</a> - mplayer crashes playing some files using vdpau output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84145">Bug 84145</a> - UE4: Realistic Rendering Demo render blue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84355">Bug 84355</a> - texture2DProjLod and textureCubeLod are not supported when using GLES.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84529">Bug 84529</a> - [IVB bisected] glean fragProg1 CMP test failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84538">Bug 84538</a> - lp_test_format.c:226:4: error: too few arguments to function gallivm_create</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84539">Bug 84539</a> - brw_fs_register_coalesce.cpp:183: bool fs_visitor::register_coalesce(): Assertion `src_size &lt;= 11' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84557">Bug 84557</a> - [HSW] &quot;Emit ELSE/ENDIF JIP with type D on Gen 7&quot; causes Atomic Afterlife and GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84651">Bug 84651</a> - Distorted graphics or black window when running Battle.net app on Intel hardware via wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84662">Bug 84662</a> - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84807">Bug 84807</a> - Build issue starting between bf4aecfb2acc8d0dc815105d2f36eccbc97c284b and a3e9582f09249ad27716ba82c7dfcee685b65d51</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85189">Bug 85189</a> - llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector&lt;llvm::Function*&gt;&amp;)': llvm/invocation.cpp:324:18: error: expected type-specifier</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85267">Bug 85267</a> - vlc crashes with vdpau (Radeon 3850HD) [r600]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85377">Bug 85377</a> - lp_test_format failure with llvm-3.6</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85425">Bug 85425</a> - [bisected] Compiler error in clip control operations in meta</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85429">Bug 85429</a> - indirect.c:296: multiple definition of `__indirect_glNewList'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85454">Bug 85454</a> - Unigine Sanctuary with Wine crashes on Mesa Git</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85683">Bug 85683</a> - [i965 Bisected]Piglit shaders_glsl-vs-raytrace-bug26691 segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85691">Bug 85691</a> - 'glsl: Drop constant 0.0 components from dot products.' broke piglit shaders/glsl-gnome-shell-dim-window and a few others with Gallium</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86025">Bug 86025</a> - src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86089">Bug 86089</a> - [r600g][mesa 10.4.0-dev] shader failure - r600_sb::bc_finalizer::cf_peephole() when starting Second Life</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86145">Bug 86145</a> - Pipeline statistic counter values for VF always 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86760">Bug 86760</a> - mesa doesn't build: recipe for target 'r600_llvm.lo' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86764">Bug 86764</a> - [SNB+ Bisected]Piglit glean/pointSprite fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86788">Bug 86788</a> - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...</li>
</ul>
<h2>Changes</h2>
<ul>
<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

212
docs/relnotes/10.5.0.html Normal file
View File

@@ -0,0 +1,212 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.0 Release Notes / March 06, 2015</h1>
<p>
Mesa 10.5.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 10.5.1.
</p>
<p>
Mesa 10.5.0 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2bb6e2e982ee4d8264d52d638c2a4e3f8a164190336d72d4e34ae1304d87ed91 mesa-10.5.0.tar.gz
d7ca9f9044bbdd674377e3eebceef1fae339c8817b9aa435c2053e4fea44e5d3 mesa-10.5.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_framebuffer_sRGB on freedreno</li>
<li>GL_ARB_texture_rg on freedreno</li>
<li>GL_EXT_packed_float on freedreno</li>
<li>GL_EXT_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, llvmpipe</li>
<li>GL_EXT_texture_shared_exponent on freedreno</li>
<li>GL_EXT_texture_snorm on freedreno</li>
</ul>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=10370">Bug 10370</a> - Incorrect pixels read back if draw bitmap texture through Display list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67672">Bug 67672</a> - [llvmpipe] lp_test_arit fails on old CPUs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77544">Bug 77544</a> - i965: Try to use LINE instructions to perform MAD with immediate arguments</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83463">Bug 83463</a> - [swrast] piglit glsl-vs-clamp-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83510">Bug 83510</a> - Graphical glitches in Unreal Engine 4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83908">Bug 83908</a> - [i965] Incorrect icon colors in Steam Big Picture</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84212">Bug 84212</a> - [BSW]ES3-CTS.shaders.loops.do_while_dynamic_iterations.vector_counter_vertex fails and causes GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84651">Bug 84651</a> - Distorted graphics or black window when running Battle.net app on Intel hardware via wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85467">Bug 85467</a> - [llvmpipe] piglit gl-1.0-dlist-beginend failure with llvm-3.6.0svn</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86089">Bug 86089</a> - [r600g][mesa 10.4.0-dev] shader failure - r600_sb::bc_finalizer::cf_peephole() when starting Second Life</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86760">Bug 86760</a> - mesa doesn't build: recipe for target 'r600_llvm.lo' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86764">Bug 86764</a> - [SNB+ Bisected]Piglit glean/pointSprite fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86788">Bug 86788</a> - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86811">Bug 86811</a> - [BDW/BSW Bisected]Piglit spec_arb_shading_language_packing_execution_built-in-functions_vs-unpackSnorm4x8 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86837">Bug 86837</a> - kodi segfault since auxiliary/vl: rework the build of the VL code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86939">Bug 86939</a> - test_vf_float_conversions.cpp:63:12: error: expected primary-expression before union</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86944">Bug 86944</a> - glsl_parser_extras.cpp&quot;, line 1455: Error: Badly formed expression. (Oracle Studio)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86958">Bug 86958</a> - lp_bld_misc.cpp:503:40: error: no matching function for call to llvm::EngineBuilder::setMCJITMemoryManager(ShaderMemoryManager*&amp;)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86969">Bug 86969</a> - _drm_intel_gem_bo_references() function takes half the CPU with Witcher2 game</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87076">Bug 87076</a> - Dead Island needs allow_glsl_extension_directive_midshader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87694">Bug 87694</a> - [SNB] Crash in brw_begin_transform_feedback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87886">Bug 87886</a> - constant fps drops with Intel and Radeon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87887">Bug 87887</a> - [i965 Bisected]ES2-CTS.gtf.GL.cos.cos_float_vert_xvary fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88079">Bug 88079</a> - dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0 tests fail due to enabling of GL_RGB and GL_RGBA</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88170">Bug 88170</a> - 32 bits opengl apps crash with latest llvm 3.6 git / mesa git / radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88227">Bug 88227</a> - Radeonsi: High GTT usage in Prison Architect large map</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88248">Bug 88248</a> - Calling glClear while there is an occlusion query in progress messes up the results</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88335">Bug 88335</a> - format_pack.c:9567:22: error: expected ')'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88385">Bug 88385</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88467">Bug 88467</a> - nir.c:140: error: nir_src has no member named ssa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88478">Bug 88478</a> - #error &quot;&lt;malloc.h&gt; has been replaced by &lt;stdlib.h&gt;&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88519">Bug 88519</a> - sha1.c:210:22: error: 'grcy_md_hd_t' undeclared (first use in this function)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88523">Bug 88523</a> - sha1.c:37: error: 'SHA1_CTX' undeclared (first use in this function)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88561">Bug 88561</a> - [radeonsi][regression,bisected] Depth test/buffer issues in Portal</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88783">Bug 88783</a> - FTBFS: Clover: src/gallium/state_trackers/clover/llvm/invocation.cpp:335:49: error: no matching function for call to 'llvm::TargetLibraryInfo::TargetLibraryInfo(llvm::Triple)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88792">Bug 88792</a> - [BDW/BSW Bisected]Piglit spec_ARB_pixel_buffer_object_pbo-read-argb8888 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88806">Bug 88806</a> - nir/nir_constant_expressions.c:2754:15: error: controlling expression type 'unsigned int' not compatible with any generic association type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88841">Bug 88841</a> - [SNB/IVB/HSW/BDW Bisected]Piglit spec_EGL_NOK_texture_from_pixmap_basic fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88852">Bug 88852</a> - macros.h(181) : error C2143: syntax error : missing '{' before 'enum [tag]'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88905">Bug 88905</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88962">Bug 88962</a> - [osmesa] Crash on postprocessing if z buffer is NULL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89032">Bug 89032</a> - [BDW/BSW/SKL Bisected]Piglit spec_OpenGL_1.1_infinite-spot-light fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89037">Bug 89037</a> - [SKL]Piglit spec_EXT_texture_array_copyteximage_1D_ARRAY_samples=2 sporadically causes GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89068">Bug 89068</a> - glTexImage2D regression by texstore_rgba switch to _mesa_format_convert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86330">Bug 86330</a> - lp_bld_debug.cpp:112: multiple definition of `raw_debug_ostream::write_impl(char const*, unsigned long)'</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed support for GCC versions earlier than 4.2.0.</li>
</ul>
</div>
</body>
</html>

217
docs/relnotes/10.5.1.html Normal file
View File

@@ -0,0 +1,217 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.1 Release Notes / March 13, 2015</h1>
<p>
Mesa 10.5.1 is a bug fix release which fixes bugs found since the 10.5.0 release.
</p>
<p>
Mesa 10.5.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
b5b6256a6d46023e16a675257fd11a0f94d7b3e60a76cf112952da3d0fef8e9b mesa-10.5.1.tar.gz
ffc51943d15c6812ee7611d053d8980a683fbd6a4986cff567b12cc66637d679 mesa-10.5.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86747">Bug 86747</a> - Noise in Football Manager 2014 textures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86974">Bug 86974</a> - INTEL_DEBUG=shader_time always asserts in fs_generator::generate_code() when Mesa is built with --enable-debug (= with asserts)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88246">Bug 88246</a> - Commit 2881b12 causes 43 DrawElements test regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88793">Bug 88793</a> - [BDW/BSW Bisected]Piglit/shaders_glsl-max-varyings fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88883">Bug 88883</a> - ir-a2xx.c: variable changed in assert statement</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89095">Bug 89095</a> - [SNB/IVB/BYT Bisected]Webglc conformance/glsl/functions/glsl-function-mix-float.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89292">Bug 89292</a> - [regression,bisected] incomplete screenshots in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89311">Bug 89311</a> - [regression, bisected] dEQP: Added entry points for glCompressedTextureSubImage*D.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89312">Bug 89312</a> - [regression, bisected] main: Added entry points for CopyTextureSubImage*D. (d6b7c40cecfe01)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89315">Bug 89315</a> - [HSW, regression, bisected] i965/fs: Emit MAD instructions when possible.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89317">Bug 89317</a> - [HSW, regression, bisected] i965: Add LINTERP/CINTERP to can_do_cmod() (d91390634)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89416">Bug 89416</a> - UE4Editor crash after load project</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89430">Bug 89430</a> - [g965][bisected] arb_copy_image-targets gl_texture* tests fail</li>
</ul>
<h2>Changes</h2>
<p>Andrey Sudnik (1):</p>
<ul>
<li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl: Take alpha bits into account when selecting GBM formats</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: Add sha256 sums for the 10.5.0 release</li>
<li>egl/main: no longer export internal function</li>
<li>cherry-ignore: ignore a few more commits picked without -x</li>
<li>mapi: fix commit 90411b56f6bc817e229d8801ac0adad6d4e3fb7a</li>
<li>Update version to 10.5.1</li>
</ul>
<p>Frank Henigman (1):</p>
<ul>
<li>intel: fix EGLImage renderbuffer _BaseFormat</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>i965/fs/nir: Use emit_math for nir_op_fpow</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>freedreno: move fb state copy after checking for size change</li>
<li>freedreno/ir3: fix array count returned by TXQ</li>
<li>freedreno/ir3: get the # of miplevels from getinfo</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>meta/TexSubImage: Stash everything other than PIXEL_TRANSFER/store in meta_begin</li>
<li>main/base_tex_format: Properly handle STENCIL_INDEX1/4/16</li>
</ul>
<p>Kenneth Graunke (8):</p>
<ul>
<li>i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta.</li>
<li>glsl: Mark array access when copying to a temporary for the ?: operator.</li>
<li>i965/fs: Set force_writemask_all on shader_time instructions.</li>
<li>i965/fs: Set smear on shader_time diff register.</li>
<li>i965/fs: Make emit_shader_time_write return rather than emit.</li>
<li>i965/fs: Make get_timestamp() pass back the MOV rather than emitting it.</li>
<li>i965/fs: Make emit_shader_time_end() insert before EOT.</li>
<li>i965/fs: Don't issue FB writes for bound but unwritten color targets.</li>
</ul>
<p>Laura Ekstrand (2):</p>
<ul>
<li>main: Fix target checking for CompressedTexSubImage*D.</li>
<li>main: Fix target checking for CopyTexSubImage*D.</li>
</ul>
<p>Marc-Andre Lureau (1):</p>
<ul>
<li>gallium/auxiliary/indices: fix start param</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>r300g: fix RGTC1 and LATC1 SNORM formats</li>
<li>r300g: fix a crash when resolving into an sRGB texture</li>
<li>r300g: fix sRGB-&gt;sRGB blits</li>
</ul>
<p>Matt Turner (12):</p>
<ul>
<li>i965/vec4: Fix implementation of i2b.</li>
<li>mesa: Indent break statements and add a missing one.</li>
<li>mesa: Free memory allocated for luminance in readpixels.</li>
<li>mesa: Correct backwards NULL check.</li>
<li>i965: Consider scratch writes to have side effects.</li>
<li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>
<li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>
<li>r300g: Check return value of snprintf().</li>
<li>i965/fs: Don't propagate cmod to inst with different type.</li>
<li>i965: Tell intel_get_memcpy() which direction the memcpy() is going.</li>
<li>Revert SHA1 additions.</li>
<li>i965: Avoid applying negate to wrong MAD source.</li>
</ul>
<p>Neil Roberts (4):</p>
<ul>
<li>meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source tex</li>
<li>Revert "common: Fix PBOs for 1D_ARRAY."</li>
<li>meta: Allow GL_UN/PACK_IMAGE_HEIGHT in _mesa_meta_pbo_Get/TexSubImage</li>
<li>meta: Fix the y offset for 1D_ARRAY in _mesa_meta_pbo_TexSubImage</li>
</ul>
<p>Rob Clark (11):</p>
<ul>
<li>freedreno/ir3: fix silly typo for binning pass shaders</li>
<li>freedreno/a2xx: fix increment in assert</li>
<li>freedreno/a4xx: bit of cleanup</li>
<li>freedreno: update generated headers</li>
<li>freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly</li>
<li>freedreno: update generated headers</li>
<li>freedreno/a4xx: aniso filtering</li>
<li>freedreno/ir3: fix up cat6 instruction encodings</li>
<li>freedreno/ir3: add support for memory (cat6) instructions</li>
<li>freedreno/ir3: handle flat bypass for a4xx</li>
<li>freedreno/ir3: fix failed assert in grouping</li>
</ul>
<p>Stefan Dösinger (1):</p>
<ul>
<li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>
</ul>
</div>
</body>
</html>

130
docs/relnotes/10.5.2.html Normal file
View File

@@ -0,0 +1,130 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.2 Release Notes / March 28, 2015</h1>
<p>
Mesa 10.5.2 is a bug fix release which fixes bugs found since the 10.5.1 release.
</p>
<p>
Mesa 10.5.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
755220e160a9f22fda0dffd47746f997b6e196d03f8edc390df7793aecaaa541 mesa-10.5.2.tar.gz
2f4b6fb77c3e7d6f861558d0884a3073f575e1e673dad8d1b0624e78e9c4dd44 mesa-10.5.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88534">Bug 88534</a> - include/c11/threads_posix.h PTHREAD_MUTEX_RECURSIVE_NP not defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89328">Bug 89328</a> - python required to build Mesa release tarballs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89590">Bug 89590</a> - Crash in glLinkProgram with shaders with multiple constant arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89680">Bug 89680</a> - Hard link exist in Mesa 10.5.1 sources</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>glsl: Generate link error for non-matching gl_FragCoord redeclarations</li>
</ul>
<p>Emil Velikov (7):</p>
<ul>
<li>docs: Add sha256 sums for the 10.5.1 release</li>
<li>automake: add missing egl files to the tarball</li>
<li>st/egl: don't ship the dri2.c link at the tarball</li>
<li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>
<li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>
<li>cherry-ignore: add commit non applicable for 10.5</li>
<li>Update version to 10.5.2</li>
</ul>
<p>Felix Janda (1):</p>
<ul>
<li>c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965: Set nr_params to the number of uniform components in the VS/GS path.</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>freedreno/a3xx: use the same layer size for all slices</li>
<li>freedreno: fix slice pitch calculations</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>
</ul>
<p>Mario Kleiner (2):</p>
<ul>
<li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>
<li>mapi: Make private copies of name strings provided by client.</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: update generated headers</li>
</ul>
<p>Samuel Iglesias Gonsalvez (2):</p>
<ul>
<li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>
<li>configure: Introduce new output variable to ax_check_python_mako_module.m4</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>glsl: fix names in lower_constant_arrays_to_uniforms</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>clover: Return 0 as storage size for local kernel args that are not set v2</li>
</ul>
</div>
</body>
</html>

125
docs/relnotes/10.5.3.html Normal file
View File

@@ -0,0 +1,125 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.3 Release Notes / April 12, 2015</h1>
<p>
Mesa 10.5.3 is a bug fix release which fixes bugs found since the 10.5.2 release.
</p>
<p>
Mesa 10.5.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2371b8e210ccd19f61dd94b6664d612e5a479ba7d431a074512d87633bd6aeb4 mesa-10.5.3.tar.gz
8701ee1be4f5c03238f5e63c1a9bd4cc03a2f6c0155ed42a1ae7d58f18912ba2 mesa-10.5.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83962">Bug 83962</a> - [HSW/BYT]Piglit spec_ARB_gpu_shader5_arb_gpu_shader5-emitstreamvertex_nodraw fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89679">Bug 89679</a> - [NV50] Portal/Half-Life 2 will not start (native Steam)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89746">Bug 89746</a> - Mesa and LLVM 3.6+ break opengl for genymotion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89754">Bug 89754</a> - vertexAttrib fails WebGL Conformance test with mesa drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89758">Bug 89758</a> - pow WebGL Conformance test with mesa drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89759">Bug 89759</a> - WebGL OGL ES GLSL conformance test with mesa drivers fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89905">Bug 89905</a> - scons build broken on 10.5.2 due to activated vega st</li>
</ul>
<h2>Changes</h2>
<p>Dave Airlie (1):</p>
<ul>
<li>st_glsl_to_tgsi: only do mov copy propagation on temps (v2)</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: Add sha256 sums for the 10.5.2 release</li>
<li>xmlpool: don't forget to ship the MOS</li>
<li>configure.ac: error out if python/mako is not found when required</li>
<li>dist: add the VG depedencies into the tarball</li>
<li>Update version to 10.5.3</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Do not render primitives in non-zero streams then TF is disabled</li>
</ul>
<p>Ilia Mirkin (7):</p>
<ul>
<li>st/mesa: update arrays when the current attrib has been updated</li>
<li>nv50/ir: take postFactor into account when doing peephole optimizations</li>
<li>nv50/ir/gk110: fix offset flag position for TXD opcode</li>
<li>freedreno/a3xx: fix 3d texture layout</li>
<li>freedreno/a3xx: point size should not be divided by 2</li>
<li>nv50: allocate more offset space for occlusion queries</li>
<li>nv50,nvc0: limit the y-tiling of 3d textures to the first level's tiling</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix instanced geometry shaders on Gen8+.</li>
<li>i965: Add forgotten multi-stream code to Gen8 SOL state.</li>
</ul>
<p>Marcin Ślusarz (1):</p>
<ul>
<li>nouveau: synchronize "scratch runout" destruction with the command stream</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Cache LLVMTargetMachineRef in context instead of in screen</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>clover: Return CL_BUILD_ERROR for CL_PROGRAM_BUILD_STATUS when compilation fails v2</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i965: Fix URB size for CHV</li>
</ul>
</div>
</body>
</html>

125
docs/relnotes/10.5.4.html Normal file
View File

@@ -0,0 +1,125 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.4 Release Notes / April 24, 2015</h1>
<p>
Mesa 10.5.4 is a bug fix release which fixes bugs found since the 10.5.3 release.
</p>
<p>
Mesa 10.5.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e1089567fc7bf8d9b2d8badcc9f2fc3b758701c8c0ccfe7af1805549fea53f11 mesa-10.5.4.tar.gz
b51e723f3a20d842c88a92d809435b229fc4744ca0dbec0317d9d4a3ac4c6803 mesa-10.5.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69226">Bug 69226</a> - Cannot enable basic shaders with Second Life aborts attempt</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71591">Bug 71591</a> - Second Life shaders fail to compile (extension declared in middle of shader)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81025">Bug 81025</a> - [IVB/BYT Bisected]Piglit spec_ARB_draw_indirect_arb_draw_indirect-draw-elements-prim-restart-ugly fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89457">Bug 89457</a> - [BSW Bisected]ogles3conform ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89957">Bug 89957</a> - vm protection faults in piglit lest: texsubimage cube_map_array pbo</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>glsl: rewrite glsl_type::record_key_hash() to avoid buffer overflow</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>st/mesa: convert sub image for cube map arrays to 2d arrays for upload</li>
<li>st/mesa: align cube map arrays layers</li>
</ul>
<p>Emil Velikov (11):</p>
<ul>
<li>docs: Add 256 sums for the 10.5.3 release</li>
<li>radeonsi: remove unused si_dump_key()</li>
<li>android: use LOCAL_SHARED_LIBRARIES over TARGET_OUT_HEADERS</li>
<li>android: add $(mesa_top)/src include to the whole of mesa</li>
<li>android: egl: add libsync_cflags to the build</li>
<li>android: dri/common: conditionally include drm_cflags/set __NOT_HAVE_DRM_H</li>
<li>android: add HAVE__BUILTIN_* and HAVE_FUNC_ATTRIBUTE_* defines</li>
<li>android: add $(mesa_top)/src/mesa/main to the includes list</li>
<li>android: dri: link against libmesa_util</li>
<li>android: mesa: fix the path of the SSE4_1 optimisations</li>
<li>Update version to 10.5.4</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>nir: Fix typo in "ushr by 0" algebraic replacement</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix software primitive restart with indirect draws.</li>
<li>drirc: Add "Second Life" quirk (allow_glsl_extension_directive_midshader).</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>i965: Rewrite ir_tex to ir_txl with lod 0 for vertex shaders</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>glsl_to_tgsi: fix out-of-bounds constant access and crash for uniforms</li>
<li>glsl_to_tgsi: don't use a potentially-undefined immediate for ir_query_levels</li>
</ul>
<p>Mathias Froehlich (1):</p>
<ul>
<li>i965: Flush batchbuffer containing the query on glQueryCounter.</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>android: mesa: generate the format_{un,}pack.[ch] sources</li>
<li>android: add inital NIR build</li>
</ul>
</div>
</body>
</html>

95
docs/relnotes/10.5.5.html Normal file
View File

@@ -0,0 +1,95 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.5 Release Notes / May 11, 2015</h1>
<p>
Mesa 10.5.5 is a bug fix release which fixes bugs found since the 10.5.4 release.
</p>
<p>
Mesa 10.5.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c10f00fd792b8290dd51ebcc48a9016c4cafab19ec205423c6fcadfd7f3a59f2 mesa-10.5.5.tar.gz
4ac4e4ea3414f1cadb1467f2f173f9e56170d31e8674f7953a46f0549d319f28 mesa-10.5.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88521">Bug 88521</a> - GLBenchmark 2.7 TRex renders with artifacts on Gen8 with !UXA</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89455">Bug 89455</a> - [NVC0/Gallium] Unigine Heaven black and white boxes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89689">Bug 89689</a> - [Regression] Weston on DRM backend won't start with new version of mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90130">Bug 90130</a> - gl_PrimitiveId seems to reset at 340</li>
</ul>
<h2>Changes</h2>
<p>Boyan Ding (1):</p>
<ul>
<li>i965: Add XRGB8888 format to intel_screen_make_configs</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.5.4 release</li>
<li>r300: do not link against libdrm_intel</li>
<li>Update version to 10.5.5</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nvc0/ir: flush denorms to zero in non-compute shaders</li>
<li>gk110/ir: fix set with a register dest to not auto-set the abs flag</li>
<li>nvc0/ir: fix predicated PFETCH emission</li>
<li>nv50/ir: fix asFlow() const helper for OP_JOIN</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions.</li>
<li>i965: Disallow linear blits that are not cacheline aligned.</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: fix prim ids when there's no gs</li>
</ul>
</div>
</body>
</html>

147
docs/relnotes/10.5.6.html Normal file
View File

@@ -0,0 +1,147 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.6 Release Notes / May 23, 2015</h1>
<p>
Mesa 10.5.6 is a bug fix release which fixes bugs found since the 10.5.5 release.
</p>
<p>
Mesa 10.5.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
89ff9cb08d0f6e3f34154864c3071253057cd21020759457c8ae27e0f70985d3 mesa-10.5.6.tar.gz
66017853bde5f7a6647db3eede30512a091a3491daa1708e0ad8027c328ba595 mesa-10.5.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86792">Bug 86792</a> - [NVC0] Portal 2 Crashes in Wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90147">Bug 90147</a> - swrast: build error undeclared _SC_PHYS_PAGES on osx</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90350">Bug 90350</a> - [G96] Portal's portal are incorrectly rendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90363">Bug 90363</a> - [nv50] HW state is not reset correctly when using a new GL context</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: add new bonaire pci id</li>
</ul>
<p>Axel Davy (2):</p>
<ul>
<li>egl/wayland: properly destroy wayland objects</li>
<li>glx/dri3: Add additional check for gpu offloading case</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: Add sha256 sums for the 10.5.5 release</li>
<li>egl/main: fix EGL_KHR_get_all_proc_addresses</li>
<li>targets/osmesa: drop the -module tag from LDFLAGS</li>
<li>Update version to 10.5.6</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>clover: Refactor event::trigger and ::abort to prevent deadlock and reentrancy issues.</li>
<li>clover: Wrap event::_status in a method to prevent unlocked access.</li>
<li>clover: Implement locking of the wait_count, _chain and _status members of event.</li>
<li>i965: Fix PBO cache coherency issue after _mesa_meta_pbo_GetTexSubImage().</li>
</ul>
<p>Fredrik Höglund (2):</p>
<ul>
<li>main: Require that the texture exists in framebuffer_texture</li>
<li>mesa: Generate GL_INVALID_VALUE in framebuffer_texture when layer &lt; 0</li>
</ul>
<p>Ilia Mirkin (7):</p>
<ul>
<li>nv50/ir: only propagate saturate up if some actual folding took place</li>
<li>nv50: keep track of PGRAPH state in nv50_screen</li>
<li>nvc0: keep track of PGRAPH state in nvc0_screen</li>
<li>nvc0: reset the instanced elements state when doing blit using 3d engine</li>
<li>nv50/ir: only enable mul saturate on G200+</li>
<li>st/mesa: make sure to create a "clean" bool when doing i2b</li>
<li>nvc0: switch mechanism for shader eviction to be a while loop</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>swrast: Build fix for darwin</li>
<li>darwin: Fix install name of libOSMesa</li>
</ul>
<p>Laura Ekstrand (2):</p>
<ul>
<li>main: Fix an error generated by FramebufferTexture</li>
<li>main: Complete error conditions for glInvalidate*Framebuffer.</li>
</ul>
<p>Marta Lofstedt (1):</p>
<ul>
<li>main: glGetIntegeri_v fails for GL_VERTEX_BINDING_STRIDE</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno: enable a306</li>
<li>freedreno: fix bug in tile/slot calculation</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: (trivial) fix out-of-bounds vector initialization</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>mesa: fix shininess check for ffvertex_prog v2</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>clover: Add a mutex to guard queue::queued_events</li>
<li>clover: Fix a bug with multi-threaded events v2</li>
</ul>
</div>
</body>
</html>

103
docs/relnotes/10.5.7.html Normal file
View File

@@ -0,0 +1,103 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.7 Release Notes / June 07, 2015</h1>
<p>
Mesa 10.5.7 is a bug fix release which fixes bugs found since the 10.5.6 release.
</p>
<p>
Mesa 10.5.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
8f865ce497435fdf25d4e35f3b5551b2bcd5f9bc6570561183be82af20d18b82 mesa-10.5.7.tar.gz
04d06890cd69af8089d6ca76f40e46dcf9cacfe4a9788b32be620574d4638818 mesa-10.5.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89131">Bug 89131</a> - [Bisected] Graphical corruption in Weston, shows old framebuffer pieces</li>
</ul>
<h2>Changes</h2>
<p>Ben Widawsky (1):</p>
<ul>
<li>i965: Emit 3DSTATE_MULTISAMPLE before WM_HZ_OP (gen8+)</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: Add sha256sums for the 10.5.6 release</li>
<li>get-pick-list.sh: Require explicit "10.5" for nominating stable patches</li>
<li>cherry-ignore: add clover build fix not applicable for 10.5</li>
<li>Update version to 10.5.7</li>
</ul>
<p>Ilia Mirkin (18):</p>
<ul>
<li>nvc0/ir: set ftz when sources are floats, not just destinations</li>
<li>nv50/ir: guess that the constant offset is the starting slot of array</li>
<li>nvc0/ir: LOAD's can't be used for shader inputs</li>
<li>nvc0: a geometry shader can have up to 1024 vertices output</li>
<li>nv50/ir: avoid messing up arg1 of PFETCH</li>
<li>nv30: don't leak fragprog consts</li>
<li>nv30: avoid leaking render state and draw shaders</li>
<li>nv30: fix clip plane uploads and enable changes</li>
<li>nv30/draw: avoid leaving stale pointers in draw state</li>
<li>nv30/draw: draw expects constbuf size in bytes, not vec4 units</li>
<li>st/mesa: don't leak glsl_to_tgsi object on link failure</li>
<li>glsl: avoid leaking linked gl_shader when there's a late linker error</li>
<li>nv30/draw: fix indexed draws with swtnl path and a resource index buffer</li>
<li>nv30/draw: only use the DMA1 object (GART) if the bo is not in VRAM</li>
<li>nv30/draw: allocate vertex buffers in gart</li>
<li>nv30/draw: switch varying hookup logic to know about texcoords</li>
<li>nv30: falling back to draw path for edgeflag does no good</li>
<li>nv30: avoid doing extra work on clear and hitting unexpected states</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/fs: Fix implied_mrf_writes for scratch writes</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/dri: fix postprocessing crash when there's no depth buffer</li>
</ul>
</div>
</body>
</html>

112
docs/relnotes/10.5.8.html Normal file
View File

@@ -0,0 +1,112 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.5.8 Release Notes / June 20, 2015</h1>
<p>
Mesa 10.5.8 is a bug fix release which fixes bugs found since the 10.5.7 release.
</p>
<p>
Mesa 10.5.8 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
611ddcfa3c1bf13f7e6ccac785c8749c3b74c9a78452bac70f8372cf6b209aa0 mesa-10.5.8.tar.gz
2866b855c5299a4aed066338c77ff6467c389b2c30ada7647be8758663da2b54 mesa-10.5.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90310">Bug 90310</a> - Fails to build gallium_dri.so at linking stage with clang because of multiple redefinitions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90347">Bug 90347</a> - [NVE0+] Failure to insert texbar under some circumstances (causing bad colors in Terasology)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90520">Bug 90520</a> - Register spilling clobbers registers used elsewhere in the shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90905">Bug 90905</a> - mesa: Finish subdir-objects transition</li>
</ul>
<h2>Changes</h2>
<p>Ben Widawsky (1):</p>
<ul>
<li>i965: Disable compaction for EOT send messages</li>
</ul>
<p>Boyan Ding (1):</p>
<ul>
<li>egl/x11: Set version of swrastLoader to 2</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256sums for the 10.5.7 release</li>
<li>Update version to 10.5.8</li>
</ul>
<p>Erik Faye-Lund (1):</p>
<ul>
<li>mesa: build xmlconfig to a separate static library</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>i965: Don't compact instructions with unmapped bits.</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0/ir: fix collection of first uses for texture barrier insertion</li>
<li>nv50,nvc0: clamp uniform size to 64k</li>
<li>nvc0/ir: can't have a join on a load with an indirect source</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/fs: Don't let the EOT send message interfere with the MRF hack</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>egl: fix setting context flags</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: (trivial) fix NULL pointer dereference</li>
</ul>
</div>
</body>
</html>

331
docs/relnotes/10.6.0.html Normal file
View File

@@ -0,0 +1,331 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.6.0 Release Notes / June 14, 2015</h1>
<p>
Mesa 10.6.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 10.6.1.
</p>
<p>
Mesa 10.6.0 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9bc659abdba26202509304f259723aaa4343dba6aac4bd87d5baea11d23c8c63 mesa-10.6.0.tar.gz
f37e2633978deed02ff0522abc36c709586e2b555fd439a82ab71dce2c866c76 mesa-10.6.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_AMD_pinned_memory on r600, radeonsi</li>
<li>GL_ARB_clip_control on i965</li>
<li>GL_ARB_depth_buffer_float on freedreno</li>
<li>GL_ARB_depth_clamp on freedreno</li>
<li>GL_ARB_direct_state_access on all drivers that support GL 2.0+</li>
<li>GL_ARB_draw_indirect, GL_ARB_multi_draw_indirect on r600</li>
<li>GL_ARB_draw_instanced on freedreno</li>
<li>GL_ARB_gpu_shader_fp64 on nvc0, softpipe</li>
<li>GL_ARB_gpu_shader5 on i965/gen8+</li>
<li>GL_ARB_instanced_arrays on freedreno</li>
<li>GL_ARB_pipeline_statistics_query on i965, nv50, nvc0, r600, radeonsi, softpipe</li>
<li>GL_ARB_program_interface_query (all drivers)</li>
<li>GL_ARB_texture_stencil8 on nv50, nvc0, r600, radeonsi, softpipe</li>
<li>GL_ARB_texture_view on llvmpipe, softpipe</li>
<li>GL_ARB_uniform_buffer_object on freedreno</li>
<li>GL_ARB_vertex_attrib_64bit on nvc0, softpipe</li>
<li>GL_ARB_viewport_array, GL_AMD_vertex_shader_viewport_index on i965/gen6</li>
<li>GL_EXT_draw_buffers2 on freedreno</li>
<li>GL_OES_EGL_sync on all drivers</li>
<li>EGL_KHR_fence_sync on i965, freedreno, nv50, nvc0, r600, radeonsi</li>
<li>EGL_KHR_wait_sync on i965, freedreno, nv50, nvc0, r600, radeonsi</li>
<li>EGL_KHR_cl_event2 on freedreno, nv50, nvc0, r600, radeonsi</li>
<li>GL_AMD_performance_monitor on nvc0</li>
</ul>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=15006">Bug 15006</a> - translate &amp; rotate the line cause Aliasing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27007">Bug 27007</a> - Lines disappear with GL_LINE_SMOOTH</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=28832">Bug 28832</a> - piglit/general/line-aa-width fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60797">Bug 60797</a> - 1px lines in octave plot aliased to 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67564">Bug 67564</a> - HiZ buffers are much larger than necessary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69226">Bug 69226</a> - Cannot enable basic shaders with Second Life aborts attempt</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71591">Bug 71591</a> - Second Life shaders fail to compile (extension declared in middle of shader)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81025">Bug 81025</a> - [IVB/BYT Bisected]Piglit spec_ARB_draw_indirect_arb_draw_indirect-draw-elements-prim-restart-ugly fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82668">Bug 82668</a> - Can't set int attributes to certain values on 32-bit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82831">Bug 82831</a> - i965: Support GL_ARB_blend_func_extended in SIMD16</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83962">Bug 83962</a> - [HSW/BYT]Piglit spec_ARB_gpu_shader5_arb_gpu_shader5-emitstreamvertex_nodraw fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86747">Bug 86747</a> - Noise in Football Manager 2014 textures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86792">Bug 86792</a> - [NVC0] Portal 2 Crashes in Wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86811">Bug 86811</a> - [BDW/BSW Bisected]Piglit spec_arb_shading_language_packing_execution_built-in-functions_vs-unpackSnorm4x8 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86837">Bug 86837</a> - kodi segfault since auxiliary/vl: rework the build of the VL code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86944">Bug 86944</a> - glsl_parser_extras.cpp&quot;, line 1455: Error: Badly formed expression. (Oracle Studio)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86974">Bug 86974</a> - INTEL_DEBUG=shader_time always asserts in fs_generator::generate_code() when Mesa is built with --enable-debug (= with asserts)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86980">Bug 86980</a> - [swrast] piglit fp-rfl regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87258">Bug 87258</a> - [BDW/BSW Bisected]Piglit spec_ARB_shader_atomic_counters_array-indexing fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88246">Bug 88246</a> - Commit 2881b12 causes 43 DrawElements test regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88248">Bug 88248</a> - Calling glClear while there is an occlusion query in progress messes up the results</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88521">Bug 88521</a> - GLBenchmark 2.7 TRex renders with artifacts on Gen8 with !UXA</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88534">Bug 88534</a> - include/c11/threads_posix.h PTHREAD_MUTEX_RECURSIVE_NP not defined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88561">Bug 88561</a> - [radeonsi][regression,bisected] Depth test/buffer issues in Portal</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88793">Bug 88793</a> - [BDW/BSW Bisected]Piglit/shaders_glsl-max-varyings fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88815">Bug 88815</a> - Incorrect handling of GLSL #line directive</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88883">Bug 88883</a> - ir-a2xx.c: variable changed in assert statement</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88905">Bug 88905</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88999">Bug 88999</a> - [SKL] Compiz crashes after opening unity dash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89014">Bug 89014</a> - PIPE_QUERY_GPU_FINISHED is not acting as expected on SI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89026">Bug 89026</a> - Renderbuffer layered state used for framebuffer completeness test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89032">Bug 89032</a> - [BDW/BSW/SKL Bisected]Piglit spec_OpenGL_1.1_infinite-spot-light fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89037">Bug 89037</a> - [SKL]Piglit spec_EXT_texture_array_copyteximage_1D_ARRAY_samples=2 sporadically causes GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89039">Bug 89039</a> - [SKL]etqw system hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89058">Bug 89058</a> - [SKL]Render error in some games (etqw-demo, nexuiz, portal)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89068">Bug 89068</a> - glTexImage2D regression by texstore_rgba switch to _mesa_format_convert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89094">Bug 89094</a> - [SNB/IVB/HSW/BYT Bisected]Ogles3conform ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89095">Bug 89095</a> - [SNB/IVB/BYT Bisected]Webglc conformance/glsl/functions/glsl-function-mix-float.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89112">Bug 89112</a> - u_atomic_test: u_atomic_test.c:124: test_atomic_8bits_bool: Assertion `r == 65 &amp;&amp; &quot;p_atomic_add&quot;' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89118">Bug 89118</a> - [SKL Bisected]many Ogles3conform cases core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89131">Bug 89131</a> - [Bisected] Graphical corruption in Weston, shows old framebuffer pieces</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89210">Bug 89210</a> - GS statistics fail on SNB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89218">Bug 89218</a> - lower_instructions.cpp:648:48: error: invalid suffix 'd' on floating constant</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89260">Bug 89260</a> - macros.h:34:25: fatal error: util/u_math.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89292">Bug 89292</a> - [regression,bisected] incomplete screenshots in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89311">Bug 89311</a> - [regression, bisected] dEQP: Added entry points for glCompressedTextureSubImage*D.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89312">Bug 89312</a> - [regression, bisected] main: Added entry points for CopyTextureSubImage*D. (d6b7c40cecfe01)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89315">Bug 89315</a> - [HSW, regression, bisected] i965/fs: Emit MAD instructions when possible.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89317">Bug 89317</a> - [HSW, regression, bisected] i965: Add LINTERP/CINTERP to can_do_cmod() (d91390634)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89328">Bug 89328</a> - python required to build Mesa release tarballs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89342">Bug 89342</a> - main/light.c:159:62: error: 'M_PI' undeclared (first use in this function)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89343">Bug 89343</a> - compiler/tests/radeon_compiler_optimize_tests.c:43:3: error: implicit declaration of function fprintf [-Werror=implicit-function-declaration]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89345">Bug 89345</a> - imports.h:452:58: error: expected declaration specifiers or '...' before 'va_list'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89364">Bug 89364</a> - c99_alloca.h:40:22: fatal error: alloca.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89372">Bug 89372</a> - [softpipe] piglit glsl-1.50 generate-zero-primitives regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89387">Bug 89387</a> - Double delete in lp_bld_misc.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89416">Bug 89416</a> - UE4Editor crash after load project</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89430">Bug 89430</a> - [g965][bisected] arb_copy_image-targets gl_texture* tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89433">Bug 89433</a> - GCC 4.2 does not support -Wvla</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89455">Bug 89455</a> - [NVC0/Gallium] Unigine Heaven black and white boxes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89457">Bug 89457</a> - [BSW Bisected]ogles3conform ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89477">Bug 89477</a> - include/no_extern_c.h:47:1: error: template with C linkage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89508">Bug 89508</a> - Bad int(floatBitsToInt(vec4))</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89569">Bug 89569</a> - Papo &amp; Yo crash on startup [HSW]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89590">Bug 89590</a> - Crash in glLinkProgram with shaders with multiple constant arrays</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89662">Bug 89662</a> - context.c:943: undefined reference to `_glapi_new_nop_table'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89670">Bug 89670</a> - cmod_propagation_test.andnz_one regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89679">Bug 89679</a> - [NV50] Portal/Half-Life 2 will not start (native Steam)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89689">Bug 89689</a> - [Regression] Weston on DRM backend won't start with new version of mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89722">Bug 89722</a> - [ILK Bisected]Ogles2conform/ES2-CTS.gtf.GL.equal.equal_vec2_frag fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89726">Bug 89726</a> - [Bisected] dEQP-GLES3: uniform linking logic in the presence of structs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89746">Bug 89746</a> - Mesa and LLVM 3.6+ break opengl for genymotion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89754">Bug 89754</a> - vertexAttrib fails WebGL Conformance test with mesa drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89758">Bug 89758</a> - pow WebGL Conformance test with mesa drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89759">Bug 89759</a> - WebGL OGL ES GLSL conformance test with mesa drivers fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89831">Bug 89831</a> - [r600] r600_asm.c:310:assign_alu_units: Assertion `0' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89899">Bug 89899</a> - nir/nir_lower_tex_projector.c:112: error: unknown field ssa specified in initializer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89957">Bug 89957</a> - vm protection faults in piglit lest: texsubimage cube_map_array pbo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89960">Bug 89960</a> - [softpipe] piglit copy-pixels regreession</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89961">Bug 89961</a> - [BDW/BSW Bisected]Synmark2_v6 OglDrvRes/OglDrvShComp/OglDrvState/OglPSPom Image Validation fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89963">Bug 89963</a> - lp_bld_debug.cpp:100:31: error: no matching function for call to llvm::raw_ostream::raw_ostream()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90000">Bug 90000</a> - [i965 Bisected NIR] Piglit/gglean_fragprog1-z-write_test fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90109">Bug 90109</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.uniform_block.random.basic_arrays.3 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90114">Bug 90114</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.struct.uniform.sampler_array_fragment fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90130">Bug 90130</a> - gl_PrimitiveId seems to reset at 340</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90147">Bug 90147</a> - swrast: build error undeclared _SC_PHYS_PAGES on osx</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90149">Bug 90149</a> - [SNB+ Bisected]ES3-CTS.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_getactiveuniformsiv_for_nonexistent_uniform_indices fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90153">Bug 90153</a> - [SKL Bisected]ES3-CTS.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_all_valid_basic_types fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90167">Bug 90167</a> - [softpipe] piglit depthstencil-default_fb-drawpixels-32f_24_8_rev regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90207">Bug 90207</a> - [r600g, bisected] regression: NI/Turks crash on WebGL Water (most WebGL stuff)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90213">Bug 90213</a> - glDrawPixels with GL_COLOR_INDEX never returns.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90243">Bug 90243</a> - [bisected] regression: spec.!opengl 3_2.get-active-attrib-returns-all-inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90258">Bug 90258</a> - [IVB] spec.glsl-1_10.execution.fs-dfdy-accuracy fails intermittently</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90310">Bug 90310</a> - Fails to build gallium_dri.so at linking stage with clang because of multiple redefinitions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90350">Bug 90350</a> - [G96] Portal's portal are incorrectly rendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90363">Bug 90363</a> - [nv50] HW state is not reset correctly when using a new GL context</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90397">Bug 90397</a> - ARB_program_interface_query: glGetProgramResourceiv() returns wrong value for GL_REFERENCED_BY_*_SHADER prop for GL_UNIFORM for members of an interface block with an instance name</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90466">Bug 90466</a> - arm: linker error ndefined reference to `nir_metadata_preserve'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90520">Bug 90520</a> - Register spilling clobbers registers used elsewhere in the shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90547">Bug 90547</a> - [BDW/BSW/SKL Bisected]Piglit/glean&#64;vertprog1-rsq_test_2_(reciprocal_square_root_of_negative_value) fais</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90580">Bug 90580</a> - [HSW bisected] integer multiplication bug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90629">Bug 90629</a> - [i965] SIMD16 dual_source_blend assertion `src[i].file != GRF || src[i].width == dst.width' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90749">Bug 90749</a> - [BDW Bisected]dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90830">Bug 90830</a> - [bsw bisected regression] GPU hang for spec.arb_gpu_shader5.execution.sampler_array_indexing.vs-nonzero-base</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90839">Bug 90839</a> - [10.5.5/10.6 regression, bisected] PBO glDrawPixels no longer using blit fastpath</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90905">Bug 90905</a> - mesa: Finish subdir-objects transition</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=9951">Bug 9951</a> - GL_LINE_SMOOTH and GL_POLYGON_SMOOTH with i965 driver</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed classic Windows software rasterizer.</li>
<li>Removed egl_gallium EGL driver.</li>
<li>Removed gbm_gallium GBM driver.</li>
<li>Removed OpenVG support.</li>
<li>Removed the galahad gallium driver.</li>
<li>Removed the identity gallium driver.</li>
<li>Removed the EGL loader from the Windows SCons build.</li>
<li>Removed the classic osmesa from the Windows SCons build.</li>
</ul>
</div>
</body>
</html>

61
docs/relnotes/10.7.0.html Normal file
View File

@@ -0,0 +1,61 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.7.0 Release Notes / TBD</h1>
<p>
Mesa 10.7.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 10.7.1.
</p>
<p>
Mesa 10.7.0 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_framebuffer_no_attachments on i965</li>
<li>GL_ARB_shader_stencil_export on llvmpipe</li>
</ul>
<h2>Bug fixes</h2>
TBD.
<h2>Changes</h2>
TBD.
</div>
</body>
</html>

View File

@@ -1693,7 +1693,7 @@ bc644be551ed585fc4f66c16b64a91c9 MesaGLUT-7.10.tar.gz
<li>llvmpipe: Special case complementary and identify blend factors in SoA.</li>
<li>llvmpipe: Make rgb/alpha bland func/factors match, when there is no alpha.</li>
<li>draw: Prevent clipped vertices overflow.</li>
<li>draw: Fullfil the new min_lod/max_lod/lod_bias/border_color dynamic state</li>
<li>draw: Fulfil the new min_lod/max_lod/lod_bias/border_color dynamic state</li>
<li>gallivm: Fetch the lod from the dynamic state when min_lod == max_lod.</li>
<li>gallivm: Remove dead experimental code.</li>
<li>llvmpipe: Decouple sampler view and sampler state updates.</li>

View File

@@ -48,7 +48,7 @@ c49c19c2bbef4f3b7f1389974dff25f4 MesaGLUT-7.6.zip
<h2>New features</h2>
<ul>
<li><a href="../openvg.html">OpenVG</a> front-end (state tracker for Gallium).
<li>OpenVG front-end (state tracker for Gallium).
This was written by Zack Rusin at Tungsten Graphics.
<li>GL_ARB_vertex_array_object and GL_APPLE_vertex_array_object extensions
(supported in Gallium drivers, Intel DRI drivers, and software drivers)</li>

View File

@@ -133,10 +133,8 @@ each directory.
<ul>
<li><b>clover</b> - OpenCL state tracker
<li><b>dri</b> - Meta state tracker for DRI drivers
<li><b>egl</b> - Meta state tracker for EGL drivers
<li><b>glx</b> - Meta state tracker for GLX
<li><b>vdpau</b> - VDPAU state tracker
<li><b>vega</b> - OpenVG 1.x state tracker
<li><b>wgl</b> -
<li><b>xorg</b> - Meta state tracker for Xorg video drivers
<li><b>xvmc</b> - XvMC state tracker

View File

@@ -0,0 +1,147 @@
Name
MESA_image_dma_buf_export
Name Strings
EGL_MESA_image_dma_buf_export
Contributors
Dave Airlie
Contact
Dave Airlie (airlied 'at' redhat 'dot' com)
Status
Complete, shipping.
Version
Version 3, May 5, 2015
Number
EGL Extension #87
Dependencies
Requires EGL 1.4 or later. This extension is written against the
wording of the EGL 1.4 specification.
EGL_KHR_base_image is required.
The EGL implementation must be running on a Linux kernel supporting the
dma_buf buffer sharing mechanism.
Overview
This extension provides entry points for integrating EGLImage with the
dma-buf infrastructure. The extension allows creating a Linux dma_buf
file descriptor or multiple file descriptors, in the case of multi-plane
YUV image, from an EGLImage.
It is designed to provide the complementary functionality to
EGL_EXT_image_dma_buf_import.
IP Status
Open-source; freely implementable.
New Types
This extension uses the 64-bit unsigned integer type EGLuint64KHR
first introduced by the EGL_KHR_stream extension, but does not
depend on that extension. The typedef may be reproduced separately
for this extension, if not already present in eglext.h.
typedef khronos_uint64_t EGLuint64KHR;
New Procedures and Functions
EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
EGLImageKHR image,
int *fourcc,
int *num_planes,
EGLuint64KHR *modifiers);
EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
EGLImageKHR image,
int *fds,
EGLint *strides,
EGLint *offsets);
New Tokens
None
Additions to the EGL 1.4 Specification:
To mirror the import extension, this extension attempts to return
enough information to enable an exported dma-buf to be imported
via eglCreateImageKHR and EGL_LINUX_DMA_BUF_EXT token.
Retrieving the information is a two step process, so two APIs
are required.
The first entrypoint
EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
EGLImageKHR image,
int *fourcc,
int *num_planes,
EGLuint64KHR *modifiers);
is used to retrieve the pixel format of the buffer, as specified by
drm_fourcc.h, the number of planes in the image and the Linux
drm modifiers. <fourcc>, <num_planes> and <modifiers> may be NULL,
in which case no value is retrieved.
The second entrypoint retrieves the dma_buf file descriptors,
strides and offsets for the image. The caller should pass
arrays sized according to the num_planes values retrieved previously.
Passing arrays of the wrong size will have undefined results.
If the number of fds is less than the number of planes, then
subsequent fd slots should contain -1.
EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
EGLImageKHR image,
int *fds,
EGLint *strides,
EGLint *offsets);
<fds>, <strides>, <offsets> can be NULL if the infomatation isn't
required by the caller.
Issues
1. Should the API look more like an attribute getting API?
ANSWER: No, from a user interface pov, having to iterate across calling
the API up to 12 times using attribs seems like the wrong solution.
2. Should the API take a plane and just get the fd/stride/offset for that
plane?
ANSWER: UNKNOWN,this might be just as valid an API.
3. Does ownership of the file descriptor remain with the app?
ANSWER: Yes, the app is responsible for closing any fds retrieved.
4. If number of planes and number of fds differ what should we do?
ANSWER: Return -1 for the secondary slots, as this avoids having
to dup the fd extra times to make the interface sane.
Revision History
Version 3, May, 2015
Just use the KHR 64-bit type.
Version 2, March, 2015
Add a query interface (Dave Airlie)
Version 1, June 3, 2014
Initial draft (Dave Airlie)

View File

@@ -150,7 +150,7 @@ New features:
Changes:
<ul>
<li>renamed aux.h as glaux.h (MS-DOS names can't start with aux)
<li>most filenames are in 8.3 format to accomodate MS-DOS
<li>most filenames are in 8.3 format to accommodate MS-DOS
<li>use GLubytes to store arrays of colors instead of GLints
</ul>
@@ -1224,7 +1224,7 @@ Bug fixes:
</ul>
Changes:
<ul>
<li>max texture units reduced to six to accomodate texture rectangles
<li>max texture units reduced to six to accommodate texture rectangles
<li>removed unfinished GL_MESA_sprite_point extension code
</ul>

View File

@@ -19,6 +19,7 @@
<p>
This page lists known issues with
<a href="http://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>
and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a>
when running on Mesa-based drivers.
</p>
@@ -40,13 +41,15 @@ These issues have been reported to the SPEC organization in the hope that
they'll be fixed in the future.
</p>
<h2><u>Viewperf 11</u></h2>
<p>
Some of the Viewperf tests use a lot of memory.
Some of the Viewperf 11 tests use a lot of memory.
At least 2GB of RAM is recommended.
</p>
<h2>Catia-03 test 2</h2>
<h3>Catia-03 test 2</h3>
<p>
This test creates over 38000 vertex buffer objects. On some systems
@@ -59,7 +62,7 @@ either in Viewperf or the Mesa driver.
<h2>Catia-03 tests 3, 4, 8</h2>
<h3>Catia-03 tests 3, 4, 8</h3>
<p>
These tests use features of the
@@ -79,7 +82,7 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.
<h2>sw-02 tests 1, 2, 4, 6</h2>
<h3>sw-02 tests 1, 2, 4, 6</h3>
<p>
These tests depend on the
@@ -99,7 +102,7 @@ color. This is probably due to some uninitialized state somewhere.
<h2>sw-02 test 6</h2>
<h3>sw-02 test 6</h3>
<p>
The lines drawn in this test appear in a random color.
@@ -111,7 +114,7 @@ situation, we get a random color.
<h2>Lightwave-01 test 3</h2>
<h3>Lightwave-01 test 3</h3>
<p>
This test uses a number of mipmapped textures, but the textures are
@@ -172,7 +175,7 @@ However, we have no plans to implement this work-around in Mesa.
</p>
<h2>Maya-03 test 2</h2>
<h3>Maya-03 test 2</h3>
<p>
This test makes some unusual calls to glRotate. For example:
@@ -204,7 +207,7 @@ and with a semi-random color (between white and black) since GL_FOG is enabled.
</p>
<h2>Proe-05 test 1</h2>
<h3>Proe-05 test 1</h3>
<p>
This uses depth testing but there's two problems:
@@ -232,7 +235,7 @@ glClear is called so clearing the depth buffer would be a no-op anyway.
</p>
<h2>Proe-05 test 6</h2>
<h3>Proe-05 test 6</h3>
<p>
This test draws an engine model with a two-pass algorithm.
@@ -261,6 +264,86 @@ blending with appropriate patterns/modes to ensure the same fragments
are produced in both passes.
</p>
<h2><u>Viewperf 12</u></h2>
<p>
Note that Viewperf 12 only runs on 64-bit Windows 7 or later.
</p>
<h3>catia-04</h3>
<p>
One of the catia tests calls wglGetProcAddress() to get some
GL_EXT_direct_state_access functions (such as glBindMultiTextureEXT) and some
GL_NV_half_float functions (such as glMultiTexCoord3hNV).
If the extension/function is not supported, wglGetProcAddress() can return NULL.
Unfortunately, Viewperf doesn't check for null pointers and crashes when it
later tries to use the pointer.
</p>
<p>
Another catia test uses OpenGL 3.1's primitive restart feature.
But when Viewperf creates an OpenGL context, it doesn't request version 3.1
If the driver returns version 3.0 or earlier all the calls related to primitive
restart generate an OpenGL error.
Some of the rendering is then incorrect.
</p>
<h3>energy-01</h3>
<p>
This test creates a 3D luminance texture of size 1K x 1K x 1K.
If the OpenGL driver/device doesn't support a texture of this size
the glTexImage3D() call will fail with GL_INVALID_VALUE or GL_OUT_OF_MEMORY
and all that's rendered is plain white polygons.
Ideally, the test would use a proxy texture to determine the max 3D
texture size. But it does not do that.
</p>
<h3>maya-04</h3>
<p>
This test generates many GL_INVALID_OPERATION errors in its calls to
glUniform().
Causes include:
<ul>
<li> Trying to set float uniforms with glUniformi()
<li> Trying to set float uniforms with glUniform3f()
<li> Trying to set matrix uniforms with glUniform() instead of glUniformMatrix().
</ul>
<p>
Apparently, the indexes returned by glGetUniformLocation() were hard-coded
into the application trace when it was created.
Since different implementations of glGetUniformLocation() may return different
values for any given uniform name, subsequent calls to glUniform() will be
invalid since they refer to the wrong uniform variables.
This causes many OpenGL errors and leads to incorrect rendering.
</p>
<h3>medical-01</h3>
<p>
This test uses a single GLSL fragment shader which contains a GLSL 1.20
array initializer statement, but it neglects to specify
<code>#version 120</code> at the top of the shader code.
So, the shader does not compile and all that's rendered is plain white polygons.
</p>
<p>
Also, the test tries to create a very large 3D texture that may exceed
the device driver's limit.
When this happens, the glTexImage3D call fails and all that's rendered is
a white box.
</p>
<h3>showcase-01</h3>
<p>
This is actually a DX11 test based on Autodesk's Showcase product.
As such, it won't run with Mesa.
</p>
</div>
</body>

1868
include/D3D9/d3d9.h Normal file

File diff suppressed because it is too large Load Diff

387
include/D3D9/d3d9caps.h Normal file
View File

@@ -0,0 +1,387 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3D9CAPS_H_
#define _D3D9CAPS_H_
#include "d3d9types.h"
/* Caps flags */
#define D3DCAPS2_FULLSCREENGAMMA 0x00020000
#define D3DCAPS2_CANCALIBRATEGAMMA 0x00100000
#define D3DCAPS2_RESERVED 0x02000000
#define D3DCAPS2_CANMANAGERESOURCE 0x10000000
#define D3DCAPS2_DYNAMICTEXTURES 0x20000000
#define D3DCAPS2_CANAUTOGENMIPMAP 0x40000000
#define D3DCAPS2_CANSHARERESOURCE 0x80000000
#define D3DCAPS3_ALPHA_FULLSCREEN_FLIP_OR_DISCARD 0x00000020
#define D3DCAPS3_LINEAR_TO_SRGB_PRESENTATION 0x00000080
#define D3DCAPS3_COPY_TO_VIDMEM 0x00000100
#define D3DCAPS3_COPY_TO_SYSTEMMEM 0x00000200
#define D3DCAPS3_DXVAHD 0x00000400
#define D3DCAPS3_RESERVED 0x8000001F
#define D3DPRESENT_INTERVAL_DEFAULT 0x00000000
#define D3DPRESENT_INTERVAL_ONE 0x00000001
#define D3DPRESENT_INTERVAL_TWO 0x00000002
#define D3DPRESENT_INTERVAL_THREE 0x00000004
#define D3DPRESENT_INTERVAL_FOUR 0x00000008
#define D3DPRESENT_INTERVAL_IMMEDIATE 0x80000000
#define D3DCURSORCAPS_COLOR 0x00000001
#define D3DCURSORCAPS_LOWRES 0x00000002
#define D3DDEVCAPS_EXECUTESYSTEMMEMORY 0x00000010
#define D3DDEVCAPS_EXECUTEVIDEOMEMORY 0x00000020
#define D3DDEVCAPS_TLVERTEXSYSTEMMEMORY 0x00000040
#define D3DDEVCAPS_TLVERTEXVIDEOMEMORY 0x00000080
#define D3DDEVCAPS_TEXTURESYSTEMMEMORY 0x00000100
#define D3DDEVCAPS_TEXTUREVIDEOMEMORY 0x00000200
#define D3DDEVCAPS_DRAWPRIMTLVERTEX 0x00000400
#define D3DDEVCAPS_CANRENDERAFTERFLIP 0x00000800
#define D3DDEVCAPS_TEXTURENONLOCALVIDMEM 0x00001000
#define D3DDEVCAPS_DRAWPRIMITIVES2 0x00002000
#define D3DDEVCAPS_SEPARATETEXTUREMEMORIES 0x00004000
#define D3DDEVCAPS_DRAWPRIMITIVES2EX 0x00008000
#define D3DDEVCAPS_HWTRANSFORMANDLIGHT 0x00010000
#define D3DDEVCAPS_CANBLTSYSTONONLOCAL 0x00020000
#define D3DDEVCAPS_HWRASTERIZATION 0x00080000
#define D3DDEVCAPS_PUREDEVICE 0x00100000
#define D3DDEVCAPS_QUINTICRTPATCHES 0x00200000
#define D3DDEVCAPS_RTPATCHES 0x00400000
#define D3DDEVCAPS_RTPATCHHANDLEZERO 0x00800000
#define D3DDEVCAPS_NPATCHES 0x01000000
#define D3DPMISCCAPS_MASKZ 0x00000002
#define D3DPMISCCAPS_CULLNONE 0x00000010
#define D3DPMISCCAPS_CULLCW 0x00000020
#define D3DPMISCCAPS_CULLCCW 0x00000040
#define D3DPMISCCAPS_COLORWRITEENABLE 0x00000080
#define D3DPMISCCAPS_CLIPPLANESCALEDPOINTS 0x00000100
#define D3DPMISCCAPS_CLIPTLVERTS 0x00000200
#define D3DPMISCCAPS_TSSARGTEMP 0x00000400
#define D3DPMISCCAPS_BLENDOP 0x00000800
#define D3DPMISCCAPS_NULLREFERENCE 0x00001000
#define D3DPMISCCAPS_INDEPENDENTWRITEMASKS 0x00004000
#define D3DPMISCCAPS_PERSTAGECONSTANT 0x00008000
#define D3DPMISCCAPS_FOGANDSPECULARALPHA 0x00010000
#define D3DPMISCCAPS_SEPARATEALPHABLEND 0x00020000
#define D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS 0x00040000
#define D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING 0x00080000
#define D3DPMISCCAPS_FOGVERTEXCLAMPED 0x00100000
#define D3DPMISCCAPS_POSTBLENDSRGBCONVERT 0x00200000
#define D3DPRASTERCAPS_DITHER 0x00000001
#define D3DPRASTERCAPS_ZTEST 0x00000010
#define D3DPRASTERCAPS_FOGVERTEX 0x00000080
#define D3DPRASTERCAPS_FOGTABLE 0x00000100
#define D3DPRASTERCAPS_MIPMAPLODBIAS 0x00002000
#define D3DPRASTERCAPS_ZBUFFERLESSHSR 0x00008000
#define D3DPRASTERCAPS_FOGRANGE 0x00010000
#define D3DPRASTERCAPS_ANISOTROPY 0x00020000
#define D3DPRASTERCAPS_WBUFFER 0x00040000
#define D3DPRASTERCAPS_WFOG 0x00100000
#define D3DPRASTERCAPS_ZFOG 0x00200000
#define D3DPRASTERCAPS_COLORPERSPECTIVE 0x00400000
#define D3DPRASTERCAPS_SCISSORTEST 0x01000000
#define D3DPRASTERCAPS_SLOPESCALEDEPTHBIAS 0x02000000
#define D3DPRASTERCAPS_DEPTHBIAS 0x04000000
#define D3DPRASTERCAPS_MULTISAMPLE_TOGGLE 0x08000000
#define D3DPCMPCAPS_NEVER 0x00000001
#define D3DPCMPCAPS_LESS 0x00000002
#define D3DPCMPCAPS_EQUAL 0x00000004
#define D3DPCMPCAPS_LESSEQUAL 0x00000008
#define D3DPCMPCAPS_GREATER 0x00000010
#define D3DPCMPCAPS_NOTEQUAL 0x00000020
#define D3DPCMPCAPS_GREATEREQUAL 0x00000040
#define D3DPCMPCAPS_ALWAYS 0x00000080
#define D3DPBLENDCAPS_ZERO 0x00000001
#define D3DPBLENDCAPS_ONE 0x00000002
#define D3DPBLENDCAPS_SRCCOLOR 0x00000004
#define D3DPBLENDCAPS_INVSRCCOLOR 0x00000008
#define D3DPBLENDCAPS_SRCALPHA 0x00000010
#define D3DPBLENDCAPS_INVSRCALPHA 0x00000020
#define D3DPBLENDCAPS_DESTALPHA 0x00000040
#define D3DPBLENDCAPS_INVDESTALPHA 0x00000080
#define D3DPBLENDCAPS_DESTCOLOR 0x00000100
#define D3DPBLENDCAPS_INVDESTCOLOR 0x00000200
#define D3DPBLENDCAPS_SRCALPHASAT 0x00000400
#define D3DPBLENDCAPS_BOTHSRCALPHA 0x00000800
#define D3DPBLENDCAPS_BOTHINVSRCALPHA 0x00001000
#define D3DPBLENDCAPS_BLENDFACTOR 0x00002000
#ifndef D3D_DISABLE_9EX
# define D3DPBLENDCAPS_SRCCOLOR2 0x00004000
# define D3DPBLENDCAPS_INVSRCCOLOR2 0x00008000
#endif
#define D3DPSHADECAPS_COLORGOURAUDRGB 0x00000008
#define D3DPSHADECAPS_SPECULARGOURAUDRGB 0x00000200
#define D3DPSHADECAPS_ALPHAGOURAUDBLEND 0x00004000
#define D3DPSHADECAPS_FOGGOURAUD 0x00080000
#define D3DPTEXTURECAPS_PERSPECTIVE 0x00000001
#define D3DPTEXTURECAPS_POW2 0x00000002
#define D3DPTEXTURECAPS_ALPHA 0x00000004
#define D3DPTEXTURECAPS_SQUAREONLY 0x00000020
#define D3DPTEXTURECAPS_TEXREPEATNOTSCALEDBYSIZE 0x00000040
#define D3DPTEXTURECAPS_ALPHAPALETTE 0x00000080
#define D3DPTEXTURECAPS_NONPOW2CONDITIONAL 0x00000100
#define D3DPTEXTURECAPS_PROJECTED 0x00000400
#define D3DPTEXTURECAPS_CUBEMAP 0x00000800
#define D3DPTEXTURECAPS_VOLUMEMAP 0x00002000
#define D3DPTEXTURECAPS_MIPMAP 0x00004000
#define D3DPTEXTURECAPS_MIPVOLUMEMAP 0x00008000
#define D3DPTEXTURECAPS_MIPCUBEMAP 0x00010000
#define D3DPTEXTURECAPS_CUBEMAP_POW2 0x00020000
#define D3DPTEXTURECAPS_VOLUMEMAP_POW2 0x00040000
#define D3DPTEXTURECAPS_NOPROJECTEDBUMPENV 0x00200000
#define D3DPTFILTERCAPS_MINFPOINT 0x00000100
#define D3DPTFILTERCAPS_MINFLINEAR 0x00000200
#define D3DPTFILTERCAPS_MINFANISOTROPIC 0x00000400
#define D3DPTFILTERCAPS_MINFPYRAMIDALQUAD 0x00000800
#define D3DPTFILTERCAPS_MINFGAUSSIANQUAD 0x00001000
#define D3DPTFILTERCAPS_MIPFPOINT 0x00010000
#define D3DPTFILTERCAPS_MIPFLINEAR 0x00020000
#define D3DPTFILTERCAPS_MAGFPOINT 0x01000000
#define D3DPTFILTERCAPS_MAGFLINEAR 0x02000000
#define D3DPTFILTERCAPS_MAGFANISOTROPIC 0x04000000
#define D3DPTFILTERCAPS_MAGFPYRAMIDALQUAD 0x08000000
#define D3DPTFILTERCAPS_MAGFGAUSSIANQUAD 0x10000000
#define D3DPTADDRESSCAPS_WRAP 0x00000001
#define D3DPTADDRESSCAPS_MIRROR 0x00000002
#define D3DPTADDRESSCAPS_CLAMP 0x00000004
#define D3DPTADDRESSCAPS_BORDER 0x00000008
#define D3DPTADDRESSCAPS_INDEPENDENTUV 0x00000010
#define D3DPTADDRESSCAPS_MIRRORONCE 0x00000020
#define D3DLINECAPS_TEXTURE 0x00000001
#define D3DLINECAPS_ZTEST 0x00000002
#define D3DLINECAPS_BLEND 0x00000004
#define D3DLINECAPS_ALPHACMP 0x00000008
#define D3DLINECAPS_FOG 0x00000010
#define D3DLINECAPS_ANTIALIAS 0x00000020
#define D3DSTENCILCAPS_KEEP 0x00000001
#define D3DSTENCILCAPS_ZERO 0x00000002
#define D3DSTENCILCAPS_REPLACE 0x00000004
#define D3DSTENCILCAPS_INCRSAT 0x00000008
#define D3DSTENCILCAPS_DECRSAT 0x00000010
#define D3DSTENCILCAPS_INVERT 0x00000020
#define D3DSTENCILCAPS_INCR 0x00000040
#define D3DSTENCILCAPS_DECR 0x00000080
#define D3DSTENCILCAPS_TWOSIDED 0x00000100
#define D3DFVFCAPS_TEXCOORDCOUNTMASK 0x0000FFFF
#define D3DFVFCAPS_DONOTSTRIPELEMENTS 0x00080000
#define D3DFVFCAPS_PSIZE 0x00100000
#define D3DTEXOPCAPS_DISABLE 0x00000001
#define D3DTEXOPCAPS_SELECTARG1 0x00000002
#define D3DTEXOPCAPS_SELECTARG2 0x00000004
#define D3DTEXOPCAPS_MODULATE 0x00000008
#define D3DTEXOPCAPS_MODULATE2X 0x00000010
#define D3DTEXOPCAPS_MODULATE4X 0x00000020
#define D3DTEXOPCAPS_ADD 0x00000040
#define D3DTEXOPCAPS_ADDSIGNED 0x00000080
#define D3DTEXOPCAPS_ADDSIGNED2X 0x00000100
#define D3DTEXOPCAPS_SUBTRACT 0x00000200
#define D3DTEXOPCAPS_ADDSMOOTH 0x00000400
#define D3DTEXOPCAPS_BLENDDIFFUSEALPHA 0x00000800
#define D3DTEXOPCAPS_BLENDTEXTUREALPHA 0x00001000
#define D3DTEXOPCAPS_BLENDFACTORALPHA 0x00002000
#define D3DTEXOPCAPS_BLENDTEXTUREALPHAPM 0x00004000
#define D3DTEXOPCAPS_BLENDCURRENTALPHA 0x00008000
#define D3DTEXOPCAPS_PREMODULATE 0x00010000
#define D3DTEXOPCAPS_MODULATEALPHA_ADDCOLOR 0x00020000
#define D3DTEXOPCAPS_MODULATECOLOR_ADDALPHA 0x00040000
#define D3DTEXOPCAPS_MODULATEINVALPHA_ADDCOLOR 0x00080000
#define D3DTEXOPCAPS_MODULATEINVCOLOR_ADDALPHA 0x00100000
#define D3DTEXOPCAPS_BUMPENVMAP 0x00200000
#define D3DTEXOPCAPS_BUMPENVMAPLUMINANCE 0x00400000
#define D3DTEXOPCAPS_DOTPRODUCT3 0x00800000
#define D3DTEXOPCAPS_MULTIPLYADD 0x01000000
#define D3DTEXOPCAPS_LERP 0x02000000
#define D3DVTXPCAPS_TEXGEN 0x00000001
#define D3DVTXPCAPS_MATERIALSOURCE7 0x00000002
#define D3DVTXPCAPS_DIRECTIONALLIGHTS 0x00000008
#define D3DVTXPCAPS_POSITIONALLIGHTS 0x00000010
#define D3DVTXPCAPS_LOCALVIEWER 0x00000020
#define D3DVTXPCAPS_TWEENING 0x00000040
#define D3DVTXPCAPS_TEXGEN_SPHEREMAP 0x00000100
#define D3DVTXPCAPS_NO_TEXGEN_NONLOCALVIEWER 0x00000200
#define D3DDEVCAPS2_STREAMOFFSET 0x00000001
#define D3DDEVCAPS2_DMAPNPATCH 0x00000002
#define D3DDEVCAPS2_ADAPTIVETESSRTPATCH 0x00000004
#define D3DDEVCAPS2_ADAPTIVETESSNPATCH 0x00000008
#define D3DDEVCAPS2_CAN_STRETCHRECT_FROM_TEXTURES 0x00000010
#define D3DDEVCAPS2_PRESAMPLEDDMAPNPATCH 0x00000020
#define D3DDEVCAPS2_VERTEXELEMENTSCANSHARESTREAMOFFSET 0x00000040
#define D3DDTCAPS_UBYTE4 0x00000001
#define D3DDTCAPS_UBYTE4N 0x00000002
#define D3DDTCAPS_SHORT2N 0x00000004
#define D3DDTCAPS_SHORT4N 0x00000008
#define D3DDTCAPS_USHORT2N 0x00000010
#define D3DDTCAPS_USHORT4N 0x00000020
#define D3DDTCAPS_UDEC3 0x00000040
#define D3DDTCAPS_DEC3N 0x00000080
#define D3DDTCAPS_FLOAT16_2 0x00000100
#define D3DDTCAPS_FLOAT16_4 0x00000200
#define D3DVS20_MAX_DYNAMICFLOWCONTROLDEPTH 24
#define D3DVS20_MIN_DYNAMICFLOWCONTROLDEPTH 0
#define D3DVS20_MAX_NUMTEMPS 32
#define D3DVS20_MIN_NUMTEMPS 12
#define D3DVS20_MAX_STATICFLOWCONTROLDEPTH 4
#define D3DVS20_MIN_STATICFLOWCONTROLDEPTH 1
#define D3DVS20CAPS_PREDICATION (1 << 0)
#define D3DPS20CAPS_ARBITRARYSWIZZLE (1 << 0)
#define D3DPS20CAPS_GRADIENTINSTRUCTIONS (1 << 1)
#define D3DPS20CAPS_PREDICATION (1 << 2)
#define D3DPS20CAPS_NODEPENDENTREADLIMIT (1 << 3)
#define D3DPS20CAPS_NOTEXINSTRUCTIONLIMIT (1 << 4)
#define D3DPS20_MAX_DYNAMICFLOWCONTROLDEPTH 24
#define D3DPS20_MIN_DYNAMICFLOWCONTROLDEPTH 0
#define D3DPS20_MAX_NUMTEMPS 32
#define D3DPS20_MIN_NUMTEMPS 12
#define D3DPS20_MAX_STATICFLOWCONTROLDEPTH 4
#define D3DPS20_MIN_STATICFLOWCONTROLDEPTH 0
#define D3DPS20_MAX_NUMINSTRUCTIONSLOTS 512
#define D3DPS20_MIN_NUMINSTRUCTIONSLOTS 96
#define D3DMIN30SHADERINSTRUCTIONS 512
#define D3DMAX30SHADERINSTRUCTIONS 32768
/* Structs */
typedef struct _D3DVSHADERCAPS2_0 {
DWORD Caps;
INT DynamicFlowControlDepth;
INT NumTemps;
INT StaticFlowControlDepth;
} D3DVSHADERCAPS2_0, *PD3DVSHADERCAPS2_0, *LPD3DVSHADERCAPS2_0;
typedef struct _D3DPSHADERCAPS2_0 {
DWORD Caps;
INT DynamicFlowControlDepth;
INT NumTemps;
INT StaticFlowControlDepth;
INT NumInstructionSlots;
} D3DPSHADERCAPS2_0, *PD3DPSHADERCAPS2_0, *LPD3DPSHADERCAPS2_0;
typedef struct _D3DCAPS9 {
D3DDEVTYPE DeviceType;
UINT AdapterOrdinal;
DWORD Caps;
DWORD Caps2;
DWORD Caps3;
DWORD PresentationIntervals;
DWORD CursorCaps;
DWORD DevCaps;
DWORD PrimitiveMiscCaps;
DWORD RasterCaps;
DWORD ZCmpCaps;
DWORD SrcBlendCaps;
DWORD DestBlendCaps;
DWORD AlphaCmpCaps;
DWORD ShadeCaps;
DWORD TextureCaps;
DWORD TextureFilterCaps;
DWORD CubeTextureFilterCaps;
DWORD VolumeTextureFilterCaps;
DWORD TextureAddressCaps;
DWORD VolumeTextureAddressCaps;
DWORD LineCaps;
DWORD MaxTextureWidth;
DWORD MaxTextureHeight;
DWORD MaxVolumeExtent;
DWORD MaxTextureRepeat;
DWORD MaxTextureAspectRatio;
DWORD MaxAnisotropy;
float MaxVertexW;
float GuardBandLeft;
float GuardBandTop;
float GuardBandRight;
float GuardBandBottom;
float ExtentsAdjust;
DWORD StencilCaps;
DWORD FVFCaps;
DWORD TextureOpCaps;
DWORD MaxTextureBlendStages;
DWORD MaxSimultaneousTextures;
DWORD VertexProcessingCaps;
DWORD MaxActiveLights;
DWORD MaxUserClipPlanes;
DWORD MaxVertexBlendMatrices;
DWORD MaxVertexBlendMatrixIndex;
float MaxPointSize;
DWORD MaxPrimitiveCount;
DWORD MaxVertexIndex;
DWORD MaxStreams;
DWORD MaxStreamStride;
DWORD VertexShaderVersion;
DWORD MaxVertexShaderConst;
DWORD PixelShaderVersion;
float PixelShader1xMaxValue;
DWORD DevCaps2;
float MaxNpatchTessellationLevel;
DWORD Reserved5;
UINT MasterAdapterOrdinal;
UINT AdapterOrdinalInGroup;
UINT NumberOfAdaptersInGroup;
DWORD DeclTypes;
DWORD NumSimultaneousRTs;
DWORD StretchRectFilterCaps;
D3DVSHADERCAPS2_0 VS20Caps;
D3DPSHADERCAPS2_0 PS20Caps;
DWORD VertexTextureFilterCaps;
DWORD MaxVShaderInstructionsExecuted;
DWORD MaxPShaderInstructionsExecuted;
DWORD MaxVertexShader30InstructionSlots;
DWORD MaxPixelShader30InstructionSlots;
} D3DCAPS9, *PD3DCAPS9, *LPD3DCAPS9;
typedef struct _D3DCONTENTPROTECTIONCAPS {
DWORD Caps;
GUID KeyExchangeType;
UINT BufferAlignmentStart;
UINT BlockAlignmentSize;
ULONGLONG ProtectedMemorySize;
} D3DCONTENTPROTECTIONCAPS, *PD3DCONTENTPROTECTIONCAPS, *LPD3DCONTENTPROTECTIONCAPS;
typedef struct _D3DOVERLAYCAPS {
UINT Caps;
UINT MaxOverlayDisplayWidth;
UINT MaxOverlayDisplayHeight;
} D3DOVERLAYCAPS, *PD3DOVERLAYCAPS, *LPD3DOVERLAYCAPS;
#endif /* _D3D9CAPS_H_ */

1815
include/D3D9/d3d9types.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,11 +1,12 @@
/* -*- mode: c; tab-width: 8; -*- */
/* vi: set sw=4 ts=8: */
/* Reference version of egl.h for EGL 1.4.
* $Revision: 9356 $ on $Date: 2009-10-21 02:52:25 -0700 (Wed, 21 Oct 2009) $
*/
#ifndef __egl_h_
#define __egl_h_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2007-2009 The Khronos Group Inc.
** Copyright (c) 2013-2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -26,304 +27,277 @@
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
/*
** This header is generated from the Khronos OpenGL / OpenGL ES XML
** API Registry. The current version of the Registry, generator scripts
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 31039 $ on $Date: 2015-05-04 17:01:57 -0700 (Mon, 04 May 2015) $
*/
#ifndef __egl_h_
#define __egl_h_
/* All platform-dependent types and macro boilerplate (such as EGLAPI
* and EGLAPIENTRY) should go in eglplatform.h.
*/
#include <EGL/eglplatform.h>
#ifdef __cplusplus
extern "C" {
#endif
/* Generated on date 20150504 */
/* EGL Types */
/* EGLint is defined in eglplatform.h */
/* Generated C header for:
* API: egl
* Versions considered: .*
* Versions emitted: .*
* Default extensions included: None
* Additional extensions included: _nomatch_^
* Extensions removed: _nomatch_^
*/
#ifndef EGL_VERSION_1_0
#define EGL_VERSION_1_0 1
typedef unsigned int EGLBoolean;
typedef unsigned int EGLenum;
typedef void *EGLConfig;
typedef void *EGLContext;
typedef void *EGLDisplay;
#include <KHR/khrplatform.h>
#include <EGL/eglplatform.h>
typedef void *EGLConfig;
typedef void *EGLSurface;
typedef void *EGLClientBuffer;
/* EGL Versioning */
#define EGL_VERSION_1_0 1
#define EGL_VERSION_1_1 1
#define EGL_VERSION_1_2 1
#define EGL_VERSION_1_3 1
#define EGL_VERSION_1_4 1
/* EGL Enumerants. Bitmasks and other exceptional cases aside, most
* enums are assigned unique values starting at 0x3000.
*/
/* EGL aliases */
#define EGL_FALSE 0
#define EGL_TRUE 1
/* Out-of-band handle values */
#define EGL_DEFAULT_DISPLAY ((EGLNativeDisplayType)0)
#define EGL_NO_CONTEXT ((EGLContext)0)
#define EGL_NO_DISPLAY ((EGLDisplay)0)
#define EGL_NO_SURFACE ((EGLSurface)0)
/* Out-of-band attribute value */
#define EGL_DONT_CARE ((EGLint)-1)
/* Errors / GetError return values */
#define EGL_SUCCESS 0x3000
#define EGL_NOT_INITIALIZED 0x3001
#define EGL_BAD_ACCESS 0x3002
#define EGL_BAD_ALLOC 0x3003
#define EGL_BAD_ATTRIBUTE 0x3004
#define EGL_BAD_CONFIG 0x3005
#define EGL_BAD_CONTEXT 0x3006
#define EGL_BAD_CURRENT_SURFACE 0x3007
#define EGL_BAD_DISPLAY 0x3008
#define EGL_BAD_MATCH 0x3009
#define EGL_BAD_NATIVE_PIXMAP 0x300A
#define EGL_BAD_NATIVE_WINDOW 0x300B
#define EGL_BAD_PARAMETER 0x300C
#define EGL_BAD_SURFACE 0x300D
#define EGL_CONTEXT_LOST 0x300E /* EGL 1.1 - IMG_power_management */
/* Reserved 0x300F-0x301F for additional errors */
/* Config attributes */
#define EGL_BUFFER_SIZE 0x3020
#define EGL_ALPHA_SIZE 0x3021
#define EGL_BLUE_SIZE 0x3022
#define EGL_GREEN_SIZE 0x3023
#define EGL_RED_SIZE 0x3024
#define EGL_DEPTH_SIZE 0x3025
#define EGL_STENCIL_SIZE 0x3026
#define EGL_CONFIG_CAVEAT 0x3027
#define EGL_CONFIG_ID 0x3028
#define EGL_LEVEL 0x3029
#define EGL_MAX_PBUFFER_HEIGHT 0x302A
#define EGL_MAX_PBUFFER_PIXELS 0x302B
#define EGL_MAX_PBUFFER_WIDTH 0x302C
#define EGL_NATIVE_RENDERABLE 0x302D
#define EGL_NATIVE_VISUAL_ID 0x302E
#define EGL_NATIVE_VISUAL_TYPE 0x302F
#define EGL_SAMPLES 0x3031
#define EGL_SAMPLE_BUFFERS 0x3032
#define EGL_SURFACE_TYPE 0x3033
#define EGL_TRANSPARENT_TYPE 0x3034
#define EGL_TRANSPARENT_BLUE_VALUE 0x3035
#define EGL_TRANSPARENT_GREEN_VALUE 0x3036
#define EGL_TRANSPARENT_RED_VALUE 0x3037
#define EGL_NONE 0x3038 /* Attrib list terminator */
#define EGL_BIND_TO_TEXTURE_RGB 0x3039
#define EGL_BIND_TO_TEXTURE_RGBA 0x303A
#define EGL_MIN_SWAP_INTERVAL 0x303B
#define EGL_MAX_SWAP_INTERVAL 0x303C
#define EGL_LUMINANCE_SIZE 0x303D
#define EGL_ALPHA_MASK_SIZE 0x303E
#define EGL_COLOR_BUFFER_TYPE 0x303F
#define EGL_RENDERABLE_TYPE 0x3040
#define EGL_MATCH_NATIVE_PIXMAP 0x3041 /* Pseudo-attribute (not queryable) */
#define EGL_CONFORMANT 0x3042
/* Reserved 0x3041-0x304F for additional config attributes */
/* Config attribute values */
#define EGL_SLOW_CONFIG 0x3050 /* EGL_CONFIG_CAVEAT value */
#define EGL_NON_CONFORMANT_CONFIG 0x3051 /* EGL_CONFIG_CAVEAT value */
#define EGL_TRANSPARENT_RGB 0x3052 /* EGL_TRANSPARENT_TYPE value */
#define EGL_RGB_BUFFER 0x308E /* EGL_COLOR_BUFFER_TYPE value */
#define EGL_LUMINANCE_BUFFER 0x308F /* EGL_COLOR_BUFFER_TYPE value */
/* More config attribute values, for EGL_TEXTURE_FORMAT */
#define EGL_NO_TEXTURE 0x305C
#define EGL_TEXTURE_RGB 0x305D
#define EGL_TEXTURE_RGBA 0x305E
#define EGL_TEXTURE_2D 0x305F
/* Config attribute mask bits */
#define EGL_PBUFFER_BIT 0x0001 /* EGL_SURFACE_TYPE mask bits */
#define EGL_PIXMAP_BIT 0x0002 /* EGL_SURFACE_TYPE mask bits */
#define EGL_WINDOW_BIT 0x0004 /* EGL_SURFACE_TYPE mask bits */
#define EGL_VG_COLORSPACE_LINEAR_BIT 0x0020 /* EGL_SURFACE_TYPE mask bits */
#define EGL_VG_ALPHA_FORMAT_PRE_BIT 0x0040 /* EGL_SURFACE_TYPE mask bits */
#define EGL_MULTISAMPLE_RESOLVE_BOX_BIT 0x0200 /* EGL_SURFACE_TYPE mask bits */
#define EGL_SWAP_BEHAVIOR_PRESERVED_BIT 0x0400 /* EGL_SURFACE_TYPE mask bits */
#define EGL_OPENGL_ES_BIT 0x0001 /* EGL_RENDERABLE_TYPE mask bits */
#define EGL_OPENVG_BIT 0x0002 /* EGL_RENDERABLE_TYPE mask bits */
#define EGL_OPENGL_ES2_BIT 0x0004 /* EGL_RENDERABLE_TYPE mask bits */
#define EGL_OPENGL_BIT 0x0008 /* EGL_RENDERABLE_TYPE mask bits */
/* QueryString targets */
#define EGL_VENDOR 0x3053
#define EGL_VERSION 0x3054
#define EGL_EXTENSIONS 0x3055
#define EGL_CLIENT_APIS 0x308D
/* QuerySurface / SurfaceAttrib / CreatePbufferSurface targets */
#define EGL_HEIGHT 0x3056
#define EGL_WIDTH 0x3057
#define EGL_LARGEST_PBUFFER 0x3058
#define EGL_TEXTURE_FORMAT 0x3080
#define EGL_TEXTURE_TARGET 0x3081
#define EGL_MIPMAP_TEXTURE 0x3082
#define EGL_MIPMAP_LEVEL 0x3083
#define EGL_RENDER_BUFFER 0x3086
#define EGL_VG_COLORSPACE 0x3087
#define EGL_VG_ALPHA_FORMAT 0x3088
#define EGL_HORIZONTAL_RESOLUTION 0x3090
#define EGL_VERTICAL_RESOLUTION 0x3091
#define EGL_PIXEL_ASPECT_RATIO 0x3092
#define EGL_SWAP_BEHAVIOR 0x3093
#define EGL_MULTISAMPLE_RESOLVE 0x3099
/* EGL_RENDER_BUFFER values / BindTexImage / ReleaseTexImage buffer targets */
#define EGL_BACK_BUFFER 0x3084
#define EGL_SINGLE_BUFFER 0x3085
/* OpenVG color spaces */
#define EGL_VG_COLORSPACE_sRGB 0x3089 /* EGL_VG_COLORSPACE value */
#define EGL_VG_COLORSPACE_LINEAR 0x308A /* EGL_VG_COLORSPACE value */
/* OpenVG alpha formats */
#define EGL_VG_ALPHA_FORMAT_NONPRE 0x308B /* EGL_ALPHA_FORMAT value */
#define EGL_VG_ALPHA_FORMAT_PRE 0x308C /* EGL_ALPHA_FORMAT value */
/* Constant scale factor by which fractional display resolutions &
* aspect ratio are scaled when queried as integer values.
*/
#define EGL_DISPLAY_SCALING 10000
/* Unknown display resolution/aspect ratio */
#define EGL_UNKNOWN ((EGLint)-1)
/* Back buffer swap behaviors */
#define EGL_BUFFER_PRESERVED 0x3094 /* EGL_SWAP_BEHAVIOR value */
#define EGL_BUFFER_DESTROYED 0x3095 /* EGL_SWAP_BEHAVIOR value */
/* CreatePbufferFromClientBuffer buffer types */
#define EGL_OPENVG_IMAGE 0x3096
/* QueryContext targets */
#define EGL_CONTEXT_CLIENT_TYPE 0x3097
/* CreateContext attributes */
#define EGL_CONTEXT_CLIENT_VERSION 0x3098
/* Multisample resolution behaviors */
#define EGL_MULTISAMPLE_RESOLVE_DEFAULT 0x309A /* EGL_MULTISAMPLE_RESOLVE value */
#define EGL_MULTISAMPLE_RESOLVE_BOX 0x309B /* EGL_MULTISAMPLE_RESOLVE value */
/* BindAPI/QueryAPI targets */
#define EGL_OPENGL_ES_API 0x30A0
#define EGL_OPENVG_API 0x30A1
#define EGL_OPENGL_API 0x30A2
/* GetCurrentSurface targets */
#define EGL_DRAW 0x3059
#define EGL_READ 0x305A
/* WaitNative engines */
#define EGL_CORE_NATIVE_ENGINE 0x305B
/* EGL 1.2 tokens renamed for consistency in EGL 1.3 */
#define EGL_COLORSPACE EGL_VG_COLORSPACE
#define EGL_ALPHA_FORMAT EGL_VG_ALPHA_FORMAT
#define EGL_COLORSPACE_sRGB EGL_VG_COLORSPACE_sRGB
#define EGL_COLORSPACE_LINEAR EGL_VG_COLORSPACE_LINEAR
#define EGL_ALPHA_FORMAT_NONPRE EGL_VG_ALPHA_FORMAT_NONPRE
#define EGL_ALPHA_FORMAT_PRE EGL_VG_ALPHA_FORMAT_PRE
/* EGL extensions must request enum blocks from the Khronos
* API Registrar, who maintains the enumerant registry. Submit
* a bug in Khronos Bugzilla against task "Registry".
*/
/* EGL Functions */
EGLAPI EGLint EGLAPIENTRY eglGetError(void);
EGLAPI EGLDisplay EGLAPIENTRY eglGetDisplay(EGLNativeDisplayType display_id);
EGLAPI EGLBoolean EGLAPIENTRY eglInitialize(EGLDisplay dpy, EGLint *major, EGLint *minor);
EGLAPI EGLBoolean EGLAPIENTRY eglTerminate(EGLDisplay dpy);
EGLAPI const char * EGLAPIENTRY eglQueryString(EGLDisplay dpy, EGLint name);
EGLAPI EGLBoolean EGLAPIENTRY eglGetConfigs(EGLDisplay dpy, EGLConfig *configs,
EGLint config_size, EGLint *num_config);
EGLAPI EGLBoolean EGLAPIENTRY eglChooseConfig(EGLDisplay dpy, const EGLint *attrib_list,
EGLConfig *configs, EGLint config_size,
EGLint *num_config);
EGLAPI EGLBoolean EGLAPIENTRY eglGetConfigAttrib(EGLDisplay dpy, EGLConfig config,
EGLint attribute, EGLint *value);
EGLAPI EGLSurface EGLAPIENTRY eglCreateWindowSurface(EGLDisplay dpy, EGLConfig config,
EGLNativeWindowType win,
const EGLint *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePbufferSurface(EGLDisplay dpy, EGLConfig config,
const EGLint *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurface(EGLDisplay dpy, EGLConfig config,
EGLNativePixmapType pixmap,
const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroySurface(EGLDisplay dpy, EGLSurface surface);
EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurface(EGLDisplay dpy, EGLSurface surface,
EGLint attribute, EGLint *value);
EGLAPI EGLBoolean EGLAPIENTRY eglBindAPI(EGLenum api);
EGLAPI EGLenum EGLAPIENTRY eglQueryAPI(void);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitClient(void);
EGLAPI EGLBoolean EGLAPIENTRY eglReleaseThread(void);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePbufferFromClientBuffer(
EGLDisplay dpy, EGLenum buftype, EGLClientBuffer buffer,
EGLConfig config, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglSurfaceAttrib(EGLDisplay dpy, EGLSurface surface,
EGLint attribute, EGLint value);
EGLAPI EGLBoolean EGLAPIENTRY eglBindTexImage(EGLDisplay dpy, EGLSurface surface, EGLint buffer);
EGLAPI EGLBoolean EGLAPIENTRY eglReleaseTexImage(EGLDisplay dpy, EGLSurface surface, EGLint buffer);
EGLAPI EGLBoolean EGLAPIENTRY eglSwapInterval(EGLDisplay dpy, EGLint interval);
EGLAPI EGLContext EGLAPIENTRY eglCreateContext(EGLDisplay dpy, EGLConfig config,
EGLContext share_context,
const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroyContext(EGLDisplay dpy, EGLContext ctx);
EGLAPI EGLBoolean EGLAPIENTRY eglMakeCurrent(EGLDisplay dpy, EGLSurface draw,
EGLSurface read, EGLContext ctx);
EGLAPI EGLContext EGLAPIENTRY eglGetCurrentContext(void);
EGLAPI EGLSurface EGLAPIENTRY eglGetCurrentSurface(EGLint readdraw);
EGLAPI EGLDisplay EGLAPIENTRY eglGetCurrentDisplay(void);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryContext(EGLDisplay dpy, EGLContext ctx,
EGLint attribute, EGLint *value);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitGL(void);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitNative(EGLint engine);
EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffers(EGLDisplay dpy, EGLSurface surface);
EGLAPI EGLBoolean EGLAPIENTRY eglCopyBuffers(EGLDisplay dpy, EGLSurface surface,
EGLNativePixmapType target);
/* This is a generic function pointer type, whose name indicates it must
* be cast to the proper type *and calling convention* before use.
*/
typedef void *EGLContext;
typedef void (*__eglMustCastToProperFunctionPointerType)(void);
#define EGL_ALPHA_SIZE 0x3021
#define EGL_BAD_ACCESS 0x3002
#define EGL_BAD_ALLOC 0x3003
#define EGL_BAD_ATTRIBUTE 0x3004
#define EGL_BAD_CONFIG 0x3005
#define EGL_BAD_CONTEXT 0x3006
#define EGL_BAD_CURRENT_SURFACE 0x3007
#define EGL_BAD_DISPLAY 0x3008
#define EGL_BAD_MATCH 0x3009
#define EGL_BAD_NATIVE_PIXMAP 0x300A
#define EGL_BAD_NATIVE_WINDOW 0x300B
#define EGL_BAD_PARAMETER 0x300C
#define EGL_BAD_SURFACE 0x300D
#define EGL_BLUE_SIZE 0x3022
#define EGL_BUFFER_SIZE 0x3020
#define EGL_CONFIG_CAVEAT 0x3027
#define EGL_CONFIG_ID 0x3028
#define EGL_CORE_NATIVE_ENGINE 0x305B
#define EGL_DEPTH_SIZE 0x3025
#define EGL_DONT_CARE ((EGLint)-1)
#define EGL_DRAW 0x3059
#define EGL_EXTENSIONS 0x3055
#define EGL_FALSE 0
#define EGL_GREEN_SIZE 0x3023
#define EGL_HEIGHT 0x3056
#define EGL_LARGEST_PBUFFER 0x3058
#define EGL_LEVEL 0x3029
#define EGL_MAX_PBUFFER_HEIGHT 0x302A
#define EGL_MAX_PBUFFER_PIXELS 0x302B
#define EGL_MAX_PBUFFER_WIDTH 0x302C
#define EGL_NATIVE_RENDERABLE 0x302D
#define EGL_NATIVE_VISUAL_ID 0x302E
#define EGL_NATIVE_VISUAL_TYPE 0x302F
#define EGL_NONE 0x3038
#define EGL_NON_CONFORMANT_CONFIG 0x3051
#define EGL_NOT_INITIALIZED 0x3001
#define EGL_NO_CONTEXT ((EGLContext)0)
#define EGL_NO_DISPLAY ((EGLDisplay)0)
#define EGL_NO_SURFACE ((EGLSurface)0)
#define EGL_PBUFFER_BIT 0x0001
#define EGL_PIXMAP_BIT 0x0002
#define EGL_READ 0x305A
#define EGL_RED_SIZE 0x3024
#define EGL_SAMPLES 0x3031
#define EGL_SAMPLE_BUFFERS 0x3032
#define EGL_SLOW_CONFIG 0x3050
#define EGL_STENCIL_SIZE 0x3026
#define EGL_SUCCESS 0x3000
#define EGL_SURFACE_TYPE 0x3033
#define EGL_TRANSPARENT_BLUE_VALUE 0x3035
#define EGL_TRANSPARENT_GREEN_VALUE 0x3036
#define EGL_TRANSPARENT_RED_VALUE 0x3037
#define EGL_TRANSPARENT_RGB 0x3052
#define EGL_TRANSPARENT_TYPE 0x3034
#define EGL_TRUE 1
#define EGL_VENDOR 0x3053
#define EGL_VERSION 0x3054
#define EGL_WIDTH 0x3057
#define EGL_WINDOW_BIT 0x0004
EGLAPI EGLBoolean EGLAPIENTRY eglChooseConfig (EGLDisplay dpy, const EGLint *attrib_list, EGLConfig *configs, EGLint config_size, EGLint *num_config);
EGLAPI EGLBoolean EGLAPIENTRY eglCopyBuffers (EGLDisplay dpy, EGLSurface surface, EGLNativePixmapType target);
EGLAPI EGLContext EGLAPIENTRY eglCreateContext (EGLDisplay dpy, EGLConfig config, EGLContext share_context, const EGLint *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePbufferSurface (EGLDisplay dpy, EGLConfig config, const EGLint *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurface (EGLDisplay dpy, EGLConfig config, EGLNativePixmapType pixmap, const EGLint *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreateWindowSurface (EGLDisplay dpy, EGLConfig config, EGLNativeWindowType win, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroyContext (EGLDisplay dpy, EGLContext ctx);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroySurface (EGLDisplay dpy, EGLSurface surface);
EGLAPI EGLBoolean EGLAPIENTRY eglGetConfigAttrib (EGLDisplay dpy, EGLConfig config, EGLint attribute, EGLint *value);
EGLAPI EGLBoolean EGLAPIENTRY eglGetConfigs (EGLDisplay dpy, EGLConfig *configs, EGLint config_size, EGLint *num_config);
EGLAPI EGLDisplay EGLAPIENTRY eglGetCurrentDisplay (void);
EGLAPI EGLSurface EGLAPIENTRY eglGetCurrentSurface (EGLint readdraw);
EGLAPI EGLDisplay EGLAPIENTRY eglGetDisplay (EGLNativeDisplayType display_id);
EGLAPI EGLint EGLAPIENTRY eglGetError (void);
EGLAPI __eglMustCastToProperFunctionPointerType EGLAPIENTRY eglGetProcAddress (const char *procname);
EGLAPI EGLBoolean EGLAPIENTRY eglInitialize (EGLDisplay dpy, EGLint *major, EGLint *minor);
EGLAPI EGLBoolean EGLAPIENTRY eglMakeCurrent (EGLDisplay dpy, EGLSurface draw, EGLSurface read, EGLContext ctx);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryContext (EGLDisplay dpy, EGLContext ctx, EGLint attribute, EGLint *value);
EGLAPI const char *EGLAPIENTRY eglQueryString (EGLDisplay dpy, EGLint name);
EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurface (EGLDisplay dpy, EGLSurface surface, EGLint attribute, EGLint *value);
EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffers (EGLDisplay dpy, EGLSurface surface);
EGLAPI EGLBoolean EGLAPIENTRY eglTerminate (EGLDisplay dpy);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitGL (void);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitNative (EGLint engine);
#endif /* EGL_VERSION_1_0 */
/* Now, define eglGetProcAddress using the generic function ptr. type */
EGLAPI __eglMustCastToProperFunctionPointerType EGLAPIENTRY
eglGetProcAddress(const char *procname);
#ifndef EGL_VERSION_1_1
#define EGL_VERSION_1_1 1
#define EGL_BACK_BUFFER 0x3084
#define EGL_BIND_TO_TEXTURE_RGB 0x3039
#define EGL_BIND_TO_TEXTURE_RGBA 0x303A
#define EGL_CONTEXT_LOST 0x300E
#define EGL_MIN_SWAP_INTERVAL 0x303B
#define EGL_MAX_SWAP_INTERVAL 0x303C
#define EGL_MIPMAP_TEXTURE 0x3082
#define EGL_MIPMAP_LEVEL 0x3083
#define EGL_NO_TEXTURE 0x305C
#define EGL_TEXTURE_2D 0x305F
#define EGL_TEXTURE_FORMAT 0x3080
#define EGL_TEXTURE_RGB 0x305D
#define EGL_TEXTURE_RGBA 0x305E
#define EGL_TEXTURE_TARGET 0x3081
EGLAPI EGLBoolean EGLAPIENTRY eglBindTexImage (EGLDisplay dpy, EGLSurface surface, EGLint buffer);
EGLAPI EGLBoolean EGLAPIENTRY eglReleaseTexImage (EGLDisplay dpy, EGLSurface surface, EGLint buffer);
EGLAPI EGLBoolean EGLAPIENTRY eglSurfaceAttrib (EGLDisplay dpy, EGLSurface surface, EGLint attribute, EGLint value);
EGLAPI EGLBoolean EGLAPIENTRY eglSwapInterval (EGLDisplay dpy, EGLint interval);
#endif /* EGL_VERSION_1_1 */
#ifndef EGL_VERSION_1_2
#define EGL_VERSION_1_2 1
typedef unsigned int EGLenum;
typedef void *EGLClientBuffer;
#define EGL_ALPHA_FORMAT 0x3088
#define EGL_ALPHA_FORMAT_NONPRE 0x308B
#define EGL_ALPHA_FORMAT_PRE 0x308C
#define EGL_ALPHA_MASK_SIZE 0x303E
#define EGL_BUFFER_PRESERVED 0x3094
#define EGL_BUFFER_DESTROYED 0x3095
#define EGL_CLIENT_APIS 0x308D
#define EGL_COLORSPACE 0x3087
#define EGL_COLORSPACE_sRGB 0x3089
#define EGL_COLORSPACE_LINEAR 0x308A
#define EGL_COLOR_BUFFER_TYPE 0x303F
#define EGL_CONTEXT_CLIENT_TYPE 0x3097
#define EGL_DISPLAY_SCALING 10000
#define EGL_HORIZONTAL_RESOLUTION 0x3090
#define EGL_LUMINANCE_BUFFER 0x308F
#define EGL_LUMINANCE_SIZE 0x303D
#define EGL_OPENGL_ES_BIT 0x0001
#define EGL_OPENVG_BIT 0x0002
#define EGL_OPENGL_ES_API 0x30A0
#define EGL_OPENVG_API 0x30A1
#define EGL_OPENVG_IMAGE 0x3096
#define EGL_PIXEL_ASPECT_RATIO 0x3092
#define EGL_RENDERABLE_TYPE 0x3040
#define EGL_RENDER_BUFFER 0x3086
#define EGL_RGB_BUFFER 0x308E
#define EGL_SINGLE_BUFFER 0x3085
#define EGL_SWAP_BEHAVIOR 0x3093
#define EGL_UNKNOWN ((EGLint)-1)
#define EGL_VERTICAL_RESOLUTION 0x3091
EGLAPI EGLBoolean EGLAPIENTRY eglBindAPI (EGLenum api);
EGLAPI EGLenum EGLAPIENTRY eglQueryAPI (void);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePbufferFromClientBuffer (EGLDisplay dpy, EGLenum buftype, EGLClientBuffer buffer, EGLConfig config, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglReleaseThread (void);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitClient (void);
#endif /* EGL_VERSION_1_2 */
#ifndef EGL_VERSION_1_3
#define EGL_VERSION_1_3 1
#define EGL_CONFORMANT 0x3042
#define EGL_CONTEXT_CLIENT_VERSION 0x3098
#define EGL_MATCH_NATIVE_PIXMAP 0x3041
#define EGL_OPENGL_ES2_BIT 0x0004
#define EGL_VG_ALPHA_FORMAT 0x3088
#define EGL_VG_ALPHA_FORMAT_NONPRE 0x308B
#define EGL_VG_ALPHA_FORMAT_PRE 0x308C
#define EGL_VG_ALPHA_FORMAT_PRE_BIT 0x0040
#define EGL_VG_COLORSPACE 0x3087
#define EGL_VG_COLORSPACE_sRGB 0x3089
#define EGL_VG_COLORSPACE_LINEAR 0x308A
#define EGL_VG_COLORSPACE_LINEAR_BIT 0x0020
#endif /* EGL_VERSION_1_3 */
#ifndef EGL_VERSION_1_4
#define EGL_VERSION_1_4 1
#define EGL_DEFAULT_DISPLAY ((EGLNativeDisplayType)0)
#define EGL_MULTISAMPLE_RESOLVE_BOX_BIT 0x0200
#define EGL_MULTISAMPLE_RESOLVE 0x3099
#define EGL_MULTISAMPLE_RESOLVE_DEFAULT 0x309A
#define EGL_MULTISAMPLE_RESOLVE_BOX 0x309B
#define EGL_OPENGL_API 0x30A2
#define EGL_OPENGL_BIT 0x0008
#define EGL_SWAP_BEHAVIOR_PRESERVED_BIT 0x0400
EGLAPI EGLContext EGLAPIENTRY eglGetCurrentContext (void);
#endif /* EGL_VERSION_1_4 */
#ifndef EGL_VERSION_1_5
#define EGL_VERSION_1_5 1
typedef void *EGLSync;
typedef intptr_t EGLAttrib;
typedef khronos_utime_nanoseconds_t EGLTime;
typedef void *EGLImage;
#define EGL_CONTEXT_MAJOR_VERSION 0x3098
#define EGL_CONTEXT_MINOR_VERSION 0x30FB
#define EGL_CONTEXT_OPENGL_PROFILE_MASK 0x30FD
#define EGL_CONTEXT_OPENGL_RESET_NOTIFICATION_STRATEGY 0x31BD
#define EGL_NO_RESET_NOTIFICATION 0x31BE
#define EGL_LOSE_CONTEXT_ON_RESET 0x31BF
#define EGL_CONTEXT_OPENGL_CORE_PROFILE_BIT 0x00000001
#define EGL_CONTEXT_OPENGL_COMPATIBILITY_PROFILE_BIT 0x00000002
#define EGL_CONTEXT_OPENGL_DEBUG 0x31B0
#define EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE 0x31B1
#define EGL_CONTEXT_OPENGL_ROBUST_ACCESS 0x31B2
#define EGL_OPENGL_ES3_BIT 0x00000040
#define EGL_CL_EVENT_HANDLE 0x309C
#define EGL_SYNC_CL_EVENT 0x30FE
#define EGL_SYNC_CL_EVENT_COMPLETE 0x30FF
#define EGL_SYNC_PRIOR_COMMANDS_COMPLETE 0x30F0
#define EGL_SYNC_TYPE 0x30F7
#define EGL_SYNC_STATUS 0x30F1
#define EGL_SYNC_CONDITION 0x30F8
#define EGL_SIGNALED 0x30F2
#define EGL_UNSIGNALED 0x30F3
#define EGL_SYNC_FLUSH_COMMANDS_BIT 0x0001
#define EGL_FOREVER 0xFFFFFFFFFFFFFFFFull
#define EGL_TIMEOUT_EXPIRED 0x30F5
#define EGL_CONDITION_SATISFIED 0x30F6
#define EGL_NO_SYNC ((EGLSync)0)
#define EGL_SYNC_FENCE 0x30F9
#define EGL_GL_COLORSPACE 0x309D
#define EGL_GL_COLORSPACE_SRGB 0x3089
#define EGL_GL_COLORSPACE_LINEAR 0x308A
#define EGL_GL_RENDERBUFFER 0x30B9
#define EGL_GL_TEXTURE_2D 0x30B1
#define EGL_GL_TEXTURE_LEVEL 0x30BC
#define EGL_GL_TEXTURE_3D 0x30B2
#define EGL_GL_TEXTURE_ZOFFSET 0x30BD
#define EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_X 0x30B3
#define EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_X 0x30B4
#define EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Y 0x30B5
#define EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Y 0x30B6
#define EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Z 0x30B7
#define EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Z 0x30B8
#define EGL_IMAGE_PRESERVED 0x30D2
#define EGL_NO_IMAGE ((EGLImage)0)
EGLAPI EGLSync EGLAPIENTRY eglCreateSync (EGLDisplay dpy, EGLenum type, const EGLAttrib *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroySync (EGLDisplay dpy, EGLSync sync);
EGLAPI EGLint EGLAPIENTRY eglClientWaitSync (EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout);
EGLAPI EGLBoolean EGLAPIENTRY eglGetSyncAttrib (EGLDisplay dpy, EGLSync sync, EGLint attribute, EGLAttrib *value);
EGLAPI EGLImage EGLAPIENTRY eglCreateImage (EGLDisplay dpy, EGLContext ctx, EGLenum target, EGLClientBuffer buffer, const EGLAttrib *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroyImage (EGLDisplay dpy, EGLImage image);
EGLAPI EGLDisplay EGLAPIENTRY eglGetPlatformDisplay (EGLenum platform, void *native_display, const EGLAttrib *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformWindowSurface (EGLDisplay dpy, EGLConfig config, void *native_window, const EGLAttrib *attrib_list);
EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurface (EGLDisplay dpy, EGLConfig config, void *native_pixmap, const EGLAttrib *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglWaitSync (EGLDisplay dpy, EGLSync sync, EGLint flags);
#endif /* EGL_VERSION_1_5 */
#ifdef __cplusplus
}
#endif
#endif /* __egl_h_ */
#endif

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013 The Khronos Group Inc.
** Copyright (c) 2013-2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,12 +33,12 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 24567 $ on $Date: 2013-12-18 09:50:17 -0800 (Wed, 18 Dec 2013) $
** Khronos $Revision$ on $Date$
*/
#include <EGL/eglplatform.h>
#define EGL_EGLEXT_VERSION 20131218
#define EGL_EGLEXT_VERSION 20150508
/* Generated C header for:
* API: egl
@@ -94,12 +94,28 @@ EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSync64KHR (EGLDisplay dpy, EGLenum type,
#define EGL_OPENGL_ES3_BIT_KHR 0x00000040
#endif /* EGL_KHR_create_context */
#ifndef EGL_KHR_create_context_no_error
#define EGL_KHR_create_context_no_error 1
#define EGL_CONTEXT_OPENGL_NO_ERROR_KHR 0x31B3
#endif /* EGL_KHR_create_context_no_error */
#ifndef EGL_KHR_fence_sync
#define EGL_KHR_fence_sync 1
typedef khronos_utime_nanoseconds_t EGLTimeKHR;
#ifdef KHRONOS_SUPPORT_INT64
#define EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR 0x30F0
#define EGL_SYNC_CONDITION_KHR 0x30F8
#define EGL_SYNC_FENCE_KHR 0x30F9
typedef EGLSyncKHR (EGLAPIENTRYP PFNEGLCREATESYNCKHRPROC) (EGLDisplay dpy, EGLenum type, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYSYNCKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync);
typedef EGLint (EGLAPIENTRYP PFNEGLCLIENTWAITSYNCKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync, EGLint flags, EGLTimeKHR timeout);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETSYNCATTRIBKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync, EGLint attribute, EGLint *value);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSyncKHR (EGLDisplay dpy, EGLenum type, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroySyncKHR (EGLDisplay dpy, EGLSyncKHR sync);
EGLAPI EGLint EGLAPIENTRY eglClientWaitSyncKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLint flags, EGLTimeKHR timeout);
EGLAPI EGLBoolean EGLAPIENTRY eglGetSyncAttribKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLint attribute, EGLint *value);
#endif
#endif /* KHRONOS_SUPPORT_INT64 */
#endif /* EGL_KHR_fence_sync */
@@ -207,9 +223,38 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurface64KHR (EGLDisplay dpy, EGLSurface s
#endif
#endif /* EGL_KHR_lock_surface3 */
#ifndef EGL_KHR_partial_update
#define EGL_KHR_partial_update 1
#define EGL_BUFFER_AGE_KHR 0x313D
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSETDAMAGEREGIONKHRPROC) (EGLDisplay dpy, EGLSurface surface, EGLint *rects, EGLint n_rects);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglSetDamageRegionKHR (EGLDisplay dpy, EGLSurface surface, EGLint *rects, EGLint n_rects);
#endif
#endif /* EGL_KHR_partial_update */
#ifndef EGL_KHR_platform_android
#define EGL_KHR_platform_android 1
#define EGL_PLATFORM_ANDROID_KHR 0x3141
#endif /* EGL_KHR_platform_android */
#ifndef EGL_KHR_platform_gbm
#define EGL_KHR_platform_gbm 1
#define EGL_PLATFORM_GBM_KHR 0x31D7
#endif /* EGL_KHR_platform_gbm */
#ifndef EGL_KHR_platform_wayland
#define EGL_KHR_platform_wayland 1
#define EGL_PLATFORM_WAYLAND_KHR 0x31D8
#endif /* EGL_KHR_platform_wayland */
#ifndef EGL_KHR_platform_x11
#define EGL_KHR_platform_x11 1
#define EGL_PLATFORM_X11_KHR 0x31D5
#define EGL_PLATFORM_X11_SCREEN_KHR 0x31D6
#endif /* EGL_KHR_platform_x11 */
#ifndef EGL_KHR_reusable_sync
#define EGL_KHR_reusable_sync 1
typedef khronos_utime_nanoseconds_t EGLTimeKHR;
#ifdef KHRONOS_SUPPORT_INT64
#define EGL_SYNC_STATUS_KHR 0x30F1
#define EGL_SIGNALED_KHR 0x30F2
@@ -221,17 +266,9 @@ typedef khronos_utime_nanoseconds_t EGLTimeKHR;
#define EGL_SYNC_FLUSH_COMMANDS_BIT_KHR 0x0001
#define EGL_FOREVER_KHR 0xFFFFFFFFFFFFFFFFull
#define EGL_NO_SYNC_KHR ((EGLSyncKHR)0)
typedef EGLSyncKHR (EGLAPIENTRYP PFNEGLCREATESYNCKHRPROC) (EGLDisplay dpy, EGLenum type, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYSYNCKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync);
typedef EGLint (EGLAPIENTRYP PFNEGLCLIENTWAITSYNCKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync, EGLint flags, EGLTimeKHR timeout);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSIGNALSYNCKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync, EGLenum mode);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETSYNCATTRIBKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync, EGLint attribute, EGLint *value);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSyncKHR (EGLDisplay dpy, EGLenum type, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglDestroySyncKHR (EGLDisplay dpy, EGLSyncKHR sync);
EGLAPI EGLint EGLAPIENTRY eglClientWaitSyncKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLint flags, EGLTimeKHR timeout);
EGLAPI EGLBoolean EGLAPIENTRY eglSignalSyncKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLenum mode);
EGLAPI EGLBoolean EGLAPIENTRY eglGetSyncAttribKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLint attribute, EGLint *value);
#endif
#endif /* KHRONOS_SUPPORT_INT64 */
#endif /* EGL_KHR_reusable_sync */
@@ -333,6 +370,14 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreateStreamProducerSurfaceKHR (EGLDisplay dpy,
#define EGL_KHR_surfaceless_context 1
#endif /* EGL_KHR_surfaceless_context */
#ifndef EGL_KHR_swap_buffers_with_damage
#define EGL_KHR_swap_buffers_with_damage 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSWITHDAMAGEKHRPROC) (EGLDisplay dpy, EGLSurface surface, EGLint *rects, EGLint n_rects);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffersWithDamageKHR (EGLDisplay dpy, EGLSurface surface, EGLint *rects, EGLint n_rects);
#endif
#endif /* EGL_KHR_swap_buffers_with_damage */
#ifndef EGL_KHR_vg_parent_image
#define EGL_KHR_vg_parent_image 1
#define EGL_VG_PARENT_IMAGE_KHR 0x30BA
@@ -389,6 +434,12 @@ EGLAPI EGLint EGLAPIENTRY eglDupNativeFenceFDANDROID (EGLDisplay dpy, EGLSyncKHR
#define EGL_D3D_TEXTURE_2D_SHARE_HANDLE_ANGLE 0x3200
#endif /* EGL_ANGLE_d3d_share_handle_client_buffer */
#ifndef EGL_ANGLE_device_d3d
#define EGL_ANGLE_device_d3d 1
#define EGL_D3D9_DEVICE_ANGLE 0x33A0
#define EGL_D3D11_DEVICE_ANGLE 0x33A1
#endif /* EGL_ANGLE_device_d3d */
#ifndef EGL_ANGLE_query_surface_pointer
#define EGL_ANGLE_query_surface_pointer 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSURFACEPOINTERANGLEPROC) (EGLDisplay dpy, EGLSurface surface, EGLint attribute, void **value);
@@ -401,6 +452,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu
#define EGL_ANGLE_surface_d3d_texture_2d_share_handle 1
#endif /* EGL_ANGLE_surface_d3d_texture_2d_share_handle */
#ifndef EGL_ANGLE_window_fixed_size
#define EGL_ANGLE_window_fixed_size 1
#define EGL_FIXED_SIZE_ANGLE 0x3201
#endif /* EGL_ANGLE_window_fixed_size */
#ifndef EGL_ARM_pixmap_multisample_discard
#define EGL_ARM_pixmap_multisample_discard 1
#define EGL_DISCARD_SAMPLES_ARM 0x3286
@@ -423,6 +479,42 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu
#define EGL_LOSE_CONTEXT_ON_RESET_EXT 0x31BF
#endif /* EGL_EXT_create_context_robustness */
#ifndef EGL_EXT_device_base
#define EGL_EXT_device_base 1
typedef void *EGLDeviceEXT;
#define EGL_NO_DEVICE_EXT ((EGLDeviceEXT)(0))
#define EGL_BAD_DEVICE_EXT 0x322B
#define EGL_DEVICE_EXT 0x322C
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDEVICEATTRIBEXTPROC) (EGLDeviceEXT device, EGLint attribute, EGLAttrib *value);
typedef const char *(EGLAPIENTRYP PFNEGLQUERYDEVICESTRINGEXTPROC) (EGLDeviceEXT device, EGLint name);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDEVICESEXTPROC) (EGLint max_devices, EGLDeviceEXT *devices, EGLint *num_devices);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDISPLAYATTRIBEXTPROC) (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglQueryDeviceAttribEXT (EGLDeviceEXT device, EGLint attribute, EGLAttrib *value);
EGLAPI const char *EGLAPIENTRY eglQueryDeviceStringEXT (EGLDeviceEXT device, EGLint name);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryDevicesEXT (EGLint max_devices, EGLDeviceEXT *devices, EGLint *num_devices);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);
#endif
#endif /* EGL_EXT_device_base */
#ifndef EGL_EXT_device_drm
#define EGL_EXT_device_drm 1
#define EGL_DRM_DEVICE_FILE_EXT 0x3233
#endif /* EGL_EXT_device_drm */
#ifndef EGL_EXT_device_enumeration
#define EGL_EXT_device_enumeration 1
#endif /* EGL_EXT_device_enumeration */
#ifndef EGL_EXT_device_openwf
#define EGL_EXT_device_openwf 1
#define EGL_OPENWF_DEVICE_ID_EXT 0x3237
#endif /* EGL_EXT_device_openwf */
#ifndef EGL_EXT_device_query
#define EGL_EXT_device_query 1
#endif /* EGL_EXT_device_query */
#ifndef EGL_EXT_image_dma_buf_import
#define EGL_EXT_image_dma_buf_import 1
#define EGL_LINUX_DMA_BUF_EXT 0x3270
@@ -454,6 +546,48 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu
#define EGL_MULTIVIEW_VIEW_COUNT_EXT 0x3134
#endif /* EGL_EXT_multiview_window */
#ifndef EGL_EXT_output_base
#define EGL_EXT_output_base 1
typedef void *EGLOutputLayerEXT;
typedef void *EGLOutputPortEXT;
#define EGL_NO_OUTPUT_LAYER_EXT ((EGLOutputLayerEXT)0)
#define EGL_NO_OUTPUT_PORT_EXT ((EGLOutputPortEXT)0)
#define EGL_BAD_OUTPUT_LAYER_EXT 0x322D
#define EGL_BAD_OUTPUT_PORT_EXT 0x322E
#define EGL_SWAP_INTERVAL_EXT 0x322F
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETOUTPUTLAYERSEXTPROC) (EGLDisplay dpy, const EGLAttrib *attrib_list, EGLOutputLayerEXT *layers, EGLint max_layers, EGLint *num_layers);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETOUTPUTPORTSEXTPROC) (EGLDisplay dpy, const EGLAttrib *attrib_list, EGLOutputPortEXT *ports, EGLint max_ports, EGLint *num_ports);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLOUTPUTLAYERATTRIBEXTPROC) (EGLDisplay dpy, EGLOutputLayerEXT layer, EGLint attribute, EGLAttrib value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYOUTPUTLAYERATTRIBEXTPROC) (EGLDisplay dpy, EGLOutputLayerEXT layer, EGLint attribute, EGLAttrib *value);
typedef const char *(EGLAPIENTRYP PFNEGLQUERYOUTPUTLAYERSTRINGEXTPROC) (EGLDisplay dpy, EGLOutputLayerEXT layer, EGLint name);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLOUTPUTPORTATTRIBEXTPROC) (EGLDisplay dpy, EGLOutputPortEXT port, EGLint attribute, EGLAttrib value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYOUTPUTPORTATTRIBEXTPROC) (EGLDisplay dpy, EGLOutputPortEXT port, EGLint attribute, EGLAttrib *value);
typedef const char *(EGLAPIENTRYP PFNEGLQUERYOUTPUTPORTSTRINGEXTPROC) (EGLDisplay dpy, EGLOutputPortEXT port, EGLint name);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglGetOutputLayersEXT (EGLDisplay dpy, const EGLAttrib *attrib_list, EGLOutputLayerEXT *layers, EGLint max_layers, EGLint *num_layers);
EGLAPI EGLBoolean EGLAPIENTRY eglGetOutputPortsEXT (EGLDisplay dpy, const EGLAttrib *attrib_list, EGLOutputPortEXT *ports, EGLint max_ports, EGLint *num_ports);
EGLAPI EGLBoolean EGLAPIENTRY eglOutputLayerAttribEXT (EGLDisplay dpy, EGLOutputLayerEXT layer, EGLint attribute, EGLAttrib value);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryOutputLayerAttribEXT (EGLDisplay dpy, EGLOutputLayerEXT layer, EGLint attribute, EGLAttrib *value);
EGLAPI const char *EGLAPIENTRY eglQueryOutputLayerStringEXT (EGLDisplay dpy, EGLOutputLayerEXT layer, EGLint name);
EGLAPI EGLBoolean EGLAPIENTRY eglOutputPortAttribEXT (EGLDisplay dpy, EGLOutputPortEXT port, EGLint attribute, EGLAttrib value);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryOutputPortAttribEXT (EGLDisplay dpy, EGLOutputPortEXT port, EGLint attribute, EGLAttrib *value);
EGLAPI const char *EGLAPIENTRY eglQueryOutputPortStringEXT (EGLDisplay dpy, EGLOutputPortEXT port, EGLint name);
#endif
#endif /* EGL_EXT_output_base */
#ifndef EGL_EXT_output_drm
#define EGL_EXT_output_drm 1
#define EGL_DRM_CRTC_EXT 0x3234
#define EGL_DRM_PLANE_EXT 0x3235
#define EGL_DRM_CONNECTOR_EXT 0x3236
#endif /* EGL_EXT_output_drm */
#ifndef EGL_EXT_output_openwf
#define EGL_EXT_output_openwf 1
#define EGL_OPENWF_PIPELINE_ID_EXT 0x3238
#define EGL_OPENWF_PORT_ID_EXT 0x3239
#endif /* EGL_EXT_output_openwf */
#ifndef EGL_EXT_platform_base
#define EGL_EXT_platform_base 1
typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETPLATFORMDISPLAYEXTPROC) (EGLenum platform, void *native_display, const EGLint *attrib_list);
@@ -466,6 +600,11 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurfaceEXT (EGLDisplay dpy,
#endif
#endif /* EGL_EXT_platform_base */
#ifndef EGL_EXT_platform_device
#define EGL_EXT_platform_device 1
#define EGL_PLATFORM_DEVICE_EXT 0x313F
#endif /* EGL_EXT_platform_device */
#ifndef EGL_EXT_platform_wayland
#define EGL_EXT_platform_wayland 1
#define EGL_PLATFORM_WAYLAND_EXT 0x31D8
@@ -477,6 +616,19 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurfaceEXT (EGLDisplay dpy,
#define EGL_PLATFORM_X11_SCREEN_EXT 0x31D6
#endif /* EGL_EXT_platform_x11 */
#ifndef EGL_EXT_protected_surface
#define EGL_EXT_protected_surface 1
#define EGL_PROTECTED_CONTENT_EXT 0x32C0
#endif /* EGL_EXT_protected_surface */
#ifndef EGL_EXT_stream_consumer_egloutput
#define EGL_EXT_stream_consumer_egloutput 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSTREAMCONSUMEROUTPUTEXTPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLOutputLayerEXT layer);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerOutputEXT (EGLDisplay dpy, EGLStreamKHR stream, EGLOutputLayerEXT layer);
#endif
#endif /* EGL_EXT_stream_consumer_egloutput */
#ifndef EGL_EXT_swap_buffers_with_damage
#define EGL_EXT_swap_buffers_with_damage 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSWITHDAMAGEEXTPROC) (EGLDisplay dpy, EGLSurface surface, EGLint *rects, EGLint n_rects);
@@ -485,6 +637,35 @@ EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffersWithDamageEXT (EGLDisplay dpy, EGLSu
#endif
#endif /* EGL_EXT_swap_buffers_with_damage */
#ifndef EGL_EXT_yuv_surface
#define EGL_EXT_yuv_surface 1
#define EGL_YUV_ORDER_EXT 0x3301
#define EGL_YUV_NUMBER_OF_PLANES_EXT 0x3311
#define EGL_YUV_SUBSAMPLE_EXT 0x3312
#define EGL_YUV_DEPTH_RANGE_EXT 0x3317
#define EGL_YUV_CSC_STANDARD_EXT 0x330A
#define EGL_YUV_PLANE_BPP_EXT 0x331A
#define EGL_YUV_BUFFER_EXT 0x3300
#define EGL_YUV_ORDER_YUV_EXT 0x3302
#define EGL_YUV_ORDER_YVU_EXT 0x3303
#define EGL_YUV_ORDER_YUYV_EXT 0x3304
#define EGL_YUV_ORDER_UYVY_EXT 0x3305
#define EGL_YUV_ORDER_YVYU_EXT 0x3306
#define EGL_YUV_ORDER_VYUY_EXT 0x3307
#define EGL_YUV_ORDER_AYUV_EXT 0x3308
#define EGL_YUV_SUBSAMPLE_4_2_0_EXT 0x3313
#define EGL_YUV_SUBSAMPLE_4_2_2_EXT 0x3314
#define EGL_YUV_SUBSAMPLE_4_4_4_EXT 0x3315
#define EGL_YUV_DEPTH_RANGE_LIMITED_EXT 0x3318
#define EGL_YUV_DEPTH_RANGE_FULL_EXT 0x3319
#define EGL_YUV_CSC_STANDARD_601_EXT 0x330B
#define EGL_YUV_CSC_STANDARD_709_EXT 0x330C
#define EGL_YUV_CSC_STANDARD_2020_EXT 0x330D
#define EGL_YUV_PLANE_BPP_0_EXT 0x331B
#define EGL_YUV_PLANE_BPP_8_EXT 0x331C
#define EGL_YUV_PLANE_BPP_10_EXT 0x331D
#endif /* EGL_EXT_yuv_surface */
#ifndef EGL_HI_clientpixmap
#define EGL_HI_clientpixmap 1
struct EGLClientPixmapHI {
@@ -533,11 +714,42 @@ EGLAPI EGLBoolean EGLAPIENTRY eglExportDRMImageMESA (EGLDisplay dpy, EGLImageKHR
#endif
#endif /* EGL_MESA_drm_image */
#ifndef EGL_MESA_image_dma_buf_export
#define EGL_MESA_image_dma_buf_export 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDMABUFIMAGEQUERYMESAPROC) (EGLDisplay dpy, EGLImageKHR image, int *fourcc, int *num_planes, EGLuint64KHR *modifiers);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDMABUFIMAGEMESAPROC) (EGLDisplay dpy, EGLImageKHR image, int *fds, EGLint *strides, EGLint *offsets);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglExportDMABUFImageQueryMESA (EGLDisplay dpy, EGLImageKHR image, int *fourcc, int *num_planes, EGLuint64KHR *modifiers);
EGLAPI EGLBoolean EGLAPIENTRY eglExportDMABUFImageMESA (EGLDisplay dpy, EGLImageKHR image, int *fds, EGLint *strides, EGLint *offsets);
#endif
#endif /* EGL_MESA_image_dma_buf_export */
#ifndef EGL_MESA_platform_gbm
#define EGL_MESA_platform_gbm 1
#define EGL_PLATFORM_GBM_MESA 0x31D7
#endif /* EGL_MESA_platform_gbm */
#ifndef EGL_NOK_swap_region
#define EGL_NOK_swap_region 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGIONNOKPROC) (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint *rects);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffersRegionNOK (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint *rects);
#endif
#endif /* EGL_NOK_swap_region */
#ifndef EGL_NOK_swap_region2
#define EGL_NOK_swap_region2 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGION2NOKPROC) (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint *rects);
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffersRegion2NOK (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint *rects);
#endif
#endif /* EGL_NOK_swap_region2 */
#ifndef EGL_NOK_texture_from_pixmap
#define EGL_NOK_texture_from_pixmap 1
#define EGL_Y_INVERTED_NOK 0x307F
#endif /* EGL_NOK_texture_from_pixmap */
#ifndef EGL_NV_3dvision_surface
#define EGL_NV_3dvision_surface 1
#define EGL_AUTO_STEREO_NV 0x3136
@@ -556,6 +768,13 @@ EGLAPI EGLBoolean EGLAPIENTRY eglExportDRMImageMESA (EGLDisplay dpy, EGLImageKHR
#define EGL_COVERAGE_SAMPLE_RESOLVE_NONE_NV 0x3133
#endif /* EGL_NV_coverage_sample_resolve */
#ifndef EGL_NV_cuda_event
#define EGL_NV_cuda_event 1
#define EGL_CUDA_EVENT_HANDLE_NV 0x323B
#define EGL_SYNC_CUDA_EVENT_NV 0x323C
#define EGL_SYNC_CUDA_EVENT_COMPLETE_NV 0x323D
#endif /* EGL_NV_cuda_event */
#ifndef EGL_NV_depth_nonlinear
#define EGL_NV_depth_nonlinear 1
#define EGL_DEPTH_ENCODING_NV 0x30E2
@@ -563,6 +782,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglExportDRMImageMESA (EGLDisplay dpy, EGLImageKHR
#define EGL_DEPTH_ENCODING_NONLINEAR_NV 0x30E3
#endif /* EGL_NV_depth_nonlinear */
#ifndef EGL_NV_device_cuda
#define EGL_NV_device_cuda 1
#define EGL_CUDA_DEVICE_NV 0x323A
#endif /* EGL_NV_device_cuda */
#ifndef EGL_NV_native_query
#define EGL_NV_native_query 1
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYNATIVEDISPLAYNVPROC) (EGLDisplay dpy, EGLNativeDisplayType *display_id);
@@ -645,6 +869,16 @@ EGLAPI EGLuint64NV EGLAPIENTRY eglGetSystemTimeNV (void);
#endif /* KHRONOS_SUPPORT_INT64 */
#endif /* EGL_NV_system_time */
#ifndef EGL_TIZEN_image_native_buffer
#define EGL_TIZEN_image_native_buffer 1
#define EGL_NATIVE_BUFFER_TIZEN 0x32A0
#endif /* EGL_TIZEN_image_native_buffer */
#ifndef EGL_TIZEN_image_native_surface
#define EGL_TIZEN_image_native_surface 1
#define EGL_NATIVE_SURFACE_TIZEN 0x32A1
#endif /* EGL_TIZEN_image_native_surface */
#include <EGL/eglmesaext.h>
#include <EGL/eglextchromium.h>

View File

@@ -34,63 +34,6 @@ extern "C" {
#include <EGL/eglplatform.h>
/* EGL_MESA_screen extension >>> PRELIMINARY <<< */
#ifndef EGL_MESA_screen_surface
#define EGL_MESA_screen_surface 1
#define EGL_BAD_SCREEN_MESA 0x4000
#define EGL_BAD_MODE_MESA 0x4001
#define EGL_SCREEN_COUNT_MESA 0x4002
#define EGL_SCREEN_POSITION_MESA 0x4003
#define EGL_SCREEN_POSITION_GRANULARITY_MESA 0x4004
#define EGL_MODE_ID_MESA 0x4005
#define EGL_REFRESH_RATE_MESA 0x4006
#define EGL_OPTIMAL_MESA 0x4007
#define EGL_INTERLACED_MESA 0x4008
#define EGL_SCREEN_BIT_MESA 0x08
typedef khronos_uint32_t EGLScreenMESA;
typedef khronos_uint32_t EGLModeMESA;
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglChooseModeMESA(EGLDisplay dpy, EGLScreenMESA screen, const EGLint *attrib_list, EGLModeMESA *modes, EGLint modes_size, EGLint *num_modes);
EGLAPI EGLBoolean EGLAPIENTRY eglGetModesMESA(EGLDisplay dpy, EGLScreenMESA screen, EGLModeMESA *modes, EGLint modes_size, EGLint *num_modes);
EGLAPI EGLBoolean EGLAPIENTRY eglGetModeAttribMESA(EGLDisplay dpy, EGLModeMESA mode, EGLint attribute, EGLint *value);
EGLAPI EGLBoolean EGLAPIENTRY eglGetScreensMESA(EGLDisplay dpy, EGLScreenMESA *screens, EGLint max_screens, EGLint *num_screens);
EGLAPI EGLSurface EGLAPIENTRY eglCreateScreenSurfaceMESA(EGLDisplay dpy, EGLConfig config, const EGLint *attrib_list);
EGLAPI EGLBoolean EGLAPIENTRY eglShowScreenSurfaceMESA(EGLDisplay dpy, EGLint screen, EGLSurface surface, EGLModeMESA mode);
EGLAPI EGLBoolean EGLAPIENTRY eglScreenPositionMESA(EGLDisplay dpy, EGLScreenMESA screen, EGLint x, EGLint y);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryScreenMESA(EGLDisplay dpy, EGLScreenMESA screen, EGLint attribute, EGLint *value);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryScreenSurfaceMESA(EGLDisplay dpy, EGLScreenMESA screen, EGLSurface *surface);
EGLAPI EGLBoolean EGLAPIENTRY eglQueryScreenModeMESA(EGLDisplay dpy, EGLScreenMESA screen, EGLModeMESA *mode);
EGLAPI const char * EGLAPIENTRY eglQueryModeStringMESA(EGLDisplay dpy, EGLModeMESA mode);
#endif /* EGL_EGLEXT_PROTOTYPES */
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCHOOSEMODEMESA) (EGLDisplay dpy, EGLScreenMESA screen, const EGLint *attrib_list, EGLModeMESA *modes, EGLint modes_size, EGLint *num_modes);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETMODESMESA) (EGLDisplay dpy, EGLScreenMESA screen, EGLModeMESA *modes, EGLint modes_size, EGLint *num_modes);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGetModeATTRIBMESA) (EGLDisplay dpy, EGLModeMESA mode, EGLint attribute, EGLint *value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLGETSCRREENSMESA) (EGLDisplay dpy, EGLScreenMESA *screens, EGLint max_screens, EGLint *num_screens);
typedef EGLSurface (EGLAPIENTRYP PFNEGLCREATESCREENSURFACEMESA) (EGLDisplay dpy, EGLConfig config, const EGLint *attrib_list);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSHOWSCREENSURFACEMESA) (EGLDisplay dpy, EGLint screen, EGLSurface surface, EGLModeMESA mode);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSCREENPOSIITONMESA) (EGLDisplay dpy, EGLScreenMESA screen, EGLint x, EGLint y);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSCREENMESA) (EGLDisplay dpy, EGLScreenMESA screen, EGLint attribute, EGLint *value);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSCREENSURFACEMESA) (EGLDisplay dpy, EGLScreenMESA screen, EGLSurface *surface);
typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSCREENMODEMESA) (EGLDisplay dpy, EGLScreenMESA screen, EGLModeMESA *mode);
typedef const char * (EGLAPIENTRYP PFNEGLQUERYMODESTRINGMESA) (EGLDisplay dpy, EGLModeMESA mode);
#endif /* EGL_MESA_screen_surface */
#ifndef EGL_MESA_copy_context
#define EGL_MESA_copy_context 1
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglCopyContextMESA(EGLDisplay dpy, EGLContext source, EGLContext dest, EGLint mask);
#endif /* EGL_EGLEXT_PROTOTYPES */
typedef EGLBoolean (EGLAPIENTRYP PFNEGLCOPYCONTEXTMESA) (EGLDisplay dpy, EGLContext source, EGLContext dest, EGLint mask);
#endif /* EGL_MESA_copy_context */
#ifndef EGL_MESA_drm_display
#define EGL_MESA_drm_display 1
@@ -144,26 +87,8 @@ typedef struct wl_buffer * (EGLAPIENTRYP PFNEGLCREATEWAYLANDBUFFERFROMIMAGEWL) (
#endif
#ifndef EGL_NOK_swap_region
#define EGL_NOK_swap_region 1
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLBoolean EGLAPIENTRY eglSwapBuffersRegionNOK(EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint* rects);
#endif
/* remnant of EGL_NOK_swap_region kept for compatibility because of a non-standard type name */
typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGIONNOK) (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint* rects);
#endif
#ifndef EGL_NOK_texture_from_pixmap
#define EGL_NOK_texture_from_pixmap 1
#define EGL_Y_INVERTED_NOK 0x307F
#endif /* EGL_NOK_texture_from_pixmap */
#ifndef EGL_ANDROID_image_native_buffer
#define EGL_ANDROID_image_native_buffer 1
#define EGL_NATIVE_BUFFER_ANDROID 0x3140 /* eglCreateImageKHR target */
#endif
#ifndef EGL_MESA_configless_context
#define EGL_MESA_configless_context 1

View File

@@ -2,7 +2,7 @@
#define __eglplatform_h_
/*
** Copyright (c) 2007-2009 The Khronos Group Inc.
** Copyright (c) 2007-2013 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -25,7 +25,7 @@
*/
/* Platform-specific types and definitions for egl.h
* $Revision: 12306 $ on $Date: 2010-08-25 09:51:28 -0700 (Wed, 25 Aug 2010) $
* $Revision: 30994 $ on $Date: 2015-04-30 13:36:48 -0700 (Thu, 30 Apr 2015) $
*
* Adopters may modify khrplatform.h and this file to suit their platform.
* You are encouraged to submit all modifications to the Khronos group so that
@@ -77,7 +77,7 @@ typedef HDC EGLNativeDisplayType;
typedef HBITMAP EGLNativePixmapType;
typedef HWND EGLNativeWindowType;
#elif defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */
#elif defined(__APPLE__) || defined(__WINSCW__) || defined(__SYMBIAN32__) /* Symbian */
typedef int EGLNativeDisplayType;
typedef void *EGLNativeWindowType;
@@ -95,18 +95,19 @@ typedef struct gbm_device *EGLNativeDisplayType;
typedef struct gbm_bo *EGLNativePixmapType;
typedef void *EGLNativeWindowType;
#elif defined(ANDROID) /* Android */
#elif defined(__ANDROID__) || defined(ANDROID)
#include <android/native_window.h>
struct ANativeWindow;
struct egl_native_pixmap_t;
typedef struct ANativeWindow *EGLNativeWindowType;
typedef struct egl_native_pixmap_t *EGLNativePixmapType;
typedef void *EGLNativeDisplayType;
typedef struct ANativeWindow* EGLNativeWindowType;
typedef struct egl_native_pixmap_t* EGLNativePixmapType;
typedef void* EGLNativeDisplayType;
#elif defined(__unix__)
#ifdef MESA_EGL_NO_X11_HEADERS
#if defined(MESA_EGL_NO_X11_HEADERS)
typedef void *EGLNativeDisplayType;
typedef khronos_uintptr_t EGLNativePixmapType;
@@ -124,6 +125,12 @@ typedef Window EGLNativeWindowType;
#endif /* MESA_EGL_NO_X11_HEADERS */
#elif __HAIKU__
#include <kernel/image.h>
typedef void *EGLNativeDisplayType;
typedef khronos_uintptr_t EGLNativePixmapType;
typedef khronos_uintptr_t EGLNativeWindowType;
#else
#error "Platform not recognized"
#endif

View File

@@ -33,7 +33,7 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $
** Khronos $Revision: 29735 $ on $Date: 2015-02-02 19:00:01 -0800 (Mon, 02 Feb 2015) $
*/
#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)
@@ -53,7 +53,7 @@ extern "C" {
#define GLAPI extern
#endif
#define GL_GLEXT_VERSION 20140810
#define GL_GLEXT_VERSION 20150202
/* Generated C header for:
* API: gl
@@ -2044,6 +2044,10 @@ GLAPI void APIENTRY glGetDoublei_v (GLenum target, GLuint index, GLdouble *data)
#ifndef GL_VERSION_4_2
#define GL_VERSION_4_2 1
#define GL_COPY_READ_BUFFER_BINDING 0x8F36
#define GL_COPY_WRITE_BUFFER_BINDING 0x8F37
#define GL_TRANSFORM_FEEDBACK_ACTIVE 0x8E24
#define GL_TRANSFORM_FEEDBACK_PAUSED 0x8E23
#define GL_UNPACK_COMPRESSED_BLOCK_WIDTH 0x9127
#define GL_UNPACK_COMPRESSED_BLOCK_HEIGHT 0x9128
#define GL_UNPACK_COMPRESSED_BLOCK_DEPTH 0x9129
@@ -2590,7 +2594,6 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui
#define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES 0x82FA
#define GL_TEXTURE_TARGET 0x1006
#define GL_QUERY_TARGET 0x82EA
#define GL_TEXTURE_BINDING 0x82EB
#define GL_GUILTY_CONTEXT_RESET 0x8253
#define GL_INNOCENT_CONTEXT_RESET 0x8254
#define GL_UNKNOWN_CONTEXT_RESET 0x8255
@@ -2603,25 +2606,25 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui
typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum depth);
typedef void (APIENTRYP PFNGLCREATETRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);
typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERBASEPROC) (GLuint xfb, GLuint index, GLuint buffer);
typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKIVPROC) (GLuint xfb, GLenum pname, GLint *param);
typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI_VPROC) (GLuint xfb, GLenum pname, GLuint index, GLint *param);
typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI64_VPROC) (GLuint xfb, GLenum pname, GLuint index, GLint64 *param);
typedef void (APIENTRYP PFNGLCREATEBUFFERSPROC) (GLsizei n, GLuint *buffers);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizei size, const void *data, GLbitfield flags);
typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizei size, const void *data, GLenum usage);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizei size, const void *data);
typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizeiptr size, const void *data, GLbitfield flags);
typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizeiptr size, const void *data, GLenum usage);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, const void *data);
typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERDATAPROC) (GLuint buffer, GLenum internalformat, GLenum format, GLenum type, const void *data);
typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum type, const void *data);
typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum type, const void *data);
typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERPROC) (GLuint buffer, GLenum access);
typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizei length, GLbitfield access);
typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizeiptr length, GLbitfield access);
typedef GLboolean (APIENTRYP PFNGLUNMAPNAMEDBUFFERPROC) (GLuint buffer);
typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizei length);
typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizeiptr length);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERIVPROC) (GLuint buffer, GLenum pname, GLint *params);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERI64VPROC) (GLuint buffer, GLenum pname, GLint64 *params);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPOINTERVPROC) (GLuint buffer, GLenum pname, void **params);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizei size, void *data);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, void *data);
typedef void (APIENTRYP PFNGLCREATEFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);
typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERRENDERBUFFERPROC) (GLuint framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERPARAMETERIPROC) (GLuint framebuffer, GLenum pname, GLint param);
@@ -2646,7 +2649,7 @@ typedef void (APIENTRYP PFNGLNAMEDRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLuint re
typedef void (APIENTRYP PFNGLGETNAMEDRENDERBUFFERPARAMETERIVPROC) (GLuint renderbuffer, GLenum pname, GLint *params);
typedef void (APIENTRYP PFNGLCREATETEXTURESPROC) (GLenum target, GLsizei n, GLuint *textures);
typedef void (APIENTRYP PFNGLTEXTUREBUFFERPROC) (GLuint texture, GLenum internalformat, GLuint buffer);
typedef void (APIENTRYP PFNGLTEXTUREBUFFERRANGEPROC) (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizei size);
typedef void (APIENTRYP PFNGLTEXTUREBUFFERRANGEPROC) (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);
typedef void (APIENTRYP PFNGLTEXTURESTORAGE1DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width);
typedef void (APIENTRYP PFNGLTEXTURESTORAGE2DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (APIENTRYP PFNGLTEXTURESTORAGE3DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);
@@ -2694,6 +2697,10 @@ typedef void (APIENTRYP PFNGLGETVERTEXARRAYINDEXED64IVPROC) (GLuint vaobj, GLuin
typedef void (APIENTRYP PFNGLCREATESAMPLERSPROC) (GLsizei n, GLuint *samplers);
typedef void (APIENTRYP PFNGLCREATEPROGRAMPIPELINESPROC) (GLsizei n, GLuint *pipelines);
typedef void (APIENTRYP PFNGLCREATEQUERIESPROC) (GLenum target, GLsizei n, GLuint *ids);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTI64VPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTIVPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTUI64VPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTUIVPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLMEMORYBARRIERBYREGIONPROC) (GLbitfield barriers);
typedef void (APIENTRYP PFNGLGETTEXTURESUBIMAGEPROC) (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, GLsizei bufSize, void *pixels);
typedef void (APIENTRYP PFNGLGETCOMPRESSEDTEXTURESUBIMAGEPROC) (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLsizei bufSize, void *pixels);
@@ -2722,25 +2729,25 @@ typedef void (APIENTRYP PFNGLTEXTUREBARRIERPROC) (void);
GLAPI void APIENTRY glClipControl (GLenum origin, GLenum depth);
GLAPI void APIENTRY glCreateTransformFeedbacks (GLsizei n, GLuint *ids);
GLAPI void APIENTRY glTransformFeedbackBufferBase (GLuint xfb, GLuint index, GLuint buffer);
GLAPI void APIENTRY glTransformFeedbackBufferRange (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
GLAPI void APIENTRY glTransformFeedbackBufferRange (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
GLAPI void APIENTRY glGetTransformFeedbackiv (GLuint xfb, GLenum pname, GLint *param);
GLAPI void APIENTRY glGetTransformFeedbacki_v (GLuint xfb, GLenum pname, GLuint index, GLint *param);
GLAPI void APIENTRY glGetTransformFeedbacki64_v (GLuint xfb, GLenum pname, GLuint index, GLint64 *param);
GLAPI void APIENTRY glCreateBuffers (GLsizei n, GLuint *buffers);
GLAPI void APIENTRY glNamedBufferStorage (GLuint buffer, GLsizei size, const void *data, GLbitfield flags);
GLAPI void APIENTRY glNamedBufferData (GLuint buffer, GLsizei size, const void *data, GLenum usage);
GLAPI void APIENTRY glNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizei size, const void *data);
GLAPI void APIENTRY glCopyNamedBufferSubData (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);
GLAPI void APIENTRY glNamedBufferStorage (GLuint buffer, GLsizeiptr size, const void *data, GLbitfield flags);
GLAPI void APIENTRY glNamedBufferData (GLuint buffer, GLsizeiptr size, const void *data, GLenum usage);
GLAPI void APIENTRY glNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizeiptr size, const void *data);
GLAPI void APIENTRY glCopyNamedBufferSubData (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
GLAPI void APIENTRY glClearNamedBufferData (GLuint buffer, GLenum internalformat, GLenum format, GLenum type, const void *data);
GLAPI void APIENTRY glClearNamedBufferSubData (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum type, const void *data);
GLAPI void APIENTRY glClearNamedBufferSubData (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum type, const void *data);
GLAPI void *APIENTRY glMapNamedBuffer (GLuint buffer, GLenum access);
GLAPI void *APIENTRY glMapNamedBufferRange (GLuint buffer, GLintptr offset, GLsizei length, GLbitfield access);
GLAPI void *APIENTRY glMapNamedBufferRange (GLuint buffer, GLintptr offset, GLsizeiptr length, GLbitfield access);
GLAPI GLboolean APIENTRY glUnmapNamedBuffer (GLuint buffer);
GLAPI void APIENTRY glFlushMappedNamedBufferRange (GLuint buffer, GLintptr offset, GLsizei length);
GLAPI void APIENTRY glFlushMappedNamedBufferRange (GLuint buffer, GLintptr offset, GLsizeiptr length);
GLAPI void APIENTRY glGetNamedBufferParameteriv (GLuint buffer, GLenum pname, GLint *params);
GLAPI void APIENTRY glGetNamedBufferParameteri64v (GLuint buffer, GLenum pname, GLint64 *params);
GLAPI void APIENTRY glGetNamedBufferPointerv (GLuint buffer, GLenum pname, void **params);
GLAPI void APIENTRY glGetNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizei size, void *data);
GLAPI void APIENTRY glGetNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizeiptr size, void *data);
GLAPI void APIENTRY glCreateFramebuffers (GLsizei n, GLuint *framebuffers);
GLAPI void APIENTRY glNamedFramebufferRenderbuffer (GLuint framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
GLAPI void APIENTRY glNamedFramebufferParameteri (GLuint framebuffer, GLenum pname, GLint param);
@@ -2765,7 +2772,7 @@ GLAPI void APIENTRY glNamedRenderbufferStorageMultisample (GLuint renderbuffer,
GLAPI void APIENTRY glGetNamedRenderbufferParameteriv (GLuint renderbuffer, GLenum pname, GLint *params);
GLAPI void APIENTRY glCreateTextures (GLenum target, GLsizei n, GLuint *textures);
GLAPI void APIENTRY glTextureBuffer (GLuint texture, GLenum internalformat, GLuint buffer);
GLAPI void APIENTRY glTextureBufferRange (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizei size);
GLAPI void APIENTRY glTextureBufferRange (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);
GLAPI void APIENTRY glTextureStorage1D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width);
GLAPI void APIENTRY glTextureStorage2D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);
GLAPI void APIENTRY glTextureStorage3D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);
@@ -2813,6 +2820,10 @@ GLAPI void APIENTRY glGetVertexArrayIndexed64iv (GLuint vaobj, GLuint index, GLe
GLAPI void APIENTRY glCreateSamplers (GLsizei n, GLuint *samplers);
GLAPI void APIENTRY glCreateProgramPipelines (GLsizei n, GLuint *pipelines);
GLAPI void APIENTRY glCreateQueries (GLenum target, GLsizei n, GLuint *ids);
GLAPI void APIENTRY glGetQueryBufferObjecti64v (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glGetQueryBufferObjectiv (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glGetQueryBufferObjectui64v (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glGetQueryBufferObjectuiv (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glMemoryBarrierByRegion (GLbitfield barriers);
GLAPI void APIENTRY glGetTextureSubImage (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, GLsizei bufSize, void *pixels);
GLAPI void APIENTRY glGetCompressedTextureSubImage (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLsizei bufSize, void *pixels);
@@ -2979,8 +2990,6 @@ GLAPI void APIENTRY glDispatchComputeGroupSizeARB (GLuint num_groups_x, GLuint n
#ifndef GL_ARB_copy_buffer
#define GL_ARB_copy_buffer 1
#define GL_COPY_READ_BUFFER_BINDING 0x8F36
#define GL_COPY_WRITE_BUFFER_BINDING 0x8F37
#endif /* GL_ARB_copy_buffer */
#ifndef GL_ARB_copy_image
@@ -4065,13 +4074,13 @@ GLAPI void APIENTRY glGetNamedStringivARB (GLint namelen, const GLchar *name, GL
#define GL_ARB_sparse_buffer 1
#define GL_SPARSE_STORAGE_BIT_ARB 0x0400
#define GL_SPARSE_BUFFER_PAGE_SIZE_ARB 0x82F8
typedef void (APIENTRYP PFNGLBUFFERPAGECOMMITMENTARBPROC) (GLenum target, GLintptr offset, GLsizei size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTEXTPROC) (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTARBPROC) (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
typedef void (APIENTRYP PFNGLBUFFERPAGECOMMITMENTARBPROC) (GLenum target, GLintptr offset, GLsizeiptr size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTEXTPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTARBPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glBufferPageCommitmentARB (GLenum target, GLintptr offset, GLsizei size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentEXT (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
GLAPI void APIENTRY glBufferPageCommitmentARB (GLenum target, GLintptr offset, GLsizeiptr size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentEXT (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
#endif
#endif /* GL_ARB_sparse_buffer */
@@ -4079,7 +4088,7 @@ GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offs
#define GL_ARB_sparse_texture 1
#define GL_TEXTURE_SPARSE_ARB 0x91A6
#define GL_VIRTUAL_PAGE_SIZE_INDEX_ARB 0x91A7
#define GL_MIN_SPARSE_LEVEL_ARB 0x919B
#define GL_NUM_SPARSE_LEVELS_ARB 0x91AA
#define GL_NUM_VIRTUAL_PAGE_SIZES_ARB 0x91A8
#define GL_VIRTUAL_PAGE_SIZE_X_ARB 0x9195
#define GL_VIRTUAL_PAGE_SIZE_Y_ARB 0x9196
@@ -4344,8 +4353,6 @@ GLAPI void APIENTRY glGetCompressedTexImageARB (GLenum target, GLint level, void
#ifndef GL_ARB_transform_feedback2
#define GL_ARB_transform_feedback2 1
#define GL_TRANSFORM_FEEDBACK_PAUSED 0x8E23
#define GL_TRANSFORM_FEEDBACK_ACTIVE 0x8E24
#endif /* GL_ARB_transform_feedback2 */
#ifndef GL_ARB_transform_feedback3
@@ -7485,6 +7492,19 @@ GLAPI void APIENTRY glPolygonOffsetEXT (GLfloat factor, GLfloat bias);
#endif
#endif /* GL_EXT_polygon_offset */
#ifndef GL_EXT_polygon_offset_clamp
#define GL_EXT_polygon_offset_clamp 1
#define GL_POLYGON_OFFSET_CLAMP_EXT 0x8E1B
typedef void (APIENTRYP PFNGLPOLYGONOFFSETCLAMPEXTPROC) (GLfloat factor, GLfloat units, GLfloat clamp);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glPolygonOffsetClampEXT (GLfloat factor, GLfloat units, GLfloat clamp);
#endif
#endif /* GL_EXT_polygon_offset_clamp */
#ifndef GL_EXT_post_depth_coverage
#define GL_EXT_post_depth_coverage 1
#endif /* GL_EXT_post_depth_coverage */
#ifndef GL_EXT_provoking_vertex
#define GL_EXT_provoking_vertex 1
#define GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION_EXT 0x8E4C
@@ -7497,6 +7517,20 @@ GLAPI void APIENTRY glProvokingVertexEXT (GLenum mode);
#endif
#endif /* GL_EXT_provoking_vertex */
#ifndef GL_EXT_raster_multisample
#define GL_EXT_raster_multisample 1
#define GL_RASTER_MULTISAMPLE_EXT 0x9327
#define GL_RASTER_SAMPLES_EXT 0x9328
#define GL_MAX_RASTER_SAMPLES_EXT 0x9329
#define GL_RASTER_FIXED_SAMPLE_LOCATIONS_EXT 0x932A
#define GL_MULTISAMPLE_RASTERIZATION_ALLOWED_EXT 0x932B
#define GL_EFFECTIVE_RASTER_SAMPLES_EXT 0x932C
typedef void (APIENTRYP PFNGLRASTERSAMPLESEXTPROC) (GLuint samples, GLboolean fixedsamplelocations);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glRasterSamplesEXT (GLuint samples, GLboolean fixedsamplelocations);
#endif
#endif /* GL_EXT_raster_multisample */
#ifndef GL_EXT_rescale_normal
#define GL_EXT_rescale_normal 1
#define GL_RESCALE_NORMAL_EXT 0x803A
@@ -7651,6 +7685,10 @@ GLAPI void APIENTRY glMemoryBarrierEXT (GLbitfield barriers);
#define GL_SHARED_TEXTURE_PALETTE_EXT 0x81FB
#endif /* GL_EXT_shared_texture_palette */
#ifndef GL_EXT_sparse_texture2
#define GL_EXT_sparse_texture2 1
#endif /* GL_EXT_sparse_texture2 */
#ifndef GL_EXT_stencil_clear_tag
#define GL_EXT_stencil_clear_tag 1
#define GL_STENCIL_TAG_BITS_EXT 0x88F2
@@ -7863,6 +7901,10 @@ GLAPI void APIENTRY glTexBufferEXT (GLenum target, GLenum internalformat, GLuint
#define GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT 0x84FF
#endif /* GL_EXT_texture_filter_anisotropic */
#ifndef GL_EXT_texture_filter_minmax
#define GL_EXT_texture_filter_minmax 1
#endif /* GL_EXT_texture_filter_minmax */
#ifndef GL_EXT_texture_integer
#define GL_EXT_texture_integer 1
#define GL_RGBA32UI_EXT 0x8D70
@@ -8912,6 +8954,18 @@ GLAPI void APIENTRY glEndConditionalRenderNV (void);
#endif
#endif /* GL_NV_conditional_render */
#ifndef GL_NV_conservative_raster
#define GL_NV_conservative_raster 1
#define GL_CONSERVATIVE_RASTERIZATION_NV 0x9346
#define GL_SUBPIXEL_PRECISION_BIAS_X_BITS_NV 0x9347
#define GL_SUBPIXEL_PRECISION_BIAS_Y_BITS_NV 0x9348
#define GL_MAX_SUBPIXEL_PRECISION_BIAS_BITS_NV 0x9349
typedef void (APIENTRYP PFNGLSUBPIXELPRECISIONBIASNVPROC) (GLuint xbits, GLuint ybits);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glSubpixelPrecisionBiasNV (GLuint xbits, GLuint ybits);
#endif
#endif /* GL_NV_conservative_raster */
#ifndef GL_NV_copy_depth_to_color
#define GL_NV_copy_depth_to_color 1
#define GL_DEPTH_STENCIL_TO_RGBA_NV 0x886E
@@ -9054,6 +9108,11 @@ GLAPI void APIENTRY glSetFenceNV (GLuint fence, GLenum condition);
#endif
#endif /* GL_NV_fence */
#ifndef GL_NV_fill_rectangle
#define GL_NV_fill_rectangle 1
#define GL_FILL_RECTANGLE_NV 0x933C
#endif /* GL_NV_fill_rectangle */
#ifndef GL_NV_float_buffer
#define GL_NV_float_buffer 1
#define GL_FLOAT_R_NV 0x8880
@@ -9080,6 +9139,16 @@ GLAPI void APIENTRY glSetFenceNV (GLuint fence, GLenum condition);
#define GL_EYE_PLANE_ABSOLUTE_NV 0x855C
#endif /* GL_NV_fog_distance */
#ifndef GL_NV_fragment_coverage_to_color
#define GL_NV_fragment_coverage_to_color 1
#define GL_FRAGMENT_COVERAGE_TO_COLOR_NV 0x92DD
#define GL_FRAGMENT_COVERAGE_COLOR_NV 0x92DE
typedef void (APIENTRYP PFNGLFRAGMENTCOVERAGECOLORNVPROC) (GLuint color);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glFragmentCoverageColorNV (GLuint color);
#endif
#endif /* GL_NV_fragment_coverage_to_color */
#ifndef GL_NV_fragment_program
#define GL_NV_fragment_program 1
#define GL_MAX_FRAGMENT_PROGRAM_LOCAL_PARAMETERS_NV 0x8868
@@ -9121,6 +9190,30 @@ GLAPI void APIENTRY glGetProgramNamedParameterdvNV (GLuint id, GLsizei len, cons
#define GL_NV_fragment_program_option 1
#endif /* GL_NV_fragment_program_option */
#ifndef GL_NV_fragment_shader_interlock
#define GL_NV_fragment_shader_interlock 1
#endif /* GL_NV_fragment_shader_interlock */
#ifndef GL_NV_framebuffer_mixed_samples
#define GL_NV_framebuffer_mixed_samples 1
#define GL_COVERAGE_MODULATION_TABLE_NV 0x9331
#define GL_COLOR_SAMPLES_NV 0x8E20
#define GL_DEPTH_SAMPLES_NV 0x932D
#define GL_STENCIL_SAMPLES_NV 0x932E
#define GL_MIXED_DEPTH_SAMPLES_SUPPORTED_NV 0x932F
#define GL_MIXED_STENCIL_SAMPLES_SUPPORTED_NV 0x9330
#define GL_COVERAGE_MODULATION_NV 0x9332
#define GL_COVERAGE_MODULATION_TABLE_SIZE_NV 0x9333
typedef void (APIENTRYP PFNGLCOVERAGEMODULATIONTABLENVPROC) (GLsizei n, const GLfloat *v);
typedef void (APIENTRYP PFNGLGETCOVERAGEMODULATIONTABLENVPROC) (GLsizei bufsize, GLfloat *v);
typedef void (APIENTRYP PFNGLCOVERAGEMODULATIONNVPROC) (GLenum components);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glCoverageModulationTableNV (GLsizei n, const GLfloat *v);
GLAPI void APIENTRY glGetCoverageModulationTableNV (GLsizei bufsize, GLfloat *v);
GLAPI void APIENTRY glCoverageModulationNV (GLenum components);
#endif
#endif /* GL_NV_framebuffer_mixed_samples */
#ifndef GL_NV_framebuffer_multisample_coverage
#define GL_NV_framebuffer_multisample_coverage 1
#define GL_RENDERBUFFER_COVERAGE_SAMPLES_NV 0x8CAB
@@ -9152,6 +9245,10 @@ GLAPI void APIENTRY glFramebufferTextureFaceEXT (GLenum target, GLenum attachmen
#define GL_NV_geometry_shader4 1
#endif /* GL_NV_geometry_shader4 */
#ifndef GL_NV_geometry_shader_passthrough
#define GL_NV_geometry_shader_passthrough 1
#endif /* GL_NV_geometry_shader_passthrough */
#ifndef GL_NV_gpu_program4
#define GL_NV_gpu_program4 1
#define GL_MIN_PROGRAM_TEXEL_OFFSET_NV 0x8904
@@ -9324,6 +9421,18 @@ GLAPI void APIENTRY glVertexAttribs4hvNV (GLuint index, GLsizei n, const GLhalfN
#endif
#endif /* GL_NV_half_float */
#ifndef GL_NV_internalformat_sample_query
#define GL_NV_internalformat_sample_query 1
#define GL_MULTISAMPLES_NV 0x9371
#define GL_SUPERSAMPLE_SCALE_X_NV 0x9372
#define GL_SUPERSAMPLE_SCALE_Y_NV 0x9373
#define GL_CONFORMANT_NV 0x9374
typedef void (APIENTRYP PFNGLGETINTERNALFORMATSAMPLEIVNVPROC) (GLenum target, GLenum internalformat, GLsizei samples, GLenum pname, GLsizei bufSize, GLint *params);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glGetInternalformatSampleivNV (GLenum target, GLenum internalformat, GLsizei samples, GLenum pname, GLsizei bufSize, GLint *params);
#endif
#endif /* GL_NV_internalformat_sample_query */
#ifndef GL_NV_light_max_exponent
#define GL_NV_light_max_exponent 1
#define GL_MAX_SHININESS_NV 0x8504
@@ -9332,7 +9441,6 @@ GLAPI void APIENTRY glVertexAttribs4hvNV (GLuint index, GLsizei n, const GLhalfN
#ifndef GL_NV_multisample_coverage
#define GL_NV_multisample_coverage 1
#define GL_COLOR_SAMPLES_NV 0x8E20
#endif /* GL_NV_multisample_coverage */
#ifndef GL_NV_multisample_filter_hint
@@ -9445,13 +9553,11 @@ GLAPI void APIENTRY glProgramBufferParametersIuivNV (GLenum target, GLuint bindi
#define GL_SKIP_MISSING_GLYPH_NV 0x90A9
#define GL_USE_MISSING_GLYPH_NV 0x90AA
#define GL_PATH_ERROR_POSITION_NV 0x90AB
#define GL_PATH_FOG_GEN_MODE_NV 0x90AC
#define GL_ACCUM_ADJACENT_PAIRS_NV 0x90AD
#define GL_ADJACENT_PAIRS_NV 0x90AE
#define GL_FIRST_TO_REST_NV 0x90AF
#define GL_PATH_GEN_MODE_NV 0x90B0
#define GL_PATH_GEN_COEFF_NV 0x90B1
#define GL_PATH_GEN_COLOR_FORMAT_NV 0x90B2
#define GL_PATH_GEN_COMPONENTS_NV 0x90B3
#define GL_PATH_STENCIL_FUNC_NV 0x90B7
#define GL_PATH_STENCIL_REF_NV 0x90B8
@@ -9520,8 +9626,6 @@ GLAPI void APIENTRY glProgramBufferParametersIuivNV (GLenum target, GLuint bindi
#define GL_FONT_UNDERLINE_POSITION_BIT_NV 0x04000000
#define GL_FONT_UNDERLINE_THICKNESS_BIT_NV 0x08000000
#define GL_FONT_HAS_KERNING_BIT_NV 0x10000000
#define GL_PRIMARY_COLOR_NV 0x852C
#define GL_SECONDARY_COLOR_NV 0x852D
#define GL_ROUNDED_RECT_NV 0xE8
#define GL_RELATIVE_ROUNDED_RECT_NV 0xE9
#define GL_ROUNDED_RECT2_NV 0xEA
@@ -9545,6 +9649,10 @@ GLAPI void APIENTRY glProgramBufferParametersIuivNV (GLenum target, GLuint bindi
#define GL_EYE_LINEAR_NV 0x2400
#define GL_OBJECT_LINEAR_NV 0x2401
#define GL_CONSTANT_NV 0x8576
#define GL_PATH_FOG_GEN_MODE_NV 0x90AC
#define GL_PRIMARY_COLOR_NV 0x852C
#define GL_SECONDARY_COLOR_NV 0x852D
#define GL_PATH_GEN_COLOR_FORMAT_NV 0x90B2
#define GL_PATH_PROJECTION_NV 0x1701
#define GL_PATH_MODELVIEW_NV 0x1700
#define GL_PATH_MODELVIEW_STACK_DEPTH_NV 0x0BA3
@@ -9582,9 +9690,6 @@ typedef void (APIENTRYP PFNGLSTENCILSTROKEPATHNVPROC) (GLuint path, GLint refere
typedef void (APIENTRYP PFNGLSTENCILFILLPATHINSTANCEDNVPROC) (GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLenum fillMode, GLuint mask, GLenum transformType, const GLfloat *transformValues);
typedef void (APIENTRYP PFNGLSTENCILSTROKEPATHINSTANCEDNVPROC) (GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLint reference, GLuint mask, GLenum transformType, const GLfloat *transformValues);
typedef void (APIENTRYP PFNGLPATHCOVERDEPTHFUNCNVPROC) (GLenum func);
typedef void (APIENTRYP PFNGLPATHCOLORGENNVPROC) (GLenum color, GLenum genMode, GLenum colorFormat, const GLfloat *coeffs);
typedef void (APIENTRYP PFNGLPATHTEXGENNVPROC) (GLenum texCoordSet, GLenum genMode, GLint components, const GLfloat *coeffs);
typedef void (APIENTRYP PFNGLPATHFOGGENNVPROC) (GLenum genMode);
typedef void (APIENTRYP PFNGLCOVERFILLPATHNVPROC) (GLuint path, GLenum coverMode);
typedef void (APIENTRYP PFNGLCOVERSTROKEPATHNVPROC) (GLuint path, GLenum coverMode);
typedef void (APIENTRYP PFNGLCOVERFILLPATHINSTANCEDNVPROC) (GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLenum coverMode, GLenum transformType, const GLfloat *transformValues);
@@ -9597,10 +9702,6 @@ typedef void (APIENTRYP PFNGLGETPATHDASHARRAYNVPROC) (GLuint path, GLfloat *dash
typedef void (APIENTRYP PFNGLGETPATHMETRICSNVPROC) (GLbitfield metricQueryMask, GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLsizei stride, GLfloat *metrics);
typedef void (APIENTRYP PFNGLGETPATHMETRICRANGENVPROC) (GLbitfield metricQueryMask, GLuint firstPathName, GLsizei numPaths, GLsizei stride, GLfloat *metrics);
typedef void (APIENTRYP PFNGLGETPATHSPACINGNVPROC) (GLenum pathListMode, GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLfloat advanceScale, GLfloat kerningScale, GLenum transformType, GLfloat *returnedSpacing);
typedef void (APIENTRYP PFNGLGETPATHCOLORGENIVNVPROC) (GLenum color, GLenum pname, GLint *value);
typedef void (APIENTRYP PFNGLGETPATHCOLORGENFVNVPROC) (GLenum color, GLenum pname, GLfloat *value);
typedef void (APIENTRYP PFNGLGETPATHTEXGENIVNVPROC) (GLenum texCoordSet, GLenum pname, GLint *value);
typedef void (APIENTRYP PFNGLGETPATHTEXGENFVNVPROC) (GLenum texCoordSet, GLenum pname, GLfloat *value);
typedef GLboolean (APIENTRYP PFNGLISPOINTINFILLPATHNVPROC) (GLuint path, GLuint mask, GLfloat x, GLfloat y);
typedef GLboolean (APIENTRYP PFNGLISPOINTINSTROKEPATHNVPROC) (GLuint path, GLfloat x, GLfloat y);
typedef GLfloat (APIENTRYP PFNGLGETPATHLENGTHNVPROC) (GLuint path, GLsizei startSegment, GLsizei numSegments);
@@ -9620,6 +9721,13 @@ typedef GLenum (APIENTRYP PFNGLPATHGLYPHINDEXARRAYNVPROC) (GLuint firstPathName,
typedef GLenum (APIENTRYP PFNGLPATHMEMORYGLYPHINDEXARRAYNVPROC) (GLuint firstPathName, GLenum fontTarget, GLsizeiptr fontSize, const void *fontData, GLsizei faceIndex, GLuint firstGlyphIndex, GLsizei numGlyphs, GLuint pathParameterTemplate, GLfloat emScale);
typedef void (APIENTRYP PFNGLPROGRAMPATHFRAGMENTINPUTGENNVPROC) (GLuint program, GLint location, GLenum genMode, GLint components, const GLfloat *coeffs);
typedef void (APIENTRYP PFNGLGETPROGRAMRESOURCEFVNVPROC) (GLuint program, GLenum programInterface, GLuint index, GLsizei propCount, const GLenum *props, GLsizei bufSize, GLsizei *length, GLfloat *params);
typedef void (APIENTRYP PFNGLPATHCOLORGENNVPROC) (GLenum color, GLenum genMode, GLenum colorFormat, const GLfloat *coeffs);
typedef void (APIENTRYP PFNGLPATHTEXGENNVPROC) (GLenum texCoordSet, GLenum genMode, GLint components, const GLfloat *coeffs);
typedef void (APIENTRYP PFNGLPATHFOGGENNVPROC) (GLenum genMode);
typedef void (APIENTRYP PFNGLGETPATHCOLORGENIVNVPROC) (GLenum color, GLenum pname, GLint *value);
typedef void (APIENTRYP PFNGLGETPATHCOLORGENFVNVPROC) (GLenum color, GLenum pname, GLfloat *value);
typedef void (APIENTRYP PFNGLGETPATHTEXGENIVNVPROC) (GLenum texCoordSet, GLenum pname, GLint *value);
typedef void (APIENTRYP PFNGLGETPATHTEXGENFVNVPROC) (GLenum texCoordSet, GLenum pname, GLfloat *value);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI GLuint APIENTRY glGenPathsNV (GLsizei range);
GLAPI void APIENTRY glDeletePathsNV (GLuint path, GLsizei range);
@@ -9647,9 +9755,6 @@ GLAPI void APIENTRY glStencilStrokePathNV (GLuint path, GLint reference, GLuint
GLAPI void APIENTRY glStencilFillPathInstancedNV (GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLenum fillMode, GLuint mask, GLenum transformType, const GLfloat *transformValues);
GLAPI void APIENTRY glStencilStrokePathInstancedNV (GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLint reference, GLuint mask, GLenum transformType, const GLfloat *transformValues);
GLAPI void APIENTRY glPathCoverDepthFuncNV (GLenum func);
GLAPI void APIENTRY glPathColorGenNV (GLenum color, GLenum genMode, GLenum colorFormat, const GLfloat *coeffs);
GLAPI void APIENTRY glPathTexGenNV (GLenum texCoordSet, GLenum genMode, GLint components, const GLfloat *coeffs);
GLAPI void APIENTRY glPathFogGenNV (GLenum genMode);
GLAPI void APIENTRY glCoverFillPathNV (GLuint path, GLenum coverMode);
GLAPI void APIENTRY glCoverStrokePathNV (GLuint path, GLenum coverMode);
GLAPI void APIENTRY glCoverFillPathInstancedNV (GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLenum coverMode, GLenum transformType, const GLfloat *transformValues);
@@ -9662,10 +9767,6 @@ GLAPI void APIENTRY glGetPathDashArrayNV (GLuint path, GLfloat *dashArray);
GLAPI void APIENTRY glGetPathMetricsNV (GLbitfield metricQueryMask, GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLsizei stride, GLfloat *metrics);
GLAPI void APIENTRY glGetPathMetricRangeNV (GLbitfield metricQueryMask, GLuint firstPathName, GLsizei numPaths, GLsizei stride, GLfloat *metrics);
GLAPI void APIENTRY glGetPathSpacingNV (GLenum pathListMode, GLsizei numPaths, GLenum pathNameType, const void *paths, GLuint pathBase, GLfloat advanceScale, GLfloat kerningScale, GLenum transformType, GLfloat *returnedSpacing);
GLAPI void APIENTRY glGetPathColorGenivNV (GLenum color, GLenum pname, GLint *value);
GLAPI void APIENTRY glGetPathColorGenfvNV (GLenum color, GLenum pname, GLfloat *value);
GLAPI void APIENTRY glGetPathTexGenivNV (GLenum texCoordSet, GLenum pname, GLint *value);
GLAPI void APIENTRY glGetPathTexGenfvNV (GLenum texCoordSet, GLenum pname, GLfloat *value);
GLAPI GLboolean APIENTRY glIsPointInFillPathNV (GLuint path, GLuint mask, GLfloat x, GLfloat y);
GLAPI GLboolean APIENTRY glIsPointInStrokePathNV (GLuint path, GLfloat x, GLfloat y);
GLAPI GLfloat APIENTRY glGetPathLengthNV (GLuint path, GLsizei startSegment, GLsizei numSegments);
@@ -9685,9 +9786,21 @@ GLAPI GLenum APIENTRY glPathGlyphIndexArrayNV (GLuint firstPathName, GLenum font
GLAPI GLenum APIENTRY glPathMemoryGlyphIndexArrayNV (GLuint firstPathName, GLenum fontTarget, GLsizeiptr fontSize, const void *fontData, GLsizei faceIndex, GLuint firstGlyphIndex, GLsizei numGlyphs, GLuint pathParameterTemplate, GLfloat emScale);
GLAPI void APIENTRY glProgramPathFragmentInputGenNV (GLuint program, GLint location, GLenum genMode, GLint components, const GLfloat *coeffs);
GLAPI void APIENTRY glGetProgramResourcefvNV (GLuint program, GLenum programInterface, GLuint index, GLsizei propCount, const GLenum *props, GLsizei bufSize, GLsizei *length, GLfloat *params);
GLAPI void APIENTRY glPathColorGenNV (GLenum color, GLenum genMode, GLenum colorFormat, const GLfloat *coeffs);
GLAPI void APIENTRY glPathTexGenNV (GLenum texCoordSet, GLenum genMode, GLint components, const GLfloat *coeffs);
GLAPI void APIENTRY glPathFogGenNV (GLenum genMode);
GLAPI void APIENTRY glGetPathColorGenivNV (GLenum color, GLenum pname, GLint *value);
GLAPI void APIENTRY glGetPathColorGenfvNV (GLenum color, GLenum pname, GLfloat *value);
GLAPI void APIENTRY glGetPathTexGenivNV (GLenum texCoordSet, GLenum pname, GLint *value);
GLAPI void APIENTRY glGetPathTexGenfvNV (GLenum texCoordSet, GLenum pname, GLfloat *value);
#endif
#endif /* GL_NV_path_rendering */
#ifndef GL_NV_path_rendering_shared_edge
#define GL_NV_path_rendering_shared_edge 1
#define GL_SHARED_EDGE_NV 0xC0
#endif /* GL_NV_path_rendering_shared_edge */
#ifndef GL_NV_pixel_data_range
#define GL_NV_pixel_data_range 1
#define GL_WRITE_PIXEL_DATA_RANGE_NV 0x8878
@@ -9845,6 +9958,30 @@ GLAPI void APIENTRY glGetCombinerStageParameterfvNV (GLenum stage, GLenum pname,
#endif
#endif /* GL_NV_register_combiners2 */
#ifndef GL_NV_sample_locations
#define GL_NV_sample_locations 1
#define GL_SAMPLE_LOCATION_SUBPIXEL_BITS_NV 0x933D
#define GL_SAMPLE_LOCATION_PIXEL_GRID_WIDTH_NV 0x933E
#define GL_SAMPLE_LOCATION_PIXEL_GRID_HEIGHT_NV 0x933F
#define GL_PROGRAMMABLE_SAMPLE_LOCATION_TABLE_SIZE_NV 0x9340
#define GL_SAMPLE_LOCATION_NV 0x8E50
#define GL_PROGRAMMABLE_SAMPLE_LOCATION_NV 0x9341
#define GL_FRAMEBUFFER_PROGRAMMABLE_SAMPLE_LOCATIONS_NV 0x9342
#define GL_FRAMEBUFFER_SAMPLE_LOCATION_PIXEL_GRID_NV 0x9343
typedef void (APIENTRYP PFNGLFRAMEBUFFERSAMPLELOCATIONSFVNVPROC) (GLenum target, GLuint start, GLsizei count, const GLfloat *v);
typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERSAMPLELOCATIONSFVNVPROC) (GLuint framebuffer, GLuint start, GLsizei count, const GLfloat *v);
typedef void (APIENTRYP PFNGLRESOLVEDEPTHVALUESNVPROC) (void);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glFramebufferSampleLocationsfvNV (GLenum target, GLuint start, GLsizei count, const GLfloat *v);
GLAPI void APIENTRY glNamedFramebufferSampleLocationsfvNV (GLuint framebuffer, GLuint start, GLsizei count, const GLfloat *v);
GLAPI void APIENTRY glResolveDepthValuesNV (void);
#endif
#endif /* GL_NV_sample_locations */
#ifndef GL_NV_sample_mask_override_coverage
#define GL_NV_sample_mask_override_coverage 1
#endif /* GL_NV_sample_mask_override_coverage */
#ifndef GL_NV_shader_atomic_counters
#define GL_NV_shader_atomic_counters 1
#endif /* GL_NV_shader_atomic_counters */
@@ -9853,6 +9990,10 @@ GLAPI void APIENTRY glGetCombinerStageParameterfvNV (GLenum stage, GLenum pname,
#define GL_NV_shader_atomic_float 1
#endif /* GL_NV_shader_atomic_float */
#ifndef GL_NV_shader_atomic_fp16_vector
#define GL_NV_shader_atomic_fp16_vector 1
#endif /* GL_NV_shader_atomic_fp16_vector */
#ifndef GL_NV_shader_atomic_int64
#define GL_NV_shader_atomic_int64 1
#endif /* GL_NV_shader_atomic_int64 */
@@ -10176,6 +10317,13 @@ GLAPI void APIENTRY glDrawTransformFeedbackNV (GLenum mode, GLuint id);
#endif
#endif /* GL_NV_transform_feedback2 */
#ifndef GL_NV_uniform_buffer_unified_memory
#define GL_NV_uniform_buffer_unified_memory 1
#define GL_UNIFORM_BUFFER_UNIFIED_NV 0x936E
#define GL_UNIFORM_BUFFER_ADDRESS_NV 0x936F
#define GL_UNIFORM_BUFFER_LENGTH_NV 0x9370
#endif /* GL_NV_uniform_buffer_unified_memory */
#ifndef GL_NV_vdpau_interop
#define GL_NV_vdpau_interop 1
typedef GLintptr GLvdpauSurfaceNV;
@@ -10671,6 +10819,10 @@ GLAPI void APIENTRY glVideoCaptureStreamParameterdvNV (GLuint video_capture_slot
#endif
#endif /* GL_NV_video_capture */
#ifndef GL_NV_viewport_array2
#define GL_NV_viewport_array2 1
#endif /* GL_NV_viewport_array2 */
#ifndef GL_OML_interlace
#define GL_OML_interlace 1
#define GL_INTERLACE_OML 0x8980
@@ -11249,10 +11401,10 @@ GLAPI void APIENTRY glReferencePlaneSGIX (const GLdouble *equation);
#ifndef GL_SGIX_resample
#define GL_SGIX_resample 1
#define GL_PACK_RESAMPLE_SGIX 0x842C
#define GL_UNPACK_RESAMPLE_SGIX 0x842D
#define GL_RESAMPLE_REPLICATE_SGIX 0x842E
#define GL_RESAMPLE_ZERO_FILL_SGIX 0x842F
#define GL_PACK_RESAMPLE_SGIX 0x842E
#define GL_UNPACK_RESAMPLE_SGIX 0x842F
#define GL_RESAMPLE_REPLICATE_SGIX 0x8433
#define GL_RESAMPLE_ZERO_FILL_SGIX 0x8434
#define GL_RESAMPLE_DECIMATE_SGIX 0x8430
#endif /* GL_SGIX_resample */

View File

@@ -85,6 +85,7 @@ typedef struct __DRIdri2ExtensionRec __DRIdri2Extension;
typedef struct __DRIdri2LoaderExtensionRec __DRIdri2LoaderExtension;
typedef struct __DRI2flushExtensionRec __DRI2flushExtension;
typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension;
typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension;
typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension;
@@ -279,6 +280,7 @@ struct __DRItexBufferExtensionRec {
#define __DRI2_FLUSH_DRAWABLE (1 << 0) /* the drawable should be flushed. */
#define __DRI2_FLUSH_CONTEXT (1 << 1) /* glFlush should be called */
#define __DRI2_FLUSH_INVALIDATE_ANCILLARY (1 << 2)
enum __DRI2throttleReason {
__DRI2_THROTTLE_SWAPBUFFER,
@@ -338,6 +340,65 @@ struct __DRI2throttleExtensionRec {
enum __DRI2throttleReason reason);
};
/**
* Extension for fences / synchronization objects.
*/
#define __DRI2_FENCE "DRI2_Fence"
#define __DRI2_FENCE_VERSION 1
#define __DRI2_FENCE_TIMEOUT_INFINITE 0xffffffffffffffffllu
#define __DRI2_FENCE_FLAG_FLUSH_COMMANDS (1 << 0)
struct __DRI2fenceExtensionRec {
__DRIextension base;
/**
* Create and insert a fence into the command stream of the context.
*/
void *(*create_fence)(__DRIcontext *ctx);
/**
* Get a fence associated with the OpenCL event object.
* This can be NULL, meaning that OpenCL interoperability is not supported.
*/
void *(*get_fence_from_cl_event)(__DRIscreen *screen, intptr_t cl_event);
/**
* Destroy a fence.
*/
void (*destroy_fence)(__DRIscreen *screen, void *fence);
/**
* This function waits and doesn't return until the fence is signalled
* or the timeout expires. It returns true if the fence has been signaled.
*
* \param ctx the context where commands are flushed
* \param fence the fence
* \param flags a combination of __DRI2_FENCE_FLAG_xxx flags
* \param timeout the timeout in ns or __DRI2_FENCE_TIMEOUT_INFINITE
*/
GLboolean (*client_wait_sync)(__DRIcontext *ctx, void *fence,
unsigned flags, uint64_t timeout);
/**
* This function enqueues a wait command into the command stream of
* the context and then returns. When the execution reaches the wait
* command, no further execution will be done in the context until
* the fence is signaled. This is a no-op if the device doesn't support
* parallel execution of contexts.
*
* \param ctx the context where the waiting is done
* \param fence the fence
* \param flags a combination of __DRI2_FENCE_FLAG_xxx flags that make
* sense with this function (right now there are none)
*/
void (*server_wait_sync)(__DRIcontext *ctx, void *fence, unsigned flags);
};
/*@}*/
/**
@@ -1005,7 +1066,7 @@ struct __DRIdri2ExtensionRec {
* extensions.
*/
#define __DRI_IMAGE "DRI_IMAGE"
#define __DRI_IMAGE_VERSION 10
#define __DRI_IMAGE_VERSION 11
/**
* These formats correspond to the similarly named MESA_FORMAT_*
@@ -1096,6 +1157,8 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_ATTRIB_FD 0x2007 /* available in versions
* 7+. Each query will return a
* new fd. */
#define __DRI_IMAGE_ATTRIB_FOURCC 0x2008 /* available in versions 11 */
#define __DRI_IMAGE_ATTRIB_NUM_PLANES 0x2009 /* available in versions 11 */
enum __DRIYUVColorSpace {
__DRI_YUV_COLOR_SPACE_UNDEFINED = 0,

View File

@@ -41,10 +41,8 @@
* OSMesaGetIntegerv - return OSMesa state parameters
*
*
* The limits on the width and height of an image buffer are MAX_WIDTH and
* MAX_HEIGHT as defined in Mesa/src/config.h. Defaults are 1280 and 1024.
* You can increase them as needed but beware that many temporary arrays in
* Mesa are dimensioned by MAX_WIDTH or MAX_HEIGHT.
* The limits on the width and height of an image buffer can be retrieved
* via OSMesaGetIntegerv(OSMESA_MAX_WIDTH/OSMESA_MAX_HEIGHT).
*/

View File

@@ -1,140 +0,0 @@
/*
* Mesa 3-D graphics library
* Copyright (C) 1995-1998 Brian Paul
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Library General Public License for more details.
*
* You should have received a copy of the GNU Library General Public
* License along with this library; if not, write to the Free
* Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*
*/
/*
* Windows driver by: Mark E. Peterson (markp@ic.mankato.mn.us)
* Updated by Li Wei (liwei@aiar.xjtu.edu.cn)
*
*
***************************************************************
* WMesa *
* version 2.3 *
* *
* By *
* Li Wei *
* Institute of Artificial Intelligence & Robotics *
* Xi'an Jiaotong University *
* Email: liwei@aiar.xjtu.edu.cn *
* Web page: http://sun.aiar.xjtu.edu.cn *
* *
* July 7th, 1997 *
***************************************************************
*/
#ifndef WMESA_H
#define WMESA_H
#ifdef __cplusplus
extern "C" {
#endif
#include "GL/gl.h"
#if defined(_MSV_VER) && !defined(__GNUC__)
# pragma warning (disable:4273)
# pragma warning( disable : 4244 ) /* '=' : conversion from 'const double ' to 'float ', possible loss of data */
# pragma warning( disable : 4018 ) /* '<' : signed/unsigned mismatch */
# pragma warning( disable : 4305 ) /* '=' : truncation from 'const double ' to 'float ' */
# pragma warning( disable : 4013 ) /* 'function' undefined; assuming extern returning int */
# pragma warning( disable : 4761 ) /* integral size mismatch in argument; conversion supplied */
# pragma warning( disable : 4273 ) /* 'identifier' : inconsistent DLL linkage. dllexport assumed */
# if (MESA_WARNQUIET>1)
# pragma warning( disable : 4146 ) /* unary minus operator applied to unsigned type, result still unsigned */
# endif
#endif
/*
* This is the WMesa context 'handle':
*/
typedef struct wmesa_context *WMesaContext;
/*
* Create a new WMesaContext for rendering into a window. You must
* have already created the window of correct visual type and with an
* appropriate colormap.
*
* Input:
* hDC - Windows device or memory context
* Pal - Palette to use
* rgb_flag - GL_TRUE = RGB mode,
* GL_FALSE = color index mode
* db_flag - GL_TRUE = double-buffered,
* GL_FALSE = single buffered
* alpha_flag - GL_TRUE = create software alpha buffer,
* GL_FALSE = no software alpha buffer
*
* Note: Indexed mode requires double buffering under Windows.
*
* Return: a WMesa_context or NULL if error.
*/
extern WMesaContext WMesaCreateContext(HDC hDC,HPALETTE* pPal,
GLboolean rgb_flag,
GLboolean db_flag,
GLboolean alpha_flag);
/*
* Destroy a rendering context as returned by WMesaCreateContext()
*/
extern void WMesaDestroyContext( WMesaContext ctx );
/*
* Make the specified context the current one.
*/
extern void WMesaMakeCurrent( WMesaContext ctx, HDC hdc );
/*
* Return a handle to the current context.
*/
extern WMesaContext WMesaGetCurrentContext( void );
/*
* Swap the front and back buffers for the current context. No action
* taken if the context is not double buffered.
*/
extern void WMesaSwapBuffers(HDC hdc);
/*
* In indexed color mode we need to know when the palette changes.
*/
extern void WMesaPaletteChange(HPALETTE Pal);
extern void WMesaMove(void);
void WMesaShareLists(WMesaContext ctx_to_share, WMesaContext ctx);
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -26,7 +26,7 @@
/* Khronos platform-specific types and definitions.
*
* $Revision: 9356 $ on $Date: 2009-10-21 02:52:25 -0700 (Wed, 21 Oct 2009) $
* $Revision: 23298 $ on $Date: 2013-09-30 17:07:13 -0700 (Mon, 30 Sep 2013) $
*
* Adopters may modify this file to suit their platform. Adopters are
* encouraged to submit platform specific modifications to the Khronos
@@ -106,9 +106,9 @@
#elif defined (__SYMBIAN32__)
# define KHRONOS_APICALL IMPORT_C
#elif (defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 303) \
|| (defined(__SUNPRO_C) && (__SUNPRO_C >= 0x590))
|| (defined(__SUNPRO_C) && (__SUNPRO_C >= 0x590))
/* KHRONOS_APIATTRIBUTES is not used by the client API headers yet */
# define KHRONOS_APICALL __attribute__((visibility("default")))
# define KHRONOS_APICALL __attribute__((visibility("default")))
#else
# define KHRONOS_APICALL
#endif
@@ -229,10 +229,23 @@ typedef signed char khronos_int8_t;
typedef unsigned char khronos_uint8_t;
typedef signed short int khronos_int16_t;
typedef unsigned short int khronos_uint16_t;
/*
* Types that differ between LLP64 and LP64 architectures - in LLP64,
* pointers are 64 bits, but 'long' is still 32 bits. Win64 appears
* to be the only LLP64 architecture in current use.
*/
#ifdef _WIN64
typedef signed long long int khronos_intptr_t;
typedef unsigned long long int khronos_uintptr_t;
typedef signed long long int khronos_ssize_t;
typedef unsigned long long int khronos_usize_t;
#else
typedef signed long int khronos_intptr_t;
typedef unsigned long int khronos_uintptr_t;
typedef signed long int khronos_ssize_t;
typedef unsigned long int khronos_usize_t;
#endif
#if KHRONOS_SUPPORT_FLOAT
/*

View File

@@ -1,746 +0,0 @@
/* $Revision: 9203 $ on $Date:: 2009-10-07 02:21:52 -0700 #$ */
/*------------------------------------------------------------------------
*
* OpenVG 1.1 Reference Implementation
* -------------------------------------
*
* Copyright (c) 2008 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and /or associated documentation files
* (the "Materials "), to deal in the Materials without restriction,
* including without limitation the rights to use, copy, modify, merge,
* publish, distribute, sublicense, and/or sell copies of the Materials,
* and to permit persons to whom the Materials are furnished to do so,
* subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MATERIALS OR
* THE USE OR OTHER DEALINGS IN THE MATERIALS.
*
*//**
* \file
* \brief OpenVG 1.1 API.
*//*-------------------------------------------------------------------*/
#ifndef _OPENVG_H
#define _OPENVG_H
#include <VG/vgplatform.h>
#ifdef __cplusplus
extern "C" {
#endif
#define OPENVG_VERSION_1_0 1
#define OPENVG_VERSION_1_0_1 1
#define OPENVG_VERSION_1_1 2
#ifndef VG_MAXSHORT
#define VG_MAXSHORT 0x7FFF
#endif
#ifndef VG_MAXINT
#define VG_MAXINT 0x7FFFFFFF
#endif
#ifndef VG_MAX_ENUM
#define VG_MAX_ENUM 0x7FFFFFFF
#endif
typedef VGuint VGHandle;
typedef VGHandle VGPath;
typedef VGHandle VGImage;
typedef VGHandle VGMaskLayer;
typedef VGHandle VGFont;
typedef VGHandle VGPaint;
#define VG_INVALID_HANDLE ((VGHandle)0)
typedef enum {
VG_FALSE = 0,
VG_TRUE = 1,
VG_BOOLEAN_FORCE_SIZE = VG_MAX_ENUM
} VGboolean;
typedef enum {
VG_NO_ERROR = 0,
VG_BAD_HANDLE_ERROR = 0x1000,
VG_ILLEGAL_ARGUMENT_ERROR = 0x1001,
VG_OUT_OF_MEMORY_ERROR = 0x1002,
VG_PATH_CAPABILITY_ERROR = 0x1003,
VG_UNSUPPORTED_IMAGE_FORMAT_ERROR = 0x1004,
VG_UNSUPPORTED_PATH_FORMAT_ERROR = 0x1005,
VG_IMAGE_IN_USE_ERROR = 0x1006,
VG_NO_CONTEXT_ERROR = 0x1007,
VG_ERROR_CODE_FORCE_SIZE = VG_MAX_ENUM
} VGErrorCode;
typedef enum {
/* Mode settings */
VG_MATRIX_MODE = 0x1100,
VG_FILL_RULE = 0x1101,
VG_IMAGE_QUALITY = 0x1102,
VG_RENDERING_QUALITY = 0x1103,
VG_BLEND_MODE = 0x1104,
VG_IMAGE_MODE = 0x1105,
/* Scissoring rectangles */
VG_SCISSOR_RECTS = 0x1106,
/* Color Transformation */
VG_COLOR_TRANSFORM = 0x1170,
VG_COLOR_TRANSFORM_VALUES = 0x1171,
/* Stroke parameters */
VG_STROKE_LINE_WIDTH = 0x1110,
VG_STROKE_CAP_STYLE = 0x1111,
VG_STROKE_JOIN_STYLE = 0x1112,
VG_STROKE_MITER_LIMIT = 0x1113,
VG_STROKE_DASH_PATTERN = 0x1114,
VG_STROKE_DASH_PHASE = 0x1115,
VG_STROKE_DASH_PHASE_RESET = 0x1116,
/* Edge fill color for VG_TILE_FILL tiling mode */
VG_TILE_FILL_COLOR = 0x1120,
/* Color for vgClear */
VG_CLEAR_COLOR = 0x1121,
/* Glyph origin */
VG_GLYPH_ORIGIN = 0x1122,
/* Enable/disable alpha masking and scissoring */
VG_MASKING = 0x1130,
VG_SCISSORING = 0x1131,
/* Pixel layout information */
VG_PIXEL_LAYOUT = 0x1140,
VG_SCREEN_LAYOUT = 0x1141,
/* Source format selection for image filters */
VG_FILTER_FORMAT_LINEAR = 0x1150,
VG_FILTER_FORMAT_PREMULTIPLIED = 0x1151,
/* Destination write enable mask for image filters */
VG_FILTER_CHANNEL_MASK = 0x1152,
/* Implementation limits (read-only) */
VG_MAX_SCISSOR_RECTS = 0x1160,
VG_MAX_DASH_COUNT = 0x1161,
VG_MAX_KERNEL_SIZE = 0x1162,
VG_MAX_SEPARABLE_KERNEL_SIZE = 0x1163,
VG_MAX_COLOR_RAMP_STOPS = 0x1164,
VG_MAX_IMAGE_WIDTH = 0x1165,
VG_MAX_IMAGE_HEIGHT = 0x1166,
VG_MAX_IMAGE_PIXELS = 0x1167,
VG_MAX_IMAGE_BYTES = 0x1168,
VG_MAX_FLOAT = 0x1169,
VG_MAX_GAUSSIAN_STD_DEVIATION = 0x116A,
VG_PARAM_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGParamType;
typedef enum {
VG_RENDERING_QUALITY_NONANTIALIASED = 0x1200,
VG_RENDERING_QUALITY_FASTER = 0x1201,
VG_RENDERING_QUALITY_BETTER = 0x1202, /* Default */
VG_RENDERING_QUALITY_FORCE_SIZE = VG_MAX_ENUM
} VGRenderingQuality;
typedef enum {
VG_PIXEL_LAYOUT_UNKNOWN = 0x1300,
VG_PIXEL_LAYOUT_RGB_VERTICAL = 0x1301,
VG_PIXEL_LAYOUT_BGR_VERTICAL = 0x1302,
VG_PIXEL_LAYOUT_RGB_HORIZONTAL = 0x1303,
VG_PIXEL_LAYOUT_BGR_HORIZONTAL = 0x1304,
VG_PIXEL_LAYOUT_FORCE_SIZE = VG_MAX_ENUM
} VGPixelLayout;
typedef enum {
VG_MATRIX_PATH_USER_TO_SURFACE = 0x1400,
VG_MATRIX_IMAGE_USER_TO_SURFACE = 0x1401,
VG_MATRIX_FILL_PAINT_TO_USER = 0x1402,
VG_MATRIX_STROKE_PAINT_TO_USER = 0x1403,
VG_MATRIX_GLYPH_USER_TO_SURFACE = 0x1404,
VG_MATRIX_MODE_FORCE_SIZE = VG_MAX_ENUM
} VGMatrixMode;
typedef enum {
VG_CLEAR_MASK = 0x1500,
VG_FILL_MASK = 0x1501,
VG_SET_MASK = 0x1502,
VG_UNION_MASK = 0x1503,
VG_INTERSECT_MASK = 0x1504,
VG_SUBTRACT_MASK = 0x1505,
VG_MASK_OPERATION_FORCE_SIZE = VG_MAX_ENUM
} VGMaskOperation;
#define VG_PATH_FORMAT_STANDARD 0
typedef enum {
VG_PATH_DATATYPE_S_8 = 0,
VG_PATH_DATATYPE_S_16 = 1,
VG_PATH_DATATYPE_S_32 = 2,
VG_PATH_DATATYPE_F = 3,
VG_PATH_DATATYPE_FORCE_SIZE = VG_MAX_ENUM
} VGPathDatatype;
typedef enum {
VG_ABSOLUTE = 0,
VG_RELATIVE = 1,
VG_PATH_ABS_REL_FORCE_SIZE = VG_MAX_ENUM
} VGPathAbsRel;
typedef enum {
VG_CLOSE_PATH = ( 0 << 1),
VG_MOVE_TO = ( 1 << 1),
VG_LINE_TO = ( 2 << 1),
VG_HLINE_TO = ( 3 << 1),
VG_VLINE_TO = ( 4 << 1),
VG_QUAD_TO = ( 5 << 1),
VG_CUBIC_TO = ( 6 << 1),
VG_SQUAD_TO = ( 7 << 1),
VG_SCUBIC_TO = ( 8 << 1),
VG_SCCWARC_TO = ( 9 << 1),
VG_SCWARC_TO = (10 << 1),
VG_LCCWARC_TO = (11 << 1),
VG_LCWARC_TO = (12 << 1),
VG_PATH_SEGMENT_FORCE_SIZE = VG_MAX_ENUM
} VGPathSegment;
typedef enum {
VG_MOVE_TO_ABS = VG_MOVE_TO | VG_ABSOLUTE,
VG_MOVE_TO_REL = VG_MOVE_TO | VG_RELATIVE,
VG_LINE_TO_ABS = VG_LINE_TO | VG_ABSOLUTE,
VG_LINE_TO_REL = VG_LINE_TO | VG_RELATIVE,
VG_HLINE_TO_ABS = VG_HLINE_TO | VG_ABSOLUTE,
VG_HLINE_TO_REL = VG_HLINE_TO | VG_RELATIVE,
VG_VLINE_TO_ABS = VG_VLINE_TO | VG_ABSOLUTE,
VG_VLINE_TO_REL = VG_VLINE_TO | VG_RELATIVE,
VG_QUAD_TO_ABS = VG_QUAD_TO | VG_ABSOLUTE,
VG_QUAD_TO_REL = VG_QUAD_TO | VG_RELATIVE,
VG_CUBIC_TO_ABS = VG_CUBIC_TO | VG_ABSOLUTE,
VG_CUBIC_TO_REL = VG_CUBIC_TO | VG_RELATIVE,
VG_SQUAD_TO_ABS = VG_SQUAD_TO | VG_ABSOLUTE,
VG_SQUAD_TO_REL = VG_SQUAD_TO | VG_RELATIVE,
VG_SCUBIC_TO_ABS = VG_SCUBIC_TO | VG_ABSOLUTE,
VG_SCUBIC_TO_REL = VG_SCUBIC_TO | VG_RELATIVE,
VG_SCCWARC_TO_ABS = VG_SCCWARC_TO | VG_ABSOLUTE,
VG_SCCWARC_TO_REL = VG_SCCWARC_TO | VG_RELATIVE,
VG_SCWARC_TO_ABS = VG_SCWARC_TO | VG_ABSOLUTE,
VG_SCWARC_TO_REL = VG_SCWARC_TO | VG_RELATIVE,
VG_LCCWARC_TO_ABS = VG_LCCWARC_TO | VG_ABSOLUTE,
VG_LCCWARC_TO_REL = VG_LCCWARC_TO | VG_RELATIVE,
VG_LCWARC_TO_ABS = VG_LCWARC_TO | VG_ABSOLUTE,
VG_LCWARC_TO_REL = VG_LCWARC_TO | VG_RELATIVE,
VG_PATH_COMMAND_FORCE_SIZE = VG_MAX_ENUM
} VGPathCommand;
typedef enum {
VG_PATH_CAPABILITY_APPEND_FROM = (1 << 0),
VG_PATH_CAPABILITY_APPEND_TO = (1 << 1),
VG_PATH_CAPABILITY_MODIFY = (1 << 2),
VG_PATH_CAPABILITY_TRANSFORM_FROM = (1 << 3),
VG_PATH_CAPABILITY_TRANSFORM_TO = (1 << 4),
VG_PATH_CAPABILITY_INTERPOLATE_FROM = (1 << 5),
VG_PATH_CAPABILITY_INTERPOLATE_TO = (1 << 6),
VG_PATH_CAPABILITY_PATH_LENGTH = (1 << 7),
VG_PATH_CAPABILITY_POINT_ALONG_PATH = (1 << 8),
VG_PATH_CAPABILITY_TANGENT_ALONG_PATH = (1 << 9),
VG_PATH_CAPABILITY_PATH_BOUNDS = (1 << 10),
VG_PATH_CAPABILITY_PATH_TRANSFORMED_BOUNDS = (1 << 11),
VG_PATH_CAPABILITY_ALL = (1 << 12) - 1,
VG_PATH_CAPABILITIES_FORCE_SIZE = VG_MAX_ENUM
} VGPathCapabilities;
typedef enum {
VG_PATH_FORMAT = 0x1600,
VG_PATH_DATATYPE = 0x1601,
VG_PATH_SCALE = 0x1602,
VG_PATH_BIAS = 0x1603,
VG_PATH_NUM_SEGMENTS = 0x1604,
VG_PATH_NUM_COORDS = 0x1605,
VG_PATH_PARAM_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGPathParamType;
typedef enum {
VG_CAP_BUTT = 0x1700,
VG_CAP_ROUND = 0x1701,
VG_CAP_SQUARE = 0x1702,
VG_CAP_STYLE_FORCE_SIZE = VG_MAX_ENUM
} VGCapStyle;
typedef enum {
VG_JOIN_MITER = 0x1800,
VG_JOIN_ROUND = 0x1801,
VG_JOIN_BEVEL = 0x1802,
VG_JOIN_STYLE_FORCE_SIZE = VG_MAX_ENUM
} VGJoinStyle;
typedef enum {
VG_EVEN_ODD = 0x1900,
VG_NON_ZERO = 0x1901,
VG_FILL_RULE_FORCE_SIZE = VG_MAX_ENUM
} VGFillRule;
typedef enum {
VG_STROKE_PATH = (1 << 0),
VG_FILL_PATH = (1 << 1),
VG_PAINT_MODE_FORCE_SIZE = VG_MAX_ENUM
} VGPaintMode;
typedef enum {
/* Color paint parameters */
VG_PAINT_TYPE = 0x1A00,
VG_PAINT_COLOR = 0x1A01,
VG_PAINT_COLOR_RAMP_SPREAD_MODE = 0x1A02,
VG_PAINT_COLOR_RAMP_PREMULTIPLIED = 0x1A07,
VG_PAINT_COLOR_RAMP_STOPS = 0x1A03,
/* Linear gradient paint parameters */
VG_PAINT_LINEAR_GRADIENT = 0x1A04,
/* Radial gradient paint parameters */
VG_PAINT_RADIAL_GRADIENT = 0x1A05,
/* Pattern paint parameters */
VG_PAINT_PATTERN_TILING_MODE = 0x1A06,
VG_PAINT_PARAM_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGPaintParamType;
typedef enum {
VG_PAINT_TYPE_COLOR = 0x1B00,
VG_PAINT_TYPE_LINEAR_GRADIENT = 0x1B01,
VG_PAINT_TYPE_RADIAL_GRADIENT = 0x1B02,
VG_PAINT_TYPE_PATTERN = 0x1B03,
VG_PAINT_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGPaintType;
typedef enum {
VG_COLOR_RAMP_SPREAD_PAD = 0x1C00,
VG_COLOR_RAMP_SPREAD_REPEAT = 0x1C01,
VG_COLOR_RAMP_SPREAD_REFLECT = 0x1C02,
VG_COLOR_RAMP_SPREAD_MODE_FORCE_SIZE = VG_MAX_ENUM
} VGColorRampSpreadMode;
typedef enum {
VG_TILE_FILL = 0x1D00,
VG_TILE_PAD = 0x1D01,
VG_TILE_REPEAT = 0x1D02,
VG_TILE_REFLECT = 0x1D03,
VG_TILING_MODE_FORCE_SIZE = VG_MAX_ENUM
} VGTilingMode;
typedef enum {
/* RGB{A,X} channel ordering */
VG_sRGBX_8888 = 0,
VG_sRGBA_8888 = 1,
VG_sRGBA_8888_PRE = 2,
VG_sRGB_565 = 3,
VG_sRGBA_5551 = 4,
VG_sRGBA_4444 = 5,
VG_sL_8 = 6,
VG_lRGBX_8888 = 7,
VG_lRGBA_8888 = 8,
VG_lRGBA_8888_PRE = 9,
VG_lL_8 = 10,
VG_A_8 = 11,
VG_BW_1 = 12,
VG_A_1 = 13,
VG_A_4 = 14,
/* {A,X}RGB channel ordering */
VG_sXRGB_8888 = 0 | (1 << 6),
VG_sARGB_8888 = 1 | (1 << 6),
VG_sARGB_8888_PRE = 2 | (1 << 6),
VG_sARGB_1555 = 4 | (1 << 6),
VG_sARGB_4444 = 5 | (1 << 6),
VG_lXRGB_8888 = 7 | (1 << 6),
VG_lARGB_8888 = 8 | (1 << 6),
VG_lARGB_8888_PRE = 9 | (1 << 6),
/* BGR{A,X} channel ordering */
VG_sBGRX_8888 = 0 | (1 << 7),
VG_sBGRA_8888 = 1 | (1 << 7),
VG_sBGRA_8888_PRE = 2 | (1 << 7),
VG_sBGR_565 = 3 | (1 << 7),
VG_sBGRA_5551 = 4 | (1 << 7),
VG_sBGRA_4444 = 5 | (1 << 7),
VG_lBGRX_8888 = 7 | (1 << 7),
VG_lBGRA_8888 = 8 | (1 << 7),
VG_lBGRA_8888_PRE = 9 | (1 << 7),
/* {A,X}BGR channel ordering */
VG_sXBGR_8888 = 0 | (1 << 6) | (1 << 7),
VG_sABGR_8888 = 1 | (1 << 6) | (1 << 7),
VG_sABGR_8888_PRE = 2 | (1 << 6) | (1 << 7),
VG_sABGR_1555 = 4 | (1 << 6) | (1 << 7),
VG_sABGR_4444 = 5 | (1 << 6) | (1 << 7),
VG_lXBGR_8888 = 7 | (1 << 6) | (1 << 7),
VG_lABGR_8888 = 8 | (1 << 6) | (1 << 7),
VG_lABGR_8888_PRE = 9 | (1 << 6) | (1 << 7),
VG_IMAGE_FORMAT_FORCE_SIZE = VG_MAX_ENUM
} VGImageFormat;
typedef enum {
VG_IMAGE_QUALITY_NONANTIALIASED = (1 << 0),
VG_IMAGE_QUALITY_FASTER = (1 << 1),
VG_IMAGE_QUALITY_BETTER = (1 << 2),
VG_IMAGE_QUALITY_FORCE_SIZE = VG_MAX_ENUM
} VGImageQuality;
typedef enum {
VG_IMAGE_FORMAT = 0x1E00,
VG_IMAGE_WIDTH = 0x1E01,
VG_IMAGE_HEIGHT = 0x1E02,
VG_IMAGE_PARAM_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGImageParamType;
typedef enum {
VG_DRAW_IMAGE_NORMAL = 0x1F00,
VG_DRAW_IMAGE_MULTIPLY = 0x1F01,
VG_DRAW_IMAGE_STENCIL = 0x1F02,
VG_IMAGE_MODE_FORCE_SIZE = VG_MAX_ENUM
} VGImageMode;
typedef enum {
VG_RED = (1 << 3),
VG_GREEN = (1 << 2),
VG_BLUE = (1 << 1),
VG_ALPHA = (1 << 0),
VG_IMAGE_CHANNEL_FORCE_SIZE = VG_MAX_ENUM
} VGImageChannel;
typedef enum {
VG_BLEND_SRC = 0x2000,
VG_BLEND_SRC_OVER = 0x2001,
VG_BLEND_DST_OVER = 0x2002,
VG_BLEND_SRC_IN = 0x2003,
VG_BLEND_DST_IN = 0x2004,
VG_BLEND_MULTIPLY = 0x2005,
VG_BLEND_SCREEN = 0x2006,
VG_BLEND_DARKEN = 0x2007,
VG_BLEND_LIGHTEN = 0x2008,
VG_BLEND_ADDITIVE = 0x2009,
VG_BLEND_MODE_FORCE_SIZE = VG_MAX_ENUM
} VGBlendMode;
typedef enum {
VG_FONT_NUM_GLYPHS = 0x2F00,
VG_FONT_PARAM_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGFontParamType;
typedef enum {
VG_IMAGE_FORMAT_QUERY = 0x2100,
VG_PATH_DATATYPE_QUERY = 0x2101,
VG_HARDWARE_QUERY_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGHardwareQueryType;
typedef enum {
VG_HARDWARE_ACCELERATED = 0x2200,
VG_HARDWARE_UNACCELERATED = 0x2201,
VG_HARDWARE_QUERY_RESULT_FORCE_SIZE = VG_MAX_ENUM
} VGHardwareQueryResult;
typedef enum {
VG_VENDOR = 0x2300,
VG_RENDERER = 0x2301,
VG_VERSION = 0x2302,
VG_EXTENSIONS = 0x2303,
VG_STRING_ID_FORCE_SIZE = VG_MAX_ENUM
} VGStringID;
/* Function Prototypes */
#ifndef VG_API_CALL
# error VG_API_CALL must be defined
#endif
#ifndef VG_API_ENTRY
# error VG_API_ENTRY must be defined
#endif
#ifndef VG_API_EXIT
# error VG_API_EXIT must be defined
#endif
VG_API_CALL VGErrorCode VG_API_ENTRY vgGetError(void) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgFlush(void) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgFinish(void) VG_API_EXIT;
/* Getters and Setters */
VG_API_CALL void VG_API_ENTRY vgSetf (VGParamType type, VGfloat value) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSeti (VGParamType type, VGint value) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetfv(VGParamType type, VGint count,
const VGfloat * values) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetiv(VGParamType type, VGint count,
const VGint * values) VG_API_EXIT;
VG_API_CALL VGfloat VG_API_ENTRY vgGetf(VGParamType type) VG_API_EXIT;
VG_API_CALL VGint VG_API_ENTRY vgGeti(VGParamType type) VG_API_EXIT;
VG_API_CALL VGint VG_API_ENTRY vgGetVectorSize(VGParamType type) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetfv(VGParamType type, VGint count, VGfloat * values) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetiv(VGParamType type, VGint count, VGint * values) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetParameterf(VGHandle object,
VGint paramType,
VGfloat value) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetParameteri(VGHandle object,
VGint paramType,
VGint value) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetParameterfv(VGHandle object,
VGint paramType,
VGint count, const VGfloat * values) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetParameteriv(VGHandle object,
VGint paramType,
VGint count, const VGint * values) VG_API_EXIT;
VG_API_CALL VGfloat VG_API_ENTRY vgGetParameterf(VGHandle object,
VGint paramType) VG_API_EXIT;
VG_API_CALL VGint VG_API_ENTRY vgGetParameteri(VGHandle object,
VGint paramType);
VG_API_CALL VGint VG_API_ENTRY vgGetParameterVectorSize(VGHandle object,
VGint paramType) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetParameterfv(VGHandle object,
VGint paramType,
VGint count, VGfloat * values) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetParameteriv(VGHandle object,
VGint paramType,
VGint count, VGint * values) VG_API_EXIT;
/* Matrix Manipulation */
VG_API_CALL void VG_API_ENTRY vgLoadIdentity(void) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgLoadMatrix(const VGfloat * m) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetMatrix(VGfloat * m) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgMultMatrix(const VGfloat * m) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgTranslate(VGfloat tx, VGfloat ty) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgScale(VGfloat sx, VGfloat sy) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgShear(VGfloat shx, VGfloat shy) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgRotate(VGfloat angle) VG_API_EXIT;
/* Masking and Clearing */
VG_API_CALL void VG_API_ENTRY vgMask(VGHandle mask, VGMaskOperation operation,
VGint x, VGint y,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgRenderToMask(VGPath path,
VGbitfield paintModes,
VGMaskOperation operation) VG_API_EXIT;
VG_API_CALL VGMaskLayer VG_API_ENTRY vgCreateMaskLayer(VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDestroyMaskLayer(VGMaskLayer maskLayer) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgFillMaskLayer(VGMaskLayer maskLayer,
VGint x, VGint y,
VGint width, VGint height,
VGfloat value) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgCopyMask(VGMaskLayer maskLayer,
VGint dx, VGint dy,
VGint sx, VGint sy,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgClear(VGint x, VGint y, VGint width, VGint height) VG_API_EXIT;
/* Paths */
VG_API_CALL VGPath VG_API_ENTRY vgCreatePath(VGint pathFormat,
VGPathDatatype datatype,
VGfloat scale, VGfloat bias,
VGint segmentCapacityHint,
VGint coordCapacityHint,
VGbitfield capabilities) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgClearPath(VGPath path, VGbitfield capabilities) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDestroyPath(VGPath path) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgRemovePathCapabilities(VGPath path,
VGbitfield capabilities) VG_API_EXIT;
VG_API_CALL VGbitfield VG_API_ENTRY vgGetPathCapabilities(VGPath path) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgAppendPath(VGPath dstPath, VGPath srcPath) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgAppendPathData(VGPath dstPath,
VGint numSegments,
const VGubyte * pathSegments,
const void * pathData) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgModifyPathCoords(VGPath dstPath, VGint startIndex,
VGint numSegments,
const void * pathData) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgTransformPath(VGPath dstPath, VGPath srcPath) VG_API_EXIT;
VG_API_CALL VGboolean VG_API_ENTRY vgInterpolatePath(VGPath dstPath,
VGPath startPath,
VGPath endPath,
VGfloat amount) VG_API_EXIT;
VG_API_CALL VGfloat VG_API_ENTRY vgPathLength(VGPath path,
VGint startSegment, VGint numSegments) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgPointAlongPath(VGPath path,
VGint startSegment, VGint numSegments,
VGfloat distance,
VGfloat * x, VGfloat * y,
VGfloat * tangentX, VGfloat * tangentY) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgPathBounds(VGPath path,
VGfloat * minX, VGfloat * minY,
VGfloat * width, VGfloat * height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgPathTransformedBounds(VGPath path,
VGfloat * minX, VGfloat * minY,
VGfloat * width, VGfloat * height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDrawPath(VGPath path, VGbitfield paintModes) VG_API_EXIT;
/* Paint */
VG_API_CALL VGPaint VG_API_ENTRY vgCreatePaint(void) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDestroyPaint(VGPaint paint) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetPaint(VGPaint paint, VGbitfield paintModes) VG_API_EXIT;
VG_API_CALL VGPaint VG_API_ENTRY vgGetPaint(VGPaintMode paintMode) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetColor(VGPaint paint, VGuint rgba) VG_API_EXIT;
VG_API_CALL VGuint VG_API_ENTRY vgGetColor(VGPaint paint) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgPaintPattern(VGPaint paint, VGImage pattern) VG_API_EXIT;
/* Images */
VG_API_CALL VGImage VG_API_ENTRY vgCreateImage(VGImageFormat format,
VGint width, VGint height,
VGbitfield allowedQuality) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDestroyImage(VGImage image) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgClearImage(VGImage image,
VGint x, VGint y, VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgImageSubData(VGImage image,
const void * data, VGint dataStride,
VGImageFormat dataFormat,
VGint x, VGint y, VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetImageSubData(VGImage image,
void * data, VGint dataStride,
VGImageFormat dataFormat,
VGint x, VGint y,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL VGImage VG_API_ENTRY vgChildImage(VGImage parent,
VGint x, VGint y, VGint width, VGint height) VG_API_EXIT;
VG_API_CALL VGImage VG_API_ENTRY vgGetParent(VGImage image) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgCopyImage(VGImage dst, VGint dx, VGint dy,
VGImage src, VGint sx, VGint sy,
VGint width, VGint height,
VGboolean dither) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDrawImage(VGImage image) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetPixels(VGint dx, VGint dy,
VGImage src, VGint sx, VGint sy,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgWritePixels(const void * data, VGint dataStride,
VGImageFormat dataFormat,
VGint dx, VGint dy,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGetPixels(VGImage dst, VGint dx, VGint dy,
VGint sx, VGint sy,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgReadPixels(void * data, VGint dataStride,
VGImageFormat dataFormat,
VGint sx, VGint sy,
VGint width, VGint height) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgCopyPixels(VGint dx, VGint dy,
VGint sx, VGint sy,
VGint width, VGint height) VG_API_EXIT;
/* Text */
VG_API_CALL VGFont VG_API_ENTRY vgCreateFont(VGint glyphCapacityHint) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDestroyFont(VGFont font) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetGlyphToPath(VGFont font,
VGuint glyphIndex,
VGPath path,
VGboolean isHinted,
const VGfloat glyphOrigin [2],
const VGfloat escapement[2]) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSetGlyphToImage(VGFont font,
VGuint glyphIndex,
VGImage image,
const VGfloat glyphOrigin [2],
const VGfloat escapement[2]) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgClearGlyph(VGFont font,VGuint glyphIndex) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDrawGlyph(VGFont font,
VGuint glyphIndex,
VGbitfield paintModes,
VGboolean allowAutoHinting) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgDrawGlyphs(VGFont font,
VGint glyphCount,
const VGuint *glyphIndices,
const VGfloat *adjustments_x,
const VGfloat *adjustments_y,
VGbitfield paintModes,
VGboolean allowAutoHinting) VG_API_EXIT;
/* Image Filters */
VG_API_CALL void VG_API_ENTRY vgColorMatrix(VGImage dst, VGImage src,
const VGfloat * matrix) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgConvolve(VGImage dst, VGImage src,
VGint kernelWidth, VGint kernelHeight,
VGint shiftX, VGint shiftY,
const VGshort * kernel,
VGfloat scale,
VGfloat bias,
VGTilingMode tilingMode) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgSeparableConvolve(VGImage dst, VGImage src,
VGint kernelWidth,
VGint kernelHeight,
VGint shiftX, VGint shiftY,
const VGshort * kernelX,
const VGshort * kernelY,
VGfloat scale,
VGfloat bias,
VGTilingMode tilingMode) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgGaussianBlur(VGImage dst, VGImage src,
VGfloat stdDeviationX,
VGfloat stdDeviationY,
VGTilingMode tilingMode) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgLookup(VGImage dst, VGImage src,
const VGubyte * redLUT,
const VGubyte * greenLUT,
const VGubyte * blueLUT,
const VGubyte * alphaLUT,
VGboolean outputLinear,
VGboolean outputPremultiplied) VG_API_EXIT;
VG_API_CALL void VG_API_ENTRY vgLookupSingle(VGImage dst, VGImage src,
const VGuint * lookupTable,
VGImageChannel sourceChannel,
VGboolean outputLinear,
VGboolean outputPremultiplied) VG_API_EXIT;
/* Hardware Queries */
VG_API_CALL VGHardwareQueryResult VG_API_ENTRY vgHardwareQuery(VGHardwareQueryType key,
VGint setting) VG_API_EXIT;
/* Renderer and Extension Information */
VG_API_CALL const VGubyte * VG_API_ENTRY vgGetString(VGStringID name) VG_API_EXIT;
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* _OPENVG_H */

View File

@@ -1,233 +0,0 @@
/* $Revision: 6810 $ on $Date:: 2008-10-29 07:31:37 -0700 #$ */
/*------------------------------------------------------------------------
*
* VG extensions Reference Implementation
* -------------------------------------
*
* Copyright (c) 2008 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and /or associated documentation files
* (the "Materials "), to deal in the Materials without restriction,
* including without limitation the rights to use, copy, modify, merge,
* publish, distribute, sublicense, and/or sell copies of the Materials,
* and to permit persons to whom the Materials are furnished to do so,
* subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MATERIALS OR
* THE USE OR OTHER DEALINGS IN THE MATERIALS.
*
*//**
* \file
* \brief VG extensions
*//*-------------------------------------------------------------------*/
#ifndef _VGEXT_H
#define _VGEXT_H
#ifdef __cplusplus
extern "C" {
#endif
#include <VG/openvg.h>
#include <VG/vgu.h>
#ifndef VG_API_ENTRYP
# define VG_API_ENTRYP VG_API_ENTRY*
#endif
#ifndef VGU_API_ENTRYP
# define VGU_API_ENTRYP VGU_API_ENTRY*
#endif
/*-------------------------------------------------------------------------------
* KHR extensions
*------------------------------------------------------------------------------*/
typedef enum {
#ifndef VG_KHR_iterative_average_blur
VG_MAX_AVERAGE_BLUR_DIMENSION_KHR = 0x116B,
VG_AVERAGE_BLUR_DIMENSION_RESOLUTION_KHR = 0x116C,
VG_MAX_AVERAGE_BLUR_ITERATIONS_KHR = 0x116D,
#endif
VG_PARAM_TYPE_KHR_FORCE_SIZE = VG_MAX_ENUM
} VGParamTypeKHR;
#ifndef VG_KHR_EGL_image
#define VG_KHR_EGL_image 1
/* VGEGLImageKHR is an opaque handle to an EGLImage */
typedef void* VGeglImageKHR;
#ifdef VG_VGEXT_PROTOTYPES
VG_API_CALL VGImage VG_API_ENTRY vgCreateEGLImageTargetKHR(VGeglImageKHR image);
#endif
typedef VGImage (VG_API_ENTRYP PFNVGCREATEEGLIMAGETARGETKHRPROC) (VGeglImageKHR image);
#endif
#ifndef VG_KHR_iterative_average_blur
#define VG_KHR_iterative_average_blur 1
#ifdef VG_VGEXT_PROTOTYPES
VG_API_CALL void vgIterativeAverageBlurKHR(VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGTilingMode tilingMode);
#endif
typedef void (VG_API_ENTRYP PFNVGITERATIVEAVERAGEBLURKHRPROC) (VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGTilingMode tilingMode);
#endif
#ifndef VG_KHR_advanced_blending
#define VG_KHR_advanced_blending 1
typedef enum {
VG_BLEND_OVERLAY_KHR = 0x2010,
VG_BLEND_HARDLIGHT_KHR = 0x2011,
VG_BLEND_SOFTLIGHT_SVG_KHR = 0x2012,
VG_BLEND_SOFTLIGHT_KHR = 0x2013,
VG_BLEND_COLORDODGE_KHR = 0x2014,
VG_BLEND_COLORBURN_KHR = 0x2015,
VG_BLEND_DIFFERENCE_KHR = 0x2016,
VG_BLEND_SUBTRACT_KHR = 0x2017,
VG_BLEND_INVERT_KHR = 0x2018,
VG_BLEND_EXCLUSION_KHR = 0x2019,
VG_BLEND_LINEARDODGE_KHR = 0x201a,
VG_BLEND_LINEARBURN_KHR = 0x201b,
VG_BLEND_VIVIDLIGHT_KHR = 0x201c,
VG_BLEND_LINEARLIGHT_KHR = 0x201d,
VG_BLEND_PINLIGHT_KHR = 0x201e,
VG_BLEND_HARDMIX_KHR = 0x201f,
VG_BLEND_CLEAR_KHR = 0x2020,
VG_BLEND_DST_KHR = 0x2021,
VG_BLEND_SRC_OUT_KHR = 0x2022,
VG_BLEND_DST_OUT_KHR = 0x2023,
VG_BLEND_SRC_ATOP_KHR = 0x2024,
VG_BLEND_DST_ATOP_KHR = 0x2025,
VG_BLEND_XOR_KHR = 0x2026,
VG_BLEND_MODE_KHR_FORCE_SIZE= VG_MAX_ENUM
} VGBlendModeKHR;
#endif
#ifndef VG_KHR_parametric_filter
#define VG_KHR_parametric_filter 1
typedef enum {
VG_PF_OBJECT_VISIBLE_FLAG_KHR = (1 << 0),
VG_PF_KNOCKOUT_FLAG_KHR = (1 << 1),
VG_PF_OUTER_FLAG_KHR = (1 << 2),
VG_PF_INNER_FLAG_KHR = (1 << 3),
VG_PF_TYPE_KHR_FORCE_SIZE = VG_MAX_ENUM
} VGPfTypeKHR;
typedef enum {
VGU_IMAGE_IN_USE_ERROR = 0xF010,
VGU_ERROR_CODE_KHR_FORCE_SIZE = VG_MAX_ENUM
} VGUErrorCodeKHR;
#ifdef VG_VGEXT_PROTOTYPES
VG_API_CALL void VG_API_ENTRY vgParametricFilterKHR(VGImage dst,VGImage src,VGImage blur,VGfloat strength,VGfloat offsetX,VGfloat offsetY,VGbitfield filterFlags,VGPaint highlightPaint,VGPaint shadowPaint);
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguDropShadowKHR(VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint shadowColorRGBA);
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguGlowKHR(VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint glowColorRGBA) ;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguBevelKHR(VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint highlightColorRGBA,VGuint shadowColorRGBA);
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguGradientGlowKHR(VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint stopsCount,const VGfloat* glowColorRampStops);
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguGradientBevelKHR(VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint stopsCount,const VGfloat* bevelColorRampStops);
#endif
typedef void (VG_API_ENTRYP PFNVGPARAMETRICFILTERKHRPROC) (VGImage dst,VGImage src,VGImage blur,VGfloat strength,VGfloat offsetX,VGfloat offsetY,VGbitfield filterFlags,VGPaint highlightPaint,VGPaint shadowPaint);
typedef VGUErrorCode (VGU_API_ENTRYP PFNVGUDROPSHADOWKHRPROC) (VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint shadowColorRGBA);
typedef VGUErrorCode (VGU_API_ENTRYP PFNVGUGLOWKHRPROC) (VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint glowColorRGBA);
typedef VGUErrorCode (VGU_API_ENTRYP PFNVGUBEVELKHRPROC) (VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint highlightColorRGBA,VGuint shadowColorRGBA);
typedef VGUErrorCode (VGU_API_ENTRYP PFNVGUGRADIENTGLOWKHRPROC) (VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint stopsCount,const VGfloat* glowColorRampStops);
typedef VGUErrorCode (VGU_API_ENTRYP PFNVGUGRADIENTBEVELKHRPROC) (VGImage dst,VGImage src,VGfloat dimX,VGfloat dimY,VGuint iterative,VGfloat strength,VGfloat distance,VGfloat angle,VGbitfield filterFlags,VGbitfield allowedQuality,VGuint stopsCount,const VGfloat* bevelColorRampStops);
#endif
/*-------------------------------------------------------------------------------
* NDS extensions
*------------------------------------------------------------------------------*/
#ifndef VG_NDS_paint_generation
#define VG_NDS_paint_generation 1
typedef enum {
VG_PAINT_COLOR_RAMP_LINEAR_NDS = 0x1A10,
VG_COLOR_MATRIX_NDS = 0x1A11,
VG_PAINT_COLOR_TRANSFORM_LINEAR_NDS = 0x1A12,
VG_PAINT_PARAM_TYPE_NDS_FORCE_SIZE = VG_MAX_ENUM
} VGPaintParamTypeNds;
typedef enum {
VG_DRAW_IMAGE_COLOR_MATRIX_NDS = 0x1F10,
VG_IMAGE_MODE_NDS_FORCE_SIZE = VG_MAX_ENUM
} VGImageModeNds;
#endif
#ifndef VG_NDS_projective_geometry
#define VG_NDS_projective_geometry 1
typedef enum {
VG_CLIP_MODE_NDS = 0x1180,
VG_CLIP_LINES_NDS = 0x1181,
VG_MAX_CLIP_LINES_NDS = 0x1182,
VG_PARAM_TYPE_NDS_FORCE_SIZE = VG_MAX_ENUM
} VGParamTypeNds;
typedef enum {
VG_CLIPMODE_NONE_NDS = 0x3000,
VG_CLIPMODE_CLIP_CLOSED_NDS = 0x3001,
VG_CLIPMODE_CLIP_OPEN_NDS = 0x3002,
VG_CLIPMODE_CULL_NDS = 0x3003,
VG_CLIPMODE_NDS_FORCE_SIZE = VG_MAX_ENUM
} VGClipModeNds;
typedef enum {
VG_RQUAD_TO_NDS = ( 13 << 1 ),
VG_RCUBIC_TO_NDS = ( 14 << 1 ),
VG_PATH_SEGMENT_NDS_FORCE_SIZE = VG_MAX_ENUM
} VGPathSegmentNds;
typedef enum {
VG_RQUAD_TO_ABS_NDS = (VG_RQUAD_TO_NDS | VG_ABSOLUTE),
VG_RQUAD_TO_REL_NDS = (VG_RQUAD_TO_NDS | VG_RELATIVE),
VG_RCUBIC_TO_ABS_NDS = (VG_RCUBIC_TO_NDS | VG_ABSOLUTE),
VG_RCUBIC_TO_REL_NDS = (VG_RCUBIC_TO_NDS | VG_RELATIVE),
VG_PATH_COMMAND_NDS_FORCE_SIZE = VG_MAX_ENUM
} VGPathCommandNds;
#ifdef VG_VGEXT_PROTOTYPES
VG_API_CALL void VG_API_ENTRY vgProjectiveMatrixNDS(VGboolean enable) ;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguTransformClipLineNDS(const VGfloat Ain,const VGfloat Bin,const VGfloat Cin,const VGfloat* matrix,const VGboolean inverse,VGfloat* Aout,VGfloat* Bout,VGfloat* Cout);
#endif
typedef void (VG_API_ENTRYP PFNVGPROJECTIVEMATRIXNDSPROC) (VGboolean enable) ;
typedef VGUErrorCode (VGU_API_ENTRYP PFNVGUTRANSFORMCLIPLINENDSPROC) (const VGfloat Ain,const VGfloat Bin,const VGfloat Cin,const VGfloat* matrix,const VGboolean inverse,VGfloat* Aout,VGfloat* Bout,VGfloat* Cout);
#endif
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* _VGEXT_H */

View File

@@ -1,92 +0,0 @@
/* $Revision: 6810 $ on $Date:: 2008-10-29 07:31:37 -0700 #$ */
/*------------------------------------------------------------------------
*
* VG platform specific header Reference Implementation
* ----------------------------------------------------
*
* Copyright (c) 2008 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and /or associated documentation files
* (the "Materials "), to deal in the Materials without restriction,
* including without limitation the rights to use, copy, modify, merge,
* publish, distribute, sublicense, and/or sell copies of the Materials,
* and to permit persons to whom the Materials are furnished to do so,
* subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MATERIALS OR
* THE USE OR OTHER DEALINGS IN THE MATERIALS.
*
*//**
* \file
* \brief VG platform specific header
*//*-------------------------------------------------------------------*/
#ifndef _VGPLATFORM_H
#define _VGPLATFORM_H
#include <KHR/khrplatform.h>
#ifdef __cplusplus
extern "C" {
#endif
#ifndef VG_API_CALL
#if defined(OPENVG_STATIC_LIBRARY)
# define VG_API_CALL
#else
# define VG_API_CALL KHRONOS_APICALL
#endif /* defined OPENVG_STATIC_LIBRARY */
#endif /* ifndef VG_API_CALL */
#ifndef VGU_API_CALL
#if defined(OPENVG_STATIC_LIBRARY)
# define VGU_API_CALL
#else
# define VGU_API_CALL KHRONOS_APICALL
#endif /* defined OPENVG_STATIC_LIBRARY */
#endif /* ifndef VGU_API_CALL */
#ifndef VG_API_ENTRY
#define VG_API_ENTRY
#endif
#ifndef VG_API_EXIT
#define VG_API_EXIT
#endif
#ifndef VGU_API_ENTRY
#define VGU_API_ENTRY
#endif
#ifndef VGU_API_EXIT
#define VGU_API_EXIT
#endif
typedef float VGfloat;
typedef signed char VGbyte;
typedef unsigned char VGubyte;
typedef signed short VGshort;
typedef signed int VGint;
typedef unsigned int VGuint;
typedef unsigned int VGbitfield;
#ifndef VG_VGEXT_PROTOTYPES
#define VG_VGEXT_PROTOTYPES
#endif
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* _VGPLATFORM_H */

View File

@@ -1,131 +0,0 @@
/* $Revision: 6810 $ on $Date:: 2008-10-29 07:31:37 -0700 #$ */
/*------------------------------------------------------------------------
*
* VGU 1.1 Reference Implementation
* -------------------------------------
*
* Copyright (c) 2008 The Khronos Group Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and /or associated documentation files
* (the "Materials "), to deal in the Materials without restriction,
* including without limitation the rights to use, copy, modify, merge,
* publish, distribute, sublicense, and/or sell copies of the Materials,
* and to permit persons to whom the Materials are furnished to do so,
* subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Materials.
*
* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MATERIALS OR
* THE USE OR OTHER DEALINGS IN THE MATERIALS.
*
*//**
* \file
* \brief VGU 1.1 API.
*//*-------------------------------------------------------------------*/
#ifndef _VGU_H
#define _VGU_H
#ifdef __cplusplus
extern "C" {
#endif
#include <VG/openvg.h>
#define VGU_VERSION_1_0 1
#define VGU_VERSION_1_1 2
#ifndef VGU_API_CALL
# error VGU_API_CALL must be defined
#endif
#ifndef VGU_API_ENTRY
# error VGU_API_ENTRY must be defined
#endif
#ifndef VGU_API_EXIT
# error VGU_API_EXIT must be defined
#endif
typedef enum {
VGU_NO_ERROR = 0,
VGU_BAD_HANDLE_ERROR = 0xF000,
VGU_ILLEGAL_ARGUMENT_ERROR = 0xF001,
VGU_OUT_OF_MEMORY_ERROR = 0xF002,
VGU_PATH_CAPABILITY_ERROR = 0xF003,
VGU_BAD_WARP_ERROR = 0xF004,
VGU_ERROR_CODE_FORCE_SIZE = VG_MAX_ENUM
} VGUErrorCode;
typedef enum {
VGU_ARC_OPEN = 0xF100,
VGU_ARC_CHORD = 0xF101,
VGU_ARC_PIE = 0xF102,
VGU_ARC_TYPE_FORCE_SIZE = VG_MAX_ENUM
} VGUArcType;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguLine(VGPath path,
VGfloat x0, VGfloat y0,
VGfloat x1, VGfloat y1) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguPolygon(VGPath path,
const VGfloat * points, VGint count,
VGboolean closed) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguRect(VGPath path,
VGfloat x, VGfloat y,
VGfloat width, VGfloat height) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguRoundRect(VGPath path,
VGfloat x, VGfloat y,
VGfloat width, VGfloat height,
VGfloat arcWidth, VGfloat arcHeight) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguEllipse(VGPath path,
VGfloat cx, VGfloat cy,
VGfloat width, VGfloat height) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguArc(VGPath path,
VGfloat x, VGfloat y,
VGfloat width, VGfloat height,
VGfloat startAngle, VGfloat angleExtent,
VGUArcType arcType) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguComputeWarpQuadToSquare(VGfloat sx0, VGfloat sy0,
VGfloat sx1, VGfloat sy1,
VGfloat sx2, VGfloat sy2,
VGfloat sx3, VGfloat sy3,
VGfloat * matrix) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguComputeWarpSquareToQuad(VGfloat dx0, VGfloat dy0,
VGfloat dx1, VGfloat dy1,
VGfloat dx2, VGfloat dy2,
VGfloat dx3, VGfloat dy3,
VGfloat * matrix) VGU_API_EXIT;
VGU_API_CALL VGUErrorCode VGU_API_ENTRY vguComputeWarpQuadToQuad(VGfloat dx0, VGfloat dy0,
VGfloat dx1, VGfloat dy1,
VGfloat dx2, VGfloat dy2,
VGfloat dx3, VGfloat dy3,
VGfloat sx0, VGfloat sy0,
VGfloat sx1, VGfloat sy1,
VGfloat sx2, VGfloat sy2,
VGfloat sx3, VGfloat sy3,
VGfloat * matrix) VGU_API_EXIT;
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* #ifndef _VGU_H */

View File

@@ -177,13 +177,8 @@ mtx_init(mtx_t *mtx, int type)
&& type != (mtx_try|mtx_recursive))
return thrd_error;
pthread_mutexattr_init(&attr);
if ((type & mtx_recursive) != 0) {
#if defined(__linux__) || defined(__linux)
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE_NP);
#else
if ((type & mtx_recursive) != 0)
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
#endif
}
pthread_mutex_init(mtx, &attr);
pthread_mutexattr_destroy(&attr);
return thrd_success;

View File

@@ -35,8 +35,7 @@
#define bool _Bool
/* For compilers that don't have the builtin _Bool type. */
#if (defined(_MSC_VER) && _MSC_VER < 1800) || \
(defined __GNUC__&& __STDC_VERSION__ < 199901L && __GNUC__ < 3)
#if (defined(_MSC_VER) && _MSC_VER < 1800)
typedef unsigned char _Bool;
#endif

View File

@@ -1,6 +1,6 @@
/**************************************************************************
*
* Copyright 2009 VMware, Inc.
* Copyright 2015 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
@@ -25,13 +25,25 @@
*
**************************************************************************/
#ifndef ID_PUBLIC_H
#define ID_PUBLIC_H
#ifndef _C99_ALLOCA_H_
#define _C99_ALLOCA_H_
struct pipe_screen;
struct pipe_context;
struct pipe_screen *
identity_screen_create(struct pipe_screen *screen);
#if defined(_MSC_VER)
#endif /* ID_PUBLIC_H */
# include <malloc.h>
# define alloca _alloca
#elif defined(__sun) || defined(__CYGWIN__)
# include <alloca.h>
#else /* !defined(_MSC_VER) */
# include <stdlib.h>
#endif /* !defined(_MSC_VER) */
#endif

View File

@@ -25,6 +25,8 @@
*
**************************************************************************/
#include "no_extern_c.h"
#ifndef _C99_COMPAT_H_
#define _C99_COMPAT_H_
@@ -33,6 +35,11 @@
* MSVC hacks.
*/
#if defined(_MSC_VER)
# if _MSC_VER < 1500
# error "Microsoft Visual Studio 2008 or higher required"
# endif
/*
* Visual Studio 2012 will complain if we define the `inline` keyword, but
* actually it only supports the keyword on C++.
@@ -114,17 +121,9 @@
# elif defined(__SUNPRO_C) && defined(__C99FEATURES__)
/* C99 */
# elif defined(__GNUC__)
# if __GNUC__ >= 2
# define __func__ __FUNCTION__
# else
# define __func__ "<unknown>"
# endif
# define __func__ __FUNCTION__
# elif defined(_MSC_VER)
# if _MSC_VER >= 1300
# define __func__ __FUNCTION__
# else
# define __func__ "<unknown>"
# endif
# define __func__ __FUNCTION__
# else
# define __func__ "<unknown>"
# endif

215
include/c99_math.h Normal file
View File

@@ -0,0 +1,215 @@
/**************************************************************************
*
* Copyright 2007-2015 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/**
* Wrapper for math.h which makes sure we have definitions of all the c99
* functions.
*/
#ifndef _C99_MATH_H_
#define _C99_MATH_H_
#include <math.h>
#include "c99_compat.h"
#if defined(_MSC_VER)
/* This is to ensure that we get M_PI, etc. definitions */
#if !defined(_USE_MATH_DEFINES)
#error _USE_MATH_DEFINES define required when building with MSVC
#endif
#if _MSC_VER < 1800
#define isfinite(x) _finite((double)(x))
#define isnan(x) _isnan((double)(x))
#endif /* _MSC_VER < 1800 */
#if _MSC_VER < 1800
static inline double log2( double x )
{
const double invln2 = 1.442695041;
return log( x ) * invln2;
}
static inline double
round(double x)
{
return x >= 0.0 ? floor(x + 0.5) : ceil(x - 0.5);
}
static inline float
roundf(float x)
{
return x >= 0.0f ? floorf(x + 0.5f) : ceilf(x - 0.5f);
}
#endif
#ifndef INFINITY
#include <float.h> // DBL_MAX
#define INFINITY (DBL_MAX + DBL_MAX)
#endif
#ifndef NAN
#define NAN (INFINITY - INFINITY)
#endif
#endif /* _MSC_VER */
#if (defined(_MSC_VER) && _MSC_VER < 1800) || \
(!defined(_MSC_VER) && \
__STDC_VERSION__ < 199901L && \
(!defined(_XOPEN_SOURCE) || _XOPEN_SOURCE < 600) && \
!defined(__cplusplus))
static inline long int
lrint(double d)
{
long int rounded = (long int)(d + 0.5);
if (d - floor(d) == 0.5) {
if (rounded % 2 != 0)
rounded += (d > 0) ? -1 : 1;
}
return rounded;
}
static inline long int
lrintf(float f)
{
long int rounded = (long int)(f + 0.5f);
if (f - floorf(f) == 0.5f) {
if (rounded % 2 != 0)
rounded += (f > 0) ? -1 : 1;
}
return rounded;
}
static inline long long int
llrint(double d)
{
long long int rounded = (long long int)(d + 0.5);
if (d - floor(d) == 0.5) {
if (rounded % 2 != 0)
rounded += (d > 0) ? -1 : 1;
}
return rounded;
}
static inline long long int
llrintf(float f)
{
long long int rounded = (long long int)(f + 0.5f);
if (f - floorf(f) == 0.5f) {
if (rounded % 2 != 0)
rounded += (f > 0) ? -1 : 1;
}
return rounded;
}
#endif /* C99 */
/*
* signbit() is a macro on Linux. Not available on Windows.
*/
#ifndef signbit
#define signbit(x) ((x) < 0.0f)
#endif
#ifndef M_PI
#define M_PI (3.14159265358979323846)
#endif
#ifndef M_E
#define M_E (2.7182818284590452354)
#endif
#ifndef M_LOG2E
#define M_LOG2E (1.4426950408889634074)
#endif
#ifndef FLT_MAX_EXP
#define FLT_MAX_EXP 128
#endif
#if defined(fpclassify)
/* ISO C99 says that fpclassify is a macro. Assume that any implementation
* of fpclassify, whether it's in a C99 compiler or not, will be a macro.
*/
#elif defined(__cplusplus)
/* For C++, fpclassify() should be defined in <cmath> */
#elif defined(_MSC_VER)
/* Not required on VS2013 and above. Oddly, the fpclassify() function
* doesn't exist in such a form on MSVC. This is an implementation using
* slightly different lower-level Windows functions.
*/
#include <float.h>
static inline enum {FP_NAN, FP_INFINITE, FP_ZERO, FP_SUBNORMAL, FP_NORMAL}
fpclassify(double x)
{
switch(_fpclass(x)) {
case _FPCLASS_SNAN: /* signaling NaN */
case _FPCLASS_QNAN: /* quiet NaN */
return FP_NAN;
case _FPCLASS_NINF: /* negative infinity */
case _FPCLASS_PINF: /* positive infinity */
return FP_INFINITE;
case _FPCLASS_NN: /* negative normal */
case _FPCLASS_PN: /* positive normal */
return FP_NORMAL;
case _FPCLASS_ND: /* negative denormalized */
case _FPCLASS_PD: /* positive denormalized */
return FP_SUBNORMAL;
case _FPCLASS_NZ: /* negative zero */
case _FPCLASS_PZ: /* positive zero */
return FP_ZERO;
default:
/* Should never get here; but if we do, this will guarantee
* that the pattern is not treated like a number.
*/
return FP_NAN;
}
}
#else
#error "Need to include or define an fpclassify function"
#endif
#endif /* #define _C99_MATH_H_ */

View File

@@ -0,0 +1,101 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3DADAPTER9_H_
#define _D3DADAPTER9_H_
#include "present.h"
#ifndef __cplusplus
/* Representation of an adapter group, although since this is implemented by
* the driver, it knows nothing about the windowing system it's on */
typedef struct ID3DAdapter9Vtbl
{
/* IUnknown */
HRESULT (WINAPI *QueryInterface)(ID3DAdapter9 *This, REFIID riid, void **ppvObject);
ULONG (WINAPI *AddRef)(ID3DAdapter9 *This);
ULONG (WINAPI *Release)(ID3DAdapter9 *This);
/* ID3DAdapter9 */
HRESULT (WINAPI *GetAdapterIdentifier)(ID3DAdapter9 *This, DWORD Flags, D3DADAPTER_IDENTIFIER9 *pIdentifier);
HRESULT (WINAPI *CheckDeviceType)(ID3DAdapter9 *This, D3DDEVTYPE DevType, D3DFORMAT AdapterFormat, D3DFORMAT BackBufferFormat, BOOL bWindowed);
HRESULT (WINAPI *CheckDeviceFormat)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, DWORD Usage, D3DRESOURCETYPE RType, D3DFORMAT CheckFormat);
HRESULT (WINAPI *CheckDeviceMultiSampleType)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT SurfaceFormat, BOOL Windowed, D3DMULTISAMPLE_TYPE MultiSampleType, DWORD *pQualityLevels);
HRESULT (WINAPI *CheckDepthStencilMatch)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, D3DFORMAT RenderTargetFormat, D3DFORMAT DepthStencilFormat);
HRESULT (WINAPI *CheckDeviceFormatConversion)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT SourceFormat, D3DFORMAT TargetFormat);
HRESULT (WINAPI *GetDeviceCaps)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DCAPS9 *pCaps);
HRESULT (WINAPI *CreateDevice)(ID3DAdapter9 *This, UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, IDirect3D9 *pD3D9, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9 **ppReturnedDeviceInterface);
HRESULT (WINAPI *CreateDeviceEx)(ID3DAdapter9 *This, UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, D3DDISPLAYMODEEX *pFullscreenDisplayMode, IDirect3D9Ex *pD3D9Ex, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9Ex **ppReturnedDeviceInterface);
} ID3DAdapter9Vtbl;
struct ID3DAdapter9
{
ID3DAdapter9Vtbl *lpVtbl;
};
/* IUnknown macros */
#define ID3DAdapter9_QueryInterface(p,a,b) (p)->lpVtbl->QueryInterface(p,a,b)
#define ID3DAdapter9_AddRef(p) (p)->lpVtbl->AddRef(p)
#define ID3DAdapter9_Release(p) (p)->lpVtbl->Release(p)
/* ID3DAdapter9 macros */
#define ID3DAdapter9_GetAdapterIdentifier(p,a,b) (p)->lpVtbl->GetAdapterIdentifier(p,a,b)
#define ID3DAdapter9_CheckDeviceType(p,a,b,c,d) (p)->lpVtbl->CheckDeviceType(p,a,b,c,d)
#define ID3DAdapter9_CheckDeviceFormat(p,a,b,c,d,e) (p)->lpVtbl->CheckDeviceFormat(p,a,b,c,d,e)
#define ID3DAdapter9_CheckDeviceMultiSampleType(p,a,b,c,d,e) (p)->lpVtbl->CheckDeviceMultiSampleType(p,a,b,c,d,e)
#define ID3DAdapter9_CheckDepthStencilMatch(p,a,b,c,d) (p)->lpVtbl->CheckDepthStencilMatch(p,a,b,c,d)
#define ID3DAdapter9_CheckDeviceFormatConversion(p,a,b,c) (p)->lpVtbl->CheckDeviceFormatConversion(p,a,b,c)
#define ID3DAdapter9_GetDeviceCaps(p,a,b) (p)->lpVtbl->GetDeviceCaps(p,a,b)
#define ID3DAdapter9_CreateDevice(p,a,b,c,d,e,f,g,h) (p)->lpVtbl->CreateDevice(p,a,b,c,d,e,f,g,h)
#define ID3DAdapter9_CreateDeviceEx(p,a,b,c,d,e,f,g,h,i) (p)->lpVtbl->CreateDeviceEx(p,a,b,c,d,e,f,g,h,i)
#else /* __cplusplus */
struct ID3DAdapter9 : public IUnknown
{
HRESULT WINAPI GetAdapterIdentifier(DWORD Flags, D3DADAPTER_IDENTIFIER9 *pIdentifier);
HRESULT WINAPI CheckDeviceType(D3DDEVTYPE DevType, D3DFORMAT AdapterFormat, D3DFORMAT BackBufferFormat, BOOL bWindowed);
HRESULT WINAPI CheckDeviceFormat(D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, DWORD Usage, D3DRESOURCETYPE RType, D3DFORMAT CheckFormat);
HRESULT WINAPI CheckDeviceMultiSampleType(D3DDEVTYPE DeviceType, D3DFORMAT SurfaceFormat, BOOL Windowed, D3DMULTISAMPLE_TYPE MultiSampleType, DWORD *pQualityLevels);
HRESULT WINAPI CheckDepthStencilMatch(D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, D3DFORMAT RenderTargetFormat, D3DFORMAT DepthStencilFormat);
HRESULT WINAPI CheckDeviceFormatConversion(D3DDEVTYPE DeviceType, D3DFORMAT SourceFormat, D3DFORMAT TargetFormat);
HRESULT WINAPI GetDeviceCaps(D3DDEVTYPE DeviceType, D3DCAPS9 *pCaps);
HRESULT WINAPI CreateDevice(UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, IDirect3D9 *pD3D9, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9 **ppReturnedDeviceInterface);
HRESULT WINAPI CreateDeviceEx(UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, D3DDISPLAYMODEEX *pFullscreenDisplayMode, IDirect3D9Ex *pD3D9Ex, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9Ex **ppReturnedDeviceInterface);
};
#endif /* __cplusplus */
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
/* acquire a const struct D3DAdapter9* structure describing the interface
* queried. See */
const void * WINAPI
D3DAdapter9GetProc( const char *name );
#ifdef __cplusplus
}
#endif /* __cplusplus */
#endif /* _D3DADAPTER9_H_ */

44
include/d3dadapter/drm.h Normal file
View File

@@ -0,0 +1,44 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3DADAPTER9_DRM_H_
#define _D3DADAPTER9_DRM_H_
#include "d3dadapter9.h"
/* query driver support name */
#define D3DADAPTER9DRM_NAME "drm"
/* current version */
#define D3DADAPTER9DRM_MAJOR 0
#define D3DADAPTER9DRM_MINOR 0
struct D3DAdapter9DRM
{
unsigned major_version; /* ABI break */
unsigned minor_version; /* backwards compatible feature additions */
/* NOTE: upon passing an fd to this function, it's now owned by this
function. If this function fails, the fd will be closed here as well */
HRESULT (WINAPI *create_adapter)(int fd, ID3DAdapter9 **ppAdapter);
};
#endif /* _D3DADAPTER9_DRM_H_ */

View File

@@ -0,0 +1,136 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3DADAPTER_PRESENT_H_
#define _D3DADAPTER_PRESENT_H_
#include <d3d9.h>
#ifndef D3DOK_WINDOW_OCCLUDED
#define D3DOK_WINDOW_OCCLUDED MAKE_D3DSTATUS(2531)
#endif /* D3DOK_WINDOW_OCCLUDED */
#ifndef __cplusplus
typedef struct ID3DPresent ID3DPresent;
typedef struct ID3DPresentGroup ID3DPresentGroup;
typedef struct ID3DAdapter9 ID3DAdapter9;
typedef struct D3DWindowBuffer D3DWindowBuffer;
/* Presentation backend for drivers to display their brilliant work */
typedef struct ID3DPresentVtbl
{
/* IUnknown */
HRESULT (WINAPI *QueryInterface)(ID3DPresent *This, REFIID riid, void **ppvObject);
ULONG (WINAPI *AddRef)(ID3DPresent *This);
ULONG (WINAPI *Release)(ID3DPresent *This);
/* ID3DPresent */
/* This function initializes the screen and window provided at creation.
* Hence why this should always be called as the one of first things a new
* swap chain does */
HRESULT (WINAPI *SetPresentParameters)(ID3DPresent *This, D3DPRESENT_PARAMETERS *pPresentationParameters, D3DDISPLAYMODEEX *pFullscreenDisplayMode);
/* Make a buffer visible to the window system via dma-buf fd.
* For better compatibility, it must be 32bpp and format ARGB/XRGB */
HRESULT (WINAPI *NewD3DWindowBufferFromDmaBuf)(ID3DPresent *This, int dmaBufFd, int width, int height, int stride, int depth, int bpp, D3DWindowBuffer **out);
HRESULT (WINAPI *DestroyD3DWindowBuffer)(ID3DPresent *This, D3DWindowBuffer *buffer);
/* After presenting a buffer to the window system, the buffer
* may be used as is (no copy of the content) by the window system.
* You must not use a non-released buffer, else the user may see undefined content. */
HRESULT (WINAPI *WaitBufferReleased)(ID3DPresent *This, D3DWindowBuffer *buffer);
HRESULT (WINAPI *FrontBufferCopy)(ID3DPresent *This, D3DWindowBuffer *buffer);
/* It is possible to do partial copy, but impossible to do resizing, which must
* be done by the client after checking the front buffer size */
HRESULT (WINAPI *PresentBuffer)(ID3DPresent *This, D3DWindowBuffer *buffer, HWND hWndOverride, const RECT *pSourceRect, const RECT *pDestRect, const RGNDATA *pDirtyRegion, DWORD Flags);
HRESULT (WINAPI *GetRasterStatus)(ID3DPresent *This, D3DRASTER_STATUS *pRasterStatus);
HRESULT (WINAPI *GetDisplayMode)(ID3DPresent *This, D3DDISPLAYMODEEX *pMode, D3DDISPLAYROTATION *pRotation);
HRESULT (WINAPI *GetPresentStats)(ID3DPresent *This, D3DPRESENTSTATS *pStats);
HRESULT (WINAPI *GetCursorPos)(ID3DPresent *This, POINT *pPoint);
HRESULT (WINAPI *SetCursorPos)(ID3DPresent *This, POINT *pPoint);
/* Cursor size is always 32x32. pBitmap and pHotspot can be NULL. */
HRESULT (WINAPI *SetCursor)(ID3DPresent *This, void *pBitmap, POINT *pHotspot, BOOL bShow);
HRESULT (WINAPI *SetGammaRamp)(ID3DPresent *This, const D3DGAMMARAMP *pRamp, HWND hWndOverride);
HRESULT (WINAPI *GetWindowInfo)(ID3DPresent *This, HWND hWnd, int *width, int *height, int *depth);
} ID3DPresentVtbl;
struct ID3DPresent
{
ID3DPresentVtbl *lpVtbl;
};
/* IUnknown macros */
#define ID3DPresent_QueryInterface(p,a,b) (p)->lpVtbl->QueryInterface(p,a,b)
#define ID3DPresent_AddRef(p) (p)->lpVtbl->AddRef(p)
#define ID3DPresent_Release(p) (p)->lpVtbl->Release(p)
/* ID3DPresent macros */
#define ID3DPresent_GetPresentParameters(p,a) (p)->lpVtbl->GetPresentParameters(p,a)
#define ID3DPresent_SetPresentParameters(p,a,b) (p)->lpVtbl->SetPresentParameters(p,a,b)
#define ID3DPresent_NewD3DWindowBufferFromDmaBuf(p,a,b,c,d,e,f,g) (p)->lpVtbl->NewD3DWindowBufferFromDmaBuf(p,a,b,c,d,e,f,g)
#define ID3DPresent_DestroyD3DWindowBuffer(p,a) (p)->lpVtbl->DestroyD3DWindowBuffer(p,a)
#define ID3DPresent_WaitBufferReleased(p,a) (p)->lpVtbl->WaitBufferReleased(p,a)
#define ID3DPresent_FrontBufferCopy(p,a) (p)->lpVtbl->FrontBufferCopy(p,a)
#define ID3DPresent_PresentBuffer(p,a,b,c,d,e,f) (p)->lpVtbl->PresentBuffer(p,a,b,c,d,e,f)
#define ID3DPresent_GetRasterStatus(p,a) (p)->lpVtbl->GetRasterStatus(p,a)
#define ID3DPresent_GetDisplayMode(p,a,b) (p)->lpVtbl->GetDisplayMode(p,a,b)
#define ID3DPresent_GetPresentStats(p,a) (p)->lpVtbl->GetPresentStats(p,a)
#define ID3DPresent_GetCursorPos(p,a) (p)->lpVtbl->GetCursorPos(p,a)
#define ID3DPresent_SetCursorPos(p,a) (p)->lpVtbl->SetCursorPos(p,a)
#define ID3DPresent_SetCursor(p,a,b,c) (p)->lpVtbl->SetCursor(p,a,b,c)
#define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)
#define ID3DPresent_GetWindowInfo(p,a,b,c,d) (p)->lpVtbl->GetWindowSize(p,a,b,c,d)
typedef struct ID3DPresentGroupVtbl
{
/* IUnknown */
HRESULT (WINAPI *QueryInterface)(ID3DPresentGroup *This, REFIID riid, void **ppvObject);
ULONG (WINAPI *AddRef)(ID3DPresentGroup *This);
ULONG (WINAPI *Release)(ID3DPresentGroup *This);
/* ID3DPresentGroup */
/* When creating a device, it's relevant for the driver to know how many
* implicit swap chains to create. It has to create one per monitor in a
* multi-monitor setup */
UINT (WINAPI *GetMultiheadCount)(ID3DPresentGroup *This);
/* returns only the implicit present interfaces */
HRESULT (WINAPI *GetPresent)(ID3DPresentGroup *This, UINT Index, ID3DPresent **ppPresent);
/* used to create additional presentation interfaces along the way */
HRESULT (WINAPI *CreateAdditionalPresent)(ID3DPresentGroup *This, D3DPRESENT_PARAMETERS *pPresentationParameters, ID3DPresent **ppPresent);
void (WINAPI *GetVersion) (ID3DPresentGroup *This, int *major, int *minor);
} ID3DPresentGroupVtbl;
struct ID3DPresentGroup
{
ID3DPresentGroupVtbl *lpVtbl;
};
/* IUnknown macros */
#define ID3DPresentGroup_QueryInterface(p,a,b) (p)->lpVtbl->QueryInterface(p,a,b)
#define ID3DPresentGroup_AddRef(p) (p)->lpVtbl->AddRef(p)
#define ID3DPresentGroup_Release(p) (p)->lpVtbl->Release(p)
/* ID3DPresentGroup */
#define ID3DPresentGroup_GetMultiheadCount(p) (p)->lpVtbl->GetMultiheadCount(p)
#define ID3DPresentGroup_GetPresent(p,a,b) (p)->lpVtbl->GetPresent(p,a,b)
#define ID3DPresentGroup_CreateAdditionalPresent(p,a,b) (p)->lpVtbl->CreateAdditionalPresent(p,a,b)
#define ID3DPresentGroup_GetVersion(p,a,b) (p)->lpVtbl->GetVersion(p,a,b)
#endif /* __cplusplus */
#endif /* _D3DADAPTER_PRESENT_H_ */

48
include/no_extern_c.h Normal file
View File

@@ -0,0 +1,48 @@
/**************************************************************************
*
* Copyright 2014 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/*
* Including system's headers inside `extern "C" { ... }` is not safe, as system
* headers may have C++ code in them, and C++ code inside extern "C"
* leads to syntatically incorrect code.
*
* This is because putting code inside extern "C" won't make __cplusplus define
* go away, that is, the system header being included thinks is free to use C++
* as it sees fits.
*
* Including non-system headers inside extern "C" is not safe either, because
* non-system headers end up including system headers, hence fall in the above
* case too.
*
* Conclusion, includes inside extern "C" is simply not portable.
*
*
* This header helps surface these issues.
*/
#ifdef __cplusplus
template<class T> class _IncludeInsideExternCNotPortable;
#endif

View File

@@ -11,5 +11,5 @@ CHIPSET(0x27AE, I945_GME, "Intel(R) 945GME")
CHIPSET(0x29B2, Q35_G, "Intel(R) Q35")
CHIPSET(0x29C2, G33_G, "Intel(R) G33")
CHIPSET(0x29D2, Q33_G, "Intel(R) Q33")
CHIPSET(0xA011, IGD_GM, "Intel(R) IGD")
CHIPSET(0xA001, IGD_G, "Intel(R) IGD")
CHIPSET(0xA011, PNV_GM, "Intel(R) Pineview M")
CHIPSET(0xA001, PNV_G, "Intel(R) Pineview")

View File

@@ -109,7 +109,22 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x22B0, chv, "Intel(R) Cherryview")
CHIPSET(0x22B1, chv, "Intel(R) Cherryview")
CHIPSET(0x22B2, chv, "Intel(R) Cherryview")
CHIPSET(0x22B3, chv, "Intel(R) Cherryview")
CHIPSET(0x1902, skl_gt1, "Intel(R) Skylake DT GT1")
CHIPSET(0x1906, skl_gt1, "Intel(R) Skylake ULT GT1")
CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake SRV GT1")
CHIPSET(0x190B, skl_gt1, "Intel(R) Skylake Halo GT1")
CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake ULX GT1")
CHIPSET(0x1912, skl_gt2, "Intel(R) Skylake DT GT2")
CHIPSET(0x1916, skl_gt2, "Intel(R) Skylake ULT GT2")
CHIPSET(0x191A, skl_gt2, "Intel(R) Skylake SRV GT2")
CHIPSET(0x191B, skl_gt2, "Intel(R) Skylake Halo GT2")
CHIPSET(0x191D, skl_gt2, "Intel(R) Skylake WKS GT2")
CHIPSET(0x191E, skl_gt2, "Intel(R) Skylake ULX GT2")
CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake ULT GT2F")
CHIPSET(0x1926, skl_gt3, "Intel(R) Skylake ULT GT3")
CHIPSET(0x192A, skl_gt3, "Intel(R) Skylake SRV GT3")
CHIPSET(0x192B, skl_gt3, "Intel(R) Skylake Halo GT3")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")

View File

@@ -85,6 +85,7 @@ CHIPSET(0x6651, BONAIRE_6651, BONAIRE)
CHIPSET(0x6658, BONAIRE_6658, BONAIRE)
CHIPSET(0x665C, BONAIRE_665C, BONAIRE)
CHIPSET(0x665D, BONAIRE_665D, BONAIRE)
CHIPSET(0x665F, BONAIRE_665F, BONAIRE)
CHIPSET(0x9830, KABINI_9830, KABINI)
CHIPSET(0x9831, KABINI_9831, KABINI)

View File

@@ -0,0 +1,90 @@
//
// File: vk_platform.h
//
/*
** Copyright (c) 2014-2015 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
** "Materials"), to deal in the Materials without restriction, including
** without limitation the rights to use, copy, modify, merge, publish,
** distribute, sublicense, and/or sell copies of the Materials, and to
** permit persons to whom the Materials are furnished to do so, subject to
** the following conditions:
**
** The above copyright notice and this permission notice shall be included
** in all copies or substantial portions of the Materials.
**
** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
#ifndef __VK_PLATFORM_H__
#define __VK_PLATFORM_H__
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
/*
***************************************************************************************************
* Platform-specific directives and type declarations
***************************************************************************************************
*/
#if defined(_WIN32)
// On Windows, VKAPI should equate to the __stdcall convention
#define VKAPI __stdcall
#elif defined(__GNUC__)
// On other platforms using GCC, VKAPI stays undefined
#define VKAPI
#else
// Unsupported Platform!
#error "Unsupported OS Platform detected!"
#endif
#include <stddef.h>
#if !defined(VK_NO_STDINT_H)
#if defined(_MSC_VER) && (_MSC_VER < 1600)
typedef signed __int8 int8_t;
typedef unsigned __int8 uint8_t;
typedef signed __int16 int16_t;
typedef unsigned __int16 uint16_t;
typedef signed __int32 int32_t;
typedef unsigned __int32 uint32_t;
typedef signed __int64 int64_t;
typedef unsigned __int64 uint64_t;
#else
#include <stdint.h>
#endif
#endif // !defined(VK_NO_STDINT_H)
typedef uint64_t VkDeviceSize;
typedef uint32_t bool32_t;
typedef uint32_t VkSampleMask;
typedef uint32_t VkFlags;
#if (UINTPTR_MAX >= UINT64_MAX)
#define VK_UINTPTRLEAST64_MAX UINTPTR_MAX
typedef uintptr_t VkUintPtrLeast64;
#else
#define VK_UINTPTRLEAST64_MAX UINT64_MAX
typedef uint64_t VkUintPtrLeast64;
#endif
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VK_PLATFORM_H__

View File

@@ -0,0 +1,216 @@
//
// File: vk_wsi_display.h
//
/*
** Copyright (c) 2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
** "Materials"), to deal in the Materials without restriction, including
** without limitation the rights to use, copy, modify, merge, publish,
** distribute, sublicense, and/or sell copies of the Materials, and to
** permit persons to whom the Materials are furnished to do so, subject to
** the following conditions:
**
** The above copyright notice and this permission notice shall be included
** in all copies or substantial portions of the Materials.
**
** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
#ifndef __VK_WSI_LUNARG_H__
#define __VK_WSI_LUNARG_H__
#include "vulkan.h"
#define VK_WSI_LUNARG_REVISION 3
#define VK_WSI_LUNARG_EXTENSION_NUMBER 1
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
// This macro defines INT_MAX in enumerations to force compilers to use 32 bits
// to represent them. This may or may not be necessary on some compilers. The
// option to compile it out may allow compilers that warn about missing enumerants
// in switch statements to be silenced.
// Using this macro is not needed for flag bit enums because those aren't used
// as storage type anywhere.
#define VK_MAX_ENUM(Prefix) VK_##Prefix##_MAX_ENUM = 0x7FFFFFFF
// This macro defines the BEGIN_RANGE, END_RANGE, NUM, and MAX_ENUM constants for
// the enumerations.
#define VK_ENUM_RANGE(Prefix, First, Last) \
VK_##Prefix##_BEGIN_RANGE = VK_##Prefix##_##First, \
VK_##Prefix##_END_RANGE = VK_##Prefix##_##Last, \
VK_NUM_##Prefix = (VK_##Prefix##_END_RANGE - VK_##Prefix##_BEGIN_RANGE + 1), \
VK_MAX_ENUM(Prefix)
// This is a helper macro to define the value of flag bit enum values.
#define VK_BIT(bit) (1 << (bit))
// ------------------------------------------------------------------------------------------------
// Objects
VK_DEFINE_DISP_SUBCLASS_HANDLE(VkDisplayWSI, VkObject)
VK_DEFINE_DISP_SUBCLASS_HANDLE(VkSwapChainWSI, VkObject)
// ------------------------------------------------------------------------------------------------
// Enumeration constants
#define VK_WSI_LUNARG_ENUM(type,id) ((type)(VK_WSI_LUNARG_EXTENSION_NUMBER * -1000 + (id)))
// Extend VkPhysicalDeviceInfoType enum with extension specific constants
#define VK_PHYSICAL_DEVICE_INFO_TYPE_DISPLAY_PROPERTIES_WSI VK_WSI_LUNARG_ENUM(VkPhysicalDeviceInfoType, 0)
#define VK_PHYSICAL_DEVICE_INFO_TYPE_QUEUE_PRESENT_PROPERTIES_WSI VK_WSI_LUNARG_ENUM(VkPhysicalDeviceInfoType, 1)
// Extend VkStructureType enum with extension specific constants
#define VK_STRUCTURE_TYPE_SWAP_CHAIN_CREATE_INFO_WSI VK_WSI_LUNARG_ENUM(VkStructureType, 0)
#define VK_STRUCTURE_TYPE_PRESENT_INFO_WSI VK_WSI_LUNARG_ENUM(VkStructureType, 1)
// Extend VkImageLayout enum with extension specific constants
#define VK_IMAGE_LAYOUT_PRESENT_SOURCE_WSI VK_WSI_LUNARG_ENUM(VkImageLayout, 0)
// Extend VkObjectType enum for new objects
#define VK_OBJECT_TYPE_DISPLAY_WSI VK_WSI_LUNARG_ENUM(VkObjectType, 0)
#define VK_OBJECT_TYPE_SWAP_CHAIN_WSI VK_WSI_LUNARG_ENUM(VkObjectType, 1)
// ------------------------------------------------------------------------------------------------
// Enumerations
typedef enum VkDisplayInfoTypeWSI_
{
// Info type for vkGetDisplayInfo()
VK_DISPLAY_INFO_TYPE_FORMAT_PROPERTIES_WSI = 0x00000003, // Return the VkFormat(s) supported for swap chains with the display
VK_ENUM_RANGE(DISPLAY_INFO_TYPE, FORMAT_PROPERTIES_WSI, FORMAT_PROPERTIES_WSI)
} VkDisplayInfoTypeWSI;
typedef enum VkSwapChainInfoTypeWSI_
{
// Info type for vkGetSwapChainInfo()
VK_SWAP_CHAIN_INFO_TYPE_PERSISTENT_IMAGES_WSI = 0x00000000, // Return information about the persistent images of the swapchain
VK_ENUM_RANGE(SWAP_CHAIN_INFO_TYPE, PERSISTENT_IMAGES_WSI, PERSISTENT_IMAGES_WSI)
} VkSwapChainInfoTypeWSI;
// ------------------------------------------------------------------------------------------------
// Flags
typedef VkFlags VkSwapModeFlagsWSI;
typedef enum VkSwapModeFlagBitsWSI_
{
VK_SWAP_MODE_FLIP_BIT_WSI = VK_BIT(0),
VK_SWAP_MODE_BLIT_BIT_WSI = VK_BIT(1),
} VkSwapModeFlagBitsWSI;
// ------------------------------------------------------------------------------------------------
// Structures
typedef struct VkDisplayPropertiesWSI_
{
VkDisplayWSI display; // Handle of the display object
VkExtent2D physicalResolution; // Max resolution for CRT?
} VkDisplayPropertiesWSI;
typedef struct VkDisplayFormatPropertiesWSI_
{
VkFormat swapChainFormat; // Format of the images of the swap chain
} VkDisplayFormatPropertiesWSI;
typedef struct VkSwapChainCreateInfoWSI_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_SWAP_CHAIN_CREATE_INFO_WSI
const void* pNext; // Pointer to next structure
// TBD: It is not yet clear what the use will be for the following two
// values. It seems to be needed for more-global window-system handles
// (e.g. X11 display). If not needed for the SDK, we will drop it from
// this extension, and from a future version of this header.
const void* pNativeWindowSystemHandle; // Pointer to native window system handle
const void* pNativeWindowHandle; // Pointer to native window handle
uint32_t displayCount; // Number of displays the swap chain is created for
const VkDisplayWSI* pDisplays; // displayCount number of display objects the swap chain is created for
uint32_t imageCount; // Number of images in the swap chain
VkFormat imageFormat; // Format of the images of the swap chain
VkExtent2D imageExtent; // Width and height of the images of the swap chain
uint32_t imageArraySize; // Number of layers of the images of the swap chain (needed for multi-view rendering)
VkFlags imageUsageFlags; // Usage flags for the images of the swap chain (see VkImageUsageFlags)
VkFlags swapModeFlags; // Allowed swap modes (see VkSwapModeFlagsWSI)
} VkSwapChainCreateInfoWSI;
typedef struct VkSwapChainImageInfoWSI_
{
VkImage image; // Persistent swap chain image handle
VkDeviceMemory memory; // Persistent swap chain image's memory handle
} VkSwapChainImageInfoWSI;
typedef struct VkPhysicalDeviceQueuePresentPropertiesWSI_
{
bool32_t supportsPresent; // Tells whether the queue supports presenting
} VkPhysicalDeviceQueuePresentPropertiesWSI;
typedef struct VkPresentInfoWSI_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_PRESENT_INFO_WSI
const void* pNext; // Pointer to next structure
VkImage image; // Image to present
uint32_t flipInterval; // Flip interval
} VkPresentInfoWSI;
// ------------------------------------------------------------------------------------------------
// Function types
typedef VkResult (VKAPI *PFN_vkGetDisplayInfoWSI)(VkDisplayWSI display, VkDisplayInfoTypeWSI infoType, size_t* pDataSize, void* pData);
typedef VkResult (VKAPI *PFN_vkCreateSwapChainWSI)(VkDevice device, const VkSwapChainCreateInfoWSI* pCreateInfo, VkSwapChainWSI* pSwapChain);
typedef VkResult (VKAPI *PFN_vkDestroySwapChainWSI)(VkSwapChainWSI swapChain);
typedef VkResult (VKAPI *PFN_vkGetSwapChainInfoWSI)(VkSwapChainWSI swapChain, VkSwapChainInfoTypeWSI infoType, size_t* pDataSize, void* pData);
typedef VkResult (VKAPI *PFN_vkQueuePresentWSI)(VkQueue queue, const VkPresentInfoWSI* pPresentInfo);
// ------------------------------------------------------------------------------------------------
// Function prototypes
#ifdef VK_PROTOTYPES
VkResult VKAPI vkGetDisplayInfoWSI(
VkDisplayWSI display,
VkDisplayInfoTypeWSI infoType,
size_t* pDataSize,
void* pData);
VkResult VKAPI vkCreateSwapChainWSI(
VkDevice device,
const VkSwapChainCreateInfoWSI* pCreateInfo,
VkSwapChainWSI* pSwapChain);
VkResult VKAPI vkDestroySwapChainWSI(
VkSwapChainWSI swapChain);
VkResult VKAPI vkGetSwapChainInfoWSI(
VkSwapChainWSI swapChain,
VkSwapChainInfoTypeWSI infoType,
size_t* pDataSize,
void* pData);
VkResult VKAPI vkQueuePresentWSI(
VkQueue queue,
const VkPresentInfoWSI* pPresentInfo);
#endif // VK_PROTOTYPES
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VK_WSI_LUNARG_H__

2753
include/vulkan/vulkan-130.h Normal file

File diff suppressed because it is too large Load Diff

2894
include/vulkan/vulkan-90.h Normal file

File diff suppressed because it is too large Load Diff

2753
include/vulkan/vulkan.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,61 @@
/*
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef __VULKAN_INTEL_H__
#define __VULKAN_INTEL_H__
#include "vulkan.h"
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
#define VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL 1024
typedef struct VkDmaBufImageCreateInfo_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL
const void* pNext; // Pointer to next structure.
int fd;
VkFormat format;
VkExtent3D extent; // Depth must be 1
uint32_t strideInBytes;
} VkDmaBufImageCreateInfo;
typedef VkResult (VKAPI *PFN_vkCreateDmaBufImageINTEL)(VkDevice device, const VkDmaBufImageCreateInfo* pCreateInfo, VkDeviceMemory* pMem, VkImage* pImage);
#ifdef VK_PROTOTYPES
VkResult VKAPI vkCreateDmaBufImageINTEL(
VkDevice _device,
const VkDmaBufImageCreateInfo* pCreateInfo,
VkDeviceMemory* pMem,
VkImage* pImage);
#endif
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VULKAN_INTEL_H__

View File

@@ -3,9 +3,9 @@
if BUILD_SHARED
if HAVE_COMPAT_SYMLINKS
all-local : .libs/install-mesa-links
all-local : .install-mesa-links
.libs/install-mesa-links : $(lib_LTLIBRARIES)
.install-mesa-links : $(lib_LTLIBRARIES)
$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \
for f in $(join $(addsuffix .libs/,$(dir $(lib_LTLIBRARIES))),$(notdir $(lib_LTLIBRARIES:%.la=%.$(LIB_EXT)*))); do \
if test -h .libs/$$f; then \
@@ -14,5 +14,12 @@ all-local : .libs/install-mesa-links
ln -f $$f $(top_builddir)/$(LIB_DIR); \
fi; \
done && touch $@
clean-local:
for f in $(notdir $(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)); do \
$(RM) $(top_builddir)/$(LIB_DIR)/$$f; \
done;
$(RM) .install-mesa-links
endif
endif

View File

@@ -0,0 +1,63 @@
# ===========================================================================
#
# SYNOPSIS
#
# AX_CHECK_PYTHON_MAKO_MODULE(MIN_VERSION_NUMBER)
#
# DESCRIPTION
#
# Check whether Python mako module is installed and its version higher than
# minimum requested.
#
# Example of its use:
#
# For example, the minimum mako version would be 0.7.3. Then configure.ac
# would contain:
#
# AX_CHECK_PYTHON_MAKO_MODULE(0.7.3)
#
# LICENSE
#
# Copyright (c) 2014 Intel Corporation.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to
# deal in the Software without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
# sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
dnl macro that checks for mako module in python
AC_DEFUN([AX_CHECK_PYTHON_MAKO_MODULE],
[AC_MSG_CHECKING(if module mako in python is installed)
echo "
try:
import sys
import mako
except ImportError as err:
sys.exit(err)
else:
ver_req = map(int, '$1'.split('.'))
ver_act = map(int, mako.__version__.split('.'))
sys.exit(int(ver_req > ver_act))
" | $PYTHON2 -
if test $? -ne 0 ; then
AC_MSG_RESULT(no)
AC_SUBST(acv_mako_found, 'no')
else
AC_MSG_RESULT(yes)
AC_SUBST(acv_mako_found, 'yes')
fi
])

View File

@@ -42,7 +42,7 @@
# modified version of the Autoconf Macro, you may extend this special
# exception to the GPL to apply to your modified version as well.
#serial 9
#serial 12
# mattst88:
# Replaced m4_ifnblank(...) with m4_ifval(m4_normalize(...), ...)
@@ -53,7 +53,7 @@ AC_DEFUN([AX_PROG_FLEX], [
AC_REQUIRE([AC_PROG_EGREP])
AC_CACHE_CHECK([if flex is the lexer generator],[ax_cv_prog_flex],[
AS_IF([$LEX --version 2>/dev/null | $EGREP -q '^\<flex\>'],
AS_IF([$LEX --version 2>/dev/null | $EGREP -qw '^g?flex'],
[ax_cv_prog_flex=yes], [ax_cv_prog_flex=no])
])
AS_IF([test "$ax_cv_prog_flex" = "yes"],

View File

@@ -35,7 +35,7 @@ import os
import os.path
import re
import subprocess
import platform as _platform
import platform as host_platform
import sys
import tempfile
@@ -87,6 +87,25 @@ def createInstallMethods(env):
env.AddMethod(install_shared_library, 'InstallSharedLibrary')
def msvc2013_compat(env):
if env['gcc']:
env.Append(CCFLAGS = [
'-Werror=vla',
'-Werror=pointer-arith',
])
def msvc2008_compat(env):
msvc2013_compat(env)
if env['gcc']:
env.Append(CFLAGS = [
'-Werror=declaration-after-statement',
])
def createMSVCCompatMethods(env):
env.AddMethod(msvc2013_compat, 'MSVC2013Compat')
env.AddMethod(msvc2008_compat, 'MSVC2008Compat')
def num_jobs():
try:
return int(os.environ['NUMBER_OF_PROCESSORS'])
@@ -128,6 +147,17 @@ def check_cc(env, cc, expr, cpp_opt = '-E'):
return result
def check_prog(env, prog):
"""Check whether this program exists."""
sys.stdout.write('Checking for %s ... ' % prog)
result = env.Detect(prog)
sys.stdout.write(' %s\n' % ['no', 'yes'][int(bool(result))])
return result
def generate(env):
"""Common environment generation code"""
@@ -167,7 +197,7 @@ def generate(env):
env['gcc'] = 0
env['clang'] = 0
env['msvc'] = 0
if _platform.system() == 'Windows':
if host_platform.system() == 'Windows':
env['msvc'] = check_cc(env, 'MSVC', 'defined(_MSC_VER)', '/E')
if not env['msvc']:
env['gcc'] = check_cc(env, 'GCC', 'defined(__GNUC__) && !defined(__clang__)')
@@ -191,10 +221,10 @@ def generate(env):
# Determine whether we are cross compiling; in particular, whether we need
# to compile code generators with a different compiler as the target code.
host_platform = _platform.system().lower()
if host_platform.startswith('cygwin'):
host_platform = 'cygwin'
host_machine = os.environ.get('PROCESSOR_ARCHITEW6432', os.environ.get('PROCESSOR_ARCHITECTURE', _platform.machine()))
hosthost_platform = host_platform.system().lower()
if hosthost_platform.startswith('cygwin'):
hosthost_platform = 'cygwin'
host_machine = os.environ.get('PROCESSOR_ARCHITEW6432', os.environ.get('PROCESSOR_ARCHITECTURE', host_platform.machine()))
host_machine = {
'x86': 'x86',
'i386': 'x86',
@@ -205,7 +235,7 @@ def generate(env):
'AMD64': 'x86_64',
'x86_64': 'x86_64',
}.get(host_machine, 'generic')
env['crosscompile'] = platform != host_platform
env['crosscompile'] = platform != hosthost_platform
if machine == 'x86_64' and host_machine != 'x86_64':
env['crosscompile'] = True
env['hostonly'] = False
@@ -283,6 +313,7 @@ def generate(env):
'_SVID_SOURCE',
'_BSD_SOURCE',
'_GNU_SOURCE',
'_DEFAULT_SOURCE',
'HAVE_PTHREAD',
'HAVE_POSIX_MEMALIGN',
]
@@ -331,6 +362,7 @@ def generate(env):
'_SCL_SECURE_NO_WARNINGS',
'_SCL_SECURE_NO_DEPRECATE',
'_ALLOW_KEYWORD_MACROS',
'_HAS_EXCEPTIONS=0', # Tell C++ STL to not use exceptions
]
if env['build'] in ('debug', 'checked'):
cppdefines += ['_DEBUG']
@@ -342,6 +374,26 @@ def generate(env):
print 'warning: Floating-point textures enabled.'
print 'warning: Please consult docs/patents.txt with your lawyer before building Mesa.'
cppdefines += ['TEXTURE_FLOAT_ENABLED']
if gcc_compat:
ccversion = env['CCVERSION']
cppdefines += [
'HAVE___BUILTIN_EXPECT',
'HAVE___BUILTIN_FFS',
'HAVE___BUILTIN_FFSLL',
'HAVE_FUNC_ATTRIBUTE_FLATTEN',
'HAVE_FUNC_ATTRIBUTE_UNUSED',
# GCC 3.0
'HAVE_FUNC_ATTRIBUTE_FORMAT',
'HAVE_FUNC_ATTRIBUTE_PACKED',
# GCC 3.4
'HAVE___BUILTIN_CTZ',
'HAVE___BUILTIN_POPCOUNT',
'HAVE___BUILTIN_POPCOUNTLL',
'HAVE___BUILTIN_CLZ',
'HAVE___BUILTIN_CLZLL',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.5'):
cppdefines += ['HAVE___BUILTIN_UNREACHABLE']
env.Append(CPPDEFINES = cppdefines)
# C compiler options
@@ -377,23 +429,19 @@ def generate(env):
'-m32',
#'-march=pentium4',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.2') \
and (platform != 'windows' or env['build'] == 'debug' or True) \
and platform != 'haiku':
if platform != 'haiku':
# NOTE: We need to ensure stack is realigned given that we
# produce shared objects, and have no control over the stack
# alignment policy of the application. Therefore we need
# -mstackrealign ore -mincoming-stack-boundary=2.
#
# XXX: -O and -mstackrealign causes stack corruption on MinGW
#
# XXX: We could have SSE without -mstackrealign if we always used
# __attribute__((force_align_arg_pointer)), but that's not
# always the case.
ccflags += [
'-mstackrealign', # ensure stack is aligned
'-mmmx', '-msse', '-msse2', # enable SIMD intrinsics
#'-mfpmath=sse',
'-msse', '-msse2', # enable SIMD intrinsics
'-mfpmath=sse', # generate SSE floating-point arithmetic
]
if platform in ['windows', 'darwin']:
# Workaround http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37216
@@ -422,13 +470,6 @@ def generate(env):
'-Wmissing-prototypes',
'-std=gnu99',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.2'):
ccflags += [
'-Wpointer-arith',
]
cflags += [
'-Wdeclaration-after-statement',
]
if icc:
cflags += [
'-std=gnu99',
@@ -465,14 +506,19 @@ def generate(env):
]
ccflags += [
'/W3', # warning level
'/wd4018', # signed/unsigned mismatch
'/wd4056', # overflow in floating-point constant arithmetic
'/wd4244', # conversion from 'type1' to 'type2', possible loss of data
'/wd4267', # 'var' : conversion from 'size_t' to 'type', possible loss of data
'/wd4305', # truncation from 'type1' to 'type2'
'/wd4351', # new behavior: elements of array 'array' will be default initialized
'/wd4756', # overflow in constant arithmetic
'/wd4800', # forcing value to bool 'true' or 'false' (performance warning)
'/wd4996', # disable deprecated POSIX name warnings
]
if env['machine'] == 'x86':
ccflags += [
#'/arch:SSE2', # use the SSE2 instructions
'/arch:SSE2', # use the SSE2 instructions (default since MSVC 2012)
]
if platform == 'windows':
ccflags += [
@@ -503,6 +549,7 @@ def generate(env):
env.Append(CCFLAGS = [
'/analyze',
#'/analyze:log', '${TARGET.base}.xml',
'/wd28251', # Inconsistent annotation for function
])
if env['clang']:
# scan-build will produce more comprehensive output
@@ -587,46 +634,56 @@ def generate(env):
env.Append(CCFLAGS = ['-fopenmp'])
env.Append(LIBS = ['gomp'])
if gcc_compat:
ccversion = env['CCVERSION']
cppdefines += [
'HAVE___BUILTIN_EXPECT',
'HAVE___BUILTIN_FFS',
'HAVE___BUILTIN_FFSLL',
'HAVE_FUNC_ATTRIBUTE_FLATTEN',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('3'):
cppdefines += [
'HAVE_FUNC_ATTRIBUTE_FORMAT',
'HAVE_FUNC_ATTRIBUTE_PACKED',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('3.4'):
cppdefines += [
'HAVE___BUILTIN_CTZ',
'HAVE___BUILTIN_POPCOUNT',
'HAVE___BUILTIN_POPCOUNTLL',
'HAVE___BUILTIN_CLZ',
'HAVE___BUILTIN_CLZLL',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.5'):
cppdefines += ['HAVE___BUILTIN_UNREACHABLE']
# Load tools
env.Tool('lex')
if env['msvc']:
env.Append(LEXFLAGS = [
# Force flex to use const keyword in prototypes, as relies on
# __cplusplus or __STDC__ macro to determine whether it's safe to
# use const keyword, but MSVC never defines __STDC__ unless we
# disable all MSVC extensions.
'-DYY_USE_CONST=',
])
# Flex relies on __STDC_VERSION__>=199901L to decide when to include
# C99 inttypes.h. We always have inttypes.h available with MSVC
# (either the one bundled with MSVC 2013, or the one we bundle
# ourselves), but we can't just define __STDC_VERSION__ without
# breaking stuff, as MSVC doesn't fully support C99. There's also no
# way to premptively include stdint.
env.Append(CCFLAGS = ['-FIinttypes.h'])
if host_platform.system() == 'Windows':
# Prefer winflexbison binaries, as not only they are easier to install
# (no additional dependencies), but also better Windows support.
if check_prog(env, 'win_flex'):
env["LEX"] = 'win_flex'
env.Append(LEXFLAGS = [
# windows compatibility (uses <io.h> instead of <unistd.h> and
# _isatty, _fileno functions)
'--wincompat'
])
env.Tool('yacc')
if host_platform.system() == 'Windows':
if check_prog(env, 'win_bison'):
env["YACC"] = 'win_bison'
if env['llvm']:
env.Tool('llvm')
# Custom builders and methods
env.Tool('custom')
createInstallMethods(env)
createMSVCCompatMethods(env)
env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes'])
env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto >= 1.4.13'])
env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.38'])
env.PkgCheckModules('UDEV', ['libudev >= 151'])
if env['x11']:
env.Append(CPPPATH = env['X11_CPPPATH'])
env['dri'] = env['x11'] and env['drm']
# for debugging

View File

@@ -72,18 +72,25 @@ def generate(env):
return
# Try to determine the LLVM version from llvm/Config/config.h
llvm_config = os.path.join(llvm_dir, 'include/llvm/Config/config.h')
llvm_config = os.path.join(llvm_dir, 'include/llvm/Config/llvm-config.h')
if not os.path.exists(llvm_config):
print 'scons: could not find %s' % llvm_config
return
llvm_version_re = re.compile(r'^#define PACKAGE_VERSION "([^"]*)"')
llvm_version_major_re = re.compile(r'^#define LLVM_VERSION_MAJOR ([0-9]+)')
llvm_version_minor_re = re.compile(r'^#define LLVM_VERSION_MINOR ([0-9]+)')
llvm_version = None
llvm_version_major = None
llvm_version_minor = None
for line in open(llvm_config, 'rt'):
mo = llvm_version_re.match(line)
mo = llvm_version_major_re.match(line)
if mo:
llvm_version = mo.group(1)
llvm_version = distutils.version.LooseVersion(llvm_version)
break
llvm_version_major = mo.group(1)
mo = llvm_version_minor_re.match(line)
if mo:
llvm_version_minor = mo.group(1)
if llvm_version_major is not None and llvm_version_minor is not None:
llvm_version = distutils.version.LooseVersion('%s.%s' % (llvm_version_major, llvm_version_minor))
if llvm_version is None:
print 'scons: could not determine the LLVM version from %s' % llvm_config
return
@@ -98,9 +105,35 @@ def generate(env):
'HAVE_STDINT_H',
])
env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
if True:
# 3.2
# LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter`
if llvm_version >= distutils.version.LooseVersion('3.6'):
env.Prepend(LIBS = [
'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
'LLVMCodeGen', 'LLVMScalarOpts', 'LLVMProfileData',
'LLVMInstCombine', 'LLVMTransformUtils', 'LLVMipa',
'LLVMAnalysis', 'LLVMX86Desc', 'LLVMMCDisassembler',
'LLVMX86Info', 'LLVMX86AsmPrinter', 'LLVMX86Utils',
'LLVMMCJIT', 'LLVMTarget', 'LLVMExecutionEngine',
'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
'LLVMBitReader', 'LLVMMC', 'LLVMCore', 'LLVMSupport'
])
elif llvm_version >= distutils.version.LooseVersion('3.5'):
env.Prepend(LIBS = [
'LLVMMCDisassembler',
'LLVMBitWriter', 'LLVMMCJIT', 'LLVMRuntimeDyld',
'LLVMX86Disassembler', 'LLVMX86AsmParser', 'LLVMX86CodeGen',
'LLVMSelectionDAG', 'LLVMAsmPrinter', 'LLVMX86Desc',
'LLVMObject', 'LLVMMCParser', 'LLVMBitReader', 'LLVMX86Info',
'LLVMX86AsmPrinter', 'LLVMX86Utils', 'LLVMJIT',
'LLVMExecutionEngine', 'LLVMCodeGen', 'LLVMScalarOpts',
'LLVMInstCombine', 'LLVMTransformUtils', 'LLVMipa',
'LLVMAnalysis', 'LLVMTarget', 'LLVMMC', 'LLVMCore',
'LLVMSupport'
])
else:
env.Prepend(LIBS = [
'LLVMMCDisassembler',
'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMX86Desc', 'LLVMSelectionDAG',
'LLVMAsmPrinter', 'LLVMMCParser', 'LLVMX86AsmPrinter',
@@ -120,6 +153,11 @@ def generate(env):
# Some of the LLVM C headers use the inline keyword without
# defining it.
env.Append(CPPDEFINES = [('inline', '__inline')])
# Match some of the warning options from llvm/cmake/modules/HandleLLVMOptions.cmake
env.AppendUnique(CXXFLAGS = [
'/wd4355', # 'this' : used in base member initializer list
'/wd4624', # 'derived class' : destructor could not be generated because a base class destructor is inaccessible
])
if env['build'] in ('debug', 'checked'):
# LLVM libraries are static, build with /MT, and they
# automatically link agains LIBCMT. When we're doing a
@@ -153,7 +191,7 @@ def generate(env):
if '-fno-rtti' in cxxflags:
env.Append(CXXFLAGS = ['-fno-rtti'])
components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter']
components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 'mcdisassembler']
env.ParseConfig('llvm-config --libs ' + ' '.join(components))
env.ParseConfig('llvm-config --ldflags')

View File

@@ -19,7 +19,9 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
SUBDIRS = gtest util mapi
AUTOMAKE_OPTIONS = subdir-objects
SUBDIRS = . gtest util mapi/glapi/gen mapi
if NEED_OPENGL_COMMON
SUBDIRS += glsl mesa
@@ -32,7 +34,7 @@ SUBDIRS += glx
endif
if HAVE_EGL_PLATFORM_WAYLAND
SUBDIRS += egl/wayland
SUBDIRS += egl/wayland/wayland-egl egl/wayland/wayland-drm
endif
if HAVE_EGL_DRIVER_DRI2
@@ -51,4 +53,28 @@ if HAVE_GALLIUM
SUBDIRS += gallium
endif
EXTRA_DIST = getopt
EXTRA_DIST = \
egl/drivers/haiku \
egl/docs \
getopt hgl SConscript
AM_CFLAGS = $(VISIBILITY_CFLAGS)
AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
if HAVE_VULKAN
SUBDIRS += vulkan
endif
AM_CPPFLAGS = \
-I$(top_srcdir)/include/ \
-I$(top_srcdir)/src/mapi/ \
-I$(top_srcdir)/src/mesa/ \
$(DEFINES)
noinst_LTLIBRARIES = libglsl_util.la
libglsl_util_la_SOURCES = \
mesa/main/imports.c \
mesa/program/prog_hash_table.c \
mesa/program/symbol_table.c \
mesa/program/dummy_errors.c

View File

@@ -12,7 +12,8 @@ if env['hostonly']:
# compilation
Return()
SConscript('loader/SConscript')
if env['platform'] != 'windows':
SConscript('loader/SConscript')
# When env['gles'] is set, the targets defined in mapi/glapi/SConscript are not
# used. libgl-xlib and libgl-gdi adapt themselves to use the targets defined
@@ -27,12 +28,15 @@ if env['platform'] in ['haiku']:
SConscript('mesa/SConscript')
SConscript('mapi/vgapi/SConscript')
if not env['embedded']:
if env['platform'] not in ('cygwin', 'darwin', 'freebsd', 'haiku', 'windows'):
SConscript('glx/SConscript')
if env['platform'] not in ['darwin', 'haiku', 'sunos']:
if env['platform'] not in ['darwin', 'haiku', 'sunos', 'windows']:
if env['dri']:
SConscript('egl/drivers/dri2/SConscript')
SConscript('egl/main/SConscript')
if env['platform'] == 'haiku':
SConscript('egl/drivers/haiku/SConscript')
SConscript('egl/main/SConscript')
if env['gles']:

View File

@@ -32,20 +32,32 @@ LOCAL_SRC_FILES := \
platform_android.c
LOCAL_CFLAGS := \
-DDEFAULT_DRIVER_DIR=\"/system/lib/dri\" \
-DHAVE_SHARED_GLAPI \
-DHAVE_ANDROID_PLATFORM
ifeq ($(MESA_LOLLIPOP_BUILD),true)
LOCAL_CFLAGS_arm := -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
LOCAL_CFLAGS_x86 := -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
LOCAL_CFLAGS_x86_64 := -DDEFAULT_DRIVER_DIR=\"/system/lib64/dri\"
else
LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
endif
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/egl/main \
$(MESA_TOP)/src/loader \
$(TARGET_OUT_HEADERS)/libdrm \
$(DRM_GRALLOC_TOP)
LOCAL_STATIC_LIBRARIES := \
libmesa_loader
LOCAL_SHARED_LIBRARIES := libdrm
ifeq ($(shell echo "$(MESA_ANDROID_VERSION) >= 4.2" | bc),1)
LOCAL_SHARED_LIBRARIES += \
libsync
endif
LOCAL_MODULE := libmesa_egl_dri2
include $(MESA_COMMON_MK)

View File

@@ -36,8 +36,9 @@ AM_CFLAGS = \
noinst_LTLIBRARIES = libegl_dri2.la
libegl_dri2_la_SOURCES = \
egl_dri2.c \
egl_dri2.h \
egl_dri2.c
egl_dri2_fallbacks.h
libegl_dri2_la_LIBADD = \
$(top_builddir)/src/loader/libloader.la \
@@ -63,3 +64,10 @@ if HAVE_EGL_PLATFORM_DRM
libegl_dri2_la_SOURCES += platform_drm.c
AM_CFLAGS += -DHAVE_DRM_PLATFORM
endif
if HAVE_EGL_PLATFORM_SURFACELESS
libegl_dri2_la_SOURCES += platform_surfaceless.c
AM_CFLAGS += -DHAVE_SURFACELESS_PLATFORM
endif
EXTRA_DIST = SConscript

Some files were not shown because too many files have changed in this diff Show More