Compare commits

...

1077 Commits

Author SHA1 Message Date
Jason Ekstrand
e9dff5bb99 vk: Add an ICD declaration file 2015-09-24 14:45:58 -07:00
Jason Ekstrand
39cd3783a4 anv: Add support for the ICD loader 2015-09-24 14:45:58 -07:00
Jason Ekstrand
a95f51c1d7 anv: Add a global dispatch table for use in meta operations 2015-09-24 14:45:58 -07:00
Jason Ekstrand
00d18a661f anv/entrypoints: Expose the anv_resolve_entrypoint function 2015-09-24 14:45:58 -07:00
Jason Ekstrand
f5e72695e0 anv/entrypoints: Rename anv_layer to anv_dispatch_table 2015-09-24 14:45:58 -07:00
Jason Ekstrand
913a9b76f7 anv/batch_chain: Remove the current_surface_bo helper
It's no longer used outside anv_batch_chain so we certainly don't need to
be exporting.  Inside anv_batch_chain, it's only used twice and it can be
replaced by a single line so there's really no point.
2015-09-24 08:46:41 -07:00
Jason Ekstrand
bc17f9c9d7 anv/cmd_buffer: Add a helper for getting the surface state base address 2015-09-24 08:42:38 -07:00
Jason Ekstrand
e1a7c721d3 anv/allocator: Don't ever call mremap
This has always been a bit sketchy and neither Kristian nor I have ever
really liked it.
2015-09-24 08:42:14 -07:00
Jason Ekstrand
99e62f5ce8 anv/allocator: Delete the unused center_fd_offset from anv_block_pool 2015-09-24 08:41:56 -07:00
Jason Ekstrand
429665823d anv/allocator: Do a better job of centering bi-directional block pools 2015-09-24 08:41:47 -07:00
Jason Ekstrand
76be58efce anv/batch_chain: Clean up the reloc list swapping code 2015-09-24 08:41:38 -07:00
Jason Ekstrand
041f5ea089 anv/meta: Add location specifiers to meta shaders 2015-09-21 16:21:56 -07:00
Jason Ekstrand
f406b708a5 Merge branch 'nir-spirv' into vulkan 2015-09-17 20:03:40 -07:00
Jason Ekstrand
616db92b01 nir/spirv: Add better location handling
Previously, our location handling was focussed on either no location
(usually implicit 0) or a builting.  Unfortunately, if you gave it a
location, it would blow it away and just not care.  This worked fine with
crucible and our meta shaders but didn't work with the CTS.  The new code
uses the "data.explicit_location" field to denote that it has a "final"
location (usually from a builtin) and, otherwise, the location is
considered to be relative to the base for that shader stage.
2015-09-17 20:02:46 -07:00
Jason Ekstrand
a788e7c659 anv/device: Move mutex initialization to befor block pools 2015-09-17 18:23:21 -07:00
Jason Ekstrand
595e6cacf1 meta: Initial support for packing parameters
Probably incomplete but it should do for now
2015-09-17 18:21:05 -07:00
Jason Ekstrand
d616493953 anv/meta: Pass the depth through the clear vertex shader
It shouldn't matter since we shut off the VS but it's at least clearer.
2015-09-17 18:09:21 -07:00
Jason Ekstrand
3b8aa26b8e anv/formats: Properly report depth-stencil formats 2015-09-17 17:44:20 -07:00
Jason Ekstrand
b5f6889648 vk/device: Don't allow device or instance creation with invalid extensions 2015-09-17 17:44:20 -07:00
Jason Ekstrand
dcf424c98c anv/tests: Add some asserts for data integrity in block_pool_no_free 2015-09-17 17:44:20 -07:00
Jason Ekstrand
5f57ff7e18 anv/allocator: Make the block pool double-ended
This allows us to allocate from either side of the block pool in a
consistent way.  If you use the previous block_pool_alloc function, you
will get offsets from the start of the pool as normal.  If you use the new
block_pool_alloc_back function, you will get a negative index that
corresponds to something in the "back" of the pool.
2015-09-17 17:44:20 -07:00
Jason Ekstrand
15624fcf55 anv/tests: Refactor the block_pool_no_free test
This simply breaks the monotonicity check out into its own function
2015-09-17 17:44:20 -07:00
Jason Ekstrand
55daed947d vk/allocator: Split block_pool_alloc into two functions 2015-09-17 17:44:20 -07:00
Jason Ekstrand
c55fa89251 anv/allocator: Use a signed 32-bit offset for the free list
This has the unfortunate side-effect of making it so that we can't have a
block pool bigger than 1GB.  However, that's unlikely to happen and, for
the sake of bi-directional block pools, we need to negative offsets.
2015-09-17 17:44:20 -07:00
Jason Ekstrand
8c6bc1e85d anv/allocator: Create 2GB memfd up-front for the block pool 2015-09-17 17:44:20 -07:00
Jason Ekstrand
74bf7aa07c anv/allocator: Take the device mutex when growing a block pool
We don't have any locking issues yet because we use the pool size itself as
a mutex in block_pool_alloc to guarantee that only one thread is resizing
at a time.  However, we are about to add support for growing the block pool
at both ends.  This introduces two potential races:

 1) You could have two block_pool_alloc() calls that both try to grow the
    block pool, one from each end.

 2) The relocation handling code will now have to think about not only the
    bo that we use for the block pool but also the offset from the start of
    that bo to the center of the block pool.  It's possible that the block
    pool growing code could race with the relocation handling code and get
    a bo and offset out of sync.

Grabbing the device mutex solves both of these problems.  Thanks to (2), we
can't really do anything more granular.
2015-09-17 17:44:20 -07:00
Jason Ekstrand
222ddac810 anv: Document the index and offset parameters of anv_bo 2015-09-17 17:44:20 -07:00
Chad Versace
85520aa070 vk/image: Remove stale FINISHME for non-2D image views
gen8_image_view_init() now supports 1D, 2D, and 3D image views.
2015-09-14 15:16:57 -07:00
Chad Versace
622a317e4c vk/image: Teach vkCreateImage about layout of 1D surfaces
Calling vkCreateImage() with VK_IMAGE_TYPE_1D now succeeds and computes
the surface layout correctly.
2015-09-14 15:15:12 -07:00
Chad Versace
6221593ff8 vk/meta: Partially implement vkCmdCopy*, vkCmdBlit* for 3D images
Partially implement the below functions for 3D images:
    vkCmdCopyBufferToImage
    vkCmdCopyImageToBuffer
    vkCmdCopyImage
    vkCmdBlitImage

Not all features work, and there is much for performance improvement.
Beware that vkCmdCopyImage and vkCmdBlitImage are untested.  Crucible
proves that vkCmdCopyBufferToImage and vkCmdCopyImageToBuffer works,
though.

Supported:
    - copy regions with z offset

Unsupported:
    - copy regions with extent.depth > 1

Crucible test results on master@d452d2b are:
    pass: func.miptree.r8g8b8a8-unorm.*.view-3d.*
    pass: func.miptree.d32-sfloat.*.view-3d.*
    fail: func.miptree.s8-uint.*.view-3d.*
2015-09-14 14:27:34 -07:00
Chad Versace
0ecafe0285 vk/meta: Rename meta_emit_blit() params
Rename src -> src_view and dest -> dest_view. This reduces noise in the next
patch's diff, which adds new params to the function.
2015-09-14 12:29:51 -07:00
Chad Versace
b659a066e9 vk/gen8: Set RENDER_SURFACE_STATE::RenderTargetViewExtent 2015-09-14 12:29:49 -07:00
Chad Versace
ffa61e1572 vk/gen8: Refactor setting of SURFACE_STATE::Depth
The field's meaning depends on SURFACE_STATE::SurfaceType.
Make that correlation explicit by switching on VkImageType.
For good measure, add some PRM quotes too.
2015-09-14 12:27:05 -07:00
Chad Versace
eed74e3a02 vk: Teach vkCreateImage about layout of 3D surfaces
Calling vkCreateImage() with VK_IMAGE_TYPE_3D now succeeds and computes
the surface layout correctly. However, 3D images do not yet work for
many other Vulkan entrypoints.
2015-09-14 11:04:08 -07:00
Chad Versace
e01d5a0471 vk: Refactor anv_image_make_surface()
Move the code that calculates the layout of 2D surfaces into a switch case.
2015-09-14 11:00:18 -07:00
Jason Ekstrand
8c8ad6dddf vk: Use push constants for dynamic buffers 2015-09-11 15:56:19 -07:00
Jason Ekstrand
2b4a2eb592 vk/compiler: Rework create_params_array 2015-09-11 15:55:54 -07:00
Jason Ekstrand
c3086c54a8 vk/compiler: Add a NIR pass for pushing dynamic buffer offset
This commit just adds the NIR pass but does none of the uniform setup
2015-09-11 15:53:56 -07:00
Jason Ekstrand
7487371056 vk/pipeline_layout: Add dynamic_offset_start and has_dynamic_offsets fields 2015-09-11 15:52:43 -07:00
Jason Ekstrand
de5220c7ce vk/pipeline_layout: Move surface/sampler start from SoA to AoS
This makes more sense to me and it's more consistent with
anv_descriptor_set_layout.
2015-09-11 10:43:55 -07:00
Jason Ekstrand
b908c67816 vk: Rework the push constants data structure
Previously, we simply had a big blob of stuff for "driver constants".  Now,
we have a very specific data structure that contains the driver constants
that we care about.
2015-09-11 10:25:23 -07:00
Jason Ekstrand
fd21f0681a Add the wayland protocol files to .gitignire 2015-09-11 09:29:40 -07:00
Jason Ekstrand
8040dc4ca5 vk/error: Handle ERROR_OUT_OF_DATE_WSI 2015-09-08 12:13:07 -07:00
Jason Ekstrand
060720f0c9 vk/wsi/x11: Actually block on X so we don't re-use busy buffers 2015-09-08 11:51:47 -07:00
Jason Ekstrand
1bee19e023 vk: Add the WSI header files 2015-09-08 10:33:46 -07:00
Jason Ekstrand
2f3de6260d Merge branch 'nir-spirv' into vulkan 2015-09-05 14:12:59 -07:00
Jason Ekstrand
4d73ca3c58 nir/spirv.h: Remove some cruft missed while merging
There were merge conflicts in spirv.h that got missed because they were in
a comment and so it still compiled.  This gets rid of them and we should be
on-par with upstream spirv->nir.
2015-09-05 14:11:40 -07:00
Jason Ekstrand
612b13aeae nir/spirv: Add support for most of the rest of texturing
Assuming this all works, about the only thing left should be some
corner-cases for tg4
2015-09-05 14:10:05 -07:00
Jason Ekstrand
fe786ff67d Merge branch 'nir-spirv' into vulkan 2015-09-05 13:17:53 -07:00
Jason Ekstrand
35fcd37fcf nir/spirv: Handle decorations after assigning variable locations 2015-09-05 13:17:21 -07:00
Jason Ekstrand
87d02f515b Merge branch 'nir-spirv' into vulkan 2015-09-05 09:48:33 -07:00
Jason Ekstrand
9be43ef99c nir/spirv: Handle the MatrixStride member decoration 2015-09-05 09:47:45 -07:00
Jason Ekstrand
01924a03d4 vk: Actually link in wayland libraries
Turns out this was why I had accidentally broken the universe.  Oops...
2015-09-04 20:02:38 -07:00
Jason Ekstrand
2c4ae00db6 vk: Conditionally compile Wayland support
Pulling in libwayland causes undefined symbols in applications that are
linked against vulkan alone.  Ideally, we would like to dlopen a platform
support library or something like that.  For now, this works and should get
crucible running again.
2015-09-04 19:18:52 -07:00
Jason Ekstrand
b3c037f329 vk: Fix size return value handling in a couple plces 2015-09-04 19:05:51 -07:00
Jason Ekstrand
9a95d08ed6 Merge branch 'nir-spirv' into vulkan 2015-09-04 18:54:15 -07:00
Jason Ekstrand
6d5dafd779 nir/spirv/glsl450: Use the correct write mask 2015-09-04 18:50:14 -07:00
Jason Ekstrand
7174d155e9 nir: Add a lower_fdiv option and use it in i965 2015-09-04 18:50:14 -07:00
Jason Ekstrand
f32d16a9f0 nir/spirv: Use the actual GLSL 450 extension header from Khronos 2015-09-04 18:50:14 -07:00
Jason Ekstrand
9e2c13350e nir/spirv: Add support for SpvDecorationColMajor 2015-09-04 18:50:14 -07:00
Jason Ekstrand
f3bdb93a8e nir/types: Allow single-column matrices
This can sometimes be a convenient way to build vectors.
2015-09-04 18:50:14 -07:00
Jason Ekstrand
48e87c0163 vk/wsi: Add Wayland WSI support 2015-09-04 17:55:42 -07:00
Jason Ekstrand
348cb29a20 vk/wsi: Move to a clallback system for the entire WSI implementation
We do this for two reasons: First, because it allows us to simplify WSI and
compiling in/out support for a particular platform is as simple as calling
or not calling the platform-specific init function.  Second, the
implementation gives us a place for a given chunk of the WSI to stash
stuff in the instance.
2015-09-04 17:55:42 -07:00
Jason Ekstrand
06d8fd5881 vk/instance: Expose anv_instance_alloc/free 2015-09-04 17:55:42 -07:00
Jason Ekstrand
c0b97577e8 vk/WSI: Use a callback mechanism instead of explicit switching 2015-09-04 17:55:42 -07:00
Jason Ekstrand
ca3cfbf6f1 vk: Add an initial implementation of the actual Khronos WSI extension
Unfortunately, this is a very large commit and removes the old LunarG WSI
extension.  This is because there are a couple of entrypoints that have the
same name between the two extensions so implementing them both is
impractiacl.

Support is still incomplete, but this is enough to get vkcube up and going
again.
2015-09-04 17:55:42 -07:00
Jason Ekstrand
3d9fbb6575 vk: Add initial support for VK_WSI_swapchain 2015-09-04 17:55:42 -07:00
Jason Ekstrand
beb466ff5b vk: Move anv_x11.c to anv_wsi_x11.c 2015-09-04 17:55:42 -07:00
Jason Ekstrand
9a7600c9b5 vk/device: Use an array for device extensions 2015-09-04 17:55:42 -07:00
Kristian Høgsberg Kristensen
8af3624651 vk: Further reduce diff to master
Now that we don't compile GLSL, we can roll back a few more hacks and
unexport some things from the backend compiler.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-04 16:17:01 -07:00
Kristian Høgsberg Kristensen
7c1d20dc48 vk: Drop GLSL code from anv_compiler.cpp
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 14:02:11 -07:00
Kristian Høgsberg Kristensen
316c8ac53b vk: Assert that the SPIR-V module has the magic number
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 12:27:28 -07:00
Kristian Høgsberg Kristensen
6e35a1f166 vk: Remove various hacks/scaffolding code
Since we switched away from calling brwCreateContext() there's a bit of
hacky support we can now delete.  This reduces our diff to upstream master.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 12:17:13 -07:00
Kristian Høgsberg Kristensen
1d787781ff vk: Fall back to previous gens in entry point resolver
We used to always just do a one-level fallback from genX_* to anv_*
entry points. That worked for gen7 and gen8 where all entry points were
either different or could be made anv_* entry points (eg
anv_CreateDynamicViewportState). We're about to add gen9 and now need to
be able to fall back to gen8 entry points for most things.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen
c4dbff58d8 vk: Drop redundant gen7_CreateGraphicsPipelines
This is handled by anv_CreateGraphicsPipelines().

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen
b5e90f3f48 vk: Use vk* entrypoints in meta, not driver_layer pointers
We'll change the dispatch mechanism again in a later commit. Stop using
the driver_layer function pointers and just use the public entry points.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen
82396a5514 vk: Drop check for I915_PARAM_HAS_EXEC_CONSTANTS
We don't use this kernel feature.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen
c4b30e7885 vk: Add new vk_errorf that takes a format string
This allows us to annotate error cases in debug builds.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen
2e346c882d vk: Make vk_error a little more helpful
Print out file and line number and translate the error code to the
symbolic name.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:08 -07:00
Chad Versace
0cb26523d3 vk/image: Add PRM reference for QPitch equation
Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
2015-09-03 11:04:38 -07:00
Chad Versace
28503191f1 vk/meta: Partially fix vkCmdCopyBufferToImage for S8_UINT
Create R8_UINT VkAttachmentView and VkImageView for the stencil data.

This fixes a crash, but the pixels in the destination image are still
incorrect. They are not properly tiled.

Fixes crashes in Crucible tests func.miptree.s8-uint.aspect-stencil.* as
of crucible-7471449. Test results improve 'lost' -> 'fail'.
2015-09-02 11:08:36 -07:00
Jason Ekstrand
be0a4da6a5 vk/meta: Use SPIR-V for shaders
We are also now using glslc for compiling the Vulkan driver like we do in
curcible.
2015-09-01 15:16:06 -07:00
Jason Ekstrand
362ab2d788 vk/compiler: Handle interpolation qualifiers for SPIR-V shaders 2015-09-01 15:15:04 -07:00
Jason Ekstrand
126ade0023 vk/extensions: count needs to be <= number of extensions 2015-09-01 12:28:50 -07:00
Jason Ekstrand
0c2d476935 vk/compiler: Properly reference/delete programs when using SPIR-V 2015-09-01 12:28:50 -07:00
Jason Ekstrand
16ebe883a4 vk/meta: Add a helper for making an image from a buffer 2015-08-31 21:54:38 -07:00
Jason Ekstrand
86c3476668 nir/spirv: Use VERTEX_ID_ZERO_BASE for VertexId
In Vulkan, VertexId and InstanceId will be zero-based and new intrinsics,
VertexIndex and InstanceIndex, will be added for non-zer-based.  See also,
Khronos bug #14255
2015-08-31 17:16:49 -07:00
Jason Ekstrand
6350c97412 Merge remote-tracking branch 'fdo-personal/nir-spirv' into vulkan
From now on, the majority of SPIR-V improvements should happen on the spirv
branch which will also be public.  It will be frequently merged into the
vulkan driver.
2015-08-31 17:14:47 -07:00
Jason Ekstrand
22fdb2f855 nir/spirv: Update to the latest revision 2015-08-31 17:05:23 -07:00
Jason Ekstrand
ce70cae756 nir/builder: Use nir_after_instr to advance the cursor
This *should* ensure that the cursor gets properly advanced in all cases.
We had a problem before where, if the cursor was created using
nir_after_cf_node on a non-block cf_node, that would call nir_before_block
on the block following the cf node.  Instructions would then get inserted
in backwards order at the top of the block which is not at all what you
would expect from nir_after_cf_node.  By just resetting to after_instr, we
avoid all these problems.
2015-08-31 17:05:23 -07:00
Jason Ekstrand
24b0c53231 nir/intrinsics: Move to a two-dimensional binding model for UBO's 2015-08-31 17:05:23 -07:00
Jason Ekstrand
f4608bc530 nir/nir_variable: Add a descriptor set field
We need this for SPIR-V
2015-08-31 17:05:23 -07:00
Jason Ekstrand
85cf2385c5 mesa: Move gl_vert_attrib from mtypes.h to shader_enums.h
It is a shader enum after all...
2015-08-31 17:05:23 -07:00
Jason Ekstrand
de4f379a70 nir/cursor: Add a helper for getting the current block 2015-08-31 17:05:23 -07:00
Connor Abbott
024c49e95e nir/builder: add a nir_fdot() convenience function 2015-08-31 17:05:23 -07:00
Jason Ekstrand
f6a0eff1ba nir: Add a pass to lower outputs to temporary variables
This pass can be used as a helper for NIR producers so they don't have to
worry about creating the temporaries themselves.
2015-08-31 17:05:23 -07:00
Jason Ekstrand
4956bbaa33 nir/cursor: Add a constructor for the end of a block but before the jump 2015-08-31 16:58:20 -07:00
Connor Abbott
c62be38286 nir/types: add more nir_type_is_xxx() wrappers 2015-08-31 16:58:20 -07:00
Connor Abbott
a1e136711b nir/types: add a helper to transpose a matrix type 2015-08-31 16:58:20 -07:00
Jason Ekstrand
756b00389c nir/spirv: Don't assert that the current block is empty
It's possible that someone will give us SPIR-V code in which someone
needlessly branches to new blocks.  We should handle that ok now.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
fe220ebd37 nir/spirv: Add initial support for samplers 2015-08-31 16:58:20 -07:00
Jason Ekstrand
a992909aae nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops
NIR doesn't have the native opcodes for them anymore
2015-08-31 16:58:20 -07:00
Jason Ekstrand
45963c9c64 nir/types: Add support for sampler types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
2887e68f36 nir/spirv: Make the global constants in spirv.h static
I've been promissed in a bug that this will be fixed in a future version of
the header.  However, in the interest of my branch building, I'm adding
these changes in myself for the moment.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
62b094a81c nir/spirv: Handle jump-to-loop in a more general way 2015-08-31 16:58:20 -07:00
Jason Ekstrand
ca51d926fd nir/spirv: Handle boolean uniforms correctly 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b6562bbc30 nir/spirv: Handle control-flow with loops 2015-08-31 16:58:20 -07:00
Jason Ekstrand
4a63761e1d nir/spirv: Set a name on temporary variables 2015-08-31 16:58:20 -07:00
Jason Ekstrand
6fc7911d15 nir/spirv: Use the correct length for copying string literals 2015-08-31 16:58:20 -07:00
Jason Ekstrand
9da6d808be nir/spirv: Make vtn_ssa_value handle constants as well as ssa values 2015-08-31 16:58:20 -07:00
Jason Ekstrand
1feeee9cf4 nir/spirv: Add initial support for GLSL 4.50 builtins 2015-08-31 16:58:20 -07:00
Jason Ekstrand
577c09fdad nir/spirv: Split the core datastructures into a header file 2015-08-31 16:58:20 -07:00
Jason Ekstrand
66fc7f252f nir/spirv: Use the builder for all instructions
We don't actually use it to create all the instructions but we do use it
for insertion always.  This should make things far more consistent for
implementing extended instructions.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
9e03b6724c nir/spirv: Add support for a bunch of ALU operations 2015-08-31 16:58:20 -07:00
Jason Ekstrand
91b3b46d8b nir/spirv: Add support for indirect array accesses 2015-08-31 16:58:20 -07:00
Jason Ekstrand
9197e3b9fc nir/spirv: Explicitly type constants and SSA values 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b7904b8281 nir/spirv: Handle OpBranchConditional
We do control-flow handling as a two-step process.  The first step is to
walk the instructions list and record various information about blocks and
functions.  This is where the acutal nir_function_overload objects get
created.  We also record the start/stop instruction for each block.  Then
a second pass walks over each of the functions and over the blocks in each
function in a way that's NIR-friendly and actually parses the instructions.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
d216dcee94 nir/spirv: Add a helper for getting a value as an SSA value 2015-08-31 16:58:20 -07:00
Jason Ekstrand
f36fabb736 nir/spirv: Split instruction handling into preamble and body sections 2015-08-31 16:58:20 -07:00
Jason Ekstrand
7bf4b53f1c nir/spirv: Implement load/store instructiosn 2015-08-31 16:58:20 -07:00
Jason Ekstrand
7d64741a5e nir: Add a helper for getting the tail of a deref chain 2015-08-31 16:58:20 -07:00
Jason Ekstrand
112c607216 nir/spirv: Actaully add variables to the funciton or shader 2015-08-31 16:58:20 -07:00
Jason Ekstrand
4fa1366392 nir/spirv: Add a vtn_untyped_value helper 2015-08-31 16:58:20 -07:00
Jason Ekstrand
e709a4ebb8 nir/spirv: Use vtn_value in the types code and fix a off-by-one error 2015-08-31 16:58:20 -07:00
Jason Ekstrand
67af6c59f2 nir/types: Add an is_vector_or_scalar helper 2015-08-31 16:58:20 -07:00
Jason Ekstrand
5e6c5e3c8e nir/spirv: Add support for deref chains 2015-08-31 16:58:20 -07:00
Jason Ekstrand
366366c7f7 nir/types: Add a scalar type constructor 2015-08-31 16:58:20 -07:00
Jason Ekstrand
befecb3c55 nir/spirv: Add support for OpLabel 2015-08-31 16:58:20 -07:00
Jason Ekstrand
399e962d25 nir/spirv: Add support for declaring functions 2015-08-31 16:58:20 -07:00
Jason Ekstrand
ac4d459aa2 nir/types: Add accessors for function parameter/return types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
3a266a18ae nir/spirv: Add support for declaring variables
Deref chains and variable load/store operations are still missing.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
2494055631 nir/spirv: Add support for constants 2015-08-31 16:58:20 -07:00
Jason Ekstrand
2a023f30a6 nir/spirv: Add basic support for types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
5bb94c9b12 nir/types: Add more helpers for creating types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
53bff3e445 glsl/types: Expose the function_param and struct_field structs to C
Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find
them.  This commit simpliy moves the #ifdef and adds #ifdef's around
constructors.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
0db3e4dd72 glsl/types: Add support for function types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
1169fcdb05 glsl: Add GLSL_TYPE_FUNCTION to the base types enums 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b79916dacc nir/spirv: Rework the way values are added
Instead of having functions to add values and set various things, we just
have a function that does a few asserts and then returns the value.  The
caller is then responsible for setting the various fields.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
ac60aba351 nir/spirv: Add stub support for extension instructions 2015-08-31 16:58:20 -07:00
Jason Ekstrand
78eabc6153 REVERT: Add a simple helper program for testing SPIR-V -> NIR translation 2015-08-31 16:58:20 -07:00
Jason Ekstrand
2c585a722d glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b20d9f5643 nir: Add the start of a SPIR-V to NIR translator
At the moment, it can handle the very basics of strings and can ignore
debug instructions.  It also has basic support for decorations.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
9d92b4fd0e nir: Import the revision 30 SPIR-V header from Khronos 2015-08-31 16:58:20 -07:00
Jason Ekstrand
0af4bf4d4b Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-31 16:30:07 -07:00
Nanley Chery
76f17266ec mesa/texformat: use format conversion function in _mesa_choose_tex_format
This function's cases for non-generic compressed formats duplicate
the GL to MESA translation in _mesa_glenum_to_compressed_format().
This patch replaces the switch cases with a call to the translation
function. This change teaches this function about ASTC, thus enabling
ASTC for glTex*Storage*() calls.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-31 15:03:21 -07:00
Nanley Chery
01024ded1e mesa/texcompress: correct mapping of S3TC formats in conversion function
MESA_FORMAT_RGBA_DXT5 should actually be reserved for GL_RGBA[4]_DXT5_S3TC.
Also, Gallium and other dri drivers (radeon and nouveau) follow this mapping
scheme.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-31 15:03:08 -07:00
Dave Airlie
3063913f77 r600/sb: update last_cf for finalize if.
As Glenn did for finalize_loop we need to update_cf when we
add a POP at the end of a shader.

I think this fixes one of the earlier shader going off end
of memory problems we've stopped.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-09-01 07:39:24 +10:00
Matt Turner
a4ba41638d i965/fs: Use greater-equal cmod to implement maximum.
The docs specifically call out SEL with .l and .ge as the
implementations of MIN and MAX respectively. Among other things,
SEL with these conditional mods are commutative.

See commit 3b7f683f.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-08-31 11:51:59 -07:00
Ben Widawsky
d2e3638ef9 i965/chv|skl: Apply sampler bypass w/a
Certain compressed formats require this setting. The docs don't go into much
detail as to why it's needed exactly.

This patch introduces no piglit regressions on gen9 (bsw is untested). Note that
the SKL "regressions" are fixed tests, and the egl_khr_gl_colorspace tests are
WTF. The patch also fixes nothing I can find.
http://otc-mesa-ci.jf.intel.com/job/Leeroy/127820/

v2:
Reworded commit message (Matt); Added piglit results link.
Restructured condition (Matt)
Moved check out to function (Nanley). I left the setting of the bit in the
  surface state open coded because it seems to go better with the existing code.

v3:
Use and inline function only in gen8_emit_texture_surface_state() (Matt).

Cc: Matt Turner <mattst88@gmail.com>
Cc: Nanley Chery <nanleychery@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-31 10:08:43 -07:00
Dave Airlie
78027c965a st/mesa: move to renumbering registers in a group
This can be done with a single pass for the instruction base,
and takes renumber_registers out of its spot on the profile.

Acked-by: Marek Olšák <marek.olsak@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-31 11:27:33 +01:00
Dave Airlie
aee73f2942 st/mesa: reduce time spent in calculating temp read/writes
The glsl->tgsi convertor does some temporary register reduction
however in profiling shader-db this shows up quite highly,

so optimise things to reduce the number of loops through
all the instructions we do. This drops merge_registers
from 4-5% on the profile to 1%. I think this can be reduced
further by possibly optimising the renumber pass.

Acked-by: Marek Olšák <marek.olsak@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-31 11:27:18 +01:00
Dave Airlie
46968c1140 st/mesa: cache tgsi opcode info in the instruction
Instead of looking this up lots, lets just cache it in the instruction
translation up front. I just noticed this function what high in a profile
of shader-db on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-31 11:26:23 +01:00
Dave Airlie
03b7ec8778 r600: move prim convert from geom shader to function.
This should avoid C++ fail including this header.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-31 19:45:13 +10:00
Timothy Arceri
c8bc8d7235 glsl: remove specical case subroutine type counting
Unlike samplers we can get the correct value for subroutines from
component_slots()

Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-08-31 13:10:44 +10:00
Edward O'Callaghan
0d19dc302f r600g: Use TGSI parse results instead of manually exfiltrating
This makes better use of the work that the TGSI API has done for
us.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-30 11:41:14 +02:00
Edward O'Callaghan
3eed81a97b r600g: Set geometry properties in r600_create_shader_state()
The selector is shared by all shader variants, so the
individual shaders shouldn't change it. Use tgsi_shader_scan()
results to set geometry properties within a
r600_create_shader_state() call and treat said propertices in
the selector as read-only within r600_shader_from_tgsi().

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-30 11:41:00 +02:00
Edward O'Callaghan
b4dee1b636 r600g: Move geometry properties state from shader to selector
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-30 11:40:44 +02:00
Edward O'Callaghan
7b6369eb69 r600g: Remove dead assigment to 'gs_input_prim' in shader state
Note that 'geometry shader properties' should be carried in the
selector state over the shader state in any case.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-30 11:40:26 +02:00
Marek Olšák
7dc8a3497f radeonsi: don't use the emit qt keyword in si_init_atom
It confuses my editor.
2015-08-29 23:18:23 +02:00
Marek Olšák
379e3382e8 radeonsi: remove no-op 32-bit masking
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-29 23:03:21 +02:00
Marek Olšák
437cb1e3f4 gallium/radeon: fix the ADDRESS_HI mask for EVENT_WRITE CIK packets
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-29 23:03:08 +02:00
Marek Olšák
e321596e9f winsys/radeon: handle non-zero finite timeout when waiting for buffers
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-29 23:03:06 +02:00
Ilia Mirkin
a5a96118ed freedreno/a3xx: implement half-z clipping
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-29 16:18:04 -04:00
Ilia Mirkin
58e24b4761 freedreno/a3xx: add basic clip plane support
The hardware is capable of dealing with GL1-style user clip planes.
No clip vertex, no clip distances. Fixes a number of ucp tests, as well
as neverball.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-08-29 16:18:04 -04:00
Samuel Pitoiset
c8a61ea4fb nvc0: change prefix of MP performance counters to HW_SM
According to NVIDIA, local performance counters (MP) are prefixed
with SM, while global performance counters (PCOUNTER) are called PM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-08-29 11:04:00 +02:00
Samuel Pitoiset
21bdb4d8f3 nvc0: sort performance counter queries by name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-08-29 10:24:50 +02:00
Samuel Pitoiset
ebca85423c nvc0: make names of performance counter queries consistent
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-08-29 10:24:44 +02:00
Samuel Pitoiset
981f46aa95 nvc0: use enumerations for driver queries
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-08-29 10:24:40 +02:00
Samuel Pitoiset
0eac599001 nvc0: remove commented out code related to PCOUNTER queries
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-08-29 10:24:35 +02:00
Jason Ekstrand
9f9628e9dd vk/SPIR-V: Pull num_uniform_components out of the NIR shader 2015-08-28 22:31:03 -07:00
Jason Ekstrand
44e6ea74b0 spirv: lower outputs to temporaries 2015-08-28 17:38:41 -07:00
Jason Ekstrand
9cebdd78d8 nir: Add a pass to lower outputs to temporary variables
This pass can be used as a helper for NIR producers so they don't have to
worry about creating the temporaries themselves.
2015-08-28 17:38:41 -07:00
Jason Ekstrand
5e7c7b2a4e spirv: Only do a block load if you're actually loading a uniform 2015-08-28 16:17:45 -07:00
Jason Ekstrand
98abed2441 spirv: Use VERTEX_ID_ZERO_BASE for vertex id 2015-08-28 16:08:29 -07:00
Dave Airlie
6941883175 r600: port si_conv_prim_to_gs_out from radeonsi
This code was broken by the tess merge, and I totally missed it
until now. I'm not sure this fixes anything but it stops the assert.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-29 09:06:04 +10:00
Dave Airlie
c149d84d45 r600g: use PRIi64 for some compute debug printfs
Otherwise this will crash on 32-bit, and it gets rid of
warnings building on 32-bit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-29 09:06:04 +10:00
Dave Airlie
8d6d0cc17d gallium/util: fix debug_get_flags_option on 32-bit
On 32-bit we need to use PRIu64 flags for printfs,
otherwise this segfaults in R600_DEBUG=help otherwise.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-29 09:06:04 +10:00
Ilia Mirkin
275c5810ca glsl: provide the option of using BFE for unpack builting lowering
This greatly improves generated code, especially for the snorm variants,
since it is able to get rid of the lshift/rshift for sext, as well as
replacing each shift + mask with a single op.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-28 18:28:04 -04:00
Ilia Mirkin
889a946a45 glsl: use bitfield_insert instead of and + shift + or for packing
It is fairly tricky to detect the proper conditions for using bitfield
insert, but easy to just use it up front. This removes a lot of
instructions on nvc0 when invoking the packing builtins.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-28 18:28:04 -04:00
Jason Ekstrand
dbc3eb5bb4 vk/compiler: Pass the correct is_scalar value to brw_process_nir 2015-08-28 12:13:17 -07:00
Jason Ekstrand
ea56d0cb1d glsl/types: Fix up function type hash table insertion 2015-08-28 12:00:25 -07:00
Matt Turner
c676c432f3 i965/fs: Remove fs_visitor::try_replace_with_sel().
No shader-db changes on g4x, snb, hsw, or bdw.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-28 11:30:47 -07:00
Matt Turner
64e312d7fa i965/fs: Replace awful variable names.
start_to      -> dst_start
   end_to        -> dst_end
   start_from    -> src_start
   end_from      -> src_end
   var_to        -> dst_var
   var_from      -> src_var
   reg_to        -> dst_reg
   reg_to_offset -> dst_reg_offset
   reg_from      -> src_reg

Not sure how these made sense to me before.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-28 11:30:47 -07:00
Matt Turner
a2ff1e95a4 i965/fs: Skip blocks in register coalescing interference check.
No need to walk through instructions in blocks we know don't contain our
registers' live ranges.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-28 11:30:47 -07:00
Matt Turner
f2f8c43af9 i965/fs: Improve register coalescing interference check.
I always thought that the is_control_flow() -> return false check was a
bad hack, and some previous attempts to remove it have failed and have
been reverted.

The previous two patches fix some problems that caused register
coalescing to not notice some interference between registers, which the
is_control_flow() check apparently works around.

With that fixed, we can calculate interference more accurately.

total instructions in shared programs: 6261319 -> 6257917 (-0.05%)
instructions in affected programs:     346282 -> 342880 (-0.98%)
helped:                                1552

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-28 11:30:47 -07:00
Matt Turner
f3d0a894af i965/fs: Use overwrites_reg() instead of dst.equals().
equals() returns false for registers with different types, using it
isn't appropriate to determine whether an is overwriting a register.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-28 11:30:47 -07:00
Matt Turner
8765f1d7dd i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM.
Noticed when debugging things that lead to the next patch.

On G45 (and presumably ILK) this helps register coalescing:

total instructions in shared programs: 4077373 -> 4077340 (-0.00%)
instructions in affected programs:     43751 -> 43718 (-0.08%)
helped:                                52
HURT:                                  2

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-28 11:30:47 -07:00
Marta Lofstedt
2581fe931a i965/fs: Do not set the size for zero-size uniforms
Zero sized uniforms can exist in the list, but they don't get get any space
allocated in prog_data->params or in the param_size array, so the size
should not be set for them.  This was previously fixed in:

commit: 781dc7c0e1.

However,

commit: 259f7291de

removed the fix.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-28 09:52:59 -07:00
Daniel Scharrer
0516159613 mesa: return old name for deleted samplers for SAMPLER_BINDING queries
If the sampler object has been deleted in the same context the binding
will have been cleared. If it has been deleted in another context, the
spec does not say what should returned. None of the other binding point
queries check for deletion in another context.

Also, as names of deleted objects are free for reuse, the current code
didn't even work reliably.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-08-28 18:08:39 +02:00
Daniel Scharrer
5aaaaebf22 mesa: add missing queries for ARB_direct_state_access
This adds index queries (glGet*i_v) for GL_TEXTURE_BINDING_* and
GL_SAMPLER_BINDING, as well as textue queries
(glGetTex{,ture}Parameter*) for GL_TEXTURE_TARGET.

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-08-28 18:08:26 +02:00
Chad Versace
a2d15ee698 vk/meta: Support stencil in vkCmdCopyImageToBuffer
At Crucible commit 12e64a4, fixes the
func.depthstencil.stencil-triangles.* tests on Broadwell.
2015-08-28 08:41:21 -07:00
Chad Versace
84cfc08c10 vk/pipeline: Fix crash when the pipeline has no attributes
If there are no attributes, don't emit 3DSTATE_VERTEX_ELEMENTS.
That packet does not allow 0 attributes.
2015-08-28 08:07:15 -07:00
Chad Versace
053d32d2a5 vk/image: Linear stencil buffers are illegal
The hardware requires that stencil buffer memory be W-tiled.

From the Sandybridge PRM:
   This buffer is supported only in Tile W memory.
2015-08-28 08:04:59 -07:00
Chad Versace
14e1d58fb7 vk: Fix stride of stencil buffers
Stencil buffers have strange pitch. The PRM says:

  The pitch must be set to 2x the value computed based on width,
  as the stencil buffer is stored with two rows interleaved.
2015-08-28 08:03:46 -07:00
Chad Versace
31af126229 vk: Program stencil ops in 3DSTATE_WM_DEPTH_STENCIL
The driver ignored the Vulkan stencil, always programming the hardware
stencil op to 0 (STENCILOP_KEEP).
2015-08-28 08:00:56 -07:00
Chad Versace
bff2879abe vk/image: Don't abort when creating stencil image views
When creating a stencil image view, log a FINISHME but don't abort.
We're sooooo close to having this working.
2015-08-28 07:59:59 -07:00
Chad Versace
4f852c76dc vk/meta: Save/restore VkDynamicDepthStencilState 2015-08-28 07:59:29 -07:00
Chad Versace
104c4e5ddf vk/meta: Don't skip clearing when clearing only depth attachment
anv_cmd_buffer_clear_attachments() skipped the clear renderpass if no
color attachments needed to be cleared, even if a depth attachment
needed to be cleared.
2015-08-28 07:58:51 -07:00
Chad Versace
aacb7bb9b6 vk: Add func anv_cmd_buffer_get_depth_stencil_view()
This function removes some duplicated code from
genN_cmd_buffer_emit_depth_stencil().
2015-08-28 07:57:34 -07:00
Chad Versace
641c25dd55 vk: Declare some local variables as const
In anv_cmd_buffer_emit_depth_stencil(), declare 'subpass' and 'fb' as
const.
2015-08-28 07:53:24 -07:00
Chad Versace
c6f19b4248 vk: Don't duplicate anv_depth_stencil_view's surface data
In anv_depth_stencil_view, replace the members
    bo
    depth_offset
    depth_stride
    depth_format
    depth_qpitch
    stencil_offset
    stencil_stride
    stencil_qpitch
with the single member
    const struct anv_image *image

The removed members duplicated data in anv_image::depth_surface and
anv_image::stencil_surface.
2015-08-28 07:52:19 -07:00
Chad Versace
35b0262a2d vk/gen7: Add func gen7_cmd_buffer_emit_depth_stencil()
This patch moves all the GEN7_3DSTATE_DEPTH_BUFFER code from
gen7_cmd_buffer_begin_subpass() into a new function
gen7_cmd_buffer_emit_depth_stencil().
2015-08-28 07:46:16 -07:00
Chad Versace
b2ee317e24 vk: Fix format of anv_depth_stencil_view
The format of the view itself and of the view's image may differ.
Moreover, if the view's format has no depth aspect but the image's
format does, we must not program the depth buffer. Ditto for stencil.
2015-08-28 07:44:32 -07:00
Chad Versace
798acb2464 vk/gen7: Fix gen of emitted packet in gen7_batch_lri()
Emit GEN7_MI_LOAD_REGISTER_IMM, not the GEN8 version.
2015-08-28 07:36:35 -07:00
Chad Versace
4461392343 vk: Remove dummy anv_depth_stencil_view 2015-08-28 07:35:39 -07:00
Chad Versace
941b48e992 vk/image: Let anv_image have one anv_surface per aspect
Split anv_image::primary_surface into two: anv_image::color_surface and
depth_surface.
2015-08-28 07:17:54 -07:00
Neil Roberts
2dbc6a0ad9 docs: Fix a typo in GL3.txt concerning GL_KHR_context_flush_control 2015-08-28 14:29:22 +01:00
Ilia Mirkin
b319fd7c14 mesa: fix dispatch sanity with GL_OES_texture_storage_multisample_2d_array
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91785
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-08-28 03:12:05 -04:00
Vinson Lee
2ef5a4f830 ABI-check: Use more portable bash invocation.
Fixes 'make check' on FreeBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-27 23:48:43 -07:00
Boyan Ding
86c57ebe0e i965/nir: Make use of nir_opt_undef
Shader-db result on Ivy Bridge:
total instructions in shared programs: 145484 -> 145445 (-0.03%)
instructions in affected programs:     225 -> 186 (-17.33%)
helped:                                5
HURT:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
2015-08-27 23:33:49 -07:00
Matt Turner
559b8842fa glapi: Remove _x86_64_get_get_dispatch symbol from x86-64 assembly.
Never used.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2015-08-27 22:28:49 -07:00
Ilia Mirkin
4a6a47ed05 glsl: clean up textureSize prototype
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-27 23:49:13 -04:00
Glenn Kennard
608c7b4a63 r600g/sb: Don't crash on empty if jump target
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-28 12:32:36 +10:00
Glenn Kennard
a830225adb r600g/sb: Don't read junk after EOP
Shaders that contain instruction data after an instruction with EOP could end
up parsing that as an instruction, leading to various crashes and asserts in
SB as it gets very confused if it sees for instance a loop start instruction
jumping off to some random point.

Add a couple of asserts, and print EOP bit if set in old asm printer.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-28 12:32:32 +10:00
Glenn Kennard
36f1999a87 r600g/sb: Handle undef in read port tracker
e8e443 missed adding check for undef values also in
unreserve function, leading to an assert triggering.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-28 12:32:14 +10:00
Jason Ekstrand
c313a989b4 spirv: Bump to the public revision 31 2015-08-27 15:24:04 -07:00
Brian Paul
52f7487923 mesa: rename rowStride to imageStride in texturesubimage()
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-27 15:22:01 -06:00
Ilia Mirkin
2259b11100 mesa: only copy the requested teximage faces
Cube maps are special in that they have separate teximages for each
face. We handled that by copying the data to them separately, but in
case zoffset != 0 or depth != 6 we would read off the end of the client
array or modify the wrong images.

zoffset/depth have already been verified by the time the code gets to
this stage, so no need to double-check.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-08-27 17:18:43 -04:00
Kenneth Graunke
0a913a9d85 nir: Convert the builder to use the new NIR cursor API.
The NIR cursor API is exactly what we want for the builder's insertion
point.  This simplifies the API, the implementation, and is actually
more flexible as well.

This required a bit of reworking of TGSI->NIR's if/loop stack handling;
we now store cursors instead of cf_node_lists, for better or worse.

v2: Actually move the cursor in the after_instr case.
v3: Take advantage of nir_instr_insert (suggested by Connor).
v4: vc4 build fixes (thanks to Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v4]
Acked-by: Connor Abbott <cwabbott0@gmail.com> [v4]
2015-08-27 13:36:57 -07:00
Kenneth Graunke
3e3cb77901 nir: Convert the NIR instruction insertion API to use cursors.
This patch implements a general nir_instr_insert() function that takes a
nir_cursor for the insertion point.  It then reworks the existing API to
simply be a wrapper around that for compatibility.

This largely involves moving the existing code into a new function.

Suggested by Connor Abbott.

v2: Make the legacy functions static inline in nir.h (requested by
    Connor Abbott).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
2015-08-27 13:36:57 -07:00
Kenneth Graunke
f90c6b1ce0 nir: Move nir_cursor to nir.h.
We want to use this for normal instruction insertion too, not just
control flow.  Generally these functions are going to be extremely
useful when working with NIR, so I want them to be widely available
without having to include a separate file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
2015-08-27 13:36:57 -07:00
Kenneth Graunke
c44d507752 nir: Strengthen "no jumps" assertions in instruction insertion API.
Jumps must be the last instruction in a block, so inserting another
instruction after a jump is illegal.

Previously, we only checked this when the new instruction being inserted
was a jump.  This is a red herring - inserting *any* kind of instruction
after a jump is illegal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
2015-08-27 13:36:57 -07:00
Brian Paul
bcae4640c8 st/mesa: use PROGRAM_ARRAY for storing structs containing arrays
Previously, we used PROGRAM_ARRAY only for variables which were
arrays or matrices.  But if the variable is a structure containing
an array or matrix, we need to use PROGRAM_ARRAY for that too.

Before, we failed an assertion:
  state_tracker/st_glsl_to_tgsi.cpp:4900:
  Assertion `src_reg->file != PROGRAM_TEMPORARY' failed.
when running the piglit test
glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-08-27 13:11:26 -06:00
Brian Paul
42c7be5877 glsl: fix comment typo: s/filed/field/ 2015-08-27 13:11:26 -06:00
Brian Paul
3c256f572b gallium/util: fix code formatting in u_blitter.h
Trivial.
2015-08-27 13:11:26 -06:00
Jason Ekstrand
fee0c5af11 i965/fs: Split VGRFs after lowering pull constants
The split_virtual_grfs code doesn't properly rewrite reladdr so we need to
make sure that any uniform indirects are lowered away first.

This fixes the glsl-fs-uniform-indexed-by-swizzled-vec4.shader_test in piglit

Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-27 12:09:36 -07:00
Jason Ekstrand
f2e667172a i964/fs: Refactor assign_constant_locations
Now that all constant locations are assigned in a single function, we can
refactor it a bit to unify things.  In particular, we now handle
pull_constant_loc and push_constant_loc more similarly and we only modify
stage_prog_data->params[] in one place at the end of the function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-27 12:09:24 -07:00
Jason Ekstrand
2a8d1ac958 vk: Update to API version 0.138.2 2015-08-27 11:41:04 -07:00
Kenneth Graunke
885a9b058c i965: Rename INTEL_DEBUG=vec4vs to INTEL_DEBUG=vec4.
driParseDebugString() doesn't have actual code to parse comma separated
lists (or any other supported options?); instead it dumbly uses strstr().

This means that INTEL_DEBUG="vec4vs" will trigger both DEBUG_VEC4VS and
DEBUG_VS, as "vs" is also a substring.

We should probably improve the driconf parsing, but for now, just rename
the option so it's usable in the meantime.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2015-08-27 11:38:50 -07:00
Jason Ekstrand
4e3ee043c0 vk/gen8: Add support for push constants 2015-08-27 10:25:58 -07:00
Jason Ekstrand
375a65d5de vk/private.h: Handle a NULL bo but valid offset in __gen_combine_address 2015-08-27 10:25:58 -07:00
Jason Ekstrand
c8365c55f5 vk/cmd_buffer: Set the CONSTANTS_REL_GENERAL flag on execbuf
This tells the kernel that the push constant buffers are relative to the
dynamic state base address.
2015-08-27 10:25:58 -07:00
Jason Ekstrand
efc2cce01f HACK: Don't call nir_setup_uniforms
We're doing our own uniform setup and we don't need to call into the entire
GL stack to mess with things.
2015-08-27 10:25:58 -07:00
Jason Ekstrand
33cabeab01 vk/compiler: Add a helper for setting up prog_data->param
This new helper sets it up the way we'll want for handling push constants.
2015-08-27 10:25:16 -07:00
Tapani Pälli
16ad1d2a8d mesa: enable enums for OES_texture_storage_multisample_2d_array
v2: use _mesa_is_gles31(ctx) for verifying we are on ES 3.1,
    remove _es31 usage from get_hash_params.py

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-27 10:58:10 +03:00
Tapani Pälli
c2c64fd269 glsl: add support for OES_texture_storage_multisample_2d_array
v2: use ARB_texture_multisample enable bit

Patch adds extension enable bit and enables required keywords
and builtin functions for the extension.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-27 10:54:41 +03:00
Tapani Pälli
b9101b1443 mesa: Add extension enable for OES_texture_storage_multisample_2d_array
v2: use ARB_texture_multisample bit to enable extension

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-27 10:53:15 +03:00
Tapani Pälli
f4280b740d glapi: add GL_OES_texture_storage_multisample_2d_array extension
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-27 10:52:46 +03:00
Jason Ekstrand
5446bf352e vk: Add initial API support for setting push constants
This doesn't add support for actually uploading them, it just ensures that
we have and update the shadow copy.
2015-08-26 17:59:15 -07:00
Nanley Chery
9a759a6ee0 swrast: add a new macro, FETCH_COMPRESSED
This patch creates a new macro, FETCH_COMPRESSED - similar in nature
to the other FETCH_* macros. This reduces repetition in the code that
deals with compressed textures.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
42ee16176d mesa: return bool instead of GLboolean in compressedteximage_only_format()
In agreement with the coding style, functions that aren't directly visible
to the GL API should prefer the use of bool over GLboolean.

Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
43d5b4db96 i965: refactor miptree alignment calculation code
Remove redundant checks and comments by grouping our calculations for
align_w and align_h wherever possible.

v2: reintroduce brw.
    don't include functional changes.
    don't adjust function parameters or create a new function.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
a687734135 i965: change the meaning of cpp for compressed textures
An ASTC block takes up 16 bytes for all block width and height configurations.
This size is not integrally divisible by all ASTC block widths. Therefore cpp
is changed to mean bytes per block if the texture is compressed.

Because the original definition was bytes per block divided by block width, all
references to the mipmap width must be divided the block width. This keeps the
address calculation formulas consistent. For example, the units for miptree_level
x_offset and miptree total_width has changed from pixels to blocks.

v2: reuse preexisting ALIGN_NPOT macro located in an i965 driver file.
v3: move ALIGN_NPOT into seperate commit.
    simplify cpp assignment in copy_image_with_blitter().
    update miptree width and offset variables in: intel_miptree_copy_slice(),
        intel_miptree_map_gtt(), and brw_miptree_layout_texture_3d().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
1a9ceed4ba i965: correct mt->align_h for 2D textures on Skylake
In agreement with commit 4ab8d59a23, vertical alignment values are equal to
four times the block height on Gen9+.

v2: add newlines to separate declarations, statments, and comments.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
10ff64fd3d i965: use ALIGN_NPOT for setting ASTC mipmap layouts
ALIGN is changed to ALIGN_NPOT because alignment values are sometimes not
powers of two when working with ASTC.

v2: handle texture arrays and LDR-only systems.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
54d2aa4258 mesa/macros: move ALIGN_NPOT to macros.h
Aligning with a non-power-of-two number is a general task that can be used in
various places. This commit is required for the next one.

v2: add greater than 0 assertion (Anuj).
    convert the macro to a static inline function.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
97f4efd573 mesa/macros: add power-of-two assertions for alignment macros
ALIGN and ROUND_DOWN_TO both require that the alignment value passed
into the macro be a power of two in the comments. Using software assertions
verifies this to be the case.

v2: use static inline functions instead of gcc-specific statement expressions (Brian).
v3: fix indendation (Brian).
v4: add greater than zero requirement (Anuj).

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
8b1f008e9a i965/surface_formats: add support for 2D ASTC surface formats
Define two-thirds of the 2D Intel ASTC surface formats (LDR-only). This allows
a 1-to-1 mapping from the mesa format to the Intel format.

ASTC textures will default to being processed in LDR mode. If there is
hardware support for HDR/Full mode and the texture is not sRGB, add the
format bit necessary to process it in HDR/Full mode.

v2: remove extra newlines.
v3: follow existing coding style in translate_tex_format().
v4: expound on the GEN9_SURFACE_ASTC_HDR_FORMAT_BIT comment.
    update SF table - ASTC is actually supported in Gen8.
v5: conform the ASTC MESA_FORMAT enums to the existing naming convention.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
cd49b97a8a mesa/teximage: return the base internal format of the ASTC formats
This is necesary to initialize the gl_texture_image struct.

From the KHR_texture_compression_astc_ldr spec:
  "Added to Section 3.8.6, Compressed Texture Images

   Add the tokens specified above to Table 3.16, Compressed Internal Formats.
   In all cases, the base internal format will be RGBA. The encoding allows
   images to be encoded with fewer channels, but this is always presented as
   RGBA to the sampler."

v2. use _mesa_is_astc_format().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
12b519b457 mesa/teximage: accept ASTC formats for 3D texture specification
The ASTC spec was revised as follows:

   Revision 2, April 28, 2015 - added CompressedTex{Sub,}Image3D to
   commands accepting ASTC format tokens in the New Tokens section [...].

Support only exists in the HDR submode:

   Add a second new column "3D Tex." which is empty for all non-ASTC
   formats. If only the LDR profile is supported by the implementation,
   this column is also empty for all ASTC formats. If both the LDR and HDR
   profiles are supported only, this column is checked for all ASTC
   formats.

LDR-only systems should generate an INVALID_OPERATION error when
attempting to call CompressedTexImage3D with the TEXTURE_3D target.

v2. return the proper error for LDR-only systems.
v3. update is_astc_format().
v4. use _mesa_is_astc_format().
v5. place logic in _mesa_target_can_be_compressed.
v6. fix issues handling ASTC formats.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
23c9cd5a96 mesa/texcompress: enable translation between MESA and GL ASTC formats
v3. conform the ASTC MESA_FORMAT enums to the existing naming convention.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:43 -07:00
Nanley Chery
692578ed13 mesa/glformats: recognize ASTC formats as compressed
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:42 -07:00
Nanley Chery
4143511b15 mesa: add ASTC extensions to the extensions table
v2: alphabetize the extensions.
    remove OES ASTC extension.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:42 -07:00
Nanley Chery
582ce1ea97 mesa: don't enable online compression for ASTC formats
In agreement with the ASTC spec, this makes calls to TexImage*D unsuccessful.
Implied by the spec, Generate[Texture]Mipmap and [Copy]Tex[Sub]Image*D calls
must be unsuccessful as well.

v2. actually force attempts to compress online to fail.
v3. indentation (Matt).
v4. update copytexture_error_check to account for CopyTexImage*D (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:42 -07:00
Nanley Chery
e9fd8e154f glapi: add support for KHR_texture_compression_astc_ldr
v2: correct the spelling of the sRGB variants.
    remove spaces around "=" when setting the enum value.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:42 -07:00
Nanley Chery
8ae37365f3 mesa/formats: define the 2D ASTC formats
Define the mesa formats and make changes necessary for compilation
without errors. Also add support for _mesa_get_srgb_format_linear().

v2. conform the ASTC MESA_FORMAT enums to the existing naming convention.
v3. remove ASTC cases for _mesa_get_uncompressed_format(). This function is
    only used for generating mipmaps - something ASTC formats do not support
    due to lack of online compression.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-26 14:36:42 -07:00
Ilia Mirkin
c4cbaca327 nouveau: avoid build failures since 0fc21ecf
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-26 14:04:41 -04:00
Jason Ekstrand
36134e1050 Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-26 11:04:30 -07:00
Marek Olšák
6924ecac77 gallium/radeon: read_registers should return bool meaning success or failure
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:20 +02:00
Marek Olšák
16e5d8ad38 radeonsi: add IB parser support for CP DMA packets
If the packet encoding is defined in the same format as register definitions,
the python script can process them automatically and the parser support
becomes trivial.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
2c14a6d3b1 radeonsi: add IB tracing support for debug contexts
This adds trace points to all IBs and the parser prints them and also
prints which trace points were reached (executed) by the CP.
This can help pinpoint a problematic packet, draw call, etc.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
189953ee13 radeonsi: remove old CS tracing code
Some of it is left there and it will be re-used in the next commit.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
df6a5666b6 radeonsi: parse and dump status registers on GPU hang
GPU hang detection must be enabled by setting: GALLIUM_DDEBUG=[timeout in ms]

This may print too much information that we might not understand yet,
but some of the bits are very useful.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
61df4f0cd3 radeonsi: add an IB parser
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
be6dc87776 radeonsi: save the contents of indirect buffers for debug contexts
This will be used by the IB parser.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
a6a6c68955 radeonsi: generate register and packet tables for an IB parser from sid.h
This makes writing a good IB parser a lot easier.

It generates 2 tables:
- packet3 table
- register table with all registers, fields, and named values

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
d15b71b4bd radeonsi: remove duplicated register definitions and instruction definitions
Instruction encoding isn't needed in Mesa.

The border color address registers were duplicated.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák
c59ad265df r600g,radeonsi: remove unused ill-formed register field definitions
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák
110873ed11 radeonsi: add an initial dump_debug_state implementation dumping shaders
This is usually called after a draw call.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák
93d97db349 radeonsi: allow si_dump_key to write to a file
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák
525921ed51 gallium/ddebug: new pipe for hang detection and driver state dumping (v2)
v2: lots of improvements

This is like identity or trace, but simpler. It doesn't wrap most states.

Run with:
  GALLIUM_DDEBUG=1000 [executable]
where "executable" is the app and "1000" is in miliseconds, meaning that
the context will be considered hung if a fence fails to signal in 1000 ms.

If that happens, all shaders, context states, bound resources, draw
parameters, and driver debug information (if any) will be dumped into:
  /home/$username/dd_dumps/$processname_$pid_$index.

Note that the context is flushed after every draw/clear/copy/blit operation
and then waited for to find the exact call that hangs.

You can also do:
  GALLIUM_DDEBUG=always
to do the dumping after every draw/clear/copy/blit operation without
flushing and waiting.

Examples of driver states that can be dumped are:
- Hardware status registers saying which hw block is busy (hung).
- Disassembled shaders in a human-readable form.
- The last submitted command buffer in a human-readable form.

v2: drop pipe-loader changes, drop SConscript
    rename dd.h -> dd_pipe.h

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák
0fc21ecfc0 gallium: add flags parameter to pipe_screen::context_create
This allows creating compute-only and debug contexts.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák
7b5c92391f gallium: add an interface for dumping debug driver state
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Ilia Mirkin
a3b617a258 mesa: remove pointless es31 checks, fix indirect to only be in es31
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-26 12:37:38 -04:00
Ilia Mirkin
332fb341dd mesa: uncomment checks in es31 computation, add texture_ms
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-08-26 12:37:17 -04:00
Marek Olšák
f432ae899f mesa: create multisample fallback textures like normal textures
This works if drivers upsample on upload (like all radeon ones do).
The alternative is an unexpected GL error from anything calling
_mesa_update_state and possibly other issues.

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-08-26 15:42:26 +02:00
Grazvydas Ignotas
f8b01ae47c radeonsi: mark unreachable paths to avoid warnings
Otherwise we get:
warning: 'num_user_sgprs' may be used uninitialized in this function
...

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 15:42:26 +02:00
Tapani Pälli
e0c2ea0337 mesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1
Patch refactors existing parameters check to first check common enums
between desktop GL and GLES 3.1 and modifies get_tex_level_parameter_image
to be compatible with enums specified in 3.1.

v2: remove extra is_gles31() checks (suggested by Ilia)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-26 08:38:25 +03:00
Marta Lofstedt
ae8d0e7abe mesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1
According to OpenGL ES specification section 7.12,
GL_COMPUTE_WORK_GROUP_SIZE, is supported by the
glGetProgramiv function.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-26 08:25:07 +03:00
Marta Lofstedt
c2a766880d mesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1
According to the OpenGL ES 3.1 specification chapter 17, the
MAX_COMPUTE_WORK_GROUP_COUNT and MAX_COMPUTE_WORK_GROUP_SIZE
is available for glGetIntegeri_v.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-26 08:25:07 +03:00
Dave Airlie
73e5adc4b2 mesa/formats: pass correct parameter to _mesa_is_format_compressed
commit 26c549e69d
Author: Nanley Chery <nanley.g.chery@intel.com>
Date:   Fri Jul 31 10:26:36 2015 -0700

    mesa/formats: remove compressed formats from matching function

caused a regression in my CTS testing, this looks like a clear
thinko.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
sSigned-off-by: Dave Airlie <airlied@redhat.com>
2015-08-26 14:13:27 +10:00
Jason Ekstrand
74e076bba8 vk/meta: Destroy vertex shaders when setting up clearing 2015-08-25 18:51:26 -07:00
Jason Ekstrand
4bb9915755 vk/gen8: Don't duplicate generic pipeline setup
gen8_graphics_pipeline_create had a bunch of stuff in it that's already set
up by anv_pipeline_init.  The duplication was causing double-initialization
of a state stream and made valgrind very angry.
2015-08-25 18:41:25 -07:00
Jason Ekstrand
9b387b5d3f Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-25 18:41:21 -07:00
Roland Scheidegger
48e6404c04 gallium/auxiliary: optimize rgb9e5 helper some more
I used this as some testing ground for investigating some compiler
bits initially (e.g. lrint calls etc.), figured I could do much better
in the end just for fun...
This is mathematically equivalent, but uses some tricks to avoid
doubles and also replaces some float math with ints. Good for another
performance doubling or so. As a side note, some quick tests show that
llvm's loop vectorizer would be able to properly vectorize this version
(which it failed to do earlier due to doubles, producing a mess), giving
another 3 times performance increase with sse2 (more with sse4.1), but this
may not apply to mesa.
No piglit change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 02:57:38 +02:00
Roland Scheidegger
941346a803 gallium/auxiliary: optimize rgb9e5 helper a bit
This code (lifted straight from the extension) was doing things the most
inefficient way you could think of.
This drops some of the more expensive float operations, in particular
- int-cast floors (pointless, values always positive)
- 2 raised to (signed) integers (replace with simple exponent manipulation),
  getting rid of a misguided comment in the process (implement with table...)
- float division (replace with mul of reverse of those exponents)
This is like 3 times faster (measured for float3_to_rgb9e5), though it depends
(e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not,
division is not too bad on cpus with early-exit divs).
Note that keeping the double math for now (float x + 0.5), as the results may
otherwise differ.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 02:57:37 +02:00
Dave Airlie
c1452983b4 mesa/texgetimage: fix missing stencil check
GetTexImage can read to stencil8 but only from
a stencil or depthstencil textures.

This fixes a bunch of failures in CTS
GL33-CTS.gtf32.GL3Tests.packed_pixels

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-26 10:22:09 +10:00
Kristian Høgsberg Kristensen
5360edcb30 vk/vec4: Use the right constant for offset into a UBO
We were using constant 0, which is the set.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 16:14:59 -07:00
Kristian Høgsberg Kristensen
647a60226d vk: Use true/false for RenderCacheReadWriteMode
This field in surface state is a bool, WriteOnlyCache is an enum from
GEN8.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 15:58:21 -07:00
Kristian Høgsberg Kristensen
7e5afa75b5 vk: Support descriptor sets and bindings in vec4 ubo loads
Still incomplete, but at least we get the simplest case working.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 15:57:12 -07:00
Kristian Høgsberg Kristensen
00e7799c69 vk/gen7: Enable L3 caching for GEN7 MOCS
Do what GL does here.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 15:55:56 -07:00
Nanley Chery
1d2a844e7d mesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed
Enables _mesa_target_can_be_compressed to return the appropriate GL error
depending on it's inputs. Use the parameter to return the appropriate GL error
for ETC2 formats on GLES3.

Suggested-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-25 15:53:46 -07:00
Nanley Chery
26c549e69d mesa/formats: remove compressed formats from matching function
All compressed formats return GL_FALSE and there isn't any evidence to
support that this behaviour would change. Remove all switch cases for
compressed formats.

v2. Since the exhaustive switch is removed, add a gtest to ensure
    all formats are handled.
v3. Ensure that GL_NO_ERROR is set before returning.
v4. Fix an arg to _mesa_uncompressed_format_to_type_and_comps();
    fix formatting and misc improvements (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-25 15:45:17 -07:00
Nanley Chery
8e581747d2 mesa/formats: make format testing a gtest
We currently check that our format info table is sane during context
initialization in debug builds. Perform this check during
`make check` instead. This enables format testing in release builds
and removes the requirement of an exhuastive switch for
_mesa_uncompressed_format_to_type_and_comps().

v2. indentation and conditional inclusion fixes (Chad).
    allow tests to continue running if any format fails
    and display the failing format name.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-25 15:45:13 -07:00
Kenneth Graunke
1bec29d04d gallium/ttn: Use nir_builder_insert() rather than poking at cf_list.
I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kenneth Graunke
78856194c1 prog_to_nir: Use nir_builder_insert() rather than poking at cf_list.
I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kenneth Graunke
5f14c417c8 nir: Use nir_shader::stage rather than passing it around.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kenneth Graunke
d4d5b430a5 nir: Store gl_shader_stage in nir_shader.
This makes it easy for NIR passes to inspect what kind of shader they're
operating on.

Thanks to Michel Dänzer for helping me figure out where TGSI stores the
shader stage information.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kristian Høgsberg Kristensen
6a1098b2c2 vk/gen7: Use TILEWALK_XMAJOR for linear surfaces
You wouldn't think the TileWalk mode matters when TiledSurface is
false. However, it has to be TILEWALK_XMAJOR. Make it so.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 10:54:13 -07:00
Jason Ekstrand
dfacae3a56 i965/fs: Combine assign_constant_locations and move_uniform_array_access_to_pull_constants
The comment above move_uniform_array_access_to_pull_constants was
completely bogus because it has nothing to do with lowering instructions.
Instead, it's assiging locations of pull constants.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
c999a58f50 nir/lower_io: Remove assign_var_locations_direct_first
This is no longer used so we might as well get rid of it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
259f7291de i965/fs: Rework uniform handling
Previously, we treated the entire UNIFORM file as if it had two elements:
One for direct things and one for indirect.  This is substantially
different from how the old visitor code handled it where each element was
effectively its own uniform.  This commit makes the NIR path more like the
old ir_visitor path where each uniform is separate.  This should allow us
to more easily make decisions about what to push.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
cfa056c6a5 i965/vec4_nir: Get rid of the uniform_driver_location tracking
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
ce5e9139aa nir/lower_io: Separate driver_location and base offset for uniforms
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
0db8e87b4a nir/intrinsics: Add a second const index to load_uniform
In the i965 backend, we want to be able to "pull apart" the uniforms and
push some of them into the shader through a different path.  In order to do
this effectively, we need to know which variable is actually being referred
to by a given uniform load.  Previously, it was completely flattened by
nir_lower_io which made things difficult.  This adds more information to
the intrinsic to make this easier for us.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Kenneth Graunke
6c33d6bbf9 nir: Pass a type_size() function pointer into nir_lower_io().
Previously, there were four type_size() functions in play - the i965
compiler backend defined scalar and vec4 type_size() functions, and
nir_lower_io contained its own similar functions.

In fact, the i965 driver used nir_lower_io() and then looped over the
components using its own type_size - meaning both were in play.  The
two are /basically/ the same, but not exactly in obscure cases like
subroutines and images.

This patch removes nir_lower_io's functions, and instead makes the
driver supply a function pointer.  This gives the driver ultimate
flexibility in deciding how it wants to count things, reduces code
duplication, and improves consistency.

v2 (Jason Ekstrand):
 - One side-effect of passing in a function pointer is that nir_lower_io is
   now aware of and properly allocates space for image uniforms, allowing
   us to drop hacks in the backend

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Kenneth Graunke
a23f82053d prog_to_nir: Don't allocate nir_variable with type vec4[0] for uniforms.
If there are no parameters, we don't need to create a nir_variable to
hold them...and allocating an array of length 0 is pretty bogus.

Should avoid i965 backend assertions in future patches Jason and I are
working on.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-25 10:18:27 -07:00
Kenneth Graunke
640c472fd0 i965: Move type_size() methods out of visitor classes.
I want to use C function pointers to these, and they don't use anything
in the visitor classes anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
c56899f41a i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset
This way they don't implicitly increment the uniforms variable and don't
have to be called in-sequence during uniform setup.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand
8d8b8f5854 i965: Rename setup_vector_uniform_values to setup_vec4_uniform_value
The new name more accurately represents what it does: Set up a single vec4
uniform value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Rob Clark
0ab29751b6 freedreno/ir3: fix compile break after splitting out nir_control_flow.h
The commit:

  commit b49371b8ed
  Author:     Connor Abbott <cwabbott0@gmail.com>
  AuthorDate: Tue Jul 21 19:54:18 2015 -0700

      nir: move control flow modification to its own file

split out some control flow related APIs into a separate header, but did
not update drivers.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-25 08:17:30 -04:00
Rob Clark
8b2d0bb844 freedreno/ir3: fix compile break after fxn->start_block removal
The commit:

  commit 8e0d4ef341
  Author:     Kenneth Graunke <kenneth@whitecape.org>
  AuthorDate: Thu Aug 6 18:18:40 2015 -0700

      nir: Delete the nir_function_impl::start_block field.

removed the start_block field without fixing up drivers..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-25 08:13:04 -04:00
Dave Airlie
529acab22a mesa: enable texture stencil8 for multisample
This fixes GL45-CTS.gtf44.GL31Tests.texture_stencil8.texture_stencil8_gl44
from the ogl conform suite.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-25 11:06:58 +10:00
Brian Paul
e089ca26e1 mesa: make _mesa_bind_texture_unit() static
It's only called from the file it's defined in.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-24 18:23:19 -06:00
Nanley Chery
8f378d1083 mesa/formats: store whether or not a format is sRGB in gl_format_info
v2: remove extra newline.
v3: use bool instead of GLboolean.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-24 16:08:01 -07:00
Kenneth Graunke
4f2cdd8497 nir: Use !block_ends_in_jump() in a few places rather than open-coding.
Connor introduced this helper recently; we should use it here too.

I had to move the function earlier in the file for it to be available.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-24 15:10:55 -07:00
Kristian Høgsberg Kristensen
f1455ffac7 vk: Add gen7 support
With all the previous commits in place, we can now drop in support for
multiple platforms. First up is gen7 (Ivybridge).

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
891995e55b vk: Move 3DSTATE_SBE setup to just before 3DSTATE_PS
This is a more logical place for it, between geometry front end state
and pixel backend state.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
9c752b5b38 vk: Move generic pipeline init to anv_pipeline.c
This logic will be shared between multiple gens.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
3800573fb5 vk: Move gen8 specific state into gen8 sub-structs
This commit moves all occurances of gen8 specific state into a gen8
substruct. This clearly identifies the state as gen8 specific and
prepares for adding gen7 state structs. In the process we also rename
the field names to exactly match the command or state packet name,
without the 3DSTATE prefix, eg:

  3DSTATE_VF -> gen8.vf
  3DSTATE_WM_DEPTH_STENCIL -> gen8.wm_depth_stencil

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
615da3795a vk: Always use a placeholder vertex shader in meta
The clear pipeline didn't have a vertex shader and relied on the clear
shader being hardcoded by the compiler to accept one attribute. This
necessitated a few special cases in the 3DSTATE_VS setup. Instead,
always provide a vertex shader, even if we disable VS dispatch.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
ac738ada7a vk: Trim out irrelevant 0-initialized surface state fields
Many of of these fields aren't used for buffer surfaces, so leave them
out for brevity.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
963a1e35e7 vk: Update generated headers
This adds VALIGN_2 and VALIGN_4 defines for IVB and HSW
RENDER_SURFACE_STATE.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
f5275f7eb3 vk: Move anv_color_attachment_view_init() to gen8_state.c
I'd prefer to move anv_CreateAttachmentView() as well, but it's a little
too much generic code to just duplicate for each gen.  For now, we'll
add a anv_color_attachment_view_init() to dispatch to the gen specific
implementation, which we then call from anv_CreateAttachmentView().

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
988341a73c vk: Move anv_CreateImageView to gen8_state.c
We'll probably want to move some code back into a shared init function,
but this gets one GEN8 surface state initialization out of anv_image.c.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
bc568ee992 vk: Make anv_cmd_buffer_begin_subpass() switch on gen
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
8fe74ec45c vk: Add generic wrapper for filling out buffer surface state
We need this for generating surface state on the fly for dynamic buffer
views.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
a2b822185e vk: Add helper for adding surface state reloc
We're going to have to do this differently for earlier gens, so lets do
it in place only.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
e43fc871be vk: Make batch chain code gen-agnostic
Since the extra dword in MI_BATCH_BUFFER_START added in gen8 is at the
end of the struct, we can emit the gen8 packet on all gens as long as we
set the instruction length correctly.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
25ab43ee8c vk: Move vkCmdPipelineBarrier to gen8_cmd_buffer.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
b4ef2302a9 vk: Use helper function for emitting MI_BATCH_BUFFER_START
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
97360ffc6c vk: Use anv_batch_emit() for chaining back to primary batch
We used to use a manual GEN8_MI_BATCH_BUFFER_START_pack() call, but this
refactors the code to use anv_batch_emit();

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
cff717c649 vk: Downgrade state packet to gen7 where they're common
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
64045eebfb vk: Reorder gen8 specific code into three new files
We'll organize gen specific code in three files per gen: pipeline,
cmd_buffer and state, eg:

    gen8_cmd_buffer.c
    gen8_pipeline.c
    gen8_state.c

where gen8_cmd_buffer.c holds all vkCmd* entry points, gne8_pipeline.c
all gen specific code related to pipeline building and remaining state
code (sampler, surface state, dynamic state) in gen8_state.c.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
9f0bb5977b vk: Move gen8_CmdBindIndexBuffer() to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
a7649b2869 vk: Move gen8_cmd_buffer_emit_state_base_address() to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
130db30771 vk: Move gen8 specific parts of queries to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
98126c021f vk: Move dynamic depth stenctil to anv_gen8.c 2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
0bcf85d79f vk: Move pipeline creation to anv_gen8.c 2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
ef0ab62486 vk: Move anv_CreateSampler to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
fb428727e0 vk: Move anv_CreateBufferView to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
74556b076a vk: Add new anv_gen8.c and move CreateDynamicRasterState there
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
ee9788973f vk: Implement multi-gen dispatch mechanism 2015-08-24 13:45:39 -07:00
Connor Abbott
d7971b41ce nir/cf: reimplement nir_cf_node_remove() using the new API
This gives us some testing of it. Also, the old nir_cf_node_remove()
wasn't handling phi nodes correctly and was calling cleanup_cf_node()
too late.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
fc7f2d2364 nir/cf: add new control modification API's
These will help us do a number of things, including:

- Early return elimination.
- Dead control flow elimination.
- Various optimizations, such as replacing:

if (foo) {
    ...
}
if (!foo) {
    ...
}

with:

if (foo) {
    ...
} else {
    ...
}

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
476eb5e4a1 nir/cf: use a cursor for inserting control flow
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
d356f84d4c nir/cf: add split_block_cursor()
This is a helper that will be shared between the new control flow
insertion and modification code.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
58a360c6b8 nir/cf: add split_block_before_instr()
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
6e47a34b29 nir/cf: add a cursor structure
For now, it allows us to refactor the control flow insertion API's so
that there's a single entrypoint (with some wrappers). More importantly,
it will allow us to reduce the combinatorial explosion in the extract
function. There, we need to specify two points to extract, which may be
at the beginning of a block, the end of a block, or in the middle of a
block. And then there are various wrappers based off of that (before a
control flow node, before a control flow list, etc.). Rather than having
9 different functions, we can have one function and push the actual
logic of determining which variant to use down to the split function,
which will be shared with nir_cf_node_insert().

In the future, we may want to make the instruction insertion API's as
well as the builder use this, but that's a future cleanup.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
6f5c81f86f nir/cf: fix link_blocks() when there are no successors
When we insert a single basic block A into another basic block B, we
will split B into C and D, insert A in the middle, and then splice
together C, A, and D. When we splice together C and A, we need to move
the successors of A into C -- except A has no successors, since it
hasn't been inserted yet. So in move_successors(), we need to handle the
case where the block whose successors are to be moved doesn't have any
successors. Fixing link_blocks() here prevents a segfault and makes it
work correctly.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
6d028749ac nir/cf: clean up jumps when cleaning up CF nodes
We may delete a control flow node which contains structured jumps to
other parts of the program. We need to remove the jump as a predecessor,
as well as remove any phi node sources which reference it. Right now,
the same problem exists for blocks that don't end in a jump instruction,
but with the new API it shouldn't be an issue, since blocks that don't
end in a jump must either point to another block in the same extracted
CF list or not point to anything at all.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
211c79515d nir/cf: remove uses of SSA definitions that are being deleted
Unlike calling nir_instr_remove(), calling nir_cf_node_remove() (and
later in the series, the nir_cf_list_delete()) implies that you're
removing instructions that may still have uses, except those
instructions are never executed so any uses will be undefined. When
cleaning up a CF node for deletion, we must clean up any uses of the
deleted instructions by making them point to undef instructions instead.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
633cbbc068 nir/cf: handle jumps better in stitch_blocks()
In particular, handle the case where the earlier block ends in a jump
and the later block is empty. In that case, we want to preserve the jump
and remove any traces of the later block. Before, we would only hit this
case when removing a control flow node after a jump, which wasn't a
common occurance, but we'll need it to handle inserting a control flow
list which ends in a jump, which should be more common/useful.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
940873bf22 nir/cf: handle jumps in split_block_end()
Before, we would only split a block with a jump at the end if we were
inserting something after a block with a jump, which never happened in
practice. But now, we want to use this to extract control flow lists
which may end in a jump, in which case we really need to do the correct
patching up. As a side effect, when removing jumps we now correctly
insert undef phi sources in some corner cases, which can't hurt.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
f596e4021c nir/cf: add block_ends_in_jump()
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
788d45cb47 nir/cf: handle phi nodes better in split_block_beginning()
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
747ddc3cdd nir/cf: split up and improve nir_handle_remove_jumps()
Before, the process of removing a jump and wiring up the remaining block
correctly was atomic, but with the new control flow modification it's
split into two parts: first, we extract the jump, which creates a new
block with re-wired successors as well as a free-floating jump, and then
we delete the control flow containing the jump, which removes the entry
in the predecessors and any phi node sources. Split up
nir_handle_remove_jumps() to accomodate this, and add the missing
support for removing phi node sources.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00
Connor Abbott
13482111d0 nir/cf: add remove_phi_src() helper
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
f41e108d8b nir: add nir_foreach_phi_src_safe()
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
762ae436ea nir/cf: add insert_phi_undef() helper
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
b49371b8ed nir: move control flow modification to its own file
We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
1c53f89696 nir: make cleanup_cf_node() not use remove_defs_uses()
cleanup_cf_node() is part of the control flow modification code, which
we're going to split into its own file, but remove_defs_uses() is an
internal function used by nir_instr_remove(). Break the dependency by
making cleanup_cf_node() use nir_instr_remove() instead, which simply
calls remove_defs_uses() and then removes the instruction from the list.
nir_instr_remove() does do extra things for jumps, though, so we avoid
calling it on jumps which matches the previous behavior (this will be
fixed later in the series).

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
9d5944053c nir: inline block_add_pred() a few places
It was being used to initialize function impls and loops, even though
it's really a control flow modification helper. It's pretty trivial, so
just inline it to avoid the dependency.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
c7df141c71 nir/validate: check successors/predecessors more carefully
We should be checking almost everything now.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Kenneth Graunke
8e0d4ef341 nir: Delete the nir_function_impl::start_block field.
It's simply the first nir_cf_node in the nir_function_impl::body list,
which is easy enough to access - we don't to store a pointer to it
explicitly.  Removing it means we don't need to maintain the pointer
when, say, splitting the start block when modifying control flow.

Thanks to Connor Abbott for suggesting this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-08-24 13:31:41 -07:00
Nanley Chery
9f00af672b mesa/formats: only do type and component lookup for uncompressed formats
Only uncompressed formats have a non-void type and actual
components per pixel. Rename _mesa_format_to_type_and_comps
to _mesa_uncompressed_format_to_type_and_comps and require
callers to check if the format is not compressed.

v2. include compressed format cases to avoid gcc warnings (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-24 11:27:46 -07:00
Rob Clark
000e225360 freedreno/a4xx: formats update
Fixes glamor, which wants to use R8 integer textures.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-24 13:16:27 -04:00
Rob Clark
afb6c24a20 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-24 13:15:57 -04:00
Chris Wilson
4e5752e2b7 i965: Always re-emit the pipeline select during invariant state emission
On the older platforms where we don't have logical contexts preserving
state across batches, we emit the invariant state setup on every batch
using the brw_invariant_state atom. This includes the pipeline selection
which is cached with the introduction of

commit 0e0e23ef53
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Wed Apr 22 11:43:50 2015 -0700

    i965/state: Emit pipeline select when changing pipelines

However, we do not reset the cache between batches on context-less
platforms resulting in us not setting the pipeline selection and can
cause GPU hangs if a media pipelined was loaded in the meantime (e.g.
mixing mplayer/gstreamer using libva and gnome-shell). A simple solution
is to just forcibly re-emit the pipeline select along with the invariant
state and reset the cache at that point.

Reported-and-tested-by: Tomasz C. <tomaszc@o2.pl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91254
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-08-24 08:57:55 +01:00
Marek Olšák
a83c36b5c0 Revert "radeon/winsys: increase the IB size for VM"
This reverts commit 567394112d.

It regressed performance. It looks like smaller IBs are better, because
the GPU goes idle quicker and there is less waiting for buffers and fences.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
2015-08-23 19:01:15 +02:00
Ilia Mirkin
e18c29b031 nv50: fix 2d engine blits for 64- and 128-bit formats
This fixes bin/ext_framebuffer_multisample-formats all_samples

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-08-23 03:12:07 -04:00
Ilia Mirkin
a6ad49cbbd nv50: account for the int RT0 rule for alpha-to-one/cov
Same as commit 1af0641db but for nvc0. If an integer texture is
bound to RT0, don't do alpha-to-one or alpha-to-coverage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-08-23 02:58:58 -04:00
Dave Airlie
45971fd0df mesa/arb_gpu_shader_fp64: add support for glGetUniformdv
This was missed when I did fp64, I've sent a piglit test to cover
the case as well.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-23 15:56:35 +10:00
Ilia Mirkin
abbf05cfc2 nv50,nvc0: disable depth bounds test on blit
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-08-23 01:39:29 -04:00
Neil Roberts
3a1ab23480 i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used
When the edge flag element is enabled then the elements are slightly
reordered so that the edge flag is always the last one. This was
confusing the code to upload the 3DSTATE_VF_INSTANCING state because
that is uploaded with a separate loop which has an instruction for
each element. The indices used in these instructions weren't taking
into account the reordering so the state would be incorrect.

v2: Use nr_elements instead of brw->vb.nr_enabled so that it will cope
    when gl_VertexID is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2015-08-22 22:25:39 -07:00
Neil Roberts
fb02b4ec48 i965: Swap the order of the vertex ID and edge flag attributes
The edge flag data on Gen6+ is passed through the fixed function hardware as
an extra attribute. According to the PRM it must be the last valid
VERTEX_ELEMENT structure. However if the vertex ID is also used then another
extra element is added to source the VID. This made it so the vertex ID is in
the wrong register in the vertex shader and the edge attribute is no longer in
the last element.

v2: Also implement for BDW+

v3 [by Ben]: Remove 10.5 tag. Too late.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2015-08-22 22:20:33 -07:00
Glenn Kennard
50932268aa r600g: Fix assert in tgsi_cmp
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=91726

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2015-08-23 09:31:12 +10:00
Alexander von Gluck IV
5abbd1cacc egl: scons: fix the haiku build, do not build the dri2 backend
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-08-22 10:13:31 -05:00
Emil Velikov
a8c5c62359 docs: add 11.1.0-devel release notes template, bump version
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-08-22 13:28:16 +01:00
Chad Versace
c4e7ed9163 vk/meta: Implement depth clears
Fixes Crucible test
func.depthstencil.basic-depth.clear-1.0.op-greater.
2015-08-20 10:25:05 -07:00
Chad Versace
0db3d67a14 vk: Cache each render pass's number of clear ops
During vkCreateRenderPass, count the number of clear ops and store them
in new members of anv_render_pass:

    uint32_t num_color_clear_attachments
    bool has_depth_clear_attachment
    bool has_stencil_clear_attachment

Cacheing these 8 bytes (including padding) reduces the number of times
that anv_cmd_buffer_clear_attachments needs to loop over the pass's
attachments.
2015-08-20 10:25:04 -07:00
Chad Versace
2387219101 vk: Use temp var in vkCreateRenderPass's attachment loop
Store the attachment in a temporary variable and
s/pass->attachments[i]/att/ .
2015-08-20 10:25:04 -07:00
Chad Versace
1c24a191cd vk: Improve memory locality of anv_render_pass
Allocate the pass's array of attachments, anv_render_pass::attachments,
in the same allocation as the pass itself.
2015-08-20 09:31:58 -07:00
Chad Versace
4eaf90effb vk: Unharcode an argument to sizeof
s/struct anv_subpass/pass->subpasses[0])/
2015-08-20 09:31:58 -07:00
Chad Versace
44ef4484c8 vk/meta: Add Z coord to clear vertices
For now, the Z coordinate is always 0.0. Will later be used for depth
clears.
2015-08-20 09:31:12 -07:00
Chad Versace
4aef5c62cd vk/meta: Restore all saved state in anv_cmd_buffer_restore()
anv_cmd_buffer_restore() did not restore the old
VkDynamicColorBlendState.
2015-08-20 09:30:34 -07:00
Chad Versace
9f908fcbde vk/meta: Use consistent names and types in anv_saved_state
In struct anv_saved_state, each member's type was a pointer to an Anvil
struct and each member's name was prefixed with "old" except cb_state,
which was a Vulkan handle whose name lacked "old".
2015-08-20 09:29:41 -07:00
Neil Roberts
49d9e89d00 Add mesa.icd to the .gitignore
Since 4d7e0fa8c7 this file is generated by the configure script.
Reviewed-by: Tapani Palli <tapani.palli@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

(cherry picked from commit 885762e182)
2015-08-19 14:12:31 -07:00
Chad Versace
bd0aab9a58 vk/meta: Fix dest format of vkCmdCopyImage
The source image's format was incorrectly used for both the source view
and destination view. For vkCmdCopyImage to correctly translate formats,
the destination view's format must be that of the destination image's.
2015-08-18 12:44:06 -07:00
Chad Versace
b0875aa911 vk: Assert that swap chain format is a color format 2015-08-18 12:43:57 -07:00
Chad Versace
d52822541e vk/image: Don't set anv_surface_view::offset twice
It was set twice a few lines apart, and the second setting always
overrode the first.
2015-08-18 11:48:50 -07:00
Chad Versace
e7d3a5df5a vk/meta: Use anv_format_is_color()
That is, replace !anv_format_is_depth_or_stencil() with
anv_format_is_color(). That conveys the meaning better.
2015-08-18 11:48:48 -07:00
Chad Versace
50f7bf70da vk: Add anv_format_is_color() 2015-08-18 11:48:46 -07:00
Chad Versace
6ff95bba8a vk: Add anv_format reference to anv_render_pass_attachment
Change type of anv_render_pass_attachment::format from VkFormat to const
struct anv_format*.  This elimiates the repetitive lookups into the
VkFormat -> anv_format table when looping over attachments during
anv_cmd_buffer_clear_attachments().
2015-08-17 14:08:55 -07:00
Chad Versace
5a6b2e6df0 vk/image: Simplify stencil case for anv_image_create()
Stop creating a temporary VkImageCreateInfo with overriden
format=VK_FORMAT_S8_UINT. Instead, just pass the format override
directly to anv_image_make_surface().
2015-08-17 14:08:55 -07:00
Chad Versace
a9c36daa83 vk/formats: Add global pointer to anv_format for S8_UINT
Stencil formats are often a special case. To reduce the number of lookups
into the VkFormat-to-anv_format translation table when working with
stencil, expose the table's entry for VK_FORMAT_S8_UINT as global
variable anv_format_s8_uint.
2015-08-17 14:08:55 -07:00
Chad Versace
60c4ac57f2 vk: Add anv_format reference t anv_surface_view
Change type of anv_surface_view::format from VkFormat to const struct
anv_format*. This reduces the number of lookups in the VkFormat ->
anv_format table.
2015-08-17 14:08:55 -07:00
Chad Versace
c11094ec9a vk: Pass anv_format to anv_fill_buffer_surface_state()
This moves the translation of VkFormat to anv_format from
anv_fill_buffer_surface_state() to its caller.

A prep commit to reduce more VkFormat -> anv_format translations.
2015-08-17 14:08:55 -07:00
Chad Versace
ded736f16a vk: Add anv_format reference to anv_image
Change type of anv_image::format from VkFormat to const struct
anv_format*. This reduces the number of lookups in the VkFormat ->
anv_format table.
2015-08-17 14:08:55 -07:00
Chad Versace
4ae42c83ec vk: Store the original VkFormat in anv_format
Store the original VkFormat as anv_format::vk_format. This will be used
to reduce format indirection, such as lookups into the VkFormat ->
anv_format translation table.
2015-08-17 14:07:44 -07:00
Jason Ekstrand
e39e1f4d24 vk: Update .gitignore for the autogenerated spirv changes 2015-08-17 11:47:25 -07:00
Kristian Høgsberg Kristensen
aac6f7c3bb vk: Drop aub dumper and PCI ID override feature
These are now available in intel_aubdump from intel-gpu-tools.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-17 11:41:19 -07:00
Kristian Høgsberg Kristensen
6d09d0644b vk: Use anv_image_create() for creating dmabuf VkImage
We need to make sure we use the VkImage infrastructure for creating
dmabuf images.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-17 11:41:19 -07:00
Jason Ekstrand
0deae66eb1 vk: Add an _autogen suffix autogenerated spirv file names
This prevents make from stomping on nir_spirv.h
2015-08-17 11:40:16 -07:00
Jason Ekstrand
6a7ca4ef2c Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-17 11:25:03 -07:00
Jason Ekstrand
b4c02253c4 vk: Add four unit tests for our lock-free data-structures 2015-08-14 17:04:39 -07:00
Jason Ekstrand
16c5b9f4ed vk: Build a version of the driver for linking into unit tests 2015-08-14 17:04:39 -07:00
Kristian Høgsberg Kristensen
30d82136bb vk: Update generated headers
This update brings usable IVB/HSW RENDER_SURFACE_STATE structs
and adds more float fields that we previously failed to
recognize.
2015-08-12 21:05:32 -07:00
Kristian Høgsberg Kristensen
9564dd37a0 vk: Query aperture size up front in anv_physical_device_init()
We already query the device in various ways here and we can just also
get the aperture size.  This avoids keeping an extra drm fd open
during the life time of the driver.

Also, we need to use explicit 64 bit types for the aperture size, not
size_t.
2015-08-10 17:18:55 -07:00
Kristian Høgsberg Kristensen
8605ee60e0 vk: Share upload logic and add size assert
This lets us hit an assert if we exceed the block pool size instead of
GPU hanging.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-10 17:17:45 -07:00
Jason Ekstrand
6757e2f75c vk/cmd_buffer: Allow for null VkCmdPool's 2015-08-04 14:01:08 -07:00
Kristian Høgsberg Kristensen
4b097d73e6 vk: Call anv_batch_emit_dwords() up front in anv_batch_emit()
This avoids putting a memory barrier between the template struct and
the pack function, which generates much better code.
2015-08-03 15:38:14 -07:00
Kristian Høgsberg Kristensen
fbb119061e vk: Update generated headers
This adds zeroing of reserved blocks of dwords and removes an
instruction definition.
2015-08-03 15:21:27 -07:00
Jason Ekstrand
facf587dea vk/allocator: Solve a data race in anv_block_pool
The anv_block_pool data structure suffered from the exact same race as the
state pool.  Namely, that the uniqueness of the blocks handed out depends
on the next_block value increasing monotonically.  However, this invariant
did not hold thanks to our block "return" concept.
2015-08-03 01:19:34 -07:00
Jason Ekstrand
5e5a783530 vk: Add and use an anv_block_pool_size() helper 2015-08-03 01:18:09 -07:00
Jason Ekstrand
56ce219493 vk/allocator: Make block_pool_grow take and return a size
It takes the old size as an argument and returns the new size as the return
value.  On error, it returns a size of 0.
2015-08-03 01:06:45 -07:00
Jason Ekstrand
fd64598462 vk/allocator: Fix a data race in the state pool
The previous algorithm had a race because of the way we were using
__sync_fetch_and_add for everything.  In particular, the concept of
"returning" over-allocated states in the "next > end" case was completely
bogus.  If too many threads were hitting the state pool at the same time,
it was possible to have the following sequence:

A: Get an offset (next == end)
B: Get an offset (next > end)
A: Resize the pool (now next < end by a lot)
C: Get an offset (next < end)
B: Return the over-allocated offset
D: Get an offset

in which case D will get the same offset as C.  The solution to this race
is to get rid of the concept of "returning" over-allocated states.
Instead, the thread that gets a new block simply sets the next and end
offsets directly and threads that over-allocate don't return anything and
just futex-wait.  Since you can only ever hit the over-allocate case if
someone else hit the "next == end" case and hasn't resized yet, you're
guaranteed that the end value will get updated and the futex won't block
forever.
2015-08-03 00:38:48 -07:00
Jason Ekstrand
481122f4ac vk/allocator: Make a few things more consistant 2015-08-03 00:35:19 -07:00
Jason Ekstrand
e65953146c vk/allocator: Use memory pools rather than (MALLOC|FREE)LIKE
We have pools, so we should be using them.  Also, I think this will help
keep valgrind from getting confused when we have to end up fighting with
system allocations such as those from malloc/free and mmap/munmap.
2015-07-31 10:38:28 -07:00
Jason Ekstrand
1920ef9675 vk/allocator: Add an anv_state_pool_finish function
Currently this is a no-op but it gives us a place to put finalization
things in the future.
2015-07-31 10:38:28 -07:00
Jason Ekstrand
930598ad56 vk/instance: valgrind-guard client-provided allocations 2015-07-31 10:38:23 -07:00
Jason Ekstrand
e40bdcef1f vk/device: Add anv_instance_alloc/free helpers
This way we can more consistently alloc/free the device and it will provide
us a better place to put valgrind hooks in the next patch
2015-07-31 10:14:17 -07:00
Jason Ekstrand
0f050aaa15 vk/device: Mark newly allocated memory as undefined for valgrind
This way valgrind still works even if the client gives us memory that has
been initialized or re-uses memory for some reason.
2015-07-31 09:44:42 -07:00
Jason Ekstrand
1f49a7d9fc vk/batch_chain: Decrement num_relocs instead of incrementing it 2015-07-31 09:11:47 -07:00
Jason Ekstrand
220a01d525 vk/batch_chain: Compute secondary exec mode after finishing the bo
Figuring out whether or not to do a copy requires knowing the length of the
final batch_bo.  This gets set by anv_batch_bo_finish so we have to do it
afterwards.  Not sure how this was even working before.
2015-07-31 08:52:30 -07:00
Jason Ekstrand
26ba0ad54d vk: Re-name command buffer implementation files
Previously, the command buffer implementation was split between
anv_cmd_buffer.c and anv_cmd_emit.c.  However, this naming convention was
confusing because none of the Vulkan entrypoints for anv_cmd_buffer were
actually in anv_cmd_buffer.c.  This changes it so that anv_cmd_buffer.c is
what you think it is and the internals are in anv_batch_chain.c.
2015-07-30 15:00:42 -07:00
Jason Ekstrand
e379cd9a0e vk/cmd_buffer: Add a simple command pool implementation 2015-07-30 14:55:49 -07:00
Jason Ekstrand
4c2a182a36 vk/cmd_buffer: Add support for zero-copy batch chaining 2015-07-30 14:22:17 -07:00
Jason Ekstrand
21004f23bf vk: Add initial support for secondary command buffers 2015-07-30 11:36:48 -07:00
Jason Ekstrand
5aee803b97 vk/cmd_buffer: Split batch chaining into a helper function 2015-07-30 11:34:58 -07:00
Jason Ekstrand
0c4a2dab7e vk/device: Make BATCH_SIZE a global #define 2015-07-30 11:34:09 -07:00
Jason Ekstrand
ace093031d vk/cmd_buffer: Add functions for cloning a list of anv_batch_bo's
We'll need this to implement secondary command buffers.
2015-07-30 11:32:27 -07:00
Jason Ekstrand
7af67e085f vk/reloc_list: Actually set the new length in reloc_list_grow 2015-07-30 11:29:55 -07:00
Jason Ekstrand
f15be18c92 util/list: Add list splicing functions
This adds functions for splicing one list into another.  These have
more-or-less the same API as the kernel list splicing functions.
2015-07-30 11:28:22 -07:00
Jason Ekstrand
e39d0b635c CLONE 2015-07-30 08:24:02 -07:00
Jason Ekstrand
82548a3aca vk/cmd_buffer: Invalidate texture cache in emit_state_base_address
Previously, the caller of emit_state_base_address was doing this.  However,
putting it directly in emit_state_base_address means that we'll never
forget the flush at the cost of one PIPE_CONTROL at the top every batch
(that should do nothing since the kernel just flushed for us).
2015-07-30 08:24:02 -07:00
Jason Ekstrand
56ce896d73 vk/cmd_buffer: Rename emit_batch_buffer_end to end_batch_buffer
This is more generic and doesn't imply that it emits MI_BATCH_BUFFER_END.
While we're at it, we'll move NOOP adding from bo_finish to
end_batch_buffer.
2015-07-30 08:24:02 -07:00
Jason Ekstrand
3ed9cea84d vk/cmd_buffer: Use an array to track all know anv_batch_bo objects
Instead of walking the list of batch and surface buffers, we simply keep
track of all known batch and surface buffers as we build the command
buffer.  Then we use this new list to construct the validate list.
2015-07-29 15:30:15 -07:00
Jason Ekstrand
0f31c580bf vk/cmd_buffer: Rework validate list creation
The algorighm we used previously required us to call add_bo in a particular
order in order to guarantee that we get the initial batch buffer as the
last element in the validate list.  The new algorighm does a recursive walk
over the buffers and then re-orders the list.  This should be much more
robust as we start to add circular dependancies in the relocations.
2015-07-29 15:16:54 -07:00
Jason Ekstrand
4fc7510a7c vk/cmd_buffer: Move emit_batch_buffer_end higher in the file 2015-07-29 12:01:08 -07:00
Jason Ekstrand
8208f01a35 vk/cmd_buffer: Store the relocation list in the anv_batch_bo struct
Before, we were doing this thing where we had one big relocation list for
the whole command buffer and each subbuffer took a chunk out of it.  Now,
we store the actual relocation list in the anv_batch_bo.  This comes at the
cost of more small allocations but makes a lot of things simpler.
2015-07-29 12:01:08 -07:00
Jason Ekstrand
7d50734240 vk/batch: Make relocs a pointer to a relocation list
Previously anv_batch.relocs was an actual relocation list.  However, this
is limiting if the implementation of the batch wants to change the
relocation list as the batch progresses.
2015-07-29 12:01:08 -07:00
Kristian Høgsberg Kristensen
fcea3e2d23 vk/headers: Update to new generated gen headers
This update fixes cases where a 48-bit address field was split into
two parts:

    __gen_address_type                           MemoryAddress;
    uint32_t                                     MemoryAddressHigh;

which cases this pack code to be generated:

   dw[1] =
       __gen_combine_address(data, &dw[1], values->MemoryAddress, dw1);

   dw[2] =
      __gen_field(values->MemoryAddressHigh, 0, 15) |
      0;

which breaks for addresses above 4G.

This update also fixes arrays of structs in commands and structs, for
example, we now have:

   struct GEN8_BLEND_STATE_ENTRY                Entry[8];

and the pack functions now write all dwords in the packet, making
valgrind happy.

Finally, we would try to pack 64 bits of blend state into a uint32_t -
that's also fixed now.
2015-07-29 11:02:33 -07:00
Jason Ekstrand
65f3d00cd6 vk/cmd_buffer: Update a comment 2015-07-29 08:33:56 -07:00
Jason Ekstrand
86a53d2880 vk/cmd_buffer: Use a doubly-linked list for batch and surface buffers
This is probably better than hand-rolling the list of buffers.
2015-07-28 17:47:59 -07:00
Jason Ekstrand
6aba52381a vk/aub: Use the data directly from the execbuf2
Previously, we were crawling through the anv_cmd_buffer datastructure to
pull out batch buffers and things.  This meant that every time something in
anv_cmd_buffer changed, we broke aub dumping.  However, aub dumping should
just dump the stuff the kernel knows about so we really don't need to be
crawling driver internals.
2015-07-28 16:53:45 -07:00
Jason Ekstrand
3c2743dcd1 vk/cmd_buffer: Pull the execbuf stuff into a substruct 2015-07-27 16:37:09 -07:00
Jason Ekstrand
4ced8650d4 vk/cmd_buffer: Move the remaining entrypoints into cmd_emit.c 2015-07-27 15:14:31 -07:00
Jason Ekstrand
d4c249364d vk/cmd_buffer: Move the re-emission of STATE_BASE_ADDRESS to the flushing code
This used to happen magically in cmd_buffer_new_surface_state_bo.  However,
according to Ken, STATE_BASE_ADDRESS is very gen-specific so we really
shouldn't have it in the generic data-structure code.
2015-07-27 15:05:06 -07:00
Jason Ekstrand
117d74b4e2 vk/cmd_buffer: Factor the guts of CmdBufferEnd into two helpers 2015-07-27 14:52:16 -07:00
Jason Ekstrand
8fb6405718 vk/cmd_buffer: Factor the guts of (Create|Reset|Destroy)CmdBuffer into helpers 2015-07-27 14:23:56 -07:00
Jason Ekstrand
80ad578c4e vk/private.h: Re-arrange and better comment anv_cmd_buffer 2015-07-27 12:40:43 -07:00
Jason Ekstrand
50e86b5777 vk: Actually advertise 0.138.1 at runtime 2015-07-23 10:44:27 -07:00
Jason Ekstrand
f884b500d0 vk/vulkan.h: Bump to the version 0.138.1 header
This doesn't actually require any implementation changes but it does change
an enum so it is ABI-incompatable with 0.138.0.
2015-07-23 10:38:22 -07:00
Jason Ekstrand
e99773badd vk: Add two more valgrind checks 2015-07-23 08:57:54 -07:00
Jason Ekstrand
b1fcc30ff0 vk/meta: Destroy shader modules 2015-07-22 17:51:26 -07:00
Jason Ekstrand
3460e6cb2f vk/device: Finish the scratch block pool on device destruction 2015-07-22 17:51:14 -07:00
Jason Ekstrand
867f6cb90c vk: Add a FreeDescriptorSets function 2015-07-22 17:33:09 -07:00
Jason Ekstrand
c9dc1f4098 vk/pipeline: Be more sloppy about shader entrypoint names
The CTS passes in NULL names right now.  It's not too hard to support that
as just "main".  With this, and a patch to vulkancts, we now pass all 6
tests.
2015-07-22 15:26:56 -07:00
Chad Versace
2c2233e328 vk: Prefix most filenames with anv
Jason started the task by creating anv_cmd_buffer.c and anv_cmd_emit.c.
This patch finishes the task by renaming all other files except
gen*_pack.h and glsl_scraper.py.
2015-07-17 20:25:38 -07:00
Chad Versace
f70d079854 vk/image: Remove unneeded data from anv_buffer_view
This completes the FINISHME to trim unneeded data from anv_buffer_view.

A VkExtent3D doesn't make sense for a VkBufferView.
So remove the member anv_surface_view::extent, and push it up to the two
objects that actually need it, anv_image_view and anv_attachment_view.
2015-07-17 14:48:23 -07:00
Chad Versace
194b77d426 vk: Document members of anv_surface_view 2015-07-17 14:39:05 -07:00
Chad Versace
169251bff0 vk: Remove more raw casts
This removes nearly all the remaining raw Anvil<->Vulkan casts from the
C source files.  (File compiler.cpp still contains many raw casts, and
I plan on ignoring that).

As far as I can tell, the only remaining raw casts are:
    anv_attachment_view -> anv_depth_stencil_view
    anv_attachment_view -> anv_color_attachment_view
2015-07-17 14:32:22 -07:00
Chad Versace
fc3838376b vk/image: Add braces around multi-line ifs 2015-07-17 13:38:09 -07:00
Connor Abbott
b2cfd85060 nir/spirv: don't declare builtin blocks
They aren't used, and the backend was barfing on them. Also, remove a
hack in in compiler.cpp now that they're gone.
2015-07-16 11:04:22 -07:00
Connor Abbott
b599735be4 nir/spirv: add support for loading UBO's
We directly emit ubo load intrinsics based off of the offset information
handed to us from SPIR-V.
2015-07-16 10:54:09 -07:00
Connor Abbott
513ee7fa48 nir/types: add more nir_type_is_xxx() wrappers 2015-07-15 21:58:32 -07:00
Connor Abbott
9fa0989ff2 nir: move to two-level binding model for UBO's
The GLSL layer above is still hacky, so we're really just moving the
hack into GLSL-to-NIR. I'd rather not go all the way and make GLSL
support the Vulkan binding model too, since presumably we'll be
switching to SPIR-V exclusively, and so working on proper GLSL support
will be a waste of time. For now, doing this keeps it working as we add
SPIR-V->NIR support though.
2015-07-15 17:18:48 -07:00
Chad Versace
5520221118 vk: Remove unneeded vulkan-138.h 2015-07-15 17:16:07 -07:00
Chad Versace
73a8f9543a vk: Bump vulkan.h version to 0.138 2015-07-15 17:16:07 -07:00
Chad Versace
55781f8d02 vk/0.138: Update VkResult values 2015-07-15 17:16:07 -07:00
Chad Versace
756d8064c1 vk/0.132: Do type-safety 2015-07-15 17:16:07 -07:00
Jason Ekstrand
927f54de68 vk/cmd_buffer: Move batch buffer padding to anv_batch_bo_finish() 2015-07-15 17:11:04 -07:00
Jason Ekstrand
9c0db9d349 vk/cmd_buffer: Rename bo_count to exec2_bo_count 2015-07-15 16:56:29 -07:00
Jason Ekstrand
6037b5d610 vk/cmd_buffer: Add a helper for allocating dynamic state
This matches what we do for surface state and makes the dynamic state pool
more opaque to things that need to get dynamic state.
2015-07-15 16:56:29 -07:00
Jason Ekstrand
7ccc8dd24a vk/private.h: Move cmd_buffer functions to near the cmd_buffer struct 2015-07-15 16:56:29 -07:00
Jason Ekstrand
d22d5f25fc vk: Split command buffer state into its own structure
Everything else in anv_cmd_buffer is the actual guts of the datastructure.
2015-07-15 16:56:29 -07:00
Jason Ekstrand
da4d9f6c7c vk: Move most of the anv_Cmd related stuff to its own file 2015-07-15 16:56:28 -07:00
Jason Ekstrand
d862099198 vk: Pull the guts of anv_cmd_buffer into its own file 2015-07-15 16:56:28 -07:00
Chad Versace
498ae009d3 vk/glsl: Replace raw casts
Needed for upcoming type-safety changes.
2015-07-15 15:51:37 -07:00
Chad Versace
6f140e8af1 vk/meta: Remove raw casts
Needed for upcoming type-safety changes.
2015-07-15 15:51:37 -07:00
Chad Versace
badbf0c94a vk/x11: Remove raw casts
The raw casts in the WSI functions will break the build when the
type-safety changes arrive.
2015-07-15 15:49:10 -07:00
Chad Versace
61a4bfe253 vk: Delete vkDbgSetObjectTag()
Because VkObject is going away.
2015-07-15 15:34:20 -07:00
Jason Ekstrand
e1c78ebe53 vk/device: Remove unneeded checks for NULL 2015-07-15 15:22:32 -07:00
Jason Ekstrand
f4748bff59 vk/device: Provide proper NULL handling in anv_device_free
The Vulkan spec does not specify that the free function provided to
CreateInstance must handle NULL properly so we do it in the wrapper.  If
this ever changes in the spec, we can delete the extra 2 lines.
2015-07-15 15:22:32 -07:00
Chad Versace
4c8e1e5888 vk: Stop internally calling anv_DestroyObject()
Replace each anv_DestroyObject() with anv_DestroyFoo().

Let vkDestroyObject() live for a while longer for Crucible's sake.
2015-07-15 15:11:16 -07:00
Chad Versace
f5ad06eb78 vk: Fix vkDestroyObject dispatch for VkRenderPass
It called anv_device_free() instead of anv_DestroyRenderPass().
2015-07-15 15:07:41 -07:00
Chad Versace
188f2328de vk: Fix vkCreate/DestroyRenderPass
While updating vkDestroyObject, I discovered that vkDestroyPass reliably
crashes. That hasn't been an issue yet, though, because it is never
called.

In vkCreateRenderPass:
    - Don't allocate empty attachment arrays.
    - Ensure that pointers to empty attachment arrays are NULL.
    - Store VkRenderPassCreateInfo::subpassCount as
      anv_render_pass::subpass_count.

In vkDestroyRenderPass:
    - Fix loop bounds: s/attachment_count/subpass_count/
    - Don't call anv_device_free on null pointers.
2015-07-15 15:07:41 -07:00
Chad Versace
c6270e8044 vk: Refactor create/destroy code for anv_descriptor_set
Define two new functions:
    anv_descriptor_set_create
    anv_descriptor_set_destroy
2015-07-15 14:31:22 -07:00
Chad Versace
365d80a91e vk: Replace some raw casts with safe casts
That is, replace some instances of
    (VkFoo) foo
with
    anv_foo_to_handle(foo)
2015-07-15 14:00:21 -07:00
Chad Versace
7529e7ce86 vk: Correct anv_CreateShaderModule's prototype
s/VkShader/VkShaderModule/

:sigh: I look forward to type-safety.
2015-07-15 13:59:47 -07:00
Chad Versace
8213be790e vk: Define struct anv_image_view, anv_buffer_view
Follow the pattern of anv_attachment_view. We need these structs to
implement the type-safety that arrived in the 0.132 header.
2015-07-15 12:19:29 -07:00
Chad Versace
43241a24bc vk/meta: Fix declared type of a shader module
s/VkShader/VkShaderModule/

I'm looking forward to a type-safe vulkan.h ;)
2015-07-15 11:49:37 -07:00
Chad Versace
94e473c993 vk: Remove struct anv_object
Trivial removal because vkDestroyObject() no longer uses it.
2015-07-15 11:29:43 -07:00
Jason Ekstrand
e375f722a6 vk/device: More documentation on surface state flushing 2015-07-15 11:09:02 -07:00
Connor Abbott
9aabe69028 vk/device: explain why a flush is necessary
Jason found this from experimenting, but the docs give a reasonable
explanation of why it's necessary.
2015-07-14 23:03:19 -07:00
Chad Versace
5f46c4608f vk: Fix indentation of anv_dynamic_cb_state 2015-07-14 18:19:10 -07:00
Chad Versace
0eeba6b80c vk: Add finishmes for VkDescriptorPool
VkDescriptorPool is a stub object. As a consequence, it's impossible to
free descriptor set memory.
2015-07-14 18:19:00 -07:00
Jason Ekstrand
2b5a4dc5f3 vk: Add vulkan-138 and remove vulkan-0.132
Now, 138 is the target and not 132.  Once object destruction is finished,
we can delete 138 as it will be identical to vulkan.h
2015-07-14 17:54:13 -07:00
Jason Ekstrand
1f658bed70 vk/device: Add stub support for command pools
Real support isn't really that far away.  We just need a data structure
with a linked list and a few tests.
2015-07-14 17:40:00 -07:00
Jason Ekstrand
ca7243b54e vk/vulkan.h: Add the stuff for cross-queue resource sharing
We only have one queue, so this is currently a no-op on our implementation.
2015-07-14 17:20:50 -07:00
Jason Ekstrand
553b4434ca vk/vulkan.h: Add a couple of size fields for specialization constants 2015-07-14 17:12:39 -07:00
Jason Ekstrand
e5db209d54 vk/vulkan.h: Move around buffer image granularities 2015-07-14 17:10:37 -07:00
Jason Ekstrand
c7fcfebd5b vk: Add stubs for all the sparse resource stuff 2015-07-14 17:06:11 -07:00
Jason Ekstrand
2a9136feb4 vk/image: Add a stub for the new ImageFormatProperties function
This lets the client query about things like multisample.  We don't do
multisample right now, so I'll let Chad deal with that when he gets to it.
2015-07-14 17:05:30 -07:00
Jason Ekstrand
2c4dc92f40 vk/vulkan.h: Rename FormatInfo to FormatProperties 2015-07-14 17:04:46 -07:00
Jason Ekstrand
d7f44852be vk/vulkan.h: Re-order some #define's 2015-07-14 16:41:39 -07:00
Jason Ekstrand
1fd3bc818a vk/vulkan.h: Rename a function parameter 2015-07-14 16:39:01 -07:00
Jason Ekstrand
2e2f48f840 vk: Remove abreviations 2015-07-14 16:34:31 -07:00
Jason Ekstrand
02db21ae11 vk: Add the new extension/layer enumeration entrypoints 2015-07-14 16:11:21 -07:00
Jason Ekstrand
a463eacb8f vk/vulkan.h: Change maxAnisotropy to a float 2015-07-14 15:04:11 -07:00
Jason Ekstrand
98957b18d2 vk/vulkan.h: Add the VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT flag 2015-07-14 15:03:39 -07:00
Jason Ekstrand
a35811d086 vk/vulkan.h: Rename a couple of function parameters
No functional change.
2015-07-14 15:03:01 -07:00
Jason Ekstrand
55723e97f1 vk: Split the memory requirements/binding functions 2015-07-14 14:59:39 -07:00
Jason Ekstrand
ccb2e5cd62 vk: Make barriers more precise (rev. 133) 2015-07-14 14:50:35 -07:00
Jason Ekstrand
30445f8f7a vk: Split the dynamic state binding function into one per state 2015-07-14 14:26:10 -07:00
Jason Ekstrand
d2c0870ff3 vk/vulkan.h: Rename a function parameter to match 132 2015-07-14 14:11:04 -07:00
Jason Ekstrand
8478350992 vk: Implement Multipass 2015-07-14 11:37:14 -07:00
Jason Ekstrand
68768c40be vk/vulkan.h: Re-arrange some enums and definitions in preparation for 131 2015-07-14 11:32:15 -07:00
Chad Versace
66cbb7f76d vk/0.132: Add vkDestroyRenderPass() 2015-07-14 11:21:31 -07:00
Chad Versace
6d0ed38db5 vk/0.132: Add vkDestroy*View()
vkDestroyColorAttachmentView
vkDestroyDepthStencilView

These functions are not in the 0.132 header, but adding them will help
us attain the type-safety API updates more quickly.
2015-07-14 11:19:22 -07:00
Chad Versace
1ca611cbad vk/0.132: Add vkDestroyCommandBuffer() 2015-07-14 11:11:41 -07:00
Chad Versace
6eec0b186c vk/0.132: Add vkDestroyImageView()
Just declare it in vulkan.h. Jason defined the function earlier
in image.c.
2015-07-14 11:09:14 -07:00
Chad Versace
4b2c5a98f0 vk/0.132: Add vkDestroyBufferView()
Just declare it in vulkan.h. Jason already defined the function
earlier in vulkan.c.
2015-07-14 11:06:57 -07:00
Chad Versace
08f7731f67 vk/0.132: Add vkDestroyFramebuffer() 2015-07-14 10:59:30 -07:00
Chad Versace
0c8456ef1e vk/0.132: Add vkDestroyDynamicDepthStencilState() 2015-07-14 10:54:51 -07:00
Chad Versace
b29c929e8e vk/0.132: Add vkDestroyDynamicColorBlendState() 2015-07-14 10:52:45 -07:00
Chad Versace
5e1737c42f vk/0.132: Add vkDestroyDynamicRasterState() 2015-07-14 10:51:08 -07:00
Chad Versace
d80fea1af6 vk/0.132: Add vkDestroyDynamicViewportState() 2015-07-14 10:42:45 -07:00
Chad Versace
9250e1e9e5 vk/0.132: Add vkDestroyDescriptorPool() 2015-07-14 10:38:22 -07:00
Chad Versace
f925ea31e7 vk/0.132: Add vkDestroyDescriptorSetLayout() 2015-07-14 10:36:49 -07:00
Chad Versace
ec5e2f4992 vk/0.132: Add vkDestroySampler() 2015-07-14 10:34:00 -07:00
Chad Versace
a684198935 vk/0.132: Add vkDestroyPipelineLayout() 2015-07-14 10:29:47 -07:00
Chad Versace
6e5ab5cf1b vk/0.132: Add vkDestroyPipeline() 2015-07-14 10:26:17 -07:00
Chad Versace
114015321e vk/0.132: Add vkDestroyPipelineCache() 2015-07-14 10:19:27 -07:00
Chad Versace
cb57bff36c vk/0.132: Add vkDestroyShader() 2015-07-14 10:16:22 -07:00
Chad Versace
8ae8e14ba7 vk/0.132: Add vkDestroyShaderModule() 2015-07-14 10:13:09 -07:00
Chad Versace
dd67c134ad vk/0.132: Add vkDestroyImage()
We only need to add it to vulkan.h because Jason defined the function
earlier in image.c.
2015-07-14 10:13:00 -07:00
Chad Versace
e18377f435 vk/0.132: Dispatch vkDestroyObject to new destructors
Oops. My recent commits added new destructors, but forgot to teach
vkDestroyObject about them. They are:
  vkDestroyFence
  vkDestroyEvent
  vkDestroySemaphore
  vkDestroyQueryPool
  vkDestroyBuffer
2015-07-14 09:58:22 -07:00
Chad Versace
e93b6d8eb1 vk/0.132: Add vkDestroyBuffer() 2015-07-14 09:47:45 -07:00
Chad Versace
584cb7a16f vk/0.132: Add vkDestroyQueryPool() 2015-07-14 09:44:58 -07:00
Chad Versace
68c7ef502d vk/0.132: Add vkDestroyEvent() 2015-07-14 09:33:47 -07:00
Chad Versace
549070b18c vk/0.132: Add vkDestroySemaphore() 2015-07-14 09:31:34 -07:00
Chad Versace
ebb191f145 vk/0.132: Add vkDestroyFence() 2015-07-14 09:29:35 -07:00
Chad Versace
435ccf4056 vk/0.132: Rename VkDynamic*State types
sed -i -e 's/VkDynamicVpState/VkDynamicViewportState/g' \
       -e 's/VkDynamicRsState/VkDynamicRasterState/g' \
       -e 's/VkDynamicCbState/VkDynamicColorBlendState/g' \
       -e 's/VkDynamicDsState/VkDynamicDepthStencilState/g' \
       $(git ls-files include/vulkan src/vulkan)
2015-07-13 16:19:28 -07:00
Connor Abbott
ffb51fd112 nir/spirv: update to SPIR-V revision 31
This means that now the internal version of glslangValidator is
required. This includes some changes due to the sampler/texture rework,
but doesn't actually enable anything more yet. We also don't yet handle
UBO's correctly, and don't handle matrix stride and row major/column
major yet.
2015-07-13 15:01:01 -07:00
Chad Versace
45f8723f44 vk/0.132: Move VkQueryControlFlags 2015-07-13 13:09:32 -07:00
Chad Versace
180c07ee50 vk/0.132: Move VkImageAspectFlags 2015-07-13 13:08:56 -07:00
Chad Versace
4b05a8cd31 vk/0.132: Move VkCmdBufferOptimizeFlags 2015-07-13 13:08:07 -07:00
Chad Versace
f1cf55fae6 vk/0.132: Move VkWaitEvent 2015-07-13 13:06:53 -07:00
Chad Versace
3112098776 vk/0.132: Move VkCmdBufferLevel 2015-07-13 13:06:33 -07:00
Chad Versace
c633ab5822 vk/0.132: Drop VK_ATTACHMENT_STORE_OP_RESOLVE_MSAA 2015-07-13 13:05:24 -07:00
Chad Versace
8f3b2187e1 vk/0.132: Rename bool32_t -> VkBool32
sed -i 's/bool32_t/VkBool32/g' \
  $(git ls-files src/vulkan include/vulkan)
2015-07-13 13:03:36 -07:00
Chad Versace
77dcfe3c70 vk/0.132: Remove stray typedef 2015-07-13 12:58:17 -07:00
Chad Versace
601d0891a6 vk/0.132: Move VKImageUsageFlags 2015-07-13 12:48:44 -07:00
Chad Versace
829810fa27 vk/0.132: Move VkImageType and VkImageTiling 2015-07-13 11:49:56 -07:00
Chad Versace
17c8232ecf vk/0.132: Import the 0.132 header
Import it as vulkan-0.132.h.
2015-07-13 11:47:12 -07:00
Chad Versace
a158ff55f0 vk/vulkan.h: Remove headers for old API versions
Remove the temporary headers for 0.90 and 0.130.
2015-07-13 11:46:30 -07:00
Chad Versace
1c4238a8e5 vk/0.130: Bump header version to 0.130
All APIs have been updated. This eliminates the diff between the
work-in-progress header and the 0.130 header.
2015-07-10 20:06:09 -07:00
Chad Versace
f43a304dc6 vk/0.130: Update vkAllocMemory to use VkMemoryType 2015-07-10 17:35:52 -07:00
Chad Versace
df2a013881 vk/0.130: Implement vkGetPhysicalDeviceMemoryProperties() 2015-07-10 17:35:52 -07:00
Chad Versace
c7f512721c vk/gem: Change signature of anv_gem_get_aperture()
Replace the anv_device parameter with anv_physical_device, because this needs
querying before vkCreateDevice.
2015-07-10 17:35:52 -07:00
Chad Versace
8cda3e9b1b vk/device: Add member anv_physical_device::fd
During anv_physical_device_init(), we opend the DRM device to do some
queries, then promptly closed it. Now we keep it open for the lifetime
of the anv_physical_device so that we can query it some more during
vkGetPhysicalDevice*Properties() [which will happen in follow-up
commits].
2015-07-10 17:35:52 -07:00
Chad Versace
4422bd4cf6 vk/device: Add func anv_physical_device_finish()
Because in a follow-up patch I need to do some non-trival teardown on
anv_physical_device. Currently, however, anv_physical_device_finish() is
currently a no-op that's just called in the right place.

Also, rename function fill_physical_device -> anv_physical_device_init
for symmetry.
2015-07-10 17:35:52 -07:00
Jason Ekstrand
7552e026da vk/device: Add an explicit destructor for RenderPass 2015-07-10 12:33:04 -07:00
Jason Ekstrand
8b342b39a3 vk/image: Add an explicit DestroyImage function 2015-07-10 12:30:58 -07:00
Jason Ekstrand
b94b8dfad5 vk/image: Add explicit constructors for buffer/image view types 2015-07-10 12:26:31 -07:00
Jason Ekstrand
18340883e3 nir: Add C++ versions of NIR_(SRC|DEST)_INIT 2015-07-10 11:57:33 -07:00
Chad Versace
9e64a2a8e4 mesa: Fix generation of git_sha1.h.tmp for gitlinks
Don't assume that $(top_srcdir)/.git is a directory. It may be a
gitlink file [1] if $(top_srcdir) is a submodule checkout or a linked
worktree [2].

[1] A "gitlink" is a text file that specifies the real location of
    the gitdir.
[2] Linked worktrees are a new feature in Git 2.5.

Cc: "10.6, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 75784243df)
2015-07-10 11:24:25 -07:00
Jason Ekstrand
19f0a9b582 vk/query.c: Use the casting functions 2015-07-09 20:32:44 -07:00
Jason Ekstrand
6eb221c884 vk/pipeline.c: Use the casting functions 2015-07-09 20:28:08 -07:00
Jason Ekstrand
fb4e2195ec vk/formats.c: Use the casting functions 2015-07-09 20:24:17 -07:00
Jason Ekstrand
a52e208203 vk/image.c: Use the casting functions 2015-07-09 20:24:07 -07:00
Jason Ekstrand
b1de1d4f6e vk/device.c: One more use of a casting function 2015-07-09 20:23:46 -07:00
Jason Ekstrand
8739e8fbe2 vk/meta.c: Use the casting functions 2015-07-09 20:16:13 -07:00
Jason Ekstrand
92556c77f4 vk: Fix the build 2015-07-09 18:59:08 -07:00
Jason Ekstrand
098209eedf device.c: Use the cast helpers a bunch of places 2015-07-09 18:49:43 -07:00
Jason Ekstrand
73f9187e33 device.c: Use the cast helpers 2015-07-09 18:41:27 -07:00
Jason Ekstrand
7d24fab4ef vk/private.h: Add a bunch of static inline casting functions
We will need these as soon as we turn on type saftey.  We might as well
define and start using them now rather than later.
2015-07-09 18:40:54 -07:00
Jason Ekstrand
5c49730164 vk/device.c: Fix whitespace issues 2015-07-09 18:20:28 -07:00
Jason Ekstrand
c95f9b61f2 vk/device.c: Use ANV_FROM_HANDLE a bunch of places 2015-07-09 18:20:10 -07:00
Jason Ekstrand
335e88c8ee vk/vulkan.h: Add the pEnabledFeatures field to DeviceCreateInfo 2015-07-09 16:21:31 -07:00
Jason Ekstrand
34871cf7f3 vk/vulkan.h: Change the MsCreateInfo structure to the 130 version
We do nothing with it at the moment, so this is a no-op.
2015-07-09 16:19:54 -07:00
Jason Ekstrand
8c2c37fae7 vk: Remove the old GetPhysicalDeviceInfo call 2015-07-09 16:14:37 -07:00
Jason Ekstrand
1f907011a3 vk: Add the new PhysicalDeviceQueue queries 2015-07-09 16:14:37 -07:00
Jason Ekstrand
977a469bce vk: Support GetPhysicalDeviceProperties 2015-07-09 16:14:37 -07:00
Jason Ekstrand
65e0b304b6 vk: Add support for GetPhysicalDeviceLimits 2015-07-09 16:14:37 -07:00
Jason Ekstrand
f6d51f3fd3 vk: Add GetPhysicalDeviceFeatures 2015-07-09 16:14:37 -07:00
Chad Versace
5b75dffd04 vk/device: Fix vkEnumeratePhysicalDevices()
The Vulkan spec says that pPhysicalDeviceCount is an out parameter if
pPhysicalDevices is NULL; otherwise it's an inout parameter.

Mesa incorrectly treated it unconditionally as an inout parameter, which
could have lead to reading unitialized data.
2015-07-09 15:53:21 -07:00
Chad Versace
fa915b661d vk/device: Move device enumeration to vkEnumeratePhysicalDevices()
Don't enumerate devices in vkCreateInstance(). That's where global,
device-independent initialization should happen. Move device enumeration
to the more logical location, vkEnumeratePhysicalDevices().
2015-07-09 15:41:17 -07:00
Chad Versace
c34d314db3 vk/device: Be consistent about path to DRM device
Function fill_physical_device() has a 'path' parameter, and struct
anv_physical_device has a 'path' member. Sometimes these are used;
sometimes hardcoded "/dev/dri/renderD128" is used instead.

Be consistent. Hardcode "/dev/dri/renderD128" in exactly one location,
during initialization of the physical device.
2015-07-09 15:27:26 -07:00
Connor Abbott
cff06bbe7d vk/compiler: create an empty parameters list
Prevents problems when initializing the sanity_param_count.
2015-07-09 14:29:23 -04:00
Connor Abbott
3318a86d12 nir/spirv: fix wrong writemask for ALU operations 2015-07-09 14:28:39 -04:00
Connor Abbott
b8fedc19f5 nir/spirv: fix memory context for builtin variable
Fixes valgrind errors with func.depthstencil.basic.
2015-07-08 22:03:30 -04:00
Connor Abbott
e4292ac039 nir/spirv: zero out value array
Before values are pushed or annotated with a name, decoration, etc.,
they need to have an invalid type, NULL name, NULL decoration, etc.
ralloc zero's everything by accident, so this wasn't an issue in
practice, but we should be explicitly zero'ing it.
2015-07-08 22:03:30 -04:00
Connor Abbott
997831868f vk/compiler: create the right kind of program struct
This fixes Valgrind errors and gets all the tests to pass with
--use-spir-v.
2015-07-08 22:03:30 -04:00
Connor Abbott
a841e2c747 vk/compiler: mark inputs/outputs as read/written
This doesn't handle inputs and outputs larger than a vec4, but we plan
to add a varyiing splitting/packing pass to handle those anyways.
2015-07-08 22:03:30 -04:00
Jason Ekstrand
8640dc12dc vk/vulkan.h: Copy the VkStructureType enum from version 130
We now have the exact same structs which require pType.
2015-07-08 17:45:52 -07:00
Jason Ekstrand
5a4ebf6bc1 vk: Move to the new pipeline creation API's 2015-07-08 17:30:18 -07:00
Chad Versace
4fcb32a17d vk/0.130: Remove VkImageViewCreateInfo::minLod
It's now set solely through VkSampler.
2015-07-08 14:48:22 -07:00
Jason Ekstrand
367b9ba78f vk/vulkan.h: Move renderPassContinue from GraphicsBeginInfo to BeginInfo 2015-07-08 14:37:30 -07:00
Jason Ekstrand
d29ec8fa36 vk/vulkan.h: Update to the new UpdateDescriptorSets api 2015-07-08 14:24:56 -07:00
Jason Ekstrand
c8577b5f52 vk: Add a macro for creating anv variables from vulkan handles
This is very helpful for doing the mass bunch of casts at the top of a
function.  It will also be invaluable when we get type saftey in the API.
2015-07-08 14:24:14 -07:00
Chad Versace
ccb27a002c vk/0.130 Update VkObjectType values
Don't import any new enum tokens from the 0.130 header. Just update the
values of existing enums. This reduces the diff by about 16 lines.
2015-07-08 12:53:49 -07:00
Chad Versace
8985dd15a1 vk/0.130: Remove VkDescriptorUpdateMode
Nowhere used.
2015-07-08 12:51:46 -07:00
Chad Versace
e02dfa309a vk/0.130: Remove VK_DEVICE_CREATE_MULTI_DEVICE_IQ_MATCH_BIT 2015-07-08 12:49:48 -07:00
Chad Versace
e9034ed875 vk/0.130: Update vkCmdBlitImage signature
Add VkTexFilter param. Ignored for now.
2015-07-08 12:47:48 -07:00
Jason Ekstrand
aae45ab583 vk/vulkan.h: Add packing parameters to BufferImageCopy 2015-07-08 11:51:34 -07:00
Chad Versace
b4ef7f354b vk/0.130: Remove msaa members of VkDepthStencilViewCreateInfo 2015-07-08 11:50:51 -07:00
Jason Ekstrand
522ab835d6 vk/vulkan.h: Move over to the new border color enums 2015-07-08 11:44:52 -07:00
Jason Ekstrand
7598329774 vk/vulkan.h: Move VkFormatProperties 2015-07-08 11:16:45 -07:00
Jason Ekstrand
52940e8fcf vk/vulkan.h: Add RenderPassBeginContents 2015-07-08 10:57:13 -07:00
Jason Ekstrand
e19d6be2a9 vk/vulkan.h: Add command buffer levels 2015-07-08 10:53:32 -07:00
Jason Ekstrand
c84f2d3b8c vk/vulkan.h: Import the VkPipeEvent enum from 130
Now, VkPipeEventFlags is back in sync with VkPipeEvent
2015-07-08 10:49:46 -07:00
Jason Ekstrand
b20cc72603 vk/vulkan.h: Remove VkFormatInfoType 2015-07-08 10:39:31 -07:00
Jason Ekstrand
8e05bbeee9 vk/vulkan.h: Update extension handling to rev 130 2015-07-08 10:38:07 -07:00
Jason Ekstrand
cc29a5f4be vk/vulkan.h: Move format quering to the physical device 2015-07-08 09:34:47 -07:00
Jason Ekstrand
719fa8ac74 vk/vulkan.h: Remove some peer opening structs and STRUCTURE_TYPE enums 2015-07-08 09:25:13 -07:00
Jason Ekstrand
fc6dcc6227 vk: Add a copy of the v90 header. 2015-07-08 09:23:29 -07:00
Jason Ekstrand
12119282e6 vk/vulkan.h: Remove an unneeded comment 2015-07-08 09:18:09 -07:00
Jason Ekstrand
3c65a1ac14 vk/vulkan.h: Remove the MemoryRange stubs and add sparse stubs 2015-07-08 09:16:48 -07:00
Jason Ekstrand
bb6567f5d1 vk/vulkan.h: Switch BindObjectMemory to a device function and remove the index 2015-07-08 09:04:16 -07:00
Jason Ekstrand
e7acdda184 vk/vulkan.h: Switch to the split ProcAddr functions in 130 2015-07-07 18:51:53 -07:00
Jason Ekstrand
db24afee2f vk/vulkan.h: Switch from GetImageSubresourceInfo to GetImageSubresourceLayout 2015-07-07 18:20:18 -07:00
Jason Ekstrand
ef8980e256 vk/vulkan.h: Switch from GetObjectInfo to GetMemoryRequirements 2015-07-07 18:16:42 -07:00
Jason Ekstrand
d9c2caea6a vk: Update memory flushing functions to 130
This involves updating the prototype for FlushMappedMemory, adding
InvalidateMappedMemoryRanges, and removing PinSystemMemory.
2015-07-07 17:22:31 -07:00
Jason Ekstrand
d5349b1b18 vk/vulkan.h: Constify the pFences parameter to ResetFences 2015-07-07 17:18:00 -07:00
Jason Ekstrand
6aa1b89457 vk/vulkan.h: Move the definitions of Create(Framebuffer|RenderPass)
This better matches the 130 header.
2015-07-07 17:13:10 -07:00
Jason Ekstrand
0ff06540ae vk: Implement the GetRenderAreaGranularity function
At the moment, we're just going to scissor clears so a granularity of 1x1
is all we need.
2015-07-07 17:11:37 -07:00
Jason Ekstrand
435b062b26 vk/vulkan.h: Add a PipelineLayout parameter to BindDescriptorSets 2015-07-07 17:06:10 -07:00
Jason Ekstrand
518ca9e254 vk/vulkan.h: Add a compareEnable parameter to SamplerCreateInfo
Our hardware doesn't actually need this, so adding it is a no-op.
2015-07-07 16:49:04 -07:00
Jason Ekstrand
672590710b vk/vulkan.h: Remove initialCount from SemaphoreCreateInfo 2015-07-07 16:42:42 -07:00
Jason Ekstrand
80046a7d54 vk/vulkan.h: Update clear color handling to 130 2015-07-07 16:37:43 -07:00
Jason Ekstrand
3e4b00d283 meta: Use the VkClearColorValue structure for the color attribute 2015-07-07 16:27:06 -07:00
Jason Ekstrand
a35fef1ab2 vk/vulkan.h: Remove the pass argument from EndRenderPass 2015-07-07 16:22:23 -07:00
Jason Ekstrand
d2ca7e24b4 vk/vulkan.h: Rename VertexInputStateInfo to VertexInputStateCreateInfo 2015-07-07 16:15:55 -07:00
Jason Ekstrand
abbb776bbe vk/vulkan.h: Remove programPointSize
Instead, we auto-detect whether or not your shader writes gl_PointSize.  If
it does, we use 1.0, otherwise we take it from the shader.
2015-07-07 16:00:46 -07:00
Chad Versace
e7ddfe03ab vk/0.130: Stub vkCmdClear*Attachment() funcs
vkCmdClearColorAttachment
vkCmdClearDepthStencilAttachment
2015-07-07 15:57:37 -07:00
Chad Versace
f89e2e6304 vk/0.130: Define enum VkImageAspectFlagBits 2015-07-07 15:57:37 -07:00
Chad Versace
55ab1737d3 vk/0.130: Define VkRect3D 2015-07-07 15:55:53 -07:00
Chad Versace
11901a9100 vk/0.130: Update name of vkCmdClearDepthStencilImage() 2015-07-07 15:53:35 -07:00
Chad Versace
dff32238c7 vk/0.130: Stub vkCmdExecuteCommands() 2015-07-07 15:51:55 -07:00
Chad Versace
85c0d69be9 vk/0.130: Update vkCmdWaitEvents() signature 2015-07-07 15:49:57 -07:00
Chad Versace
0ecb789b71 vk: Remove unused 'v' param from stub() macro 2015-07-07 15:47:24 -07:00
Chad Versace
f78d684772 vk: Stub vkCmdPushConstants() from 0.130 header 2015-07-07 15:46:19 -07:00
Chad Versace
18ee32ef9d vk: Update vkCmdPipelineBarrier to 0.130 header 2015-07-07 15:43:41 -07:00
Chad Versace
4af79ab076 vk: Add func anv_clear_mask()
A little helper func for inspecting and clearing bitmasks.
2015-07-07 15:43:41 -07:00
Jason Ekstrand
788a8352b9 vk/vulkan.h: Remove some unused fields.
In particular, the following are removed:

 - disableVertexReuse
 - clipOrigin
 - depthMode
 - pointOrigin
 - provokingVertex
2015-07-07 15:33:00 -07:00
Jason Ekstrand
7fbed521bb vk/vulkan.h: Remove the explicit primitive restart index
Unfortunately, this requires some non-trivial changes to the driver.  Now
that the primitive restart index isn't given explicitly by the client, we
always use ~0 for everything like D3D does.  Unfortunately, our hardware is
awesome and a 32-bit version of ~0 doesn't match any 16-bit values.  This
means, we have to set it to either UINT16_MAX or UINT32_MAX depending on
the size of the index type.  Since we get the index type from
CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need
to lazy-emit the VF packet.
2015-07-07 15:33:00 -07:00
Chad Versace
d6b840beff vk: Delete some comments not present in 0.130 header
Deleting the comments reduces diff noise.
2015-07-07 15:16:13 -07:00
Chad Versace
84a5bc25e3 vk: Pull in remaining 0.130 handle types
This pulls in the definition of VkShaderModule and VkPipelineCache,
which nowhere used yet.
2015-07-07 15:13:01 -07:00
Chad Versace
f2899b1af2 vk: Pull in #defines from 0.130 header
Despite not being used yet, pulling in the macros does diminish the
header diff.
2015-07-07 15:11:30 -07:00
Jason Ekstrand
962d6932fa vk/vulkan.h: Rename (min|max)Depth to (min|max)DepthBounds 2015-07-07 12:37:54 -07:00
Jason Ekstrand
1fb859e4b2 vk/vulkan.h: Remove client-settable pointSize from DynamicRsState 2015-07-07 12:35:32 -07:00
Jason Ekstrand
245583075c vk/vulkan.h: Remove UINT8 index buffers 2015-07-07 11:26:49 -07:00
Jason Ekstrand
0a42332904 vk/vulkan.h: Re-order the object declarations 2015-07-07 11:26:49 -07:00
Kristian Høgsberg Kristensen
a1eea996d4 vk: Emit 3DSTATE_SAMPLE_MASK
This was missing and was causing the driver to not work with
execlists. Presumably we get a different initial hw context with
execlists enabled, that has sample mask 0 initially.

Set this to 0xffff for now.  When we add MS support, we need to take the
value from VkPipelineMsStateCreateInfo::sampleMask.
2015-07-06 23:54:12 -07:00
Kristian Høgsberg Kristensen
c325bb24b5 vk: Pull in new generated headers
The new headers use stdbool for enable/disable fields which
implicitly converts expressions like (flags & 8) to 0 or 1.
Also handles MBO (must-be-one) fields by setting them to one,
corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and
makes a few enum values less clashy.
2015-07-06 22:12:26 -07:00
Chad Versace
23075bccb3 vk/image: Validate vkCreateImageView more
Exhaustively validate the function input.  If it's not validated and
doesn't have an anv_finishme(), then I overlooked it.
2015-07-06 18:28:26 -07:00
Chad Versace
69e11adecc vk/image: Add more info to VkImageViewType table
Convert the table from the direct mapping
  VkImageViewType -> SurfaceType

into a mapping to an info struct
  VkImageViewType -> struct anv_image_view_info
2015-07-06 18:28:26 -07:00
Chad Versace
b844f542e0 vk: Update VkImageViewType to 0.130.0
This splits 1D and 1D_ARRAY, 2D and 2D_ARRAY, CUBE and CUBE_ARRAY.

The new tokens are unused. This is just a header update.
2015-07-06 18:28:26 -07:00
Chad Versace
5b04db71ff vk/image: Move validation for vkCreateImageView
Move the validation from anv_CreateImageView() and anv_image_view_init()
to anv_validate_CreateImageView(). No new validation is added.
2015-07-06 18:27:14 -07:00
Jason Ekstrand
1f1b26bceb vk/vulkan.h: Rename VkRect to VkRect2D 2015-07-06 17:47:18 -07:00
Jason Ekstrand
63c1190e47 vk/vulkan.h: Rename count to arraySize in VkDescriptorSetLayoutBinding 2015-07-06 17:43:58 -07:00
Jason Ekstrand
d84f3155b1 vk/vulkan.h: Remove the Vk(Memory|Semaphor|Image)OpenInfo structs
We already deleted the functions that need them.  The structs are just
dangling uselessly.
2015-07-06 17:37:13 -07:00
Jason Ekstrand
65f9ccb4e7 vk/vulkan.h: Remove VK_MEMORY_PROPERTY_PREFER_HOST_LOCAL_BIT
We weren't doing anything with it, so this is a no-op
2015-07-06 17:33:45 -07:00
Jason Ekstrand
68fa750f2e vk/vulkan.h: Replace DEVICE_COHERENT_BIT with DEVICE_NON_COHERENT_BIT 2015-07-06 17:32:28 -07:00
Jason Ekstrand
d5b5bd67f6 vk/vulkan.h: Use the query result bits from revision 130
None of the important bits or names actually changed.  It just
added/removed some no-op names.

No functional change.
2015-07-06 17:27:11 -07:00
Jason Ekstrand
d843418c2e vk/vulkan.h: One more quick enum refactor clean-up 2015-07-06 17:26:29 -07:00
Jason Ekstrand
2b37fc28d1 vk/vulkan.h: Get rid of VERTEX_INPUT_STEP_RATE_DRAW
We never supported it, so no functional change.
2015-07-06 17:24:26 -07:00
Jason Ekstrand
a75967b1bb vk/vulkan.h: Remove the CLEAR_OPTIMAL image layout 2015-07-06 17:21:19 -07:00
Jason Ekstrand
2b404e5d00 vk: Rename CPU_READ/WRITE_BIT to HOST_READ/WRITE_BIT 2015-07-06 17:18:25 -07:00
Jason Ekstrand
c57ca3f16f vk/vulkan.h: Remove VK_IMAGE_CREATE_CLONEABLE_BIT 2015-07-06 17:14:30 -07:00
Jason Ekstrand
2de388c49c vk: Remove SHAREABLE bits
They were removed from the Vulkan API and we don't really use them because
there are no multi-GPU i965 systems.
2015-07-06 17:12:51 -07:00
Jason Ekstrand
1b0c47bba6 vk/vulkan.h: Re-order the logic op enums 2015-07-06 17:08:11 -07:00
Jason Ekstrand
c7cef662d0 vk/vulkan.h: Reformat a bunch of enums to match revision 130
In theory, no functional change.
2015-07-06 17:06:02 -07:00
Jason Ekstrand
8c5e48f307 vk: Rename NUM_SHADER_STAGE to SHADER_STAGE_NUM
This is a refactor of more than just the header but it lets us finish
reformating the shader stage enum.
2015-07-06 16:43:28 -07:00
Jason Ekstrand
d9176f2ec7 vk: Reformat a bunch of enums
This accounts for a number differences between the generated headers and
the hand-written header.  Not all reformatting is done in this commit but
it does make the headers much more diffable.

In theory, no functional change.
2015-07-06 16:41:31 -07:00
Jason Ekstrand
e95bf93e5a vk: Pull the VkResult enum from revision 130 2015-07-06 16:15:12 -07:00
Jason Ekstrand
1b7b580756 vk: re-arrange enums to match the order in revision 130 2015-07-06 16:11:05 -07:00
Jason Ekstrand
2fb524b369 vk: Rename a parameter in CmdBindDynamicStateObject 2015-07-06 15:37:17 -07:00
Jason Ekstrand
c5ffcc9958 vk: Remove multi-device stuff 2015-07-06 15:34:55 -07:00
Jason Ekstrand
c5ab5925df vk: Remove ClearDescriptorSets 2015-07-06 15:32:40 -07:00
Jason Ekstrand
ea5fbe1957 vk: Remove begin/end descriptor pool update 2015-07-06 15:32:27 -07:00
Jason Ekstrand
9a798fa946 vk: Remove stub for CloneImageData 2015-07-06 15:30:05 -07:00
Jason Ekstrand
78a0d23d4e vk: Remove the stub support for memory priorities 2015-07-06 15:28:10 -07:00
Jason Ekstrand
11cf214578 vk: Remove the stub support for explicit memory references 2015-07-06 15:27:58 -07:00
Jason Ekstrand
0dc7d4ac8a vk/vulkan.h: Reformat structs to match revision 130
Structs in the old version were specified as

typedef struct VkSomeThing_
{
   type                                        field; // comment
} VkSomeThing;

However, in the generated headers, you have

typedef struct {
   type                                        field;
} VkSomeThing;

This commit also removes some unneeded whitespaces.
2015-07-06 15:19:12 -07:00
Jason Ekstrand
19aabb5730 vk/vulkah.h: Re-arrange structures to match the order in 130 2015-07-06 15:09:30 -07:00
Connor Abbott
f9dbc34a18 nir/spirv: fix some bugs 2015-07-06 15:00:37 -07:00
Connor Abbott
f3ea3b6e58 nir/spirv: add support for builtins inside structures
We may be able to revert this depending on the outcome of bug 14190, but
for now it gets vertex shaders working with SPIR-V.
2015-07-06 15:00:37 -07:00
Connor Abbott
15047514c9 nir/spirv: fix a bug with structure creation
We were creating 2 extra bogus fields.
2015-07-06 15:00:37 -07:00
Connor Abbott
73351c6a18 nir/spirv: fix a bad assertion in the decoration handling
We should be asserting that the parent decoration didn't hand us
a member if the child decoration did, but different child decorations
may obviously have different members.
2015-07-06 15:00:37 -07:00
Connor Abbott
70d2336e7e nir/spirv: pull out logic for getting builtin locations
Also add support for more builtins.
2015-07-06 15:00:37 -07:00
Connor Abbott
aca5fc6af1 nir/spirv: plumb through the type of dereferences
We need this to know if a deref is of a builtin.
2015-07-06 15:00:37 -07:00
Connor Abbott
66375e2852 nir/spirv: handle structure member builtin decorations 2015-07-06 15:00:37 -07:00
Connor Abbott
23c179be75 nir/spirv: add a vtn_type struct
This will handle decorations that aren't in the glsl_type.
2015-07-06 15:00:37 -07:00
Connor Abbott
f9bb95ad4a nir/spirv: move 'type' into the union
Since SSA values now have their own types, it's more convenient to make
'type' only used when we want to look up an actual SPIR-V type, since
we're going to change its type soon to support various decorations that
are handled at the SPIR-V -> NIR level.
2015-07-06 15:00:37 -07:00
Jason Ekstrand
d5dccc1e7a vk: Move CreateFramebuffer and CreateRenderPass higher in the header
This matches where they are in the 130 header.
2015-07-06 14:41:43 -07:00
Jason Ekstrand
4a42f45514 vk: Remove atomic counters stubs 2015-07-06 14:38:45 -07:00
Jason Ekstrand
630b19a1c8 vk: Make vulkan.h look more like vulkan-130.h
Most of these changes are insubstantial.  The only potentially substantial
cyhange is that we added a few new #defines for API maximums.
2015-07-06 14:32:52 -07:00
Jason Ekstrand
2f9180b1b2 vk: Add a revision 130 header along-side the current header 2015-07-06 14:16:51 -07:00
Jason Ekstrand
1f1465f077 vk/meta: Add an initial implementation of ClearColorImage 2015-07-02 18:15:06 -07:00
Jason Ekstrand
8a6c8177e0 vk/meta: Factor the guts out of cmd_buffer_clear 2015-07-02 18:13:59 -07:00
Jason Ekstrand
beb0e25327 vk: Roll back to API v90
This is what version 0.1 of the Vulkan SDK is built against.
2015-07-01 16:44:12 -07:00
Jason Ekstrand
fa663c27f5 nir/spirv: Add initial structure member decoration support 2015-07-01 15:38:26 -07:00
Jason Ekstrand
e3d60d479b nir/spirv: Make vtn_handle_type match the other handler functions
Previously, the caller of vtn_handle_type had to handle actually inserting
the type.  However, this didn't really work if the type was decorated in
any way.
2015-07-01 15:34:10 -07:00
Jason Ekstrand
7a749aa4ba nir/spirv: Add basic support for Op[Group]MemberDecorate 2015-07-01 14:18:07 -07:00
Jason Ekstrand
682eb9489d vk/x11: Allow for the client querying the size of the format properties 2015-07-01 14:18:07 -07:00
Chad Versace
bba767a9af vk/formats: Fix entry for S8_UINT
I forgot to update this when fixing the depth formats.
2015-06-30 09:41:44 -07:00
Chad Versace
6720b47717 vk/formats: Document new meaning of anv_format::cpp
The way the code currently works is that anv_format::cpp is the cpp of
anv_format::surface_format.

Me and Kristian disagree about how the code *should* work. Despite that,
I think it's in our discussion's best interest to document how the code
*currently* works. That should eliminate confusion.

If and when the code begins to work differently, then we'll update the
anv_format comments.
2015-06-30 09:41:41 -07:00
Chad Versace
709fa463ec vk/depth: Add a FIXME
3DSTATE_DEPTH_BUFFER.Width,Height are wrong.
2015-06-26 22:15:03 -07:00
Chad Versace
5b3a1ceb83 vk/image: Enable 2d single-sample color miptrees
What's been tested, for both image views and color attachment views:

    - VK_FORMAT_R8G8B8A8_UNORM
    - VK_IMAGE_VIEW_TYPE_2D
    - mipLevels: 1, 2
    - baseMipLevel: 0, 1
    - arraySize: 1, 2
    - baseArraySlice: 0, 1

What's known to be broken:

    - Depth and stencil miptrees. To fix this, anv_depth_stencil_view
      needs major rework.
    - VkImageViewType != 2D
    - MSAA

Fixes Crucible tests:

  func.miptree.view-2d.levels02.array01.*
  func.miptree.view-2d.levels01.array02.*
  func.miptree.view-2d.levels02.array02.*
2015-06-26 22:11:15 -07:00
Chad Versace
c6e76aed9d vk/image: Define anv_surface, refactor anv_image
This prepares for upcoming miptree support.

anv_surface is a proxy for color surfaces, depth surfaces, and stencil
surfaces.  Embed two instances of anv_surface into anv_image: the
primary surface (color or depth), and an optional stencil surface.
2015-06-26 21:45:53 -07:00
Chad Versace
127cb3f6c5 vk/image: Reformat function signatures
Reformat them to match Mesa code-style.
2015-06-26 20:12:42 -07:00
Chad Versace
fdcd71f71d vk/image: Embed VkImageCreateInfo* into anv_image_create_info
All function signatures that matched this pattern,
  old: f(const VkImageCreateInfo *, const struct anv_image_create_info *)

were rewritten as
  new: f(const struct anv_image_create_info *)
2015-06-26 20:06:08 -07:00
Chad Versace
ca6cef3302 vk/image: Drop some tmp vars in anv_image_view_init()
Variables 'tile_mode' and 'format' are unneeded.
2015-06-26 19:50:04 -07:00
Chad Versace
9c46ba9ca2 vk/image: Abort on stencil image views
The code doesn't work. Not even close.

Replace the broken code with a FINISHME and abort.
2015-06-26 19:23:21 -07:00
Chad Versace
667529fbaa vk: Reindent struct anv_image 2015-06-26 15:27:20 -07:00
Chad Versace
74e3eb304f vk: Define MIN(a, b) macro 2015-06-26 15:09:07 -07:00
Chad Versace
55752fe94a vk: Rename functions ALIGN_*32 -> align_*32
ALIGN_U32 and ALIGN_I32 are functions, not macros. So stop using
allcaps.
2015-06-26 15:07:59 -07:00
Connor Abbott
6ee082718f Merge branch 'wip/nir-vtn' into vulkan
Adds composites and matrix multiplication, plus some control flow fixes.
2015-06-26 12:14:05 -07:00
Chad Versace
37d6e04ba1 vk/formats: Remove the cpp=0 stencil hack
The format table defined cpp = 0 for stencil-only formats. The real cpp
is 1.

When code begins to lie, especially about stencil buffers, code becomes
increasingly fragile as time progresses, and the damage becomes
increasingly hard to undo. (For precedent, see the painful history of
stencil buffer cpp in the git log for gen6 and gen7 in the i965 driver).
Let's undo the stencil buffer cpp lie now to avoid future pain.

In the format table, set cpp = 1 for VK_FORMAT_S8; replace checks for
cpp == 0; and delete all comments about the hack.
2015-06-26 09:58:22 -07:00
Chad Versace
67a7659d69 vk/image: Refactor anv_image_create()
From my experience with intel_mipmap_tree.c, I learned that for struct's
like anv_image and intel_mipmap_tree, which have sprawling
multi-function construction codepaths, it's easy to mistakenly use
unitialized struct members during construction.

Let's eliminate the risk of using unitialized anv_image members during
construction.  Fill the struct at the function bottom instead of
piecemeal throughout the constructor.
2015-06-26 09:32:59 -07:00
Chad Versace
5d7103ee15 vk/image: Group some assertions closer together
In anv_image_create(), group together the assertions on
VkImageCreateInfo.
2015-06-26 09:05:46 -07:00
Chad Versace
0349e8d607 vk/formats: #undef fmt at end of format table 2015-06-26 07:38:02 -07:00
Chad Versace
068b8a41e2 vk: Fix comment for anv_depth_stencil_view::stencil_qpitch
s/DEPTH/STENCIL/
2015-06-26 07:31:57 -07:00
Chad Versace
7ea707a42a vk/image: Add qpitch fields to anv_depth_stencil_view
For now, hard-code them to 0.
2015-06-25 20:10:16 -07:00
Chad Versace
b91a76de98 vk: Reindent and document struct anv_depth_stencil_view 2015-06-25 20:10:16 -07:00
Chad Versace
ebe1e768b8 vk/formats: Fix incorrect depth formats
anv_format::surface_format was incorrect for Vulkan depth formats.
For example, the format table mapped

    VK_FORMAT_D24_UNORM -> .surface_format = D24_UNORM_X8_UINT
    VK_FORMAT_D32_FLOAT -> .surface_format = D32_FLOAT

but should have mapped

    VK_FORMAT_D24_UNORM -> .surface_format = R24_UNORM_X8_TYPELESS
    VK_FORMAT_D32_FLOAT -> .surface_format = R32_FLOAT

The Crucible test func.depthstencil.basic passed despite the bug, but
only because it did not attempt to texture from the depth surface.

The core problem is that RENDER_SURFACE_STATE.SurfaceFormat and
3DSTATE_DEPTH_BUFFER.SurfaceFormat are distinct types. Considering them
as enum spaces, the two enum spaces have incompatible collisions.

Fix this by adding a new field 'depth_format' to struct anv_format.

Refer to brw_surface_formats.c:translate_tex_format() for precedent.
2015-06-25 20:10:16 -07:00
Chad Versace
45b804a049 vk/image: Rename local variable in anv_image_create()
This function has many local variables for info structs. Having one
named simply 'info' is confusing.  Rename it to 'format_info'.
2015-06-25 20:10:16 -07:00
Chad Versace
528071f004 vk/formats: Fix table entry for R8G8B8_SNORM
Now that anv_formats[] is formatted like a table, buggy entries are
easier to see.
2015-06-25 20:10:16 -07:00
Chad Versace
4c8146313f vk/formats: Rename anv_format::format -> surface_format
I misinterpreted anv_format::format as a VkFormat. Instead, it is
a hardware surface format (RENDER_SURFACE_STATE.SurfaceFormat). Rename
the field to 'surface_format' to make it unambiguous.
2015-06-25 20:10:16 -07:00
Chad Versace
4b8b451a1d vk/formats: Rename anv_format::channels -> num_channels
I misinterpreted anv_format::channels as a bitmask of channels.
Renaming it to 'num_channels' makes it unambiguous.
2015-06-25 20:10:16 -07:00
Chad Versace
af0ade0d6c vk: Reindent struct anv_format 2015-06-25 20:10:16 -07:00
Chad Versace
ae29fd1b55 vk/formats: Don't abbreviate tokens in the format table
Abbreviating the VK_FORMAT_* tokens doesn't help much. To the contrary,
it means grep and ctags can't find them.
2015-06-25 20:10:16 -07:00
Jason Ekstrand
d5e41a3a99 vk/compiler: Add the initial hacks to get SPIR-V up and going 2015-06-25 17:36:35 -07:00
Jason Ekstrand
c4c1d96a01 HACK: Get rid of sanity_param_count for FS 2015-06-25 17:36:34 -07:00
Jason Ekstrand
4f5ef945e0 i965: Don't print the GLSL IR if it doesn't exist 2015-06-25 17:36:34 -07:00
Jason Ekstrand
588acdb431 nir/spirv: Set the right location for shader input/outputs
We need to add FRAG_RESULT_DATA0 etc. to the input/output location.
2015-06-25 17:36:34 -07:00
Jason Ekstrand
333b8ddd6b nir/spirv: Set the interface type on uniform blocks 2015-06-25 17:36:34 -07:00
Jason Ekstrand
7e1792b1b7 nir/spirv: Set the system value mode on builtins 2015-06-25 17:36:34 -07:00
Jason Ekstrand
b72936fdad nir/spirv: Actually put variables on the right linked list 2015-06-25 17:36:34 -07:00
Jason Ekstrand
ee0a8f23e4 glsl: Move vert_attrib varying_slot and frag_result enums to shader_enums.h 2015-06-25 17:36:34 -07:00
Chad Versace
fa352969a2 vk/image: Check extent does not exceed surface type limits 2015-06-25 16:53:24 -07:00
Chad Versace
99031aa0f3 vk/image: Stop hardcoding SurfaceType of VkImageView
Instead, translate VkImageViewType to a gen SurfaceType.
2015-06-25 16:53:22 -07:00
Chad Versace
7ea121687c vk/image: Add anv_image::surf_type
This the gen SurfaceType, such as SURFTYPE_2D.
2015-06-25 16:52:16 -07:00
Chad Versace
cb30acaced vk/image: Add tables for gen SurfaceType
Tables for mapping VkImageType and VkImageViewType to gen SurfaceType.
Tables are unused.
2015-06-25 16:52:16 -07:00
Chad Versace
1132080d5d vk/util: Add anv_loge() for logging error messages 2015-06-25 16:52:16 -07:00
Chad Versace
5f2d469e37 vk: Add func anv_is_aligned() 2015-06-25 16:52:16 -07:00
Chad Versace
f7fb7575ef vk: Add anv_minify() 2015-06-25 16:52:05 -07:00
Chad Versace
7cec6c5dfd vk: Define MAX(a, b) macro 2015-06-25 16:29:42 -07:00
Jason Ekstrand
d178e15567 nir/spirv: Fix up some dererf ralloc parenting 2015-06-24 21:39:07 -07:00
Jason Ekstrand
845002e163 i965/nir: Handle returns as long as they're at the end of a function 2015-06-24 21:38:49 -07:00
Jason Ekstrand
2ecac045a4 i965/nir: Split NIR shader handling into two functions
The brw_create_nir function takes a GLSL or ARB shader and turns it into a
NIR shader.  The guts of the optimization and lowering code is now split
into a new brw_process_shader function.
2015-06-24 21:22:07 -07:00
Jason Ekstrand
e369a0eb41 nir/spirv: Use vtn_ssa_value for texture coordinates 2015-06-24 20:39:37 -07:00
Jason Ekstrand
d0bd2bc604 nir/spirv: Add support for the Uniform storage class
This is kida sketchy.  I'm not really sure this is the way it's supposed to
be used.
2015-06-24 20:32:05 -07:00
Jason Ekstrand
ba0d9d33d4 nir/spirv: Add support for some more decorations including built-in 2015-06-24 20:30:32 -07:00
Jason Ekstrand
1bc0a1ad98 nir/spirv: Make the header file C++ safe 2015-06-24 19:01:10 -07:00
Jason Ekstrand
88d02a1b27 vk: Build xmlconfig stuff into libi965_compiler 2015-06-24 15:59:09 -07:00
Kristian Høgsberg Kristensen
24dff4f8fa vk/headers: Handle MBO fields
These must be set to one.
2015-06-24 09:37:50 -07:00
Jason Ekstrand
a62edcce4e Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-06-23 18:05:25 -07:00
Connor Abbott
dee4a94e69 nir/vtn: add support for phi nodes 2015-06-23 10:34:55 -07:00
Connor Abbott
fe1269cf28 nir/builder: add support for inserting before/after blocks 2015-06-23 10:34:22 -07:00
Connor Abbott
9a3dda101e nir/vtn: fix emitting code after loops
When we're done emitting the code for a loop, we need to visit the new
break block, which is the merge block of the current loop, rather than
the old merge block, which is the merge block of the loop containing the
one we just emitted code for.
2015-06-22 13:53:08 -07:00
Connor Abbott
e9c21d0ca0 unbreak things 2015-06-22 11:59:55 -07:00
Kristian Høgsberg Kristensen
9b9f973ca6 vk: Implement scratch buffers to make spilling work 2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
9e59003fb1 vk: Undo relocs for scratch bos 2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
b20794cfa8 vk/allocator: Get rid of non-memfd path
We can just use modern valgrind now.
2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
aba75d0546 vk/headers: Make General State offsets relocations 2015-06-19 15:42:15 -07:00
Connor Abbott
841aab6f50 matrices matrices matrices 2015-06-18 18:52:44 -07:00
Connor Abbott
d0fc04aacf nir/types: be less strict about constructing matrix types 2015-06-18 18:51:51 -07:00
Connor Abbott
22854a60ef nir/builder: add a nir_fdot() convenience function 2015-06-18 17:34:55 -07:00
Connor Abbott
0e86ab7c0a nir/types: add a helper to transpose a matrix type 2015-06-18 17:34:12 -07:00
Connor Abbott
de4c31a085 fix glsl450 for composites 2015-06-18 17:33:08 -07:00
Kristian Høgsberg Kristensen
aedd3c9579 vk: Add missing gen7 RENDER_SURFACE_STATE struct 2015-06-17 21:42:29 -07:00
Connor Abbott
bf5a615659 composites composites composites 2015-06-17 16:25:38 -07:00
Kristian Høgsberg Kristensen
fa8a07748d vk: Compute CS exec mask and thread width max in pipeline
We compute the right mask and thread width max parameters as part of
pipeline creation and set them accordingly at vkCmdDispatch() and
vkCmdDispatchIndirect() time. These parameters depend only on the local
group size and the dispatch width of the program so we can figure this
out at pipeline create time.
2015-06-12 18:21:50 -07:00
Kristian Høgsberg Kristensen
c103c4990c vk: Set binding table layout for CS
We weren't setting the binding table layout for the backend compiler.
2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen
2fdd17d259 vk: Generate CS prog_data into the pipeline instance
We were generating the prog_data into a local variable and never
initializing the pipeline->cs_prog_data one.
2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen
00494c6cb7 vk: Document how depth/stencil formats work in anv_image_create()
This reverts commits

  e17ed04 * vk/image: Don't double-allocate stencil buffers
  1ee2d1c * vk/image: Teach anv_image_choose_tile_mode about WMAJOR

and instead adds a comment to describe the subtlety of how we create
images for stencil only formats.
2015-06-11 22:07:16 -07:00
Kristian Høgsberg Kristensen
fbc9fe3c92 vk: Use compute pipeline layout when binding compute sets 2015-06-11 21:57:43 -07:00
Kristian Høgsberg Kristensen
765175f5d1 vk: Implement basic compute shader support 2015-06-11 15:31:42 -07:00
Kristian Høgsberg Kristensen
7637b02aaa vk: Emit PIPELINE_SELECT on demand 2015-06-11 15:21:49 -07:00
Kristian Høgsberg Kristensen
405697eb3d vk: Stop asserting we have a fragment shader
Even for graphics, this is not a requirement, we can have a depth-only output pipeline.
2015-06-11 15:07:38 -07:00
Kristian Høgsberg Kristensen
e7edde60ba vk: Defer setting viewport dynamic state
We can't emit this until we've done a 3D pipeline select.
2015-06-11 15:04:09 -07:00
Kristian Høgsberg Kristensen
f7fe06cf0a vk: Disable shader stages in the graphics pipeline batch
We need to move this into the graphics pipeline batch so we don't  emit it
for compute pipelines.
2015-06-11 14:58:31 -07:00
Kristian Høgsberg Kristensen
9aae480cc4 vk: Don't emit STATE_SIP
We don't have a SIP kernel and don't enable exceptions.
2015-06-11 14:56:29 -07:00
Kristian Høgsberg Kristensen
923e923bbc vk: Compile fragment shader after VS and GS
Just moving code around to do shader stages in the natual order.
2015-06-11 14:55:50 -07:00
Jason Ekstrand
1dd63fcbed vk/entrypoints: Don't print every single function call 2015-06-11 10:10:13 -07:00
Kristian Høgsberg Kristensen
b581e924b6 vk: Remove left-over trp call 2015-06-11 09:26:49 -07:00
Kristian Høgsberg Kristensen
d76ea7644a vk: Set maximum point size range
We set both minimum and maximum point size to 0 in 3DSTATE_CLIP, which
will clip away all points.
2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen
a5b49d2799 vk: Use generated headers with fixed point support
The generated headers now convert float in the template struct to the
correct fixed point format.
2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen
ea7ef46cf9 vk: Regenerate headers with __gen_validate_value() 2015-06-11 09:25:03 -07:00
Jason Ekstrand
a566b1e08a vk/formats: Refactor format properties code
Along with the refactor, we now do the right thing when we hit an
unsupported format: Set the flags to 0 and return VK_SUCCESS.
2015-06-11 09:11:16 -07:00
Jason Ekstrand
2a3c29698c vk/image: Add a bunch of asserts 2015-06-10 21:04:51 -07:00
Jason Ekstrand
c8b62d109b vk: Add a couple vk_error calls 2015-06-10 21:04:13 -07:00
Jason Ekstrand
7153b56abc vk/private: Add a non-fatal assert 2015-06-10 21:03:50 -07:00
Jason Ekstrand
29d2bbb2b5 vk/cmd: Add an initial implementation of PipelineBarrier
We may want to do something more inteligent here later such as actually
handling image layout transitions.  However, this should do for now.
2015-06-10 16:37:33 -07:00
Jason Ekstrand
047ed02723 vk/emit: Use valgrind to validate every packed field 2015-06-10 12:43:02 -07:00
Jason Ekstrand
9cae3d18ac vk: Add valgrind checks in various emit functions
The check in batch_bo_finish should catch any undefined values in the batch
but isn't that great for debugging.  The checks in the various emit
functions will help get better granularity.
2015-06-09 21:51:37 -07:00
Jason Ekstrand
d5ad24e39b vk: Move the valgrind include and VG() macro to private.h 2015-06-09 21:51:37 -07:00
Chad Versace
e17ed04b03 vk/image: Don't double-allocate stencil buffers
If the main surface has format S8_UINT, then don't allocate the
auxiliary stencil surface.
2015-06-09 16:39:28 -07:00
Chad Versace
1ee2d1c3fc vk/image: Teach anv_image_choose_tile_mode about WMAJOR 2015-06-09 16:38:55 -07:00
Chad Versace
2d2e148952 vk/util: Add anv_abortf(), anv_abortfv()
Convenience functions to print an error message then abort.
2015-06-09 16:38:50 -07:00
Chad Versace
ffb1ee5d20 vk: Define anv_noreturn macro 2015-06-09 16:38:46 -07:00
Chad Versace
f1db3b3869 vk/image: Factor tile mode selection into separate function
Because it will eventually need to get smarter.
2015-06-09 16:38:42 -07:00
Jason Ekstrand
11e941900a vk/device: Actually allow destruction 2015-06-09 16:28:46 -07:00
Jason Ekstrand
5d4b6a01af vk/cmd_buffer: Properly initialize/reset dynamic states 2015-06-09 16:27:55 -07:00
Jason Ekstrand
634a6150b9 vk/pipeline: Zero out the depth-stencil state when not in use 2015-06-09 16:26:55 -07:00
Jason Ekstrand
919e7b7551 vk/device: Use anv_CreateDynamicViewportState instead of the vk one 2015-06-09 16:01:56 -07:00
Jason Ekstrand
0599d39dd9 vk/device: Dedent the vkCreateDynamicViewportState call 2015-06-09 15:53:26 -07:00
Chad Versace
d57c4cf999 vk/util: Annotate anv_finishme() as printflike 2015-06-09 14:46:49 -07:00
Chad Versace
822cb16abe vk: Define anv_printflike() macro 2015-06-09 14:46:45 -07:00
Chad Versace
081f617b5a vk/image: Stop hardcoding alignment of stencil surfaces
Look up the alignment from anv_tile_info_table.
2015-06-09 14:16:56 -07:00
Chad Versace
e6bd568f36 vk/image: Rewrite tile info table
- Reduce the number of table lookups in anv_image_create from 4 to 1.
- Add field for surface alignment.
- Shorten field names tile_width, tile_height -> width, height.
2015-06-09 14:16:45 -07:00
Chad Versace
5b777e2bcf vk/image: Delete an old comment 2015-06-09 14:14:29 -07:00
Jason Ekstrand
d842a6965f vk/compiler: Free the GL errors data 2015-06-09 12:36:23 -07:00
Jason Ekstrand
9f292219bf vk/compiler: Free more of prog_data when tearing down a pipeline 2015-06-09 12:36:23 -07:00
Jason Ekstrand
66b00d5e5a vk/queue: Embed the queue in and allocate it with the device 2015-06-09 12:36:23 -07:00
Jason Ekstrand
38f5eef59d vk/device: Free border color states when we have valgrind 2015-06-09 12:36:23 -07:00
Jason Ekstrand
999b56c507 vk/device: Destroy all batch buffers
Due to a copy+paste error, we were destroying all but the first batch or
surface state buffer.  Now we destroy them all.
2015-06-09 12:36:23 -07:00
Jason Ekstrand
3a38b0db5f vk/meta: Clean up temporary objects 2015-06-09 12:36:23 -07:00
Jason Ekstrand
9d6f55dedf vk/surface_view: Add a destructor 2015-06-09 12:36:23 -07:00
Chad Versace
e6162c2fef vk/image: Add anv_image::h_align,v_align
Use the new fields to compute RENDER_SURFACE_STATE.Surface*Alignment.
We still hardcode them to 4, though.
2015-06-09 12:19:24 -07:00
Jason Ekstrand
58afc24e57 vk/allocator: Remove the concept of a slave block pool
This reverts commit d24f8245db.
2015-06-08 17:46:32 -07:00
Jason Ekstrand
b6363c3f12 vk/device: Remove the binding table pools/streams 2015-06-08 17:45:57 -07:00
Jason Ekstrand
531549d9fc vk/pipeline: Move freeing the program stream to pipeline.c
It's created in pipeline.c so we should free it there.
2015-06-08 14:27:04 -07:00
Jason Ekstrand
66a4dab89a vk/pipeline: Don't destroy the program stream
It's freed in compiler.cpp and we don't want to free it twice.
2015-06-08 13:53:19 -07:00
Jason Ekstrand
920fb771d4 vk/allocator: Make the use of NULL_BLOCK in state_stream_finish explicit 2015-06-08 13:53:19 -07:00
Kristian Høgsberg Kristensen
52637c0996 vk: Quiet a few warnings 2015-06-08 08:51:40 -07:00
Kristian Høgsberg Kristensen
9eab70e54f vk: Create a minimal context for the compiler
This avoids the full brw context initialization and just sets up context
constants, initializes extensions and sets a few driver vfuncs for the
front-end GLSL compiler.
2015-06-08 08:51:40 -07:00
Jason Ekstrand
ce00233c13 vk/cmd_buffer: Use the dynamic state stream in emit_dynamic and merge_dynamic 2015-06-05 17:26:41 -07:00
Jason Ekstrand
e69588b764 vk/device: Use a 64-byte alignment for CC state 2015-06-05 17:26:26 -07:00
Jason Ekstrand
c2eeab305b vk/pipeline: Actually free the program stream and dynamic pool 2015-06-05 17:26:26 -07:00
Jason Ekstrand
ed2ca020f8 vk/allocator: Avoid double-free in the bo pool 2015-06-05 17:12:28 -07:00
Jason Ekstrand
aa523d3c62 vk/gem: Call VALGRIND_FREELIKE_BLOCK before unmapping 2015-06-05 16:41:49 -07:00
Chad Versace
87d98e1935 vk: Fix 2 incorrect typecasts
The compiler didn't find the cast errors because all Vulkan types are
just integers.
2015-06-04 14:32:22 -07:00
Chad Versace
b981379bcf vk: Make make clean remove generated spirv headers 2015-06-04 14:26:46 -07:00
Jason Ekstrand
8d930da35d vk/allocator: Remove an unneeded VG() wrapper 2015-06-04 09:14:33 -07:00
Jason Ekstrand
7f90e56e42 vk/device: Dissalow device destruction 2015-06-04 09:14:33 -07:00
Chad Versace
9cd42b3dea vk: Fix build
Commit 1286bd, which deleted vk.c, broke the build. Update the Makefile
to fix it.
2015-06-04 09:01:30 -07:00
Jason Ekstrand
251aea80b0 vk/DS: Mask stencil masks to 8 bits 2015-06-03 16:59:13 -07:00
Connor Abbott
47bd462b0c awesome control flow bugfixes/clarifications 2015-06-03 14:10:28 -04:00
Kristian Høgsberg Kristensen
a37d122e88 vk: Set color/blend state in meta clear if not set yet 2015-06-02 23:08:05 -07:00
Kristian Høgsberg Kristensen
1286bd3160 vk: Delete vk.c test case
We now have crucible up and running and all vk sub-cases have been moved
over. Delete this crufty old hack of a test case.
2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
2f6aa424e9 vk: Update generated headers with support for 64 bit fields 2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
5744d1763c vk: Set cb_state to NULL at cmd buffer create time
Dynamic color/blend state can be NULL in case we're not rendering to
color targets (only output to depth and/or stencil). Initialize
cmd_buffer->cb_state to NULL so we can reliably detect whether it's been
set or not.
2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
c8f078537e vk: Implement vertexOffset parameter of vkCmdDrawIndexed()
As exposed by the func.draw_indexed test, we were ignoring the argument
and hardcoding 0.
2015-06-02 22:57:42 -07:00
Jason Ekstrand
e702197e3f vk/formats: Add a name to the metadata and better logging 2015-06-02 11:30:39 -07:00
Jason Ekstrand
fbafc946c6 vk/formats: Rework the formats table 2015-06-02 11:30:39 -07:00
Kristian Høgsberg Kristensen
f98c89ef31 vk: Move query related functionality to new file query.c 2015-06-01 21:52:45 -07:00
Jason Ekstrand
08748e3a0c i965: Use NIR by default for vertex shaders on GEN8+
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

   total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
   instructions in affected programs:     1514770 -> 1454047 (-4.01%)
   helped:                                5813
   HURT:                                  1120

The gained programs are ARB vertext programs that were previously going
through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-06-01 12:25:58 -07:00
Jason Ekstrand
d4cbf6a728 vk/compiler: Add an index_count to the bind map and check for OOB 2015-06-01 12:25:58 -07:00
Jason Ekstrand
510b5c3bed vk/HACK: Plumb real descriptor set/index into textures 2015-06-01 12:25:58 -07:00
Jason Ekstrand
aded32bf04 NIR: Add a helper for doing sampler lowering for vulkan 2015-06-01 12:25:58 -07:00
Kristian Høgsberg Kristensen
5caa408579 vk: Indent tables to align '=' at column 48 2015-05-31 22:36:26 -07:00
Kristian Høgsberg Kristensen
76bb658518 vk: Add support for anisotropic bits 2015-05-31 22:15:34 -07:00
Kristian Høgsberg Kristensen
dc56e4f7b8 vk: Implement support for sampler border colors
This supports the three Vulkan border color types for float color
formats. The support for integer formats is a little trickier, as we
don't know the format of the texture at this time.
2015-05-31 17:20:48 -07:00
Jason Ekstrand
e497ac2c62 vk/device: Only flush the texture cache when setting state base address
After further examination, it appears that the other flushes and stalls
weren't actually needed.
2015-05-30 18:04:50 -07:00
Jason Ekstrand
2251305e1a vk/cmd_buffer: Track descriptor set dirtying per-stage 2015-05-30 10:07:29 -07:00
Jason Ekstrand
33cccbbb73 vk/device: Emit PIPE_CONTROL flushes surrounding new STATE_BASE_ADDRESS
According to the bspec, you're supposed to emit a PIPE_CONTROL with a CS
stall and a render target flush prior to chainging STATE_BASE_ADDRESS.  A
little experimentation, however, shows that this is not enough.  It also
appears as if you have to flush the texture cache after chainging base
address or things won't propagate properly.
2015-05-30 08:08:07 -07:00
Jason Ekstrand
b2b9fc9fad vk/allocator: Don't call VALGRIND_MALLOCLIKE_BLOCK on fresh gem_mmap's 2015-05-29 21:15:47 -07:00
Jason Ekstrand
03ffa9ca31 vk: Don't crash on partial descriptor sets 2015-05-29 20:43:10 -07:00
Jason Ekstrand
4ffbab5ae0 vk/device: Allow for starting a new surface state buffer
This commit allows for us to create a whole new surface state buffer when
the old one runs out of room.  We simply re-emit the state base address for
the new state, re-emit binding tables, and keep going.
2015-05-29 17:49:41 -07:00
Jason Ekstrand
c4bd5f87a0 vk/device: Do lazy surface state emission for binding tables
Before, we were emitting surface states up-front when binding tables were
updated.  Now, we wait to emit the surface states until we emit the binding
table.  This makes meta simpler and should make it easier to deal with
swapping out the surface state buffer.
2015-05-29 16:51:11 -07:00
Kristian Høgsberg Kristensen
4aecec0bd6 vk: Store dynamic slot index with struct anv_descriptor_slot
We need to make sure we use the right index into dynamic offset
array. Dynamic descriptors can be present or not in different stages and
to get the right offset, we need to compute the index at
vkCreateDescriptorSetLayout time.
2015-05-29 11:32:53 -07:00
Kristian Høgsberg Kristensen
fad418ff47 vk: Implement dynamic buffer offsets
We do this by creating a surface state on the fly that incorporates the
dynamic offset. This patch also refactor the descriptor set layout
constructor a bit to be less clever with switch statement fall
through. Instead of duplicating the subtle code to update the sampler
and surface slot map, we just use two switch statements.
2015-05-28 22:41:20 -07:00
Jason Ekstrand
9ffc1bed15 vk/device: Split state base address emit into its own function 2015-05-28 15:34:08 -07:00
Jason Ekstrand
468c89a351 vk/device: Use anv_batch_emit for MI_BATCH_BUFFER_START 2015-05-28 15:25:02 -07:00
Jason Ekstrand
2dc0f7fe5b vk/device: Actually destroy batch buffers 2015-05-28 13:08:21 -07:00
Jason Ekstrand
8cf932fd25 vk/query: Don't emit a CS stall by itself
Both the bspec and the simulator don't like this.  I'm not sure if stalling
at the scoreboard is right but it at least shuts up the simulator.
2015-05-28 10:27:53 -07:00
Jason Ekstrand
730ca0efb1 vk/device: Fixups for batch buffer chaining
Some how these didn't get merged with the other batch buffer chaining
stuff.  Oh well, it's here now.
2015-05-28 10:26:11 -07:00
Jason Ekstrand
de221a672d meta: Add a default ds_state and use it when no ds state is set 2015-05-28 10:06:45 -07:00
Jason Ekstrand
6eefeb1f84 vk/meta: Share the dummy RS and CB state between clear and blit 2015-05-28 10:00:38 -07:00
Kristian Høgsberg Kristensen
5a317ef4cb vk: Initialize dynamic state binding points to NULL
We rely on these being initialized to NULL so meta can reliably detect
whether or not they've been set. ds_state is also allowed to not be
present so we need a well-defined value for that.
2015-05-27 22:13:48 -07:00
Chad Versace
1435bf4bc4 .gitignore: Ignore spirv2nir binary 2015-05-27 17:01:09 -07:00
Chad Versace
f559fe9134 .gitignore: Scope Vulkan's generated source files
Don't ignore any file named entrypoints.{c,h}. Ignore it only if it's in
src/vulkan.
2015-05-27 16:59:53 -07:00
Chad Versace
ca385dcf2a vk: gitignore generated source files 2015-05-27 16:57:31 -07:00
Chad Versace
466f61e9f6 vk/glsl_scraper: Replace adhoc arg parsing with argparse 2015-05-27 16:56:02 -07:00
Chad Versace
fab9011c44 vk/image: Assert that VkImageTiling is valid 2015-05-27 16:21:04 -07:00
Chad Versace
c0739043b3 vk/image: Remove trailing whitespace 2015-05-27 16:15:47 -07:00
Chad Versace
4514e63893 vk/glsl: Reject invalid options
The script incorrectly interpreted --blah as the input filename.
2015-05-27 16:14:26 -07:00
Chad Versace
fd8b5e0df2 vk/glsl_scraper: Indent large text blocks
Indent them to the same level as if the text was code.

No changes in entrypoints.{c,h} after a clean build.
2015-05-27 16:09:31 -07:00
Chad Versace
df4b02f4ed vk/glsl_scraper: Fix code style for imports
Python style is one module imported per line, and imports are at the top
of the file.
2015-05-27 16:04:12 -07:00
Jason Ekstrand
b23885857f vk/meta: Actually create the CB state for blits 2015-05-27 12:06:30 -07:00
Jason Ekstrand
da8f148203 vk: Rework anv_batch and use chaining batch buffers
This mega-commit primarily does two things.  First, is to turn anv_batch
into a better abstraction of a batch.  Instead of actually having a BO, it
now has a few pointers to some piece of memory that are used to add data to
the "batch".  If it gets to the end, there is a function pointer that it
can call to attempt to grow the batch.

The second change is to start using chained batch buffers.  When the end of
the current batch BO is reached, it automatically creates a new one and
ineserts an MI_BATCH_BUFFER_START command to chain to it.  In this way, our
batch buffers are effectively infinite in length.
2015-05-27 11:48:28 -07:00
Jason Ekstrand
59def43fc8 Fixup for growable reloc lists 2015-05-27 11:48:28 -07:00
Jason Ekstrand
1c63575de8 vk/cmd_buffer: Allocate the surface_bo from device->batch_bo_pool 2015-05-27 11:48:28 -07:00
Jason Ekstrand
403266be05 vk/device: Make reloc lists growable 2015-05-27 11:48:28 -07:00
Jason Ekstrand
5ef81f0a05 vk/device: Use a bo pool for batch buffers 2015-05-27 11:48:28 -07:00
Jason Ekstrand
6f3e3c715a vk/allocator: Add a BO pool 2015-05-27 11:48:28 -07:00
Jason Ekstrand
59328bac10 vk/allocator: Add a free list that acts on pointers instead of offsets 2015-05-27 11:48:28 -07:00
Kristian Høgsberg
a1d30f867d vk: Add support for dynamic and pipeline color blend state 2015-05-26 17:12:37 -07:00
Kristian Høgsberg
2514ac5547 vk/test: Create and use color/blend dynamic and pipeline state 2015-05-26 17:12:37 -07:00
Kristian Høgsberg
1cd8437b9d vk/meta: Allocate and set color/blend state
For color blend, we have to set our own state to avoid inheriting bogus
blend state.
2015-05-26 17:12:37 -07:00
Kristian Høgsberg
610e6291da vk: Allocate samplers from dynamic stream 2015-05-26 11:50:34 -07:00
Kristian Høgsberg
b29f44218d vk: Emit color calc state
This involves pulling stencil ref values out of DS dynamic state and the
blend constant out of CB dynamic state.
2015-05-26 11:27:31 -07:00
Kristian Høgsberg
5e637c5d5a vk/pack: Generate length macros for structs 2015-05-26 11:27:31 -07:00
Kristian Høgsberg
998837764f vk: Program depth bias
This makes 3DSTATE_RASTER a split state command.
2015-05-26 11:27:31 -07:00
Kristian Høgsberg
0dbed616af vk: Add support for texture component swizzle
This also drops the share create_surface_state helper and moves filling
out SURFACE_STATE directly into anv_image_view_init() and
anv_color_attachment_view_init().
2015-05-26 11:27:29 -07:00
Kristian Høgsberg
cbe7ed416e vk: Implement dynamic and pipeline ds state 2015-05-25 20:20:31 -07:00
Kristian Høgsberg
37743f90bc vk: Set up depth and stencil buffers 2015-05-25 20:20:31 -07:00
Kristian Høgsberg
7c0d0021eb vk/test: Add new depth-stencil test
Not yet a depth stencil test, but will become one.
2015-05-25 20:20:31 -07:00
Kristian Høgsberg
0997a7b2e3 vk: Add basic MOCS settings
This matches what we do for GL.
2015-05-25 20:20:31 -07:00
Kristian Høgsberg
c03314bdd3 vk: Update to header files with nested struct support
This will let us do MOCS settings right.
2015-05-25 20:20:31 -07:00
Jason Ekstrand
ae8c93e023 vk/cmd_buffer: Initialize the pipeline pointer to NULL
If a meta operation is called before the pipeline is set, this can cause
uses of undefined values.  They *should* be harmless, but we might as well
shut up valgrind on this one too.
2015-05-25 17:14:49 -07:00
Jason Ekstrand
912944e59d vk/device: Use the correct number of viewports when creating default VP state
Fixes valgrind uninitialized value errors
2015-05-25 17:14:49 -07:00
Jason Ekstrand
1b211feb6c vk/compiler: Zero out the vs_prog_data struct when VS is disabled
Prevents uninitialized value errors
2015-05-25 17:14:49 -07:00
Jason Ekstrand
903bd4b056 vk/compiler: Fix up the binding hack and make it work in NIR 2015-05-25 12:57:32 -07:00
Jason Ekstrand
57153da2d5 vk: Actually implement some sort of destructor for all object types 2015-05-22 15:15:08 -07:00
Jason Ekstrand
0f0b5aecb8 vk/pipeline: Track VB's that are actually used by the pipeline
Previously, we just blasted out whatever VB's we had marked as "dirty"
regardless of which ones were used by the pipeline.  Given that the stride
of the VB is embedded in the pipeline this can cause problems.  One problem
is if the pipeline doesn't use the given VB binding we emit a bogus stride.
Another problem is that we weren't properly resetting the dirty bits when
the pipeline changed.
2015-05-21 16:58:53 -07:00
Jason Ekstrand
0a54751910 vk/device: Memset descriptor sets to 0 and handle descriptor set holes 2015-05-21 16:33:04 -07:00
Jason Ekstrand
519fe765e2 vk: Do relocations in surface states when they are created
Previously, we waited until later and did a pass through the used surfaces
and did the relocations then.  This lead to doing double-relocations which
was causing us to get bogus surface offsets.
2015-05-21 15:55:29 -07:00
Jason Ekstrand
ccf2bf9b99 vk/test: Use the glsl_scraper for building shaders 2015-05-21 12:24:02 -07:00
Jason Ekstrand
f3d70e4165 vk/glsl_scraper: Use the LunarG back-door for GLSL source 2015-05-21 12:22:44 -07:00
Jason Ekstrand
cb56372eeb vk/glsl_scraper: Use a fake GLSL version that glslang will accept 2015-05-21 12:21:02 -07:00
Jason Ekstrand
0e441cde71 vk: Bake the GLSL_VK_SHADER macro into the scraper output file 2015-05-21 12:21:00 -07:00
Jason Ekstrand
f17e835c26 vk/meta: Use glsl_scraper for our GLSL source
We are not yet using SPIR-V for meta but this is a first step.
2015-05-21 11:39:54 -07:00
Jason Ekstrand
b13c0f469b vk: More out-of-tree build fixes 2015-05-21 11:32:59 -07:00
Jason Ekstrand
f294154e42 vk: Fix for out-of-tree builds 2015-05-21 10:23:18 -07:00
Kristian Høgsberg
f9e66ea621 vk: Remove render pass stub call
This isn't really a stub.
2015-05-20 20:34:52 -07:00
Kristian Høgsberg
a29df71dd2 vk: Add WSI implementation 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
f886647b75 vk: Add debug stubs 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
63da974529 vk: Mark remaining unsupported formats as such 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
387a1bb58f vk: Mark VK_FORMAT_UNDEFINED as 1 cpp, 1 channel 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
a1bd426393 vk: Stream surface state instead of using the surface pool
Since the binding table pointer is only 16 bits, we can only have 64kb
of binding table state allocated at any given time. With a block size of
1kb, that amounts to just 64 command buffers, which is not enough.
2015-05-20 20:34:52 -07:00
Kristian Høgsberg
01504057f5 vk: Use surface_format_info from dri driver for vkGetFormatInfo 2015-05-20 20:34:52 -07:00
Chad Versace
a61f307996 vk: Fix result of vkCreateInstance
When fill_physical_device() fails, don't return VK_SUCCESS.
2015-05-20 19:51:10 -07:00
Jason Ekstrand
14929046ba vk/compiler: Add shader language detection
This commit adds support for the LunarG GLSL back-door as well as detecting
regular GLSL and SPIR-V.  The SPIR-V path doesn't exist yet, so that will
cause an assert-fail.
2015-05-20 17:05:41 -07:00
Jason Ekstrand
47c1cf5ce6 vk/test: Add a test for testing buffer copies 2015-05-20 16:20:04 -07:00
Jason Ekstrand
bea66ac5ad vk/meta: Add support for copying arbitrary size buffers 2015-05-20 16:20:04 -07:00
Jason Ekstrand
9557b85e3d vk/meta: Use the biggest format possible for buffer copies
This should substantially improve throughput of buffer copies.
2015-05-20 16:20:04 -07:00
Jason Ekstrand
13719e9225 vk/meta: Fix buffer copy extents 2015-05-20 16:20:04 -07:00
Jason Ekstrand
d7044a19b1 vk/meta: Use texture() instead of texture2D() 2015-05-19 12:44:35 -07:00
Jason Ekstrand
edff076188 vk: Use binding instead of index in uniform layout qualifiers
This more closely matches what the Vulkan docs say to do.
2015-05-19 12:44:22 -07:00
Jason Ekstrand
e37a89136f vk/glsl_scraper: Add a --glsl-only option 2015-05-19 11:29:07 -07:00
Jason Ekstrand
4bcf58a192 vk/glsl_scraper: Use the line number from the end of the macro
We used to use the line number from the start of the macro but this doesn't
seem to match the c preprocessor
2015-05-19 11:29:07 -07:00
Jason Ekstrand
1573913194 vk/glsl_scraper: Don't open files until needed
This prevents us from writing an empty file when the compile failed.
2015-05-19 11:29:07 -07:00
Kristian Høgsberg
e4c11f50b5 vk: Call finish for binding table state stream 2015-05-18 21:12:13 -07:00
Jason Ekstrand
851495d344 vk/meta: Use the new *view_init functions and stack-allocated views
This should save us a good deal of the leakage that meta currently has.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
4668bbb161 vk/image: Factor view creation out into separate *_init functions
The *_init functions work basically the same as the Vulkan entrypoints
except that they act on an already-created view and take an optional
command buffer option.  If a command buffer is given, the surface state is
allocated out of the command buffer's state stream.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
7c9f209427 Revert "vk/allocator: Don't use memfd when valgrind is detected"
This reverts commit b6ab076d6b.

It turns out setting USE_MEMFD to 0 is really bad because it means we can't
resize the pool.  Besides, valgrind SVN handles memfd so we really don't
need this fallback for valgrind anymore.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
923691c70d vk: Use a separate block pool and state stream for binding tables
The binding table pointers packet only allows for a 16-bit binding table
address so all binding tables have to be in the first 64 KB of the surface
state BO.  We solve this by adding a slave block pool that pulls off the
first 64 KB worth of blocks and reserves them for binding tables.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
d24f8245db vk/allocator: Add a concept of a slave block pool
We probably need a better name but this will do for now.
2015-05-18 20:57:43 -07:00
Kristian Høgsberg
997596e4c4 vk/test: Add test that prints format features 2015-05-18 20:52:44 -07:00
Kristian Høgsberg
241b59cba0 vk/test: Test timestamps and occlusion queries 2015-05-18 20:52:44 -07:00
Kristian Høgsberg
ae9ac47c74 vk: Make timestamp command work correctly
This was using the wrong timestamp register and needs to write a 64 bit
value.
2015-05-18 20:52:43 -07:00
Kristian Høgsberg
82ddab4b18 vk: Make occlusion query work, both copy and get functions 2015-05-18 20:52:43 -07:00
Kristian Høgsberg
1d40e6ade8 vk: Update generated header files
This fixes a problem where register addresses where incorrectly shifted.
2015-05-18 20:52:43 -07:00
Kristian Høgsberg
f330bad545 vk: Only fill render targets for meta clear
Clear inherits the render targets from the current render pass. This
means we need to fill out the binding table after switching to meta
bindings. However, meta copies etc happen outside a render pass and
break when we try to fill in the render targets. This change fills the
render targets only for meta clear.
2015-05-18 20:52:43 -07:00
Jason Ekstrand
b6c7d8c911 vk/pipeline: Use a state_stream for storing programs
Previously, we were effectively using a state_stream, it was just
hand-rolled based on a block pool.  Now we actually use the data structure.
2015-05-18 15:58:20 -07:00
Jason Ekstrand
4063b7deb8 vk/allocator: Add support for valgrind tracking of state pools and streams
We leave the block pool untracked so that reads/writes to freed blocks
will get caught and do the tracking at the state pool/stream level.  We
have to do a few extra gymnastics for streams because valgrind works in
terms of poitners and we work in terms of separate map and offset.
Fortunately, the users of the state pool and stream should always be using
the map pointer provided in the anv_state structure.  We just have to
track, per block, the map that was used when we initially got the block.
Then we can make sure we always use that map and valgrind should stay
happy.
2015-05-18 15:58:20 -07:00
Jason Ekstrand
b6ab076d6b vk/allocator: Don't use memfd when valgrind is detected 2015-05-18 15:58:20 -07:00
Jason Ekstrand
682d11a6e8 vk/allocator: Assert that block_pool_grow succeeds 2015-05-18 15:48:19 -07:00
Jason Ekstrand
28804fb9e4 vk/gem: VG_CLEAR the padding for the gem_mmap struct 2015-05-18 12:05:17 -07:00
Jason Ekstrand
8440b13f55 vk/meta: Rework the indentation style
No functional change.
2015-05-18 10:43:51 -07:00
Kristian Høgsberg
5286ef7849 vk: Provide more realistic values for device info 2015-05-18 10:27:08 -07:00
Kristian Høgsberg
69fd473321 vk: Use a temporary buffer for formatting in finishme
This is more likely to avoid breaking up the message when racing with
other threads.
2015-05-18 10:27:08 -07:00
Jason Ekstrand
cd7ab6ba4e vk/meta: Add an initial implementation of vkCmdCopyBuffer
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
c25ce55fd3 vk/meta: Add an initial implementation of vkCmdCopyBufferToImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
08bd554cda vk/meta: Add an initial implementation of vkCmdBlitImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
fb27d80781 vk/meta: Add an initial implementation of vkCmdCopyImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
c15f3834e3 vk/gem: Set the gem_mmap.flags parameter to 0 if it exists 2015-05-18 10:27:08 -07:00
Jason Ekstrand
f7b0f922be vk/gem: Only VK_CLEAR the addr_ptr in gen_mmap 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
ca7e62d421 vk: Add a logger wrapper for the generated entrypoint 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
eb92745b2e vk/gem: Just return -1 from anv_gem_wait() on error
We were returning -errno, unlike all the other gem functions.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
05754549e8 vk: Fix vkGetOjectInfo return values
We weren't properly returning the allocation count.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
6afb26452b vk: Implement fences
This basic implementation uses a throw-away bo for synchronization.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
e26a7ffbd9 vk/meta: Use anv_* internal entrypoints 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
b7fac7a7d1 vk: Implement allocation count query 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
783e6217fc vk: Change pData/pDataSize semantics
We now always copy the entire struct unless pData is NULL and
unconditionally write back the struct size. It's not clear this is
useful if the structs may grow over time, but it seems to be the
expected behaviour for now.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
b4b3bd1c51 vk: Return VK_SUCCESS from vkAllocDescriptorSets
This should've been returning VK_SUCCESS all along.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
a9f2115486 vk: Return VK_SUCCESS for all descriptor pool entry points 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
60ebcbed54 vk: Start Implementing vkGetFormatInfo()
We move the format table and vkGetFormatInfo to their own file in the
process.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
454345da1e vk: Add script for generating ifunc entry points
This lets us generate a hash table for vkGetProcAddress and lets us call
public functions internally without the public entrypoint overhead.
2015-05-18 10:27:02 -07:00
Kristian Høgsberg
333bcc2072 vk: Fix vulkan header inconsistency
The function pointer typedef and the function prototype for
vkCmdClearColorImage() didn't agree. Fix the typedef to match the
prototype.
2015-05-17 21:08:31 -07:00
Kristian Høgsberg
b9eb56a404 vk: Add function pointer typedef for intel extension
Also guard function prototype by VK_PROTOTYPES.
2015-05-17 21:08:30 -07:00
Kristian Høgsberg
75cb85c56a vk: Add missing VKAPI for vkQueueRemoveMemReferences 2015-05-17 21:08:30 -07:00
Jason Ekstrand
a924ea0c75 Merge remote-tracking branch 'fdo-personal/wip/nir-vtn' into vulkan
This adds the SPIR-V -> NIR translator.
2015-05-16 12:43:16 -07:00
Jason Ekstrand
a63952510d nir/spirv: Don't assert that the current block is empty
It's possible that someone will give us SPIR-V code in which someone
needlessly branches to new blocks.  We should handle that ok now.
2015-05-16 12:34:34 -07:00
Jason Ekstrand
4e44dcc312 nir/spirv: Add initial support for samplers 2015-05-16 12:34:15 -07:00
Jason Ekstrand
d6f52dfb3e nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops
NIR doesn't have the native opcodes for them anymore
2015-05-16 12:33:32 -07:00
Jason Ekstrand
a53e795524 nir/types: Add support for sampler types 2015-05-16 12:32:58 -07:00
Jason Ekstrand
0fa9211d7f nir/spirv: Make the global constants in spirv.h static
I've been promissed in a bug that this will be fixed in a future version of
the header.  However, in the interest of my branch building, I'm adding
these changes in myself for the moment.
2015-05-16 11:16:34 -07:00
Jason Ekstrand
036a4b1855 nir/spirv: Handle jump-to-loop in a more general way 2015-05-16 11:16:34 -07:00
Jason Ekstrand
56f533b3a0 nir/spirv: Handle boolean uniforms correctly 2015-05-16 11:16:34 -07:00
Jason Ekstrand
64bc58a88e nir/spirv: Handle control-flow with loops 2015-05-16 11:16:34 -07:00
Jason Ekstrand
3a2db9207d nir/spirv: Set a name on temporary variables 2015-05-16 11:16:34 -07:00
Jason Ekstrand
a28f8ad9f1 nir/spirv: Use the correct length for copying string literals 2015-05-16 11:16:34 -07:00
Jason Ekstrand
7b9c29e440 nir/spirv: Make vtn_ssa_value handle constants as well as ssa values 2015-05-16 11:16:33 -07:00
Jason Ekstrand
b0d1854efc nir/spirv: Add initial support for GLSL 4.50 builtins 2015-05-16 11:16:33 -07:00
Jason Ekstrand
1da9876486 nir/spirv: Split the core datastructures into a header file 2015-05-16 11:16:33 -07:00
Jason Ekstrand
98d78856f6 nir/spirv: Use the builder for all instructions
We don't actually use it to create all the instructions but we do use it
for insertion always.  This should make things far more consistent for
implementing extended instructions.
2015-05-16 11:16:33 -07:00
Jason Ekstrand
ff828749ea nir/spirv: Add support for a bunch of ALU operations 2015-05-16 11:16:33 -07:00
Jason Ekstrand
d2a7972557 nir/spirv: Add support for indirect array accesses 2015-05-16 11:16:33 -07:00
Jason Ekstrand
683c99908a nir/spirv: Explicitly type constants and SSA values 2015-05-16 11:16:33 -07:00
Jason Ekstrand
c5650148a9 nir/spirv: Handle OpBranchConditional
We do control-flow handling as a two-step process.  The first step is to
walk the instructions list and record various information about blocks and
functions.  This is where the acutal nir_function_overload objects get
created.  We also record the start/stop instruction for each block.  Then
a second pass walks over each of the functions and over the blocks in each
function in a way that's NIR-friendly and actually parses the instructions.
2015-05-16 11:16:33 -07:00
Jason Ekstrand
ebc152e4c9 nir/spirv: Add a helper for getting a value as an SSA value 2015-05-16 11:16:33 -07:00
Jason Ekstrand
f23afc549b nir/spirv: Split instruction handling into preamble and body sections 2015-05-16 11:16:33 -07:00
Jason Ekstrand
ae6d32c635 nir/spirv: Implement load/store instructiosn 2015-05-16 11:16:33 -07:00
Jason Ekstrand
88f6fbc897 nir: Add a helper for getting the tail of a deref chain 2015-05-16 11:16:33 -07:00
Jason Ekstrand
06acd174f3 nir/spirv: Actaully add variables to the funciton or shader 2015-05-16 11:16:33 -07:00
Jason Ekstrand
5045efa4aa nir/spirv: Add a vtn_untyped_value helper 2015-05-16 11:16:33 -07:00
Jason Ekstrand
01f3aa9c51 nir/spirv: Use vtn_value in the types code and fix a off-by-one error 2015-05-16 11:16:33 -07:00
Jason Ekstrand
6ff0830d64 nir/types: Add an is_vector_or_scalar helper 2015-05-16 11:16:33 -07:00
Jason Ekstrand
5acd472271 nir/spirv: Add support for deref chains 2015-05-16 11:16:33 -07:00
Jason Ekstrand
7182597e50 nir/types: Add a scalar type constructor 2015-05-16 11:16:32 -07:00
Jason Ekstrand
eccd798cc2 nir/spirv: Add support for OpLabel 2015-05-16 11:16:32 -07:00
Jason Ekstrand
a6cb9d9222 nir/spirv: Add support for declaring functions 2015-05-16 11:16:32 -07:00
Jason Ekstrand
8ee23dab04 nir/types: Add accessors for function parameter/return types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
707b706d18 nir/spirv: Add support for declaring variables
Deref chains and variable load/store operations are still missing.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
b2db85d8e4 nir/spirv: Add support for constants 2015-05-16 11:16:32 -07:00
Jason Ekstrand
3f83579664 nir/spirv: Add basic support for types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
e9d3b1e694 nir/types: Add more helpers for creating types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
fe550f0738 glsl/types: Expose the function_param and struct_field structs to C
Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find
them.  This commit simpliy moves the ifdef.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
053778c493 glsl/types: Add support for function types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
7b63b3de93 glsl: Add GLSL_TYPE_FUNCTION to the base types enums 2015-05-16 11:16:32 -07:00
Jason Ekstrand
2b570a49a9 nir/spirv: Rework the way values are added
Instead of having functions to add values and set various things, we just
have a function that does a few asserts and then returns the value.  The
caller is then responsible for setting the various fields.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
f9a31ba044 nir/spirv: Add stub support for extension instructions 2015-05-16 11:16:32 -07:00
Jason Ekstrand
4763a13b07 REVERT: Add a simple helper program for testing SPIR-V -> NIR translation 2015-05-16 11:16:32 -07:00
Jason Ekstrand
cae8db6b7e glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp 2015-05-16 11:16:32 -07:00
Jason Ekstrand
98452cd8ae nir: Add the start of a SPIR-V to NIR translator
At the moment, it can handle the very basics of strings and can ignore
debug instructions.  It also has basic support for decorations.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
573ca4a4a7 nir: Import the revision 30 SPIR-V header from Khronos 2015-05-16 11:16:31 -07:00
Jason Ekstrand
057bef8a84 vk/device: Use bias rather than layers for computing binding table size
Because we statically use the first 8 binding table entries for render
targets, we need to create a table of size 8 + surfaces.
2015-05-16 10:42:53 -07:00
Jason Ekstrand
22e61c9da4 vk/meta: Make clear a no-op if no layers need clearing
Among other things, this prevents recursive meta.
2015-05-16 10:30:05 -07:00
Jason Ekstrand
120394ac92 vk/meta: Save and restore the old bindings pointer
If we don't do this then recursive meta is completely broken.  What happens
is that the outer meta call may change the bindings pointer and the inner
meta call will change it again and, when it exits set it back to the
default.  However, the outer meta call may be relying on it being left
alone so it uses the non-meta descriptor sets instead of its own.
2015-05-16 10:28:04 -07:00
Jason Ekstrand
4223de769e vk/device: Simplify surface_count calculation 2015-05-16 10:23:09 -07:00
Jason Ekstrand
eb1952592e vk/glsl_helpers: Fix GLSL_VK_SHADER with respect to commas
Previously, the GLSL_VK_SHADER macro didn't work if the shader contained
commas outside of parentheses due to the way the C preprocessor works.
This commit fixes this by making it variadic again and doing it correctly
this time.
2015-05-15 22:17:07 -07:00
Kristian Høgsberg
3b9f32e893 vk: Make cmd_buffer->bindings a pointer
This lets us save and restore efficiently by just moving the pointer to
a temporary bindings struct for meta.
2015-05-15 18:12:07 -07:00
Kristian Høgsberg
9540130c41 vk: Move vertex buffers into struct anv_bindings 2015-05-15 16:34:31 -07:00
Kristian Høgsberg
0cfc493775 vk: Fix GLSL_VK_SHADER macro
Stringify doesn't work with __ARGV__. The last macro argument swallows
up excess arguments and as such we can just stringify that.
2015-05-15 16:15:04 -07:00
Kristian Høgsberg
af45f4a558 vk: Fix warning from missing initializer
Struct initializers need to be { 0, } to zero out the variable they're
initializing.
2015-05-15 16:07:17 -07:00
Kristian Høgsberg
bf096c9ec3 vk: Build binding tables at bind descriptor time
This changes the way descriptor sets and layouts work so that we fill
out binding table contents at the time we bind descriptor sets. We
manipulate the binding table contents and sampler state in a shadow-copy
in anv_cmd_buffer. At draw time, we allocate the actual binding table
and sampler state and flush the anv_cmd_buffer copies.
2015-05-15 16:05:31 -07:00
Kristian Høgsberg
1f6c220b45 vk: Update the bind map length to reflect MAX_SETS 2015-05-15 15:22:29 -07:00
Kristian Høgsberg
b806e80e66 vk: Flip back to using memfd for the allocators 2015-05-15 15:22:29 -07:00
Kristian Høgsberg
0a775e1eab vk: Rename dyn_state_pool to dynamic_state_pool
Given that we already tolerate surface_state_pool and the even longer
instruction_state_pool, there's no reason to arbitrarily abbreviate
dynamic.
2015-05-15 15:22:29 -07:00
Kristian Høgsberg
f5b0f1351f vk: Consolidate image, buffer and color attachment views
These are all just surface state, offset and a bo.
2015-05-15 15:22:29 -07:00
Jason Ekstrand
41db8db0f2 vk: Add a GLSL scraper utility
This new utility, glsl_scraper.py scrapes C files for instances of the
GLSL_VK_SHADER macro, pulls out the shader source, and compiles it to
SPIR-V.  The compilation is done using glslValidator.  The result is then
placed into another C file as arrays of dwords that can be easiliy handed
to a Vulkan driver.
2015-05-14 19:18:57 -07:00
Jason Ekstrand
79ace6def6 vk/meta: Add a magic GLSL shader source macro 2015-05-14 19:07:34 -07:00
Jason Ekstrand
018a0c1741 vk/meta: Add a better comment about the VS for blits 2015-05-14 11:39:32 -07:00
Jason Ekstrand
8c92701a69 vk/test: Use VK_IMAGE_TILING_OPTIMAL for the render target 2015-05-13 22:27:38 -07:00
Jason Ekstrand
4fb8bddc58 vk/test: Do a copy of the RT into a linear buffer and write that to a PNG 2015-05-13 22:23:30 -07:00
Jason Ekstrand
bd5b76d6d0 vk/meta: Add the start of a blit implementation
Currently, we only implement CopyImageToBuffer
2015-05-13 22:23:30 -07:00
Jason Ekstrand
94b8c0b810 vk/pipeline: Default to a SamplerCount of 1 for PS 2015-05-13 22:23:30 -07:00
Jason Ekstrand
d3d4776202 vk/pipeline: Add an extra flag for force-disabling the vertex shader
This way we can pass in a vertex shader and yet have the pipeline emit an
empty 3DSTATE_VS packet.  We need this for meta because we need to trick
the compiler into not deleting our inputs but at the same time disable the
VS so that we can use a rectlist.  This should go away once we actually get
SPIR-V.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
a1309c5255 vk/pass: Emit a flushing pipe control at the end of the pass
This is rather crude but it at least makes sure that all the render targets
get flushed at the end of the pass.  We probably actually want to do
somthing based on image layout traansitions, but this will work for now.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
07943656a7 vk/compiler: Set the binding table texture_start
This is by no means a complete solution to the binding table problems.
However, it does make texturing actually work.  Before, we were texturing
from the render target since they were both starting at 0.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
cd197181f2 vk/compiler: Zero the prog data
We use prog_data[stage] != NULL to determine whether or not we need to
clean up that stage.  Make sure it default to NULL.
2015-05-13 22:22:59 -07:00
Jason Ekstrand
1f7dcf9d75 vk/image: Stash more information in images and views 2015-05-13 22:22:59 -07:00
Jason Ekstrand
43126388cd vk/meta: Save/restore more stuff in cmd_buffer_restore 2015-05-13 22:22:59 -07:00
Chad Versace
50806e8dec vk: Install headers
I need this for building a testsuite.
2015-05-13 17:49:26 -07:00
Kristian Høgsberg
83c7e1f1db vk: Add support for sampler descriptors 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
4f9eaf77a5 vk: Use a typesafe anv_descriptor struct 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
5c9d77600b vk: Create and bind a sampler in vk.c 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
18acfa7301 vk: Fix copy-n-paste sType in vkCreateSampler 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a1ec789b0b vk: Add a dynamic state stream to anv_cmd_buffer
We'll need this for sampler state.
2015-05-13 14:47:11 -07:00
Kristian Høgsberg
3f52c016fa vk: Move struct anv_sampler to private.h 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a77229c979 vk: Allocate layout->count number of descriptors
layout->count is the number of descriptors the application
requested. layout->total is the number of entries we need across all
stages.
2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a3fd136509 vk: Fill out sampler state from API values 2015-05-13 14:47:11 -07:00
Chad Versace
828817b88f vk: Ignore vk executable 2015-05-13 12:05:38 -07:00
Kristian Høgsberg
2b7a060178 vk: Fix stale error handling in vkQueueSubmit 2015-05-12 14:38:58 -07:00
Kristian Høgsberg
cb986ef597 vk: Submit all cmd buffers passed to vkQueueSubmit 2015-05-12 14:38:12 -07:00
Kristian Høgsberg
9905481552 vk: Add generated header for HSW and IVB (GEN75 and GEN7) 2015-05-12 14:29:04 -07:00
Jason Ekstrand
ffe9f60358 vk: Add stub() and stub_return() macros and mark piles of functions as stubs 2015-05-12 13:45:02 -07:00
Jason Ekstrand
d3b374ce59 vk/util: Add a anv_finishme function/macro 2015-05-12 13:43:36 -07:00
Jason Ekstrand
7727720585 vk/meta: Break setting up meta clear state into it's own functin 2015-05-12 13:03:50 -07:00
Jason Ekstrand
4336a1bc00 vk/pipeline: Add support for disabling the scissor in "extra" 2015-05-12 12:53:01 -07:00
Kristian Høgsberg
d77c34d1d2 vk: Add clear load-op for render passes 2015-05-11 23:25:29 -07:00
Kristian Høgsberg
b734e0bcc5 vk: Add support for driver-internal custom pipelines
This lets us disable the viewport, use rect lists and repclear.
2015-05-11 23:25:29 -07:00
Kristian Høgsberg
ad132bbe48 vk: Fix 3DSTATE_VERTEX_BUFFER emission
Set VertexBufferIndex to the attribute binding, not the location.
2015-05-11 23:25:29 -07:00
Kristian Høgsberg
6a895c6681 vk: Add 32 bpc signed and unsigned integer formats 2015-05-11 23:25:29 -07:00
Kristian Høgsberg
55b9b703ea vk: Add anv_batch_emit_merge() helper macro
This lets us emit a state packet by merging to half-backed versions,
typically one from the pipeline object and one from a dynamic state
objects.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
099faa1a2b vk: Store bo pointer in anv_image and anv_buffer
We don't need to point back to the memory object the bo came from.
Pointing directly to a bo lets us bind images and buffers to other
bos - like our allocator bos.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
4f25f5d86c vk: Support not having a vertex shader
This lets us bypass the vertex shader and pass data straight into
the rasterizer part of the pipeline.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
20ad071190 vk: Allow NULL as a valid pipeline layout
Vertex buffers and render targets aren't part of the layout so having
an empty layout is pretty common.
2015-05-11 22:12:56 -07:00
Kristian Høgsberg
769785c497 Add vulkan driver for BDW 2015-05-09 11:38:32 -07:00
340 changed files with 58862 additions and 6601 deletions

View File

@@ -1 +1 @@
11.0.0-devel
11.1.0-devel

View File

@@ -99,6 +99,7 @@ AM_PROG_CC_C_O
AM_PROG_AS
AX_CHECK_GNU_MAKE
AC_CHECK_PROGS([PYTHON2], [python2 python])
AC_CHECK_PROGS([PYTHON3], [python3])
AC_PROG_SED
AC_PROG_MKDIR_P
@@ -1517,6 +1518,12 @@ GBM_PC_LIB_PRIV="$DLOPEN_LIBS"
AC_SUBST([GBM_PC_REQ_PRIV])
AC_SUBST([GBM_PC_LIB_PRIV])
AM_CONDITIONAL(HAVE_VULKAN, true)
AC_ARG_VAR([GLSLC], [Path to the glslc executable])
AC_CHECK_PROGS([GLSLC], [glslc])
AC_SUBST([GLSLC])
dnl
dnl EGL configuration
dnl
@@ -2291,6 +2298,13 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
AC_SUBST([XA_TINY], $XA_TINY)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
PKG_CHECK_MODULES(VALGRIND, [valgrind],
[have_valgrind=yes], [have_valgrind=no])
if test "x$have_valgrind" = "xyes"; then
AC_DEFINE([HAVE_VALGRIND], 1,
[Use valgrind intrinsics to suppress false warnings])
fi
dnl Restore LDFLAGS and CPPFLAGS
LDFLAGS="$_SAVE_LDFLAGS"
CPPFLAGS="$_SAVE_CPPFLAGS"
@@ -2317,6 +2331,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/auxiliary/Makefile
src/gallium/auxiliary/pipe-loader/Makefile
src/gallium/drivers/freedreno/Makefile
src/gallium/drivers/ddebug/Makefile
src/gallium/drivers/i915/Makefile
src/gallium/drivers/ilo/Makefile
src/gallium/drivers/llvmpipe/Makefile
@@ -2399,6 +2414,9 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
src/vulkan/Makefile
src/vulkan/anv_icd.json
src/vulkan/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile])
@@ -2517,6 +2535,7 @@ if test "x$MESA_LLVM" = x1; then
echo ""
fi
echo " PYTHON2: $PYTHON2"
echo " PYTHON3: $PYTHON3"
echo ""
echo " Run '${MAKE-make}' to build Mesa"

View File

@@ -196,7 +196,7 @@ GL 4.5, GLSL 4.50:
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples not started
GL_ARB_texture_barrier DONE (nv50, nvc0, r600, radeonsi)
GL_KHR_context_flush_control DONE (all - but needs GLX/EXT extension to be useful)
GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful)
GL_KHR_robust_buffer_access_behavior not started
GL_KHR_robustness 90% done (the ARB variant)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)

60
docs/relnotes/11.1.0.html Normal file
View File

@@ -0,0 +1,60 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.1.0 Release Notes / TBD</h1>
<p>
Mesa 11.1.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 11.1.1.
</p>
<p>
Mesa 11.1.0 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
TBD.
</ul>
<h2>Bug fixes</h2>
TBD.
<h2>Changes</h2>
TBD.
</div>
</body>
</html>

View File

@@ -0,0 +1,90 @@
//
// File: vk_platform.h
//
/*
** Copyright (c) 2014-2015 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
** "Materials"), to deal in the Materials without restriction, including
** without limitation the rights to use, copy, modify, merge, publish,
** distribute, sublicense, and/or sell copies of the Materials, and to
** permit persons to whom the Materials are furnished to do so, subject to
** the following conditions:
**
** The above copyright notice and this permission notice shall be included
** in all copies or substantial portions of the Materials.
**
** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
#ifndef __VK_PLATFORM_H__
#define __VK_PLATFORM_H__
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
/*
***************************************************************************************************
* Platform-specific directives and type declarations
***************************************************************************************************
*/
#if defined(_WIN32)
// On Windows, VKAPI should equate to the __stdcall convention
#define VKAPI __stdcall
#elif defined(__GNUC__)
// On other platforms using GCC, VKAPI stays undefined
#define VKAPI
#else
// Unsupported Platform!
#error "Unsupported OS Platform detected!"
#endif
#include <stddef.h>
#if !defined(VK_NO_STDINT_H)
#if defined(_MSC_VER) && (_MSC_VER < 1600)
typedef signed __int8 int8_t;
typedef unsigned __int8 uint8_t;
typedef signed __int16 int16_t;
typedef unsigned __int16 uint16_t;
typedef signed __int32 int32_t;
typedef unsigned __int32 uint32_t;
typedef signed __int64 int64_t;
typedef unsigned __int64 uint64_t;
#else
#include <stdint.h>
#endif
#endif // !defined(VK_NO_STDINT_H)
typedef uint64_t VkDeviceSize;
typedef uint32_t VkBool32;
typedef uint32_t VkSampleMask;
typedef uint32_t VkFlags;
#if (UINTPTR_MAX >= UINT64_MAX)
#define VK_UINTPTRLEAST64_MAX UINTPTR_MAX
typedef uintptr_t VkUintPtrLeast64;
#else
#define VK_UINTPTRLEAST64_MAX UINT64_MAX
typedef uint64_t VkUintPtrLeast64;
#endif
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VK_PLATFORM_H__

View File

@@ -0,0 +1,249 @@
//
// File: vk_wsi_device_swapchain.h
//
/*
** Copyright (c) 2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
** "Materials"), to deal in the Materials without restriction, including
** without limitation the rights to use, copy, modify, merge, publish,
** distribute, sublicense, and/or sell copies of the Materials, and to
** permit persons to whom the Materials are furnished to do so, subject to
** the following conditions:
**
** The above copyright notice and this permission notice shall be included
** in all copies or substantial portions of the Materials.
**
** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
#ifndef __VK_WSI_DEVICE_SWAPCHAIN_H__
#define __VK_WSI_DEVICE_SWAPCHAIN_H__
#include "vulkan.h"
#define VK_WSI_DEVICE_SWAPCHAIN_REVISION 40
#define VK_WSI_DEVICE_SWAPCHAIN_EXTENSION_NUMBER 2
#define VK_WSI_DEVICE_SWAPCHAIN_EXTENSION_NAME "VK_WSI_device_swapchain"
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
// ------------------------------------------------------------------------------------------------
// Objects
VK_DEFINE_NONDISP_HANDLE(VkSwapChainWSI);
// ------------------------------------------------------------------------------------------------
// Enumeration constants
#define VK_WSI_DEVICE_SWAPCHAIN_ENUM(type,id) ((type)((int)0xc0000000 - VK_WSI_DEVICE_SWAPCHAIN_EXTENSION_NUMBER * -1024 + (id)))
#define VK_WSI_DEVICE_SWAPCHAIN_ENUM_POSITIVE(type,id) ((type)((int)0x40000000 + (VK_WSI_DEVICE_SWAPCHAIN_EXTENSION_NUMBER - 1) * 1024 + (id)))
// Extend VkStructureType enum with extension specific constants
#define VK_STRUCTURE_TYPE_SWAP_CHAIN_CREATE_INFO_WSI VK_WSI_DEVICE_SWAPCHAIN_ENUM(VkStructureType, 0)
#define VK_STRUCTURE_TYPE_QUEUE_PRESENT_INFO_WSI VK_WSI_DEVICE_SWAPCHAIN_ENUM(VkStructureType, 1)
// Extend VkImageLayout enum with extension specific constants
#define VK_IMAGE_LAYOUT_PRESENT_SOURCE_WSI VK_WSI_DEVICE_SWAPCHAIN_ENUM(VkImageLayout, 2)
// Extend VkResult enum with extension specific constants
// Return codes for successful operation execution
#define VK_SUBOPTIMAL_WSI VK_WSI_DEVICE_SWAPCHAIN_ENUM_POSITIVE(VkResult, 3)
// Error codes
#define VK_ERROR_OUT_OF_DATE_WSI VK_WSI_DEVICE_SWAPCHAIN_ENUM(VkResult, 4)
// ------------------------------------------------------------------------------------------------
// Enumerations
typedef enum VkSurfaceTransformWSI_
{
VK_SURFACE_TRANSFORM_NONE_WSI = 0,
VK_SURFACE_TRANSFORM_ROT90_WSI = 1,
VK_SURFACE_TRANSFORM_ROT180_WSI = 2,
VK_SURFACE_TRANSFORM_ROT270_WSI = 3,
VK_SURFACE_TRANSFORM_HMIRROR_WSI = 4,
VK_SURFACE_TRANSFORM_HMIRROR_ROT90_WSI = 5,
VK_SURFACE_TRANSFORM_HMIRROR_ROT180_WSI = 6,
VK_SURFACE_TRANSFORM_HMIRROR_ROT270_WSI = 7,
VK_SURFACE_TRANSFORM_INHERIT_WSI = 8,
} VkSurfaceTransformWSI;
typedef enum VkSurfaceTransformFlagBitsWSI_
{
VK_SURFACE_TRANSFORM_NONE_BIT_WSI = 0x00000001,
VK_SURFACE_TRANSFORM_ROT90_BIT_WSI = 0x00000002,
VK_SURFACE_TRANSFORM_ROT180_BIT_WSI = 0x00000004,
VK_SURFACE_TRANSFORM_ROT270_BIT_WSI = 0x00000008,
VK_SURFACE_TRANSFORM_HMIRROR_BIT_WSI = 0x00000010,
VK_SURFACE_TRANSFORM_HMIRROR_ROT90_BIT_WSI = 0x00000020,
VK_SURFACE_TRANSFORM_HMIRROR_ROT180_BIT_WSI = 0x00000040,
VK_SURFACE_TRANSFORM_HMIRROR_ROT270_BIT_WSI = 0x00000080,
VK_SURFACE_TRANSFORM_INHERIT_BIT_WSI = 0x00000100,
} VkSurfaceTransformFlagBitsWSI;
typedef VkFlags VkSurfaceTransformFlagsWSI;
typedef enum VkSurfaceInfoTypeWSI_
{
VK_SURFACE_INFO_TYPE_PROPERTIES_WSI = 0,
VK_SURFACE_INFO_TYPE_FORMATS_WSI = 1,
VK_SURFACE_INFO_TYPE_PRESENT_MODES_WSI = 2,
VK_SURFACE_INFO_TYPE_BEGIN_RANGE_WSI = VK_SURFACE_INFO_TYPE_PROPERTIES_WSI,
VK_SURFACE_INFO_TYPE_END_RANGE_WSI = VK_SURFACE_INFO_TYPE_PRESENT_MODES_WSI,
VK_SURFACE_INFO_TYPE_NUM_WSI = (VK_SURFACE_INFO_TYPE_PRESENT_MODES_WSI - VK_SURFACE_INFO_TYPE_PROPERTIES_WSI + 1),
VK_SURFACE_INFO_TYPE_MAX_ENUM_WSI = 0x7FFFFFFF
} VkSurfaceInfoTypeWSI;
typedef enum VkSwapChainInfoTypeWSI_
{
VK_SWAP_CHAIN_INFO_TYPE_IMAGES_WSI = 0,
VK_SWAP_CHAIN_INFO_TYPE_BEGIN_RANGE_WSI = VK_SWAP_CHAIN_INFO_TYPE_IMAGES_WSI,
VK_SWAP_CHAIN_INFO_TYPE_END_RANGE_WSI = VK_SWAP_CHAIN_INFO_TYPE_IMAGES_WSI,
VK_SWAP_CHAIN_INFO_TYPE_NUM_WSI = (VK_SWAP_CHAIN_INFO_TYPE_IMAGES_WSI - VK_SWAP_CHAIN_INFO_TYPE_IMAGES_WSI + 1),
VK_SWAP_CHAIN_INFO_TYPE_MAX_ENUM_WSI = 0x7FFFFFFF
} VkSwapChainInfoTypeWSI;
typedef enum VkPresentModeWSI_
{
VK_PRESENT_MODE_IMMEDIATE_WSI = 0,
VK_PRESENT_MODE_MAILBOX_WSI = 1,
VK_PRESENT_MODE_FIFO_WSI = 2,
VK_PRESENT_MODE_BEGIN_RANGE_WSI = VK_PRESENT_MODE_IMMEDIATE_WSI,
VK_PRESENT_MODE_END_RANGE_WSI = VK_PRESENT_MODE_FIFO_WSI,
VK_PRESENT_MODE_NUM = (VK_PRESENT_MODE_FIFO_WSI - VK_PRESENT_MODE_IMMEDIATE_WSI + 1),
VK_PRESENT_MODE_MAX_ENUM_WSI = 0x7FFFFFFF
} VkPresentModeWSI;
// ------------------------------------------------------------------------------------------------
// Flags
// ------------------------------------------------------------------------------------------------
// Structures
typedef struct VkSurfacePropertiesWSI_
{
uint32_t minImageCount; // Supported minimum number of images for the surface
uint32_t maxImageCount; // Supported maximum number of images for the surface, 0 for unlimited
VkExtent2D currentExtent; // Current image width and height for the surface, (-1, -1) if undefined.
VkExtent2D minImageExtent; // Supported minimum image width and height for the surface
VkExtent2D maxImageExtent; // Supported maximum image width and height for the surface
VkSurfaceTransformFlagsWSI supportedTransforms;// 1 or more bits representing the transforms supported
VkSurfaceTransformWSI currentTransform; // The surface's current transform relative to the device's natural orientation.
uint32_t maxImageArraySize; // Supported maximum number of image layers for the surface
VkImageUsageFlags supportedUsageFlags;// Supported image usage flags for the surface
} VkSurfacePropertiesWSI;
typedef struct VkSurfaceFormatPropertiesWSI_
{
VkFormat format; // Supported rendering format for the surface
} VkSurfaceFormatPropertiesWSI;
typedef struct VkSurfacePresentModePropertiesWSI_
{
VkPresentModeWSI presentMode; // Supported presention mode for the surface
} VkSurfacePresentModePropertiesWSI;
typedef struct VkSwapChainCreateInfoWSI_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_SWAP_CHAIN_CREATE_INFO_WSI
const void* pNext; // Pointer to next structure
const VkSurfaceDescriptionWSI* pSurfaceDescription;// describes the swap chain's target surface
uint32_t minImageCount; // Minimum number of presentation images the application needs
VkFormat imageFormat; // Format of the presentation images
VkExtent2D imageExtent; // Dimensions of the presentation images
VkImageUsageFlags imageUsageFlags; // Bits indicating how the presentation images will be used
VkSurfaceTransformWSI preTransform; // The transform, relative to the device's natural orientation, applied to the image content prior to presentation
uint32_t imageArraySize; // Determines the number of views for multiview/stereo presentation
VkPresentModeWSI presentMode; // Which presentation mode to use for presents on this swap chain.
VkSwapChainWSI oldSwapChain; // Existing swap chain to replace, if any.
VkBool32 clipped; // Specifies whether presentable images may be affected by window clip regions.
} VkSwapChainCreateInfoWSI;
typedef struct VkSwapChainImagePropertiesWSI_
{
VkImage image; // Persistent swap chain image handle
} VkSwapChainImagePropertiesWSI;
typedef struct VkPresentInfoWSI_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_QUEUE_PRESENT_INFO_WSI
const void* pNext; // Pointer to next structure
uint32_t swapChainCount; // Number of swap chains to present in this call
const VkSwapChainWSI* swapChains; // Swap chains to present an image from.
const uint32_t* imageIndices; // Indices of which swapChain images to present
} VkPresentInfoWSI;
// ------------------------------------------------------------------------------------------------
// Function types
typedef VkResult (VKAPI *PFN_vkGetSurfaceInfoWSI)(VkDevice device, const VkSurfaceDescriptionWSI* pSurfaceDescription, VkSurfaceInfoTypeWSI infoType, size_t* pDataSize, void* pData);
typedef VkResult (VKAPI *PFN_vkCreateSwapChainWSI)(VkDevice device, const VkSwapChainCreateInfoWSI* pCreateInfo, VkSwapChainWSI* pSwapChain);
typedef VkResult (VKAPI *PFN_vkDestroySwapChainWSI)(VkDevice device, VkSwapChainWSI swapChain);
typedef VkResult (VKAPI *PFN_vkGetSwapChainInfoWSI)(VkDevice device, VkSwapChainWSI swapChain, VkSwapChainInfoTypeWSI infoType, size_t* pDataSize, void* pData);
typedef VkResult (VKAPI *PFN_vkAcquireNextImageWSI)(VkDevice device, VkSwapChainWSI swapChain, uint64_t timeout, VkSemaphore semaphore, uint32_t* pImageIndex);
typedef VkResult (VKAPI *PFN_vkQueuePresentWSI)(VkQueue queue, VkPresentInfoWSI* pPresentInfo);
// ------------------------------------------------------------------------------------------------
// Function prototypes
#ifdef VK_PROTOTYPES
VkResult VKAPI vkGetSurfaceInfoWSI(
VkDevice device,
const VkSurfaceDescriptionWSI* pSurfaceDescription,
VkSurfaceInfoTypeWSI infoType,
size_t* pDataSize,
void* pData);
VkResult VKAPI vkCreateSwapChainWSI(
VkDevice device,
const VkSwapChainCreateInfoWSI* pCreateInfo,
VkSwapChainWSI* pSwapChain);
VkResult VKAPI vkDestroySwapChainWSI(
VkDevice device,
VkSwapChainWSI swapChain);
VkResult VKAPI vkGetSwapChainInfoWSI(
VkDevice device,
VkSwapChainWSI swapChain,
VkSwapChainInfoTypeWSI infoType,
size_t* pDataSize,
void* pData);
VkResult VKAPI vkAcquireNextImageWSI(
VkDevice device,
VkSwapChainWSI swapChain,
uint64_t timeout,
VkSemaphore semaphore,
uint32_t* pImageIndex);
VkResult VKAPI vkQueuePresentWSI(
VkQueue queue,
VkPresentInfoWSI* pPresentInfo);
#endif // VK_PROTOTYPES
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VK_WSI_SWAPCHAIN_H__

View File

@@ -0,0 +1,133 @@
//
// File: vk_wsi_swapchain.h
//
/*
** Copyright (c) 2014 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
** "Materials"), to deal in the Materials without restriction, including
** without limitation the rights to use, copy, modify, merge, publish,
** distribute, sublicense, and/or sell copies of the Materials, and to
** permit persons to whom the Materials are furnished to do so, subject to
** the following conditions:
**
** The above copyright notice and this permission notice shall be included
** in all copies or substantial portions of the Materials.
**
** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
#ifndef __VK_WSI_SWAPCHAIN_H__
#define __VK_WSI_SWAPCHAIN_H__
#include "vulkan.h"
#define VK_WSI_SWAPCHAIN_REVISION 12
#define VK_WSI_SWAPCHAIN_EXTENSION_NUMBER 1
#define VK_WSI_SWAPCHAIN_EXTENSION_NAME "VK_WSI_swapchain"
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
// ------------------------------------------------------------------------------------------------
// Objects
// ------------------------------------------------------------------------------------------------
// Enumeration constants
#define VK_WSI_SWAPCHAIN_ENUM(type,id) ((type)((int)0xc0000000 - VK_WSI_SWAPCHAIN_EXTENSION_NUMBER * -1024 + (id)))
#define VK_WSI_SWAPCHAIN_ENUM_POSITIVE(type,id) ((type)((int)0x40000000 + (VK_WSI_SWAPCHAIN_EXTENSION_NUMBER - 1) * 1024 + (id)))
// Extend VkStructureType enum with extension specific constants
#define VK_STRUCTURE_TYPE_SURFACE_DESCRIPTION_WINDOW_WSI VK_WSI_SWAPCHAIN_ENUM(VkStructureType, 0)
// ------------------------------------------------------------------------------------------------
// Enumerations
typedef enum VkPlatformWSI_
{
VK_PLATFORM_WIN32_WSI = 0,
VK_PLATFORM_X11_WSI = 1,
VK_PLATFORM_XCB_WSI = 2,
VK_PLATFORM_ANDROID_WSI = 3,
VK_PLATFORM_WAYLAND_WSI = 4,
VK_PLATFORM_MIR_WSI = 5,
VK_PLATFORM_BEGIN_RANGE_WSI = VK_PLATFORM_WIN32_WSI,
VK_PLATFORM_END_RANGE_WSI = VK_PLATFORM_MIR_WSI,
VK_PLATFORM_NUM_WSI = (VK_PLATFORM_MIR_WSI - VK_PLATFORM_WIN32_WSI + 1),
VK_PLATFORM_MAX_ENUM_WSI = 0x7FFFFFFF
} VkPlatformWSI;
// ------------------------------------------------------------------------------------------------
// Flags
// ------------------------------------------------------------------------------------------------
// Structures
// pPlatformHandle points to this struct when platform is VK_PLATFORM_X11_WSI
#ifdef _X11_XLIB_H_
typedef struct VkPlatformHandleX11WSI_
{
Display* dpy; // Display connection to an X server
Window root; // To identify the X screen
} VkPlatformHandleX11WSI;
#endif /* _X11_XLIB_H_ */
// pPlatformHandle points to this struct when platform is VK_PLATFORM_XCB_WSI
#ifdef __XCB_H__
typedef struct VkPlatformHandleXcbWSI_
{
xcb_connection_t* connection; // XCB connection to an X server
xcb_window_t root; // To identify the X screen
} VkPlatformHandleXcbWSI;
#endif /* __XCB_H__ */
// Placeholder structure header for the different types of surface description structures
typedef struct VkSurfaceDescriptionWSI_
{
VkStructureType sType; // Can be any of the VK_STRUCTURE_TYPE_SURFACE_DESCRIPTION_XXX_WSI constants
const void* pNext; // Pointer to next structure
} VkSurfaceDescriptionWSI;
// Surface description structure for a native platform window surface
typedef struct VkSurfaceDescriptionWindowWSI_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_SURFACE_DESCRIPTION_WINDOW_WSI
const void* pNext; // Pointer to next structure
VkPlatformWSI platform; // e.g. VK_PLATFORM_*_WSI
void* pPlatformHandle;
void* pPlatformWindow;
} VkSurfaceDescriptionWindowWSI;
// ------------------------------------------------------------------------------------------------
// Function types
typedef VkResult (VKAPI *PFN_vkGetPhysicalDeviceSurfaceSupportWSI)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, const VkSurfaceDescriptionWSI* pSurfaceDescription, VkBool32* pSupported);
// ------------------------------------------------------------------------------------------------
// Function prototypes
#ifdef VK_PROTOTYPES
VkResult VKAPI vkGetPhysicalDeviceSurfaceSupportWSI(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
const VkSurfaceDescriptionWSI* pSurfaceDescription,
VkBool32* pSupported);
#endif // VK_PROTOTYPES
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VK_WSI_SWAPCHAIN_H__

3054
include/vulkan/vulkan.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,61 @@
/*
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef __VULKAN_INTEL_H__
#define __VULKAN_INTEL_H__
#include "vulkan.h"
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
#define VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL 1024
typedef struct VkDmaBufImageCreateInfo_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL
const void* pNext; // Pointer to next structure.
int fd;
VkFormat format;
VkExtent3D extent; // Depth must be 1
uint32_t strideInBytes;
} VkDmaBufImageCreateInfo;
typedef VkResult (VKAPI *PFN_vkCreateDmaBufImageINTEL)(VkDevice device, const VkDmaBufImageCreateInfo* pCreateInfo, VkDeviceMemory* pMem, VkImage* pImage);
#ifdef VK_PROTOTYPES
VkResult VKAPI vkCreateDmaBufImageINTEL(
VkDevice _device,
const VkDmaBufImageCreateInfo* pCreateInfo,
VkDeviceMemory* pMem,
VkImage* pImage);
#endif
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VULKAN_INTEL_H__

View File

@@ -53,6 +53,10 @@ EXTRA_DIST = \
AM_CFLAGS = $(VISIBILITY_CFLAGS)
AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)
if HAVE_VULKAN
SUBDIRS += vulkan
endif
AM_CPPFLAGS = \
-I$(top_srcdir)/include/ \
-I$(top_srcdir)/src/mapi/ \

View File

@@ -15,7 +15,6 @@ env.Append(CPPPATH = [
# parse Makefile.sources
egl_sources = env.ParseSourceList('Makefile.sources', 'LIBEGL_C_FILES')
egl_sources.append(env.ParseSourceList('Makefile.sources', 'dri2_backend_core_FILES'))
env.Append(CPPDEFINES = [
'_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_HAIKU',

View File

@@ -11,6 +11,7 @@ SUBDIRS += auxiliary
##
SUBDIRS += \
drivers/ddebug \
drivers/noop \
drivers/trace \
drivers/rbug

View File

@@ -24,6 +24,7 @@
#include "util/ralloc.h"
#include "glsl/nir/nir.h"
#include "glsl/nir/nir_control_flow.h"
#include "glsl/nir/nir_builder.h"
#include "glsl/list.h"
#include "glsl/shader_enums.h"
@@ -64,24 +65,24 @@ struct ttn_compile {
nir_register *addr_reg;
/**
* Stack of cf_node_lists where instructions should be pushed as we pop
* Stack of nir_cursors where instructions should be pushed as we pop
* back out of the control flow stack.
*
* For each IF/ELSE/ENDIF block, if_stack[if_stack_pos] has where the else
* instructions should be placed, and if_stack[if_stack_pos - 1] has where
* the next instructions outside of the if/then/else block go.
*/
struct exec_list **if_stack;
nir_cursor *if_stack;
unsigned if_stack_pos;
/**
* Stack of cf_node_lists where instructions should be pushed as we pop
* Stack of nir_cursors where instructions should be pushed as we pop
* back out of the control flow stack.
*
* loop_stack[loop_stack_pos - 1] contains the cf_node_list for the outside
* of the loop.
*/
struct exec_list **loop_stack;
nir_cursor *loop_stack;
unsigned loop_stack_pos;
/* How many TGSI_FILE_IMMEDIATE vec4s have been parsed so far. */
@@ -307,7 +308,7 @@ ttn_emit_immediate(struct ttn_compile *c)
for (i = 0; i < 4; i++)
load_const->value.u[i] = tgsi_imm->u[i].Uint;
nir_instr_insert_after_cf_list(b->cf_node_list, &load_const->instr);
nir_builder_instr_insert(b, &load_const->instr);
}
static nir_src
@@ -363,7 +364,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index,
load->variables[0] = ttn_array_deref(c, load, var, offset, indirect);
nir_ssa_dest_init(&load->instr, &load->dest, 4, NULL);
nir_instr_insert_after_cf_list(b->cf_node_list, &load->instr);
nir_builder_instr_insert(b, &load->instr);
src = nir_src_for_ssa(&load->dest.ssa);
@@ -414,7 +415,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index,
load->num_components = ncomp;
nir_ssa_dest_init(&load->instr, &load->dest, ncomp, NULL);
nir_instr_insert_after_cf_list(b->cf_node_list, &load->instr);
nir_builder_instr_insert(b, &load->instr);
src = nir_src_for_ssa(&load->dest.ssa);
break;
@@ -476,7 +477,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index,
srcn++;
}
nir_ssa_dest_init(&load->instr, &load->dest, 4, NULL);
nir_instr_insert_after_cf_list(b->cf_node_list, &load->instr);
nir_builder_instr_insert(b, &load->instr);
src = nir_src_for_ssa(&load->dest.ssa);
break;
@@ -552,7 +553,7 @@ ttn_get_dest(struct ttn_compile *c, struct tgsi_full_dst_register *tgsi_fdst)
load->dest = nir_dest_for_reg(reg);
nir_instr_insert_after_cf_list(b->cf_node_list, &load->instr);
nir_builder_instr_insert(b, &load->instr);
} else {
assert(!tgsi_dst->Indirect);
dest.dest.reg.reg = c->temp_regs[index].reg;
@@ -667,7 +668,7 @@ ttn_alu(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
instr->src[i].src = nir_src_for_ssa(src[i]);
instr->dest = dest;
nir_instr_insert_after_cf_list(b->cf_node_list, &instr->instr);
nir_builder_instr_insert(b, &instr->instr);
}
static void
@@ -683,7 +684,7 @@ ttn_move_dest_masked(nir_builder *b, nir_alu_dest dest,
mov->src[0].src = nir_src_for_ssa(def);
for (unsigned i = def->num_components; i < 4; i++)
mov->src[0].swizzle[i] = def->num_components - 1;
nir_instr_insert_after_cf_list(b->cf_node_list, &mov->instr);
nir_builder_instr_insert(b, &mov->instr);
}
static void
@@ -902,7 +903,7 @@ ttn_kill(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
{
nir_intrinsic_instr *discard =
nir_intrinsic_instr_create(b->shader, nir_intrinsic_discard);
nir_instr_insert_after_cf_list(b->cf_node_list, &discard->instr);
nir_builder_instr_insert(b, &discard->instr);
}
static void
@@ -912,7 +913,7 @@ ttn_kill_if(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
nir_intrinsic_instr *discard =
nir_intrinsic_instr_create(b->shader, nir_intrinsic_discard_if);
discard->src[0] = nir_src_for_ssa(cmp);
nir_instr_insert_after_cf_list(b->cf_node_list, &discard->instr);
nir_builder_instr_insert(b, &discard->instr);
}
static void
@@ -921,7 +922,7 @@ ttn_if(struct ttn_compile *c, nir_ssa_def *src, bool is_uint)
nir_builder *b = &c->build;
/* Save the outside-of-the-if-statement node list. */
c->if_stack[c->if_stack_pos] = b->cf_node_list;
c->if_stack[c->if_stack_pos] = b->cursor;
c->if_stack_pos++;
src = ttn_channel(b, src, X);
@@ -932,11 +933,11 @@ ttn_if(struct ttn_compile *c, nir_ssa_def *src, bool is_uint)
} else {
if_stmt->condition = nir_src_for_ssa(nir_fne(b, src, nir_imm_int(b, 0)));
}
nir_cf_node_insert_end(b->cf_node_list, &if_stmt->cf_node);
nir_builder_cf_insert(b, &if_stmt->cf_node);
nir_builder_insert_after_cf_list(b, &if_stmt->then_list);
b->cursor = nir_after_cf_list(&if_stmt->then_list);
c->if_stack[c->if_stack_pos] = &if_stmt->else_list;
c->if_stack[c->if_stack_pos] = nir_after_cf_list(&if_stmt->else_list);
c->if_stack_pos++;
}
@@ -945,7 +946,7 @@ ttn_else(struct ttn_compile *c)
{
nir_builder *b = &c->build;
nir_builder_insert_after_cf_list(b, c->if_stack[c->if_stack_pos - 1]);
b->cursor = c->if_stack[c->if_stack_pos - 1];
}
static void
@@ -954,7 +955,7 @@ ttn_endif(struct ttn_compile *c)
nir_builder *b = &c->build;
c->if_stack_pos -= 2;
nir_builder_insert_after_cf_list(b, c->if_stack[c->if_stack_pos]);
b->cursor = c->if_stack[c->if_stack_pos];
}
static void
@@ -963,27 +964,27 @@ ttn_bgnloop(struct ttn_compile *c)
nir_builder *b = &c->build;
/* Save the outside-of-the-loop node list. */
c->loop_stack[c->loop_stack_pos] = b->cf_node_list;
c->loop_stack[c->loop_stack_pos] = b->cursor;
c->loop_stack_pos++;
nir_loop *loop = nir_loop_create(b->shader);
nir_cf_node_insert_end(b->cf_node_list, &loop->cf_node);
nir_builder_cf_insert(b, &loop->cf_node);
nir_builder_insert_after_cf_list(b, &loop->body);
b->cursor = nir_after_cf_list(&loop->body);
}
static void
ttn_cont(nir_builder *b)
{
nir_jump_instr *instr = nir_jump_instr_create(b->shader, nir_jump_continue);
nir_instr_insert_after_cf_list(b->cf_node_list, &instr->instr);
nir_builder_instr_insert(b, &instr->instr);
}
static void
ttn_brk(nir_builder *b)
{
nir_jump_instr *instr = nir_jump_instr_create(b->shader, nir_jump_break);
nir_instr_insert_after_cf_list(b->cf_node_list, &instr->instr);
nir_builder_instr_insert(b, &instr->instr);
}
static void
@@ -992,7 +993,7 @@ ttn_endloop(struct ttn_compile *c)
nir_builder *b = &c->build;
c->loop_stack_pos--;
nir_builder_insert_after_cf_list(b, c->loop_stack[c->loop_stack_pos]);
b->cursor = c->loop_stack[c->loop_stack_pos];
}
static void
@@ -1279,7 +1280,7 @@ ttn_tex(struct ttn_compile *c, nir_alu_dest dest, nir_ssa_def **src)
assert(src_number == num_srcs);
nir_ssa_dest_init(&instr->instr, &instr->dest, 4, NULL);
nir_instr_insert_after_cf_list(b->cf_node_list, &instr->instr);
nir_builder_instr_insert(b, &instr->instr);
/* Resolve the writemask on the texture op. */
ttn_move_dest(b, dest, &instr->dest.ssa);
@@ -1318,10 +1319,10 @@ ttn_txq(struct ttn_compile *c, nir_alu_dest dest, nir_ssa_def **src)
txs->src[0].src_type = nir_tex_src_lod;
nir_ssa_dest_init(&txs->instr, &txs->dest, 3, NULL);
nir_instr_insert_after_cf_list(b->cf_node_list, &txs->instr);
nir_builder_instr_insert(b, &txs->instr);
nir_ssa_dest_init(&qlv->instr, &qlv->dest, 1, NULL);
nir_instr_insert_after_cf_list(b->cf_node_list, &qlv->instr);
nir_builder_instr_insert(b, &qlv->instr);
ttn_move_dest_masked(b, dest, &txs->dest.ssa, TGSI_WRITEMASK_XYZ);
ttn_move_dest_masked(b, dest, &qlv->dest.ssa, TGSI_WRITEMASK_W);
@@ -1730,7 +1731,7 @@ ttn_emit_instruction(struct ttn_compile *c)
store->variables[0] = ttn_array_deref(c, store, var, offset, indirect);
store->src[0] = nir_src_for_reg(dest.dest.reg.reg);
nir_instr_insert_after_cf_list(b->cf_node_list, &store->instr);
nir_builder_instr_insert(b, &store->instr);
}
}
@@ -1759,11 +1760,26 @@ ttn_add_output_stores(struct ttn_compile *c)
store->const_index[0] = loc;
store->src[0].reg.reg = c->output_regs[loc].reg;
store->src[0].reg.base_offset = c->output_regs[loc].offset;
nir_instr_insert_after_cf_list(b->cf_node_list, &store->instr);
nir_builder_instr_insert(b, &store->instr);
}
}
}
static gl_shader_stage
tgsi_processor_to_shader_stage(unsigned processor)
{
switch (processor) {
case TGSI_PROCESSOR_FRAGMENT: return MESA_SHADER_FRAGMENT;
case TGSI_PROCESSOR_VERTEX: return MESA_SHADER_VERTEX;
case TGSI_PROCESSOR_GEOMETRY: return MESA_SHADER_GEOMETRY;
case TGSI_PROCESSOR_TESS_CTRL: return MESA_SHADER_TESS_CTRL;
case TGSI_PROCESSOR_TESS_EVAL: return MESA_SHADER_TESS_EVAL;
case TGSI_PROCESSOR_COMPUTE: return MESA_SHADER_COMPUTE;
default:
unreachable("invalid TGSI processor");
};
}
struct nir_shader *
tgsi_to_nir(const void *tgsi_tokens,
const nir_shader_compiler_options *options)
@@ -1775,17 +1791,19 @@ tgsi_to_nir(const void *tgsi_tokens,
int ret;
c = rzalloc(NULL, struct ttn_compile);
s = nir_shader_create(NULL, options);
tgsi_scan_shader(tgsi_tokens, &scan);
c->scan = &scan;
s = nir_shader_create(NULL, tgsi_processor_to_shader_stage(scan.processor),
options);
nir_function *func = nir_function_create(s, "main");
nir_function_overload *overload = nir_function_overload_create(func);
nir_function_impl *impl = nir_function_impl_create(overload);
nir_builder_init(&c->build, impl);
nir_builder_insert_after_cf_list(&c->build, &impl->body);
tgsi_scan_shader(tgsi_tokens, &scan);
c->scan = &scan;
c->build.cursor = nir_after_cf_list(&impl->body);
s->num_inputs = scan.file_max[TGSI_FILE_INPUT] + 1;
s->num_uniforms = scan.const_file_max[0] + 1;
@@ -1801,10 +1819,10 @@ tgsi_to_nir(const void *tgsi_tokens,
c->num_samp_types = scan.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
c->samp_types = rzalloc_array(c, nir_alu_type, c->num_samp_types);
c->if_stack = rzalloc_array(c, struct exec_list *,
c->if_stack = rzalloc_array(c, nir_cursor,
(scan.opcode_count[TGSI_OPCODE_IF] +
scan.opcode_count[TGSI_OPCODE_UIF]) * 2);
c->loop_stack = rzalloc_array(c, struct exec_list *,
c->loop_stack = rzalloc_array(c, nir_cursor,
scan.opcode_count[TGSI_OPCODE_BGNLOOP]);
ret = tgsi_parse_init(&parser, tgsi_tokens);

View File

@@ -11,6 +11,10 @@
* one or more debug driver: rbug, trace.
*/
#ifdef GALLIUM_DDEBUG
#include "ddebug/dd_public.h"
#endif
#ifdef GALLIUM_TRACE
#include "trace/tr_public.h"
#endif
@@ -30,6 +34,10 @@
static inline struct pipe_screen *
debug_screen_wrap(struct pipe_screen *screen)
{
#if defined(GALLIUM_DDEBUG)
screen = ddebug_screen_create(screen);
#endif
#if defined(GALLIUM_RBUG)
screen = rbug_screen_create(screen);
#endif

View File

@@ -372,30 +372,28 @@ void util_blitter_custom_resolve_color(struct blitter_context *blitter,
*
* States not listed here are not affected by util_blitter. */
static inline
void util_blitter_save_blend(struct blitter_context *blitter,
void *state)
static inline void
util_blitter_save_blend(struct blitter_context *blitter, void *state)
{
blitter->saved_blend_state = state;
}
static inline
void util_blitter_save_depth_stencil_alpha(struct blitter_context *blitter,
void *state)
static inline void
util_blitter_save_depth_stencil_alpha(struct blitter_context *blitter,
void *state)
{
blitter->saved_dsa_state = state;
}
static inline
void util_blitter_save_vertex_elements(struct blitter_context *blitter,
void *state)
static inline void
util_blitter_save_vertex_elements(struct blitter_context *blitter, void *state)
{
blitter->saved_velem_state = state;
}
static inline
void util_blitter_save_stencil_ref(struct blitter_context *blitter,
const struct pipe_stencil_ref *state)
static inline void
util_blitter_save_stencil_ref(struct blitter_context *blitter,
const struct pipe_stencil_ref *state)
{
blitter->saved_stencil_ref = *state;
}
@@ -407,23 +405,20 @@ void util_blitter_save_rasterizer(struct blitter_context *blitter,
blitter->saved_rs_state = state;
}
static inline
void util_blitter_save_fragment_shader(struct blitter_context *blitter,
void *fs)
static inline void
util_blitter_save_fragment_shader(struct blitter_context *blitter, void *fs)
{
blitter->saved_fs = fs;
}
static inline
void util_blitter_save_vertex_shader(struct blitter_context *blitter,
void *vs)
static inline void
util_blitter_save_vertex_shader(struct blitter_context *blitter, void *vs)
{
blitter->saved_vs = vs;
}
static inline
void util_blitter_save_geometry_shader(struct blitter_context *blitter,
void *gs)
static inline void
util_blitter_save_geometry_shader(struct blitter_context *blitter, void *gs)
{
blitter->saved_gs = gs;
}
@@ -442,24 +437,24 @@ util_blitter_save_tesseval_shader(struct blitter_context *blitter,
blitter->saved_tes = sh;
}
static inline
void util_blitter_save_framebuffer(struct blitter_context *blitter,
const struct pipe_framebuffer_state *state)
static inline void
util_blitter_save_framebuffer(struct blitter_context *blitter,
const struct pipe_framebuffer_state *state)
{
blitter->saved_fb_state.nr_cbufs = 0; /* It's ~0 now, meaning it's unsaved. */
util_copy_framebuffer_state(&blitter->saved_fb_state, state);
}
static inline
void util_blitter_save_viewport(struct blitter_context *blitter,
struct pipe_viewport_state *state)
static inline void
util_blitter_save_viewport(struct blitter_context *blitter,
struct pipe_viewport_state *state)
{
blitter->saved_viewport = *state;
}
static inline
void util_blitter_save_scissor(struct blitter_context *blitter,
struct pipe_scissor_state *state)
static inline void
util_blitter_save_scissor(struct blitter_context *blitter,
struct pipe_scissor_state *state)
{
blitter->saved_scissor = *state;
}

View File

@@ -41,6 +41,7 @@
#include "util/u_tile.h"
#include "util/u_prim.h"
#include "util/u_surface.h"
#include <inttypes.h>
#include <stdio.h>
#include <limits.h> /* CHAR_BIT */
@@ -275,7 +276,7 @@ debug_get_flags_option(const char *name,
for (; flags->name; ++flags)
namealign = MAX2(namealign, strlen(flags->name));
for (flags = orig; flags->name; ++flags)
_debug_printf("| %*s [0x%0*lx]%s%s\n", namealign, flags->name,
_debug_printf("| %*s [0x%0*"PRIu64"]%s%s\n", namealign, flags->name,
(int)sizeof(uint64_t)*CHAR_BIT/4, flags->value,
flags->desc ? " " : "", flags->desc ? flags->desc : "");
}
@@ -290,9 +291,9 @@ debug_get_flags_option(const char *name,
if (debug_get_option_should_print()) {
if (str) {
debug_printf("%s: %s = 0x%lx (%s)\n", __FUNCTION__, name, result, str);
debug_printf("%s: %s = 0x%"PRIu64" (%s)\n", __FUNCTION__, name, result, str);
} else {
debug_printf("%s: %s = 0x%lx\n", __FUNCTION__, name, result);
debug_printf("%s: %s = 0x%"PRIu64"\n", __FUNCTION__, name, result);
}
}

View File

@@ -21,7 +21,8 @@
* DEALINGS IN THE SOFTWARE.
*/
/* Copied from EXT_texture_shared_exponent and edited. */
/* Copied from EXT_texture_shared_exponent and edited, getting rid of
* expensive float math bits too. */
#ifndef RGB9E5_H
#define RGB9E5_H
@@ -39,7 +40,6 @@
#define RGB9E5_MANTISSA_VALUES (1<<RGB9E5_MANTISSA_BITS)
#define MAX_RGB9E5_MANTISSA (RGB9E5_MANTISSA_VALUES-1)
#define MAX_RGB9E5 (((float)MAX_RGB9E5_MANTISSA)/RGB9E5_MANTISSA_VALUES * (1<<MAX_RGB9E5_EXP))
#define EPSILON_RGB9E5 ((1.0/RGB9E5_MANTISSA_VALUES) / (1<<RGB9E5_EXP_BIAS))
typedef union {
unsigned int raw;
@@ -74,63 +74,59 @@ typedef union {
} field;
} rgb9e5;
static inline float rgb9e5_ClampRange(float x)
{
if (x > 0.0f) {
if (x >= MAX_RGB9E5) {
return MAX_RGB9E5;
} else {
return x;
}
} else {
/* NaN gets here too since comparisons with NaN always fail! */
return 0.0;
}
}
/* Ok, FloorLog2 is not correct for the denorm and zero values, but we
are going to do a max of this value with the minimum rgb9e5 exponent
that will hide these problem cases. */
static inline int rgb9e5_FloorLog2(float x)
static inline int rgb9e5_ClampRange(float x)
{
float754 f;
float754 max;
f.value = x;
return (f.field.biasedexponent - 127);
max.value = MAX_RGB9E5;
if (f.raw > 0x7f800000)
/* catches neg, NaNs */
return 0;
else if (f.raw >= max.raw)
return max.raw;
else
return f.raw;
}
static inline unsigned float3_to_rgb9e5(const float rgb[3])
{
rgb9e5 retval;
float maxrgb;
int rm, gm, bm;
float rc, gc, bc;
int exp_shared, maxm;
double denom;
int rm, gm, bm, exp_shared;
float754 revdenom = {0};
float754 rc, bc, gc, maxrgb;
rc = rgb9e5_ClampRange(rgb[0]);
gc = rgb9e5_ClampRange(rgb[1]);
bc = rgb9e5_ClampRange(rgb[2]);
rc.raw = rgb9e5_ClampRange(rgb[0]);
gc.raw = rgb9e5_ClampRange(rgb[1]);
bc.raw = rgb9e5_ClampRange(rgb[2]);
maxrgb.raw = MAX3(rc.raw, gc.raw, bc.raw);
maxrgb = MAX3(rc, gc, bc);
exp_shared = MAX2(-RGB9E5_EXP_BIAS-1, rgb9e5_FloorLog2(maxrgb)) + 1 + RGB9E5_EXP_BIAS;
/*
* Compared to what the spec suggests, instead of conditionally adjusting
* the exponent after the fact do it here by doing the equivalent of +0.5 -
* the int add will spill over into the exponent in this case.
*/
maxrgb.raw += maxrgb.raw & (1 << (23-9));
exp_shared = MAX2((maxrgb.raw >> 23), -RGB9E5_EXP_BIAS - 1 + 127) +
1 + RGB9E5_EXP_BIAS - 127;
revdenom.field.biasedexponent = 127 - (exp_shared - RGB9E5_EXP_BIAS -
RGB9E5_MANTISSA_BITS) + 1;
assert(exp_shared <= RGB9E5_MAX_VALID_BIASED_EXP);
assert(exp_shared >= 0);
/* This exp2 function could be replaced by a table. */
denom = exp2(exp_shared - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS);
maxm = (int) floor(maxrgb / denom + 0.5);
if (maxm == MAX_RGB9E5_MANTISSA+1) {
denom *= 2;
exp_shared += 1;
assert(exp_shared <= RGB9E5_MAX_VALID_BIASED_EXP);
} else {
assert(maxm <= MAX_RGB9E5_MANTISSA);
}
rm = (int) floor(rc / denom + 0.5);
gm = (int) floor(gc / denom + 0.5);
bm = (int) floor(bc / denom + 0.5);
/*
* The spec uses strict round-up behavior (d3d10 disagrees, but in any case
* must match what is done above for figuring out exponent).
* We avoid the doubles ((int) rc * revdenom + 0.5) by doing the rounding
* ourselves (revdenom was adjusted by +1, above).
*/
rm = (int) (rc.value * revdenom.value);
gm = (int) (gc.value * revdenom.value);
bm = (int) (bc.value * revdenom.value);
rm = (rm & 1) + (rm >> 1);
gm = (gm & 1) + (gm >> 1);
bm = (bm & 1) + (bm >> 1);
assert(rm <= MAX_RGB9E5_MANTISSA);
assert(gm <= MAX_RGB9E5_MANTISSA);
@@ -151,15 +147,15 @@ static inline void rgb9e5_to_float3(unsigned rgb, float retval[3])
{
rgb9e5 v;
int exponent;
float scale;
float754 scale = {0};
v.raw = rgb;
exponent = v.field.biasedexponent - RGB9E5_EXP_BIAS - RGB9E5_MANTISSA_BITS;
scale = exp2f(exponent);
scale.field.biasedexponent = exponent + 127;
retval[0] = v.field.r * scale;
retval[1] = v.field.g * scale;
retval[2] = v.field.b * scale;
retval[0] = v.field.r * scale.value;
retval[1] = v.field.g * scale.value;
retval[2] = v.field.b * scale.value;
}
#endif

View File

@@ -457,7 +457,7 @@ null_constant_buffer(struct pipe_context *ctx)
void
util_run_tests(struct pipe_screen *screen)
{
struct pipe_context *ctx = screen->context_create(screen, NULL);
struct pipe_context *ctx = screen->context_create(screen, NULL, 0);
tgsi_vs_window_space_position(ctx);
null_sampler_view(ctx, TGSI_TEXTURE_2D);

View File

@@ -1120,7 +1120,7 @@ vl_create_mpeg12_decoder(struct pipe_context *context,
dec->base = *templat;
dec->base.context = context;
dec->context = context->screen->context_create(context->screen, NULL);
dec->context = context->screen->context_create(context->screen, NULL, 0);
dec->base.destroy = vl_mpeg12_destroy;
dec->base.begin_frame = vl_mpeg12_begin_frame;

View File

@@ -0,0 +1,9 @@
include Makefile.sources
include $(top_srcdir)/src/gallium/Automake.inc
AM_CFLAGS = \
$(GALLIUM_DRIVER_CFLAGS)
noinst_LTLIBRARIES = libddebug.la
libddebug_la_SOURCES = $(C_SOURCES)

View File

@@ -0,0 +1,6 @@
C_SOURCES := \
dd_pipe.h \
dd_public.h \
dd_context.c \
dd_draw.c \
dd_screen.c

View File

@@ -0,0 +1,771 @@
/**************************************************************************
*
* Copyright 2015 Advanced Micro Devices, Inc.
* Copyright 2008 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#include "dd_pipe.h"
#include "tgsi/tgsi_parse.h"
#include "util/u_memory.h"
static void
safe_memcpy(void *dst, const void *src, size_t size)
{
if (src)
memcpy(dst, src, size);
else
memset(dst, 0, size);
}
/********************************************************************
* queries
*/
static struct dd_query *
dd_query(struct pipe_query *query)
{
return (struct dd_query *)query;
}
static struct pipe_query *
dd_query_unwrap(struct pipe_query *query)
{
if (query) {
return dd_query(query)->query;
} else {
return NULL;
}
}
static struct pipe_query *
dd_context_create_query(struct pipe_context *_pipe, unsigned query_type,
unsigned index)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
struct pipe_query *query;
query = pipe->create_query(pipe, query_type, index);
/* Wrap query object. */
if (query) {
struct dd_query *dd_query = CALLOC_STRUCT(dd_query);
if (dd_query) {
dd_query->type = query_type;
dd_query->query = query;
query = (struct pipe_query *)dd_query;
} else {
pipe->destroy_query(pipe, query);
query = NULL;
}
}
return query;
}
static void
dd_context_destroy_query(struct pipe_context *_pipe,
struct pipe_query *query)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->destroy_query(pipe, dd_query_unwrap(query));
FREE(query);
}
static boolean
dd_context_begin_query(struct pipe_context *_pipe, struct pipe_query *query)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
return pipe->begin_query(pipe, dd_query_unwrap(query));
}
static void
dd_context_end_query(struct pipe_context *_pipe, struct pipe_query *query)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
pipe->end_query(pipe, dd_query_unwrap(query));
}
static boolean
dd_context_get_query_result(struct pipe_context *_pipe,
struct pipe_query *query, boolean wait,
union pipe_query_result *result)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
return pipe->get_query_result(pipe, dd_query_unwrap(query), wait, result);
}
static void
dd_context_render_condition(struct pipe_context *_pipe,
struct pipe_query *query, boolean condition,
uint mode)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
pipe->render_condition(pipe, dd_query_unwrap(query), condition, mode);
dctx->render_cond.query = dd_query(query);
dctx->render_cond.condition = condition;
dctx->render_cond.mode = mode;
}
/********************************************************************
* constant (immutable) non-shader states
*/
#define DD_CSO_CREATE(name, shortname) \
static void * \
dd_context_create_##name##_state(struct pipe_context *_pipe, \
const struct pipe_##name##_state *state) \
{ \
struct pipe_context *pipe = dd_context(_pipe)->pipe; \
struct dd_state *hstate = CALLOC_STRUCT(dd_state); \
\
if (!hstate) \
return NULL; \
hstate->cso = pipe->create_##name##_state(pipe, state); \
hstate->state.shortname = *state; \
return hstate; \
}
#define DD_CSO_BIND(name, shortname) \
static void \
dd_context_bind_##name##_state(struct pipe_context *_pipe, void *state) \
{ \
struct dd_context *dctx = dd_context(_pipe); \
struct pipe_context *pipe = dctx->pipe; \
struct dd_state *hstate = state; \
\
dctx->shortname = hstate; \
pipe->bind_##name##_state(pipe, hstate ? hstate->cso : NULL); \
}
#define DD_CSO_DELETE(name) \
static void \
dd_context_delete_##name##_state(struct pipe_context *_pipe, void *state) \
{ \
struct dd_context *dctx = dd_context(_pipe); \
struct pipe_context *pipe = dctx->pipe; \
struct dd_state *hstate = state; \
\
pipe->delete_##name##_state(pipe, hstate->cso); \
FREE(hstate); \
}
#define DD_CSO_WHOLE(name, shortname) \
DD_CSO_CREATE(name, shortname) \
DD_CSO_BIND(name, shortname) \
DD_CSO_DELETE(name)
DD_CSO_WHOLE(blend, blend)
DD_CSO_WHOLE(rasterizer, rs)
DD_CSO_WHOLE(depth_stencil_alpha, dsa)
DD_CSO_CREATE(sampler, sampler)
DD_CSO_DELETE(sampler)
static void
dd_context_bind_sampler_states(struct pipe_context *_pipe, unsigned shader,
unsigned start, unsigned count, void **states)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
memcpy(&dctx->sampler_states[shader][start], states,
sizeof(void*) * count);
if (states) {
void *samp[PIPE_MAX_SAMPLERS];
int i;
for (i = 0; i < count; i++) {
struct dd_state *s = states[i];
samp[i] = s ? s->cso : NULL;
}
pipe->bind_sampler_states(pipe, shader, start, count, samp);
}
else
pipe->bind_sampler_states(pipe, shader, start, count, NULL);
}
static void *
dd_context_create_vertex_elements_state(struct pipe_context *_pipe,
unsigned num_elems,
const struct pipe_vertex_element *elems)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
struct dd_state *hstate = CALLOC_STRUCT(dd_state);
if (!hstate)
return NULL;
hstate->cso = pipe->create_vertex_elements_state(pipe, num_elems, elems);
memcpy(hstate->state.velems.velems, elems, sizeof(elems[0]) * num_elems);
hstate->state.velems.count = num_elems;
return hstate;
}
DD_CSO_BIND(vertex_elements, velems)
DD_CSO_DELETE(vertex_elements)
/********************************************************************
* shaders
*/
#define DD_SHADER(NAME, name) \
static void * \
dd_context_create_##name##_state(struct pipe_context *_pipe, \
const struct pipe_shader_state *state) \
{ \
struct pipe_context *pipe = dd_context(_pipe)->pipe; \
struct dd_state *hstate = CALLOC_STRUCT(dd_state); \
\
if (!hstate) \
return NULL; \
hstate->cso = pipe->create_##name##_state(pipe, state); \
hstate->state.shader = *state; \
hstate->state.shader.tokens = tgsi_dup_tokens(state->tokens); \
return hstate; \
} \
\
static void \
dd_context_bind_##name##_state(struct pipe_context *_pipe, void *state) \
{ \
struct dd_context *dctx = dd_context(_pipe); \
struct pipe_context *pipe = dctx->pipe; \
struct dd_state *hstate = state; \
\
dctx->shaders[PIPE_SHADER_##NAME] = hstate; \
pipe->bind_##name##_state(pipe, hstate ? hstate->cso : NULL); \
} \
\
static void \
dd_context_delete_##name##_state(struct pipe_context *_pipe, void *state) \
{ \
struct dd_context *dctx = dd_context(_pipe); \
struct pipe_context *pipe = dctx->pipe; \
struct dd_state *hstate = state; \
\
pipe->delete_##name##_state(pipe, hstate->cso); \
tgsi_free_tokens(hstate->state.shader.tokens); \
FREE(hstate); \
}
DD_SHADER(FRAGMENT, fs)
DD_SHADER(VERTEX, vs)
DD_SHADER(GEOMETRY, gs)
DD_SHADER(TESS_CTRL, tcs)
DD_SHADER(TESS_EVAL, tes)
/********************************************************************
* immediate states
*/
#define DD_IMM_STATE(name, type, deref, ref) \
static void \
dd_context_set_##name(struct pipe_context *_pipe, type deref) \
{ \
struct dd_context *dctx = dd_context(_pipe); \
struct pipe_context *pipe = dctx->pipe; \
\
dctx->name = deref; \
pipe->set_##name(pipe, ref); \
}
DD_IMM_STATE(blend_color, const struct pipe_blend_color, *state, state)
DD_IMM_STATE(stencil_ref, const struct pipe_stencil_ref, *state, state)
DD_IMM_STATE(clip_state, const struct pipe_clip_state, *state, state)
DD_IMM_STATE(sample_mask, unsigned, sample_mask, sample_mask)
DD_IMM_STATE(min_samples, unsigned, min_samples, min_samples)
DD_IMM_STATE(framebuffer_state, const struct pipe_framebuffer_state, *state, state)
DD_IMM_STATE(polygon_stipple, const struct pipe_poly_stipple, *state, state)
static void
dd_context_set_constant_buffer(struct pipe_context *_pipe,
uint shader, uint index,
struct pipe_constant_buffer *constant_buffer)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->constant_buffers[shader][index], constant_buffer,
sizeof(*constant_buffer));
pipe->set_constant_buffer(pipe, shader, index, constant_buffer);
}
static void
dd_context_set_scissor_states(struct pipe_context *_pipe,
unsigned start_slot, unsigned num_scissors,
const struct pipe_scissor_state *states)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->scissors[start_slot], states,
sizeof(*states) * num_scissors);
pipe->set_scissor_states(pipe, start_slot, num_scissors, states);
}
static void
dd_context_set_viewport_states(struct pipe_context *_pipe,
unsigned start_slot, unsigned num_viewports,
const struct pipe_viewport_state *states)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->viewports[start_slot], states,
sizeof(*states) * num_viewports);
pipe->set_viewport_states(pipe, start_slot, num_viewports, states);
}
static void dd_context_set_tess_state(struct pipe_context *_pipe,
const float default_outer_level[4],
const float default_inner_level[2])
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4);
memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 2);
pipe->set_tess_state(pipe, default_outer_level, default_inner_level);
}
/********************************************************************
* views
*/
static struct pipe_surface *
dd_context_create_surface(struct pipe_context *_pipe,
struct pipe_resource *resource,
const struct pipe_surface *surf_tmpl)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
struct pipe_surface *view =
pipe->create_surface(pipe, resource, surf_tmpl);
if (!view)
return NULL;
view->context = _pipe;
return view;
}
static void
dd_context_surface_destroy(struct pipe_context *_pipe,
struct pipe_surface *surf)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->surface_destroy(pipe, surf);
}
static struct pipe_sampler_view *
dd_context_create_sampler_view(struct pipe_context *_pipe,
struct pipe_resource *resource,
const struct pipe_sampler_view *templ)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
struct pipe_sampler_view *view =
pipe->create_sampler_view(pipe, resource, templ);
if (!view)
return NULL;
view->context = _pipe;
return view;
}
static void
dd_context_sampler_view_destroy(struct pipe_context *_pipe,
struct pipe_sampler_view *view)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->sampler_view_destroy(pipe, view);
}
static struct pipe_image_view *
dd_context_create_image_view(struct pipe_context *_pipe,
struct pipe_resource *resource,
const struct pipe_image_view *templ)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
struct pipe_image_view *view =
pipe->create_image_view(pipe, resource, templ);
if (!view)
return NULL;
view->context = _pipe;
return view;
}
static void
dd_context_image_view_destroy(struct pipe_context *_pipe,
struct pipe_image_view *view)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->image_view_destroy(pipe, view);
}
static struct pipe_stream_output_target *
dd_context_create_stream_output_target(struct pipe_context *_pipe,
struct pipe_resource *res,
unsigned buffer_offset,
unsigned buffer_size)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
struct pipe_stream_output_target *view =
pipe->create_stream_output_target(pipe, res, buffer_offset,
buffer_size);
if (!view)
return NULL;
view->context = _pipe;
return view;
}
static void
dd_context_stream_output_target_destroy(struct pipe_context *_pipe,
struct pipe_stream_output_target *target)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->stream_output_target_destroy(pipe, target);
}
/********************************************************************
* set states
*/
static void
dd_context_set_sampler_views(struct pipe_context *_pipe, unsigned shader,
unsigned start, unsigned num,
struct pipe_sampler_view **views)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->sampler_views[shader][start], views,
sizeof(views[0]) * num);
pipe->set_sampler_views(pipe, shader, start, num, views);
}
static void
dd_context_set_shader_images(struct pipe_context *_pipe, unsigned shader,
unsigned start, unsigned num,
struct pipe_image_view **views)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->shader_images[shader][start], views,
sizeof(views[0]) * num);
pipe->set_shader_images(pipe, shader, start, num, views);
}
static void
dd_context_set_shader_buffers(struct pipe_context *_pipe, unsigned shader,
unsigned start, unsigned num_buffers,
struct pipe_shader_buffer *buffers)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->shader_buffers[shader][start], buffers,
sizeof(buffers[0]) * num_buffers);
pipe->set_shader_buffers(pipe, shader, start, num_buffers, buffers);
}
static void
dd_context_set_vertex_buffers(struct pipe_context *_pipe,
unsigned start, unsigned num_buffers,
const struct pipe_vertex_buffer *buffers)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->vertex_buffers[start], buffers,
sizeof(buffers[0]) * num_buffers);
pipe->set_vertex_buffers(pipe, start, num_buffers, buffers);
}
static void
dd_context_set_index_buffer(struct pipe_context *_pipe,
const struct pipe_index_buffer *ib)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
safe_memcpy(&dctx->index_buffer, ib, sizeof(*ib));
pipe->set_index_buffer(pipe, ib);
}
static void
dd_context_set_stream_output_targets(struct pipe_context *_pipe,
unsigned num_targets,
struct pipe_stream_output_target **tgs,
const unsigned *offsets)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
dctx->num_so_targets = num_targets;
safe_memcpy(dctx->so_targets, tgs, sizeof(*tgs) * num_targets);
safe_memcpy(dctx->so_offsets, offsets, sizeof(*offsets) * num_targets);
pipe->set_stream_output_targets(pipe, num_targets, tgs, offsets);
}
static void
dd_context_destroy(struct pipe_context *_pipe)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
pipe->destroy(pipe);
FREE(dctx);
}
/********************************************************************
* transfer
*/
static void *
dd_context_transfer_map(struct pipe_context *_pipe,
struct pipe_resource *resource, unsigned level,
unsigned usage, const struct pipe_box *box,
struct pipe_transfer **transfer)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
return pipe->transfer_map(pipe, resource, level, usage, box, transfer);
}
static void
dd_context_transfer_flush_region(struct pipe_context *_pipe,
struct pipe_transfer *transfer,
const struct pipe_box *box)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->transfer_flush_region(pipe, transfer, box);
}
static void
dd_context_transfer_unmap(struct pipe_context *_pipe,
struct pipe_transfer *transfer)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->transfer_unmap(pipe, transfer);
}
static void
dd_context_transfer_inline_write(struct pipe_context *_pipe,
struct pipe_resource *resource,
unsigned level, unsigned usage,
const struct pipe_box *box,
const void *data, unsigned stride,
unsigned layer_stride)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->transfer_inline_write(pipe, resource, level, usage, box, data,
stride, layer_stride);
}
/********************************************************************
* miscellaneous
*/
static void
dd_context_texture_barrier(struct pipe_context *_pipe)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->texture_barrier(pipe);
}
static void
dd_context_memory_barrier(struct pipe_context *_pipe, unsigned flags)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->memory_barrier(pipe, flags);
}
static void
dd_context_get_sample_position(struct pipe_context *_pipe,
unsigned sample_count, unsigned sample_index,
float *out_value)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
return pipe->get_sample_position(pipe, sample_count, sample_index,
out_value);
}
static void
dd_context_invalidate_resource(struct pipe_context *_pipe,
struct pipe_resource *resource)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
pipe->invalidate_resource(pipe, resource);
}
static enum pipe_reset_status
dd_context_get_device_reset_status(struct pipe_context *_pipe)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
return pipe->get_device_reset_status(pipe);
}
static void
dd_context_dump_debug_state(struct pipe_context *_pipe, FILE *stream,
unsigned flags)
{
struct pipe_context *pipe = dd_context(_pipe)->pipe;
return pipe->dump_debug_state(pipe, stream, flags);
}
struct pipe_context *
dd_context_create(struct dd_screen *dscreen, struct pipe_context *pipe)
{
struct dd_context *dctx;
if (!pipe)
return NULL;
dctx = CALLOC_STRUCT(dd_context);
if (!dctx) {
pipe->destroy(pipe);
return NULL;
}
dctx->pipe = pipe;
dctx->base.priv = pipe->priv; /* expose wrapped priv data */
dctx->base.screen = &dscreen->base;
dctx->base.destroy = dd_context_destroy;
CTX_INIT(render_condition);
CTX_INIT(create_query);
CTX_INIT(destroy_query);
CTX_INIT(begin_query);
CTX_INIT(end_query);
CTX_INIT(get_query_result);
CTX_INIT(create_blend_state);
CTX_INIT(bind_blend_state);
CTX_INIT(delete_blend_state);
CTX_INIT(create_sampler_state);
CTX_INIT(bind_sampler_states);
CTX_INIT(delete_sampler_state);
CTX_INIT(create_rasterizer_state);
CTX_INIT(bind_rasterizer_state);
CTX_INIT(delete_rasterizer_state);
CTX_INIT(create_depth_stencil_alpha_state);
CTX_INIT(bind_depth_stencil_alpha_state);
CTX_INIT(delete_depth_stencil_alpha_state);
CTX_INIT(create_fs_state);
CTX_INIT(bind_fs_state);
CTX_INIT(delete_fs_state);
CTX_INIT(create_vs_state);
CTX_INIT(bind_vs_state);
CTX_INIT(delete_vs_state);
CTX_INIT(create_gs_state);
CTX_INIT(bind_gs_state);
CTX_INIT(delete_gs_state);
CTX_INIT(create_tcs_state);
CTX_INIT(bind_tcs_state);
CTX_INIT(delete_tcs_state);
CTX_INIT(create_tes_state);
CTX_INIT(bind_tes_state);
CTX_INIT(delete_tes_state);
CTX_INIT(create_vertex_elements_state);
CTX_INIT(bind_vertex_elements_state);
CTX_INIT(delete_vertex_elements_state);
CTX_INIT(set_blend_color);
CTX_INIT(set_stencil_ref);
CTX_INIT(set_sample_mask);
CTX_INIT(set_min_samples);
CTX_INIT(set_clip_state);
CTX_INIT(set_constant_buffer);
CTX_INIT(set_framebuffer_state);
CTX_INIT(set_polygon_stipple);
CTX_INIT(set_scissor_states);
CTX_INIT(set_viewport_states);
CTX_INIT(set_sampler_views);
CTX_INIT(set_tess_state);
CTX_INIT(set_shader_buffers);
CTX_INIT(set_shader_images);
CTX_INIT(set_vertex_buffers);
CTX_INIT(set_index_buffer);
CTX_INIT(create_stream_output_target);
CTX_INIT(stream_output_target_destroy);
CTX_INIT(set_stream_output_targets);
CTX_INIT(create_sampler_view);
CTX_INIT(sampler_view_destroy);
CTX_INIT(create_surface);
CTX_INIT(surface_destroy);
CTX_INIT(create_image_view);
CTX_INIT(image_view_destroy);
CTX_INIT(transfer_map);
CTX_INIT(transfer_flush_region);
CTX_INIT(transfer_unmap);
CTX_INIT(transfer_inline_write);
CTX_INIT(texture_barrier);
CTX_INIT(memory_barrier);
/* create_video_codec */
/* create_video_buffer */
/* create_compute_state */
/* bind_compute_state */
/* delete_compute_state */
/* set_compute_resources */
/* set_global_binding */
CTX_INIT(get_sample_position);
CTX_INIT(invalidate_resource);
CTX_INIT(get_device_reset_status);
CTX_INIT(dump_debug_state);
dd_init_draw_functions(dctx);
dctx->sample_mask = ~0;
return &dctx->base;
}

View File

@@ -0,0 +1,807 @@
/**************************************************************************
*
* Copyright 2015 Advanced Micro Devices, Inc.
* Copyright 2008 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#include "dd_pipe.h"
#include "util/u_dump.h"
#include "util/u_format.h"
#include "tgsi/tgsi_scan.h"
#include "os/os_process.h"
#include <errno.h>
#include <sys/stat.h>
enum call_type
{
CALL_DRAW_VBO,
CALL_RESOURCE_COPY_REGION,
CALL_BLIT,
CALL_FLUSH_RESOURCE,
CALL_CLEAR,
CALL_CLEAR_BUFFER,
CALL_CLEAR_RENDER_TARGET,
CALL_CLEAR_DEPTH_STENCIL,
};
struct call_resource_copy_region
{
struct pipe_resource *dst;
unsigned dst_level;
unsigned dstx, dsty, dstz;
struct pipe_resource *src;
unsigned src_level;
const struct pipe_box *src_box;
};
struct call_clear
{
unsigned buffers;
const union pipe_color_union *color;
double depth;
unsigned stencil;
};
struct call_clear_buffer
{
struct pipe_resource *res;
unsigned offset;
unsigned size;
const void *clear_value;
int clear_value_size;
};
struct dd_call
{
enum call_type type;
union {
struct pipe_draw_info draw_vbo;
struct call_resource_copy_region resource_copy_region;
struct pipe_blit_info blit;
struct pipe_resource *flush_resource;
struct call_clear clear;
struct call_clear_buffer clear_buffer;
} info;
};
static FILE *
dd_get_file_stream(struct dd_context *dctx)
{
struct pipe_screen *screen = dctx->pipe->screen;
static unsigned index;
char proc_name[128], dir[256], name[512];
FILE *f;
if (!os_get_process_name(proc_name, sizeof(proc_name))) {
fprintf(stderr, "dd: can't get the process name\n");
return NULL;
}
snprintf(dir, sizeof(dir), "%s/"DD_DIR, debug_get_option("HOME", "."));
if (mkdir(dir, 0774) && errno != EEXIST) {
fprintf(stderr, "dd: can't create a directory (%i)\n", errno);
return NULL;
}
snprintf(name, sizeof(name), "%s/%s_%u_%08u", dir, proc_name, getpid(), index++);
f = fopen(name, "w");
if (!f) {
fprintf(stderr, "dd: can't open file %s\n", name);
return NULL;
}
fprintf(f, "Driver vendor: %s\n", screen->get_vendor(screen));
fprintf(f, "Device vendor: %s\n", screen->get_device_vendor(screen));
fprintf(f, "Device name: %s\n\n", screen->get_name(screen));
return f;
}
static void
dd_close_file_stream(FILE *f)
{
fclose(f);
}
static unsigned
dd_num_active_viewports(struct dd_context *dctx)
{
struct tgsi_shader_info info;
const struct tgsi_token *tokens;
if (dctx->shaders[PIPE_SHADER_GEOMETRY])
tokens = dctx->shaders[PIPE_SHADER_GEOMETRY]->state.shader.tokens;
else if (dctx->shaders[PIPE_SHADER_TESS_EVAL])
tokens = dctx->shaders[PIPE_SHADER_TESS_EVAL]->state.shader.tokens;
else if (dctx->shaders[PIPE_SHADER_VERTEX])
tokens = dctx->shaders[PIPE_SHADER_VERTEX]->state.shader.tokens;
else
return 1;
tgsi_scan_shader(tokens, &info);
return info.writes_viewport_index ? PIPE_MAX_VIEWPORTS : 1;
}
#define COLOR_RESET "\033[0m"
#define COLOR_SHADER "\033[1;32m"
#define COLOR_STATE "\033[1;33m"
#define DUMP(name, var) do { \
fprintf(f, COLOR_STATE #name ": " COLOR_RESET); \
util_dump_##name(f, var); \
fprintf(f, "\n"); \
} while(0)
#define DUMP_I(name, var, i) do { \
fprintf(f, COLOR_STATE #name " %i: " COLOR_RESET, i); \
util_dump_##name(f, var); \
fprintf(f, "\n"); \
} while(0)
#define DUMP_M(name, var, member) do { \
fprintf(f, " " #member ": "); \
util_dump_##name(f, (var)->member); \
fprintf(f, "\n"); \
} while(0)
#define DUMP_M_ADDR(name, var, member) do { \
fprintf(f, " " #member ": "); \
util_dump_##name(f, &(var)->member); \
fprintf(f, "\n"); \
} while(0)
static void
print_named_value(FILE *f, const char *name, int value)
{
fprintf(f, COLOR_STATE "%s" COLOR_RESET " = %i\n", name, value);
}
static void
print_named_xvalue(FILE *f, const char *name, int value)
{
fprintf(f, COLOR_STATE "%s" COLOR_RESET " = 0x%08x\n", name, value);
}
static void
util_dump_uint(FILE *f, unsigned i)
{
fprintf(f, "%u", i);
}
static void
util_dump_hex(FILE *f, unsigned i)
{
fprintf(f, "0x%x", i);
}
static void
util_dump_double(FILE *f, double d)
{
fprintf(f, "%f", d);
}
static void
util_dump_format(FILE *f, enum pipe_format format)
{
fprintf(f, "%s", util_format_name(format));
}
static void
util_dump_color_union(FILE *f, const union pipe_color_union *color)
{
fprintf(f, "{f = {%f, %f, %f, %f}, ui = {%u, %u, %u, %u}",
color->f[0], color->f[1], color->f[2], color->f[3],
color->ui[0], color->ui[1], color->ui[2], color->ui[3]);
}
static void
util_dump_query(FILE *f, struct dd_query *query)
{
if (query->type >= PIPE_QUERY_DRIVER_SPECIFIC)
fprintf(f, "PIPE_QUERY_DRIVER_SPECIFIC + %i",
query->type - PIPE_QUERY_DRIVER_SPECIFIC);
else
fprintf(f, "%s", util_dump_query_type(query->type, false));
}
static void
dd_dump_render_condition(struct dd_context *dctx, FILE *f)
{
if (dctx->render_cond.query) {
fprintf(f, "render condition:\n");
DUMP_M(query, &dctx->render_cond, query);
DUMP_M(uint, &dctx->render_cond, condition);
DUMP_M(uint, &dctx->render_cond, mode);
fprintf(f, "\n");
}
}
static void
dd_dump_draw_vbo(struct dd_context *dctx, struct pipe_draw_info *info, FILE *f)
{
int sh, i;
const char *shader_str[PIPE_SHADER_TYPES];
shader_str[PIPE_SHADER_VERTEX] = "VERTEX";
shader_str[PIPE_SHADER_TESS_CTRL] = "TESS_CTRL";
shader_str[PIPE_SHADER_TESS_EVAL] = "TESS_EVAL";
shader_str[PIPE_SHADER_GEOMETRY] = "GEOMETRY";
shader_str[PIPE_SHADER_FRAGMENT] = "FRAGMENT";
shader_str[PIPE_SHADER_COMPUTE] = "COMPUTE";
DUMP(draw_info, info);
if (info->indexed) {
DUMP(index_buffer, &dctx->index_buffer);
if (dctx->index_buffer.buffer)
DUMP_M(resource, &dctx->index_buffer, buffer);
}
if (info->count_from_stream_output)
DUMP_M(stream_output_target, info,
count_from_stream_output);
if (info->indirect)
DUMP_M(resource, info, indirect);
fprintf(f, "\n");
/* TODO: dump active queries */
dd_dump_render_condition(dctx, f);
for (i = 0; i < PIPE_MAX_ATTRIBS; i++)
if (dctx->vertex_buffers[i].buffer ||
dctx->vertex_buffers[i].user_buffer) {
DUMP_I(vertex_buffer, &dctx->vertex_buffers[i], i);
if (dctx->vertex_buffers[i].buffer)
DUMP_M(resource, &dctx->vertex_buffers[i], buffer);
}
if (dctx->velems) {
print_named_value(f, "num vertex elements",
dctx->velems->state.velems.count);
for (i = 0; i < dctx->velems->state.velems.count; i++) {
fprintf(f, " ");
DUMP_I(vertex_element, &dctx->velems->state.velems.velems[i], i);
}
}
print_named_value(f, "num stream output targets", dctx->num_so_targets);
for (i = 0; i < dctx->num_so_targets; i++)
if (dctx->so_targets[i]) {
DUMP_I(stream_output_target, dctx->so_targets[i], i);
DUMP_M(resource, dctx->so_targets[i], buffer);
fprintf(f, " offset = %i\n", dctx->so_offsets[i]);
}
fprintf(f, "\n");
for (sh = 0; sh < PIPE_SHADER_TYPES; sh++) {
if (sh == PIPE_SHADER_COMPUTE)
continue;
if (sh == PIPE_SHADER_TESS_CTRL &&
!dctx->shaders[PIPE_SHADER_TESS_CTRL] &&
dctx->shaders[PIPE_SHADER_TESS_EVAL])
fprintf(f, "tess_state: {default_outer_level = {%f, %f, %f, %f}, "
"default_inner_level = {%f, %f}}\n",
dctx->tess_default_levels[0],
dctx->tess_default_levels[1],
dctx->tess_default_levels[2],
dctx->tess_default_levels[3],
dctx->tess_default_levels[4],
dctx->tess_default_levels[5]);
if (sh == PIPE_SHADER_FRAGMENT)
if (dctx->rs) {
unsigned num_viewports = dd_num_active_viewports(dctx);
if (dctx->rs->state.rs.clip_plane_enable)
DUMP(clip_state, &dctx->clip_state);
for (i = 0; i < num_viewports; i++)
DUMP_I(viewport_state, &dctx->viewports[i], i);
if (dctx->rs->state.rs.scissor)
for (i = 0; i < num_viewports; i++)
DUMP_I(scissor_state, &dctx->scissors[i], i);
DUMP(rasterizer_state, &dctx->rs->state.rs);
if (dctx->rs->state.rs.poly_stipple_enable)
DUMP(poly_stipple, &dctx->polygon_stipple);
fprintf(f, "\n");
}
if (!dctx->shaders[sh])
continue;
fprintf(f, COLOR_SHADER "begin shader: %s" COLOR_RESET "\n", shader_str[sh]);
DUMP(shader_state, &dctx->shaders[sh]->state.shader);
for (i = 0; i < PIPE_MAX_CONSTANT_BUFFERS; i++)
if (dctx->constant_buffers[sh][i].buffer ||
dctx->constant_buffers[sh][i].user_buffer) {
DUMP_I(constant_buffer, &dctx->constant_buffers[sh][i], i);
if (dctx->constant_buffers[sh][i].buffer)
DUMP_M(resource, &dctx->constant_buffers[sh][i], buffer);
}
for (i = 0; i < PIPE_MAX_SAMPLERS; i++)
if (dctx->sampler_states[sh][i])
DUMP_I(sampler_state, &dctx->sampler_states[sh][i]->state.sampler, i);
for (i = 0; i < PIPE_MAX_SAMPLERS; i++)
if (dctx->sampler_views[sh][i]) {
DUMP_I(sampler_view, dctx->sampler_views[sh][i], i);
DUMP_M(resource, dctx->sampler_views[sh][i], texture);
}
/* TODO: print shader images */
/* TODO: print shader buffers */
fprintf(f, COLOR_SHADER "end shader: %s" COLOR_RESET "\n\n", shader_str[sh]);
}
if (dctx->dsa)
DUMP(depth_stencil_alpha_state, &dctx->dsa->state.dsa);
DUMP(stencil_ref, &dctx->stencil_ref);
if (dctx->blend)
DUMP(blend_state, &dctx->blend->state.blend);
DUMP(blend_color, &dctx->blend_color);
print_named_value(f, "min_samples", dctx->min_samples);
print_named_xvalue(f, "sample_mask", dctx->sample_mask);
fprintf(f, "\n");
DUMP(framebuffer_state, &dctx->framebuffer_state);
for (i = 0; i < dctx->framebuffer_state.nr_cbufs; i++)
if (dctx->framebuffer_state.cbufs[i]) {
fprintf(f, " " COLOR_STATE "cbufs[%i]:" COLOR_RESET "\n ", i);
DUMP(surface, dctx->framebuffer_state.cbufs[i]);
fprintf(f, " ");
DUMP(resource, dctx->framebuffer_state.cbufs[i]->texture);
}
if (dctx->framebuffer_state.zsbuf) {
fprintf(f, " " COLOR_STATE "zsbuf:" COLOR_RESET "\n ");
DUMP(surface, dctx->framebuffer_state.zsbuf);
fprintf(f, " ");
DUMP(resource, dctx->framebuffer_state.zsbuf->texture);
}
fprintf(f, "\n");
}
static void
dd_dump_resource_copy_region(struct dd_context *dctx,
struct call_resource_copy_region *info,
FILE *f)
{
fprintf(f, "%s:\n", __func__+8);
DUMP_M(resource, info, dst);
DUMP_M(uint, info, dst_level);
DUMP_M(uint, info, dstx);
DUMP_M(uint, info, dsty);
DUMP_M(uint, info, dstz);
DUMP_M(resource, info, src);
DUMP_M(uint, info, src_level);
DUMP_M(box, info, src_box);
}
static void
dd_dump_blit(struct dd_context *dctx, struct pipe_blit_info *info, FILE *f)
{
fprintf(f, "%s:\n", __func__+8);
DUMP_M(resource, info, dst.resource);
DUMP_M(uint, info, dst.level);
DUMP_M_ADDR(box, info, dst.box);
DUMP_M(format, info, dst.format);
DUMP_M(resource, info, src.resource);
DUMP_M(uint, info, src.level);
DUMP_M_ADDR(box, info, src.box);
DUMP_M(format, info, src.format);
DUMP_M(hex, info, mask);
DUMP_M(uint, info, filter);
DUMP_M(uint, info, scissor_enable);
DUMP_M_ADDR(scissor_state, info, scissor);
DUMP_M(uint, info, render_condition_enable);
if (info->render_condition_enable)
dd_dump_render_condition(dctx, f);
}
static void
dd_dump_flush_resource(struct dd_context *dctx, struct pipe_resource *res,
FILE *f)
{
fprintf(f, "%s:\n", __func__+8);
DUMP(resource, res);
}
static void
dd_dump_clear(struct dd_context *dctx, struct call_clear *info, FILE *f)
{
fprintf(f, "%s:\n", __func__+8);
DUMP_M(uint, info, buffers);
DUMP_M(color_union, info, color);
DUMP_M(double, info, depth);
DUMP_M(hex, info, stencil);
}
static void
dd_dump_clear_buffer(struct dd_context *dctx, struct call_clear_buffer *info,
FILE *f)
{
int i;
const char *value = (const char*)info->clear_value;
fprintf(f, "%s:\n", __func__+8);
DUMP_M(resource, info, res);
DUMP_M(uint, info, offset);
DUMP_M(uint, info, size);
DUMP_M(uint, info, clear_value_size);
fprintf(f, " clear_value:");
for (i = 0; i < info->clear_value_size; i++)
fprintf(f, " %02x", value[i]);
fprintf(f, "\n");
}
static void
dd_dump_clear_render_target(struct dd_context *dctx, FILE *f)
{
fprintf(f, "%s:\n", __func__+8);
/* TODO */
}
static void
dd_dump_clear_depth_stencil(struct dd_context *dctx, FILE *f)
{
fprintf(f, "%s:\n", __func__+8);
/* TODO */
}
static void
dd_dump_driver_state(struct dd_context *dctx, FILE *f, unsigned flags)
{
if (dctx->pipe->dump_debug_state) {
fprintf(f,"\n\n**************************************************"
"***************************\n");
fprintf(f, "Driver-specific state:\n\n");
dctx->pipe->dump_debug_state(dctx->pipe, f, flags);
}
}
static void
dd_dump_call(struct dd_context *dctx, struct dd_call *call, unsigned flags)
{
FILE *f = dd_get_file_stream(dctx);
if (!f)
return;
switch (call->type) {
case CALL_DRAW_VBO:
dd_dump_draw_vbo(dctx, &call->info.draw_vbo, f);
break;
case CALL_RESOURCE_COPY_REGION:
dd_dump_resource_copy_region(dctx, &call->info.resource_copy_region, f);
break;
case CALL_BLIT:
dd_dump_blit(dctx, &call->info.blit, f);
break;
case CALL_FLUSH_RESOURCE:
dd_dump_flush_resource(dctx, call->info.flush_resource, f);
break;
case CALL_CLEAR:
dd_dump_clear(dctx, &call->info.clear, f);
break;
case CALL_CLEAR_BUFFER:
dd_dump_clear_buffer(dctx, &call->info.clear_buffer, f);
break;
case CALL_CLEAR_RENDER_TARGET:
dd_dump_clear_render_target(dctx, f);
break;
case CALL_CLEAR_DEPTH_STENCIL:
dd_dump_clear_depth_stencil(dctx, f);
}
dd_dump_driver_state(dctx, f, flags);
dd_close_file_stream(f);
}
static void
dd_kill_process(void)
{
sync();
fprintf(stderr, "dd: Aborting the process...\n");
fflush(stdout);
fflush(stderr);
abort();
}
static bool
dd_flush_and_check_hang(struct dd_context *dctx,
struct pipe_fence_handle **flush_fence,
unsigned flush_flags)
{
struct pipe_fence_handle *fence = NULL;
struct pipe_context *pipe = dctx->pipe;
struct pipe_screen *screen = pipe->screen;
uint64_t timeout_ms = dd_screen(dctx->base.screen)->timeout_ms;
bool idle;
assert(timeout_ms > 0);
pipe->flush(pipe, &fence, flush_flags);
if (flush_fence)
screen->fence_reference(screen, flush_fence, fence);
if (!fence)
return false;
idle = screen->fence_finish(screen, fence, timeout_ms * 1000000);
screen->fence_reference(screen, &fence, NULL);
if (!idle)
fprintf(stderr, "dd: GPU hang detected!\n");
return !idle;
}
static void
dd_flush_and_handle_hang(struct dd_context *dctx,
struct pipe_fence_handle **fence, unsigned flags,
const char *cause)
{
if (dd_flush_and_check_hang(dctx, fence, flags)) {
FILE *f = dd_get_file_stream(dctx);
if (f) {
fprintf(f, "dd: %s.\n", cause);
dd_dump_driver_state(dctx, f, PIPE_DEBUG_DEVICE_IS_HUNG);
dd_close_file_stream(f);
}
/* Terminate the process to prevent future hangs. */
dd_kill_process();
}
}
static void
dd_context_flush(struct pipe_context *_pipe,
struct pipe_fence_handle **fence, unsigned flags)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
switch (dd_screen(dctx->base.screen)->mode) {
case DD_DETECT_HANGS:
dd_flush_and_handle_hang(dctx, fence, flags,
"GPU hang detected in pipe->flush()");
break;
case DD_DUMP_ALL_CALLS:
pipe->flush(pipe, fence, flags);
break;
default:
assert(0);
}
}
static void
dd_before_draw(struct dd_context *dctx)
{
if (dd_screen(dctx->base.screen)->mode == DD_DETECT_HANGS &&
!dd_screen(dctx->base.screen)->no_flush)
dd_flush_and_handle_hang(dctx, NULL, 0,
"GPU hang most likely caused by internal "
"driver commands");
}
static void
dd_after_draw(struct dd_context *dctx, struct dd_call *call)
{
switch (dd_screen(dctx->base.screen)->mode) {
case DD_DETECT_HANGS:
if (!dd_screen(dctx->base.screen)->no_flush &&
dd_flush_and_check_hang(dctx, NULL, 0)) {
dd_dump_call(dctx, call, PIPE_DEBUG_DEVICE_IS_HUNG);
/* Terminate the process to prevent future hangs. */
dd_kill_process();
}
break;
case DD_DUMP_ALL_CALLS:
dd_dump_call(dctx, call, 0);
break;
default:
assert(0);
}
}
static void
dd_context_draw_vbo(struct pipe_context *_pipe,
const struct pipe_draw_info *info)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_DRAW_VBO;
call.info.draw_vbo = *info;
dd_before_draw(dctx);
pipe->draw_vbo(pipe, info);
dd_after_draw(dctx, &call);
}
static void
dd_context_resource_copy_region(struct pipe_context *_pipe,
struct pipe_resource *dst, unsigned dst_level,
unsigned dstx, unsigned dsty, unsigned dstz,
struct pipe_resource *src, unsigned src_level,
const struct pipe_box *src_box)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_RESOURCE_COPY_REGION;
call.info.resource_copy_region.dst = dst;
call.info.resource_copy_region.dst_level = dst_level;
call.info.resource_copy_region.dstx = dstx;
call.info.resource_copy_region.dsty = dsty;
call.info.resource_copy_region.dstz = dstz;
call.info.resource_copy_region.src = src;
call.info.resource_copy_region.src_level = src_level;
call.info.resource_copy_region.src_box = src_box;
dd_before_draw(dctx);
pipe->resource_copy_region(pipe,
dst, dst_level, dstx, dsty, dstz,
src, src_level, src_box);
dd_after_draw(dctx, &call);
}
static void
dd_context_blit(struct pipe_context *_pipe, const struct pipe_blit_info *info)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_BLIT;
call.info.blit = *info;
dd_before_draw(dctx);
pipe->blit(pipe, info);
dd_after_draw(dctx, &call);
}
static void
dd_context_flush_resource(struct pipe_context *_pipe,
struct pipe_resource *resource)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_FLUSH_RESOURCE;
call.info.flush_resource = resource;
dd_before_draw(dctx);
pipe->flush_resource(pipe, resource);
dd_after_draw(dctx, &call);
}
static void
dd_context_clear(struct pipe_context *_pipe, unsigned buffers,
const union pipe_color_union *color, double depth,
unsigned stencil)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_CLEAR;
call.info.clear.buffers = buffers;
call.info.clear.color = color;
call.info.clear.depth = depth;
call.info.clear.stencil = stencil;
dd_before_draw(dctx);
pipe->clear(pipe, buffers, color, depth, stencil);
dd_after_draw(dctx, &call);
}
static void
dd_context_clear_render_target(struct pipe_context *_pipe,
struct pipe_surface *dst,
const union pipe_color_union *color,
unsigned dstx, unsigned dsty,
unsigned width, unsigned height)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_CLEAR_RENDER_TARGET;
dd_before_draw(dctx);
pipe->clear_render_target(pipe, dst, color, dstx, dsty, width, height);
dd_after_draw(dctx, &call);
}
static void
dd_context_clear_depth_stencil(struct pipe_context *_pipe,
struct pipe_surface *dst, unsigned clear_flags,
double depth, unsigned stencil, unsigned dstx,
unsigned dsty, unsigned width, unsigned height)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_CLEAR_DEPTH_STENCIL;
dd_before_draw(dctx);
pipe->clear_depth_stencil(pipe, dst, clear_flags, depth, stencil,
dstx, dsty, width, height);
dd_after_draw(dctx, &call);
}
static void
dd_context_clear_buffer(struct pipe_context *_pipe, struct pipe_resource *res,
unsigned offset, unsigned size,
const void *clear_value, int clear_value_size)
{
struct dd_context *dctx = dd_context(_pipe);
struct pipe_context *pipe = dctx->pipe;
struct dd_call call;
call.type = CALL_CLEAR_BUFFER;
call.info.clear_buffer.res = res;
call.info.clear_buffer.offset = offset;
call.info.clear_buffer.size = size;
call.info.clear_buffer.clear_value = clear_value;
call.info.clear_buffer.clear_value_size = clear_value_size;
dd_before_draw(dctx);
pipe->clear_buffer(pipe, res, offset, size, clear_value, clear_value_size);
dd_after_draw(dctx, &call);
}
void
dd_init_draw_functions(struct dd_context *dctx)
{
CTX_INIT(flush);
CTX_INIT(draw_vbo);
CTX_INIT(resource_copy_region);
CTX_INIT(blit);
CTX_INIT(clear);
CTX_INIT(clear_render_target);
CTX_INIT(clear_depth_stencil);
CTX_INIT(clear_buffer);
CTX_INIT(flush_resource);
/* launch_grid */
}

View File

@@ -0,0 +1,141 @@
/**************************************************************************
*
* Copyright 2015 Advanced Micro Devices, Inc.
* Copyright 2008 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#ifndef DD_H_
#define DD_H_
#include "pipe/p_context.h"
#include "pipe/p_state.h"
#include "pipe/p_screen.h"
/* name of the directory in home */
#define DD_DIR "ddebug_dumps"
enum dd_mode {
DD_DETECT_HANGS,
DD_DUMP_ALL_CALLS
};
struct dd_screen
{
struct pipe_screen base;
struct pipe_screen *screen;
unsigned timeout_ms;
enum dd_mode mode;
bool no_flush;
};
struct dd_query
{
unsigned type;
struct pipe_query *query;
};
struct dd_state
{
void *cso;
union {
struct pipe_blend_state blend;
struct pipe_depth_stencil_alpha_state dsa;
struct pipe_rasterizer_state rs;
struct pipe_sampler_state sampler;
struct {
struct pipe_vertex_element velems[PIPE_MAX_ATTRIBS];
unsigned count;
} velems;
struct pipe_shader_state shader;
} state;
};
struct dd_context
{
struct pipe_context base;
struct pipe_context *pipe;
struct {
struct dd_query *query;
bool condition;
unsigned mode;
} render_cond;
struct pipe_index_buffer index_buffer;
struct pipe_vertex_buffer vertex_buffers[PIPE_MAX_ATTRIBS];
unsigned num_so_targets;
struct pipe_stream_output_target *so_targets[PIPE_MAX_SO_BUFFERS];
unsigned so_offsets[PIPE_MAX_SO_BUFFERS];
struct dd_state *shaders[PIPE_SHADER_TYPES];
struct pipe_constant_buffer constant_buffers[PIPE_SHADER_TYPES][PIPE_MAX_CONSTANT_BUFFERS];
struct pipe_sampler_view *sampler_views[PIPE_SHADER_TYPES][PIPE_MAX_SAMPLERS];
struct dd_state *sampler_states[PIPE_SHADER_TYPES][PIPE_MAX_SAMPLERS];
struct pipe_image_view *shader_images[PIPE_SHADER_TYPES][PIPE_MAX_SHADER_IMAGES];
struct pipe_shader_buffer shader_buffers[PIPE_SHADER_TYPES][PIPE_MAX_SHADER_BUFFERS];
struct dd_state *velems;
struct dd_state *rs;
struct dd_state *dsa;
struct dd_state *blend;
struct pipe_blend_color blend_color;
struct pipe_stencil_ref stencil_ref;
unsigned sample_mask;
unsigned min_samples;
struct pipe_clip_state clip_state;
struct pipe_framebuffer_state framebuffer_state;
struct pipe_poly_stipple polygon_stipple;
struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS];
struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS];
float tess_default_levels[6];
};
struct pipe_context *
dd_context_create(struct dd_screen *dscreen, struct pipe_context *pipe);
void
dd_init_draw_functions(struct dd_context *dctx);
static inline struct dd_context *
dd_context(struct pipe_context *pipe)
{
return (struct dd_context *)pipe;
}
static inline struct dd_screen *
dd_screen(struct pipe_screen *screen)
{
return (struct dd_screen*)screen;
}
#define CTX_INIT(_member) \
dctx->base._member = dctx->pipe->_member ? dd_context_##_member : NULL
#endif /* DD_H_ */

View File

@@ -0,0 +1,36 @@
/**************************************************************************
*
* Copyright 2015 Advanced Micro Devices, Inc.
* Copyright 2010 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#ifndef DD_PUBLIC_H_
#define DD_PUBLIC_H_
struct pipe_screen;
struct pipe_screen *
ddebug_screen_create(struct pipe_screen *screen);
#endif /* DD_PUBLIC_H_ */

View File

@@ -0,0 +1,353 @@
/**************************************************************************
*
* Copyright 2015 Advanced Micro Devices, Inc.
* Copyright 2008 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#include "dd_pipe.h"
#include "dd_public.h"
#include "util/u_memory.h"
#include <stdio.h>
static const char *
dd_screen_get_name(struct pipe_screen *_screen)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_name(screen);
}
static const char *
dd_screen_get_vendor(struct pipe_screen *_screen)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_vendor(screen);
}
static const char *
dd_screen_get_device_vendor(struct pipe_screen *_screen)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_device_vendor(screen);
}
static int
dd_screen_get_param(struct pipe_screen *_screen,
enum pipe_cap param)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_param(screen, param);
}
static float
dd_screen_get_paramf(struct pipe_screen *_screen,
enum pipe_capf param)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_paramf(screen, param);
}
static int
dd_screen_get_shader_param(struct pipe_screen *_screen, unsigned shader,
enum pipe_shader_cap param)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_shader_param(screen, shader, param);
}
static uint64_t
dd_screen_get_timestamp(struct pipe_screen *_screen)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_timestamp(screen);
}
static struct pipe_context *
dd_screen_context_create(struct pipe_screen *_screen, void *priv,
unsigned flags)
{
struct dd_screen *dscreen = dd_screen(_screen);
struct pipe_screen *screen = dscreen->screen;
flags |= PIPE_CONTEXT_DEBUG;
return dd_context_create(dscreen,
screen->context_create(screen, priv, flags));
}
static boolean
dd_screen_is_format_supported(struct pipe_screen *_screen,
enum pipe_format format,
enum pipe_texture_target target,
unsigned sample_count,
unsigned tex_usage)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->is_format_supported(screen, format, target, sample_count,
tex_usage);
}
static boolean
dd_screen_can_create_resource(struct pipe_screen *_screen,
const struct pipe_resource *templat)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->can_create_resource(screen, templat);
}
static void
dd_screen_flush_frontbuffer(struct pipe_screen *_screen,
struct pipe_resource *resource,
unsigned level, unsigned layer,
void *context_private,
struct pipe_box *sub_box)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
screen->flush_frontbuffer(screen, resource, level, layer, context_private,
sub_box);
}
static int
dd_screen_get_driver_query_info(struct pipe_screen *_screen,
unsigned index,
struct pipe_driver_query_info *info)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_driver_query_info(screen, index, info);
}
static int
dd_screen_get_driver_query_group_info(struct pipe_screen *_screen,
unsigned index,
struct pipe_driver_query_group_info *info)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->get_driver_query_group_info(screen, index, info);
}
/********************************************************************
* resource
*/
static struct pipe_resource *
dd_screen_resource_create(struct pipe_screen *_screen,
const struct pipe_resource *templat)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
struct pipe_resource *res = screen->resource_create(screen, templat);
if (!res)
return NULL;
res->screen = _screen;
return res;
}
static struct pipe_resource *
dd_screen_resource_from_handle(struct pipe_screen *_screen,
const struct pipe_resource *templ,
struct winsys_handle *handle)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
struct pipe_resource *res =
screen->resource_from_handle(screen, templ, handle);
if (!res)
return NULL;
res->screen = _screen;
return res;
}
static struct pipe_resource *
dd_screen_resource_from_user_memory(struct pipe_screen *_screen,
const struct pipe_resource *templ,
void *user_memory)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
struct pipe_resource *res =
screen->resource_from_user_memory(screen, templ, user_memory);
if (!res)
return NULL;
res->screen = _screen;
return res;
}
static void
dd_screen_resource_destroy(struct pipe_screen *_screen,
struct pipe_resource *res)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
screen->resource_destroy(screen, res);
}
static boolean
dd_screen_resource_get_handle(struct pipe_screen *_screen,
struct pipe_resource *resource,
struct winsys_handle *handle)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->resource_get_handle(screen, resource, handle);
}
/********************************************************************
* fence
*/
static void
dd_screen_fence_reference(struct pipe_screen *_screen,
struct pipe_fence_handle **pdst,
struct pipe_fence_handle *src)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
screen->fence_reference(screen, pdst, src);
}
static boolean
dd_screen_fence_finish(struct pipe_screen *_screen,
struct pipe_fence_handle *fence,
uint64_t timeout)
{
struct pipe_screen *screen = dd_screen(_screen)->screen;
return screen->fence_finish(screen, fence, timeout);
}
/********************************************************************
* screen
*/
static void
dd_screen_destroy(struct pipe_screen *_screen)
{
struct dd_screen *dscreen = dd_screen(_screen);
struct pipe_screen *screen = dscreen->screen;
screen->destroy(screen);
FREE(dscreen);
}
struct pipe_screen *
ddebug_screen_create(struct pipe_screen *screen)
{
struct dd_screen *dscreen;
const char *option = debug_get_option("GALLIUM_DDEBUG", NULL);
bool dump_always = option && !strcmp(option, "always");
bool no_flush = option && strstr(option, "noflush");
bool help = option && !strcmp(option, "help");
unsigned timeout = 0;
if (help) {
puts("Gallium driver debugger");
puts("");
puts("Usage:");
puts("");
puts(" GALLIUM_DDEBUG=always");
puts(" Dump context and driver information after every draw call into");
puts(" $HOME/"DD_DIR"/.");
puts("");
puts(" GALLIUM_DDEBUG=[timeout in ms] noflush");
puts(" Flush and detect a device hang after every draw call based on the given");
puts(" fence timeout and dump context and driver information into");
puts(" $HOME/"DD_DIR"/ when a hang is detected.");
puts(" If 'noflush' is specified, only detect hangs in pipe->flush.");
puts("");
exit(0);
}
if (!option)
return screen;
if (!dump_always && sscanf(option, "%u", &timeout) != 1)
return screen;
dscreen = CALLOC_STRUCT(dd_screen);
if (!dscreen)
return NULL;
#define SCR_INIT(_member) \
dscreen->base._member = screen->_member ? dd_screen_##_member : NULL
dscreen->base.destroy = dd_screen_destroy;
dscreen->base.get_name = dd_screen_get_name;
dscreen->base.get_vendor = dd_screen_get_vendor;
dscreen->base.get_device_vendor = dd_screen_get_device_vendor;
dscreen->base.get_param = dd_screen_get_param;
dscreen->base.get_paramf = dd_screen_get_paramf;
dscreen->base.get_shader_param = dd_screen_get_shader_param;
/* get_video_param */
/* get_compute_param */
SCR_INIT(get_timestamp);
dscreen->base.context_create = dd_screen_context_create;
dscreen->base.is_format_supported = dd_screen_is_format_supported;
/* is_video_format_supported */
SCR_INIT(can_create_resource);
dscreen->base.resource_create = dd_screen_resource_create;
dscreen->base.resource_from_handle = dd_screen_resource_from_handle;
SCR_INIT(resource_from_user_memory);
dscreen->base.resource_get_handle = dd_screen_resource_get_handle;
dscreen->base.resource_destroy = dd_screen_resource_destroy;
SCR_INIT(flush_frontbuffer);
SCR_INIT(fence_reference);
SCR_INIT(fence_finish);
SCR_INIT(get_driver_query_info);
SCR_INIT(get_driver_query_group_info);
#undef SCR_INIT
dscreen->screen = screen;
dscreen->timeout_ms = timeout;
dscreen->mode = dump_always ? DD_DUMP_ALL_CALLS : DD_DETECT_HANGS;
dscreen->no_flush = no_flush;
switch (dscreen->mode) {
case DD_DUMP_ALL_CALLS:
fprintf(stderr, "Gallium debugger active. Logging all calls.\n");
break;
case DD_DETECT_HANGS:
fprintf(stderr, "Gallium debugger active. "
"The hang detection timout is %i ms.\n", timeout);
break;
default:
assert(0);
}
return &dscreen->base;
}

View File

@@ -14,7 +14,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10551 bytes, from 2015-05-20 20:03:14)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14968 bytes, from 2015-05-20 20:12:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 67120 bytes, from 2015-08-14 23:22:03)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63785 bytes, from 2015-08-14 18:27:06)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63915 bytes, from 2015-08-24 16:56:28)
Copyright (C) 2013-2015 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)

View File

@@ -86,7 +86,7 @@ static const uint8_t a20x_primtypes[PIPE_PRIM_MAX] = {
};
struct pipe_context *
fd2_context_create(struct pipe_screen *pscreen, void *priv)
fd2_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags)
{
struct fd_screen *screen = fd_screen(pscreen);
struct fd2_context *fd2_ctx = CALLOC_STRUCT(fd2_context);

View File

@@ -47,6 +47,6 @@ fd2_context(struct fd_context *ctx)
}
struct pipe_context *
fd2_context_create(struct pipe_screen *pscreen, void *priv);
fd2_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags);
#endif /* FD2_CONTEXT_H_ */

View File

@@ -14,7 +14,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10551 bytes, from 2015-05-20 20:03:14)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14968 bytes, from 2015-05-20 20:12:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 67120 bytes, from 2015-08-14 23:22:03)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63785 bytes, from 2015-08-14 18:27:06)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63915 bytes, from 2015-08-24 16:56:28)
Copyright (C) 2013-2015 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -680,6 +680,7 @@ static inline uint32_t REG_A3XX_CP_PROTECT_REG(uint32_t i0) { return 0x00000460
#define A3XX_GRAS_CL_CLIP_CNTL_VP_CLIP_CODE_IGNORE 0x00080000
#define A3XX_GRAS_CL_CLIP_CNTL_VP_XFORM_DISABLE 0x00100000
#define A3XX_GRAS_CL_CLIP_CNTL_PERSP_DIVISION_DISABLE 0x00200000
#define A3XX_GRAS_CL_CLIP_CNTL_ZERO_GB_SCALE_Z 0x00400000
#define A3XX_GRAS_CL_CLIP_CNTL_ZCOORD 0x00800000
#define A3XX_GRAS_CL_CLIP_CNTL_WCOORD 0x01000000
#define A3XX_GRAS_CL_CLIP_CNTL_ZCLIP_DISABLE 0x02000000

View File

@@ -98,7 +98,7 @@ static const uint8_t primtypes[PIPE_PRIM_MAX] = {
};
struct pipe_context *
fd3_context_create(struct pipe_screen *pscreen, void *priv)
fd3_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags)
{
struct fd_screen *screen = fd_screen(pscreen);
struct fd3_context *fd3_ctx = CALLOC_STRUCT(fd3_context);

View File

@@ -119,6 +119,6 @@ fd3_context(struct fd_context *ctx)
}
struct pipe_context *
fd3_context_create(struct pipe_screen *pscreen, void *priv);
fd3_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags);
#endif /* FD3_CONTEXT_H_ */

View File

@@ -563,10 +563,29 @@ fd3_emit_state(struct fd_context *ctx, struct fd_ringbuffer *ring,
val |= COND(fp->writes_pos, A3XX_GRAS_CL_CLIP_CNTL_ZCLIP_DISABLE);
val |= COND(fp->frag_coord, A3XX_GRAS_CL_CLIP_CNTL_ZCOORD |
A3XX_GRAS_CL_CLIP_CNTL_WCOORD);
/* TODO only use if prog doesn't use clipvertex/clipdist */
val |= MIN2(util_bitcount(ctx->rasterizer->clip_plane_enable), 6) << 26;
OUT_PKT0(ring, REG_A3XX_GRAS_CL_CLIP_CNTL, 1);
OUT_RING(ring, val);
}
if (dirty & (FD_DIRTY_RASTERIZER | FD_DIRTY_UCP)) {
uint32_t planes = ctx->rasterizer->clip_plane_enable;
int count = 0;
while (planes && count < 6) {
int i = ffs(planes) - 1;
planes &= ~(1U << i);
fd_wfi(ctx, ring);
OUT_PKT0(ring, REG_A3XX_GRAS_CL_USER_PLANE(count++), 4);
OUT_RING(ring, fui(ctx->ucp.ucp[i][0]));
OUT_RING(ring, fui(ctx->ucp.ucp[i][1]));
OUT_RING(ring, fui(ctx->ucp.ucp[i][2]));
OUT_RING(ring, fui(ctx->ucp.ucp[i][3]));
}
}
/* NOTE: since primitive_restart is not actually part of any
* state object, we need to make sure that we always emit
* PRIM_VTX_CNTL.. either that or be more clever and detect

View File

@@ -65,7 +65,8 @@ fd3_rasterizer_state_create(struct pipe_context *pctx,
if (cso->multisample)
TODO
*/
so->gras_cl_clip_cntl = A3XX_GRAS_CL_CLIP_CNTL_IJ_PERSP_CENTER; /* ??? */
so->gras_cl_clip_cntl = A3XX_GRAS_CL_CLIP_CNTL_IJ_PERSP_CENTER /* ??? */ |
COND(cso->clip_halfz, A3XX_GRAS_CL_CLIP_CNTL_ZERO_GB_SCALE_Z);
so->gras_su_point_minmax =
A3XX_GRAS_SU_POINT_MINMAX_MIN(psize_min) |
A3XX_GRAS_SU_POINT_MINMAX_MAX(psize_max);

View File

@@ -14,7 +14,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10551 bytes, from 2015-05-20 20:03:14)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14968 bytes, from 2015-05-20 20:12:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 67120 bytes, from 2015-08-14 23:22:03)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63785 bytes, from 2015-08-14 18:27:06)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63915 bytes, from 2015-08-24 16:56:28)
Copyright (C) 2013-2015 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -162,10 +162,13 @@ enum a4xx_tex_fmt {
TFMT4_8_UNORM = 4,
TFMT4_8_8_UNORM = 14,
TFMT4_8_8_8_8_UNORM = 28,
TFMT4_8_SNORM = 5,
TFMT4_8_8_SNORM = 15,
TFMT4_8_8_8_8_SNORM = 29,
TFMT4_8_UINT = 6,
TFMT4_8_8_UINT = 16,
TFMT4_8_8_8_8_UINT = 30,
TFMT4_8_SINT = 7,
TFMT4_8_8_SINT = 17,
TFMT4_8_8_8_8_SINT = 31,
TFMT4_16_UINT = 21,

View File

@@ -96,7 +96,7 @@ static const uint8_t primtypes[PIPE_PRIM_MAX] = {
};
struct pipe_context *
fd4_context_create(struct pipe_screen *pscreen, void *priv)
fd4_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags)
{
struct fd_screen *screen = fd_screen(pscreen);
struct fd4_context *fd4_ctx = CALLOC_STRUCT(fd4_context);

View File

@@ -97,6 +97,6 @@ fd4_context(struct fd_context *ctx)
}
struct pipe_context *
fd4_context_create(struct pipe_screen *pscreen, void *priv);
fd4_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags);
#endif /* FD4_CONTEXT_H_ */

View File

@@ -79,9 +79,9 @@ struct fd4_format {
static struct fd4_format formats[PIPE_FORMAT_COUNT] = {
/* 8-bit */
VT(R8_UNORM, 8_UNORM, R8_UNORM, WZYX),
V_(R8_SNORM, 8_SNORM, NONE, WZYX),
V_(R8_UINT, 8_UINT, NONE, WZYX),
V_(R8_SINT, 8_SINT, NONE, WZYX),
VT(R8_SNORM, 8_SNORM, NONE, WZYX),
VT(R8_UINT, 8_UINT, NONE, WZYX),
VT(R8_SINT, 8_SINT, NONE, WZYX),
V_(R8_USCALED, 8_UINT, NONE, WZYX),
V_(R8_SSCALED, 8_UINT, NONE, WZYX),
@@ -115,8 +115,8 @@ static struct fd4_format formats[PIPE_FORMAT_COUNT] = {
VT(R8G8_UNORM, 8_8_UNORM, R8G8_UNORM, WZYX),
VT(R8G8_SNORM, 8_8_SNORM, R8G8_SNORM, WZYX),
VT(R8G8_UINT, 8_8_UINT, NONE, WZYX),
VT(R8G8_SINT, 8_8_SINT, NONE, WZYX),
VT(R8G8_UINT, 8_8_UINT, R8G8_UINT, WZYX),
VT(R8G8_SINT, 8_8_SINT, R8G8_SINT, WZYX),
V_(R8G8_USCALED, 8_8_UINT, NONE, WZYX),
V_(R8G8_SSCALED, 8_8_SINT, NONE, WZYX),

View File

@@ -14,7 +14,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10551 bytes, from 2015-05-20 20:03:14)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14968 bytes, from 2015-05-20 20:12:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 67120 bytes, from 2015-08-14 23:22:03)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63785 bytes, from 2015-08-14 18:27:06)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63915 bytes, from 2015-08-24 16:56:28)
Copyright (C) 2013-2015 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)

View File

@@ -14,7 +14,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10551 bytes, from 2015-05-20 20:03:14)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14968 bytes, from 2015-05-20 20:12:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 67120 bytes, from 2015-08-14 23:22:03)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63785 bytes, from 2015-08-14 18:27:06)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 63915 bytes, from 2015-08-24 16:56:28)
Copyright (C) 2013-2015 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)

View File

@@ -334,6 +334,7 @@ struct fd_context {
FD_DIRTY_INDEXBUF = (1 << 16),
FD_DIRTY_SCISSOR = (1 << 17),
FD_DIRTY_STREAMOUT = (1 << 18),
FD_DIRTY_UCP = (1 << 19),
} dirty;
struct pipe_blend_state *blend;
@@ -355,6 +356,7 @@ struct fd_context {
struct fd_constbuf_stateobj constbuf[PIPE_SHADER_TYPES];
struct pipe_index_buffer indexbuf;
struct fd_streamout_stateobj streamout;
struct pipe_clip_state ucp;
/* GMEM/tile handling fxns: */
void (*emit_tile_init)(struct fd_context *ctx);

View File

@@ -191,6 +191,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
return 16383;
case PIPE_CAP_DEPTH_CLIP_DISABLE:
case PIPE_CAP_CLIP_HALFZ:
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
return is_a3xx(screen);
@@ -228,7 +229,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_CONDITIONAL_RENDER_INVERTED:
case PIPE_CAP_SAMPLER_VIEW_TARGET:
case PIPE_CAP_CLIP_HALFZ:
case PIPE_CAP_POLYGON_OFFSET_CLAMP:
case PIPE_CAP_MULTISAMPLE_Z_RESOLVE:
case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:

View File

@@ -65,7 +65,9 @@ static void
fd_set_clip_state(struct pipe_context *pctx,
const struct pipe_clip_state *clip)
{
DBG("TODO: ");
struct fd_context *ctx = fd_context(pctx);
ctx->ucp = *clip;
ctx->dirty |= FD_DIRTY_UCP;
}
static void

View File

@@ -2312,7 +2312,7 @@ emit_instructions(struct ir3_compile *ctx)
ctx->ir = ir3_create(ctx->compiler, ninputs, noutputs);
/* Create inputs in first block: */
ctx->block = get_block(ctx, fxn->start_block);
ctx->block = get_block(ctx, nir_start_block(fxn));
ctx->in_block = ctx->block;
list_addtail(&ctx->block->node, &ctx->ir->block_list);

View File

@@ -29,6 +29,7 @@
#include "ir3_nir.h"
#include "glsl/nir/nir_builder.h"
#include "glsl/nir/nir_control_flow.h"
/* Based on nir_opt_peephole_select, and hacked up to more aggressively
* flatten anything that can be flattened
@@ -171,7 +172,7 @@ flatten_block(nir_builder *bld, nir_block *if_block, nir_block *prev_block,
(intr->intrinsic == nir_intrinsic_discard_if)) {
nir_ssa_def *discard_cond;
nir_builder_insert_after_instr(bld,
bld->cursor = nir_after_instr(
nir_block_last_instr(prev_block));
if (invert) {

View File

@@ -155,7 +155,7 @@ static void i915_destroy(struct pipe_context *pipe)
}
struct pipe_context *
i915_create_context(struct pipe_screen *screen, void *priv)
i915_create_context(struct pipe_screen *screen, void *priv, unsigned flags)
{
struct i915_context *i915;

View File

@@ -401,7 +401,7 @@ void i915_init_string_functions( struct i915_context *i915 );
* i915_context.c
*/
struct pipe_context *i915_create_context(struct pipe_screen *screen,
void *priv);
void *priv, unsigned flags);
/***********************************************************************

View File

@@ -135,7 +135,7 @@ ilo_context_destroy(struct pipe_context *pipe)
}
static struct pipe_context *
ilo_context_create(struct pipe_screen *screen, void *priv)
ilo_context_create(struct pipe_screen *screen, void *priv, unsigned flags)
{
struct ilo_screen *is = ilo_screen(screen);
struct ilo_context *ilo;

View File

@@ -128,7 +128,8 @@ llvmpipe_render_condition ( struct pipe_context *pipe,
}
struct pipe_context *
llvmpipe_create_context( struct pipe_screen *screen, void *priv )
llvmpipe_create_context(struct pipe_screen *screen, void *priv,
unsigned flags)
{
struct llvmpipe_context *llvmpipe;

View File

@@ -160,7 +160,8 @@ struct llvmpipe_context {
struct pipe_context *
llvmpipe_create_context( struct pipe_screen *screen, void *priv );
llvmpipe_create_context(struct pipe_screen *screen, void *priv,
unsigned flags);
struct pipe_resource *
llvmpipe_user_buffer_create(struct pipe_screen *screen,

View File

@@ -260,7 +260,8 @@ static void noop_destroy_context(struct pipe_context *ctx)
FREE(ctx);
}
static struct pipe_context *noop_create_context(struct pipe_screen *screen, void *priv)
static struct pipe_context *noop_create_context(struct pipe_screen *screen,
void *priv, unsigned flags)
{
struct pipe_context *ctx = CALLOC_STRUCT(pipe_context);

View File

@@ -190,7 +190,7 @@ nv30_context_destroy(struct pipe_context *pipe)
} while(0)
struct pipe_context *
nv30_context_create(struct pipe_screen *pscreen, void *priv)
nv30_context_create(struct pipe_screen *pscreen, void *priv, unsigned ctxflags)
{
struct nv30_screen *screen = nv30_screen(pscreen);
struct nv30_context *nv30 = CALLOC_STRUCT(nv30_context);

View File

@@ -132,7 +132,7 @@ nv30_context(struct pipe_context *pipe)
}
struct pipe_context *
nv30_context_create(struct pipe_screen *pscreen, void *priv);
nv30_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags);
void
nv30_vbo_init(struct pipe_context *pipe);

View File

@@ -240,7 +240,7 @@ nv50_context_get_sample_position(struct pipe_context *, unsigned, unsigned,
float *);
struct pipe_context *
nv50_create(struct pipe_screen *pscreen, void *priv)
nv50_create(struct pipe_screen *pscreen, void *priv, unsigned ctxflags)
{
struct nv50_screen *screen = nv50_screen(pscreen);
struct nv50_context *nv50;

View File

@@ -186,7 +186,7 @@ nv50_context_shader_stage(unsigned pipe)
}
/* nv50_context.c */
struct pipe_context *nv50_create(struct pipe_screen *, void *);
struct pipe_context *nv50_create(struct pipe_screen *, void *, unsigned flags);
void nv50_bufctx_fence(struct nouveau_bufctx *, bool on_flush);

View File

@@ -117,7 +117,6 @@ nv50_blend_state_create(struct pipe_context *pipe,
struct nv50_blend_stateobj *so = CALLOC_STRUCT(nv50_blend_stateobj);
int i;
bool emit_common_func = cso->rt[0].blend_enable;
uint32_t ms;
if (nv50_context(pipe)->screen->tesla->oclass >= NVA3_3D_CLASS) {
SB_BEGIN_3D(so, BLEND_INDEPENDENT, 1);
@@ -189,15 +188,6 @@ nv50_blend_state_create(struct pipe_context *pipe,
SB_DATA (so, nv50_colormask(cso->rt[0].colormask));
}
ms = 0;
if (cso->alpha_to_coverage)
ms |= NV50_3D_MULTISAMPLE_CTRL_ALPHA_TO_COVERAGE;
if (cso->alpha_to_one)
ms |= NV50_3D_MULTISAMPLE_CTRL_ALPHA_TO_ONE;
SB_BEGIN_3D(so, MULTISAMPLE_CTRL, 1);
SB_DATA (so, ms);
assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
return so;
}

View File

@@ -1,4 +1,6 @@
#include "util/u_format.h"
#include "nv50/nv50_context.h"
#include "nv50/nv50_defs.xml.h"
@@ -313,6 +315,25 @@ nv50_validate_derived_2(struct nv50_context *nv50)
}
}
static void
nv50_validate_derived_3(struct nv50_context *nv50)
{
struct nouveau_pushbuf *push = nv50->base.pushbuf;
struct pipe_framebuffer_state *fb = &nv50->framebuffer;
uint32_t ms = 0;
if ((!fb->nr_cbufs || !fb->cbufs[0] ||
!util_format_is_pure_integer(fb->cbufs[0]->format)) && nv50->blend) {
if (nv50->blend->pipe.alpha_to_coverage)
ms |= NV50_3D_MULTISAMPLE_CTRL_ALPHA_TO_COVERAGE;
if (nv50->blend->pipe.alpha_to_one)
ms |= NV50_3D_MULTISAMPLE_CTRL_ALPHA_TO_ONE;
}
BEGIN_NV04(push, NV50_3D(MULTISAMPLE_CTRL), 1);
PUSH_DATA (push, ms);
}
static void
nv50_validate_clip(struct nv50_context *nv50)
{
@@ -474,6 +495,7 @@ static struct state_validate {
{ nv50_validate_derived_rs, NV50_NEW_FRAGPROG | NV50_NEW_RASTERIZER |
NV50_NEW_VERTPROG | NV50_NEW_GMTYPROG },
{ nv50_validate_derived_2, NV50_NEW_ZSA | NV50_NEW_FRAMEBUFFER },
{ nv50_validate_derived_3, NV50_NEW_BLEND | NV50_NEW_FRAMEBUFFER },
{ nv50_validate_clip, NV50_NEW_CLIP | NV50_NEW_RASTERIZER |
NV50_NEW_VERTPROG | NV50_NEW_GMTYPROG },
{ nv50_constbufs_validate, NV50_NEW_CONSTBUF },

View File

@@ -19,7 +19,7 @@
struct nv50_blend_stateobj {
struct pipe_blend_state pipe;
int size;
uint32_t state[84]; // TODO: allocate less if !independent_blend_enable
uint32_t state[82]; // TODO: allocate less if !independent_blend_enable
};
struct nv50_rasterizer_stateobj {

View File

@@ -68,6 +68,10 @@ nv50_2d_format(enum pipe_format format, bool dst, bool dst_src_equal)
return NV50_SURFACE_FORMAT_R16_UNORM;
case 4:
return NV50_SURFACE_FORMAT_BGRA8_UNORM;
case 8:
return NV50_SURFACE_FORMAT_RGBA16_FLOAT;
case 16:
return NV50_SURFACE_FORMAT_RGBA32_FLOAT;
default:
return 0;
}
@@ -1003,6 +1007,8 @@ nv50_blitctx_prepare_state(struct nv50_blitctx *blit)
/* zsa state */
BEGIN_NV04(push, NV50_3D(DEPTH_TEST_ENABLE), 1);
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(DEPTH_BOUNDS_EN), 1);
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(STENCIL_ENABLE), 1);
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(ALPHA_TEST_ENABLE), 1);

View File

@@ -262,7 +262,7 @@ nvc0_context_get_sample_position(struct pipe_context *, unsigned, unsigned,
float *);
struct pipe_context *
nvc0_create(struct pipe_screen *pscreen, void *priv)
nvc0_create(struct pipe_screen *pscreen, void *priv, unsigned ctxflags)
{
struct nvc0_screen *screen = nvc0_screen(pscreen);
struct nvc0_context *nvc0;

View File

@@ -214,7 +214,7 @@ nvc0_shader_stage(unsigned pipe)
/* nvc0_context.c */
struct pipe_context *nvc0_create(struct pipe_screen *, void *);
struct pipe_context *nvc0_create(struct pipe_screen *, void *, unsigned flags);
void nvc0_bufctx_fence(struct nvc0_context *, struct nouveau_bufctx *,
bool on_flush);
void nvc0_default_kick_notify(struct nouveau_pushbuf *);

View File

@@ -56,10 +56,10 @@ struct nvc0_query {
#define NVC0_QUERY_ALLOC_SPACE 256
static boolean nvc0_mp_pm_query_begin(struct nvc0_context *,
static boolean nvc0_hw_sm_query_begin(struct nvc0_context *,
struct nvc0_query *);
static void nvc0_mp_pm_query_end(struct nvc0_context *, struct nvc0_query *);
static boolean nvc0_mp_pm_query_result(struct nvc0_context *,
static void nvc0_hw_sm_query_end(struct nvc0_context *, struct nvc0_query *);
static boolean nvc0_hw_sm_query_result(struct nvc0_context *,
struct nvc0_query *, void *, boolean);
static inline struct nvc0_query *
@@ -159,7 +159,7 @@ nvc0_query_create(struct pipe_context *pipe, unsigned type, unsigned index)
} else
#endif
if (nvc0->screen->base.device->drm_version >= 0x01000101) {
if (type >= NVE4_PM_QUERY(0) && type <= NVE4_PM_QUERY_LAST) {
if (type >= NVE4_HW_SM_QUERY(0) && type <= NVE4_HW_SM_QUERY_LAST) {
/* for each MP:
* [00] = WS0.C0
* [04] = WS0.C1
@@ -189,7 +189,7 @@ nvc0_query_create(struct pipe_context *pipe, unsigned type, unsigned index)
space = (4 * 4 + 4 + 4) * nvc0->screen->mp_count * sizeof(uint32_t);
break;
} else
if (type >= NVC0_PM_QUERY(0) && type <= NVC0_PM_QUERY_LAST) {
if (type >= NVC0_HW_SM_QUERY(0) && type <= NVC0_HW_SM_QUERY_LAST) {
/* for each MP:
* [00] = MP.C0
* [04] = MP.C1
@@ -327,9 +327,9 @@ nvc0_query_begin(struct pipe_context *pipe, struct pipe_query *pq)
q->u.value = 0;
} else
#endif
if ((q->type >= NVE4_PM_QUERY(0) && q->type <= NVE4_PM_QUERY_LAST) ||
(q->type >= NVC0_PM_QUERY(0) && q->type <= NVC0_PM_QUERY_LAST)) {
ret = nvc0_mp_pm_query_begin(nvc0, q);
if ((q->type >= NVE4_HW_SM_QUERY(0) && q->type <= NVE4_HW_SM_QUERY_LAST) ||
(q->type >= NVC0_HW_SM_QUERY(0) && q->type <= NVC0_HW_SM_QUERY_LAST)) {
ret = nvc0_hw_sm_query_begin(nvc0, q);
}
break;
}
@@ -412,9 +412,9 @@ nvc0_query_end(struct pipe_context *pipe, struct pipe_query *pq)
return;
} else
#endif
if ((q->type >= NVE4_PM_QUERY(0) && q->type <= NVE4_PM_QUERY_LAST) ||
(q->type >= NVC0_PM_QUERY(0) && q->type <= NVC0_PM_QUERY_LAST)) {
nvc0_mp_pm_query_end(nvc0, q);
if ((q->type >= NVE4_HW_SM_QUERY(0) && q->type <= NVE4_HW_SM_QUERY_LAST) ||
(q->type >= NVC0_HW_SM_QUERY(0) && q->type <= NVC0_HW_SM_QUERY_LAST)) {
nvc0_hw_sm_query_end(nvc0, q);
}
break;
}
@@ -453,9 +453,9 @@ nvc0_query_result(struct pipe_context *pipe, struct pipe_query *pq,
return true;
} else
#endif
if ((q->type >= NVE4_PM_QUERY(0) && q->type <= NVE4_PM_QUERY_LAST) ||
(q->type >= NVC0_PM_QUERY(0) && q->type <= NVC0_PM_QUERY_LAST)) {
return nvc0_mp_pm_query_result(nvc0, q, result, wait);
if ((q->type >= NVE4_HW_SM_QUERY(0) && q->type <= NVE4_HW_SM_QUERY_LAST) ||
(q->type >= NVC0_HW_SM_QUERY(0) && q->type <= NVC0_HW_SM_QUERY_LAST)) {
return nvc0_hw_sm_query_result(nvc0, q, result, wait);
}
if (q->state != NVC0_QUERY_STATE_READY)
@@ -692,7 +692,7 @@ static const char *nvc0_drv_stat_names[] =
* We could add a kernel interface for it, but reading the counters like this
* has the advantage of being async (if get_result isn't called immediately).
*/
static const uint64_t nve4_read_mp_pm_counters_code[] =
static const uint64_t nve4_read_hw_sm_counters_code[] =
{
/* sched 0x20 0x20 0x20 0x20 0x20 0x20 0x20
* mov b32 $r8 $tidx
@@ -776,6 +776,33 @@ static const uint64_t nve4_read_mp_pm_counters_code[] =
static const char *nve4_pm_query_names[] =
{
/* MP counters */
"active_cycles",
"active_warps",
"atom_count",
"branch",
"divergent_branch",
"gld_request",
"global_ld_mem_divergence_replays",
"global_store_transaction",
"global_st_mem_divergence_replays",
"gred_count",
"gst_request",
"inst_executed",
"inst_issued",
"inst_issued1",
"inst_issued2",
"l1_global_load_hit",
"l1_global_load_miss",
"l1_local_load_hit",
"l1_local_load_miss",
"l1_local_store_hit",
"l1_local_store_miss",
"l1_shared_load_transactions",
"l1_shared_store_transactions",
"local_load",
"local_load_transactions",
"local_store",
"local_store_transactions",
"prof_trigger_00",
"prof_trigger_01",
"prof_trigger_02",
@@ -784,41 +811,14 @@ static const char *nve4_pm_query_names[] =
"prof_trigger_05",
"prof_trigger_06",
"prof_trigger_07",
"warps_launched",
"threads_launched",
"sm_cta_launched",
"inst_issued1",
"inst_issued2",
"inst_executed",
"local_load",
"local_store",
"shared_load",
"shared_store",
"l1_local_load_hit",
"l1_local_load_miss",
"l1_local_store_hit",
"l1_local_store_miss",
"gld_request",
"gst_request",
"l1_global_load_hit",
"l1_global_load_miss",
"uncached_global_load_transaction",
"global_store_transaction",
"branch",
"divergent_branch",
"active_warps",
"active_cycles",
"inst_issued",
"atom_count",
"gred_count",
"shared_load_replay",
"shared_store",
"shared_store_replay",
"local_load_transactions",
"local_store_transactions",
"l1_shared_load_transactions",
"l1_shared_store_transactions",
"global_ld_mem_divergence_replays",
"global_st_mem_divergence_replays",
"sm_cta_launched",
"threads_launched",
"uncached_global_load_transaction",
"warps_launched",
/* metrics, i.e. functions of the MP counters */
"metric-ipc", /* inst_executed, clock */
"metric-ipac", /* inst_executed, active_cycles */
@@ -852,7 +852,7 @@ struct nvc0_mp_counter_cfg
#define NVC0_COUNTER_OP2_AVG_DIV_MM 5 /* avg(ctr0 / ctr1) */
#define NVC0_COUNTER_OP2_AVG_DIV_M0 6 /* avg(ctr0) / ctr1 of MP[0]) */
struct nvc0_mp_pm_query_cfg
struct nvc0_hw_sm_query_cfg
{
struct nvc0_mp_counter_cfg ctr[4];
uint8_t num_counters;
@@ -860,17 +860,17 @@ struct nvc0_mp_pm_query_cfg
uint8_t norm[2]; /* normalization num,denom */
};
#define _Q1A(n, f, m, g, s, nu, dn) [NVE4_PM_QUERY_##n] = { { { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g, s }, {}, {}, {} }, 1, NVC0_COUNTER_OPn_SUM, { nu, dn } }
#define _Q1B(n, f, m, g, s, nu, dn) [NVE4_PM_QUERY_##n] = { { { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g, s }, {}, {}, {} }, 1, NVC0_COUNTER_OPn_SUM, { nu, dn } }
#define _M2A(n, f0, m0, g0, s0, f1, m1, g1, s1, o, nu, dn) [NVE4_PM_QUERY_METRIC_##n] = { { \
#define _Q1A(n, f, m, g, s, nu, dn) [NVE4_HW_SM_QUERY_##n] = { { { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g, s }, {}, {}, {} }, 1, NVC0_COUNTER_OPn_SUM, { nu, dn } }
#define _Q1B(n, f, m, g, s, nu, dn) [NVE4_HW_SM_QUERY_##n] = { { { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g, s }, {}, {}, {} }, 1, NVC0_COUNTER_OPn_SUM, { nu, dn } }
#define _M2A(n, f0, m0, g0, s0, f1, m1, g1, s1, o, nu, dn) [NVE4_HW_SM_QUERY_METRIC_##n] = { { \
{ f0, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m0, 0, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g0, s0 }, \
{ f1, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m1, 0, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g1, s1 }, \
{}, {}, }, 2, NVC0_COUNTER_OP2_##o, { nu, dn } }
#define _M2B(n, f0, m0, g0, s0, f1, m1, g1, s1, o, nu, dn) [NVE4_PM_QUERY_METRIC_##n] = { { \
#define _M2B(n, f0, m0, g0, s0, f1, m1, g1, s1, o, nu, dn) [NVE4_HW_SM_QUERY_METRIC_##n] = { { \
{ f0, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m0, 0, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g0, s0 }, \
{ f1, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m1, 0, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g1, s1 }, \
{}, {}, }, 2, NVC0_COUNTER_OP2_##o, { nu, dn } }
#define _M2AB(n, f0, m0, g0, s0, f1, m1, g1, s1, o, nu, dn) [NVE4_PM_QUERY_METRIC_##n] = { { \
#define _M2AB(n, f0, m0, g0, s0, f1, m1, g1, s1, o, nu, dn) [NVE4_HW_SM_QUERY_METRIC_##n] = { { \
{ f0, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m0, 0, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g0, s0 }, \
{ f1, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m1, 0, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g1, s1 }, \
{}, {}, }, 2, NVC0_COUNTER_OP2_##o, { nu, dn } }
@@ -881,8 +881,35 @@ struct nvc0_mp_pm_query_cfg
* metric-ipXc: we simply multiply by 4 to account for the 4 warp schedulers;
* this is inaccurate !
*/
static const struct nvc0_mp_pm_query_cfg nve4_mp_pm_queries[] =
static const struct nvc0_hw_sm_query_cfg nve4_hw_sm_queries[] =
{
_Q1B(ACTIVE_CYCLES, 0x0001, B6, WARP, 0x00000000, 1, 1),
_Q1B(ACTIVE_WARPS, 0x003f, B6, WARP, 0x31483104, 2, 1),
_Q1A(ATOM_COUNT, 0x0001, B6, BRANCH, 0x00000000, 1, 1),
_Q1A(BRANCH, 0x0001, B6, BRANCH, 0x0000000c, 1, 1),
_Q1A(DIVERGENT_BRANCH, 0x0001, B6, BRANCH, 0x00000010, 1, 1),
_Q1A(GLD_REQUEST, 0x0001, B6, LDST, 0x00000010, 1, 1),
_Q1B(GLD_MEM_DIV_REPLAY, 0x0001, B6, REPLAY, 0x00000010, 1, 1),
_Q1B(GST_TRANSACTIONS, 0x0001, B6, MEM, 0x00000004, 1, 1),
_Q1B(GST_MEM_DIV_REPLAY, 0x0001, B6, REPLAY, 0x00000014, 1, 1),
_Q1A(GRED_COUNT, 0x0001, B6, BRANCH, 0x00000008, 1, 1),
_Q1A(GST_REQUEST, 0x0001, B6, LDST, 0x00000014, 1, 1),
_Q1A(INST_EXECUTED, 0x0003, B6, EXEC, 0x00000398, 1, 1),
_Q1A(INST_ISSUED, 0x0003, B6, ISSUE, 0x00000104, 1, 1),
_Q1A(INST_ISSUED1, 0x0001, B6, ISSUE, 0x00000004, 1, 1),
_Q1A(INST_ISSUED2, 0x0001, B6, ISSUE, 0x00000008, 1, 1),
_Q1B(L1_GLD_HIT, 0x0001, B6, L1, 0x00000010, 1, 1),
_Q1B(L1_GLD_MISS, 0x0001, B6, L1, 0x00000014, 1, 1),
_Q1B(L1_LOCAL_LD_HIT, 0x0001, B6, L1, 0x00000000, 1, 1),
_Q1B(L1_LOCAL_LD_MISS, 0x0001, B6, L1, 0x00000004, 1, 1),
_Q1B(L1_LOCAL_ST_HIT, 0x0001, B6, L1, 0x00000008, 1, 1),
_Q1B(L1_LOCAL_ST_MISS, 0x0001, B6, L1, 0x0000000c, 1, 1),
_Q1B(L1_SHARED_LD_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x00000008, 1, 1),
_Q1B(L1_SHARED_ST_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x0000000c, 1, 1),
_Q1A(LOCAL_LD, 0x0001, B6, LDST, 0x00000008, 1, 1),
_Q1B(LOCAL_LD_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x00000000, 1, 1),
_Q1A(LOCAL_ST, 0x0001, B6, LDST, 0x0000000c, 1, 1),
_Q1B(LOCAL_ST_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x00000004, 1, 1),
_Q1A(PROF_TRIGGER_0, 0x0001, B6, USER, 0x00000000, 1, 1),
_Q1A(PROF_TRIGGER_1, 0x0001, B6, USER, 0x00000004, 1, 1),
_Q1A(PROF_TRIGGER_2, 0x0001, B6, USER, 0x00000008, 1, 1),
@@ -891,41 +918,14 @@ static const struct nvc0_mp_pm_query_cfg nve4_mp_pm_queries[] =
_Q1A(PROF_TRIGGER_5, 0x0001, B6, USER, 0x00000014, 1, 1),
_Q1A(PROF_TRIGGER_6, 0x0001, B6, USER, 0x00000018, 1, 1),
_Q1A(PROF_TRIGGER_7, 0x0001, B6, USER, 0x0000001c, 1, 1),
_Q1A(LAUNCHED_WARPS, 0x0001, B6, LAUNCH, 0x00000004, 1, 1),
_Q1A(LAUNCHED_THREADS, 0x003f, B6, LAUNCH, 0x398a4188, 1, 1),
_Q1B(LAUNCHED_CTA, 0x0001, B6, WARP, 0x0000001c, 1, 1),
_Q1A(INST_ISSUED1, 0x0001, B6, ISSUE, 0x00000004, 1, 1),
_Q1A(INST_ISSUED2, 0x0001, B6, ISSUE, 0x00000008, 1, 1),
_Q1A(INST_ISSUED, 0x0003, B6, ISSUE, 0x00000104, 1, 1),
_Q1A(INST_EXECUTED, 0x0003, B6, EXEC, 0x00000398, 1, 1),
_Q1A(LD_SHARED, 0x0001, B6, LDST, 0x00000000, 1, 1),
_Q1A(ST_SHARED, 0x0001, B6, LDST, 0x00000004, 1, 1),
_Q1A(LD_LOCAL, 0x0001, B6, LDST, 0x00000008, 1, 1),
_Q1A(ST_LOCAL, 0x0001, B6, LDST, 0x0000000c, 1, 1),
_Q1A(GLD_REQUEST, 0x0001, B6, LDST, 0x00000010, 1, 1),
_Q1A(GST_REQUEST, 0x0001, B6, LDST, 0x00000014, 1, 1),
_Q1B(L1_LOCAL_LOAD_HIT, 0x0001, B6, L1, 0x00000000, 1, 1),
_Q1B(L1_LOCAL_LOAD_MISS, 0x0001, B6, L1, 0x00000004, 1, 1),
_Q1B(L1_LOCAL_STORE_HIT, 0x0001, B6, L1, 0x00000008, 1, 1),
_Q1B(L1_LOCAL_STORE_MISS, 0x0001, B6, L1, 0x0000000c, 1, 1),
_Q1B(L1_GLOBAL_LOAD_HIT, 0x0001, B6, L1, 0x00000010, 1, 1),
_Q1B(L1_GLOBAL_LOAD_MISS, 0x0001, B6, L1, 0x00000014, 1, 1),
_Q1B(GLD_TRANSACTIONS_UNCACHED, 0x0001, B6, MEM, 0x00000000, 1, 1),
_Q1B(GST_TRANSACTIONS, 0x0001, B6, MEM, 0x00000004, 1, 1),
_Q1A(BRANCH, 0x0001, B6, BRANCH, 0x0000000c, 1, 1),
_Q1A(BRANCH_DIVERGENT, 0x0001, B6, BRANCH, 0x00000010, 1, 1),
_Q1B(ACTIVE_WARPS, 0x003f, B6, WARP, 0x31483104, 2, 1),
_Q1B(ACTIVE_CYCLES, 0x0001, B6, WARP, 0x00000000, 1, 1),
_Q1A(ATOM_COUNT, 0x0001, B6, BRANCH, 0x00000000, 1, 1),
_Q1A(GRED_COUNT, 0x0001, B6, BRANCH, 0x00000008, 1, 1),
_Q1B(LD_SHARED_REPLAY, 0x0001, B6, REPLAY, 0x00000008, 1, 1),
_Q1B(ST_SHARED_REPLAY, 0x0001, B6, REPLAY, 0x0000000c, 1, 1),
_Q1B(LD_LOCAL_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x00000000, 1, 1),
_Q1B(ST_LOCAL_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x00000004, 1, 1),
_Q1B(L1_LD_SHARED_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x00000008, 1, 1),
_Q1B(L1_ST_SHARED_TRANSACTIONS, 0x0001, B6, TRANSACTION, 0x0000000c, 1, 1),
_Q1B(GLD_MEM_DIV_REPLAY, 0x0001, B6, REPLAY, 0x00000010, 1, 1),
_Q1B(GST_MEM_DIV_REPLAY, 0x0001, B6, REPLAY, 0x00000014, 1, 1),
_Q1A(SHARED_LD, 0x0001, B6, LDST, 0x00000000, 1, 1),
_Q1B(SHARED_LD_REPLAY, 0x0001, B6, REPLAY, 0x00000008, 1, 1),
_Q1A(SHARED_ST, 0x0001, B6, LDST, 0x00000004, 1, 1),
_Q1B(SHARED_ST_REPLAY, 0x0001, B6, REPLAY, 0x0000000c, 1, 1),
_Q1B(SM_CTA_LAUNCHED, 0x0001, B6, WARP, 0x0000001c, 1, 1),
_Q1A(THREADS_LAUNCHED, 0x003f, B6, LAUNCH, 0x398a4188, 1, 1),
_Q1B(UNCACHED_GLD_TRANSACTIONS, 0x0001, B6, MEM, 0x00000000, 1, 1),
_Q1A(WARPS_LAUNCHED, 0x0001, B6, LAUNCH, 0x00000004, 1, 1),
_M2AB(IPC, 0x3, B6, EXEC, 0x398, 0xffff, LOGOP, WARP, 0x0, DIV_SUM_M0, 10, 1),
_M2AB(IPAC, 0x3, B6, EXEC, 0x398, 0x1, B6, WARP, 0x0, AVG_DIV_MM, 10, 1),
_M2A(IPEC, 0x3, B6, EXEC, 0x398, 0xe, LOGOP, EXEC, 0x398, AVG_DIV_MM, 10, 1),
@@ -940,7 +940,7 @@ static const struct nvc0_mp_pm_query_cfg nve4_mp_pm_queries[] =
#undef _M2B
/* === PERFORMANCE MONITORING COUNTERS for NVC0:NVE4 === */
static const uint64_t nvc0_read_mp_pm_counters_code[] =
static const uint64_t nvc0_read_hw_sm_counters_code[] =
{
/* mov b32 $r8 $tidx
* mov b32 $r9 $physid
@@ -993,29 +993,21 @@ static const uint64_t nvc0_read_mp_pm_counters_code[] =
static const char *nvc0_pm_query_names[] =
{
/* MP counters */
"inst_executed",
"active_cycles",
"active_warps",
"atom_count",
"branch",
"divergent_branch",
"active_warps",
"active_cycles",
"warps_launched",
"threads_launched",
"shared_load",
"shared_store",
"local_load",
"local_store",
"gred_count",
"atom_count",
"gld_request",
"gred_count",
"gst_request",
"inst_executed",
"inst_issued1_0",
"inst_issued1_1",
"inst_issued2_0",
"inst_issued2_1",
"thread_inst_executed_0",
"thread_inst_executed_1",
"thread_inst_executed_2",
"thread_inst_executed_3",
"local_load",
"local_store",
"prof_trigger_00",
"prof_trigger_01",
"prof_trigger_02",
@@ -1024,35 +1016,35 @@ static const char *nvc0_pm_query_names[] =
"prof_trigger_05",
"prof_trigger_06",
"prof_trigger_07",
"shared_load",
"shared_store",
"threads_launched",
"thread_inst_executed_0",
"thread_inst_executed_1",
"thread_inst_executed_2",
"thread_inst_executed_3",
"warps_launched",
};
#define _Q(n, f, m, g, c, s0, s1, s2, s3, s4, s5) [NVC0_PM_QUERY_##n] = { { { f, NVC0_COMPUTE_MP_PM_OP_MODE_##m, c, 0, g, s0|(s1 << 8)|(s2 << 16)|(s3 << 24)|(s4##ULL << 32)|(s5##ULL << 40) }, {}, {}, {} }, 1, NVC0_COUNTER_OPn_SUM, { 1, 1 } }
#define _Q(n, f, m, g, c, s0, s1, s2, s3, s4, s5) [NVC0_HW_SM_QUERY_##n] = { { { f, NVC0_COMPUTE_MP_PM_OP_MODE_##m, c, 0, g, s0|(s1 << 8)|(s2 << 16)|(s3 << 24)|(s4##ULL << 32)|(s5##ULL << 40) }, {}, {}, {} }, 1, NVC0_COUNTER_OPn_SUM, { 1, 1 } }
static const struct nvc0_mp_pm_query_cfg nvc0_mp_pm_queries[] =
static const struct nvc0_hw_sm_query_cfg nvc0_hw_sm_queries[] =
{
_Q(INST_EXECUTED, 0xaaaa, LOGOP, 0x2d, 3, 0x00, 0x11, 0x22, 0x00, 0x00, 0x00),
_Q(BRANCH, 0xaaaa, LOGOP, 0x1a, 2, 0x00, 0x11, 0x00, 0x00, 0x00, 0x00),
_Q(BRANCH_DIVERGENT, 0xaaaa, LOGOP, 0x19, 2, 0x20, 0x31, 0x00, 0x00, 0x00, 0x00),
_Q(ACTIVE_WARPS, 0xaaaa, LOGOP, 0x24, 6, 0x10, 0x21, 0x32, 0x43, 0x54, 0x65),
_Q(ACTIVE_CYCLES, 0xaaaa, LOGOP, 0x11, 1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(LAUNCHED_WARPS, 0xaaaa, LOGOP, 0x26, 1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(LAUNCHED_THREADS, 0xaaaa, LOGOP, 0x26, 6, 0x10, 0x21, 0x32, 0x43, 0x54, 0x65),
_Q(LD_SHARED, 0xaaaa, LOGOP, 0x64, 1, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(ST_SHARED, 0xaaaa, LOGOP, 0x64, 1, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(LD_LOCAL, 0xaaaa, LOGOP, 0x64, 1, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(ST_LOCAL, 0xaaaa, LOGOP, 0x64, 1, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(GRED_COUNT, 0xaaaa, LOGOP, 0x63, 1, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(ACTIVE_WARPS, 0xaaaa, LOGOP, 0x24, 6, 0x10, 0x21, 0x32, 0x43, 0x54, 0x65),
_Q(ATOM_COUNT, 0xaaaa, LOGOP, 0x63, 1, 0x30, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(BRANCH, 0xaaaa, LOGOP, 0x1a, 2, 0x00, 0x11, 0x00, 0x00, 0x00, 0x00),
_Q(DIVERGENT_BRANCH, 0xaaaa, LOGOP, 0x19, 2, 0x20, 0x31, 0x00, 0x00, 0x00, 0x00),
_Q(GLD_REQUEST, 0xaaaa, LOGOP, 0x64, 1, 0x30, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(GRED_COUNT, 0xaaaa, LOGOP, 0x63, 1, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(GST_REQUEST, 0xaaaa, LOGOP, 0x64, 1, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(INST_EXECUTED, 0xaaaa, LOGOP, 0x2d, 3, 0x00, 0x11, 0x22, 0x00, 0x00, 0x00),
_Q(INST_ISSUED1_0, 0xaaaa, LOGOP, 0x7e, 1, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(INST_ISSUED1_1, 0xaaaa, LOGOP, 0x7e, 1, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(INST_ISSUED2_0, 0xaaaa, LOGOP, 0x7e, 1, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(INST_ISSUED2_1, 0xaaaa, LOGOP, 0x7e, 1, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(TH_INST_EXECUTED_0, 0xaaaa, LOGOP, 0xa3, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(TH_INST_EXECUTED_1, 0xaaaa, LOGOP, 0xa5, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(TH_INST_EXECUTED_2, 0xaaaa, LOGOP, 0xa4, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(TH_INST_EXECUTED_3, 0xaaaa, LOGOP, 0xa6, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(LOCAL_LD, 0xaaaa, LOGOP, 0x64, 1, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(LOCAL_ST, 0xaaaa, LOGOP, 0x64, 1, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(PROF_TRIGGER_0, 0xaaaa, LOGOP, 0x01, 1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(PROF_TRIGGER_1, 0xaaaa, LOGOP, 0x01, 1, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(PROF_TRIGGER_2, 0xaaaa, LOGOP, 0x01, 1, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00),
@@ -1061,38 +1053,46 @@ static const struct nvc0_mp_pm_query_cfg nvc0_mp_pm_queries[] =
_Q(PROF_TRIGGER_5, 0xaaaa, LOGOP, 0x01, 1, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(PROF_TRIGGER_6, 0xaaaa, LOGOP, 0x01, 1, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(PROF_TRIGGER_7, 0xaaaa, LOGOP, 0x01, 1, 0x70, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(SHARED_LD, 0xaaaa, LOGOP, 0x64, 1, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(SHARED_ST, 0xaaaa, LOGOP, 0x64, 1, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00),
_Q(THREADS_LAUNCHED, 0xaaaa, LOGOP, 0x26, 6, 0x10, 0x21, 0x32, 0x43, 0x54, 0x65),
_Q(TH_INST_EXECUTED_0, 0xaaaa, LOGOP, 0xa3, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(TH_INST_EXECUTED_1, 0xaaaa, LOGOP, 0xa5, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(TH_INST_EXECUTED_2, 0xaaaa, LOGOP, 0xa4, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(TH_INST_EXECUTED_3, 0xaaaa, LOGOP, 0xa6, 6, 0x00, 0x11, 0x22, 0x33, 0x44, 0x55),
_Q(WARPS_LAUNCHED, 0xaaaa, LOGOP, 0x26, 1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00),
};
#undef _Q
static const struct nvc0_mp_pm_query_cfg *
nvc0_mp_pm_query_get_cfg(struct nvc0_context *nvc0, struct nvc0_query *q)
static const struct nvc0_hw_sm_query_cfg *
nvc0_hw_sm_query_get_cfg(struct nvc0_context *nvc0, struct nvc0_query *q)
{
struct nvc0_screen *screen = nvc0->screen;
if (screen->base.class_3d >= NVE4_3D_CLASS)
return &nve4_mp_pm_queries[q->type - PIPE_QUERY_DRIVER_SPECIFIC];
return &nvc0_mp_pm_queries[q->type - NVC0_PM_QUERY(0)];
return &nve4_hw_sm_queries[q->type - PIPE_QUERY_DRIVER_SPECIFIC];
return &nvc0_hw_sm_queries[q->type - NVC0_HW_SM_QUERY(0)];
}
boolean
nvc0_mp_pm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q)
nvc0_hw_sm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q)
{
struct nvc0_screen *screen = nvc0->screen;
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
const bool is_nve4 = screen->base.class_3d >= NVE4_3D_CLASS;
const struct nvc0_mp_pm_query_cfg *cfg;
const struct nvc0_hw_sm_query_cfg *cfg;
unsigned i, c;
unsigned num_ab[2] = { 0, 0 };
cfg = nvc0_mp_pm_query_get_cfg(nvc0, q);
cfg = nvc0_hw_sm_query_get_cfg(nvc0, q);
/* check if we have enough free counter slots */
for (i = 0; i < cfg->num_counters; ++i)
num_ab[cfg->ctr[i].sig_dom]++;
if (screen->pm.num_mp_pm_active[0] + num_ab[0] > 4 ||
screen->pm.num_mp_pm_active[1] + num_ab[1] > 4) {
if (screen->pm.num_hw_sm_active[0] + num_ab[0] > 4 ||
screen->pm.num_hw_sm_active[1] + num_ab[1] > 4) {
NOUVEAU_ERR("Not enough free MP counter slots !\n");
return false;
}
@@ -1113,14 +1113,14 @@ nvc0_mp_pm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q)
for (i = 0; i < cfg->num_counters; ++i) {
const unsigned d = cfg->ctr[i].sig_dom;
if (!screen->pm.num_mp_pm_active[d]) {
if (!screen->pm.num_hw_sm_active[d]) {
uint32_t m = (1 << 22) | (1 << (7 + (8 * !d)));
if (screen->pm.num_mp_pm_active[!d])
if (screen->pm.num_hw_sm_active[!d])
m |= 1 << (7 + (8 * d));
BEGIN_NVC0(push, SUBC_SW(0x0600), 1);
PUSH_DATA (push, m);
}
screen->pm.num_mp_pm_active[d]++;
screen->pm.num_hw_sm_active[d]++;
for (c = d * 4; c < (d * 4 + 4); ++c) {
if (!screen->pm.mp_counter[c]) {
@@ -1163,7 +1163,7 @@ nvc0_mp_pm_query_begin(struct nvc0_context *nvc0, struct nvc0_query *q)
}
static void
nvc0_mp_pm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
nvc0_hw_sm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
{
struct nvc0_screen *screen = nvc0->screen;
struct pipe_context *pipe = &nvc0->base.pipe;
@@ -1174,9 +1174,9 @@ nvc0_mp_pm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
const uint block[3] = { 32, is_nve4 ? 4 : 1, 1 };
const uint grid[3] = { screen->mp_count, 1, 1 };
unsigned c;
const struct nvc0_mp_pm_query_cfg *cfg;
const struct nvc0_hw_sm_query_cfg *cfg;
cfg = nvc0_mp_pm_query_get_cfg(nvc0, q);
cfg = nvc0_hw_sm_query_get_cfg(nvc0, q);
if (unlikely(!screen->pm.prog)) {
struct nvc0_program *prog = CALLOC_STRUCT(nvc0_program);
@@ -1185,11 +1185,11 @@ nvc0_mp_pm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
prog->num_gprs = 14;
prog->parm_size = 12;
if (is_nve4) {
prog->code = (uint32_t *)nve4_read_mp_pm_counters_code;
prog->code_size = sizeof(nve4_read_mp_pm_counters_code);
prog->code = (uint32_t *)nve4_read_hw_sm_counters_code;
prog->code_size = sizeof(nve4_read_hw_sm_counters_code);
} else {
prog->code = (uint32_t *)nvc0_read_mp_pm_counters_code;
prog->code_size = sizeof(nvc0_read_mp_pm_counters_code);
prog->code = (uint32_t *)nvc0_read_hw_sm_counters_code;
prog->code_size = sizeof(nvc0_read_hw_sm_counters_code);
}
screen->pm.prog = prog;
}
@@ -1207,7 +1207,7 @@ nvc0_mp_pm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
/* release counters for this query */
for (c = 0; c < 8; ++c) {
if (nvc0_query(screen->pm.mp_counter[c]) == q) {
screen->pm.num_mp_pm_active[c / 4]--;
screen->pm.num_hw_sm_active[c / 4]--;
screen->pm.mp_counter[c] = NULL;
}
}
@@ -1234,7 +1234,7 @@ nvc0_mp_pm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
q = nvc0_query(screen->pm.mp_counter[c]);
if (!q)
continue;
cfg = nvc0_mp_pm_query_get_cfg(nvc0, q);
cfg = nvc0_hw_sm_query_get_cfg(nvc0, q);
for (i = 0; i < cfg->num_counters; ++i) {
if (mask & (1 << q->ctr[i]))
break;
@@ -1250,10 +1250,10 @@ nvc0_mp_pm_query_end(struct nvc0_context *nvc0, struct nvc0_query *q)
}
static inline bool
nvc0_mp_pm_query_read_data(uint32_t count[32][4],
nvc0_hw_sm_query_read_data(uint32_t count[32][4],
struct nvc0_context *nvc0, bool wait,
struct nvc0_query *q,
const struct nvc0_mp_pm_query_cfg *cfg,
const struct nvc0_hw_sm_query_cfg *cfg,
unsigned mp_count)
{
unsigned p, c;
@@ -1275,10 +1275,10 @@ nvc0_mp_pm_query_read_data(uint32_t count[32][4],
}
static inline bool
nve4_mp_pm_query_read_data(uint32_t count[32][4],
nve4_hw_sm_query_read_data(uint32_t count[32][4],
struct nvc0_context *nvc0, bool wait,
struct nvc0_query *q,
const struct nvc0_mp_pm_query_cfg *cfg,
const struct nvc0_hw_sm_query_cfg *cfg,
unsigned mp_count)
{
unsigned p, c, d;
@@ -1317,22 +1317,22 @@ nve4_mp_pm_query_read_data(uint32_t count[32][4],
* NOTE: Interpretation of IPC requires knowledge of MP count.
*/
static boolean
nvc0_mp_pm_query_result(struct nvc0_context *nvc0, struct nvc0_query *q,
nvc0_hw_sm_query_result(struct nvc0_context *nvc0, struct nvc0_query *q,
void *result, boolean wait)
{
uint32_t count[32][4];
uint64_t value = 0;
unsigned mp_count = MIN2(nvc0->screen->mp_count_compute, 32);
unsigned p, c;
const struct nvc0_mp_pm_query_cfg *cfg;
const struct nvc0_hw_sm_query_cfg *cfg;
bool ret;
cfg = nvc0_mp_pm_query_get_cfg(nvc0, q);
cfg = nvc0_hw_sm_query_get_cfg(nvc0, q);
if (nvc0->screen->base.class_3d >= NVE4_3D_CLASS)
ret = nve4_mp_pm_query_read_data(count, nvc0, wait, q, cfg, mp_count);
ret = nve4_hw_sm_query_read_data(count, nvc0, wait, q, cfg, mp_count);
else
ret = nvc0_mp_pm_query_read_data(count, nvc0, wait, q, cfg, mp_count);
ret = nvc0_hw_sm_query_read_data(count, nvc0, wait, q, cfg, mp_count);
if (!ret)
return false;
@@ -1410,11 +1410,11 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen,
if (screen->base.device->drm_version >= 0x01000101) {
if (screen->compute) {
if (screen->base.class_3d == NVE4_3D_CLASS) {
count += NVE4_PM_QUERY_COUNT;
count += NVE4_HW_SM_QUERY_COUNT;
} else
if (screen->base.class_3d < NVE4_3D_CLASS) {
/* NVC0_COMPUTE is not always enabled */
count += NVC0_PM_QUERY_COUNT;
count += NVC0_HW_SM_QUERY_COUNT;
}
}
}
@@ -1444,15 +1444,15 @@ nvc0_screen_get_driver_query_info(struct pipe_screen *pscreen,
if (screen->compute) {
if (screen->base.class_3d == NVE4_3D_CLASS) {
info->name = nve4_pm_query_names[id - NVC0_QUERY_DRV_STAT_COUNT];
info->query_type = NVE4_PM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT);
info->query_type = NVE4_HW_SM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT);
info->max_value.u64 =
(id < NVE4_PM_QUERY_METRIC_MP_OCCUPANCY) ? 0 : 100;
(id < NVE4_HW_SM_QUERY_METRIC_MP_OCCUPANCY) ? 0 : 100;
info->group_id = NVC0_QUERY_MP_COUNTER_GROUP;
return 1;
} else
if (screen->base.class_3d < NVE4_3D_CLASS) {
info->name = nvc0_pm_query_names[id - NVC0_QUERY_DRV_STAT_COUNT];
info->query_type = NVC0_PM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT);
info->query_type = NVC0_HW_SM_QUERY(id - NVC0_QUERY_DRV_STAT_COUNT);
info->group_id = NVC0_QUERY_MP_COUNTER_GROUP;
return 1;
}
@@ -1494,7 +1494,7 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen *pscreen,
info->type = PIPE_DRIVER_QUERY_GROUP_TYPE_GPU;
if (screen->base.class_3d == NVE4_3D_CLASS) {
info->num_queries = NVE4_PM_QUERY_COUNT;
info->num_queries = NVE4_HW_SM_QUERY_COUNT;
/* On NVE4+, each multiprocessor have 8 hardware counters separated
* in two distinct domains, but we allow only one active query
@@ -1504,7 +1504,7 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen *pscreen,
return 1;
} else
if (screen->base.class_3d < NVE4_3D_CLASS) {
info->num_queries = NVC0_PM_QUERY_COUNT;
info->num_queries = NVC0_HW_SM_QUERY_COUNT;
/* On NVC0:NVE4, each multiprocessor have 8 hardware counters
* in a single domain. */

View File

@@ -95,7 +95,7 @@ struct nvc0_screen {
struct {
struct nvc0_program *prog; /* compute state object to read MP counters */
struct pipe_query *mp_counter[8]; /* counter to query allocation */
uint8_t num_mp_pm_active[2];
uint8_t num_hw_sm_active[2];
bool mp_counters_enabled;
} pm;
@@ -120,156 +120,139 @@ nvc0_screen(struct pipe_screen *screen)
/* Performance counter queries:
*/
#define NVE4_PM_QUERY_COUNT 49
#define NVE4_PM_QUERY(i) (PIPE_QUERY_DRIVER_SPECIFIC + (i))
#define NVE4_PM_QUERY_LAST NVE4_PM_QUERY(NVE4_PM_QUERY_COUNT - 1)
#define NVE4_PM_QUERY_PROF_TRIGGER_0 0
#define NVE4_PM_QUERY_PROF_TRIGGER_1 1
#define NVE4_PM_QUERY_PROF_TRIGGER_2 2
#define NVE4_PM_QUERY_PROF_TRIGGER_3 3
#define NVE4_PM_QUERY_PROF_TRIGGER_4 4
#define NVE4_PM_QUERY_PROF_TRIGGER_5 5
#define NVE4_PM_QUERY_PROF_TRIGGER_6 6
#define NVE4_PM_QUERY_PROF_TRIGGER_7 7
#define NVE4_PM_QUERY_LAUNCHED_WARPS 8
#define NVE4_PM_QUERY_LAUNCHED_THREADS 9
#define NVE4_PM_QUERY_LAUNCHED_CTA 10
#define NVE4_PM_QUERY_INST_ISSUED1 11
#define NVE4_PM_QUERY_INST_ISSUED2 12
#define NVE4_PM_QUERY_INST_EXECUTED 13
#define NVE4_PM_QUERY_LD_LOCAL 14
#define NVE4_PM_QUERY_ST_LOCAL 15
#define NVE4_PM_QUERY_LD_SHARED 16
#define NVE4_PM_QUERY_ST_SHARED 17
#define NVE4_PM_QUERY_L1_LOCAL_LOAD_HIT 18
#define NVE4_PM_QUERY_L1_LOCAL_LOAD_MISS 19
#define NVE4_PM_QUERY_L1_LOCAL_STORE_HIT 20
#define NVE4_PM_QUERY_L1_LOCAL_STORE_MISS 21
#define NVE4_PM_QUERY_GLD_REQUEST 22
#define NVE4_PM_QUERY_GST_REQUEST 23
#define NVE4_PM_QUERY_L1_GLOBAL_LOAD_HIT 24
#define NVE4_PM_QUERY_L1_GLOBAL_LOAD_MISS 25
#define NVE4_PM_QUERY_GLD_TRANSACTIONS_UNCACHED 26
#define NVE4_PM_QUERY_GST_TRANSACTIONS 27
#define NVE4_PM_QUERY_BRANCH 28
#define NVE4_PM_QUERY_BRANCH_DIVERGENT 29
#define NVE4_PM_QUERY_ACTIVE_WARPS 30
#define NVE4_PM_QUERY_ACTIVE_CYCLES 31
#define NVE4_PM_QUERY_INST_ISSUED 32
#define NVE4_PM_QUERY_ATOM_COUNT 33
#define NVE4_PM_QUERY_GRED_COUNT 34
#define NVE4_PM_QUERY_LD_SHARED_REPLAY 35
#define NVE4_PM_QUERY_ST_SHARED_REPLAY 36
#define NVE4_PM_QUERY_LD_LOCAL_TRANSACTIONS 37
#define NVE4_PM_QUERY_ST_LOCAL_TRANSACTIONS 38
#define NVE4_PM_QUERY_L1_LD_SHARED_TRANSACTIONS 39
#define NVE4_PM_QUERY_L1_ST_SHARED_TRANSACTIONS 40
#define NVE4_PM_QUERY_GLD_MEM_DIV_REPLAY 41
#define NVE4_PM_QUERY_GST_MEM_DIV_REPLAY 42
#define NVE4_PM_QUERY_METRIC_IPC 43
#define NVE4_PM_QUERY_METRIC_IPAC 44
#define NVE4_PM_QUERY_METRIC_IPEC 45
#define NVE4_PM_QUERY_METRIC_MP_OCCUPANCY 46
#define NVE4_PM_QUERY_METRIC_MP_EFFICIENCY 47
#define NVE4_PM_QUERY_METRIC_INST_REPLAY_OHEAD 48
#define NVE4_HW_SM_QUERY(i) (PIPE_QUERY_DRIVER_SPECIFIC + (i))
#define NVE4_HW_SM_QUERY_LAST NVE4_HW_SM_QUERY(NVE4_HW_SM_QUERY_COUNT - 1)
enum nve4_pm_queries
{
NVE4_HW_SM_QUERY_ACTIVE_CYCLES = 0,
NVE4_HW_SM_QUERY_ACTIVE_WARPS,
NVE4_HW_SM_QUERY_ATOM_COUNT,
NVE4_HW_SM_QUERY_BRANCH,
NVE4_HW_SM_QUERY_DIVERGENT_BRANCH,
NVE4_HW_SM_QUERY_GLD_REQUEST,
NVE4_HW_SM_QUERY_GLD_MEM_DIV_REPLAY,
NVE4_HW_SM_QUERY_GST_TRANSACTIONS,
NVE4_HW_SM_QUERY_GST_MEM_DIV_REPLAY,
NVE4_HW_SM_QUERY_GRED_COUNT,
NVE4_HW_SM_QUERY_GST_REQUEST,
NVE4_HW_SM_QUERY_INST_EXECUTED,
NVE4_HW_SM_QUERY_INST_ISSUED,
NVE4_HW_SM_QUERY_INST_ISSUED1,
NVE4_HW_SM_QUERY_INST_ISSUED2,
NVE4_HW_SM_QUERY_L1_GLD_HIT,
NVE4_HW_SM_QUERY_L1_GLD_MISS,
NVE4_HW_SM_QUERY_L1_LOCAL_LD_HIT,
NVE4_HW_SM_QUERY_L1_LOCAL_LD_MISS,
NVE4_HW_SM_QUERY_L1_LOCAL_ST_HIT,
NVE4_HW_SM_QUERY_L1_LOCAL_ST_MISS,
NVE4_HW_SM_QUERY_L1_SHARED_LD_TRANSACTIONS,
NVE4_HW_SM_QUERY_L1_SHARED_ST_TRANSACTIONS,
NVE4_HW_SM_QUERY_LOCAL_LD,
NVE4_HW_SM_QUERY_LOCAL_LD_TRANSACTIONS,
NVE4_HW_SM_QUERY_LOCAL_ST,
NVE4_HW_SM_QUERY_LOCAL_ST_TRANSACTIONS,
NVE4_HW_SM_QUERY_PROF_TRIGGER_0,
NVE4_HW_SM_QUERY_PROF_TRIGGER_1,
NVE4_HW_SM_QUERY_PROF_TRIGGER_2,
NVE4_HW_SM_QUERY_PROF_TRIGGER_3,
NVE4_HW_SM_QUERY_PROF_TRIGGER_4,
NVE4_HW_SM_QUERY_PROF_TRIGGER_5,
NVE4_HW_SM_QUERY_PROF_TRIGGER_6,
NVE4_HW_SM_QUERY_PROF_TRIGGER_7,
NVE4_HW_SM_QUERY_SHARED_LD,
NVE4_HW_SM_QUERY_SHARED_LD_REPLAY,
NVE4_HW_SM_QUERY_SHARED_ST,
NVE4_HW_SM_QUERY_SHARED_ST_REPLAY,
NVE4_HW_SM_QUERY_SM_CTA_LAUNCHED,
NVE4_HW_SM_QUERY_THREADS_LAUNCHED,
NVE4_HW_SM_QUERY_UNCACHED_GLD_TRANSACTIONS,
NVE4_HW_SM_QUERY_WARPS_LAUNCHED,
NVE4_HW_SM_QUERY_METRIC_IPC,
NVE4_HW_SM_QUERY_METRIC_IPAC,
NVE4_HW_SM_QUERY_METRIC_IPEC,
NVE4_HW_SM_QUERY_METRIC_MP_OCCUPANCY,
NVE4_HW_SM_QUERY_METRIC_MP_EFFICIENCY,
NVE4_HW_SM_QUERY_METRIC_INST_REPLAY_OHEAD,
NVE4_HW_SM_QUERY_COUNT
};
/*
#define NVE4_PM_QUERY_GR_IDLE 50
#define NVE4_PM_QUERY_BSP_IDLE 51
#define NVE4_PM_QUERY_VP_IDLE 52
#define NVE4_PM_QUERY_PPP_IDLE 53
#define NVE4_PM_QUERY_CE0_IDLE 54
#define NVE4_PM_QUERY_CE1_IDLE 55
#define NVE4_PM_QUERY_CE2_IDLE 56
*/
/* L2 queries (PCOUNTER) */
/*
#define NVE4_PM_QUERY_L2_SUBP_WRITE_L1_SECTOR_QUERIES 57
...
*/
/* TEX queries (PCOUNTER) */
/*
#define NVE4_PM_QUERY_TEX0_CACHE_SECTOR_QUERIES 58
...
*/
#define NVC0_PM_QUERY_COUNT 31
#define NVC0_PM_QUERY(i) (PIPE_QUERY_DRIVER_SPECIFIC + 2048 + (i))
#define NVC0_PM_QUERY_LAST NVC0_PM_QUERY(NVC0_PM_QUERY_COUNT - 1)
#define NVC0_PM_QUERY_INST_EXECUTED 0
#define NVC0_PM_QUERY_BRANCH 1
#define NVC0_PM_QUERY_BRANCH_DIVERGENT 2
#define NVC0_PM_QUERY_ACTIVE_WARPS 3
#define NVC0_PM_QUERY_ACTIVE_CYCLES 4
#define NVC0_PM_QUERY_LAUNCHED_WARPS 5
#define NVC0_PM_QUERY_LAUNCHED_THREADS 6
#define NVC0_PM_QUERY_LD_SHARED 7
#define NVC0_PM_QUERY_ST_SHARED 8
#define NVC0_PM_QUERY_LD_LOCAL 9
#define NVC0_PM_QUERY_ST_LOCAL 10
#define NVC0_PM_QUERY_GRED_COUNT 11
#define NVC0_PM_QUERY_ATOM_COUNT 12
#define NVC0_PM_QUERY_GLD_REQUEST 13
#define NVC0_PM_QUERY_GST_REQUEST 14
#define NVC0_PM_QUERY_INST_ISSUED1_0 15
#define NVC0_PM_QUERY_INST_ISSUED1_1 16
#define NVC0_PM_QUERY_INST_ISSUED2_0 17
#define NVC0_PM_QUERY_INST_ISSUED2_1 18
#define NVC0_PM_QUERY_TH_INST_EXECUTED_0 19
#define NVC0_PM_QUERY_TH_INST_EXECUTED_1 20
#define NVC0_PM_QUERY_TH_INST_EXECUTED_2 21
#define NVC0_PM_QUERY_TH_INST_EXECUTED_3 22
#define NVC0_PM_QUERY_PROF_TRIGGER_0 23
#define NVC0_PM_QUERY_PROF_TRIGGER_1 24
#define NVC0_PM_QUERY_PROF_TRIGGER_2 25
#define NVC0_PM_QUERY_PROF_TRIGGER_3 26
#define NVC0_PM_QUERY_PROF_TRIGGER_4 27
#define NVC0_PM_QUERY_PROF_TRIGGER_5 28
#define NVC0_PM_QUERY_PROF_TRIGGER_6 29
#define NVC0_PM_QUERY_PROF_TRIGGER_7 30
#define NVC0_HW_SM_QUERY(i) (PIPE_QUERY_DRIVER_SPECIFIC + 2048 + (i))
#define NVC0_HW_SM_QUERY_LAST NVC0_HW_SM_QUERY(NVC0_HW_SM_QUERY_COUNT - 1)
enum nvc0_pm_queries
{
NVC0_HW_SM_QUERY_ACTIVE_CYCLES = 0,
NVC0_HW_SM_QUERY_ACTIVE_WARPS,
NVC0_HW_SM_QUERY_ATOM_COUNT,
NVC0_HW_SM_QUERY_BRANCH,
NVC0_HW_SM_QUERY_DIVERGENT_BRANCH,
NVC0_HW_SM_QUERY_GLD_REQUEST,
NVC0_HW_SM_QUERY_GRED_COUNT,
NVC0_HW_SM_QUERY_GST_REQUEST,
NVC0_HW_SM_QUERY_INST_EXECUTED,
NVC0_HW_SM_QUERY_INST_ISSUED1_0,
NVC0_HW_SM_QUERY_INST_ISSUED1_1,
NVC0_HW_SM_QUERY_INST_ISSUED2_0,
NVC0_HW_SM_QUERY_INST_ISSUED2_1,
NVC0_HW_SM_QUERY_LOCAL_LD,
NVC0_HW_SM_QUERY_LOCAL_ST,
NVC0_HW_SM_QUERY_PROF_TRIGGER_0,
NVC0_HW_SM_QUERY_PROF_TRIGGER_1,
NVC0_HW_SM_QUERY_PROF_TRIGGER_2,
NVC0_HW_SM_QUERY_PROF_TRIGGER_3,
NVC0_HW_SM_QUERY_PROF_TRIGGER_4,
NVC0_HW_SM_QUERY_PROF_TRIGGER_5,
NVC0_HW_SM_QUERY_PROF_TRIGGER_6,
NVC0_HW_SM_QUERY_PROF_TRIGGER_7,
NVC0_HW_SM_QUERY_SHARED_LD,
NVC0_HW_SM_QUERY_SHARED_ST,
NVC0_HW_SM_QUERY_THREADS_LAUNCHED,
NVC0_HW_SM_QUERY_TH_INST_EXECUTED_0,
NVC0_HW_SM_QUERY_TH_INST_EXECUTED_1,
NVC0_HW_SM_QUERY_TH_INST_EXECUTED_2,
NVC0_HW_SM_QUERY_TH_INST_EXECUTED_3,
NVC0_HW_SM_QUERY_WARPS_LAUNCHED,
NVC0_HW_SM_QUERY_COUNT
};
/* Driver statistics queries:
*/
#ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
#define NVC0_QUERY_DRV_STAT(i) (PIPE_QUERY_DRIVER_SPECIFIC + 1024 + (i))
#define NVC0_QUERY_DRV_STAT_COUNT 29
#define NVC0_QUERY_DRV_STAT_LAST NVC0_QUERY_DRV_STAT(NVC0_QUERY_DRV_STAT_COUNT - 1)
#define NVC0_QUERY_DRV_STAT_TEX_OBJECT_CURRENT_COUNT 0
#define NVC0_QUERY_DRV_STAT_TEX_OBJECT_CURRENT_BYTES 1
#define NVC0_QUERY_DRV_STAT_BUF_OBJECT_CURRENT_COUNT 2
#define NVC0_QUERY_DRV_STAT_BUF_OBJECT_CURRENT_BYTES_VID 3
#define NVC0_QUERY_DRV_STAT_BUF_OBJECT_CURRENT_BYTES_SYS 4
#define NVC0_QUERY_DRV_STAT_TEX_TRANSFERS_READ 5
#define NVC0_QUERY_DRV_STAT_TEX_TRANSFERS_WRITE 6
#define NVC0_QUERY_DRV_STAT_TEX_COPY_COUNT 7
#define NVC0_QUERY_DRV_STAT_TEX_BLIT_COUNT 8
#define NVC0_QUERY_DRV_STAT_TEX_CACHE_FLUSH_COUNT 9
#define NVC0_QUERY_DRV_STAT_BUF_TRANSFERS_READ 10
#define NVC0_QUERY_DRV_STAT_BUF_TRANSFERS_WRITE 11
#define NVC0_QUERY_DRV_STAT_BUF_READ_BYTES_STAGING_VID 12
#define NVC0_QUERY_DRV_STAT_BUF_WRITE_BYTES_DIRECT 13
#define NVC0_QUERY_DRV_STAT_BUF_WRITE_BYTES_STAGING_VID 14
#define NVC0_QUERY_DRV_STAT_BUF_WRITE_BYTES_STAGING_SYS 15
#define NVC0_QUERY_DRV_STAT_BUF_COPY_BYTES 16
#define NVC0_QUERY_DRV_STAT_BUF_NON_KERNEL_FENCE_SYNC_COUNT 17
#define NVC0_QUERY_DRV_STAT_ANY_NON_KERNEL_FENCE_SYNC_COUNT 18
#define NVC0_QUERY_DRV_STAT_QUERY_SYNC_COUNT 19
#define NVC0_QUERY_DRV_STAT_GPU_SERIALIZE_COUNT 20
#define NVC0_QUERY_DRV_STAT_DRAW_CALLS_ARRAY 21
#define NVC0_QUERY_DRV_STAT_DRAW_CALLS_INDEXED 22
#define NVC0_QUERY_DRV_STAT_DRAW_CALLS_FALLBACK_COUNT 23
#define NVC0_QUERY_DRV_STAT_USER_BUFFER_UPLOAD_BYTES 24
#define NVC0_QUERY_DRV_STAT_CONSTBUF_UPLOAD_COUNT 25
#define NVC0_QUERY_DRV_STAT_CONSTBUF_UPLOAD_BYTES 26
#define NVC0_QUERY_DRV_STAT_PUSHBUF_COUNT 27
#define NVC0_QUERY_DRV_STAT_RESOURCE_VALIDATE_COUNT 28
#else
#define NVC0_QUERY_DRV_STAT_COUNT 0
enum nvc0_drv_stats_queries
{
#ifdef NOUVEAU_ENABLE_DRIVER_STATISTICS
NVC0_QUERY_DRV_STAT_TEX_OBJECT_CURRENT_COUNT = 0,
NVC0_QUERY_DRV_STAT_TEX_OBJECT_CURRENT_BYTES,
NVC0_QUERY_DRV_STAT_BUF_OBJECT_CURRENT_COUNT,
NVC0_QUERY_DRV_STAT_BUF_OBJECT_CURRENT_BYTES_VID,
NVC0_QUERY_DRV_STAT_BUF_OBJECT_CURRENT_BYTES_SYS,
NVC0_QUERY_DRV_STAT_TEX_TRANSFERS_READ,
NVC0_QUERY_DRV_STAT_TEX_TRANSFERS_WRITE,
NVC0_QUERY_DRV_STAT_TEX_COPY_COUNT,
NVC0_QUERY_DRV_STAT_TEX_BLIT_COUNT,
NVC0_QUERY_DRV_STAT_TEX_CACHE_FLUSH_COUNT,
NVC0_QUERY_DRV_STAT_BUF_TRANSFERS_READ,
NVC0_QUERY_DRV_STAT_BUF_TRANSFERS_WRITE,
NVC0_QUERY_DRV_STAT_BUF_READ_BYTES_STAGING_VID,
NVC0_QUERY_DRV_STAT_BUF_WRITE_BYTES_DIRECT,
NVC0_QUERY_DRV_STAT_BUF_WRITE_BYTES_STAGING_VID,
NVC0_QUERY_DRV_STAT_BUF_WRITE_BYTES_STAGING_SYS,
NVC0_QUERY_DRV_STAT_BUF_COPY_BYTES,
NVC0_QUERY_DRV_STAT_BUF_NON_KERNEL_FENCE_SYNC_COUNT,
NVC0_QUERY_DRV_STAT_ANY_NON_KERNEL_FENCE_SYNC_COUNT,
NVC0_QUERY_DRV_STAT_QUERY_SYNC_COUNT,
NVC0_QUERY_DRV_STAT_GPU_SERIALIZE_COUNT,
NVC0_QUERY_DRV_STAT_DRAW_CALLS_ARRAY,
NVC0_QUERY_DRV_STAT_DRAW_CALLS_INDEXED,
NVC0_QUERY_DRV_STAT_DRAW_CALLS_FALLBACK_COUNT,
NVC0_QUERY_DRV_STAT_USER_BUFFER_UPLOAD_BYTES,
NVC0_QUERY_DRV_STAT_CONSTBUF_UPLOAD_COUNT,
NVC0_QUERY_DRV_STAT_CONSTBUF_UPLOAD_BYTES,
NVC0_QUERY_DRV_STAT_PUSHBUF_COUNT,
NVC0_QUERY_DRV_STAT_RESOURCE_VALIDATE_COUNT,
#endif
NVC0_QUERY_DRV_STAT_COUNT
};
int nvc0_screen_get_driver_query_info(struct pipe_screen *, unsigned,
struct pipe_driver_query_info *);

View File

@@ -887,6 +887,7 @@ nvc0_blitctx_prepare_state(struct nvc0_blitctx *blit)
/* zsa state */
IMMED_NVC0(push, NVC0_3D(DEPTH_TEST_ENABLE), 0);
IMMED_NVC0(push, NVC0_3D(DEPTH_BOUNDS_EN), 0);
IMMED_NVC0(push, NVC0_3D(STENCIL_ENABLE), 0);
IMMED_NVC0(push, NVC0_3D(ALPHA_TEST_ENABLE), 0);

View File

@@ -363,7 +363,7 @@ static void r300_init_states(struct pipe_context *pipe)
}
struct pipe_context* r300_create_context(struct pipe_screen* screen,
void *priv)
void *priv, unsigned flags)
{
struct r300_context* r300 = CALLOC_STRUCT(r300_context);
struct r300_screen* r300screen = r300_screen(screen);

View File

@@ -705,7 +705,7 @@ r300_get_nonnull_cb(struct pipe_framebuffer_state *fb, unsigned i)
}
struct pipe_context* r300_create_context(struct pipe_screen* screen,
void *priv);
void *priv, unsigned flags);
/* Context initialization. */
struct draw_stage* r300_draw_stage(struct r300_context* r300);

View File

@@ -120,7 +120,7 @@ int64_t compute_memory_prealloc_chunk(
assert(size_in_dw <= pool->size_in_dw);
COMPUTE_DBG(pool->screen, "* compute_memory_prealloc_chunk() size_in_dw = %ld\n",
COMPUTE_DBG(pool->screen, "* compute_memory_prealloc_chunk() size_in_dw = %"PRIi64"\n",
size_in_dw);
LIST_FOR_EACH_ENTRY(item, pool->item_list, link) {
@@ -151,7 +151,7 @@ struct list_head *compute_memory_postalloc_chunk(
struct compute_memory_item *next;
struct list_head *next_link;
COMPUTE_DBG(pool->screen, "* compute_memory_postalloc_chunck() start_in_dw = %ld\n",
COMPUTE_DBG(pool->screen, "* compute_memory_postalloc_chunck() start_in_dw = %"PRIi64"\n",
start_in_dw);
/* Check if we can insert it in the front of the list */
@@ -568,7 +568,7 @@ void compute_memory_free(struct compute_memory_pool* pool, int64_t id)
struct pipe_screen *screen = (struct pipe_screen *)pool->screen;
struct pipe_resource *res;
COMPUTE_DBG(pool->screen, "* compute_memory_free() id + %ld \n", id);
COMPUTE_DBG(pool->screen, "* compute_memory_free() id + %"PRIi64" \n", id);
LIST_FOR_EACH_ENTRY_SAFE(item, next, pool->item_list, link) {
@@ -628,7 +628,7 @@ struct compute_memory_item* compute_memory_alloc(
{
struct compute_memory_item *new_item = NULL;
COMPUTE_DBG(pool->screen, "* compute_memory_alloc() size_in_dw = %ld (%ld bytes)\n",
COMPUTE_DBG(pool->screen, "* compute_memory_alloc() size_in_dw = %"PRIi64" (%"PRIi64" bytes)\n",
size_in_dw, 4 * size_in_dw);
new_item = (struct compute_memory_item *)

View File

@@ -2143,11 +2143,11 @@ static void evergreen_emit_shader_stages(struct r600_context *rctx, struct r600_
if (state->geom_enable) {
uint32_t cut_val;
if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 128)
if (rctx->gs_shader->gs_max_out_vertices <= 128)
cut_val = V_028A40_GS_CUT_128;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 256)
else if (rctx->gs_shader->gs_max_out_vertices <= 256)
cut_val = V_028A40_GS_CUT_256;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 512)
else if (rctx->gs_shader->gs_max_out_vertices <= 512)
cut_val = V_028A40_GS_CUT_512;
else
cut_val = V_028A40_GS_CUT_1024;
@@ -3013,7 +3013,7 @@ void evergreen_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader
struct r600_shader *rshader = &shader->shader;
struct r600_shader *cp_shader = &shader->gs_copy_shader->shader;
unsigned gsvs_itemsize =
(cp_shader->ring_item_size * rshader->gs_max_out_vertices) >> 2;
(cp_shader->ring_item_size * shader->selector->gs_max_out_vertices) >> 2;
r600_init_command_buffer(cb, 64);
@@ -3022,14 +3022,14 @@ void evergreen_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader
r600_store_context_reg(cb, R_028AB8_VGT_VTX_CNT_EN, 1);
r600_store_context_reg(cb, R_028B38_VGT_GS_MAX_VERT_OUT,
S_028B38_MAX_VERT_OUT(rshader->gs_max_out_vertices));
S_028B38_MAX_VERT_OUT(shader->selector->gs_max_out_vertices));
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
r600_conv_prim_to_gs_out(shader->selector->gs_output_prim));
if (rctx->screen->b.info.drm_minor >= 35) {
r600_store_context_reg(cb, R_028B90_VGT_GS_INSTANCE_CNT,
S_028B90_CNT(MIN2(rshader->gs_num_invocations, 127)) |
S_028B90_ENABLE(rshader->gs_num_invocations > 0));
S_028B90_CNT(MIN2(shader->selector->gs_num_invocations, 127)) |
S_028B90_ENABLE(shader->selector->gs_num_invocations > 0));
}
r600_store_context_reg_seq(cb, R_02891C_SQ_GS_VERT_ITEMSIZE, 4);
r600_store_value(cb, cp_shader->ring_item_size >> 2);

View File

@@ -2029,6 +2029,8 @@ void r600_bytecode_disasm(struct r600_bytecode *bc)
fprintf(stderr, "CND:%X ", cf->cond);
if (cf->pop_count)
fprintf(stderr, "POP:%X ", cf->pop_count);
if (cf->end_of_program)
fprintf(stderr, "EOP ");
fprintf(stderr, "\n");
}
}

View File

@@ -108,7 +108,8 @@ static void r600_destroy_context(struct pipe_context *context)
FREE(rctx);
}
static struct pipe_context *r600_create_context(struct pipe_screen *screen, void *priv)
static struct pipe_context *r600_create_context(struct pipe_screen *screen,
void *priv, unsigned flags)
{
struct r600_context *rctx = CALLOC_STRUCT(r600_context);
struct r600_screen* rscreen = (struct r600_screen *)screen;
@@ -624,7 +625,7 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys *ws)
rscreen->global_pool = compute_memory_pool_new(rscreen);
/* Create the auxiliary context. This must be done last. */
rscreen->b.aux_context = rscreen->b.b.context_create(&rscreen->b.b, NULL);
rscreen->b.aux_context = rscreen->b.b.context_create(&rscreen->b.b, NULL, 0);
#if 0 /* This is for testing whether aux_context and buffer clearing work correctly. */
struct pipe_resource templ = {};

View File

@@ -36,6 +36,8 @@
#include "util/list.h"
#include "util/u_transfer.h"
#include "tgsi/tgsi_scan.h"
#define R600_NUM_ATOMS 75
#define R600_MAX_VIEWPORTS 16
@@ -305,12 +307,18 @@ struct r600_pipe_shader_selector {
struct tgsi_token *tokens;
struct pipe_stream_output_info so;
struct tgsi_shader_info info;
unsigned num_shaders;
/* PIPE_SHADER_[VERTEX|FRAGMENT|...] */
unsigned type;
/* geometry shader properties */
unsigned gs_output_prim;
unsigned gs_max_out_vertices;
unsigned gs_num_invocations;
unsigned nr_ps_max_color_exports;
};
@@ -936,28 +944,5 @@ static inline bool r600_can_read_depth(struct r600_texture *rtex)
#define V_028A6C_OUTPRIM_TYPE_LINESTRIP 1
#define V_028A6C_OUTPRIM_TYPE_TRISTRIP 2
static inline unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
V_028A6C_OUTPRIM_TYPE_POINTLIST,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_LINESTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP,
V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
unsigned r600_conv_prim_to_gs_out(unsigned mode);
#endif

View File

@@ -1809,7 +1809,6 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
struct tgsi_token *tokens = pipeshader->selector->tokens;
struct pipe_stream_output_info so = pipeshader->selector->so;
struct tgsi_full_immediate *immediate;
struct tgsi_full_property *property;
struct r600_shader_ctx ctx;
struct r600_bytecode_output output[32];
unsigned output_done, noutput;
@@ -1840,7 +1839,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
shader->indirect_files = ctx.info.indirect_files;
indirect_gprs = ctx.info.indirect_files & ~(1 << TGSI_FILE_CONSTANT);
tgsi_parse_init(&ctx.parse, tokens);
ctx.type = ctx.parse.FullHeader.Processor.Processor;
ctx.type = ctx.info.processor;
shader->processor_type = ctx.type;
ctx.bc->type = shader->processor_type;
@@ -1968,6 +1967,12 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
ctx.nliterals = 0;
ctx.literals = NULL;
shader->fs_write_all = FALSE;
if (ctx.info.properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS])
shader->fs_write_all = TRUE;
shader->vs_position_window_space = FALSE;
if (ctx.info.properties[TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION])
shader->vs_position_window_space = TRUE;
if (shader->vs_as_gs_a)
vs_add_primid_output(&ctx, key.vs.prim_id_out);
@@ -1994,34 +1999,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx,
goto out_err;
break;
case TGSI_TOKEN_TYPE_INSTRUCTION:
break;
case TGSI_TOKEN_TYPE_PROPERTY:
property = &ctx.parse.FullToken.FullProperty;
switch (property->Property.PropertyName) {
case TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS:
if (property->u[0].Data == 1)
shader->fs_write_all = TRUE;
break;
case TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION:
if (property->u[0].Data == 1)
shader->vs_position_window_space = TRUE;
break;
case TGSI_PROPERTY_VS_PROHIBIT_UCPS:
/* we don't need this one */
break;
case TGSI_PROPERTY_GS_INPUT_PRIM:
shader->gs_input_prim = property->u[0].Data;
break;
case TGSI_PROPERTY_GS_OUTPUT_PRIM:
shader->gs_output_prim = property->u[0].Data;
break;
case TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES:
shader->gs_max_out_vertices = property->u[0].Data;
break;
case TGSI_PROPERTY_GS_INVOCATIONS:
shader->gs_num_invocations = property->u[0].Data;
break;
}
break;
default:
R600_ERR("unsupported token type %d\n", ctx.parse.FullToken.Token.Type);
@@ -6151,10 +6129,10 @@ static int tgsi_cmp(struct r600_shader_ctx *ctx)
r = tgsi_make_src_for_op3(ctx, temp_regs[0], i, &alu.src[0], &ctx->src[0]);
if (r)
return r;
r = tgsi_make_src_for_op3(ctx, temp_regs[1], i, &alu.src[1], &ctx->src[2]);
r = tgsi_make_src_for_op3(ctx, temp_regs[2], i, &alu.src[1], &ctx->src[2]);
if (r)
return r;
r = tgsi_make_src_for_op3(ctx, temp_regs[2], i, &alu.src[2], &ctx->src[1]);
r = tgsi_make_src_for_op3(ctx, temp_regs[1], i, &alu.src[2], &ctx->src[1]);
if (r)
return r;
tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst);

View File

@@ -78,11 +78,6 @@ struct r600_shader {
/* Temporarily workaround SB not handling CF_INDEX_[01] index registers */
boolean uses_index_registers;
/* geometry shader properties */
unsigned gs_input_prim;
unsigned gs_output_prim;
unsigned gs_max_out_vertices;
unsigned gs_num_invocations;
/* size in bytes of a data item in the ring (single vertex data) */
unsigned ring_item_size;

View File

@@ -1951,11 +1951,11 @@ static void r600_emit_shader_stages(struct r600_context *rctx, struct r600_atom
if (state->geom_enable) {
uint32_t cut_val;
if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 128)
if (rctx->gs_shader->gs_max_out_vertices <= 128)
cut_val = V_028A40_GS_CUT_128;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 256)
else if (rctx->gs_shader->gs_max_out_vertices <= 256)
cut_val = V_028A40_GS_CUT_256;
else if (rctx->gs_shader->current->shader.gs_max_out_vertices <= 512)
else if (rctx->gs_shader->gs_max_out_vertices <= 512)
cut_val = V_028A40_GS_CUT_512;
else
cut_val = V_028A40_GS_CUT_1024;
@@ -2650,7 +2650,7 @@ void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *sha
struct r600_shader *rshader = &shader->shader;
struct r600_shader *cp_shader = &shader->gs_copy_shader->shader;
unsigned gsvs_itemsize =
(cp_shader->ring_item_size * rshader->gs_max_out_vertices) >> 2;
(cp_shader->ring_item_size * shader->selector->gs_max_out_vertices) >> 2;
r600_init_command_buffer(cb, 64);
@@ -2659,10 +2659,10 @@ void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *sha
if (rctx->b.chip_class >= R700) {
r600_store_context_reg(cb, R_028B38_VGT_GS_MAX_VERT_OUT,
S_028B38_MAX_VERT_OUT(rshader->gs_max_out_vertices));
S_028B38_MAX_VERT_OUT(shader->selector->gs_max_out_vertices));
}
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
r600_conv_prim_to_gs_out(shader->selector->gs_output_prim));
r600_store_context_reg(cb, R_0288C8_SQ_GS_VERT_ITEMSIZE,
cp_shader->ring_item_size >> 2);

View File

@@ -34,6 +34,7 @@
#include "util/u_upload_mgr.h"
#include "util/u_math.h"
#include "tgsi/tgsi_parse.h"
#include "tgsi/tgsi_scan.h"
void r600_init_command_buffer(struct r600_command_buffer *cb, unsigned num_dw)
{
@@ -123,6 +124,31 @@ static unsigned r600_conv_pipe_prim(unsigned prim)
return prim_conv[prim];
}
unsigned r600_conv_prim_to_gs_out(unsigned mode)
{
static const int prim_conv[] = {
[PIPE_PRIM_POINTS] = V_028A6C_OUTPRIM_TYPE_POINTLIST,
[PIPE_PRIM_LINES] = V_028A6C_OUTPRIM_TYPE_LINESTRIP,
[PIPE_PRIM_LINE_LOOP] = V_028A6C_OUTPRIM_TYPE_LINESTRIP,
[PIPE_PRIM_LINE_STRIP] = V_028A6C_OUTPRIM_TYPE_LINESTRIP,
[PIPE_PRIM_TRIANGLES] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_TRIANGLE_STRIP] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_TRIANGLE_FAN] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_QUADS] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_QUAD_STRIP] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_POLYGON] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_LINES_ADJACENCY] = V_028A6C_OUTPRIM_TYPE_LINESTRIP,
[PIPE_PRIM_LINE_STRIP_ADJACENCY] = V_028A6C_OUTPRIM_TYPE_LINESTRIP,
[PIPE_PRIM_TRIANGLES_ADJACENCY] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY] = V_028A6C_OUTPRIM_TYPE_TRISTRIP,
[PIPE_PRIM_PATCHES] = V_028A6C_OUTPRIM_TYPE_POINTLIST,
[R600_PRIM_RECTANGLE_LIST] = V_028A6C_OUTPRIM_TYPE_TRISTRIP
};
assert(mode < Elements(prim_conv));
return prim_conv[mode];
}
/* common state between evergreen and r600 */
static void r600_bind_blend_state_internal(struct r600_context *rctx,
@@ -818,6 +844,19 @@ static void *r600_create_shader_state(struct pipe_context *ctx,
sel->type = pipe_shader_type;
sel->tokens = tgsi_dup_tokens(state->tokens);
sel->so = state->stream_output;
tgsi_scan_shader(state->tokens, &sel->info);
switch (pipe_shader_type) {
case PIPE_SHADER_GEOMETRY:
sel->gs_output_prim =
sel->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM];
sel->gs_max_out_vertices =
sel->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES];
sel->gs_num_invocations =
sel->info.properties[TGSI_PROPERTY_GS_INVOCATIONS];
break;
}
return sel;
}
@@ -1524,7 +1563,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info
unsigned prim = info.mode;
if (rctx->gs_shader) {
prim = rctx->gs_shader->current->shader.gs_output_prim;
prim = rctx->gs_shader->gs_output_prim;
}
prim = r600_conv_prim_to_gs_out(prim); /* decrease the number of types to 3 */

View File

@@ -3428,7 +3428,6 @@
#define S_0085F0_SO3_DEST_BASE_ENA(x) (((x) & 0x1) << 5)
#define G_0085F0_SO3_DEST_BASE_ENA(x) (((x) >> 5) & 0x1)
#define C_0085F0_SO3_DEST_BASE_ENA 0xFFFFFFDF
#define S_0085F0_CB0_DEST_BASE_ENA_SHIFT 6
#define S_0085F0_CB0_DEST_BASE_ENA(x) (((x) & 0x1) << 6)
#define G_0085F0_CB0_DEST_BASE_ENA(x) (((x) >> 6) & 0x1)
#define C_0085F0_CB0_DEST_BASE_ENA 0xFFFFFFBF

View File

@@ -32,6 +32,7 @@ int bc_decoder::decode_cf(unsigned &i, bc_cf& bc) {
int r = 0;
uint32_t dw0 = dw[i];
uint32_t dw1 = dw[i+1];
assert(i+1 <= ndw);
if ((dw1 >> 29) & 1) { // CF_ALU
return decode_cf_alu(i, bc);

View File

@@ -199,6 +199,9 @@ void bc_finalizer::finalize_if(region_node* r) {
cf_node *if_jump = sh.create_cf(CF_OP_JUMP);
cf_node *if_pop = sh.create_cf(CF_OP_POP);
if (!last_cf || last_cf->get_parent_region() == r) {
last_cf = if_pop;
}
if_pop->bc.pop_count = 1;
if_pop->jump_after(if_pop);

View File

@@ -95,7 +95,7 @@ int bc_parser::decode_shader() {
if ((r = decode_cf(i, eop)))
return r;
} while (!eop || (i >> 1) <= max_cf);
} while (!eop || (i >> 1) < max_cf);
return 0;
}
@@ -769,6 +769,7 @@ int bc_parser::prepare_ir() {
}
int bc_parser::prepare_loop(cf_node* c) {
assert(c->bc.addr-1 < cf_map.size());
cf_node *end = cf_map[c->bc.addr - 1];
assert(end->bc.op == CF_OP_LOOP_END);
@@ -788,8 +789,12 @@ int bc_parser::prepare_loop(cf_node* c) {
}
int bc_parser::prepare_if(cf_node* c) {
assert(c->bc.addr-1 < cf_map.size());
cf_node *c_else = NULL, *end = cf_map[c->bc.addr];
if (!end)
return 0; // not quite sure how this happens, malformed input?
BCP_DUMP(
sblog << "parsing JUMP @" << c->bc.id;
sblog << "\n";
@@ -815,7 +820,7 @@ int bc_parser::prepare_if(cf_node* c) {
if (c_else->parent != c->parent)
c_else = NULL;
if (end->parent != c->parent)
if (end && end->parent != c->parent)
end = NULL;
region_node *reg = sh->create_region();

View File

@@ -236,7 +236,7 @@ void rp_gpr_tracker::unreserve(alu_node* n) {
for (i = 0; i < nsrc; ++i) {
value *v = n->src[i];
if (v->is_readonly())
if (v->is_readonly() || v->is_undef())
continue;
if (i == 1 && opt)
continue;

View File

@@ -197,7 +197,7 @@ static void r600_emit_query_begin(struct r600_common_context *ctx, struct r600_q
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_ZPASS_DONE) | EVENT_INDEX(1));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
radeon_emit(cs, (va >> 32) & 0xFFFF);
break;
case PIPE_QUERY_PRIMITIVES_EMITTED:
case PIPE_QUERY_PRIMITIVES_GENERATED:
@@ -206,13 +206,13 @@ static void r600_emit_query_begin(struct r600_common_context *ctx, struct r600_q
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
radeon_emit(cs, EVENT_TYPE(event_type_for_stream(query)) | EVENT_INDEX(3));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
radeon_emit(cs, (va >> 32) & 0xFFFF);
break;
case PIPE_QUERY_TIME_ELAPSED:
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE_EOP, 4, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_CACHE_FLUSH_AND_INV_TS_EVENT) | EVENT_INDEX(5));
radeon_emit(cs, va);
radeon_emit(cs, (3 << 29) | ((va >> 32UL) & 0xFF));
radeon_emit(cs, (3 << 29) | ((va >> 32) & 0xFFFF));
radeon_emit(cs, 0);
radeon_emit(cs, 0);
break;
@@ -220,7 +220,7 @@ static void r600_emit_query_begin(struct r600_common_context *ctx, struct r600_q
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_SAMPLE_PIPELINESTAT) | EVENT_INDEX(2));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
radeon_emit(cs, (va >> 32) & 0xFFFF);
break;
default:
assert(0);
@@ -254,7 +254,7 @@ static void r600_emit_query_end(struct r600_common_context *ctx, struct r600_que
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_ZPASS_DONE) | EVENT_INDEX(1));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
radeon_emit(cs, (va >> 32) & 0xFFFF);
break;
case PIPE_QUERY_PRIMITIVES_EMITTED:
case PIPE_QUERY_PRIMITIVES_GENERATED:
@@ -264,7 +264,7 @@ static void r600_emit_query_end(struct r600_common_context *ctx, struct r600_que
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
radeon_emit(cs, EVENT_TYPE(event_type_for_stream(query)) | EVENT_INDEX(3));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
radeon_emit(cs, (va >> 32) & 0xFFFF);
break;
case PIPE_QUERY_TIME_ELAPSED:
va += query->buffer.results_end + query->result_size/2;
@@ -273,7 +273,7 @@ static void r600_emit_query_end(struct r600_common_context *ctx, struct r600_que
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE_EOP, 4, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_CACHE_FLUSH_AND_INV_TS_EVENT) | EVENT_INDEX(5));
radeon_emit(cs, va);
radeon_emit(cs, (3 << 29) | ((va >> 32UL) & 0xFF));
radeon_emit(cs, (3 << 29) | ((va >> 32) & 0xFFFF));
radeon_emit(cs, 0);
radeon_emit(cs, 0);
break;
@@ -282,7 +282,7 @@ static void r600_emit_query_end(struct r600_common_context *ctx, struct r600_que
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_SAMPLE_PIPELINESTAT) | EVENT_INDEX(2));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
radeon_emit(cs, (va >> 32) & 0xFFFF);
break;
default:
assert(0);
@@ -341,8 +341,8 @@ static void r600_emit_query_predication(struct r600_common_context *ctx, struct
while (results_base < qbuf->results_end) {
radeon_emit(cs, PKT3(PKT3_SET_PREDICATION, 1, 0));
radeon_emit(cs, (va + results_base) & 0xFFFFFFFFUL);
radeon_emit(cs, op | (((va + results_base) >> 32UL) & 0xFF));
radeon_emit(cs, va + results_base);
radeon_emit(cs, op | (((va + results_base) >> 32) & 0xFF));
r600_emit_reloc(ctx, &ctx->rings.gfx, qbuf->buf, RADEON_USAGE_READ,
RADEON_PRIO_MIN);
results_base += query->result_size;

View File

@@ -680,7 +680,7 @@ struct radeon_winsys {
uint64_t (*query_value)(struct radeon_winsys *ws,
enum radeon_value_id value);
void (*read_registers)(struct radeon_winsys *ws, unsigned reg_offset,
bool (*read_registers)(struct radeon_winsys *ws, unsigned reg_offset,
unsigned num_registers, uint32_t *out);
};

View File

@@ -0,0 +1 @@
sid_tables.h

View File

@@ -31,3 +31,12 @@ AM_CFLAGS = \
noinst_LTLIBRARIES = libradeonsi.la
libradeonsi_la_SOURCES = $(C_SOURCES)
sid_tables.h: $(srcdir)/sid_tables.py $(srcdir)/sid.h
$(AM_V_GEN) $(PYTHON2) $(srcdir)/sid_tables.py $(srcdir)/sid.h > $@
EXTRA_DIST = \
sid_tables.py
BUILT_SOURCES =\
sid_tables.h

View File

@@ -4,8 +4,10 @@ C_SOURCES := \
si_commands.c \
si_compute.c \
si_cp_dma.c \
si_debug.c \
si_descriptors.c \
sid.h \
sid_tables.h \
si_dma.c \
si_hw_context.c \
si_pipe.c \

View File

@@ -362,7 +362,7 @@ static void si_launch_grid(
shader_va += pc;
#endif
si_pm4_add_bo(pm4, shader->bo, RADEON_USAGE_READ, RADEON_PRIO_SHADER_DATA);
si_pm4_set_reg(pm4, R_00B830_COMPUTE_PGM_LO, (shader_va >> 8) & 0xffffffff);
si_pm4_set_reg(pm4, R_00B830_COMPUTE_PGM_LO, shader_va >> 8);
si_pm4_set_reg(pm4, R_00B834_COMPUTE_PGM_HI, shader_va >> 40);
si_pm4_set_reg(pm4, R_00B848_COMPUTE_PGM_RSRC1,

View File

@@ -47,10 +47,11 @@ static void si_emit_cp_dma_copy_buffer(struct si_context *sctx,
unsigned size, unsigned flags)
{
struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
uint32_t sync_flag = flags & R600_CP_DMA_SYNC ? PKT3_CP_DMA_CP_SYNC : 0;
uint32_t raw_wait = flags & SI_CP_DMA_RAW_WAIT ? PKT3_CP_DMA_CMD_RAW_WAIT : 0;
uint32_t sync_flag = flags & R600_CP_DMA_SYNC ? S_411_CP_SYNC(1) : 0;
uint32_t raw_wait = flags & SI_CP_DMA_RAW_WAIT ? S_414_RAW_WAIT(1) : 0;
uint32_t sel = flags & CIK_CP_DMA_USE_L2 ?
PKT3_CP_DMA_SRC_SEL(3) | PKT3_CP_DMA_DST_SEL(3) : 0;
S_411_SRC_SEL(V_411_SRC_ADDR_TC_L2) |
S_411_DSL_SEL(V_411_DST_ADDR_TC_L2) : 0;
assert(size);
assert((size & ((1<<21)-1)) == size);
@@ -79,16 +80,16 @@ static void si_emit_cp_dma_clear_buffer(struct si_context *sctx,
uint32_t clear_value, unsigned flags)
{
struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
uint32_t sync_flag = flags & R600_CP_DMA_SYNC ? PKT3_CP_DMA_CP_SYNC : 0;
uint32_t raw_wait = flags & SI_CP_DMA_RAW_WAIT ? PKT3_CP_DMA_CMD_RAW_WAIT : 0;
uint32_t dst_sel = flags & CIK_CP_DMA_USE_L2 ? PKT3_CP_DMA_DST_SEL(3) : 0;
uint32_t sync_flag = flags & R600_CP_DMA_SYNC ? S_411_CP_SYNC(1) : 0;
uint32_t raw_wait = flags & SI_CP_DMA_RAW_WAIT ? S_414_RAW_WAIT(1) : 0;
uint32_t dst_sel = flags & CIK_CP_DMA_USE_L2 ? S_411_DSL_SEL(V_411_DST_ADDR_TC_L2) : 0;
assert(size);
assert((size & ((1<<21)-1)) == size);
if (sctx->b.chip_class >= CIK) {
radeon_emit(cs, PKT3(PKT3_DMA_DATA, 5, 0));
radeon_emit(cs, sync_flag | dst_sel | PKT3_CP_DMA_SRC_SEL(2)); /* CP_SYNC [31] | SRC_SEL[30:29] */
radeon_emit(cs, sync_flag | dst_sel | S_411_SRC_SEL(V_411_DATA)); /* CP_SYNC [31] | SRC_SEL[30:29] */
radeon_emit(cs, clear_value); /* DATA [31:0] */
radeon_emit(cs, 0);
radeon_emit(cs, dst_va); /* DST_ADDR_LO [31:0] */
@@ -97,7 +98,7 @@ static void si_emit_cp_dma_clear_buffer(struct si_context *sctx,
} else {
radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0));
radeon_emit(cs, clear_value); /* DATA [31:0] */
radeon_emit(cs, sync_flag | PKT3_CP_DMA_SRC_SEL(2)); /* CP_SYNC [31] | SRC_SEL[30:29] */
radeon_emit(cs, sync_flag | S_411_SRC_SEL(V_411_DATA)); /* CP_SYNC [31] | SRC_SEL[30:29] */
radeon_emit(cs, dst_va); /* DST_ADDR_LO [31:0] */
radeon_emit(cs, (dst_va >> 32) & 0xffff); /* DST_ADDR_HI [15:0] */
radeon_emit(cs, size | raw_wait); /* COMMAND [29:22] | BYTE_COUNT [20:0] */

View File

@@ -0,0 +1,439 @@
/*
* Copyright 2015 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE.
*
* Authors:
* Marek Olšák <maraeo@gmail.com>
*/
#include "si_pipe.h"
#include "si_shader.h"
#include "sid.h"
#include "sid_tables.h"
static void si_dump_shader(struct si_shader_selector *sel, const char *name,
FILE *f)
{
if (!sel || !sel->current)
return;
fprintf(f, "%s shader disassembly:\n", name);
si_dump_shader_key(sel->type, &sel->current->key, f);
fprintf(f, "%s\n\n", sel->current->binary.disasm_string);
}
/* Parsed IBs are difficult to read without colors. Use "less -R file" to
* read them, or use "aha -b -f file" to convert them to html.
*/
#define COLOR_RESET "\033[0m"
#define COLOR_RED "\033[31m"
#define COLOR_GREEN "\033[1;32m"
#define COLOR_YELLOW "\033[1;33m"
#define COLOR_CYAN "\033[1;36m"
#define INDENT_PKT 8
static void print_spaces(FILE *f, unsigned num)
{
fprintf(f, "%*s", num, "");
}
static void print_value(FILE *file, uint32_t value, int bits)
{
/* Guess if it's int or float */
if (value <= (1 << 15))
fprintf(file, "%u\n", value);
else {
float f = uif(value);
if (fabs(f) < 100000 && f*10 == floor(f*10))
fprintf(file, "%.1ff\n", f);
else
/* Don't print more leading zeros than there are bits. */
fprintf(file, "0x%0*x\n", bits / 4, value);
}
}
static void print_named_value(FILE *file, const char *name, uint32_t value,
int bits)
{
print_spaces(file, INDENT_PKT);
fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", name);
print_value(file, value, bits);
}
static void si_dump_reg(FILE *file, unsigned offset, uint32_t value,
uint32_t field_mask)
{
int r, f;
for (r = 0; r < ARRAY_SIZE(reg_table); r++) {
const struct si_reg *reg = &reg_table[r];
if (reg->offset == offset) {
bool first_field = true;
print_spaces(file, INDENT_PKT);
fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ",
reg->name);
if (!reg->num_fields) {
print_value(file, value, 32);
return;
}
for (f = 0; f < reg->num_fields; f++) {
const struct si_field *field = &reg->fields[f];
uint32_t val = (value & field->mask) >>
(ffs(field->mask) - 1);
if (!(field->mask & field_mask))
continue;
/* Indent the field. */
if (!first_field)
print_spaces(file,
INDENT_PKT + strlen(reg->name) + 4);
/* Print the field. */
fprintf(file, "%s = ", field->name);
if (val < field->num_values && field->values[val])
fprintf(file, "%s\n", field->values[val]);
else
print_value(file, val,
util_bitcount(field->mask));
first_field = false;
}
return;
}
}
fprintf(file, COLOR_YELLOW "0x%05x" COLOR_RESET " = 0x%08x", offset, value);
}
static void si_parse_set_reg_packet(FILE *f, uint32_t *ib, unsigned count,
unsigned reg_offset)
{
unsigned reg = (ib[1] << 2) + reg_offset;
int i;
for (i = 0; i < count; i++)
si_dump_reg(f, reg + i*4, ib[2+i], ~0);
}
static uint32_t *si_parse_packet3(FILE *f, uint32_t *ib, int *num_dw,
int trace_id)
{
unsigned count = PKT_COUNT_G(ib[0]);
unsigned op = PKT3_IT_OPCODE_G(ib[0]);
const char *predicate = PKT3_PREDICATE(ib[0]) ? "(predicate)" : "";
int i;
/* Print the name first. */
for (i = 0; i < ARRAY_SIZE(packet3_table); i++)
if (packet3_table[i].op == op)
break;
if (i < ARRAY_SIZE(packet3_table))
if (op == PKT3_SET_CONTEXT_REG ||
op == PKT3_SET_CONFIG_REG ||
op == PKT3_SET_UCONFIG_REG ||
op == PKT3_SET_SH_REG)
fprintf(f, COLOR_CYAN "%s%s" COLOR_CYAN ":\n",
packet3_table[i].name, predicate);
else
fprintf(f, COLOR_GREEN "%s%s" COLOR_RESET ":\n",
packet3_table[i].name, predicate);
else
fprintf(f, COLOR_RED "PKT3_UNKNOWN 0x%x%s" COLOR_RESET ":\n",
op, predicate);
/* Print the contents. */
switch (op) {
case PKT3_SET_CONTEXT_REG:
si_parse_set_reg_packet(f, ib, count, SI_CONTEXT_REG_OFFSET);
break;
case PKT3_SET_CONFIG_REG:
si_parse_set_reg_packet(f, ib, count, SI_CONFIG_REG_OFFSET);
break;
case PKT3_SET_UCONFIG_REG:
si_parse_set_reg_packet(f, ib, count, CIK_UCONFIG_REG_OFFSET);
break;
case PKT3_SET_SH_REG:
si_parse_set_reg_packet(f, ib, count, SI_SH_REG_OFFSET);
break;
case PKT3_DRAW_PREAMBLE:
si_dump_reg(f, R_030908_VGT_PRIMITIVE_TYPE, ib[1], ~0);
si_dump_reg(f, R_028AA8_IA_MULTI_VGT_PARAM, ib[2], ~0);
si_dump_reg(f, R_028B58_VGT_LS_HS_CONFIG, ib[3], ~0);
break;
case PKT3_ACQUIRE_MEM:
si_dump_reg(f, R_0301F0_CP_COHER_CNTL, ib[1], ~0);
si_dump_reg(f, R_0301F4_CP_COHER_SIZE, ib[2], ~0);
si_dump_reg(f, R_030230_CP_COHER_SIZE_HI, ib[3], ~0);
si_dump_reg(f, R_0301F8_CP_COHER_BASE, ib[4], ~0);
si_dump_reg(f, R_0301E4_CP_COHER_BASE_HI, ib[5], ~0);
print_named_value(f, "POLL_INTERVAL", ib[6], 16);
break;
case PKT3_SURFACE_SYNC:
si_dump_reg(f, R_0085F0_CP_COHER_CNTL, ib[1], ~0);
si_dump_reg(f, R_0085F4_CP_COHER_SIZE, ib[2], ~0);
si_dump_reg(f, R_0085F8_CP_COHER_BASE, ib[3], ~0);
print_named_value(f, "POLL_INTERVAL", ib[4], 16);
break;
case PKT3_EVENT_WRITE:
si_dump_reg(f, R_028A90_VGT_EVENT_INITIATOR, ib[1],
S_028A90_EVENT_TYPE(~0));
print_named_value(f, "EVENT_INDEX", (ib[1] >> 8) & 0xf, 4);
print_named_value(f, "INV_L2", (ib[1] >> 20) & 0x1, 1);
if (count > 0) {
print_named_value(f, "ADDRESS_LO", ib[2], 32);
print_named_value(f, "ADDRESS_HI", ib[3], 16);
}
break;
case PKT3_DRAW_INDEX_AUTO:
si_dump_reg(f, R_030930_VGT_NUM_INDICES, ib[1], ~0);
si_dump_reg(f, R_0287F0_VGT_DRAW_INITIATOR, ib[2], ~0);
break;
case PKT3_DRAW_INDEX_2:
si_dump_reg(f, R_028A78_VGT_DMA_MAX_SIZE, ib[1], ~0);
si_dump_reg(f, R_0287E8_VGT_DMA_BASE, ib[2], ~0);
si_dump_reg(f, R_0287E4_VGT_DMA_BASE_HI, ib[3], ~0);
si_dump_reg(f, R_030930_VGT_NUM_INDICES, ib[4], ~0);
si_dump_reg(f, R_0287F0_VGT_DRAW_INITIATOR, ib[5], ~0);
break;
case PKT3_INDEX_TYPE:
si_dump_reg(f, R_028A7C_VGT_DMA_INDEX_TYPE, ib[1], ~0);
break;
case PKT3_NUM_INSTANCES:
si_dump_reg(f, R_030934_VGT_NUM_INSTANCES, ib[1], ~0);
break;
case PKT3_WRITE_DATA:
si_dump_reg(f, R_370_CONTROL, ib[1], ~0);
si_dump_reg(f, R_371_DST_ADDR_LO, ib[2], ~0);
si_dump_reg(f, R_372_DST_ADDR_HI, ib[3], ~0);
for (i = 2; i < count; i++) {
print_spaces(f, INDENT_PKT);
fprintf(f, "0x%08x\n", ib[2+i]);
}
break;
case PKT3_CP_DMA:
si_dump_reg(f, R_410_CP_DMA_WORD0, ib[1], ~0);
si_dump_reg(f, R_411_CP_DMA_WORD1, ib[2], ~0);
si_dump_reg(f, R_412_CP_DMA_WORD2, ib[3], ~0);
si_dump_reg(f, R_413_CP_DMA_WORD3, ib[4], ~0);
si_dump_reg(f, R_414_COMMAND, ib[5], ~0);
break;
case PKT3_DMA_DATA:
si_dump_reg(f, R_500_DMA_DATA_WORD0, ib[1], ~0);
si_dump_reg(f, R_501_SRC_ADDR_LO, ib[2], ~0);
si_dump_reg(f, R_502_SRC_ADDR_HI, ib[3], ~0);
si_dump_reg(f, R_503_DST_ADDR_LO, ib[4], ~0);
si_dump_reg(f, R_504_DST_ADDR_HI, ib[5], ~0);
si_dump_reg(f, R_414_COMMAND, ib[6], ~0);
break;
case PKT3_NOP:
if (ib[0] == 0xffff1000) {
count = -1; /* One dword NOP. */
break;
} else if (count == 0 && SI_IS_TRACE_POINT(ib[1])) {
unsigned packet_id = SI_GET_TRACE_POINT_ID(ib[1]);
print_spaces(f, INDENT_PKT);
fprintf(f, COLOR_RED "Trace point ID: %u\n", packet_id);
if (trace_id == -1)
break; /* tracing was disabled */
print_spaces(f, INDENT_PKT);
if (packet_id < trace_id)
fprintf(f, COLOR_RED
"This trace point was reached by the CP."
COLOR_RESET "\n");
else if (packet_id == trace_id)
fprintf(f, COLOR_RED
"!!!!! This is the last trace point that "
"was reached by the CP !!!!!"
COLOR_RESET "\n");
else if (packet_id+1 == trace_id)
fprintf(f, COLOR_RED
"!!!!! This is the first trace point that "
"was NOT been reached by the CP !!!!!"
COLOR_RESET "\n");
else
fprintf(f, COLOR_RED
"!!!!! This trace point was NOT reached "
"by the CP !!!!!"
COLOR_RESET "\n");
break;
}
/* fall through, print all dwords */
default:
for (i = 0; i < count+1; i++) {
print_spaces(f, INDENT_PKT);
fprintf(f, "0x%08x\n", ib[1+i]);
}
}
ib += count + 2;
*num_dw -= count + 2;
return ib;
}
/**
* Parse and print an IB into a file.
*
* \param f file
* \param ib IB
* \param num_dw size of the IB
* \param chip_class chip class
* \param trace_id the last trace ID that is known to have been reached
* and executed by the CP, typically read from a buffer
*/
static void si_parse_ib(FILE *f, uint32_t *ib, int num_dw, int trace_id)
{
fprintf(f, "------------------ IB begin ------------------\n");
while (num_dw > 0) {
unsigned type = PKT_TYPE_G(ib[0]);
switch (type) {
case 3:
ib = si_parse_packet3(f, ib, &num_dw, trace_id);
break;
case 2:
/* type-2 nop */
if (ib[0] == 0x80000000) {
fprintf(f, COLOR_GREEN "NOP (type 2)" COLOR_RESET "\n");
ib++;
break;
}
/* fall through */
default:
fprintf(f, "Unknown packet type %i\n", type);
return;
}
}
fprintf(f, "------------------- IB end -------------------\n");
if (num_dw < 0) {
printf("Packet ends after the end of IB.\n");
exit(0);
}
}
static void si_dump_mmapped_reg(struct si_context *sctx, FILE *f,
unsigned offset)
{
struct radeon_winsys *ws = sctx->b.ws;
uint32_t value;
if (ws->read_registers(ws, offset, 1, &value))
si_dump_reg(f, offset, value, ~0);
}
static void si_dump_debug_registers(struct si_context *sctx, FILE *f)
{
if (sctx->screen->b.info.drm_major == 2 &&
sctx->screen->b.info.drm_minor < 42)
return; /* no radeon support */
fprintf(f, "Memory-mapped registers:\n");
si_dump_mmapped_reg(sctx, f, R_008010_GRBM_STATUS);
/* No other registers can be read on DRM < 3.1.0. */
if (sctx->screen->b.info.drm_major < 3 ||
sctx->screen->b.info.drm_minor < 1) {
fprintf(f, "\n");
return;
}
si_dump_mmapped_reg(sctx, f, R_008008_GRBM_STATUS2);
si_dump_mmapped_reg(sctx, f, R_008014_GRBM_STATUS_SE0);
si_dump_mmapped_reg(sctx, f, R_008018_GRBM_STATUS_SE1);
si_dump_mmapped_reg(sctx, f, R_008038_GRBM_STATUS_SE2);
si_dump_mmapped_reg(sctx, f, R_00803C_GRBM_STATUS_SE3);
si_dump_mmapped_reg(sctx, f, R_00D034_SDMA0_STATUS_REG);
si_dump_mmapped_reg(sctx, f, R_00D834_SDMA1_STATUS_REG);
si_dump_mmapped_reg(sctx, f, R_000E50_SRBM_STATUS);
si_dump_mmapped_reg(sctx, f, R_000E4C_SRBM_STATUS2);
si_dump_mmapped_reg(sctx, f, R_000E54_SRBM_STATUS3);
si_dump_mmapped_reg(sctx, f, R_008680_CP_STAT);
si_dump_mmapped_reg(sctx, f, R_008674_CP_STALLED_STAT1);
si_dump_mmapped_reg(sctx, f, R_008678_CP_STALLED_STAT2);
si_dump_mmapped_reg(sctx, f, R_008670_CP_STALLED_STAT3);
si_dump_mmapped_reg(sctx, f, R_008210_CP_CPC_STATUS);
si_dump_mmapped_reg(sctx, f, R_008214_CP_CPC_BUSY_STAT);
si_dump_mmapped_reg(sctx, f, R_008218_CP_CPC_STALLED_STAT1);
si_dump_mmapped_reg(sctx, f, R_00821C_CP_CPF_STATUS);
si_dump_mmapped_reg(sctx, f, R_008220_CP_CPF_BUSY_STAT);
si_dump_mmapped_reg(sctx, f, R_008224_CP_CPF_STALLED_STAT1);
fprintf(f, "\n");
}
static void si_dump_debug_state(struct pipe_context *ctx, FILE *f,
unsigned flags)
{
struct si_context *sctx = (struct si_context*)ctx;
if (flags & PIPE_DEBUG_DEVICE_IS_HUNG)
si_dump_debug_registers(sctx, f);
si_dump_shader(sctx->vs_shader, "Vertex", f);
si_dump_shader(sctx->tcs_shader, "Tessellation control", f);
si_dump_shader(sctx->tes_shader, "Tessellation evaluation", f);
si_dump_shader(sctx->gs_shader, "Geometry", f);
si_dump_shader(sctx->ps_shader, "Fragment", f);
if (sctx->last_ib) {
int last_trace_id = -1;
if (sctx->last_trace_buf) {
/* We are expecting that the ddebug pipe has already
* waited for the context, so this buffer should be idle.
* If the GPU is hung, there is no point in waiting for it.
*/
uint32_t *map =
sctx->b.ws->buffer_map(sctx->last_trace_buf->cs_buf,
NULL,
PIPE_TRANSFER_UNSYNCHRONIZED |
PIPE_TRANSFER_READ);
if (map)
last_trace_id = *map;
}
si_parse_ib(f, sctx->last_ib, sctx->last_ib_dw_size,
last_trace_id);
free(sctx->last_ib); /* dump only once */
sctx->last_ib = NULL;
r600_resource_reference(&sctx->last_trace_buf, NULL);
}
fprintf(f, "Done.\n");
}
void si_init_debug_functions(struct si_context *sctx)
{
sctx->b.b.dump_debug_state = si_dump_debug_state;
}

View File

@@ -426,7 +426,7 @@ static bool si_upload_vertex_buffer_descriptors(struct si_context *sctx)
va = rbuffer->gpu_address + offset;
/* Fill in T# buffer resource description */
desc[0] = va & 0xFFFFFFFF;
desc[0] = va;
desc[1] = S_008F04_BASE_ADDRESS_HI(va >> 32) |
S_008F04_STRIDE(vb->stride);

View File

@@ -86,8 +86,8 @@ static void si_dma_copy_buffer(struct si_context *ctx,
for (i = 0; i < ncopy; i++) {
csize = size < max_csize ? size : max_csize;
cs->buf[cs->cdw++] = SI_DMA_PACKET(SI_DMA_PACKET_COPY, sub_cmd, csize);
cs->buf[cs->cdw++] = dst_offset & 0xffffffff;
cs->buf[cs->cdw++] = src_offset & 0xffffffff;
cs->buf[cs->cdw++] = dst_offset;
cs->buf[cs->cdw++] = src_offset;
cs->buf[cs->cdw++] = (dst_offset >> 32UL) & 0xff;
cs->buf[cs->cdw++] = (src_offset >> 32UL) & 0xff;
dst_offset += csize << shift;

View File

@@ -88,11 +88,8 @@ void si_need_cs_space(struct si_context *ctx, unsigned num_dw,
/* Count in framebuffer cache flushes at the end of CS. */
num_dw += ctx->atoms.s.cache_flush->num_dw;
#if SI_TRACE_CS
if (ctx->screen->b.trace_bo) {
num_dw += SI_TRACE_CS_DWORDS;
}
#endif
if (ctx->screen->b.trace_bo)
num_dw += SI_TRACE_CS_DWORDS * 2;
/* Flush if there's not enough space. */
if (num_dw > cs->max_dw) {
@@ -130,6 +127,19 @@ void si_context_gfx_flush(void *context, unsigned flags,
/* force to keep tiling flags */
flags |= RADEON_FLUSH_KEEP_TILING_FLAGS;
if (ctx->trace_buf)
si_trace_emit(ctx);
/* Save the IB for debug contexts. */
if (ctx->is_debug) {
free(ctx->last_ib);
ctx->last_ib_dw_size = cs->cdw;
ctx->last_ib = malloc(cs->cdw * 4);
memcpy(ctx->last_ib, cs->buf, cs->cdw * 4);
r600_resource_reference(&ctx->last_trace_buf, ctx->trace_buf);
r600_resource_reference(&ctx->trace_buf, NULL);
}
/* Flush the CS. */
ws->cs_flush(cs, flags, &ctx->last_gfx_fence,
ctx->screen->b.cs_count++);
@@ -138,31 +148,28 @@ void si_context_gfx_flush(void *context, unsigned flags,
if (fence)
ws->fence_reference(fence, ctx->last_gfx_fence);
#if SI_TRACE_CS
if (ctx->screen->b.trace_bo) {
struct si_screen *sscreen = ctx->screen;
unsigned i;
for (i = 0; i < 10; i++) {
usleep(5);
if (!ws->buffer_is_busy(sscreen->b.trace_bo->buf, RADEON_USAGE_READWRITE)) {
break;
}
}
if (i == 10) {
fprintf(stderr, "timeout on cs lockup likely happen at cs %d dw %d\n",
sscreen->b.trace_ptr[1], sscreen->b.trace_ptr[0]);
} else {
fprintf(stderr, "cs %d executed in %dms\n", sscreen->b.trace_ptr[1], i * 5);
}
}
#endif
si_begin_new_cs(ctx);
}
void si_begin_new_cs(struct si_context *ctx)
{
if (ctx->is_debug) {
uint32_t zero = 0;
/* Create a buffer used for writing trace IDs and initialize it to 0. */
assert(!ctx->trace_buf);
ctx->trace_buf = (struct r600_resource*)
pipe_buffer_create(ctx->b.b.screen, PIPE_BIND_CUSTOM,
PIPE_USAGE_STAGING, 4);
if (ctx->trace_buf)
pipe_buffer_write_nooverlap(&ctx->b.b, &ctx->trace_buf->b.b,
0, sizeof(zero), &zero);
ctx->trace_id = 0;
}
if (ctx->trace_buf)
si_trace_emit(ctx);
/* Flush read caches at the beginning of CS. */
ctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER |
SI_CONTEXT_INV_TC_L1 |

View File

@@ -81,6 +81,9 @@ static void si_destroy_context(struct pipe_context *context)
LLVMDisposeTargetMachine(sctx->tm);
#endif
r600_resource_reference(&sctx->trace_buf, NULL);
r600_resource_reference(&sctx->last_trace_buf, NULL);
free(sctx->last_ib);
FREE(sctx);
}
@@ -92,7 +95,8 @@ si_amdgpu_get_reset_status(struct pipe_context *ctx)
return sctx->b.ws->ctx_query_reset_status(sctx->b.ctx);
}
static struct pipe_context *si_create_context(struct pipe_screen *screen, void *priv)
static struct pipe_context *si_create_context(struct pipe_screen *screen,
void *priv, unsigned flags)
{
struct si_context *sctx = CALLOC_STRUCT(si_context);
struct si_screen* sscreen = (struct si_screen *)screen;
@@ -111,6 +115,7 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, void *
sctx->b.b.destroy = si_destroy_context;
sctx->b.set_atom_dirty = (void *)si_set_atom_dirty;
sctx->screen = sscreen; /* Easy accessing of screen/winsys. */
sctx->is_debug = (flags & PIPE_CONTEXT_DEBUG) != 0;
if (!r600_common_context_init(&sctx->b, &sscreen->b))
goto fail;
@@ -121,6 +126,7 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, void *
si_init_blit_functions(sctx);
si_init_compute_functions(sctx);
si_init_cp_dma_functions(sctx);
si_init_debug_functions(sctx);
if (sscreen->b.info.has_uvd) {
sctx->b.b.create_video_codec = si_uvd_create_decoder;
@@ -586,7 +592,7 @@ struct pipe_screen *radeonsi_screen_create(struct radeon_winsys *ws)
sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | DBG_CS;
/* Create the auxiliary context. This must be done last. */
sscreen->b.aux_context = sscreen->b.b.context_create(&sscreen->b.b, NULL);
sscreen->b.aux_context = sscreen->b.b.context_create(&sscreen->b.b, NULL, 0);
return &sscreen->b.b;
}

View File

@@ -43,8 +43,7 @@
#define SI_RESTART_INDEX_UNKNOWN INT_MIN
#define SI_NUM_SMOOTH_AA_SAMPLES 8
#define SI_TRACE_CS 0
#define SI_TRACE_CS_DWORDS 6
#define SI_TRACE_CS_DWORDS 7
#define SI_MAX_DRAW_CS_DWORDS \
(/*scratch:*/ 3 + /*derived prim state:*/ 3 + \
@@ -82,6 +81,10 @@
SI_CONTEXT_FLUSH_AND_INV_DB | \
SI_CONTEXT_FLUSH_AND_INV_DB_META)
#define SI_ENCODE_TRACE_POINT(id) (0xcafe0000 | ((id) & 0xffff))
#define SI_IS_TRACE_POINT(x) (((x) & 0xcafe0000) == 0xcafe0000)
#define SI_GET_TRACE_POINT_ID(x) ((x) & 0xffff)
struct si_compute;
struct si_screen {
@@ -243,6 +246,14 @@ struct si_context {
struct si_shader_selector *last_tcs;
int last_num_tcs_input_cp;
int last_tes_sh_base;
/* Debug state. */
bool is_debug;
uint32_t *last_ib;
unsigned last_ib_dw_size;
struct r600_resource *last_trace_buf;
struct r600_resource *trace_buf;
unsigned trace_id;
};
/* cik_sdma.c */
@@ -275,6 +286,9 @@ void si_copy_buffer(struct si_context *sctx,
bool is_framebuffer);
void si_init_cp_dma_functions(struct si_context *sctx);
/* si_debug.c */
void si_init_debug_functions(struct si_context *sctx);
/* si_dma.c */
void si_dma_copy(struct pipe_context *ctx,
struct pipe_resource *dst,
@@ -290,10 +304,6 @@ void si_context_gfx_flush(void *context, unsigned flags,
void si_begin_new_cs(struct si_context *ctx);
void si_need_cs_space(struct si_context *ctx, unsigned num_dw, boolean count_draw_in);
#if SI_TRACE_CS
void si_trace_emit(struct si_context *sctx);
#endif
/* si_compute.c */
void si_init_compute_functions(struct si_context *sctx);

View File

@@ -135,12 +135,6 @@ unsigned si_pm4_dirty_dw(struct si_context *sctx)
continue;
count += state->ndw;
#if SI_TRACE_CS
/* for tracing each states */
if (sctx->screen->b.trace_bo) {
count += SI_TRACE_CS_DWORDS;
}
#endif
}
return count;
@@ -161,12 +155,6 @@ void si_pm4_emit(struct si_context *sctx, struct si_pm4_state *state)
}
cs->cdw += state->ndw;
#if SI_TRACE_CS
if (sctx->screen->b.trace_bo) {
si_trace_emit(sctx);
}
#endif
}
void si_pm4_emit_dirty(struct si_context *sctx)

View File

@@ -2418,7 +2418,7 @@ static void tex_fetch_args(
num_deriv_channels = 1;
break;
default:
assert(0); /* no other targets are valid here */
unreachable("invalid target");
}
for (param = 0; param < 2; param++)
@@ -3781,7 +3781,7 @@ void si_shader_apply_scratch_relocs(struct si_context *sctx,
uint64_t scratch_va)
{
unsigned i;
uint32_t scratch_rsrc_dword0 = scratch_va & 0xffffffff;
uint32_t scratch_rsrc_dword0 = scratch_va;
uint32_t scratch_rsrc_dword1 =
S_008F04_BASE_ADDRESS_HI(scratch_va >> 32)
| S_008F04_STRIDE(shader->scratch_bytes_per_wave / 64);
@@ -3964,48 +3964,48 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen,
return r;
}
static void si_dump_key(unsigned shader, union si_shader_key *key)
void si_dump_shader_key(unsigned shader, union si_shader_key *key, FILE *f)
{
int i;
fprintf(stderr, "SHADER KEY\n");
fprintf(f, "SHADER KEY\n");
switch (shader) {
case PIPE_SHADER_VERTEX:
fprintf(stderr, " instance_divisors = {");
fprintf(f, " instance_divisors = {");
for (i = 0; i < Elements(key->vs.instance_divisors); i++)
fprintf(stderr, !i ? "%u" : ", %u",
fprintf(f, !i ? "%u" : ", %u",
key->vs.instance_divisors[i]);
fprintf(stderr, "}\n");
fprintf(f, "}\n");
if (key->vs.as_es)
fprintf(stderr, " es_enabled_outputs = 0x%"PRIx64"\n",
fprintf(f, " es_enabled_outputs = 0x%"PRIx64"\n",
key->vs.es_enabled_outputs);
fprintf(stderr, " as_es = %u\n", key->vs.as_es);
fprintf(stderr, " as_ls = %u\n", key->vs.as_ls);
fprintf(f, " as_es = %u\n", key->vs.as_es);
fprintf(f, " as_ls = %u\n", key->vs.as_ls);
break;
case PIPE_SHADER_TESS_CTRL:
fprintf(stderr, " prim_mode = %u\n", key->tcs.prim_mode);
fprintf(f, " prim_mode = %u\n", key->tcs.prim_mode);
break;
case PIPE_SHADER_TESS_EVAL:
if (key->tes.as_es)
fprintf(stderr, " es_enabled_outputs = 0x%"PRIx64"\n",
fprintf(f, " es_enabled_outputs = 0x%"PRIx64"\n",
key->tes.es_enabled_outputs);
fprintf(stderr, " as_es = %u\n", key->tes.as_es);
fprintf(f, " as_es = %u\n", key->tes.as_es);
break;
case PIPE_SHADER_GEOMETRY:
break;
case PIPE_SHADER_FRAGMENT:
fprintf(stderr, " export_16bpc = 0x%X\n", key->ps.export_16bpc);
fprintf(stderr, " last_cbuf = %u\n", key->ps.last_cbuf);
fprintf(stderr, " color_two_side = %u\n", key->ps.color_two_side);
fprintf(stderr, " alpha_func = %u\n", key->ps.alpha_func);
fprintf(stderr, " alpha_to_one = %u\n", key->ps.alpha_to_one);
fprintf(stderr, " poly_stipple = %u\n", key->ps.poly_stipple);
fprintf(f, " export_16bpc = 0x%X\n", key->ps.export_16bpc);
fprintf(f, " last_cbuf = %u\n", key->ps.last_cbuf);
fprintf(f, " color_two_side = %u\n", key->ps.color_two_side);
fprintf(f, " alpha_func = %u\n", key->ps.alpha_func);
fprintf(f, " alpha_to_one = %u\n", key->ps.alpha_to_one);
fprintf(f, " poly_stipple = %u\n", key->ps.poly_stipple);
break;
default:
@@ -4036,7 +4036,7 @@ int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm,
/* Dump TGSI code before doing TGSI->LLVM conversion in case the
* conversion fails. */
if (dump && !(sscreen->b.debug_flags & DBG_NO_TGSI)) {
si_dump_key(sel->type, &shader->key);
si_dump_shader_key(sel->type, &shader->key, stderr);
tgsi_dump(tokens, 0);
si_dump_streamout(&sel->so);
}

View File

@@ -304,6 +304,7 @@ static inline bool si_vs_exports_prim_id(struct si_shader *shader)
/* radeonsi_shader.c */
int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm,
struct si_shader *shader);
void si_dump_shader_key(unsigned shader, union si_shader_key *key, FILE *f);
int si_compile_llvm(struct si_screen *sscreen, struct si_shader *shader,
LLVMTargetMachineRef tm, LLVMModuleRef mod);
void si_shader_destroy(struct pipe_context *ctx, struct si_shader *shader);

View File

@@ -35,10 +35,10 @@
#include "util/u_pstipple.h"
static void si_init_atom(struct r600_atom *atom, struct r600_atom **list_elem,
void (*emit)(struct si_context *ctx, struct r600_atom *state),
void (*emit_func)(struct si_context *ctx, struct r600_atom *state),
unsigned num_dw)
{
atom->emit = (void*)emit;
atom->emit = (void*)emit_func;
atom->num_dw = num_dw;
atom->dirty = false;
*list_elem = atom;

View File

@@ -281,6 +281,7 @@ extern const struct r600_atom si_atom_msaa_sample_locs;
extern const struct r600_atom si_atom_msaa_config;
void si_emit_cache_flush(struct r600_common_context *sctx, struct r600_atom *atom);
void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *dinfo);
void si_trace_emit(struct si_context *sctx);
/* si_commands.c */
void si_cmd_context_control(struct si_pm4_state *pm4);

View File

@@ -835,11 +835,8 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
si_emit_draw_registers(sctx, info);
si_emit_draw_packets(sctx, info, &ib);
#if SI_TRACE_CS
if (sctx->screen->b.trace_bo) {
if (sctx->trace_buf)
si_trace_emit(sctx);
}
#endif
/* Workaround for a VGT hang when streamout is enabled.
* It must be done after drawing. */
@@ -874,23 +871,20 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
sctx->b.num_draw_calls++;
}
#if SI_TRACE_CS
void si_trace_emit(struct si_context *sctx)
{
struct si_screen *sscreen = sctx->screen;
struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
uint64_t va;
va = sscreen->b.trace_bo->gpu_address;
r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, sscreen->b.trace_bo,
sctx->trace_id++;
r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, sctx->trace_buf,
RADEON_USAGE_READWRITE, RADEON_PRIO_MIN);
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cs, PKT3_WRITE_DATA_DST_SEL(PKT3_WRITE_DATA_DST_SEL_MEM_SYNC) |
PKT3_WRITE_DATA_WR_CONFIRM |
PKT3_WRITE_DATA_ENGINE_SEL(PKT3_WRITE_DATA_ENGINE_SEL_ME));
radeon_emit(cs, va & 0xFFFFFFFFUL);
radeon_emit(cs, (va >> 32UL) & 0xFFFFFFFFUL);
radeon_emit(cs, cs->cdw);
radeon_emit(cs, sscreen->b.cs_count);
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
radeon_emit(cs, S_370_DST_SEL(V_370_MEMORY_SYNC) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_ME));
radeon_emit(cs, sctx->trace_buf->gpu_address);
radeon_emit(cs, sctx->trace_buf->gpu_address >> 32);
radeon_emit(cs, sctx->trace_id);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, SI_ENCODE_TRACE_POINT(sctx->trace_id));
}
#endif

View File

@@ -181,7 +181,7 @@ static void si_shader_es(struct si_shader *shader)
vgpr_comp_cnt = 3; /* all components are needed for TES */
num_user_sgprs = SI_TES_NUM_USER_SGPR;
} else
assert(0);
unreachable("invalid shader selector type");
num_sgprs = shader->num_sgprs;
/* One SGPR after user SGPRs is pre-loaded with es2gs_offset */
@@ -338,7 +338,7 @@ static void si_shader_vs(struct si_shader *shader)
vgpr_comp_cnt = 3; /* all components are needed for TES */
num_user_sgprs = SI_TES_NUM_USER_SGPR;
} else
assert(0);
unreachable("invalid shader selector type");
num_sgprs = shader->num_sgprs;
if (num_user_sgprs > num_sgprs) {

Some files were not shown because too many files have changed in this diff Show More